v5.6.3: 15 Hours of Walk-Forward Research, One Shipping Config, and the Problem We Haven't Solved

After 4 sessions and ~15 hours of automated walk-forward research, the derivatives integration plan is complete. We tested every hypothesis. Five steps, all answered. One config validated and shipping. And one problem that remains open.

This is the full disclosure — what worked, what didn't, what the numbers actually say, and what we're still investigating.

The Research: 5 Steps, All Complete

The original plan asked five questions about integrating derivatives data into the BV-7X signal engine. Here's what we found.

Step	Question	Finding
1. IC Validation	Which derivatives signals are actionable?	Only oiChange7d (IC = −0.072, orthogonal). Funding & L/S ratio are redundant.
2. Discretization	Discrete combo keys or continuous?	Continuous wins. 5D combos cause sparsity — 40% had <8 samples, costing 56 correct calls.
3. Data Gap	Does pre-2022 data (no derivatives) break it?	No. UNKNOWN→NEUTRAL fallback holds. 2022+ subperiod: 62.02%.
4. Walk-Forward	Best validated config?	61% accuracy, 1,844 trades, p = 0.018. Monte Carlo: stable.
5. Error Analysis	Do corrections help?	6.4% correction rate, net −45. Veto thresholds too aggressive.

What Ships: The v5.6.3 Config

The validated configuration we're deploying:

trendThreshold:    2     // was 3 — tighter trend detection
flowThreshold:     75    // was 100 — lower bar for ETF flow signal
drawdownThreshold: -3    // was -5 — faster drawdown recognition

useOIModifier:          true   // Binance OI 7d change as confidence adjuster
useFGMomentumFilter:    true   // block trades when F&G drops >20 in 7 days
useRSIMomentumFilter:   true   // block trades when RSI drops >20 in 7 days

61%

Overall Accuracy

1,844

Actionable Trades

p=0.018

vs Baseline

Accuracy by Period

Period	Accuracy	Trades
2018–2020	66.18%	408
2014–2017	63.22%	590
2021–2022	62.08%	240
2024+	55.81%	439
2023	53.29%	167

Accuracy by Regime

Regime	Accuracy	Trades
Capitulation	65.64%	163
Bull	62.22%	990
Correction	59.79%	378
Bear	58.04%	224
Euphoria	53.62%	69

Accuracy by Action

Action	Accuracy	Trades
BUY	63.77%	69
WEAK_BUY	62.69%	1,351
WEAK_SELL	62.50%	56
SELL	54.35%	368

What Derivatives Data Actually Works

We tested four derivatives signals from Binance: funding rate, long/short ratio, taker buy/sell ratio, and open interest change. Only one survived validation.

Open Interest 7-day change (oiChange7d) — Information Coefficient of −0.072. Negative IC means rising open interest is bearish: when new leveraged positions flood in, the market is setting up for a liquidation cascade. The signal is orthogonal to everything else in the model — it captures a dimension (leverage crowding) that price, sentiment, and flow data miss.

Funding rate — redundant. It correlates heavily with the existing sentiment and momentum signals. Adding it to the model doesn't improve accuracy; it just double-counts the same information.

Long/short ratio — same story. The retail positioning signal is already captured by Fear & Greed and RSI. Redundant.

Taker buy/sell ratio — zero predictive power. IC indistinguishable from random.

The lesson: more data doesn't mean better predictions. Only orthogonal data — information the model can't already see — improves accuracy. Out of four derivatives signals, only one passed that test.

The Momentum Filters

Two new filters block trades during momentum collapse:

F&G Momentum Filter: If Fear & Greed drops more than 20 points in 7 days, block the trade → HOLD. Sentiment is in freefall — any directional call is unreliable.
RSI Momentum Filter: If RSI drops more than 20 points in 7 days, block the trade → HOLD. Technical momentum is collapsing — the model's trend signals are stale.

Honest assessment: these filters blocked 96 trades. Those trades had 56.25% accuracy — below the 61% baseline but not terrible. The net improvement is +0.24 percentage points. That's within noise. The filters are a reasonable safeguard against momentum collapse, but they're not the primary driver of v5.6.3's improvement. The threshold changes (trend, flow, drawdown) are doing the heavy lifting.

The Unsolved Problem: Choppy Markets

The model's weakest periods aren't crashes. They're chop.

We ran a full investigation of every trade in 2021–2022 — the period where prior configs collapsed to coin-flip accuracy. What we found surprised us.

Crashes are fine. During the Luna collapse (May 2022), the model hit 71.4% accuracy. During China's mining ban (mid-2021), 66.1%. During the 2022 Q1 decline, 60.2%. The model correctly identifies SELL signals during sharp downturns.

Chop is the problem. The non-event periods in 2021–2022 — sideways grind, range-bound action, repeated MA200 crossovers — that's where accuracy collapsed to 44.8%. The model generates directional signals in a market that isn't going anywhere.

Context	Accuracy	Trades
Luna Crash (May–Jun 2022)	71.4%	42
China Ban (May–Jul 2021)	66.1%	56
Q1 2022 Decline	60.2%	88
Non-Event Periods	44.8%	143
FTX Collapse (Nov 2022)	39.3%	28

The regime transition data confirms it: 2021–2022 had nearly double the MA200 crossover rate (0.78/month) compared to 2018–2020 (0.40/month). A trend-following model generates signals on every crossover. In a choppy market, most of those signals are wrong.

The worst months tell the story. September 2021: 8.3% accuracy. March 2022: 21.1%. These aren't crash months — they're sideways grind months where the model kept calling direction in a directionless market.

The FTX collapse (39.3%) is the exception — a crash where the model failed. FTX was an idiosyncratic fraud event, not a market-driven correction. The model's signals were structurally correct (SELL in a downtrend) but the timing and magnitude were unpredictable. This is a known limitation of any model that can't anticipate fraud.

What We're NOT Doing

The research eliminated several dead ends. Documenting them so we don't revisit:

No 5D combo keys. Adding derivatives as a 5th dimension to the signal combination caused sparsity. 40% of combos had fewer than 8 samples. Net cost: 56 correct calls lost.
No funding rate or L/S ratio as primary signals. Both are redundant with existing sentiment and momentum data. Adding them doesn't improve accuracy.
No shorter prediction horizons. 7-day forward return is the most predictable window. 1-day and 3-day horizons have worse signal-to-noise ratios.
No ensembles. Phase 2 showed ensemble methods converge to the same answer (59.02–59.07%). The model isn't ensemble-limited; it's data-limited.

What's Next

v5.6.3 is shipping. The validated config is in production. But the choppy-market problem remains the single biggest accuracy risk.

The investigation points to a clear next step: a chop detector. When the model detects high regime-transition frequency (frequent MA200 crossovers, low directional momentum, compressed volatility), it should reduce position sizing or suppress signals entirely. The model knows when it doesn't know — it just doesn't act on that knowledge yet.

We'll report the results when it ships.

Track the Model

Every BV-7X prediction is attested on-chain via EAS on Base. Accuracy is verifiable, not claimed.

Live Terminal →

Signal Methodology On-Chain Verification Wager Agent

Mischa0X

Building BV-7X — autonomous prediction oracle, on-chain attestation, and adversarial AI markets.

Previously: Depeg.io

@Mischa0X bv7x.ai