The Setup
Kalshi has run daily Bitcoin prediction markets since June 2024. Each market asks the same question: will BTC be higher or lower in 7 days? When the window closes, the market settles. Up or down. Binary outcome, publicly recorded, no ambiguity.
561 markets have settled. We took the BV-7X V5 model — four macro signals (trend, momentum, flow, value) with a 7-day horizon — and ran it against every Kalshi observation where we had overlapping data. 520 matched days, from June 2024 through January 2026. Ground truth is Kalshi's own settlement brackets, not ours.
This is the head-to-head.
What this comparison does
It compares BV-7X's signal on day T against the actual 7-day outcome — which is what Kalshi settles on. The Kalshi data provides the ground truth timeline, not the crowd's implied odds.
What it does not do: compare against Kalshi's implied probability at event open. The dataset contains only settlement outcomes, not crowd probabilities. The "48.5% base rate" is simply the percentage of 7-day windows that resolved UP — the accuracy of a naive "always predict UP" strategy, not a crowd consensus figure.
The framing: BV-7X 59.7% vs coin flip 50% vs naive-UP 48.5%, all measured against Kalshi's settled 7-day brackets as ground truth.
The Baseline
Coin flip: 50%. The Kalshi crowd's actual base rate across all 561 settled markets: 48.5% resolved UP. On our 520 overlapping days: 50.6% resolved UP. Bitcoin on a weekly horizon is effectively a coin toss. Any sustained accuracy above 50% is signal. Everything below is noise dressed up as conviction.
Results
| Metric | BV-7X V5 | Coin Flip | Kalshi Crowd |
|---|---|---|---|
| Accuracy | 59.7% (273/457) | 50.0% | 48.5% (272/561) |
| BUY accuracy | 56.6% (146/258) | — | — |
| SELL accuracy | 63.8% (127/199) | — | — |
| Signal rate | 87.9% (457/520) | 100% | 100% |
| Edge vs flip | +9.7pp | — | -1.5pp |
| Kelly fraction | 19.5% | 0% | 0% |
457 actionable signals out of 520 days. 63 HOLDs. The model chose not to play 12.1% of the time. When it played, it was right 59.7% of the time. The crowd played every day and landed below 50%.
The SELL Edge
SELL accuracy at 63.8% across 199 signals is the standout. The V5 strength-mode SELL gate plus a flow-confirmation filter requires institutional selling conviction before calling a decline. When trend turns bearish but ETF flows haven't confirmed, the model stands aside rather than forcing a weak sell. It doesn't short Bitcoin recklessly. It requires convergence from multiple independent indicators before calling a 7-day decline.
BUY at 56.6% across 258 signals. Both sides are well above the coin flip baseline. 457 out of 520 days produced an actionable signal — the model sits out 12% of the time, mostly during ambiguous regime transitions, and still maintains a positive edge on both sides of the trade.
Month by Month
| Month | Signals | Correct | Accuracy | HOLDs | Base UP |
|---|---|---|---|---|---|
| Jun 2024 | 15 | 9 | 60.0% | 0 | 13.3% |
| Jul 2024 | 17 | 9 | 52.9% | 5 | 54.5% |
| Aug 2024 | 13 | 9 | 69.2% | 9 | 40.9% |
| Sep 2024 | 14 | 7 | 50.0% | 6 | 65.0% |
| Oct 2024 | 20 | 12 | 60.0% | 3 | 73.9% |
| Nov 2024 | 21 | 18 | 85.7% | 0 | 85.7% |
| Dec 2024 | 29 | 15 | 51.7% | 2 | 51.6% |
| Jan 2025 | 31 | 17 | 54.8% | 0 | 29.0% |
| Feb 2025 | 28 | 19 | 67.9% | 0 | 39.3% |
| Mar 2025 | 27 | 17 | 63.0% | 4 | 38.7% |
| Apr 2025 | 20 | 10 | 50.0% | 10 | 76.7% |
| May 2025 | 31 | 19 | 61.3% | 0 | 61.3% |
| Jun 2025 | 28 | 17 | 60.7% | 2 | 63.3% |
| Jul 2025 | 31 | 16 | 51.6% | 0 | 51.6% |
| Aug 2025 | 20 | 9 | 45.0% | 11 | 41.9% |
| Sep 2025 | 26 | 14 | 53.8% | 4 | 66.7% |
| Oct 2025 | 28 | 15 | 53.6% | 3 | 29.0% |
| Nov 2025 | 29 | 24 | 82.8% | 1 | 26.7% |
| Dec 2025 | 28 | 17 | 60.7% | 3 | 51.6% |
The edge is distributed. No single month carries the aggregate. Best months: Nov 2024 (85.7%), Nov 2025 (82.8%), Aug 2024 (69.2%). Only one month below 50%: Aug 2025 (45.0%). Months that previously struggled — Sep 2024, Dec 2024, Oct 2025 — now sit above 50% thanks to the correction override and flow-confirmation gate. The model's remaining weak spot is choppy sideways action where no trend-following system has an edge.
Caveats
The 59.7% is an in-sample number. The V5 model's thresholds were optimized on historical data that overlaps this Kalshi period. That means the comparison flatters the model. We know this and we're saying it upfront.
The unbiased estimate comes from walk-forward testing: 19 expanding-window folds, each with a mini grid search on the training set and a 180-day out-of-sample test window. That number is 61.2% (764/1,248). The out-of-sample accuracy actually exceeds the in-sample Kalshi number. That's unusual and worth pausing on.
Overfitting is the cardinal sin of quantitative modelling. It means your model has memorized the past rather than learned from it — it performs beautifully on historical data and collapses on new data. The telltale sign is a large gap between in-sample accuracy and out-of-sample accuracy. BV-7X shows the opposite: 59.7% in-sample, 61.2% out-of-sample. The model isn't fitting to noise. It's capturing something structural about how Bitcoin moves on a weekly horizon — trend persistence, institutional flow confirmation, mean reversion at extremes. Four macro signals, simple thresholds, no neural nets, no hundred-parameter black boxes. Parsimony is the best defence against overfitting, and this model was built parsimonious from day one.
What This Means
+9.7 percentage points over a coin flip. Kelly criterion says 19.5% of bankroll per bet. On Kalshi's weekly BTC markets, that's a quantifiable, repeatable edge. The model doesn't predict the future — it tilts the odds. Over hundreds of bets, tilted odds compound. That's the thesis. 520 days of data say the tilt is real.