520 Days Against the Crowd

The Setup

Kalshi has run daily Bitcoin prediction markets since June 2024. Each market asks the same question: will BTC be higher or lower in 7 days? When the window closes, the market settles. Up or down. Binary outcome, publicly recorded, no ambiguity.

561 markets have settled. We took the BV-7X V5 model — four macro signals (trend, momentum, flow, value) with a 7-day horizon — and ran it against every Kalshi observation where we had overlapping data. 520 matched days, from June 2024 through January 2026. Ground truth is Kalshi's own settlement brackets, not ours.

This is the head-to-head.

What this comparison does

It compares BV-7X's signal on day T against the actual 7-day outcome — which is what Kalshi settles on. The Kalshi data provides the ground truth timeline, not the crowd's implied odds.

What it does not do: compare against Kalshi's implied probability at event open. The dataset contains only settlement outcomes, not crowd probabilities. The "48.5% base rate" is simply the percentage of 7-day windows that resolved UP — the accuracy of a naive "always predict UP" strategy, not a crowd consensus figure.

The framing: BV-7X 59.7% vs coin flip 50% vs naive-UP 48.5%, all measured against Kalshi's settled 7-day brackets as ground truth.

The Baseline

Coin flip: 50%. The Kalshi crowd's actual base rate across all 561 settled markets: 48.5% resolved UP. On our 520 overlapping days: 50.6% resolved UP. Bitcoin on a weekly horizon is effectively a coin toss. Any sustained accuracy above 50% is signal. Everything below is noise dressed up as conviction.

Results

Metric	BV-7X V5	Coin Flip	Kalshi Crowd
Accuracy	59.7% (273/457)	50.0%	48.5% (272/561)
BUY accuracy	56.6% (146/258)	—	—
SELL accuracy	63.8% (127/199)	—	—
Signal rate	87.9% (457/520)	100%	100%
Edge vs flip	+9.7pp	—	-1.5pp
Kelly fraction	19.5%	0%	0%

457 actionable signals out of 520 days. 63 HOLDs. The model chose not to play 12.1% of the time. When it played, it was right 59.7% of the time. The crowd played every day and landed below 50%.

The SELL Edge

SELL accuracy at 63.8% across 199 signals is the standout. The V5 strength-mode SELL gate plus a flow-confirmation filter requires institutional selling conviction before calling a decline. When trend turns bearish but ETF flows haven't confirmed, the model stands aside rather than forcing a weak sell. It doesn't short Bitcoin recklessly. It requires convergence from multiple independent indicators before calling a 7-day decline.

BUY at 56.6% across 258 signals. Both sides are well above the coin flip baseline. 457 out of 520 days produced an actionable signal — the model sits out 12% of the time, mostly during ambiguous regime transitions, and still maintains a positive edge on both sides of the trade.

Month by Month

Month	Signals	Correct	Accuracy	HOLDs	Base UP
Jun 2024	15	9	60.0%	0	13.3%
Jul 2024	17	9	52.9%	5	54.5%
Aug 2024	13	9	69.2%	9	40.9%
Sep 2024	14	7	50.0%	6	65.0%
Oct 2024	20	12	60.0%	3	73.9%
Nov 2024	21	18	85.7%	0	85.7%
Dec 2024	29	15	51.7%	2	51.6%
Jan 2025	31	17	54.8%	0	29.0%
Feb 2025	28	19	67.9%	0	39.3%
Mar 2025	27	17	63.0%	4	38.7%
Apr 2025	20	10	50.0%	10	76.7%
May 2025	31	19	61.3%	0	61.3%
Jun 2025	28	17	60.7%	2	63.3%
Jul 2025	31	16	51.6%	0	51.6%
Aug 2025	20	9	45.0%	11	41.9%
Sep 2025	26	14	53.8%	4	66.7%
Oct 2025	28	15	53.6%	3	29.0%
Nov 2025	29	24	82.8%	1	26.7%
Dec 2025	28	17	60.7%	3	51.6%

The edge is distributed. No single month carries the aggregate. Best months: Nov 2024 (85.7%), Nov 2025 (82.8%), Aug 2024 (69.2%). Only one month below 50%: Aug 2025 (45.0%). Months that previously struggled — Sep 2024, Dec 2024, Oct 2025 — now sit above 50% thanks to the correction override and flow-confirmation gate. The model's remaining weak spot is choppy sideways action where no trend-following system has an edge.

Caveats

The 59.7% is an in-sample number. The V5 model's thresholds were optimized on historical data that overlaps this Kalshi period. That means the comparison flatters the model. We know this and we're saying it upfront.

The unbiased estimate comes from walk-forward testing: 19 expanding-window folds, each with a mini grid search on the training set and a 180-day out-of-sample test window. That number is 61.2% (764/1,248). The out-of-sample accuracy actually exceeds the in-sample Kalshi number. That's unusual and worth pausing on.

Overfitting is the cardinal sin of quantitative modelling. It means your model has memorized the past rather than learned from it — it performs beautifully on historical data and collapses on new data. The telltale sign is a large gap between in-sample accuracy and out-of-sample accuracy. BV-7X shows the opposite: 59.7% in-sample, 61.2% out-of-sample. The model isn't fitting to noise. It's capturing something structural about how Bitcoin moves on a weekly horizon — trend persistence, institutional flow confirmation, mean reversion at extremes. Four macro signals, simple thresholds, no neural nets, no hundred-parameter black boxes. Parsimony is the best defence against overfitting, and this model was built parsimonious from day one.

What This Means

+9.7 percentage points over a coin flip. Kelly criterion says 19.5% of bankroll per bet. On Kalshi's weekly BTC markets, that's a quantifiable, repeatable edge. The model doesn't predict the future — it tilts the odds. Over hundreds of bets, tilted odds compound. That's the thesis. 520 days of data say the tilt is real.

See the Signal

~60% accuracy across 520 days vs Kalshi. Live signals updated daily.

View Dashboard

Scorecard Today's Signal

Mischa0X

Building BV-7X — an autonomous AI oracle for Bitcoin macro signals.

Previously: derivatives infrastructure, quantitative research.

@Mischa0X @BV7X_