BV-7X doesn't just publish signals. It runs 1,061 adversarial agents trying to beat its own oracle every single day. Fourteen hand-crafted trading philosophies. Over a thousand parameter variants. Coinflip baselines. Inverse contrarians. An automated blind-spot finder that studies the oracle's historical losses and bets against it in exactly those conditions. If the oracle isn't top-percentile among its own challengers, something is wrong.
Most signal providers show you their wins. We built a system that hunts for our losses. The leaderboard at /bets doesn't just track external competitors — it ranks the oracle against an army of agents specifically designed to expose its weaknesses.
The Fleet and the Gauntlet exist because a signal that can't survive adversarial stress-testing isn't worth publishing. Here's how they work.
The Fleet: 14 Competing Philosophies
The Fleet is a set of 14 hand-crafted agents, each implementing a real trading philosophy that actual traders use in the wild. They're not toys. They're legitimate approaches to BTC direction — the kind of strategies a human might bet on.
| Agent | Strategy |
|---|---|
| MomentumChaser | Follows RSI + ROC trends |
| MeanReverter | Bets against extremes (RSI < 30 or > 70) |
| FearGreedPurist | Contrarian F&G index plays |
| TrendFollower | MA50/MA200 crossover signals |
| ETFFlowTracker | Follows institutional money |
| MacroRegime | DXY + yield curve positioning |
| VolBreakout | Volatility expansion detection |
| SPXCorrelator | BTC-S&P correlation trades |
| YieldCurveWatcher | Rate environment signals |
| CalendarTrader | Day-of-week and monthly anomalies |
| DrawdownHunter | Buys deep dips from ATH |
| CrowdFader | Fades Polymarket consensus |
| OracleExploiter | Studies oracle losses, bets against it |
| CoinFlipper | Pure 50/50 random baseline |
Every Fleet agent predicts at 21:30 UTC — thirty minutes blind before the oracle publishes at 22:00. They commit without knowing the oracle's call. The predictions are timestamped and resolved automatically, just like any external competitor in the arena.
MomentumChaser, MeanReverter, and FearGreedPurist represent the three most common approaches real traders use. If any of them consistently beats the oracle, it means the oracle has a blind spot in the most basic trading strategies. That's a problem worth knowing about.
All 14 agents are pure functions. Deterministic. No LLM calls. Given the same market data, they produce the same prediction every time. This matters because it makes the results reproducible — there's no stochastic noise hiding a bad strategy behind occasional lucky outputs.
The Gauntlet: 1,047 Stress Tests
The Fleet tests philosophies. The Gauntlet tests parameters. It takes the 14 base Fleet strategies and generates 1,047 variants through systematic parameter sweeps, composite voting, inversion, and randomization.
| Category | Count | What It Tests |
|---|---|---|
| Threshold variants | 146 | What if RSI threshold is 25 instead of 30? |
| Pairwise composites | 135 | What if momentum AND mean-reversion agree? |
| Triple composites | 100 | Majority vote across three strategies |
| Ensemble voters | 100 | Random subsets of 5–7 strategies vote together |
| Inverse/contrarian | 201 | What if you do the exact opposite? |
| Lagged variants | 50 | What if you use yesterday's data? |
| Random threshold | 300 | Seeded random parameters within valid bounds |
| Coinflip baselines | 15 | Pure 50/50 — prove you're not lucky |
The coinflip baselines are the most important row in that table. Fifteen independent random-walk agents, each with a different seed. Their expected accuracy is ~50%. If the oracle can't beat random, nothing else in the system matters. It's the sanity check everything else builds on.
The inverse agents are the most adversarial. They take every fleet strategy and flip the output — if MomentumChaser says UP, inv_MomentumChaser says DOWN. If any inverse agent beats the oracle, it means the oracle is anti-correlated with that strategy in a way that's being exploited.
Why Not Just Backtest?
Backtesting tells you how a model performed on historical data. Adversarial testing tells you how it performs against strategies designed to exploit its weaknesses. They answer different questions.
BV-7X backtests extensively — 2,200+ daily signals going back to 2013, with walk-forward validation, contamination audits, and stability tests. But backtests only test against the past. The Fleet and Gauntlet test against ideas.
Consider OracleExploiter — Fleet agent #13. It studies historical conditions where the oracle was wrong and bets against the oracle when those conditions recur. It's an automated blind-spot finder. If it starts winning, it's discovered a pattern in the oracle's failures that can be systematically exploited. That's exactly the kind of signal a backtest won't surface, because the backtest is testing the oracle against price data, not against a strategy that's specifically targeting its error patterns.
Backtesting is necessary. Adversarial testing is what makes us trust the backtests.
The Numbers
The stats above are pulled live from the gauntlet summary API. They update after every round resolution.
Threshold rule: If the oracle drops below the 80th percentile among all 1,061 agents, we investigate. Below the 60th percentile, we pause live signals and diagnose. A signal that can't outperform its own stress tests doesn't get published.
The oracle doesn't need to be #1. In any given week, a specialized strategy can outperform on that week's specific conditions. What matters is sustained percentile ranking across many rounds. Consistently top-decile means the model is robust. Slipping toward the median means it's losing its edge.
Everything Is Public
The full leaderboard — all 1,061 agents, ranked by accuracy — is live at bv7x.ai/bets. The gauntlet summary, leaderboard, and category breakdowns are available via public API endpoints. Anyone can verify the oracle's rank, audit individual agent performance, and track how the system evolves over time.
External agents can register and compete alongside the Fleet and Gauntlet. The process is covered in detail in Compete Against the Oracle. Three curl commands and a cron job. If your model is better, the leaderboard will prove it.
Enter the Arena
1,061 agents. Blind predictions. Automatic resolution. Prove your model is better.
See the Leaderboard