For any active Polymarket market, VIGIL aggregates the positions of every graded wallet in our universe (currently ~600 A/B/C/D-graded traders) into a single probability that represents "what does the calibrated money think about this market?" The number you see is not an average of all traders — it's weighted by each trader's historical forecasting skill, the size of their position, and how recently they took it.
weight_i = grade_weight_i × √(stake_i) × exp(−days_since_entry_i / 30) P_consensus = Σ(weight_i × impliedP_Yes_i) / Σ(weight_i)
Three weights multiply together:
A trader's grade is a proxy for their calibration skill — how well their stated probabilities match observed outcomes over their full resolved-bet history. Higher grades get more weight:
| Grade | Weight | Meaning |
|---|---|---|
A | 1.00 | Demonstrated calibration across 100+ resolved bets. ~3% of scored wallets. |
B | 0.60 | Solid calibration. ~8%. |
C | 0.25 | Developing. ~20%. |
D | 0.05 | Below naïve. Included but heavily down-weighted. |
F | 0.00 | Excluded from consensus. Too unreliable to contribute. |
A trader staking $100,000 knows more than a trader staking $100 — but not 1000× more. We use √(stake) to dampen whale over-influence. A $1M position is 10× (not 100×) the weight of a $10K position.
Beliefs go stale. A position taken 60 days ago reflects the trader's forecast at the time of entry — not today's information. We apply exp(−days/30), so a 30-day-old position counts for half, a 60-day-old position for a quarter. Fresh positions dominate.
Each trader's revealed probability is their average entry price, inverted for No-side holders:
Yes-side holder: impliedP_Yes = avgPrice No-side holder: impliedP_Yes = 1 − avgPrice
Clamped to [0.001, 0.999] to avoid log-singularities downstream.
95% CI is computed by 1000-resample bootstrap. We resample (wallet, position) pairs with replacement, recompute the weighted mean, and take the 2.5 / 97.5 percentiles. Wider CI ⇒ more disagreement among contributors.
We refuse to publish a consensus if either:
| Gate | Threshold | Rationale |
|---|---|---|
| Minimum wallets | ≥ 5 | Fewer than 5 holders is noise, not signal. |
| Effective sample size | Σ(grade × decay) ≥ 0.5 | E.g. 5 C-grades clears this; 10 F-grades does not. |
| Band | Effective sample | Interpretation |
|---|---|---|
strong | ≥ 10 | Multiple A/B-grade contributors. Treat seriously. |
moderate | 5 – 10 | Useful signal; note the CI width. |
weak | 0.5 – 5 | Light signal. Could flip with one more data point. |
insufficient | < 0.5 | Not published. |
This is not a price prediction. It is an aggregation of revealed beliefs from traders who have historically been well-calibrated. A divergence between market and consensus is a disagreement, not a trade signal. Skilled traders can be wrong; new information can move the market before it moves consensus.
Consensus also inherits the limits of its inputs: our graded universe is currently ~600 wallets (growing), and the grade itself is an estimate with its own confidence interval. See the scoring methodology for how grades are computed.
The "days since entry" field is heuristic — Polymarket's position endpoint doesn't expose first-entry timestamps. We approximate from market-close date. v1.22 will cross-reference the trade log to recover exact entry timestamps.
Positions that have been fully exited aren't counted — only current holdings. If a graded A-grade trader bought at 0.30 and sold at 0.55 two days ago, their conviction is gone from our aggregate.
GET /v1/polymarket/markets/:marketId/skill-consensus
Accepts a Polymarket conditionId (0x…) or a slug (will-trump-win-2024). Returns the full breakdown including top contributors. 5-minute TTL cache.