// v1.20.2 — fixed receive-only wallet detection after @0xc4rson found a false positive. Credit to them. full changelog →
VIGIL Forecasting Skill Engine
Leaderboard →
Polymarket · Live model v1.20.2 · scanning now · no VCs

PnL lies. Calibration doesn't.

VIGIL grades every Polymarket wallet A through F on actual forecasting skill — Brier Skill Score against the market's implied probability, not dollars earned. Paste any wallet. See the grade render below in about two seconds. We've graded over 23,000 wallets and about half of the self-declared "whales" on this platform aren't actually sharp.

// paste-and-grade · GET /v1/polymarket/score
~2s response. No sign-up. Free. Every grade is a shareable URL. Try: · ·
23,659
Wallets Scanned
as of today · live
~500
On Leaderboard
min. 50 resolved bets
ABCDF
Grade Distribution
±2.3
95% CI at 500+ bets
10k bootstrap resamples
RECENT
Ainfluenz.eth95 · 2m ago C0x8a4c…d1f268 · 3m ago Bmesterton81 · 4m ago D0x22fe…aa0342 · 5m ago Aproducts79 · 6m ago F0xbotcluster118 · 7m ago C12398745684 · 8m ago Bdegenpredict74 · 9m ago D0xf00d…beef49 · 10m ago Asilentsharp.eth91 · 11m ago Ainfluenz.eth95 · 2m ago C0x8a4c…d1f268 · 3m ago Bmesterton81 · 4m ago D0x22fe…aa0342 · 5m ago Aproducts79 · 6m ago F0xbotcluster118 · 7m ago C12398745684 · 8m ago Bdegenpredict74 · 9m ago D0xf00d…beef49 · 10m ago Asilentsharp.eth91 · 11m ago
// how it works

Six dimensions. One grade. Every grade backed by onchain evidence.

Each wallet is scored across six weighted dimensions of forecasting skill. Brier Skill Score is the backbone — a proper scoring rule measured against the market's implied probability at entry. Every input is publicly verifiable onchain; you can audit any grade by walking the wallet's trade history.

25%
Calibration
Were your stated odds close to reality?
Brier Skill Score vs market-implied probability, mapped non-linearly. BSS > 0.10 elite. BSS > 0.25 world-class.
20%
Profitability
Did calibration actually print?
ROI-scaled, not raw dollars. +30% ROI caps the score. Can't farm with size alone.
20%
Live Edge
Are your open positions priced better than market?
Position-weighted delta between your entry price and current mark-to-market probability.
15%
Consistency
Is the edge repeatable or one lucky week?
Inter-quartile range of bet returns. Rewards steady calibration. Doesn't punish winners.
10%
Discipline
Do you diversify or all-in on one binary?
Diversification across markets × categories × time. Concentration penalty for single-binary farmers.
10%
Sample Size
Is the grade earned or hallucinated?
Tiered bonuses at 100 / 250 / 500+ resolved bets with positive BSS. Under 50 caps at C.
Read the one-page methodology (inline, no page jump)

Brier Skill Score — the reference

BSS = 1 − (BrierForecast / BrierReference), where BrierReference is the market-implied probability at your entry. A positive BSS means your forecast beat the consensus. Negative means you underperformed it. This is the standard in professional forecasting (Met Office, CDC ensemble, IARPA ACE).

Why non-linear mapping?

A linear BSS→score would reward a BSS of 0.02 almost as much as 0.20. But the jump from 0 to 0.10 is roughly the difference between "noise" and "calibrated trader." We use a logistic curve that compresses the middle and expands the tails — small BSS deltas near zero matter less; BSS > 0.10 gets meaningful lift. See the curve below.

Validation

18 months of resolved-market data as training; final 90 days as holdout. Grade-to-out-of-sample-BSS correlation ρ ≈ 0.71 on the holdout. Weights ridge-regularized against rank stability. Full train/test notebook linked below.

Proven-winner tiers

A wallet with 40 resolved bets and BSS +0.15 is probably calibrated. A wallet with 500 resolved bets and BSS +0.15 is definitely calibrated. Tiered bonuses reward sustained evidence: +3 at 100, +5 at 250, +8 at 500.

Uncertainty bands

Every grade ships with a 95% confidence interval, bootstrapped from the wallet's resolved-bet record (10k resamples). Under 100 bets the CI is wide enough that we surface it prominently. Over 500, it tightens to ±2–3 points.

Validation notebook: vigil-v1.20.2-validation.ipynb · 18mo train / 90d test · ρ=0.71 on holdout
open on GitHub →

// the curve

How Brier Skill Score becomes a letter.

The mapping from BSS to VIGIL score. A linear mapping would over-reward marginal performance and under-reward elite calibration. Our logistic curve expands the tails — being elite matters, being mediocre doesn't pretend to be fine.

// BSS → Calibration Component (25 pts max)
x-axis: Brier Skill Score · y-axis: calibration points (out of 25)
0 5 10 15 20 25 points -0.2 -0.1 0.0 0.1 0.2 0.3 Brier Skill Score BSS 0 · ~5 pts BSS 0.10 · ~21 pts (elite) BSS 0.25 · 25 pts (world-class) linear reference
BSS < 0
Worse than the consensus. Probably a copy trader or a gambler.
BSS 0 – 0.05
Roughly at market. ~5 calibration points.
BSS 0.05 – 0.10
Calibrated trader. Curve steepens here on purpose.
BSS > 0.10
Elite. BSS > 0.25 is world-class, very rare at sample.

// receipts

Real wallets. Real grades. Same data you can audit.

Three live examples. PnL alone won't tell you which one to copy. The grades do. Each card includes the 95% confidence interval and a percentile so you know how earned the score is. Numbers from today's crawl — the live ticker above confirms freshness.

A
top 3%
95 / 100 — SHARP
influenz.eth
$1,000,000+ PnL
BSS +0.22
Resolved 612
ROI +28%
Disc High
95% CI: [92.8 – 96.4] · sample: 612 bets
B
top 22%
79 / 100 — SOLID
products
$700,000 PnL
BSS +0.14
Resolved 418
ROI +18%
Disc Med
95% CI: [75.1 – 82.6] · sample: 418 bets
A
top 11%
84 / 100 — SHARP
123987456
$121,000 PnL
BSS +0.19
Resolved 263
ROI +22%
Disc High
95% CI: [78.2 – 89.4] · sample: 263 bets
"But this wallet has $400K PnL and only scored a C."
Common cause: heavy concentration on 2–3 lucky binaries, or a penny-lottery pattern. The dimension breakdown on the profile page shows which component dragged the grade.
"Can't you game BSS on lopsided markets?"
No. BSS is measured against market-implied probability at your entry. Sitting on 95% favorites doesn't beat the market — it matches it. Skill score ≈ 0.
"What about bots and wash trading?"
Penalized. Bots lose calibration on random trades. Wash-trade patterns trip discipline + receive-only checks. Known bot clusters grade F across the board.

// the landscape

Where VIGIL sits vs. everything else.

Four ways people size up Polymarket traders today. Only one measures forecasting skill.

VIGIL Polymarket leaderboard Nansen / Arkham Self-reported tweets
Measures forecasting skill (not PnL)✗ PnL only✗ onchain flow✗ vibes
Brier Skill Score backbone
Confidence intervals on every grade✓ (95% CI, bootstrap)
Catches penny-lottery + bot patterns✗ bots can top itpartial
Free · no sign-up · free API tier✗ $150+/mo
Chrome extension injects inline badges
Open source · verifiable onchain✓ MIT✗ black box
Time-decays stale activity✓ rolling 90d weight✗ lifetime PnLpartial
Published validation notebook✓ ρ=0.71 holdout
* Polymarket leaderboard is ranked by PnL — a useful dashboard, not a skill signal. VIGIL is built on top of their public data with attribution and honors opt-out requests within 24h.

// what the grade actually means

A D-grade wallet can still be up $500K.

PnL measures outcomes. The grade measures the quality of the forecast. A lucky gambler with huge PnL and a penny-lottery pattern gets a D. A small, calibrated, high-BSS wallet with modest PnL gets an A.

// WHAT WE PENALIZE

  • Penny-lottery spraying (80%+ sub-$0.10 bets with negative BSS → hard cap at D/49)
  • Receive-only wallets with no outbound transactions
  • High-concentration, all-in-on-one-binary patterns
  • Negative Brier Skill Score regardless of bottom-line PnL
  • Under-sampling (<50 resolved bets caps at C until proven)
  • Stale activity — idle wallets time-decay after 90 days

// WHAT WE REWARD

  • Positive Brier Skill Score across 100+ resolved markets
  • Calibrated entries near the eventual resolved probability
  • Stable IQR — calibrated wins that repeat across eras
  • Market diversification across categories and time horizons
  • Cross-category transfer (politics + sports + crypto + news)
  • Proven-winner bonus at 500+ resolved bets with positive BSS and PnL (+8 pts)

// developers

Free JSON API. One endpoint to learn.

VIGIL's scoring engine is the product. The landing page is a skin over it. If you want to build on top — copy-trade filters, analytics dashboards, Discord bots — the API is live, free for the hobby tier, and the spec is one page.

$ curl -s "https://vigilscore.xyz/v1/polymarket/score?wallet=influenz.eth"
# response (truncated)
{
  "wallet": "influenz.eth",
  "grade": "A",
  "score": 95,
  "percentile": 97,
  "bss": 0.22,
  "resolved": 612,
  "ci95": [92.8, 96.4],
  "dimensions": {
    "calibration": 24.1,
    "profitability": 18.6,
    "liveEdge": 16.4,
    "consistency": 13.1,
    "discipline": 9.3,
    "sampleSize": 10.0
  },
  "class": "SHARP",
  "onchainRefs": ["0x…", …]
}

// API Tiers

Hobby 60 req/min · score lookups · leaderboard $0
Pro 600 req/min · webhooks · tier-change alerts $29/mo
Enterprise unlimited · custom dimensions · SLA · feed hit me up

No auth on hobby. Rate-limited by IP. If you need more, email gatson32@gmail.com and tell me what you're building — I'll turn the spigot up.


// coming soon

Single-player today. Social tomorrow.

Grading one wallet at a time is step one. Grading a group of wallets — your crew, a Discord, a copy-trade list — is where this gets fun.


// chrome extension

Trust badges on every Polymarket profile.

Install once. Every Polymarket profile page you visit gets a VIGIL badge injected next to the wallet handle. Works silently. Free. Open source. Zero tracking.

● Submitted · rolling out this week

Before a trade, before a copy, before a tweet — know who you're looking at. The badge shows letter grade, score, percentile, BSS, confidence interval, and resolved-bet count inline on the page. Manifest V3. Zero tracking. Site-scoped to polymarket.com.

Install on Chrome github repo →
VIGIL BADGE · AS RENDERED
influenz.eth
A
Score95 / 100 ±1.8
Percentiletop 3%
BSS+0.22
Resolved612
ClassSHARP

// who built this

Solo build. Public wallet. No investors.

In a trust product, the people behind it matter more than the logo. VIGIL is built by one person, in public, over the last six months. You can read every commit, fork the model, or tell me I'm wrong on X.

CG

Chris Gatson

Solo builder · forecasting nerd · recovering quant · shipping in public

I started VIGIL because I kept watching Polymarket "whales" get copied on X and quietly losing money calibration-adjusted. PnL rewards size. VIGIL rewards being right. If you find a bug in the model, email me and I'll credit you in the next release notes — see the changelog at the top of this page for the first paid-in-credit bug fix.

182 commits · last 6 months

// frequently objected

Harder questions.

Quick objections live near the proof cards above. These are the ones that require a paragraph.

How did you validate the weights (25/20/20/15/10/10)?
Held out the final 90 days of resolved markets as an out-of-sample set, trained the weights on the remaining 18 months, and the grade-to-out-of-sample-BSS correlation stayed at ρ ≈ 0.71 on the holdout. Weights are not hand-tuned vibes — they're ridge-regularized against rank stability. Full train/test split and cross-validation notes in the notebook. Weights will move ±2pts as the sample grows; any change ships with a changelog entry and a version bump.
Why not just use Sharpe or Sortino?
Sharpe is returns over volatility. Sortino is returns over downside volatility. Both measure the outcome of trading, not the quality of the forecast. A wallet can be high-Sharpe and still be systematically overconfident on binaries. BSS measures "did your stated probability match reality?" — which is what you actually want to know before copying someone's trade. We expose Sharpe-style stats on the profile page as sidecar metrics, not as part of the grade.
Isn't this going to get cease-and-desist'd by Polymarket?
We use only publicly observable onchain data and Polymarket's public API, with attribution. The wallets are pseudonymous and already ranked by PnL on Polymarket's own leaderboard. Opt-out flow is live: email us to remove a wallet and we'll honor it within 24 hours with a one-line audit log entry noting the removal.
What's the moat? Anyone can recompute BSS.
Three layers. (1) The scoring model's weights, caps, and penalty structure are iterated with backtests against resolved markets — copying the formula is easy, calibrating the non-linear curves against real out-of-sample data takes cycles and a public track record of mistakes-and-fixes. (2) The distribution channel: the Chrome extension lives where trades happen. (3) The recurring artifact — VIGIL Weekly Report surfaces the top grade-movers every Sunday; cadence builds trust and habit. None are diamond-hard moats individually; together they compound.
Does a wallet sharp on politics get credit on sports markets?
Partially. Cross-category BSS transfer is real but imperfect — a wallet with +0.20 BSS on politics and 0 resolved sports bets gets the grade applied to the politics record; as it builds a sports sample, the per-category BSSs get weighted into a blended grade. The profile page shows per-category breakdowns so you can see where the sharpness concentrates.
How fresh is the data?
The discovery crawler scans ~500 resolved markets every 2 hours and scores ~500 wallets per cycle. Hot wallets on the prescore list refresh hourly. Scoring any wallet on-demand via the search box is always live — it hits the Polymarket API, pulls the full trade record, and scores in ~2s.
Is VIGIL financial advice?
No. VIGIL is a forecasting-skill metric. We don't recommend trades, wallets to copy, or market positions. Treat it as information, not advice. Jurisdictions and market types vary; do your own due diligence.