Methodology · sgdi-1.0.0

How SGDI is computed

The score is reproducible from on-chain data and a small set of public APIs. If you can't reproduce it from this page, that's a bug — please file an issue.

What we're measuring

A simple question: does this pool delegate to places that need stake, or does it pile more onto already-popular validators?

A pool whose stake sits in the same handful of overweight cities and ASNs as everyone else isn't improving network decentralisation, regardless of how internally diverse its own delegations look. A pool that finds underweight regions — Hong Kong, São Paulo, Manila, Lagos — is doing the actual work of decentralising the network.

The formula

For each validator v in a pool with stake fraction wᵥ, we compute its rarity on three dimensions — country, city, and autonomous system number (ASN):

rarity_D(v)  =  -ln( network_share_D(category of v) )       D ∈ {country, city, ASN}

network_share_D(category) is the fraction of total network stake currently delegated to validators in that category. A validator in NYC (where ~8% of network stake sits) has a low rarity; a validator in Manila (~0.1%) has a high rarity.

The pool's Decentralisation Contribution on each dimension is the stake-weighted average rarity of its validators:

DC_D  =  Σᵥ wᵥ · rarity_D(v)

The composite GDI is the geometric mean of the three:

GDI  =  ( DC_country · DC_city · DC_asn )^(1/3)

Geometric mean penalises being good on one dimension and poor on another — these are distinct decentralisation risk classes. A pool that's geographically diverse but everyone's on AWS still has a single-ASN failure mode.

A secondary signal, the Network Impact Score:

NIS  =  Σᵥ wᵥ · stakewiz_wiz_score(v)

Captures whether a pool delegates to validators that improve the network's overall health (as scored by Stakewiz). A pool can be geographically well-distributed but still delegate to under-performing validators; NIS surfaces that.

Why three dimensions?

The obvious objection: country and city are correlated. If you know a validator is in Frankfurt, you know it's in Germany. So isn't the country dimension redundant once city is in the formula?

Correlated, yes. Redundant, no. Consider a pool with five validators in LA, San Francisco, NYC, Chicago, and Dallas — five different US cities on five different ASNs. On a city-and-ASN-only composite that pool looks well-decentralised. With the country dimension included, the same pool scores poorly on country (effectively one bucket: US), and the geometric mean drags the composite down. That's the right answer: single-jurisdiction concentration is a real risk class, distinct from physical-location and network-operator risk.

The three failure modes are independent:

Country — regulatory action, sanctions, jurisdiction-specific rule changes (e.g. China's 2021 crypto crackdown took ~50% of Bitcoin hashrate offline overnight).
City — power outage, datacenter incident, regional fiber cut, weather event.
ASN — cloud-provider outage, BGP misconfiguration, network-operator-level disruption.

Geometric mean weights the three equally — no domain-expert opinion baked in about which risk class is most important. A pool concentrated on any single dimension gets pulled down by the geometric mean regardless of how diverse it looks on the other two. This is the intended behaviour: you don't want a pool with all stake on AWS (single-ASN failure mode) to claim a high score because its cities and countries are diverse.

Network baseline — how to read it

Applying the same formula to the entire active validator set gives the network baseline GDI — by construction, the network's own stake-weighted average rarity. A pool whose GDI is above the baseline is preferentially delegating to less-popular places than the network average — directly reducing concentration. A pool whose GDI is below the baseline is reinforcing already-popular spots — concentrating the network further.

This is the metric's honest claim: it isolates which pools are contributing to decentralisation versus exacerbating concentration, regardless of size or yield.

Data sources

Source	Provides	Trust
Helius RPC	Pool → validator → stake mapping (current epoch)	Authoritative (on-chain)
Stakewiz	IP-derived country / city / ASN; activated stake; wiz_score	Primary for location + network shares
Validators.app	Cross-reference for validator metadata	Fallback / disagreement check

Concentration: computed vs reported

Stakewiz publishes its own per-validator city_concentration and asn_concentration fields. SGDI does not use these directly for scoring — instead, we compute network shares ourselves from the raw activated_stake + IP fields, so the math is fully reproducible from public inputs and we cover all three dimensions (country, city, ASN) the same way (Stakewiz doesn't expose a country-concentration field).

We do store Stakewiz's reported concentration values alongside our own computed shares as a sanity check. A side-by-side comparison for the top buckets is published at /gdi/concentration-crosscheck.json each ingest. Wide divergence between our numbers and Stakewiz's would be a red flag — if you spot one, file an issue.

Limitations (read these before quoting a score)

IP-derived geography is imperfect. Cloud-provider IPs occasionally place a chunk of stake in tiny countries (e.g. Andorra) that have no real node presence. This shifts absolute rarity numbers by a few percent but doesn't change pool rankings between pools using the same data.
Stake within an epoch is fixed. Solana stake delegations only activate at epoch boundaries, so per-epoch resolution is the natural cadence. Don't expect intra-epoch updates.
Per-epoch numbers are noisy. Pool rebalancing causes legitimate single-epoch swings of several percent. The 5-epoch and 10-epoch rolling averages are the trustworthy signal.
A pool with one validator in a rare place can score very high. The leaderboard surfaces validator-count alongside the score so small pools are visually distinct, and we focus the leaderboard on top-25-by-TVL pools (which all have multiple validators).
Placement coverage. Each pool's row reports the fraction of its stake we could place geographically (typically 100%; lower if a validator's metadata is unavailable from both Stakewiz and Validators.app). Stake we can't place is excluded from the rarity calculation so it neither helps nor hurts the pool's score.

Version policy

Methodology version: sgdi-1.0.0. Historical scores remain reproducible under their original version forever; the leaderboard transparently flags any historical epoch computed under an older methodology version. See CONTRIBUTING.md for the bump policy.