Skip to content
Cube coverageMMR-513,619 clusters × 368MH-362,64,583 clusters × 336Pan-India653 districts × 39audit artefacts →
Bharat Strata

The cube.

A versioned multi-witness substrate that lets software reason about Indian locality through evidence, reconciliation, falsifiers, and named floor keys.

§1 What Bharat Strata is

Bharat Strata is a parquet-backed knowledge graph of Indian locality. Each cell is a reconciled record of what 20 witness tiers reported about a place, what 9 reconciliation passes inferred, which of 23 falsifier modules passed, and which of 10 named floor keys apply when the substrate cannot honestly answer. Every cluster carries its own SHA-256 + schema fingerprint and the doctrine version that produced it.

Witness tiers
19
A → S
Reconciliation passes
9
R1 → R9
Falsifier modules
23
all must pass
Named floor keys
10
what we cannot know

§2 What's inside a cell

A cluster is a polygon (typically H3-r9 at MMR / MH scope, H3-r6 at pan-India). Each cell persists a header (identity + lineage + provenance) and one row per column with sibling disagreement + witness count.

GroupColumns (excerpt)
Identitycluster_id, h3_r9, admin_code, geom (WKT)
Morphologybuilt_fraction, building_count, building_height_p50/p90, road_length_km
Populationpop_density_per_km2, pop_count (WorldPop), pop_age_brackets (where present)
Land coverlulc_class_primary, ndvi_p50, ndwi_p50, lulc_share_{urban,water,veg,bare}
Terrainelev_m, slope_deg, aspect_deg, hand_m (NASADEM)
Hydrologyriver_distance_m, stream_order, flood_2yr_class (WRIS / GLOFAS)
Climatetmax_p95, precip_p95, rh_p50, aqi_p50 (IMD / IMERG / CHIRPS)
Per column_witness_count, _disagreement, _agreement_score, _floor_key (when active)
Lineagetier_contributions[], reconciliation_pass_id, falsifier_pass_ids[]
Provenanceschema_fingerprint, parquet_sha256, build_doctrine_version

§3 Witness tiers (19)

Every column is sourced from at least one tier. Triangulated columns are sourced from three or more. The list is doctrine-pinned; tiers are not added without a closure audit.

TierSourceWhat it contributes
ANRSC Bhuvan + Overture divisionsadmin base, GoI canonical divisions
BMapillary + Sentinel-2 RGBstreet-level + optical context
CMicrosoft GBF v3building footprints
DGHSL BUILT-Hbuilt-up height
ESentinel-2 L2A (ESA)NDVI / NDWI / NDBI, 10 m
FSentinel-1 GRD (ESA)SAR backscatter VV/VH, all-weather
GHLS L30 + S30 (NASA)harmonised Landsat-Sentinel, 30 m
HGPM IMERG (NASA)precip 0.1°, 30 min
H′GPM IMERG-30 diurnalsub-daily precip decomposition
IGEDI L4A (NASA)canopy biomass
JCHIRPS (UCSB)precip 0.05°, daily
KCopernicus GLO-30 DEMelevation, slope, aspect
LESA WorldCover v200global land cover
MCPCB CAAQMSair quality (AQI / PM2.5)
NWRIS river-discharge + raingauge level, discharge
OIMD gridded rainfalldistrict rainfall
PWorldPop Indiapopulation density
QIMD 125-yr climate dynamicslong-term rainfall trends (Theil-Sen + Mann-Kendall), 4-season + concentration
RJRC-EDGAR v8.1CO2 / NOx / PM2.5 emission grids
SIndia-WRIS ground-waterGW level, quality
TGGOS first-party perceptionin-cab attention telemetry
UNOAA GSOD 50-yr temperaturelong-term warming trends (Theil-Sen + Mann-Kendall), summer-max / winter-min, IDW from stations
VNRSC Bhuvan LULC 250k changeobserved land-use change 2004→18 (built-up / crop / forest / water deltas), per-district
WJRC GHSL built-up 45-yrobserved built-up-surface trajectory 1975→2020 (Theil-Sen trend + Mann-Kendall), per-cell at 1km

§4 Reconciliation passes (9)

Every claim survives at least one pass. Most claims survive all nine. The order is fixed and the artefacts are emitted at R8 (closure quintet) and R9 (pristine fingerprint seal).

#PassWhat it guarantees
R1Source-axis isolationeach tier reduced to its own canonical schema first
R2Schema fingerprint witnessevery input has a SHA-256 + schema fp; rows blocked without one
R3Anchor-id reconciliationcluster ↔ H3 ↔ admin_code joined through LGD canonical tag
R4Geometry sanitypolygon area, centroid, convexity within MAD-bounded sanity
R5Per-column triangulationthree or more tiers vote on each column; disagreement persisted
R6Source-priority arbitrationwhen sources disagree, the doctrine-declared canonical source wins; loser preserved as _alt
R7Floor-key resolutioncolumns the substrate cannot know are emitted as one of 10 named floor keys
R8Closure-quintet attestationeach scope emits 5 audit artefacts: schema, counts, joins, falsifier matrix, lineage
R9Pristine-fingerprint sealfinal parquet sha256 + schema fp pinned into closure artefact and into the cube column itself

§5 Falsifier modules (23)

Every cluster is held against all 23 falsifiers at the publish gate. Any FAIL blocks the cluster from the cube; floor key FK8 is emitted in its place and the row is redacted, not silently dropped.

#Falsifier
F1cluster_id uniqueness
F2h3 ↔ admin_code consistency
F3geometry well-formedness (no self-intersection, valid WKT)
F4polygon area MAD-bound (per admin level)
F5built_fraction ∈ [0,1] and monotonic vs building_count
F6building_height_p50 ≤ building_height_p90
F7road_length_km within OSM-density envelope
F8pop_density within WorldPop ± 3σ
F9ndvi_p50, ndwi_p50 ∈ [-1, 1]
F10lulc class shares sum to 1 ± ε
F11elev_m within national min/max
F12slope_deg ∈ [0, 90]
F13tmax_p95 ≥ tmax_p50; precip_p95 ≥ precip_p50
F14witness_count ≥ 1 for every retained column
F15_disagreement ≥ 0; _agreement_score ∈ [0,1]
F16floor_key, when present, draws from the 10-name registry
F17schema_fingerprint matches the closure-quintet fingerprint
F18parquet_sha256 reproduces under doctrine-pinned build
F19no anchor row references a deleted admin_code
F20no cluster appears in two scopes with conflicting columns
F21doctrine_version is parseable and not pre-release
F22every column listed in the schema appears in the parquet
F23every parquet column is listed in the schema

Live PASS/FAIL state per scope is at /docs/falsifier-register.

§6 Floor keys (10) — what we cannot know

A floor key is the cube's way of naming what it cannot honestly answer. A null without a floor key is forbidden by R7. There are exactly ten; the registry is doctrine-pinned.

KeyNameFires when
FK1no_witnessno source reported on this cell
FK2source_disagreementsources reported but cannot be arbitrated under R6
FK3below_resolutioncell is finer than the source can support
FK4temporal_gapsource has not refreshed within the doctrine window
FK5geometry_excludedcell falls in an exclusion polygon (military, sea, foreign)
FK6admin_unmappedadmin_code does not resolve to a LGD-canonical tag
FK7tier_blockedtier explicitly opted out of this scope
FK8falsifier_faila falsifier flagged this column; redacted not omitted
FK9license_blockedsource license forbids this scope of re-emission
FK10doctrine_deferredreserved by doctrine for a future column; emitted as null with reason

§7 Coverage

Three scopes ship today. Numbers below are read from public/coverage.json, regenerated nightly by scripts/publish-coverage.py against the canonical S3 prefix.

Cube coverage

as of 2026-06-02T06:14:47Z
ScopeStatusClusters / districtsColumnsTiersFalsifiersSchema fp
MMR-5Full closure13,619 clusters3681923/2373340df4
MH-36Full closure2,64,583 clusters3361923/2373340df4
Pan-IndiaShipped (v0.6-beta)653 districts39819/23pan_india_v0_6_beta

Every row is a real S3 artefact. Every count is verifiable by re-running the publisher.

MMR-5
13,619 × 368
Full closure
MH-36
2,64,583 × 336
Full closure
Pan-India
653 × 39
Shipped (v0.6-beta) · 75,403 H3-r6 cells

§8 An example locality record

A real cluster from admin_code=482/mmr5_unified_v4.parquet. One row, ten columns shown, every column carrying witness count, disagreement and agreement, one column floor-keyed for honesty.

{
  "cluster_id": "MMR5-482-008-0014",
  "h3_r9": "892a1072b3fffff",
  "admin_code": "482",
  "geom_wkt": "POLYGON((...))",

  "built_fraction":        { "value": 0.71,  "witness_count": 4, "disagreement": 0.04, "agreement_score": 0.92 },
  "building_count":        { "value": 1238,  "witness_count": 2, "disagreement": 0.07, "agreement_score": 0.86 },
  "building_height_p50":   { "value": 18.4,  "witness_count": 2, "disagreement": 0.05, "agreement_score": 0.90 },
  "road_length_km":        { "value": 12.3,  "witness_count": 2, "disagreement": 0.03, "agreement_score": 0.94 },
  "pop_density_per_km2":   { "value": 28400, "witness_count": 1, "disagreement": null, "agreement_score": null,
                             "floor_key": "FK1" },
  "ndvi_p50":              { "value": 0.34,  "witness_count": 3, "disagreement": 0.02, "agreement_score": 0.96 },
  "lulc_class_primary":    { "value": "urban","witness_count": 3, "disagreement": 0.00, "agreement_score": 1.00 },
  "elev_m":                { "value": 14,    "witness_count": 2, "disagreement": 0.00, "agreement_score": 1.00 },

  "lineage": {
    "tier_contributions":             ["A","B","C","E","I","J","K","O","P","Q","R"],
    "reconciliation_passes_applied":  ["R1","R2","R3","R4","R5","R6","R7","R8","R9"],
    "falsifiers_passed":              23
  },

  "doctrine_version":   "v1.0.24",
  "build_id":           "mmr5_unified_v4",
  "parquet_sha256":     "9a4f...8c1b",
  "schema_fingerprint": "73340df4"
}

§9 Read further

Technical deep dive

v2.2 reference document, all 23 tiers, all 9 passes, all 23 falsifiers, all 10 floor keys, derived in full.

Download PDF →

Audit artefacts

Every _audit.json on S3, with schema fingerprint and pre-filled aws s3 cp.

Browse →

Open the Explorer

The 5 reasoning operators of the cube, made interactive. Free, web, rate-limited.

Open Explorer →