// geo-standard@2026.05

Methodology

How we compute our agent-readability score, what “pass” means for each signal, how often we refresh the dataset, and where the measurement stops being precise — laid out so anyone can reproduce it.

The five signals

Each domain is scored on five signals, weighted to a 100-point total:

Semantic HTML — 25 pts. Landmark elements, single H1, alt text coverage.
JSON-LD — 20 pts. Organization at root; type-specific schema per leaf (Article, Product, FAQPage, BreadcrumbList).
llms.txt — 15 pts. Presence, validity, and content coverage at /llms.txt.
Citability — 20 pts. Front-loaded answer, named entities, numbers and dates in the first 50–70 words.
Speed — 20 pts. TTFB under 800ms (full 100 below 300ms); SSR HTML below 1MB.

Pass thresholds

A site "passes" a signal when it clears roughly 75% of that signal's max — the empirical line above which AI engines reliably extract a clean answer:

Semantic ≥ 19 / 25
JSON-LD ≥ 15 / 20
llms.txt ≥ 11 / 15
Citability ≥ 15 / 20
Speed ≥ 15 / 20

Total-score milestones: ≥ 85/100 "agent-native" (AI engines cite by name); < 55/100 "effectively opaque" (engines time out or skip).

Dataset construction

The Agent Readability Leaderboard tracks AI-industry companies across four canonical categories — Infra, Models, Agents, Dev Tools. Flagship rows (top ~30) are hand-scored; long-tail rows are deterministic estimates seeded from domain characteristics and re-scored weekly by an automated scan against the same five-signal scorer.

Any row can be re-scored live by anyone at /check?u=<domain>. Discrepancies between the leaderboard score and a live re-scan are expected when a site ships changes between scheduled scans.

Scan cadence

Weekly re-scan of the full dataset via /api/public/hooks/rescan-leaderboard. Quarterly report aggregates the most recent four weekly snapshots. Monthly data drops surface single-stat findings between quarterly cuts.

Limitations

Marketing pages only. Docs sites and product UIs are out of scope — their access patterns and citation needs differ.
Public surface only. Auth-gated content cannot be scored; sites that hide everything behind login will under-score regardless of internal quality.
Snapshot in time. Scores reflect the most recent scan; sites that ship daily will drift between scans.
Five signals, not all signals. Backlinks, brand authority, and Information Gain affect citation rates but are out of scope for this measurement.

License & attribution

Report text, headline statistics, and the underlying dataset are licensed CC BY 4.0. Attribution: "grow.contact, State of the Agent-Readable Web (CC BY 4.0)" with a link back to the report URL.

Dataset endpoint: /api/public/leaderboard.json

Changelog

geo-standard@2026.05 — Current. Adds Citability weighting refinement (front-loaded answer detection).
geo-standard@2026.02 — Added speed threshold step function below 300ms TTFB.
geo-standard@2025.11 — Initial five-signal scorer released.