Guide · Updated 2026-05-30

llms.txt: the complete spec

llms.txt is a markdown file at /llms.txt that lists your site's public routes with a short description each. It is the LLM-era equivalent of robots.txt + sitemap.xml combined — a curated map that agents load into context before answering questions about your product.

Why llms.txt exists

Generative AI agents have limited context windows and most cannot afford to crawl your entire site to answer one question. llms.txt gives them a single short file that answers two questions at once: what is this site and where should I look for details.

The spec was proposed by Jeremy Howard in September 2024 (llmstxt.org) and has been adopted by Anthropic's Claude project ingestion, Perplexity scraping, and the agent tool ecosystem.

File structure

  1. H1 — the project name. One per file.
  2. Blockquote — a one-sentence summary the LLM can quote verbatim.
  3. One or more paragraphs — context that does not fit in the summary.
  4. H2 sections — grouped lists of links. Each list item is [Title](/path): description.
  5. Optional "Optional" H2 — links the agent may skip if context is tight.

Keep the file under 100 KB so it fits in any model's context. Serve it with Content-Type: text/markdown or text/plain.

Working example

# Acme Robotics

> Industrial automation for warehouse fulfillment. We sell autonomous picking robots, conveyor integration, and the software stack that runs them.

Acme Robotics makes robots that pick and pack orders in warehouses. Our customers are mid-market 3PLs and DTC brands processing 10k–100k orders per month.

## Products

- [Picker R1](/products/picker-r1): Autonomous bin-picking robot, $48k, 18-month lead time.
- [Conveyor Bridge](/products/conveyor-bridge): Software-defined conveyor sortation, SaaS $2k/mo per zone.

## Docs

- [API reference](/docs/api): REST + WebSocket endpoints for fleet control.
- [Integrations](/docs/integrations): Shopify, NetSuite, Manhattan WMS.

## Company

- [About](/about): Founded 2021, Series B, 84 employees, HQ Reykjavík.
- [Careers](/careers): Open roles in robotics, ML, and field engineering.

## Contact

- Sales: sales@acmerobotics.com
- Support: support@acmerobotics.com

llms.txt vs llms-full.txt

llms.txt is the short index. llms-full.txt is the full content dump — every public page's markdown concatenated into one file, served at /llms-full.txt.

Use llms-full.txt only when the site is docs-heavy and an agent benefits from loading the whole product reference at once. For marketing sites the short index is enough — agents fetch individual routes when they need depth.

Common mistakes

  • Listing private routes. llms.txt is public; never reference admin URLs, draft posts, or anything gated.
  • Out-of-sync with the sitemap. If a route exists in sitemap.xml but not in llms.txt, agents may miss it. Regenerate both from the same source.
  • HTML inside the file. llms.txt is markdown. Strip any HTML and use markdown links.
  • Wrong content-type. Some CDNs serve unknown extensions as application/octet-stream, which blocks inline rendering. Force text content-type.
  • Treating it as a replacement for robots.txt. It is not. llms.txt does not control crawler access; you still need robots.txt for that.

FAQ

What is llms.txt?

llms.txt is a plain markdown file served at the root of a website (/llms.txt) that lists the site's public routes with a short description each. It is designed to be loaded into an LLM's context window at inference time so the model can answer questions about the site without crawling every page.

Is llms.txt the same as robots.txt?

No. robots.txt tells crawlers what they may or may not fetch. llms.txt tells an LLM what's on the site and how to think about it — it is a curated map, not an access policy. Most sites need both.

What's the difference between llms.txt and llms-full.txt?

llms.txt is the short index — a list of routes with descriptions. llms-full.txt is the full content dump — every public page's markdown concatenated into one file. Use llms-full.txt only for docs-heavy sites where an agent benefits from loading the whole product reference at once.

Do AI engines actually read llms.txt today?

Adoption is uneven but growing. ChatGPT and Claude support loading it via tool use, Perplexity reads it when scraping, and Anthropic's Claude project ingestion accepts it natively. Even where it is not auto-fetched, the file documents the site's structure for any agent integration you build.

Where do I put llms.txt?

At the root of the domain — https://example.com/llms.txt. Serve it with Content-Type text/markdown or text/plain. Keep it under 100 KB so it fits comfortably in any model's context window.

How do I validate llms.txt?

Run the URL through the grow.contact /check scanner — it tests presence, content-type, route coverage against your sitemap, and markdown validity in one pass. The official spec lives at llmstxt.org.

Related reading