AI Standards

llms.txt: The AI-Crawler Manifest

A small markdown file at your site root that tells AI models which of your pages matter most. Adopted by Anthropic, Mintlify, and a growing list of AI-native sites — here's what it is, why it matters, and how to deploy yours in under an hour.

Last Updated: April 2026

Quick Answer

llms.txt is a markdown file you place at the root of your website (yoursite.com/llms.txt) that gives AI models a curated, machine-readable map of your most important pages. Created by Jeremy Howard in September 2024, it's now adopted by Anthropic, Mintlify, Cursor, and a growing list of AI-native sites. Think of it as sitemap.xml optimized for LLM ingestion: small (under 5K tokens), curated (your top pages, not all pages), and human-readable. Build yours with our free llms.txt generator.

Why a new file when sitemap.xml exists

sitemap.xml lists every URL on your site so search engine crawlers can discover them all. The audience is search engine indexing infrastructure — comprehensive coverage, automated parsing, no curation. It works because Google's index has effectively unlimited storage and processing capacity.

LLMs don't have unlimited capacity. When a model is grounding a response with web context, it can typically only ingest a small slice of your site (often a single page, sometimes a handful). What it needs from you is curation: which pages are highest-quality, what each one covers, and how they fit together. sitemap.xml doesn't provide that.

llms.txt fills the gap with a small markdown file that says, in human-readable form: "If you're trying to understand this site, start with these pages, in this order, and here's what each is about." It's the difference between handing someone your full file system and handing them a curated reading list.

The minimum-viable llms.txt

# Your Site Name

> One-line description of what this site is and who it serves.

## Core pages

- [Homepage](https://yoursite.com/): What the site does
- [About](https://yoursite.com/about): The team and mission
- [Pricing](https://yoursite.com/pricing): Plans and what's included

## Documentation

- [Getting started](https://yoursite.com/docs/start): First-time setup
- [API reference](https://yoursite.com/docs/api): Endpoint reference

## Blog

- [Why we built X](https://yoursite.com/blog/why-x): Founding story
- [How to do Y](https://yoursite.com/blog/how-y): Core tutorial

That's it. Save as llms.txt at your site root, served as text/plain or text/markdown.

Who reads llms.txt today

• Anthropic Claude — reads llms.txt during web browsing for context-grounding.
• Mintlify, Cursor, dev-tool integrations — use llms.txt to ingest documentation into IDE assistants and code agents.
• Smaller AI search engines — Phind, Komo, Andi, and others have signaled or implemented llms.txt support.
• ChatGPT, Perplexity, Gemini, AI Overviews — no official llms.txt support yet, but the file is increasingly used as a publishing signal that correlates with other AEO best practices.

llms.txt vs llms-full.txt

Two related files in the spec:

• llms.txt — the curated summary. Small (~5K tokens), human-readable, lists your top pages with descriptions. Most sites only need this one.
• llms-full.txt — optional companion containing the full markdown content of every page listed in llms.txt. Designed for context-window ingestion when an LLM needs to read your entire knowledge base in one shot. Most useful for documentation sites and developer tools where coherent multi-page context matters.

Why deploy now even before universal adoption

Asymmetric upside. Costs under an hour to create; quarterly maintenance. The cost of NOT having one if llms.txt becomes a standard is higher than the cost of having one if it doesn't.
Curation is itself useful. The exercise of identifying your top 20-50 pages and writing one-line descriptions is a beneficial information-architecture audit regardless of who reads the file.
Signal correlation. Sites that deploy llms.txt are typically also implementing schema markup, entity consistency, and other AEO best practices. The file functions as a "we take this seriously" signal.

Generate yours in 60 seconds

Use our free llms.txt generator — paste your sitemap URL, pick which sections to include, and copy the validated markdown.

Frequently Asked Questions

What is llms.txt? +

llms.txt is a small markdown file placed at the root of your website (yoursite.com/llms.txt) that provides AI crawlers and large language models with a curated, machine-readable summary of your most important pages. Think of it as a sitemap optimized specifically for AI ingestion — instead of letting models discover your content through full crawls, you give them a hand-picked list of your high-quality, citation-worthy pages with brief descriptions of each.

Who created the llms.txt standard? +

The llms.txt proposal was introduced by Jeremy Howard (founder of fast.ai and Answer.AI) in September 2024. It builds on the same conceptual lineage as robots.txt and sitemap.xml — simple text files at root that signal site structure to crawlers — but is purpose-built for the LLM era. By early 2026 it has been adopted by Anthropic, Mintlify, Cursor, and a growing list of AI-native and developer-tools companies.

Is llms.txt the same as robots.txt? +

No. They serve different purposes. robots.txt tells crawlers which paths they may NOT visit (a deny-list). llms.txt tells AI models which pages are MOST IMPORTANT (a positive curation). Both files can coexist; in practice most sites that deploy llms.txt also maintain robots.txt for traditional crawl control. llms.txt does not replace robots.txt — and importantly, llms.txt is not a permission file; AI models can ignore it just like they can ignore robots.txt.

Does Google or ChatGPT actually read llms.txt? +

In early 2026, adoption is partial. Anthropic's Claude reads llms.txt for context-grounding when browsing a site. Some smaller AI search engines and dev-tool integrations read it. ChatGPT, Perplexity, and Gemini do not yet officially announce llms.txt support, but the file is increasingly used as a publishing signal — sites that maintain it are typically also implementing other AEO best practices (schema, entity consistency), and the correlation matters even if the file isn't directly parsed.

How do I create a llms.txt file? +

Create a markdown file at your site root following this structure: an H1 with your site name, a blockquote summarizing what the site is, then sectioned H2 lists with bullet points linking to your most important pages and a one-line description of each. Keep the entire file under 5,000 tokens — its purpose is to give an LLM a quick, high-quality map of your site, not to mirror your full sitemap. Use our free llms.txt generator to build it from your sitemap.

What is the difference between llms.txt and llms-full.txt? +

llms.txt is the curated summary file (intended to be small, ~5K tokens or less). llms-full.txt is an optional companion that contains the full markdown content of every page listed in llms.txt — designed for context-window ingestion when an LLM needs to read your entire knowledge base in one shot. Most sites only need llms.txt; documentation sites and developer tools (Mintlify, Anthropic's docs) often publish both.

Where exactly should I put the llms.txt file? +

At the root of your domain — yoursite.com/llms.txt — and accessible via a plain HTTP GET request. Don't put it in /docs/llms.txt or similar subpaths; the spec calls for the root location and most LLM clients only check there. Make sure the file is served as text/plain or text/markdown (not as text/html) so crawlers parse it correctly.

Is llms.txt worth the effort if adoption is still partial? +

Yes, for three reasons. First, it costs less than an hour to create and never needs maintenance more than quarterly. Second, the cost of NOT having one if it becomes a standard is higher than the cost of having one if it doesn't — asymmetric upside. Third, the act of building one forces you to identify your most important pages, which is itself a useful exercise. Treat it as cheap insurance plus a beneficial side-product.