What is the purpose of the llms.txt file?

The llms.txt file acts as a markdown-based API for Large Language Models. It provides a condensed, token-efficient summary of a website's core content, allowing AI crawlers (like GPTBot) to ingest accurate information without parsing complex HTML.

Where should I deploy the generated file?

The file must be deployed at the root of your domain (e.g., https://your-site.com/llms.txt). It mimics the placement standard of robots.txt and sitemap.xml.

Does this replace robots.txt?

No. robots.txt provides 'directives' (Allow/Disallow) to control access. llms.txt provides 'context' and 'content' to facilitate better ingestion. Both files should coexist.

llms.txt Generator | Create Standard /llms.txt Files for AI Bots

From Indexing to Ingestion

Legacy crawlers (Googlebot) focus on discovering URLs and mapping keyword density. Modern AI crawlers (GPTBot, ClaudeBot, Applebot-Extended) operate differently. They perform RAG (Retrieval-Augmented Generation). They do not just want to know a page exists; they need to ingest the semantic meaning of that page to answer user queries.

Standard HTML is noisy. It contains navigation bars, footers, JavaScript, and CSS classes that waste "tokens"—the computational currency of LLMs. The llms.txt specification solves this by offering a clean, markdown-based representation of your site located at the root directory.

The llms.txt Specification

The file follows a strict Markdown structure designed for machine readability. The llms generator above produces a file adhering to these three core sections:

# Project Name

> A concise summary of the project...

## Secondary Section

- [Link Title](URL): Description of the resource

## Optional Resources

- [Docs](URL): Technical documentation

The Summary BlockA high-density paragraph explaining the entity's purpose. This is often the only text an LLM reads before deciding if the site is relevant to the user's prompt.
The Core MapA list of URLs pointing to the most information-dense pages (Pricing, About, Documentation). Unlike a sitemap.xml which lists every page, this lists only context-heavy pages.

Token Economics & Crawler Incentives

Why would OpenAI or Perplexity prioritize sites with an llms.txt file? The answer is computational cost.

Parsing a React-heavy webpage requires rendering JavaScript, stripping DOM elements, and normalizing text. This is expensive. Reading a markdown file is computationally free. By providing this file, you align your website's architecture with the economic incentives of AI companies. You make your content "cheaper" to learn, increasing the probability of citation.

Protocol Comparison: Old Web vs. New Web

Feature	robots.txt	llms.txt
Primary Function	Access Control	Context Provision
Target Audience	Search Spiders	LLM Agents (RAG)
File Format	Proprietary Syntax	Standard Markdown
Processing Cost	Low	Near Zero

Advanced: The /llms-full.txt Strategy

The standard proposes a two-file system. The primary /llms.txt acts as a signpost or table of contents. However, for documentation-heavy sites, you may also generate an /llms-full.txt.

This secondary file contains the concatenated full text of your entire documentation or core content. This allows an AI agent to ingest your entire knowledge base in a single HTTP request, rather than recursively crawling links. Our llms generator supports the creation of the primary signpost file, which is the requisite first step for this architecture.

Technical FAQ

How do I validate the syntax?▼

The syntax is valid Markdown. If the file renders correctly in a standard markdown viewer (like GitHub's preview or Obsidian), it is valid for LLM ingestion.

Should I include my entire sitemap?▼

No. That is an anti-pattern. Only include high-value, semantic pages. Avoid listing paginated archives, tag pages, or login screens. The goal is signal, not noise.

How often should I update this file?▼

Update the file whenever your core site architecture changes or when you release a major new product feature. Since it is a static file, aggressive caching is recommended.

llms.txt Generator