llms.txt Generator
robots.txt controls indexing. llms.txt controls ingestion. Generate the markdown standard that tells AI agents exactly what your website does.
We will automatically generate links for your Pricing, About, and Docs pages.
// 1. Enter your Website URL
// 2. Describe your project
// 3. Click Generate to see the magic 🪄The Protocol for AI Crawler Optimization
As the web transitions from search engines to answer engines, the mechanism for site discovery has changed. This document outlines the technical specification and deployment strategy for llms.txt.
From Indexing to Ingestion
Legacy crawlers (Googlebot) focus on discovering URLs and mapping keyword density. Modern AI crawlers (GPTBot, ClaudeBot, Applebot-Extended) operate differently. They perform RAG (Retrieval-Augmented Generation). They do not just want to know a page exists; they need to ingest the semantic meaning of that page to answer user queries.
Standard HTML is noisy. It contains navigation bars, footers, JavaScript, and CSS classes that waste "tokens"—the computational currency of LLMs. The llms.txt specification solves this by offering a clean, markdown-based representation of your site located at the root directory.
The llms.txt Specification
The file follows a strict Markdown structure designed for machine readability. The llms generator above produces a file adhering to these three core sections:
- The Summary BlockA high-density paragraph explaining the entity's purpose. This is often the only text an LLM reads before deciding if the site is relevant to the user's prompt.
- The Core MapA list of URLs pointing to the most information-dense pages (Pricing, About, Documentation). Unlike a sitemap.xml which lists every page, this lists only context-heavy pages.
Token Economics & Crawler Incentives
Why would OpenAI or Perplexity prioritize sites with an llms.txt file? The answer is computational cost.
Parsing a React-heavy webpage requires rendering JavaScript, stripping DOM elements, and normalizing text. This is expensive. Reading a markdown file is computationally free. By providing this file, you align your website's architecture with the economic incentives of AI companies. You make your content "cheaper" to learn, increasing the probability of citation.
Protocol Comparison: Old Web vs. New Web
| Feature | robots.txt | llms.txt |
|---|---|---|
| Primary Function | Access Control | Context Provision |
| Target Audience | Search Spiders | LLM Agents (RAG) |
| File Format | Proprietary Syntax | Standard Markdown |
| Processing Cost | Low | Near Zero |
Advanced: The /llms-full.txt Strategy
The standard proposes a two-file system. The primary /llms.txt acts as a signpost or table of contents. However, for documentation-heavy sites, you may also generate an /llms-full.txt.
This secondary file contains the concatenated full text of your entire documentation or core content. This allows an AI agent to ingest your entire knowledge base in a single HTTP request, rather than recursively crawling links. Our llms generator supports the creation of the primary signpost file, which is the requisite first step for this architecture.