Menu
Web Standard Protocol

llms.txt Generator

robots.txt controls indexing. llms.txt controls ingestion. Generate the markdown standard that tells AI agents exactly what your website does.

We will automatically generate links for your Pricing, About, and Docs pages.

/public/llms.txt
// 1. Enter your Website URL
// 2. Describe your project
// 3. Click Generate to see the magic 🪄

The Protocol for AI Crawler Optimization

As the web transitions from search engines to answer engines, the mechanism for site discovery has changed. This document outlines the technical specification and deployment strategy for llms.txt.

From Indexing to Ingestion

Legacy crawlers (Googlebot) focus on discovering URLs and mapping keyword density. Modern AI crawlers (GPTBot, ClaudeBot, Applebot-Extended) operate differently. They perform RAG (Retrieval-Augmented Generation). They do not just want to know a page exists; they need to ingest the semantic meaning of that page to answer user queries.

Standard HTML is noisy. It contains navigation bars, footers, JavaScript, and CSS classes that waste "tokens"—the computational currency of LLMs. The llms.txt specification solves this by offering a clean, markdown-based representation of your site located at the root directory.

The llms.txt Specification

The file follows a strict Markdown structure designed for machine readability. The llms generator above produces a file adhering to these three core sections:

# Project Name
> A concise summary of the project...
## Secondary Section
- [Link Title](URL): Description of the resource
## Optional Resources
- [Docs](URL): Technical documentation
  • The Summary BlockA high-density paragraph explaining the entity's purpose. This is often the only text an LLM reads before deciding if the site is relevant to the user's prompt.
  • The Core MapA list of URLs pointing to the most information-dense pages (Pricing, About, Documentation). Unlike a sitemap.xml which lists every page, this lists only context-heavy pages.

Token Economics & Crawler Incentives

Why would OpenAI or Perplexity prioritize sites with an llms.txt file? The answer is computational cost.

Parsing a React-heavy webpage requires rendering JavaScript, stripping DOM elements, and normalizing text. This is expensive. Reading a markdown file is computationally free. By providing this file, you align your website's architecture with the economic incentives of AI companies. You make your content "cheaper" to learn, increasing the probability of citation.

Protocol Comparison: Old Web vs. New Web

Featurerobots.txtllms.txt
Primary FunctionAccess ControlContext Provision
Target AudienceSearch SpidersLLM Agents (RAG)
File FormatProprietary SyntaxStandard Markdown
Processing CostLowNear Zero

Advanced: The /llms-full.txt Strategy

The standard proposes a two-file system. The primary /llms.txt acts as a signpost or table of contents. However, for documentation-heavy sites, you may also generate an /llms-full.txt.

This secondary file contains the concatenated full text of your entire documentation or core content. This allows an AI agent to ingest your entire knowledge base in a single HTTP request, rather than recursively crawling links. Our llms generator supports the creation of the primary signpost file, which is the requisite first step for this architecture.

Technical FAQ

How do I validate the syntax?
The syntax is valid Markdown. If the file renders correctly in a standard markdown viewer (like GitHub's preview or Obsidian), it is valid for LLM ingestion.
Should I include my entire sitemap?
No. That is an anti-pattern. Only include high-value, semantic pages. Avoid listing paginated archives, tag pages, or login screens. The goal is signal, not noise.
How often should I update this file?
Update the file whenever your core site architecture changes or when you release a major new product feature. Since it is a static file, aggressive caching is recommended.