llms.txt File

I want to read about the llms.txt file - "A proposal to standardize on using an /llms.txt file to provide information to help LLMs use a website at inference time."

Date Created:
2 598

References



Related



Notes


  • This is a proposal to standardize on using /llms.txt file to provide information to help LLMs use a website at inference time - created by Jeremy Howard
  • Today's websites are not just used to provide information to people, but they are also used to provide information to large language models. Providing information to language models is a little different than providing information to humans - LLMS generally like to have information in a more concise form.
  • Language models can ingest a lot of information quickly, so it can be helpful to have a single place where all of the key information can be collated.
  • Right now, context windows are too small to handle most sites in their entirety. This combined with JS-heavy nature of some sites that could obfuscate content could cause problems for LLMs.
  • This proposal is that those interested in proving LLM-friendly content add a /llms.txt file to their site.
  • This is a markdown file that provides a brief background information and guidance, along with links to markdown files (which can also link external sites) providing more detailed information.
  • llms.txt markdown is human and LLM readable, but is also in a precise format allowing fixed processing methods.
  • This proposal also includes the idea that pages on websites that have information that might be useful for LLMs to read provide a clean markdown version of those pages as the original page, but with .md appended.
  • Example: FastHTML Documentation llms.txt
  • Example: Example of a Page with .md Extension
  • The llms.txt file spec is for files located in the root path /llms.txt of a website. A file following the spec contains the following sections as markdown, in the specific order:
    • An H1 with the name of the project or site (only required section)
    • A blockquote with the short summary of the project, containing key information necessary for understanding the rest of the file.
    • Zero or more markdown sections of any type except headings, containing more information about the project and how to interpret the provided files.
    • Zero or more markdown sections delimited by H2 headers, containing "file lists" of URRLs where further detail is available.
      • Each "file list" is a markdown list, containing a required markdown hyperlink [name](url), then optionally a : and notes about the file.
# Title

> Optional description goes here

Optional details go here

## Section name

- [Link title](https://link_url): Optional link details

## Optional

- [Link title](https://link_url)

The purpose of llms.txt is to provide a curated overview for LLMs. The expectation is the llms.txt will mainly be useful for inference, as opposed to for training.

# FastHTML

> FastHTML is a python library which brings together Starlette, Uvicorn, HTMX, and fastcore's `FT` "FastTags" into a library for creating server-rendered hypermedia applications.

Important notes:

- Although parts of its API are inspired by FastAPI, it is *not* compatible with FastAPI syntax and is not targeted at creating API services
- FastHTML is compatible with JS-native web components and any vanilla JS library, but not with React, Vue, or Svelte.

## Docs

- [FastHTML quick start](https://docs.fastht.ml/path/quickstart.html.md): A brief overview of many FastHTML features
- [HTMX reference](https://raw.githubusercontent.com/path/reference.md): Brief description of all HTMX attributes, CSS classes, headers, events, extensions, js lib methods, and config options

## Examples

- [Todo list application](https://raw.githubusercontent.com/path/adv_app.py): Detailed walk-thru of a complete CRUD app in FastHTML showing idiomatic use of FastHTML and HTMX patterns.

## Optional

- [Starlette full documentation](https://gist.githubusercontent.com/path/starlette-sml.md): A subset of the Starlette documentation useful for FastHTML development
To create effective llms.txt files, consider these guidelines: Use concise, clear language. When linking to resources, include brief, informative descriptions. Avoid ambiguous terms or unexplained jargon. Run a tool that expands your llms.txt file into an LLM context file and test a number of language models to see if they can answer questions about your content.


Python Module and CLI

$ pip install llms-text
$ llms_txt2ctx -h # Get Help for the CLI
$ llms_txt2ctx llms.txt > llms.md # Convert txt file to XML context and save to llms.md
from llms_txt import * 
samp = Path('llms-sample.txt').read_text()
# Create a Data Structure with the sections of an llms.txt file
parsed = parse_llms_file(samp)
list(parsed)
# ['title', 'summary', 'info', 'sections']
# More...

You can read more about how comments are sorted in this blog post.

User Comments