August 24, 2025

Plain-Text Context Files for LLMs

What They Are, Why They Exist, Where They Break

Plain-text context files are the new config surface for AI. A few kilobytes of Markdown can change how a coding agent behaves, how a web crawler assembles context, and how your IDE chooses files.

This post maps (some) of the landscape, explains why so many tools went “just use a text file,” and calls out the sharp edges.

What we’re talking about

You've probably encountered these types of files when using AI-powered IDEs. This concept was essentially invented by them and generally take the form of repo-level files that are automatically loaded and presented to LLMs.

Examples include:

  • AGENTS.md — OpenAI’s neutral “README for agents.” Tools look here for project-specific guidance.

  • CLAUDE.md — Claude Code auto-loads this file at session start; commonly used for run/test instructions, coding conventions, and local quirks.

  • GEMINI.md— Gemini Code Assist discovers these files (hierarchically) and treats them as agent memory.

  • Copilot instructions — GitHub Copilot supports repo-scoped instruction files like .github/copilot-instructions.md.

  • Other IDE vendor rule files — Cursor .cursorrules (MDC format), Windsurf .windsurfrules, Amazon Q Markdown rules in .amazonq/rules.

AGENTS.md is probably the best illustration of this:

1# AGENTS.md
2
3## Project
4TypeScript monorepo. pnpm. Turbo for tasks. Next.js app, FastAPI service.
5
6## Stack & commands
7- Install: pnpm i
8- Dev: pnpm dev:all
9- Test: pnpm test -w
10
11## Conventions
12- TS strict; no any
13- API errors: RFC 9457
14- Commit style: Conventional Commits
15
16## Tasks the agent may do
17- Add endpoints in /apps/api with tests
18- Touch only files it just created when refactoring
19
20## Guardrails
21- Never commit .env
22- Don’t bump Node version

It's also worth noting /llms.txt here although it serves a slightly different purpose. A Markdown index pointing models to the right pages. GitBook and others now generate it automatically. Variants include llms-full.txt (entire docs rendered to Markdown).

Why plain text won

  • Zero-ops & portable. A Markdown file commits to Git, diffs cleanly and works offline (useful if you're using local coding agents). Many new IDEs auto-discover these files by name.

  • Markdown as a schema The general shape of all these formats is the same: sections of text the model can rely on. Formalisations like AGENTS.md are useful but don't really contsrain you in anyway.

  • 0 learning curve: It’s just text in the repo. No new DSL or plugin config, no schema to memorise. Edit in any editor, review in PRs, search with grep/rg, and copy between vendors unchanged. CI can lint size and required sections with the tools you already use.

The sharp edges

  • Duplication & drift. Teams end up with CLAUDE.md, GEMINI.md, AGENTS.md, .cursorrules, and .amazonq/rules that all say the same thing, then drift. Even Google’s CLI exposes a setting to accept multiple context filenames, which is a tell.

  • Ambiguous precedence. Multiple files across directories make it unclear what actually got loaded. Some tools do hierarchical search, but specifics vary.

  • Context window pressure. Shoving giant files into prompts burns tokens and can drown out local context. You need discipline and evals. (Which is often missing.)

  • Security & governance. Markdown looks harmless, but you can leak secrets, internal URLs, or compliance steps that bypass review. Treat these like code. Copilot’s repo-level instructions and Amazon Q’s rules should go through PRs.

A simple decision guide

ScenarioRecommendation
Single IDE/agent, small teamOne repo file the tool auto-loads (vendor-specific: CLAUDE.md or GEMINI.md). Keep it < 300 lines.
Multiple agents/toolsAdopt AGENTS.md as the canonical source and generate vendor files as needed. Keep vendor files thin.
Docs on the webServe /llms.txt. Consider /llms-full.txt if your audience is AI agents, not humans. Automate generation in CI.
Dynamic data, toolsMove context & tools behind MCP and treat context files as pointers to these tools.
    Plain-Text Context Files for LLMs