Our CLAUDE.md Is 162 Lines. Here's What's In It — and What We Cut.

The Most Expensive File In Your Repo You Didn’t Know About

If you use Claude Code, your project-root CLAUDE.md is the single most expensive file in your repository. Not because of disk space — because every byte of it loads into context at every session start with no lazy loading. Skills load progressively. MCP tools can be deferred. CLAUDE.md just sits there, always loaded, always taxing your window.

Ours was 820 lines. We stripped it to 162 without losing a single guardrail. This post is exactly what came out, what stayed, and the three patterns that did the work.

Headline numbers: 820 → 162 lines, ~8,000 → ~1,200 tokens, and a ~12% → ~0.75% share of the 200K context window.

Why CLAUDE.md Deserves Its Own Post

In our broader token optimization writeup, MCP Tool Search was the single biggest lever. CLAUDE.md is the second. Research on large Claude Code frameworks reports optimized root configs achieving 54–62% reduction in baseline overhead, and one documented anti-pattern — a 2,800-line CLAUDE.md — wasted roughly 62% of tokens per session.

The mechanism is the reason. CLAUDE.md content at every level of the hierarchy loads unconditionally:

Enterprise policy files
Project root CLAUDE.md
User-level ~/.claude/CLAUDE.md
Project-local overrides
Any file imported via the @path/to/file syntax (recursive up to 5 levels deep)

Everything else Claude Code does around context — progressive skill loading, MCP Tool Search, subagent isolation — is designed to dodge the always-loaded tax. CLAUDE.md is the one thing you pay for unconditionally, every single turn.

Our Before: 820 Lines, Roughly 12% Of Context

Our original CLAUDE.md read like a company handbook. It documented every skill we used, every release-gate step, every nuance of our WordPress Coding Standards configuration, every naming convention for React components, every admin-asset scoping rule. It was genuinely useful reference material. It was also loaded into every session, including sessions that had nothing to do with most of it.

Here’s a rough accounting of what was in it:

Section	Lines	Approx tokens	Load frequency justified?
Workspace map and project structure	90	900	Yes — context for any session
Architecture and tech stack	60	600	Yes
Privacy rules (non-negotiable)	40	400	Yes — critical invariants
Per-skill descriptions (30+ entries)	180	1,800	No — individual skills load on demand
Per-workflow protocols (detailed)	200	2,000	No — needed only during that workflow
Testing standards in detail	120	1,200	No — relevant only to test-writing sessions
Release gate walkthrough	80	800	No — lives in the release skill
Troubleshooting notes	50	500	No — rarely needed

Everything marked “No” was paying tax on every session to be available for a fraction of them. That’s roughly 6,300 tokens of waste — about 3% of the entire context window, permanently.

Our After: 162 Lines, Roughly 0.75%

The new CLAUDE.md covers the same surface area with three patterns: a trigger table, nested CLAUDE.md files, and path-scoped rules.

Pattern 1: The Skill Trigger Table

Instead of documenting every skill with a paragraph, we replaced ~30 per-skill descriptions with a single lookup table:

## Skill triggers
| Trigger keywords | Skill | Domain |
|------------------|-------|--------|
| sprint, backlog, iteration | pm-sprint-plan | PM |
| story, acceptance criteria | pm-story-write | PM |
| deploy, release, ship | statnive-release | Dev |
| security, audit, CVE | sec-audit-remediate | Security |
| wireframe, mockup | ux-design | UX |
| benchmark, performance | wp-performance | WP |

Research on large frameworks reports this pattern costing ~800 tokens versus 3,000+ for verbose per-skill prose. Claude’s routing still works correctly because the trigger keywords are what it actually matches against — the long-form “when to use this skill and when not to” prose we used to write never helped routing, it just filled context.

All the detailed “when to use / when not to use” content lives in the individual SKILL.md files, loaded only when Claude routes to them. The trigger table is the index; the skills are the chapters.

Pattern 2: Nested CLAUDE.md Files For Domain Rules

Here’s the critical property most developers miss: nested CLAUDE.md files in subdirectories are lazily loaded. They enter context only when Claude reads files in those subtrees.

Our repo layout now looks like:

statnive-workflow/
├── CLAUDE.md                    # 162 lines — global only
├── statnive/
│   ├── CLAUDE.md                # PHP / WordPress plugin conventions
│   ├── src/
│   └── tests/
├── statnive-website/
│   └── CLAUDE.md                # Astro / MDX / frontend conventions
└── jaan-to/
    └── CLAUDE.md                # AI framework conventions

When we’re writing PHP for the plugin, statnive/CLAUDE.md loads automatically and brings in our WordPress Coding Standards notes, the $wpdb->prepare() enforcement rule, and the admin-asset scoping rule. When we’re writing MDX blog content, statnive-website/CLAUDE.md loads with Astro conventions and house-style notes. Neither interferes with the other.

This pattern cut about 2,200 tokens of “only relevant sometimes” content from the root.

Pattern 3: Path-Scoped Rules In `.claude/rules/`

The third mechanism is .claude/rules/ — rule files with YAML frontmatter that can declare which paths they apply to:

---
paths: ["statnive-website/src/**/*.tsx", "statnive-website/src/**/*.astro"]
---
Use Tailwind utilities scoped under `#statnive-app`. Never import the default
Tailwind bundle — it restyles the WordPress admin chrome.

Rules with a paths: field load conditionally — only when Claude is working with matching files. Rules without a paths: field load unconditionally, so we keep those to a tight handful (privacy rules, security rules, commit conventions).

We moved about 1,800 tokens of framework-specific conventions out of the root CLAUDE.md into path-scoped rules under .claude/rules/. They still fire reliably when relevant, and they cost nothing when irrelevant.

The Anti-Pattern We Removed: `@-Imports`

Our original CLAUDE.md had three @-imports:

@jaan-to/docs/best-practices.md
@jaan-to/context/tech.md
@jaan-to/outputs/ROADMAP.md

It looks tidy. It is a trap. The @ syntax recursively loads the full content of each target file on every session, up to five levels deep. Our three imports alone were adding roughly 6,000 tokens of permanent overhead for content the model needed on maybe one in twenty sessions.

We replaced each with a pointer:

For architectural context see `jaan-to/context/tech.md`.
For the current roadmap see `jaan-to/outputs/ROADMAP.md`.

Claude reads these paths when it actually needs them — via the Read tool, on demand. Zero baseline cost, same practical outcome.

What Still Lives In The Root File

After the cuts, the 162-line root CLAUDE.md contains exactly:

Section	Lines	Why it stayed
Product philosophy (8 principles from research)	28	Shapes every decision; must influence all sessions
Workspace map	18	One-glance orientation, loaded for any session
Privacy rules (non-negotiable)	16	Critical invariants — cookies, raw IPs, salts, SHA-256
Admin asset scoping rule	22	Has bitten us before, applies broadly
Commit policy (co-authored-by trailer is banned)	8	Global git convention
`/simplify` workflow rule	18	Enforcement of our quality gate
Skill trigger table	32	Replaces ~30 verbose per-skill sections
Key paths (pointers, not imports)	20	One-line references to docs

Everything else either moved into nested CLAUDE.md, path-scoped rules, or individual skill bodies.

What We Did Not Optimize

Honest caveats, same pattern as the rest of the series:

User-level ~/.claude/CLAUDE.md is out of our control. Our engineers each have their own global config, and those load on top of the project file. Research calls this out as a common source of hidden overhead — a global CLAUDE.md containing “remember to run tests before committing” on top of a project-level workflow doing exactly that costs tokens without adding signal. We asked team members to audit theirs. Yours is probably worth a look.
We don’t have a CI gate on root CLAUDE.md size yet. Research suggests failing the PR if root CLAUDE.md + any @-imports exceed ~2,500 tokens. We enforce with /context spot-checks. A CI gate is on the roadmap.
IMPORTANT and YOU MUST markers are tempting. Anthropic internally runs CLAUDE.md files through their prompt improver and uses emphasis markers for critical rules. We use them sparingly — every one is permanent overhead, so we reserve them for the privacy rules (no cookies, no raw IPs) and the commit policy (no co-authored-by trailer).

The Measurement Step You Can’t Skip

If you’re auditing your own CLAUDE.md, the one command that matters is /context. Run it at the very start of a fresh session, before any prompts. It breaks down your context usage by source: system prompt, tools, memory (that’s CLAUDE.md), skills metadata, and MCP tool schemas.

Numbers we aim for on a 200K-window session:

Source	Target	Hard cap
Root CLAUDE.md + unscoped rules	≤ 1,500 tokens	2,500
Skill metadata	≤ 2,500 tokens	5,000
MCP tool schemas (with Tool Search)	≤ 3,000 tokens	8,000
Hook stdout (SessionStart, UserPromptSubmit)	0 tokens	300 per hook

If your /context exceeds any of these by more than ~50%, you have the same problem we had. The fixes are the three patterns above plus MCP Tool Search.

Why This Matters For The People Running Statnive

Our CLAUDE.md is public in the Statnive repo on GitHub — same reason our test pipeline is public. A tight root config means faster sessions, cheaper runs, and a model that actually reads what matters instead of drowning in “always available, rarely needed” reference material. That efficiency compounds into the same thing our users care about: a plugin that ships often and stays under its 5KB tracker budget.

Try Statnive

Privacy-first WordPress analytics, built by a team that takes token budgets as seriously as performance budgets. Install Statnive free from WordPress.org — your data stays on your server, our engineering practices stay on GitHub for anyone to audit.