System Prompt Assembly

Claude Code does not ship with a single static system prompt. Instead, getSystemPrompt() in src/constants/prompts.ts assembles the prompt at runtime from 15+ discrete sections, splitting them into a globally cacheable static half and a per-session dynamic half. Understanding this architecture is key to optimising your CLAUDE.md placement and minimising cache misses.

The Two-Part Architecture

Every system prompt is divided by a hard boundary marker into two halves. Everything above the marker is identical across all sessions for a given build, so it can be cached once and reused. Everything below changes per session, per project, or per user.

src/constants/prompts.ts
const CACHE_BOUNDARY = "__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__"; function getSystemPrompt(config: SessionConfig): SystemPromptParts { const staticSections = [ identitySection, // 1. Identity + CYBER_RISK_INSTRUCTION behaviourRules, // 2. System behaviour rules taskExecution, // 3. Task execution guidelines safetyGuidance, // 4. Safety & reversibility toolUsage, // 5. Tool usage guidance toneAndStyle, // 6. Tone & style ]; const dynamicSections = [ toolTips(config.tools), // 7. Session-specific tool tips claudeMd(config.project), // 8. MEMORY.md + CLAUDE.md envInfo(config.env), // 9. CWD, git, OS, model languagePref, // 10. Language preference outputStyle, // 11. Output style mcpInstructions, // 12. MCP server instructions scratchpadGuidance, // 13. Scratchpad guidance tokenManagement, // 14. Token management hints ]; return { static: staticSections.join("\n"), boundary: CACHE_BOUNDARY, dynamic: dynamicSections.join("\n"), }; }
Why Two Parts?

Anthropic's API supports prompt caching. By keeping sections 1-6 identical across sessions, the first ~60% of the system prompt hits the cache on every request. The dynamic half (sections 7-14+) is rebuilt per session but is smaller, so the cache miss cost is contained.


Static Sections (Globally Cacheable)

These six sections are identical for every user running the same build. They never change between sessions.

STATIC HALF -- Cached Globally Per Build 1. Identity + CYBER_RISK_INSTRUCTION Who am I, what am I not allowed to do 2. System Behaviour Rules Core operational constraints 3. Task Execution Guidelines How to approach work step-by-step 4. Safety & Reversibility Guidance Prefer reversible actions, ask before destructive ops 5. Tool Usage Guidance When and how to use each tool 6. Tone & Style Conciseness, formatting, language register
# Section Purpose Key Contents
1 Identity Establish who Claude is Name, model, CYBER_RISK_INSTRUCTION block (security hardening)
2 Behaviour Rules Core operational constraints Never fabricate, admit uncertainty, respect boundaries
3 Task Execution How to approach work Read before editing, verify after changes, prefer small diffs
4 Safety Guidance Reversibility rules Prefer non-destructive git ops, ask before deleting, sandbox awareness
5 Tool Usage Per-tool instructions When to use Bash vs Read vs Grep, file editing best practices
6 Tone & Style Output formatting Be concise, no preamble, no unnecessary apologies, markdown formatting

The Cache Boundary

Between the static and dynamic halves sits a sentinel string that the caching layer uses to split the prompt:

Cache Marker
__SYSTEM_PROMPT_DYNAMIC_BOUNDARY__

The API client splits on this marker. Everything before it is sent with cache_control: { "type": "ephemeral" }, telling the Anthropic API to cache that prefix. Everything after is sent as a normal (uncached) system block. This means the static half is a write-once, read-many cache entry -- typically amortised across hundreds of requests.

Cache-Breaking Gotcha

Your CLAUDE.md lives in the dynamic half (section 8). Every edit to CLAUDE.md changes the dynamic portion, but does not break the static cache. However, if Anthropic ships a new build that changes sections 1-6, every user's static cache is invalidated globally.


Dynamic Sections (Per-Session)

These sections are rebuilt for each session based on your project, environment, and configuration.

DYNAMIC HALF -- Rebuilt Per Session 7. Session-Specific Tool Tips Tips for currently available tools 8. MEMORY.md + CLAUDE.md YOUR INSTRUCTIONS HERE 9. Environment Info CWD, git status, OS, shell, model name 10. Language Preference User's preferred language (if set) 11. Output Style Conciseness level, formatting rules 12. MCP Server Instructions Connected MCP servers & their tools 13. Scratchpad Guidance How to use the scratchpad tool 14. Token Management Hints Budget awareness, output length guidance
# Section Varies By Details
7 Tool Tips Available tools Generated from the tool set actually loaded for this session (MCP tools, built-in tools, agent-restricted tools)
8 MEMORY.md + CLAUDE.md Project + user Three tiers loaded: ~/.claude/CLAUDE.md (global), .claude/CLAUDE.md (project), CLAUDE.md (workspace root). Plus MEMORY.md files from ~/.claude/projects/
9 Environment Info Machine + session Working directory, git branch/status, OS name + version, shell, active model ID
10 Language Preference User setting If set, instructs Claude to respond in that language. Defaults to English.
11 Output Style User setting Conciseness level (concise, normal, verbose), code-only mode
12 MCP Instructions Connected servers Descriptions, tool schemas, and usage hints for each MCP server. Uses DANGEROUS_uncachedSystemPromptSection
13 Scratchpad Feature flag Guidance on using the scratchpad tool for persistent notes within a session
14 Token Management Model + context Budget hints, output length targets, compaction awareness

System Prefix Selection

Before any sections are assembled, the system selects one of three prefix variants based on how Claude Code was invoked. The prefix is prepended to the entire system prompt.

Variant When Used Key Difference
Default CLI Standard claude command Full interactive identity: "You are Claude Code, an interactive CLI tool..."
Agent SDK (Claude Code) When used via the Agent SDK as a Claude Code instance Adds SDK-specific constraints, preserves Claude Code identity
Agent SDK (Generic) When used via the Agent SDK as a generic agent Minimal identity, no Claude Code branding. "You are an AI agent..."
Prefix Selection Logic
function getSystemPrefix(mode: InvocationMode): string { switch (mode) { case "cli": return "You are Claude Code, a CLI-based AI coding assistant..."; case "agent-sdk-claude-code": return "You are Claude Code, running via the Agent SDK..."; case "agent-sdk-generic": return "You are an AI agent with access to tools..."; } }

Knowledge Cutoff Dates

The system prompt includes a hardcoded knowledge cutoff date that varies by model. This is injected into the environment info section so Claude knows the limits of its training data.

Model Cutoff Date Notes
claude-sonnet-4-6 August 2025 Most recent training data
claude-opus-4-6 May 2025 Same cutoff as Opus 4.5
claude-opus-4-5 May 2025 --
claude-haiku-4 February 2025 Used by Explore agent, cheapest option
Source: Knowledge Cutoff Map
const KNOWLEDGE_CUTOFFS: Record<string, string> = { "claude-sonnet-4-6": "August 2025", "claude-opus-4-6": "May 2025", "claude-opus-4-5": "May 2025", "claude-haiku-4": "February 2025", };

Feature-Gated Sections

Several prompt sections are conditionally included based on feature flags (via GrowthBook), user type, or environment variables. These sections only appear when their gate is satisfied.

Feature Section Gate What It Adds
Proactive Mode Feature Flag Enables tick prompts: Claude can act autonomously between user messages, polling for state changes and taking action without being asked
Verification Agent Ant-Only A/B Test Background agent that verifies outputs. Only available to internal Anthropic users as part of an A/B experiment
Token Budget Feature Flag Injects explicit token budget targets: "aim for under N tokens in this response". Helps control output length
Numeric Length Anchors Internal Builds Hardcoded numeric targets like "keep responses under 4000 tokens". Reduced output by ~1.2% in internal testing

Proactive Mode Deep Dive

When proactive mode is enabled, an additional section is injected that instructs Claude to use "tick prompts" -- periodic self-triggered turns where Claude can check build status, test results, or file system changes and respond without waiting for the user.

Proactive Mode Prompt Injection
// When proactive mode feature flag is enabled: if (features.proactiveMode) { dynamicSections.push(` You have proactive mode enabled. You may take autonomous actions between user messages when you detect something that needs attention. Use tick prompts to poll for changes in build output, test results, or file system state. `); }

Numeric Length Anchors

Internal Anthropic builds include specific numeric targets in the style section. These anchors measurably reduced output tokens by approximately 1.2% in A/B testing. The technique works because concrete numbers are harder for the model to rationalise away than vague instructions like "be concise".

Steal This: Add Numeric Anchors to CLAUDE.md

You can replicate the internal build's length anchors in your own CLAUDE.md. Instead of "be concise", use specific numbers: "Keep responses under 100 words. Keep text between tool calls under 25 words." This exploits the same mechanism Anthropic uses internally.

Medium Impact

Ant-Only vs External Builds

The prompt assembly checks a USER_TYPE gate to determine whether the user is an internal Anthropic employee ("ant") or an external user. Several sections differ between the two builds.

Feature External Build Ant-Only Build
Verification Agent Not available Available (A/B tested)
Numeric Length Anchors Not included Included (reduced output ~1.2%)
Explore Agent Model Haiku May use higher-tier models
Feature Flag Overrides Standard GrowthBook Additional internal-only flags
Debug Sections Stripped Optional debug prompt sections available
USER_TYPE Gate
const isInternal = process.env.USER_TYPE === "ant"; if (isInternal) { staticSections.push(numericLengthAnchors); // Verification agent A/B test gate if (features.verificationAgent) { dynamicSections.push(verificationAgentSection); } }

Caching Strategy

The prompt uses section-level caching. Each section is individually cacheable, and the overall prompt is split at the boundary marker for API-level caching. However, one notable exception exists.

Static sections (1-6): Cached globally per build version. Identical for all users on the same build.
Dynamic sections (7-14): Cached per session. Rebuilt when tools, CLAUDE.md, or environment change.
MCP section (12): Uses DANGEROUS_uncachedSystemPromptSection -- deliberately excluded from caching because MCP server availability can change mid-session.
Why DANGEROUS_uncachedSystemPromptSection?

The MCP instructions section is the only section that uses this escape hatch. MCP servers can connect and disconnect during a session, so caching their instructions would cause Claude to reference tools that no longer exist (or miss tools that just became available). The "DANGEROUS" prefix is a deliberate naming convention to discourage casual use -- it forces cache misses on every request.

Practical Insight

Since CLAUDE.md is in the dynamic half (section 8), editing it between turns does not break the static cache. But it does invalidate the per-session dynamic cache, meaning the next API call after a CLAUDE.md edit will be slightly more expensive (no dynamic cache hit). For this reason, avoid editing CLAUDE.md in tight loops or automation.


Full Assembly Flow

Putting it all together, here is the complete assembly pipeline from invocation to API call:

1. Detect invocation mode 2. Select system prefix variant 3. Assemble static sections 1-6 4. Check feature flags + USER_TYPE 5. Insert CACHE BOUNDARY marker 6. Assemble dynamic sections 7-14+ 7. Inject gated sections if active 8. Send to API (static=cached)
Key Takeaway

The system prompt is not a monolith. It is a pipeline of conditional sections, split by a cache boundary, with feature-gated additions and a deliberate uncached escape hatch for MCP. Understanding this structure lets you write CLAUDE.md instructions that complement (rather than duplicate) the built-in sections, and helps you predict when cache misses will increase your API costs.