Hacks & Optimisations
Practical, actionable tricks extracted from Claude Code's source. These are undocumented environment variables, hidden commands, and configuration patterns that let you control behaviour, cut costs, and work more efficiently.
Environment Variable Reference
All undocumented environment variables extracted from the source. Click any variable to jump to its detailed explanation.
| Variable | Default | Category | Description | Impact |
|---|---|---|---|---|
CLAUDE_CODE_SIMPLE |
unset |
Prompt | Strips system prompt to minimal version | High |
CLAUDE_CODE_AUTO_COMPACT_WINDOW |
ctx - 33K |
Compaction | Override auto-compact threshold (tokens) | Med |
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE |
unset |
Compaction | Trigger compaction at % of context used | Med |
DISABLE_AUTO_COMPACT |
unset |
Compaction | Disable auto-compaction, keep manual | Med |
DISABLE_COMPACT |
unset |
Compaction | Disable all compaction entirely | Low |
CLAUDE_CODE_DISABLE_1M_CONTEXT |
unset |
Context | Force 200K window even on 1M models | High |
CLAUDE_CODE_MAX_CONTEXT_TOKENS |
model max |
Context | Hard-cap maximum context tokens | Med |
CLAUDE_CODE_DISABLE_THINKING |
unset |
Thinking | Disable extended thinking | Med |
DISABLE_INTERLEAVED_THINKING |
unset |
Thinking | Disable reasoning between tool calls | Low |
DISABLE_BACKGROUND_TASKS |
unset |
Background | Disable all background processes | Med |
CLAUDE_CODE_DISABLE_AUTO_MEMORY |
unset |
Background | Disable auto-memory extraction only | Low |
CLAUDE_CODE_COORDINATOR_MODE |
unset |
Mode | Multi-agent orchestrator mode | High |
CLAUDE_CODE_FORK_SUBAGENT |
unset |
Mode | Fork full context into background agent | Med |
Set these as environment variables before launching Claude Code. You can either export them for the whole shell session or prefix them inline for a single run (e.g., CLAUDE_CODE_SIMPLE=1 claude).
Detailed Variable Reference
CLAUDE_CODE_SIMPLE
What it does: Strips the system prompt down to a minimal version. The full system prompt is massive (15+ sections covering identity, security guardrails, tool usage rules, task execution guidelines, style rules, and more). This flag bypasses nearly all of it, reducing input tokens significantly and making each API call faster and cheaper.
When to use: Scripted/automated pipelines where you send structured prompts and don't need the interactive hand-holding. Also useful for batch operations where you're driving Claude with precise instructions and want maximum throughput.
Not recommended for interactive use. Without its full instruction set, Claude loses all guardrails, style guidance, and tool usage tips. It may use tools incorrectly (e.g., running grep via Bash instead of using the Grep tool), skip safety checks on destructive operations, produce inconsistent output styles, and forget conventions like reading files before editing them.
CLAUDE_CODE_AUTO_COMPACT_WINDOW
What it does: Overrides the effective context window size used to calculate when auto-compaction triggers. The default formula is modelContextWindow - 33K. Setting this lower (e.g., 150000) forces compaction to trigger earlier, keeping per-turn token costs down.
When to use: Long-running sessions where you want to stay well under the context limit to maintain fast response times and avoid hitting the blocking limit.
Setting this too low causes frequent compactions, which means you lose conversation detail more often. Each compaction replaces your full history with a 9-section summary, so nuance and intermediate reasoning get lost. You may find Claude asking to re-read files it already had in context.
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE
What it does: Triggers auto-compaction when context usage reaches a specific percentage of the window, rather than using the absolute token threshold. For example, 85 means compact when 85% of the context window is used.
When to use: When you want a predictable compaction point that scales automatically regardless of which model or context window size you're using. This is often simpler to reason about than an absolute token number.
DISABLE_AUTO_COMPACT
What it does: Disables the automatic compaction trigger entirely. Compaction will only happen when you explicitly run /compact. The silent microcompaction (which cleans old tool results before each API call) still runs.
When to use: When you want full control over when context is summarised -- useful for sessions where you need Claude to retain precise details of earlier work and you're willing to manage the context window manually.
If you forget to manually compact, the conversation will eventually hit the blocking limit and Claude will be unable to respond until you run /compact. Long uncompacted contexts also cost significantly more per turn since every API call includes the full history.
DISABLE_COMPACT
What it does: Disables all compaction -- both automatic and manual. The /compact command itself becomes a no-op. No summarisation, no microcompaction, nothing.
When to use: Short, scripted sessions where you know the conversation will stay well within the context window and you want zero overhead from compaction logic.
If the conversation exceeds the context window, Claude will hit the blocking limit with no way to recover other than restarting the session. Only use this when you are certain the session will be short-lived.
CLAUDE_CODE_DISABLE_1M_CONTEXT
What it does: Forces Claude Code to use the standard 200K context window even when the model supports the extended 1M window. This means auto-compaction triggers at ~167K tokens instead of ~967K.
When to use: When you want to cap costs. The 1M context window means each turn can include up to 5x more tokens, which directly multiplies your API costs. Using the 200K window keeps per-turn costs predictable.
You lose the ability to hold large codebases or long conversations in context. If your work requires cross-referencing many files simultaneously, the 200K limit can force frequent compactions that lose important context.
CLAUDE_CODE_MAX_CONTEXT_TOKENS
What it does: Hard-caps the maximum context token count, overriding whatever the model's native window supports. For example, setting 500000 on a 1M model means compaction triggers at ~467K instead of ~967K.
When to use: A more flexible alternative to CLAUDE_CODE_DISABLE_1M_CONTEXT. Lets you pick an exact ceiling -- useful for budgeting or when you want some extended context but not the full 1M.
CLAUDE_CODE_DISABLE_THINKING
What it does: Disables Claude's extended thinking (the internal reasoning step before generating a response). Without extended thinking, Claude skips the deliberation phase and responds immediately, saving output tokens and latency.
When to use: Straightforward, repetitive tasks where the extra reasoning doesn't improve quality -- batch file edits, simple code generation, or scripted pipelines.
Extended thinking is what gives Claude the ability to reason through complex multi-step problems, architectural decisions, and debugging. Disabling it noticeably degrades quality on anything non-trivial. Claude is more likely to make mistakes, miss edge cases, and produce shallow solutions.
DISABLE_INTERLEAVED_THINKING
What it does: Prevents Claude from reasoning between tool calls. Normally, Claude can "think" between each tool use (e.g., reason about a file's contents before editing it). This flag forces Claude to chain tool calls without intermediate reasoning.
When to use: When you want faster tool execution chains and are willing to trade reasoning quality. Works well for predictable, linear workflows where Claude doesn't need to adapt between steps.
Without interleaved thinking, Claude cannot course-correct between tool calls. If one tool returns unexpected results, Claude will blindly proceed with the next step rather than reconsidering its approach. This can lead to cascading errors in multi-step tasks.
DISABLE_BACKGROUND_TASKS
What it does: Prevents Claude Code from spawning any background processes. This includes auto-memory extraction (the forked sub-agent that analyses conversations for memories), the "dream" system (background processing during idle periods), and any other background agent spawns.
When to use: Short scripted sessions, CI/CD pipelines, or any automated workflow where background processing is pure overhead. Also useful if you're debugging and want to eliminate all non-deterministic background activity.
You lose all automatic memory extraction. Claude won't remember anything from the session for future conversations unless you explicitly ask it to save a memory. The dream system, which pre-processes tasks during idle time, is also disabled.
CLAUDE_CODE_DISABLE_AUTO_MEMORY
What it does: Specifically disables the auto-memory extraction sub-agent without affecting other background tasks. The extraction agent normally forks after every N turns to analyse the conversation for memory-worthy information (corrections, preferences, project context).
When to use: When you want the dream system and other background tasks to remain active but don't want the cost of running the memory extraction agent. Useful if you manage your memories manually or don't use the memory system at all.
CLAUDE_CODE_COORDINATOR_MODE
What it does: Activates coordinator mode, where Claude acts as an orchestrator rather than doing work directly. In this mode, Claude delegates all tasks to worker sub-agents -- it researches via workers, synthesises their findings into implementation specs, then dispatches workers to implement and verify changes. The coordinator follows four phases: research, synthesis, implementation, verification.
When to use: Large, multi-file changes that benefit from parallel execution. The coordinator spawns multiple workers simultaneously, each operating in isolated worktrees, which can significantly speed up broad refactors or multi-component features.
Coordinator mode spawns multiple sub-agents, each consuming their own API calls and tokens. Costs can multiply quickly -- a single task might spawn 3-5 workers, each running their own conversation. The synthesis step is also critical: if the coordinator writes a poor spec, all workers execute on a flawed plan. Best paired with /plan mode first to review the approach before execution begins.
CLAUDE_CODE_FORK_SUBAGENT
What it does: Enables fork-based sub-agent spawning. When Claude uses the Agent tool without specifying a subagent_type, instead of spawning a fresh agent, it forks the entire current context into a background agent. The forked agent inherits the full conversation history, system prompt, and file state byte-for-byte, including prompt cache.
When to use: When you want a sub-agent that already understands what you've been working on -- no need to re-explain the context. The fork inherits everything, so the first API call to the forked agent is cache-warm.
Forked agents are constrained by safety rules: they cannot spawn their own sub-agents (no recursive forking), must execute directly rather than suggesting next steps, and are expected to commit changes and report a hash. The full context duplication also means the forked agent's first turn is expensive -- it includes everything from the parent conversation.
CLAUDE.md Optimisation Tricks
Where Your Instructions Actually Land
The source reveals the exact prompt order Claude sees. Your CLAUDE.md lands at position 10 out of 15+ sections, after 9 sections of Anthropic's own instructions:
Don't repeat what sections 1-6 already cover (e.g., "be concise", "use tools properly"). Instead, add project-specific context that Claude cannot derive from the codebase. Every byte of CLAUDE.md is injected on every turn, so 1KB of CLAUDE.md = 1KB per API call.
Steal Anthropic's Internal Prompt Tricks
Internal Anthropic builds include numeric length anchors that reduced output tokens by ~1.2%. Add similar instructions to your CLAUDE.md:
Hidden Slash Commands
| Command | What It Does | Status |
|---|---|---|
/rewind |
Restore code AND conversation to a previous point (full undo) | Hidden |
/compact <instructions> |
Custom compaction with priority instructions | Public |
/context |
See what's actually in your context window | Public |
/stats |
Token usage, costs, timing statistics | Public |
/cost |
Accumulated session cost breakdown | Public |
/diff |
See what files changed this session | Public |
/export |
Export conversation transcript | Public |
/stickers |
Easter egg -- opens StickerMule for Claude Code stickers | Hidden |
Custom Compact Instructions
One of the most useful undocumented features. When running /compact, you can tell the compactor what to prioritise:
Memory System Tricks
MEMORY.md Has a 200-Line Hard Limit
Lines after 200 in MEMORY.md are silently truncated. If your index is bloated, Claude literally cannot see entries past line 200. Prune regularly.
High ImpactMemory Staleness Caveats
Memories older than 1 day get a caveat injected into the prompt that makes Claude second-guess them. If you have stable memories that shouldn't be doubted, reset the modification time:
Auto-Extraction Behaviour
- Runs as a forked sub-agent after every N turns
- Has a 5-turn budget max per extraction run
- Sandboxed to only write to the memory directory
- Skips if you manually wrote to memory in the same turn
- Throttled by GrowthBook feature flags
Explicitly saying "remember this" is more reliable than hoping auto-extraction catches it, because auto-extraction has throttling and conflict detection that can cause it to skip turns.
Tool Permission Tricks
Commands That Never Prompt
These read-only commands are auto-approved in default permission mode -- no dialog, no delay:
| Category | Auto-Approved Commands |
|---|---|
| Git | git log, git show, git diff, git status, git config --get, git branch, git remote |
| Files | cat, head, tail, less, more, file, strings, od, xxd, hexdump |
| Search | grep, rg, find, fd, locate, which, type |
| Docker | docker ps, docker logs, docker inspect, docker images |
| Other | xargs (read-only), jq, pyright |
Agent Optimisation
Explore Agent Uses Haiku (Cheapest Model)
The built-in Explore agent runs on Haiku for external builds. It's read-only, skips CLAUDE.md loading, and is designed for fire-and-forget codebase searches. Use it liberally without worrying about cost.
Medium ImpactBuild Custom Agents on Haiku
Create cheap, specialised agents in ~/.claude/agents/ or .claude/agents/:
Full Agent Frontmatter Options
Cost Reduction Cheat Sheet
Ordered by estimated impact:
CLAUDE_CODE_SIMPLE=1for automated/scripted pipelines -- massive prompt reduction- Use Explore agent for searches -- runs on Haiku, skips CLAUDE.md
- Keep CLAUDE.md short -- it's injected on every turn
- Custom compact instructions --
/compact keep only Xprevents re-doing work - Create Haiku-based custom agents for repetitive tasks (linting, testing, searching)
- Set
CLAUDE_CODE_MAX_CONTEXT_TOKENSlower if you don't need the full window - Disable auto-memory (
CLAUDE_CODE_DISABLE_AUTO_MEMORY=1) if you don't use it - Disable background tasks (
DISABLE_BACKGROUND_TASKS=1) for short scripted sessions
Compaction Tricks
Microcompaction Runs Silently
Two types happen before every API call without you knowing:
- Cached microcompaction: Deletes oldest tool results via API
cache_edits(preserves prompt cache) - Time-based microcompaction: If idle >60 minutes, clears old tool results (cache expired anyway)
If you step away for an hour and come back, your oldest tool results have been silently removed. This is why Claude sometimes says "let me re-read that file" after breaks -- the earlier read result was garbage-collected.
Compaction Circuit Breaker
After 3 consecutive auto-compact failures, the system stops retrying. If compaction seems stuck, run /compact manually to reset the circuit breaker.