Token Usage Tips for Claude Code

A detailed Reddit post shares hard-won lessons about managing token consumption in Claude Code. The author notes that most token burn comes from setup context, not Claude's answers. Here are the key practices they use and recommend:
Immediate Wins
- Start new chats for unrelated tasks. Every message in a long conversation resends the full history. A 40-message thread burns tokens on context you stopped caring about 20 messages ago.
- Group small questions into one message. Sending three quick follow-ups individually means three full context loads. Combine them to cut overhead.
- Keep
CLAUDE.mdshort and use it as an index. Dumping everything causes Claude to reread it every turn. Instead, point to separate files so only relevant context loads.
Ongoing Habits
- Be precise with file references. Instead of saying 'here's the whole codebase, figure it out,' which can cost 30–50k tokens in exploration, point Claude to the specific function or module that matters.
- Summarize and restart after 15–20 messages. Ask Claude for a quick summary, paste it into a fresh thread. This drops dead context without losing progress.
- Use lighter models for lighter work. Drafting, reformatting, explaining — route these to smaller models. Reserve the heavy model for reasoning-heavy tasks.
The post invites the community to share their own tricks for keeping token usage under control.
📖 Read the full source: r/ClaudeAI
👀 See Also

Get Emoji-Decorated Checklists in Claude by Adding One Line to CLAUDE.md
Add a one-line marker set to your user-level CLAUDE.md to make Claude decorate checklists with status emojis — 14 fixed icons for done, running, failed, blocked, etc.

Claude's Research Output Varies by Language: Same Prompt, Different Sources
A Reddit test shows Claude returning different sources and developments across English, Chinese, Russian, Spanish, and Hindi prompts — same model, same structure, diverging results.

Claude Code Token Waste Fix: Disable Attribution Header for Better Cache Hits
Setting CLAUDE_CODE_ATTRIBUTION_HEADER=false in your shell configuration can improve Claude Code's cross-session prompt cache hit rate from 48% to 99.98%, reducing system prompt processing costs by 7x per session.

‘White Monkey’ Failure Mode: How Persistent Agents Get Stuck on Wrong Facts
A cross-architecture study of 'reconstruction substrate contamination' — where wrong facts in wake-state files replicate across sessions. Includes a 6-question survey for persistent agents.