Claude 1M Context Token Burn: Unbounded Growth & Cache Misses

Token Burn Analysis from Real Usage Data

A detailed analysis of Claude's 1M context window implementation reveals specific technical factors causing rapid token consumption. The author parsed JSONL session files across multiple conversations to identify patterns.

Key Findings from the Data

Unbounded Context Growth: Before the 1M context window, auto-compaction triggered at approximately 160K tokens. Post-1M implementation, this ceiling is gone, allowing sessions to regularly reach 500K+ tokens. Every prompt resends the entire context, meaning at 500K tokens, even a simple confirmation costs 500K tokens. If Claude makes 3 tool calls to answer a prompt, that's 1.5M tokens for a single interaction.

Cache Miss Compounding: Anthropic caches context server-side for approximately 5 minutes. After this window, the next prompt reprocesses the full context at approximately 10x the cached price. While the cache miss rate hasn't changed (remaining at about 2.5% of turns), a cache miss at 500K context is significantly more costly than one at 150K context.

Analysis Tool

The author created a Python script that parses token counts from Claude JSONL session files without accessing conversation content. The script auto-detects your Claude data directory and requires matplotlib and numpy. The script is available at: https://github.com/RyanSeanPhillips/cldctrl/blob/master/docs/context_analysis.py

The author also mentions CLD CTRL (https://github.com/RyanSeanPhillips/cldctrl), a terminal dashboard for launching and monitoring Claude Code sessions, token usage, and project activity.

📖 Read the full source: r/ClaudeAI

Analyzing Claude's 1M Context Window Token Burn: Data Shows Unbounded Growth and Cache Miss Compounding

Token Burn Analysis from Real Usage Data

Key Findings from the Data

Analysis Tool

👀 See Also

AI Subscriptions Need a Reliable Meter: A Call for Service Transparency

Illinois Passes SB 315: Third-Party Audits Required for Frontier AI Labs

Anthropic's Platform Strategy and the OpenClaw Response

From Prompting to Specification Engineering: The Planner-Worker Architecture Shift