99.4% Input Tokens: Claude Code Analysis of 100M

Token usage breakdown from 100M tokens tracked

A detailed analysis of Claude Code usage tracked 1,289 requests across extended coding sessions, totaling approximately 100.9M tokens. The breakdown reveals a significant imbalance between input and output tokens.

Token distribution:

Input tokens: 100.3M (99.4% of total)
Cached tokens: 84.2M (84% of input)
Output tokens: 616K (0.6% of total)

The context re-reading bottleneck

Claude Code spends 99.4% of its token budget reading context and only 0.6% writing code. This pattern isn't specific to Claude Code but reflects how all agentic coding systems currently operate. Every time Claude Code makes a move — reading a file, running a command, editing code — it requires the full context to be fed back in, including repository structure, conversation history, tool results, and error logs.

The 84M cached tokens represent the same context being re-sent 1,289 times because the model lacks persistent memory between turns. Unlike human developers who maintain a mental model of their codebase, Claude Code follows a pattern of: forget everything → re-read everything → write code → forget everything again.

Prompt caching limitations

Anthropic's prompt caching makes this process cheaper but doesn't make it faster. The bottleneck isn't inference speed — it's the re-reading loop. The analysis suggests the real unlock for Claude Code and agentic coding in general would be persistent project memory — not just saved facts via memory files or CLAUDE.md, but a compressed, evolving understanding of the codebase that carries forward across sessions.

Current systems essentially brute-force intelligence through repeated context instead of building understanding. The day this changes could make AI coding genuinely faster by eliminating the need to repeatedly process the same information.

📖 Read the full source: r/ClaudeAI

Analysis of 100M tokens in Claude Code reveals 99.4% input usage

Token usage breakdown from 100M tokens tracked

The context re-reading bottleneck

Prompt caching limitations

👀 See Also

OpenClaw: Four Critical Issues Developers Need to Know

Claude-Code v2.1.51: Security fixes, performance improvements, and new remote control feature

Anthropic's Mythos Leak Reveals Latent High-Capability System

Investigation: Claude Code Agents Surfacing Unverified MEMORY.md Content Due to Compaction Changes