Claude Code Token Waste Fix: Disable Attribution Header for Better Cache Hits

✍️ OpenClawRadar📅 Published: April 1, 2026🔗 Source
Claude Code Token Waste Fix: Disable Attribution Header for Better Cache Hits
Ad

Claude Code has been wasting tokens on every new session since version 2.1.69 due to a billing attribution header that breaks prompt caching. The issue is documented in multiple GitHub issues (#40652, #34629, #40524) with no official response from Anthropic as of the source publication.

What's Happening

Since v2.1.69, Claude Code inserts a billing attribution string into the first block of your system prompt: x-anthropic-billing-header: cc_version=2.1.88.a3f; cc_entrypoint=cli; cch=00000;

The .a3f part is a 3-character hash computed from your first message in each conversation using this function:

function computeHash(firstUserMessage, version) {
  const chars = [4, 7, 20].map(i => firstUserMessage[i] || "0").join("");
  return sha256("59cf53e54c78" + chars + version).slice(0, 3);
}

Different conversations with different first messages generate different hashes every time.

Why This Breaks Caching

Anthropic's caching requires 100% identical prompt segments. The cache is shared across your entire Organization or Workspace, not per session. The billing header sits at the front of the ~23K token system prompt, and since it changes per conversation, the prefix never matches, causing cache misses on every new chat.

Ad

Benchmark Results

A controlled A/B test showed:

  • Header ON (default): 48% cache hit rate, ~12K tokens rebuilt per session
  • Header OFF: 99.98% cache hit rate, zero cache creation on 3 out of 4 sessions

The result is 7x cheaper per session on system prompt processing.

The Fix

Add this to your shell configuration:

export CLAUDE_CODE_ATTRIBUTION_HEADER=false

For zsh users:

echo 'export CLAUDE_CODE_ATTRIBUTION_HEADER=false' >> ~/.zshrc
source ~/.zshrc

New sessions pick it up automatically. Existing sessions don't need restarting—the hash doesn't change mid-conversation, and they don't interfere with new sessions.

Safety and Background

This is not a hack—the environment variable exists in the source code as a proper feature toggle. claude-code-router and CLIProxyAPI have been shipping with this disabled in production with no reported issues.

Anthropic likely implemented this to track which version and entrypoint (CLI vs SDK vs GitHub Action) made each API call, placing it in the system prompt because Bedrock/Vertex don't forward custom headers.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also

Using project narratives to manage memory in large OpenClaw projects
Tips

Using project narratives to manage memory in large OpenClaw projects

A developer shares a process where after each major milestone, they spawn a separate OpenClaw worker to analyze the codebase and write a 'project narrative' document, which helps identify broken pipelines, redundancies, and missing pieces that the main worker might overlook.

OpenClawRadar
Claude Prompt Codes Retested: L99 Sharper, OODA Narrower, ARTIFACTS Faded, and 3 New Codes to Use
Tips

Claude Prompt Codes Retested: L99 Sharper, OODA Narrower, ARTIFACTS Faded, and 3 New Codes to Use

A 6-month retest of L99, OODA, and ARTIFACTS prompt codes on Claude shows L99 sharper on Sonnet 4.6/Opus 4.7, OODA failing on strategic prompts, ARTIFACTS unnecessary for code, and three new codes (/skeptic, /blindspots, /decompose) earning daily use. Stack no more than 2 codes.

OpenClawRadar
Governance Layer for Claude Agents: Hard Safety Boundaries and Live Traces in Production
Tips

Governance Layer for Claude Agents: Hard Safety Boundaries and Live Traces in Production

A Claude API user built a lightweight governance layer below the agent to add hard safety boundaries, real-time traces, human-in-the-loop control via Telegram, and automatic checkpointing — solving silent failures and runaway token costs in long-running agent loops.

OpenClawRadar
Claude Code's tendency to validate flawed assumptions and prompting workarounds
Tips

Claude Code's tendency to validate flawed assumptions and prompting workarounds

A developer reports Claude Code will enthusiastically implement flawed architectures without questioning incorrect assumptions, leading to wasted debugging time. The workaround is to explicitly add "assume I might be wrong about the framing" to complex requests.

OpenClawRadar