Claude Code Compute Costs: 10% of API vs $5k Report

A recent Forbes article claimed Anthropic's $200/month Claude Code Max plan can consume about $5,000 in compute, suggesting the company is losing money on inference. This analysis examines why that figure is misleading.

API pricing vs actual compute costs

The $5,000 figure comes from Anthropic's retail API pricing: $5 per million input tokens and $25 per million output tokens for Opus 4.6. At these prices, a heavy user could indeed rack up $5,000/month in API-equivalent usage.

However, API pricing doesn't reflect what it actually costs Anthropic to serve those tokens. To estimate real inference costs, look at competitive pricing for similar models on OpenRouter:

Qwen 3.5 397B-A17B (comparable to Opus 4.6): $0.39 per million input tokens, $2.34 per million output tokens
Kimi K2.5 1T params with 32B active: $0.45 per million input tokens, $2.25 per million output tokens
DeepInfra cache reads on Kimi K2.5: $0.07/MTok vs Anthropic's $0.50/MTok

The real math

These OpenRouter providers are running businesses with margins, not taking enormous losses. If they can serve comparable models at roughly 10% of Anthropic's API price, actual compute costs are likely in that range.

Therefore:

Heavy user consuming $5,000 in API-equivalent tokens ≈ $500 in actual compute cost
Loss on extreme power users: $300/month (not $4,800)
Most users don't approach limits: Anthropic says fewer than 5% of subscribers would be affected by weekly caps
Typical Max 20x plan usage around 50% of weekly token budget ≈ break-even or profitable for Anthropic

Who actually faces $5,000 costs?

The $5,000 figure comes from Cursor's internal analysis. For Cursor, the number is roughly correct because they pay Anthropic's retail API prices (or close to them) for Opus 4.6 access.

Developers want Anthropic models in Cursor due to brand awareness and current performance advantages over cheaper open-weight alternatives.

Broader implications

Anthropic isn't profitable overall due to training costs, researcher salaries, and compute commitments - not inference. On a per-user, per-token basis for inference, Anthropic is likely profitable on average Claude Code subscribers.

The "AI inference is a money pit" narrative plays into frontier labs' hands by discouraging competition and making their moats appear deeper than they are.

📖 Read the full source: HN AI Agents