Claude 4.6: 1M Context at Standard Pricing

What's available now

Claude Opus 4.6 and Sonnet 4.6 now include the full 1M context window at standard pricing on the Claude Platform. Standard pricing applies across the full window — $5/$25 per million tokens for Opus 4.6 and $3/$15 for Sonnet 4.6. There's no multiplier: a 900K-token request is billed at the same per-token rate as a 9K one.

Key changes with general availability

One price for the full context window with no long-context premium
Full rate limits at every context length — your standard account throughput applies across the entire window
6x more media per request: up to 600 images or PDF pages, up from 100
Available today on Claude Platform natively, Microsoft Azure Foundry, and Google Cloud's Vertex AI
No beta header required — requests over 200K tokens work automatically
If you're already sending the beta header, it's ignored so no code changes are required

Integration with Claude Code

1M context is now included in Claude Code for Max, Team, and Enterprise users with Opus 4.6. Opus 4.6 sessions can use the full 1M context window automatically, meaning fewer compactions and more of the conversation kept intact.

Performance benchmarks

Opus 4.6 scores 78.3% on MRCR v2, the highest among frontier models at that context length. Claude Opus 4.6 and Sonnet 4.6 maintain accuracy across the full 1M window. Long context retrieval has improved with each model generation.

Practical implications for developers

This means you can load an entire codebase, thousands of pages of contracts, or the full trace of a long-running agent — tool calls, observations, intermediate reasoning — and use it directly. The engineering work, lossy summarization, and context clearing that long-context work previously required are no longer needed.

According to user feedback:

Software engineers report being able to search, re-search, aggregate edge cases, and propose fixes all in one window without losing context
Teams have seen a 15% decrease in compaction events, with agents able to hold context and run for hours without forgetting initial content
Agent systems can now process full diffs without chunking, leading to higher-quality reviews from simpler, more token-efficient harnesses
Scientific research systems can synthesize hundreds of papers, proofs, and codebases in a single pass

📖 Read the full source: HN AI Agents