Claude Opus 4.6 and Sonnet 4.6 now offer 1M context at standard pricing

What's available now
Claude Opus 4.6 and Sonnet 4.6 now include the full 1M context window at standard pricing on the Claude Platform. Standard pricing applies across the full window — $5/$25 per million tokens for Opus 4.6 and $3/$15 for Sonnet 4.6. There's no multiplier: a 900K-token request is billed at the same per-token rate as a 9K one.
Key changes with general availability
- One price for the full context window with no long-context premium
- Full rate limits at every context length — your standard account throughput applies across the entire window
- 6x more media per request: up to 600 images or PDF pages, up from 100
- Available today on Claude Platform natively, Microsoft Azure Foundry, and Google Cloud's Vertex AI
- No beta header required — requests over 200K tokens work automatically
- If you're already sending the beta header, it's ignored so no code changes are required
Integration with Claude Code
1M context is now included in Claude Code for Max, Team, and Enterprise users with Opus 4.6. Opus 4.6 sessions can use the full 1M context window automatically, meaning fewer compactions and more of the conversation kept intact.
Performance benchmarks
Opus 4.6 scores 78.3% on MRCR v2, the highest among frontier models at that context length. Claude Opus 4.6 and Sonnet 4.6 maintain accuracy across the full 1M window. Long context retrieval has improved with each model generation.
Practical implications for developers
This means you can load an entire codebase, thousands of pages of contracts, or the full trace of a long-running agent — tool calls, observations, intermediate reasoning — and use it directly. The engineering work, lossy summarization, and context clearing that long-context work previously required are no longer needed.
According to user feedback:
- Software engineers report being able to search, re-search, aggregate edge cases, and propose fixes all in one window without losing context
- Teams have seen a 15% decrease in compaction events, with agents able to hold context and run for hours without forgetting initial content
- Agent systems can now process full diffs without chunking, leading to higher-quality reviews from simpler, more token-efficient harnesses
- Scientific research systems can synthesize hundreds of papers, proofs, and codebases in a single pass
📖 Read the full source: HN AI Agents
👀 See Also

TranslateGemma-12b: Human Review Catches 71% Errors Missed by Automated Metrics
Human MQM review flagged 71% of translation segments that automated metrics rated clean, with all 25 accuracy errors in the metric-blind quadrant.

NVIDIA Vera CPU Launched for Agentic AI Workloads
NVIDIA has launched the Vera CPU, a processor designed specifically for agentic AI and reinforcement learning workloads, claiming 50% faster performance and twice the efficiency compared to traditional rack-scale CPUs.

Anthropic Reverses Policy on Third-Party Agent SDK and claude-p, Cuts Effective Inference Value by 25-40x for Max Subscribers
Anthropic reversed its ban on third-party agents using subscription credentials but moved claude-p and the Agent SDK to a separate, non-rollover credit pool billed at API rates, reducing effective inference value by 25-40x for Max subscribers.

Exploring the Intricacies of OpenClaw: How It Operates
OpenClaw is revolutionizing the AI coding landscape with its innovative architecture and unique functionalities. Discover the inner workings of this potent automation agent.