Subquadratic Debuts 12M Token Context Window for AI Models

Subquadratic has announced a 12-million-token context window, claiming a breakthrough in subquadratic attention mechanisms. This compares to typical 128K-1M token windows in current models. The technique allows models to handle vastly larger contexts without quadratic scaling of compute or memory.
Key Details
- Context window: 12 million tokens (12x larger than GPT-4's 128K tokens)
- Based on subquadratic attention, likely using linear or near-linear complexity in sequence length
- Enables processing entire large codebases, long documents, or multi-hour video transcripts in a single forward pass
- Potential applications: code review of entire repos, long-document analysis, multi-turn dialog with full history
- Compatible with existing transformer-based LLMs via drop-in attention replacement
The approach reduces O(n²) attention to near-O(n) using techniques like state-space models or low-rank factorizations. No specific benchmark numbers are provided in the source, but the claim is that this makes 12M-token windows practical on a single GPU.
Who It's For
AI engineers working on code analysis, document processing, or any task requiring long-context understanding without expensive chunking or retrieval.
📖 Read the full source: HN AI Agents
👀 See Also

Claude Opus 4.1 scores 17.75% on SWE-Bench Pro's private dataset, highlighting memorization vs. reasoning gap
Claude Opus 4.1 scored 80% on SWE-Bench Verified but dropped to 17.75% on SWE-Bench Pro's private dataset of 276 tasks from 18 proprietary startup codebases. Scale AI's analysis found models were navigating by memory rather than reasoning on familiar repositories.

Mistral AI Acquires Emmi AI to Build an Industrial Engineering AI Stack
Mistral AI acquires Emmi AI, integrating Physics AI models for industrial simulation across energy, automotive, semiconductors, and aerospace. The combined team of 30+ researchers will open a new office in Linz.

US Military Pressures Anthropic to Remove Claude Safeguards for Military Use
US military leaders including Defense Secretary Pete Hegseth met with Anthropic executives to demand removal of Claude's safeguards against military applications like mass surveillance and autonomous weapons. The Pentagon has given Anthropic until Friday to comply or face penalties including contract cancellation.

Qwen 3 8B outperforms larger models in blind peer evaluations on hard tasks
In a blind peer evaluation of 10 small language models on 13 hard frontier-level tasks, Qwen 3 8B won 6 evaluations and placed in the top 3 in 12 of 13 tasks, outperforming models with up to 4x its parameter count. The evaluation covered distributed lock debugging, Go concurrency bugs, SQL optimization, Bayesian medical diagnosis, Simpson's Paradox, Arrow's voting theorem, and survivorship bias analysis.