SubQ: A Sub-Quadratic LLM with 12M-Token Context Window

✍️ OpenClawRadar📅 Published: May 6, 2026🔗 Source

SubQ from Subquadratic is a production-ready LLM built on a fully sub-quadratic sparse-attention architecture. It handles up to 12M tokens in a single prompt, runs at 150 tokens per second, and costs roughly 1/5 of leading models like GPT-5 or Opus.

Architecture & Benchmarks

Unlike standard transformers with O(n²) attention, SubQ uses a sub-quadratic sparse-attention mechanism that only processes relevant token relationships. At 12M tokens, this reduces attention compute by nearly 1000×. Benchmarks (third-party validated):

SWE-Bench Verified (real-world coding): 81.8%
RULER @ 128K (long-context accuracy): 95.0%
MRCR v2 (8-needle, 1M): 65.9%

For comparison, SubQ's SWE-Bench score sits between Gemini 3.1 Pro (80.6%) and Opus 4.6 (80.8%). The model also outperforms Opus 4.7 (87.6%? – not reported at time) and GPT-5.5 (n/r) on MRCR v2.

Products & Integration

Two access options:

Full-Context API: 12M-token context, streaming, tool use, OpenAI-compatible endpoints. Process entire repositories in one call at linear cost.
SubQ Code (long-context layer for coding agents): Plug into Claude Code, Codex, or Cursor. ~25% lower bill, 10× faster exploration, auto-redirects expensive model turns. One-line install.

Who It's For

Developers and teams running AI agents that need to reason across full codebases, long PR histories, or persistent state without quality loss.

📖 Read the full source: HN AI Agents

👀 See Also

Tools

Developer shares hybrid AI coding workflow: Claude for planning, local models for execution

A developer built a pipeline using Claude 3.5 Sonnet for task planning and local Qwen2.5-Coder models via Ollama for code generation, achieving 85% token reduction compared to using Claude alone.

Apr 16, 2026, 09:45 AM UTC

OpenClawRadar

Tools

TradingView MCP Server Enables Claude to Backtest Trading Strategies

A developer has released an MCP server that allows Claude to backtest six trading strategies using Yahoo Finance data without API keys. Setup involves adding one line to the claude_desktop_config.json file.

Mar 29, 2026, 05:45 AM UTC

OpenClawRadar

Tools

Feynman: Open Source Research Agent with Paper-Codebase Audit Tool

Feynman is an open source research agent CLI that dispatches four subagents in parallel to answer research questions and includes a unique audit tool that compares paper claims against actual codebases. It features one-command installation, MIT license, and runs on pi for agent runtime with alphaxiv for paper search.

Apr 17, 2026, 08:34 AM UTC

OpenClawRadar

Tools

Reflect MCP Server Implements Reflexion Paper for Persistent Coding Agent Memory

A developer implemented the Reflexion paper (Shinn et al., NeurIPS 2023) as an MCP server to give local coding agents persistent memory of their mistakes. The system uses regex-based pattern matching on error messages and stores lessons in SQLite with FTS5.

Apr 16, 2026, 11:24 AM UTC

OpenClawRadar