OpenRouter Pricing: Best Value AI Models Intelligence-per-Dollar

Model Intelligence and Pricing Comparison

A developer analyzed OpenRouter API pricing for 16 AI models and calculated intelligence-per-dollar values to help select models for specific tasks. The intelligence metric combines seven benchmarks: Artificial Analysis Intelligence Index, Agentic Index, Coding Index, Artificial Analysis Omnicience Index (rescaled to 0-100), GPDval-AA, Terminal-Bench Hard, and t2-Bench Telecom.

Key Findings

The analysis identified several standout models:

Top intelligence: GPT-5.4 (58.8 intelligence, $2.50/M tokens) and Gemini 3.1 Pro (58.6 intelligence, $2.00/M tokens)
Best value: MiMo-V2-Flash (39.9 intelligence, $0.09/M tokens, 443 value score)
Balance models: GLM-5, Kimi K2.5, and Gemini 3 Flash

Model Details and Capabilities

The complete dataset includes:

MiMo-V2-Flash: 39.9 intelligence, $0.09/M tokens, 443 value, text-only
Step 3.5 Flash: 34.8 intelligence, $0.10/M tokens, 348 value, general fast text tasks
Grok 4.1 Fast: 41.2 intelligence, $0.20/M tokens, 205 value, 2M context window, high-speed routing and extraction
MiniMax M2.5: 40.3 intelligence, $0.27/M tokens, 149 value, open-source, excellent performance in real coding tasks
DeepSeek V3.2: 34.6 intelligence, $0.25/M tokens, 138 value, strong coding and logic capabilities, supports API cache hits
Kimi K2.5: 45.8 intelligence, $0.45/M tokens, 101 value, 262K context window, broad general knowledge
Gemini 3 Flash: 47.7 intelligence, $0.50/M tokens, 95 value, multimodal with audio input support
GLM-4.7: 31.6 intelligence, $0.38/M tokens, 83 value, general text generation
Qwen 3.5: 41.1 intelligence, $0.60/M tokens, 68 value, strong overall performance, general purpose
GLM-5: 49.5 intelligence, $0.80/M tokens, 61 value, 200K context window, general knowledge
Claude Haiku 4.5: 36.5 intelligence, $1.00/M tokens, 36 value, fast and cheap, extended thinking support
GPT-5.3: 55.9 intelligence, $1.75/M tokens, 32 value, general reasoning and text processing
GPT-5.2: 50.8 intelligence, $1.75/M tokens, 29 value, excellent for coding + agentic tasks
Gemini 3.1 Pro: 58.6 intelligence, $2.00/M tokens, 29 value, multimodal analyses, image output support
Grok 4.2 Beta: 49.6 intelligence, $2.00/M tokens, 25 value, heavy reasoning, broad knowledge base
GPT-5.4: 58.8 intelligence, $2.50/M tokens, 24 value, variable context tiers (<272K / >272K), top-tier reasoning
Claude Sonnet 4.6: 52.3 intelligence, $3.00/M tokens, 17 value, workhorse model, trained through Jan 2026
Claude Opus 4.6: 51.9 intelligence, $5.00/M tokens, 10 value, top-tier reasoning, strongest for coding and software engineering

Notable Insights

The analysis notes that smarter models typically have worse value scores, but this may not reflect actual efficiency. For example, if Qwen 3.5 uses 500,000 tokens and 30 minutes to solve a problem incorrectly while Sonnet solves it correctly in one-tenth the time, Sonnet might be better value despite its lower intelligence-per-dollar score.

Grok 4.1's 2M context window gives it an intelligence boost that won't appear in most use cases. MiniMax 2.5 outperforms it on all metrics except context window.

GLM-5 marks the last model before a significant value drop (from 61 to 36 with Claude Haiku 4.5) and is reportedly almost as smart as GPT-5.2.

📖 Read the full source: r/openclaw