Stop Asking Which AI Model to Use: Route Tasks to Haiku, Sonnet, and Opus Tiers

Reddit user u/spencer_kw calls out the daily "which model should I use?" posts and gives a concrete answer based on a month of routing by task type. The core insight: no single model is optimal for everything, and you should be routing tasks to at least three tiers.
Model Tiers by Task
- Reading files, summarizing, answering code questions: Use the cheapest model — Haiku, Qwen 3.6 via Ollama, Gemma 4. Sending file reads to Opus is burning money.
- Writing code, tests, boilerplate: Sonnet-tier — GPT-5.5 mini, DeepSeek v4. Solid generation at a fraction of frontier cost.
- Multi-file refactors, architecture, complex async debugging: Only time you need Opus or GPT-5.5. This is ~15-20% of your day.
Practical Routing Setup
u/spencer_kw's current distribution:
- ~40% of tasks → Haiku-tier (cheap readers)
- ~35% → Sonnet-tier (generation)
- ~25% → Opus-tier (complex reasoning)
Total monthly spend: $30–40 depending on workload.
The "daily driver" framing is broken — asking for one model for everything is like asking for one vehicle that both hauls and commutes. Use multiple models and route by task.
📖 Read the full source: r/openclaw
👀 See Also

Custom 4x RTX PRO 6000 Server vs Dell GB300: Decision for 30 Fine-Tuned Pipelines
A deep dive into two on-prem architectures for running ~30 fine-tuned production pipelines: a custom 4U server with 4-8x RTX PRO 6000 Blackwell (96GB each) vs NVIDIA GB300 Grace Blackwell appliance with 252GB HBM3e + 496GB unified memory.

OpenClaw 101: The Ultimate Setup Guide for New Users

How to Secure Claude Cowork with a Proxy Layer: Practical Guide
A walkthrough on setting up a proxy layer to observe and secure Claude Cowork's behavior, published by General Analysis team.

Using AI to Write Better Code More Slowly: A Bug-Finding Workflow
Nolan Lawson describes a workflow using multiple AI agents (Claude, Codex, Cursor Bugbot) to find and prioritize bugs in PRs, improving code quality over raw velocity.