Developer shares hybrid AI coding workflow: Claude for planning, local models for execution

✍️ OpenClawRadar📅 Published: April 16, 2026🔗 Source
Developer shares hybrid AI coding workflow: Claude for planning, local models for execution
Ad

Hybrid AI coding workflow reduces cloud costs

A developer on r/LocalLLaMA shared a detailed workflow that combines cloud and local AI models to reduce token costs while maintaining coding quality. The approach addresses the realization that many coding tasks don't require expensive cloud models.

The workflow architecture

The system follows a "Reason in the cloud, Execute locally" logic:

  • Planner (Claude 3.5 Sonnet): Receives the task and generates a precise task_context.md file containing instructions, file paths, and logic. This costs approximately 300-500 tokens.
  • Coder (Local Qwen2.5-Coder 30B via Ollama): Takes the specification and actual file content to write the code. This runs locally with zero cost.
  • Validator: A simple Bash script runs tsc --noEmit or mypy for type checking.
  • Reviewer (Local Qwen2.5-Coder 7B): Runs in parallel to check for obvious logic flaws.
  • Auto-fix: If the build fails, the error log goes back to the local coder for 2-3 iterations.
Ad

Implementation details

The entire pipeline is wrapped into a set of Bash scripts using just jq and curl to communicate with the Ollama API. The system auto-detects language standards (TypeScript, Python, C++, etc.) based on the planner's output and doesn't require heavy Python/Node runtimes.

The developer notes that local models (even 30B ones) often fail at complex architectural reasoning but are surprisingly good at execution when given crystal-clear specifications.

Results and savings

On a recent TypeScript project involving 12 files changed:

  • Claude usage was limited to the initial planning phase only
  • Local models handled everything else: writing 12 files, linting, and reviewing
  • Total savings: approximately 85% token reduction compared to doing everything inside the Claude Code CLI

The developer has made the scripts available in a repository called ai-orchestrator on GitHub (username: Mybono) for those interested in implementation details.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

Developer shares 10+ MCP servers for AI agent settlement, reputation, and micropayments
Tools

Developer shares 10+ MCP servers for AI agent settlement, reputation, and micropayments

A developer built BlindOracle on Claude Code with 100+ agents and created 10+ MCP servers for settlement, reputation, and micropayments. The architecture includes private commit-reveal forecasts, on-chain scoring, per-request micropayments, and verifiable agent attestation.

OpenClawRadar
From Replit to Local: How One Developer Used Claude to Build StillHere, an API-Powered AI Companion Chat App
Tools

From Replit to Local: How One Developer Used Claude to Build StillHere, an API-Powered AI Companion Chat App

A developer built StillHere.ink, an AI chat app for companion-style conversations using personal API keys, after migrating from Replit to local development with Claude. The app features memory, diary summaries, RAG, model switching, and cost-control tools.

OpenClawRadar
Claude Code Remote Control: Continue Local Sessions from Any Device
Tools

Claude Code Remote Control: Continue Local Sessions from Any Device

Claude Code Remote Control lets you continue local Claude Code sessions from other devices like phones or browsers while keeping everything running on your machine. It's available as a research preview on Pro and Max plans, requiring authentication and workspace trust setup.

OpenClawRadar
Skill Scaffolder: Build OpenClaw Skills Without Writing Code
Tools

Skill Scaffolder: Build OpenClaw Skills Without Writing Code

Skill Scaffolder is an open-source tool that lets users create OpenClaw skills by describing what they want in plain English. It handles the entire process—interviewing users, writing skill files, testing, and installation—without requiring YAML, Python, or config files.

OpenClawRadar