RouteLLM Setup for Cost-Effective AI Task Routing

Docker Compose Configuration for Hybrid AI Setup
A Reddit user posted a detailed Docker Compose setup that implements what they call "Poor Man's Superintelligence" - a hybrid AI system that routes tasks between local and cloud models based on complexity.
Core Components
The system uses four main services:
- vscode-openwire: Uses image
sendmeticket/vscode-openwire:1.0.0with ports 3000 and 3030 exposed. This provides access to GitHub Copilot through OpenWire, though the source notes this may violate TOS and suggests using an available API key instead. - ollama: Runs
ollama/ollama:latestwith port 11434 exposed. It automatically pulls and serves theqwen3.5:4bmodel as the local "weak" model. - openroutellm: Uses image
sendmeticket/openroutellm:1.0.0on port 6060. This is the routing service that decides which model handles each request. - openclaw: Runs
ghcr.io/openclaw/openclaw:latestwith ports 18789 and 18790 exposed, serving as the main interface.
RouteLLM Configuration
The openroutellm service is configured with specific parameters:
python -m routellm.openai_server --routers bert --default-router-threshold 0.75 --port 6060 --openwire-base-url http://vscode-openwire:3030/v1 --ollama-base-url http://ollama:11434/v1 --strong-model gpt-4o --weak-model qwen3.5:4bThis setup uses BERT-based routing with a 0.75 threshold to determine when to send tasks to the "strong" model (GPT-4o) versus the local "weak" model (Qwen3.5:4b).
How It Works
The system routes difficult tasks to the paid GPT-4o model through OpenWire/Copilot, while simpler tasks are handled by the local Qwen3.5:4b model running in Ollama. This creates what the author describes as a "fail-safe, local-first AI model with low base intelligence but really high max intelligence."
All services are connected through a custom Docker network (openclaw_net with subnet 172.10.10.0/24) and include health checks to ensure service availability.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Collection of 177 OpenClaw SOUL.md Templates Organized into 24 Categories
A developer has compiled 177 ready-to-use SOUL.md templates for OpenClaw agents across 24 categories including Marketing, Development, Business, DevOps, Finance, Creative, Data, Security, Healthcare, Legal, HR, and Education. All templates are MIT licensed and available on GitHub.

Vibeyard Adds Kanban Board for Managing Multiple Claude Code Sessions
An open-source IDE called Vibeyard now includes a Kanban board that lets you spin up Claude Code agent sessions directly from cards. Cards auto-move to Done when the agent finishes.

Academic Research Skills for Claude Code: A Human-in-the-Loop Pipeline for Paper Writing
Academic Research Skills (ARS) v3.7.0+ is a Claude Code plugin that automates reference hunting, citation formatting, data checking, and logical consistency review while keeping the human researcher in control. Install via /plugin marketplace add Imbad0202/academic-research-skills.

Jan Adds One-Click OpenClaw Installation with Jan-v3-Base Model Integration
Jan now supports one-click installation of OpenClaw with direct integration to the Jan-v3-base model, keeping all operations local and private on your computer.