Orkestra: Cost-Aware LLM Routing Layer for OpenClaw Reduces API Costs by 60-80%

✍️ OpenClawRadar📅 Published: February 28, 2026🔗 Source

What Orkestra Does

Orkestra is a cost-aware LLM routing layer built for OpenClaw that reduces API costs by 60-80%. It's a modular architecture that sits in front of model calls and decides which tier should handle each request based on semantic similarity.

How It Works

When a prompt comes in, it gets embedded and passed through a lightweight KNN classifier trained on previously labeled workloads. Based on semantic similarity, the router categorizes it as budget, balanced, or premium and forwards the call accordingly.

There's no prompt rewriting and no complex rule tree — just semantic classification at call time. The reduction in API costs comes primarily from preventing simpler prompts from defaulting to the most expensive models.

Integration with OpenClaw

Orkestra plugs in as an OpenClaw skill via a local proxy, so existing pipelines stay completely intact. The agent calls it through bash/curl to an OpenAI-compatible endpoint on 127.0.0.1:8765.

The response includes full cost transparency with the fields _orkestra.cost and _orkestra.savings_percent.

Supported Providers and Configuration

Supported providers: Google (Gemini), Anthropic (Claude), OpenAI
Routes across budget/balanced/premium tiers within each provider
Supports multi-provider mode across all three providers
Repository and OpenClaw integration available at: github.com/imperativelabs/orkestra
See integrations/openclaw/ for the skill files, proxy, and config examples

📖 Read the full source: r/openclaw

👀 See Also

Tools

Chrome Extension Adds Live Preview to Claude Code Web

A Chrome extension called Claude Code Preview adds live preview functionality to Claude Code Web, similar to Lovable and other 'vibecoding' sites, allowing side-by-side viewing of deployments.

Apr 20, 2026, 10:15 PM UTC

OpenClawRadar

Tools

llm-use – An Open-Source Framework for Routing and Orchestrating Multi-LLM Agent Workflows

llm-use is revolutionizing automation with its open-source framework designed to efficiently route and orchestrate multi-LLM agent workflows. Explore its impact on AI operations.

Feb 8, 2026, 01:45 PM UTC

OpenClawRadar

Tools

Open Source Grafana Dashboard Tracks Claude Code Costs and Usage via OpenTelemetry

An SRE built a free Grafana dashboard to visualize Claude Code spend, token usage, cache hit ratios, and edit decisions by pulling OpenTelemetry metrics into Prometheus-compatible backends.

May 16, 2026, 04:15 PM UTC

OpenClawRadar

Tools

Open-source CLI uses Claude Haiku to automate Xero expense auditing

A developer has released an open-source Python CLI tool that uses Claude Haiku 4.5 to automate Xero expense auditing. The tool follows a 'deterministic code first, then AI to fill in the gaps' approach, keeping costs to a few cents per audit run.

Apr 20, 2026, 05:38 PM UTC

OpenClawRadar