Open-source local hook automatically switches Claude models to cut AI costs

✍️ OpenClawRadar📅 Published: March 7, 2026🔗 Source

A developer has open-sourced a local hook that automatically selects the most cost-effective Claude AI model based on the type of coding task, potentially reducing AI costs by 50-70% without quality loss.

How it works

The tool runs as a local hook in Cursor and Claude Code (both use the same hook system) before each prompt is sent. It sits next to Opus/plan and acts as an efficient front-end filter that prevents obviously bad model matches before they hit expensive models.

Key functionality

Reads the prompt and current model selection
Uses simple keyword rules to classify tasks (git operations, feature work, architecture/deep analysis)
Blocks if you're overpaying (e.g., Opus for git commit) and suggests Haiku or Sonnet
Blocks if you're underpowered (Sonnet/Haiku for architecture) and suggests Opus
Lets everything else through unchanged
! prefix bypasses the filter completely if you disagree with its suggestion

Technical details

3 files: bash + python3 + JSON
No proxy, no API calls, no external services
Fail-open design: if it hangs, Claude Code proceeds normally
Open-sourced at: https://github.com/coyvalyss1/model-matchmaker

Performance and testing

The developer analyzed several weeks of their own prompts and found:

60-70% were standard feature work Sonnet could handle
5-20% were debugging/troubleshooting
A significant portion were pure git/rename/formatting tasks that Haiku handles identically at 90% less cost

Retroactive analysis showed the tool would have cut 50-70% of AI spend with no quality drop. After tuning, it correctly handled 12/12 real test prompts.

Problem it solves

The issue isn't knowledge—developers know they should switch models—but friction. When in flow state, developers don't want to think about dropdown menus. This tool automates the decision-making process.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

LLMSpend: Open-source cost tracker for Anthropic and OpenAI SDKs

LLMSpend is a Python library that adds cost tracking to Anthropic and OpenAI SDK calls with two lines of code. It provides local SQLite storage, CLI reporting, and a web dashboard without sending data externally.

Mar 12, 2026, 12:45 PM UTC

OpenClawRadar

Tools

Coordinator Server for Multi-Agent Development Prevents Overwrites

A developer built a Node.js coordinator server that manages line-range locking, line shift tracking, and real-time messaging between AI agents working on the same codebase. The system prevents agents from overwriting each other's work by using HTTP-based locking with conflict detection.

Apr 13, 2026, 01:06 PM UTC

OpenClawRadar

Tools

Kstack: Skill Pack for Claude Code to Monitor and Troubleshoot Kubernetes

Kstack is an open-source skill pack that adds slash commands like /investigate, /audit-security, and /cluster-status to Claude Code (and other AI agents) for monitoring and troubleshooting K8s clusters. It uses kubectl, Kubetail, Trivy, and Pluto behind the scenes.

May 8, 2026, 08:20 AM UTC

OpenClawRadar

Tools

CodeVibe: Push Notifications for AI Coding Agents When Blocked on Input

CodeVibe sends push notifications to your phone when AI coding agents like Claude Code get stuck waiting for approval on edit operations. You can review file diffs and respond with numbered options to keep the agent moving.

Apr 16, 2026, 06:45 AM UTC

OpenClawRadar