Open-source local hook automatically switches Claude models to cut AI costs

A developer has open-sourced a local hook that automatically selects the most cost-effective Claude AI model based on the type of coding task, potentially reducing AI costs by 50-70% without quality loss.
How it works
The tool runs as a local hook in Cursor and Claude Code (both use the same hook system) before each prompt is sent. It sits next to Opus/plan and acts as an efficient front-end filter that prevents obviously bad model matches before they hit expensive models.
Key functionality
- Reads the prompt and current model selection
- Uses simple keyword rules to classify tasks (git operations, feature work, architecture/deep analysis)
- Blocks if you're overpaying (e.g., Opus for git commit) and suggests Haiku or Sonnet
- Blocks if you're underpowered (Sonnet/Haiku for architecture) and suggests Opus
- Lets everything else through unchanged
- ! prefix bypasses the filter completely if you disagree with its suggestion
Technical details
- 3 files: bash + python3 + JSON
- No proxy, no API calls, no external services
- Fail-open design: if it hangs, Claude Code proceeds normally
- Open-sourced at: https://github.com/coyvalyss1/model-matchmaker
Performance and testing
The developer analyzed several weeks of their own prompts and found:
- 60-70% were standard feature work Sonnet could handle
- 5-20% were debugging/troubleshooting
- A significant portion were pure git/rename/formatting tasks that Haiku handles identically at 90% less cost
Retroactive analysis showed the tool would have cut 50-70% of AI spend with no quality drop. After tuning, it correctly handled 12/12 real test prompts.
Problem it solves
The issue isn't knowledge—developers know they should switch models—but friction. When in flow state, developers don't want to think about dropdown menus. This tool automates the decision-making process.
📖 Read the full source: r/ClaudeAI
👀 See Also

LLMSpend: Open-source cost tracker for Anthropic and OpenAI SDKs
LLMSpend is a Python library that adds cost tracking to Anthropic and OpenAI SDK calls with two lines of code. It provides local SQLite storage, CLI reporting, and a web dashboard without sending data externally.

Coordinator Server for Multi-Agent Development Prevents Overwrites
A developer built a Node.js coordinator server that manages line-range locking, line shift tracking, and real-time messaging between AI agents working on the same codebase. The system prevents agents from overwriting each other's work by using HTTP-based locking with conflict detection.

Kstack: Skill Pack for Claude Code to Monitor and Troubleshoot Kubernetes
Kstack is an open-source skill pack that adds slash commands like /investigate, /audit-security, and /cluster-status to Claude Code (and other AI agents) for monitoring and troubleshooting K8s clusters. It uses kubectl, Kubetail, Trivy, and Pluto behind the scenes.

CodeVibe: Push Notifications for AI Coding Agents When Blocked on Input
CodeVibe sends push notifications to your phone when AI coding agents like Claude Code get stuck waiting for approval on edit operations. You can review file diffs and respond with numbered options to keep the agent moving.