Routerly: Self-Hosted LLM Gateway with Runtime Routing Policies and Budget Control

Routerly is a self-hosted LLM gateway built to address gaps in existing solutions. The developer created it because OpenRouter is cloud-based, and they wanted something runnable on their own infrastructure, while LiteLLM's routing felt too manual despite handling budgeting well.
Core Features
Instead of hardcoding a specific model in your application, Routerly lets you define routing policies that determine model selection at runtime. Available policies include:
- Cheapest
- Fastest
- Most capable
- Combinations of these policies
Budget control operates at the project level with actual per-token tracking, providing granular cost management.
Compatibility and Use
Routerly is OpenAI-compatible, meaning it can drop into existing workflows without code changes. Specifically mentioned compatible tools include:
- Cursor
- LangChain
- Open WebUI
It works with "anything else" that uses the OpenAI API format.
Current Status
The developer acknowledges there are rough edges and is seeking community feedback on:
- What's broken
- What's missing
- Whether the routing logic makes sense in practice
- Whether it solves a real problem people have
The tool is completely free and open source, with no commercial sales pitch. The developer is focused on practical feedback from the technical community.
Resources
- GitHub Repository: https://github.com/Inebrio/Routerly
- Website: https://www.routerly.ai
📖 Read the full source: r/LocalLLaMA
👀 See Also

Using a Smart Pixel Clock for Claude AI Completion Notifications
A Reddit user shares a method to display Claude AI completion notifications using a ULANZI TC001 Smart Pixel Clock with custom firmware and an HTTP endpoint.

Tether: An MCP Server for Sharing Context Between AI Models via SQLite
Tether is an open-source tool that collapses JSON data into 28-byte content-addressed handles, allowing multiple AI models to share context through a shared SQLite database. It functions as an MCP server, enabling direct communication between models like Claude and MiniMax without copy-pasting.

Improving Claude Code Sessions with claude-self-improve
Claude-self-improve is a CLI tool that enhances Claude Code's AI performance by analyzing session data and updating memory files automatically.

Open-source pipeline turns Claude Code workflow into reusable skills
A developer who used Claude Code daily for 9 months has open-sourced a pipeline that structures feature development with checkpoints like functional documentation, technical documentation, complexity estimation, and security checks. The pipeline includes /new-feature and /bug-fix entry points that guide implementation.