ClawRelay: macOS-native OpenAI-compatible LLM proxy with automatic failover

What ClawRelay does
ClawRelay is a native Swift application for macOS 15+ that runs an OpenAI-compatible HTTP server locally. You configure LLM providers in priority order (OpenAI, Groq, Nvidia NIMs, Ollama, or any service with a /v1/chat/completions endpoint). When a request comes in, it tries the first provider and automatically falls back to the next if there's a failure (rate limit, 5xx error, or timeout).
Setup and configuration
The app runs in the system tray with quick access and a full settings window. Provider API keys are stored in macOS Keychain. No Docker, Node.js, or config files are required.
To connect your tools:
- Base URL:
http://localhost:11434/v1 - API Key: optional for local use, can be generated in-app for LAN or tunnel setups
Works with Cursor, Continue.dev, LM Studio, the Python openai library, and any tool that accepts a custom base URL.
openClaw integration
For openClaw users, one command wires it up:
bash <(curl -fsSL https://www.desertstack.dev/clawrelay/enable-provider.sh ) \
--provider-id "clawrelay" \
--base-url "http://localhost:11434/v1" \
--api-key "claw_relay_key" \
--api "openai-completions" \
--model-id "clawrelay" \
--model-name "ClawRelay"Generate your key from the Servers tab in ClawRelay. Requires jq and the openclaw CLI.
Deployment options
Beyond localhost, you can bind ClawRelay to your LAN interface to reach it from any device on your network. You can also put Cloudflare Tunnel or ngrok in front to expose it to the internet. The same app and configuration work for all deployment scenarios.
Built-in features
- Request logs included
- System tray access
- Full settings window
- macOS Keychain storage for API keys
- Native Swift implementation
📖 Read the full source: r/clawdbot
👀 See Also

Community-voted Model Leaderboard for OpenClaw Released
A new community-voted leaderboard for models compatible with OpenClaw is now available, with Opus 4.5 currently leading.

Publicly Hosted MCP Servers for Health, Academic, and Government Data
A developer has built and publicly hosts 14 MCP servers providing access to CDC datasets, clinical trials, FDA data, academic publications, congressional information, weather data, and other utilities. These servers require no setup, API keys, or local installation.

RTX 5060 Ti 16GB Local LLM Benchmarks: 30B Models Still Lead for Coding
Benchmarks on an RTX 5060 Ti 16GB show Unsloth Qwen3-Coder-30B UD-Q3_K_XL achieving 76.3 tok/s on Ubuntu with quality score 8.14, making it the recommended default coding model. The Unsloth Qwen3.5-35B UD-Q2_K_XL hits 80.1 tok/s but with lower quality scores.

Jentic Mini: Self-Hosted API and Action Execution Layer for OpenClaw
Jentic Mini is a self-hosted API and action execution layer that sits between AI agents and external APIs, storing credentials in an encrypted vault and providing scoped toolkits with individually revocable keys. It automatically imports 10,000+ OpenAPI specs and Arazzo workflow sources when credentials are added.