RelayCode VS Code Extension Routes Claude Code Through Sovereign RDUs

OpenGPU has released RelayCode, a VS Code extension that acts as a local proxy for AI coding agents. The tool intercepts requests from Claude Code or GitHub Copilot and routes them through the OpenGPU Relay network to open-weight models running on sovereign infrastructure.
Key Details
The extension provides several specific features and performance characteristics:
- Infrastructure: Workloads are routed through Infercom's reconfigurable dataflow units (RDUs), described as dedicated sovereign compute with no US jurisdiction and GDPR compliance by design.
- Performance: Benchmarks show 250+ tokens per second on DeepSeek-R1 (671B) and 400+ tokens per second on MiniMax M2.5. Model switching is near-instant (milliseconds) due to the dataflow architecture.
- Context Management: The extension automatically manages
CLAUDE_AUTOCOMPACTsettings to keep agents within model context windows without crashing. - Privacy: Code stays on the local machine; only inference requests hit the relay network with no data retention.
- Current Status: The team reports about 23 installs and is seeking feedback on relay latency from the community.
- Access: Promo credits are available for testing RDU speeds for free.
The tool is positioned as a way to reduce Anthropic API costs while maintaining Claude CLI workflows, particularly useful for refactoring work.
📖 Read the full source: r/LocalLLaMA
👀 See Also

QCAI App Provides Mobile Control Center for OpenClaw Ecosystem
Academic research team releases QCAI app for iOS and Android, built with AI-assisted development, offering dashboard monitoring, gateway chat, and secure VPN access to OpenClaw tools.

Open-Foundry: A Framework for Multi-Agent Debates with Claude Code
Open-foundry is a Python framework that assembles multiple Claude Code agents into a panel to debate complex questions, producing fully inspectable reasoning trails with transcripts, orchestrator logs, and per-agent working notes.

soul.py adds persistent memory to local LLMs with simple file-based approach
soul.py is a Python library that adds persistent memory to any LLM using two markdown files for identity and conversation logging, working with Ollama, OpenAI, and Anthropic models without requiring databases or servers.

AI Claw: Serverless Bridge Connects Alexa to Local OpenClaw with Dual Delivery
AI Claw is a Python AWS Lambda pipeline that connects Amazon Echo speakers to local OpenClaw instances, bypassing Amazon's 8-second timeout by using a fire-and-forget architecture with dual delivery to Telegram and native Echo audio output.