Voxray-AI: Production Go Backend for Real-Time Voice Agent Pipelines

Production Voice Agent Pipeline in Go
Voxray-AI provides a complete streaming pipeline in Go that handles client audio through WebSocket or WebRTC, processes it through STT → LLM → TTS, and returns audio output. The system is designed for production-grade servers and high-concurrency voice workloads.
Transport Options
The system supports multiple transport mechanisms:
- WebSocket at
/wswith RTVI serializer (?rtvi=1) and Protobuf (?format=protobuf) support - WebRTC at
/webrtc/offerwith full SDP offer/answer, configurable STUN/TURN, and Opus encoding (requires CGO build) - Telephony runner transports: Twilio, Telnyx, Plivo, Exotel, LiveKit, Daily.co
Pluggable Providers
All components are swappable via configuration:
- STT providers: OpenAI, Groq, Sarvam, Google, AWS
- LLM providers: OpenAI, Anthropic, Groq, others
- TTS providers: OpenAI, Google, AWS Polly, Sarvam
Configuration Examples
Minimal configuration example:
{"transport": "both", "stt": { "provider": "groq", "model": "whisper-large-v3" }, "llm": { "provider": "anthropic", "model": "claude-3-5-haiku" }, "tts": { "provider": "google", "voice": "en-US-Neural2-F" }}Turn-taking and voice activity detection configuration:
{"turn_detection": "silence", "vad_type": "silero", "vad_confidence": 0.7, "vad_start_secs_vad": 0.2, "vad_stop_secs": 0.8, "turn_max_duration_secs": 30, "user_idle_timeout_secs": 60}Observability & Storage
/metricsendpoint for Prometheus (request counts, latency histograms, active connection gauges)- Recording: Full session audio to S3 with configurable worker pool and format
- Transcripts: Per-message storage to Postgres or MySQL with configurable table
/healthand/readyendpoints with optional Redis session store check on/ready
Security Features
server_api_keygates/ws,/webrtc/offer,/start,/sessions/*viaAuthorization: BearerorX-API-Key- CORS allowlist configuration
- TLS cert/key configuration
- 12-factor style: JSON config + environment variable overrides
This type of backend is useful for developers building real-time voice applications that need to integrate multiple AI services with production-ready infrastructure.
📖 Read the full source: r/LocalLLaMA
👀 See Also

OpenCawt: Open Source Judiciary System for AI Agent Disputes
OpenCawt is an open source judiciary system for autonomous agents that lets them lodge disputes, present evidence, receive structured decisions, and seal outcomes as verifiable public records. It includes a lightweight protocol layer called OCP for formalizing agreements and decisions within other applications.

ClawCode: Migrate OpenClaw Agents to Claude Code as a Plugin
ClawCode is a Node.js plugin for Claude Code that imports OpenClaw agents, including IDENTITY, SOUL, memory, skills, and crons from ~/.openclaw/workspace/. It provides SQLite+FTS5 searchable memory, messaging plugins for WhatsApp, Telegram, Discord, iMessage, and Slack, and a nightly 'dream' process for memory consolidation.

Aired: A Claude Code Skill for Instant HTML Publishing to Live URLs
Aired is an open-source tool that publishes HTML to a live URL in 2 seconds via Claude Code skills or MCP servers. It requires no signup, deployment configuration, or installation for web-based AI tools, and works with Claude Code, Cursor, VS Code, Codex, and Windsurf.

Hawkeye Update Adds Swarm Orchestration, Remote Tasks, and Local Model Support
Hawkeye v1.0+ now supports multi-agent swarm orchestration, remote task queuing, and improved Ollama/LM Studio integration. The local-first AI agent flight recorder helps developers track what happens when agents work in repositories.