Building a Voice Assistant with OpenClaw, Alexa, and Local LLM

✍️ OpenClawRadar📅 Published: March 1, 2026🔗 Source
Building a Voice Assistant with OpenClaw, Alexa, and Local LLM
Ad

A developer shared their implementation of a voice-first assistant that uses OpenClaw as the AI agent backbone, integrated with Alexa for voice interaction and a local LLM for cost-effective query handling.

How It Works

The system is triggered by saying "Alexa, open Lucy" to a custom Alexa skill. Queries are processed through a four-tier routing system:

  • Fast path (0ms): Handles time, date, and hardcoded responses
  • Ollama local LLM (<1s): Uses Qwen 2.5 3B for general knowledge queries, running on a Mac Mini with Apple Silicon
  • Claude agent (5-12s): Handles personal context, memory, and complex reasoning
  • Deferred + tools (up to 2min): Manages email, web search, and database queries via Home Assistant TTS

Responses return to the same Alexa device that initiated the query, auto-detected via Home Assistant's last_called feature. The system uses Piper TTS on Home Assistant for neural Spanish voice output on Sonos speakers and can deliver morning briefings with market data, calendar information, and business metrics.

Ad

Technical Stack

  • OpenClaw: AI agent backbone supporting Telegram, Alexa, and voice interfaces
  • Alexa Custom Skill: Node.js proxy with PIN authentication and session chaining
  • Ollama + Qwen 2.5 3B: Local LLM providing ~0.5s responses
  • Home Assistant: Integrates Alexa Media Player, Piper TTS, and device routing
  • Piper TTS: Neural Spanish voice for Sonos speakers

Key Implementation Details

The developer found that using a local LLM saves approximately 80% of API costs for simple questions that don't require Claude. However, they noted that local models "hallucinate freely" and added a bypass filter for business and financial queries.

Alexa speech recognition was identified as the bottleneck, with AMAZON.SearchQuery and multiple sample utterances helping improve accuracy. Authentication uses userId rather than sessionId since Alexa generates new sessions for each invocation. The developer persists authentication to file because in-memory Maps don't survive proxy restarts.

The proxy code is available as open source: openclaw-alexa-voice. Future plans include wake word detection ("Hey Lucy"), smart home control, and presence-based speaker routing.

📖 Read the full source: r/openclaw

Ad

👀 See Also