Local voice-to-text transcription for OpenClaw using Parakeet TDT 0.6b v3

Local transcription setup for OpenClaw
A community developer has adapted NVIDIA's Parakeet TDT 0.6b v3 model for local voice-to-text transcription within OpenClaw. The model runs via ONNX inference on CPU, eliminating API costs and supporting 25 European languages.
Technical implementation
The solution uses a GitHub repository (groxaxo/parakeet-tdt-0.6b-v3-fastapi-openai) that provides a Docker container for CPU deployment. The container exposes an OpenAI-compatible API endpoint at http://127.0.0.1:5092/v1.
Supported languages include: Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hungarian (hu), Italian (it), Latvian (lv), Lithuanian (lt), Maltese (mt), Polish (pl), Portuguese (pt), Romanian (ro), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Russian (ru), and Ukrainian (uk).
Integration with OpenClaw
The developer provides a Python script for transcription:
#!/home/openclaw/.local/share/pipx/venvs/openai/bin/python
import sys
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:5092/v1",
api_key="sk-no-key-required"
)
audio_file = open(sys.argv[1], "rb")
transcript = client.audio.transcriptions.create(
model="parakeet-tdt-0.6b-v3",
file=audio_file,
response_format="text"
)
print(transcript)
This script can be configured in OpenClaw's openclaw.json file:
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [
{
"type": "cli",
"command": "/home/openclaw/.local/bin/transcribe",
"args": ["{{MediaPath}}"],
"timeoutSeconds": 60
}
]
}
}
}Alternatively, OpenClaw can be configured to directly use the OpenAI-compatible API endpoint with the model name and dummy API key from the script.
Deployment notes
The developer tested this on an ARM64 Ubuntu Linux VM on a Mac Mini with M4 Pro, noting it should run reasonably fast on any decent Intel-compatible CPU. The Docker container is built following the README instructions in the GitHub repository.
📖 Read the full source: r/openclaw
👀 See Also

Interfaze: New Model Architecture Beats Gemini-3-Flash and GPT-5.4-Mini on Deterministic Tasks
Interfaze, a new model architecture combining DNN/CNNs with transformers, outperforms Gemini-3-Flash, Claude-Sonnet-4.6, GPT-5.4-Mini, and Grok-4.3 across 9 benchmarks including OCR, vision, STT, and structured output.

Open Source Second Brain System Built on Claude Code for Task Management
An open source system called Kipi System uses Claude Code to track open threads, draft follow-ups, and manage tasks by pulling from calendar, email, CRM, and social feeds. It generates a daily HTML file with pre-written actions sorted by friction.

ClawControl iOS client released for OpenClaw self-hosted servers
ClawControl v1.50 is now available on iOS as a privacy-focused mobile client for self-hosted OpenClaw/Claw servers. The open-source app enables real-time chat with streaming responses, agent management, and session control from mobile devices.

Ssemble MCP Server Enables Claude to Generate Short-Form Videos from YouTube
A new MCP server for Ssemble AI Clipping allows Claude to create TikTok/Reels/Shorts-style videos from YouTube URLs with AI-generated clips, caption templates, music tracks, and overlays. Setup involves adding configuration to Claude Desktop or using a hosted endpoint.