Building an AI Receptionist for a Mechanic Shop: RAG Pipeline and Voice Integration

Building the RAG Pipeline
The first step was creating an accurate knowledge base to prevent hallucinations. The developer scraped the mechanic shop's website service pages and pricing into markdown files, creating a structured knowledge base covering 21+ documents including service types, pricing, turnaround times, hours, payment methods, cancellation policies, warranty info, loaner vehicles, and specialized car makes.
Each document was converted into a 1024-dimensional vector using Voyage AI (voyage-3-large) and stored in MongoDB Atlas alongside the raw text, with an Atlas Vector Search index on the embedding field.
When a customer asks a question, the query gets embedded using the same Voyage AI model and runs against the Atlas Vector Search index, returning the top 3 most semantically similar documents. Retrieved documents get passed as context to Anthropic Claude (claude-sonnet-4-6) with a strict system prompt: answer only from the knowledge base, keep responses short and conversational, and if you don't know — say so and offer to take a message.
Example response: "How much is an oil change?" → "$45 for conventional, $75 for synthetic. Includes oil filter, fluid top-off, and tire pressure check. Takes about 30 minutes."
Connecting to a Real Phone Line
The developer used Vapi as the voice platform to handle telephony: purchasing a phone number, speech-to-text (via Deepgram), text-to-speech (via ElevenLabs), and real-time function calling back to the server.
A FastAPI webhook server was built with a /webhook endpoint. When a caller asks a question, Vapi sends a tool-calls request to this endpoint with the caller's query. The server routes that to the RAG pipeline, gets a response from Claude, and sends it back to Vapi, which reads it aloud to the caller.
During development, the server runs locally on port 8000 and is exposed using Ngrok, which creates a tunnel to a public HTTPS URL that gets pasted into the Vapi dashboard as the webhook endpoint.
In the Vapi dashboard, the assistant was configured with a greeting ("Hi, thanks for calling Dane's Motorsport, how can I help you today?") and two tools: answerQuestion for RAG-backed responses and saveCallback for collecting a name and number when a question can't be answered.
Vapi sends the full conversation history with each request, enabling conversation memory.
📖 Read the full source: HN AI Agents
👀 See Also

AI Coding Agents Take Shortcuts: Developer Documents Cases of Claude and ChatGPT Choosing Easiest Path
A developer building a sensor fusion device found both Claude and ChatGPT merged dual microphone inputs into mono instead of implementing beamforming for spatial awareness. In a separate model training task, AI initially pooled subjects of different sizes together without grouping by age cohorts.

Three Practical Patterns for Making Money with OpenClaw
Analysis of 100 OpenClaw users shows three consistent approaches: turning existing knowledge into AI assistants, automating repetitive research, and selling time-saving outcomes rather than AI features.

Building a Productive Autonomous ML Research System with Claude Code
A developer built a system where Claude Code acts as an autonomous ML researcher on tabular data, running experiments overnight with constrained file editing and Docker sandboxing. Key learnings include locking down editable files, protecting experiment throughput with limits, and implementing persistent memory through structured logging.

Building a Discord Cat Monitoring Bot with ESP32-S3, MiniClaw, and Multimodal AI
A developer built a Discord bot using an ESP32-S3 Sense with MiniClaw that captures images or audio of their cat, sends them to Zhipu AI's VLM-4V model, and returns natural language descriptions instead of generic motion alerts.