How to Build an AI Receptionist: RAG Pipeline & Voice Integration

Building the RAG Pipeline

The first step was creating an accurate knowledge base to prevent hallucinations. The developer scraped the mechanic shop's website service pages and pricing into markdown files, creating a structured knowledge base covering 21+ documents including service types, pricing, turnaround times, hours, payment methods, cancellation policies, warranty info, loaner vehicles, and specialized car makes.

Each document was converted into a 1024-dimensional vector using Voyage AI (voyage-3-large) and stored in MongoDB Atlas alongside the raw text, with an Atlas Vector Search index on the embedding field.

When a customer asks a question, the query gets embedded using the same Voyage AI model and runs against the Atlas Vector Search index, returning the top 3 most semantically similar documents. Retrieved documents get passed as context to Anthropic Claude (claude-sonnet-4-6) with a strict system prompt: answer only from the knowledge base, keep responses short and conversational, and if you don't know — say so and offer to take a message.

Example response: "How much is an oil change?" → "$45 for conventional, $75 for synthetic. Includes oil filter, fluid top-off, and tire pressure check. Takes about 30 minutes."

Connecting to a Real Phone Line

The developer used Vapi as the voice platform to handle telephony: purchasing a phone number, speech-to-text (via Deepgram), text-to-speech (via ElevenLabs), and real-time function calling back to the server.

A FastAPI webhook server was built with a /webhook endpoint. When a caller asks a question, Vapi sends a tool-calls request to this endpoint with the caller's query. The server routes that to the RAG pipeline, gets a response from Claude, and sends it back to Vapi, which reads it aloud to the caller.

During development, the server runs locally on port 8000 and is exposed using Ngrok, which creates a tunnel to a public HTTPS URL that gets pasted into the Vapi dashboard as the webhook endpoint.

In the Vapi dashboard, the assistant was configured with a greeting ("Hi, thanks for calling Dane's Motorsport, how can I help you today?") and two tools: answerQuestion for RAG-backed responses and saveCallback for collecting a name and number when a question can't be answered.

Vapi sends the full conversation history with each request, enabling conversation memory.

📖 Read the full source: HN AI Agents

Building an AI Receptionist for a Mechanic Shop: RAG Pipeline and Voice Integration

Building the RAG Pipeline

Connecting to a Real Phone Line

👀 See Also

AI Coding Agents Take Shortcuts: Developer Documents Cases of Claude and ChatGPT Choosing Easiest Path

Three Practical Patterns for Making Money with OpenClaw

Building a Productive Autonomous ML Research System with Claude Code

Building a Discord Cat Monitoring Bot with ESP32-S3, MiniClaw, and Multimodal AI