Kelet: Automated Root Cause Analysis for AI Agents

What Kelet Does

Kelet is a service that continuously monitors AI agents and LLM applications in production to automatically identify why they fail. Instead of agents crashing with clear errors, they often give wrong answers quietly, requiring manual trace analysis. Kelet automates this investigation by clustering failure patterns across thousands of sessions.

How It Works

You connect your traces and signals (user feedback, edits, clicks, sentiment, LLM-as-a-judge, etc.)
Kelet processes those signals and extracts facts about each session
It forms hypotheses about what went wrong in each case
It clusters similar hypotheses across sessions and investigates them together
It surfaces a root cause with a suggested fix you can review and apply

The key insight: individual session failures look random, but when you cluster the hypotheses, failure patterns emerge.

Integration Options

Three ways to integrate:

Kelet Skill for coding agents: Scans your codebase, discovers where signals should be collected, and sets everything up automatically
Python SDK: pip install kelet
TypeScript SDK: npm install kelet

Manual setup requires adding two lines to your agent code. Kelet is fully OpenTelemetry-compliant, so any OTEL-instrumented agent works out of the box.

Supported Frameworks and Platforms

Works with: OpenTelemetry, Langfuse, Mixpanel, OpenAI, Anthropic, LangChain, pydantic AI SDK, CrewAI, Strands, Agno, Mastra, PostHog, LangGraph, AutoGen, LlamaIndex, Haystack, Semantic Kernel, and Gemini APIs.

Works with any agent or LLM application where you own the code: agentic loops, multi-step workflows, RAG pipelines, chatbots, autonomous agents.

Two situations where Kelet isn't the right fit:

If you use AI tools built by others (Cursor, Claude Code, Copilot as a developer)
If you're building a skill or plugin inside an existing agentic platform

Technical Details

Runs on Kelet's servers (SOC 2 certified)
Continuously ingests traces 24/7
LLM tokens for analysis are covered by Kelet (don't touch your model API bill)
Pricing based on usage (see kelet.ai/pricing)
Currently free during beta (no credit card required)

Performance Metrics

From pilot cohort data:

73% of teams had failures nobody noticed (Kelet found them)
14.3 minutes median time from trace ingestion to prompt patch
33K+ sessions analyzed across design partner deployments

📖 Read the full source: HN AI Agents

Kelet: Automated Root Cause Analysis for AI Agents

What Kelet Does

How It Works

Integration Options

Supported Frameworks and Platforms

Technical Details

Performance Metrics

👀 See Also

Local RAG Tool Built with Nemotron Nano 9B v2 and vLLM Tool Calling

OpenClaw Optimizer v1.18.0 released with OpenClaw v2026.3.7 alignment

Clavis MCP Server: Secure Credential Management for Claude Desktop

Introducing Swarmhook: Free and Open Source Webhooks for Your Bot