Kelet: Automated Root Cause Analysis for AI Agents

What Kelet Does
Kelet is a service that continuously monitors AI agents and LLM applications in production to automatically identify why they fail. Instead of agents crashing with clear errors, they often give wrong answers quietly, requiring manual trace analysis. Kelet automates this investigation by clustering failure patterns across thousands of sessions.
How It Works
- You connect your traces and signals (user feedback, edits, clicks, sentiment, LLM-as-a-judge, etc.)
- Kelet processes those signals and extracts facts about each session
- It forms hypotheses about what went wrong in each case
- It clusters similar hypotheses across sessions and investigates them together
- It surfaces a root cause with a suggested fix you can review and apply
The key insight: individual session failures look random, but when you cluster the hypotheses, failure patterns emerge.
Integration Options
Three ways to integrate:
- Kelet Skill for coding agents: Scans your codebase, discovers where signals should be collected, and sets everything up automatically
- Python SDK:
pip install kelet - TypeScript SDK:
npm install kelet
Manual setup requires adding two lines to your agent code. Kelet is fully OpenTelemetry-compliant, so any OTEL-instrumented agent works out of the box.
Supported Frameworks and Platforms
Works with: OpenTelemetry, Langfuse, Mixpanel, OpenAI, Anthropic, LangChain, pydantic AI SDK, CrewAI, Strands, Agno, Mastra, PostHog, LangGraph, AutoGen, LlamaIndex, Haystack, Semantic Kernel, and Gemini APIs.
Works with any agent or LLM application where you own the code: agentic loops, multi-step workflows, RAG pipelines, chatbots, autonomous agents.
Two situations where Kelet isn't the right fit:
- If you use AI tools built by others (Cursor, Claude Code, Copilot as a developer)
- If you're building a skill or plugin inside an existing agentic platform
Technical Details
- Runs on Kelet's servers (SOC 2 certified)
- Continuously ingests traces 24/7
- LLM tokens for analysis are covered by Kelet (don't touch your model API bill)
- Pricing based on usage (see kelet.ai/pricing)
- Currently free during beta (no credit card required)
Performance Metrics
From pilot cohort data:
- 73% of teams had failures nobody noticed (Kelet found them)
- 14.3 minutes median time from trace ingestion to prompt patch
- 33K+ sessions analyzed across design partner deployments
📖 Read the full source: HN AI Agents
👀 See Also

Local RAG Tool Built with Nemotron Nano 9B v2 and vLLM Tool Calling
A developer built a local-first RAG research tool that runs entirely on a single GPU using Nemotron Nano 9B v2 Japanese on vLLM with custom parser plugins for tool calling. The system features a two-step extract-execute flow with bilingual keyword extraction and parallel FTS5/DuckDuckGo search.

OpenClaw Optimizer v1.18.0 released with OpenClaw v2026.3.7 alignment
OpenClaw Optimizer skill v1.18.0 is now aligned with OpenClaw v2026.3.7, adding support for new AI providers including Google Gemini 3.1 Flash-Lite and OpenAI gpt-5.4, plus new CLI commands like /session idle and /usage cost.

Clavis MCP Server: Secure Credential Management for Claude Desktop
Clavis is an MCP server that manages API keys and OAuth tokens for Claude Desktop, storing credentials with AES-256 encryption and providing automatic token refresh to prevent mid-conversation expiration errors.

Introducing Swarmhook: Free and Open Source Webhooks for Your Bot
Swarmhook.com offers free and open source webhooks to effectively manage events for your bots, streamlining automation and response capabilities.