Agent-Xray: Open-source tool for debugging AI agent failures from trace logs

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source

Agent-Xray is an open-source tool for debugging AI agents by analyzing their trace logs. It was created to solve the problem of agents failing tasks without clear errors—situations where code runs fine but the agent makes wrong decisions, like repeatedly calling the wrong tool despite error messages suggesting the correct one.

Key Features

The tool reads trace logs and provides structural grading and root-cause classification for agent failures. It reconstructs what the agent was seeing at each step to help understand why bad decisions were made.

Failure Categories

spin
tool_bug
early_abort

Enforcement Mode

The most significant feature according to the creator is enforcement mode. After fixing an agent bug, this mode runs adversarial challenges against your fixes to verify they're legitimate. It checks for:

Hardcoded returns
Weakened assertions

This addresses the problem where fixes might work on specific test tasks but are actually fragile, or where agents learn to game the test.

Workflow Integration

The tool runs as MCP tools, allowing Claude Code to use it directly. A typical workflow described in the source:

Tell Claude Code to triage agent traces
It finds the worst failure
Replays what the agent saw
Suggests a fix
Enforcement mode verifies the fix is legitimate

The creator describes this as "agents debugging agents."

Technical Details

Installation: pip install agent-xray
Quickstart: agent-xray quickstart (includes sample traces to test without your own data)
License: MIT
Zero dependencies
Runs offline
Works with OpenAI, Anthropic, LangChain, CrewAI, OpenTelemetry traces
Project age: About 9 days old at time of posting

Use Case

This tool is for developers working with AI agents who need to debug failures that don't produce traditional errors or stack traces—situations where agents make incorrect decisions despite having access to correct tools and information.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

Curated List of 260+ AI Agent Tools with Claude Ecosystem Highlights

A GitHub repository contains a curated list of 260+ AI agent tools, including specific Claude-related entries like Claude Code (80.9% SWE-bench), Claude Computer Use, and Claude in Chrome, plus tools that work well with Claude such as Cline and Cursor.

Mar 7, 2026, 08:45 PM UTC

OpenClawRadar

Tools

Open Source Grafana Dashboard Tracks Claude Code Costs and Usage via OpenTelemetry

An SRE built a free Grafana dashboard to visualize Claude Code spend, token usage, cache hit ratios, and edit decisions by pulling OpenTelemetry metrics into Prometheus-compatible backends.

May 16, 2026, 04:15 PM UTC

OpenClawRadar

Tools

Comparison of 14 Claw AI Agent Variants Across 10 Categories

A detailed comparison of 14 popular Claw AI agent variants including OpenClaw, NanoClaw, NemoClaw, ZeroClaw, PicoClaw, Moltis, IronClaw, and NullClaw, scored across 53 sub-parameters with composite rankings and ideal use cases for each.

Mar 31, 2026, 02:45 AM UTC

OpenClawRadar

Tools

Claude Code Plan Mode Reduces Redo Rate from 40% to Near Zero

A developer tracked 30+ coding sessions with Claude Code and found that skipping Plan Mode resulted in redoing tasks from scratch 40% of the time. With Plan Mode, the redo rate dropped to basically zero, with one feature taking 17 minutes total versus 35+ minutes without planning.

Feb 26, 2026, 05:45 AM UTC

OpenClawRadar