Culpa: Open Source Deterministic Replay Engine for AI Agent Debugging

Culpa is an open source deterministic replay engine designed specifically for debugging AI agent sessions. The core problem it addresses is the nondeterministic nature of LLM agents—when they fail, you can't reproduce the exact failure by simply re-running the session.
How It Works
The tool records every LLM call along with the full execution context during an agent session. When you need to debug a failure, it replays the session using the recorded responses as stubs instead of making new API calls. This makes the replay fully deterministic and costs nothing since it doesn't hit the real APIs.
Key Features
- Proxy Mode: Works with tools like Claude Code and Cursor without requiring any code changes
- Python SDK: Available for developers building their own agents
- API Support: Compatible with Anthropic and OpenAI APIs
- Forking Capability: You can fork at any recorded decision point, inject a different response, and see what would have happened
Practical Benefits
Since the replay uses recorded responses instead of making actual API calls, debugging sessions incur zero API costs. The deterministic nature of replays makes it possible to reliably reproduce and analyze failures that would otherwise be impossible to recreate due to the inherent randomness in LLM responses.
The project is actively seeking feedback, particularly from developers building agent workflows. The creator notes they're a CS freshman and looking to improve the tool.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Mengram AI: Auto-Memory Tool for Claude Code Sessions
Mengram AI automatically maintains context between Claude Code sessions by loading cognitive profiles, injecting relevant past context into prompts, and saving new knowledge. It stores semantic, episodic, and procedural memory that evolves based on failures.

Hippo v0.21.0: Biologically-Inspired Memory for AI Agents with Multi-Tool Support
Hippo v0.21.0 introduces one-command setup for multiple AI coding tools including Claude Code, OpenCode, OpenClaw, Codex, Cursor, and Pi. The memory system features decay, retrieval strengthening, and consolidation with zero runtime dependencies.

Open Source MCP Server Connects Claude to Mailchimp API
A developer built a Mailchimp MCP server using Claude Code, providing 53 tools for campaigns, audiences, reports, automations, and e-commerce with built-in safety modes and read-only configuration.

Claude Toolbox extension adds message-level bookmarks and full-text search
Claude Toolbox is a Chrome extension that lets you bookmark individual messages, full-text search across conversations, and export as TXT or JSON. Free tier covers 2 conversations; paid at $5/month or $49 lifetime.