Definable AI adds self-hosted observability dashboard with single flag

Built-in observability for AI agents
Definable AI, an open-source Python framework built on FastAPI for building AI agents, has added a self-hosted observability dashboard that requires minimal setup. Unlike other frameworks that treat observability as an afterthought requiring external services like LangSmith or Arize, this feature is built directly into the execution pipeline.
One-flag setup
To enable the dashboard, add a single parameter when creating your agent:
from definable.agent import Agent
agent = Agent(
model="openai/gpt-4o",
tools=[get_weather, calculate],
observability=True, # <- this line
)
agent.serve(enable_server=True, port=8002)
Dashboard live at http://localhost:8002/obs/
The setup requires no API keys, cloud accounts, or separate infrastructure like Docker-compose for metrics stacks. The dashboard is served alongside your agent as a self-contained component.
Dashboard features
- Live event stream: SSE-powered real-time streaming of every model call, tool execution, knowledge retrieval, and memory recall across 60+ event types
- Token & cost accounting: Per-run and aggregate tracking to see exactly where your budget is going
- Latency percentiles: p50, p95, p99 metrics across all runs to spot regressions instantly
- Per-tool analytics: Which tools get called most frequently, which ones error, and average execution times
- Run replay: Click into any historical run and step through it turn-by-turn
- Run comparison: Side-by-side diff of two runs to see changed prompts or different tool calls immediately
- Timeline charts: Token consumption, costs, and error rates over time with 5-minute, 30-minute, hourly, and daily buckets
Architecture approach
The observability system differs from alternatives like LangSmith or Phoenix in several ways:
- Self-hosted: Your data never leaves your machine with no vendor lock-in
- Zero-config: No separate infrastructure or collector processes required
- Built into the pipeline: Events are emitted from inside the 8-phase execution pipeline rather than patched on via monkey-patching or OTEL instrumentation
- Protocol-based: Write a 3-method class to export to any backend without installing SDKs
The maintainer notes this isn't intended to replace full-blown APM systems with enterprise features like RBAC and retention policies. It's designed for developers building agents who want to see what's happening during development.
The project is currently in early stages with the maintainer seeking additional contributors. The framework is available at https://github.com/definableai/definable.ai.
📖 Read the full source: r/LocalLLaMA
👀 See Also

angular-grab: Tool for Extracting Angular Component Context for AI Agents
angular-grab is a dev-only tool that lets you point at any UI element in an Angular dev server, press Cmd+C, and copy the full component stack trace with file paths and HTML to your clipboard for pasting into AI agents.

bunx ccusage Shows $18,450 in Credits Burned — Flat Plans Absorb the Cost
A user on r/ClaudeAI ran bunx ccusage and discovered $18,450 in credits used in May — 248M input tokens, 42M output tokens, 21.7B with cache reads — while paying only €400/month flat-rate for Claude Code and Codex.

Relay: A Tool for Handing Off Claude Code Sessions to Other AI Agents
Relay is a Rust binary that extracts Claude Code's session context—including conversation history, tool calls, errors, and git state—and transfers it to other AI agents like Codex or Gemini when rate limits are hit. It supports 8 agents and can be installed via GitHub or npm.
Hugging Face's physics-intern: Multi-Agent Framework Doubles Gemini on CritPt Benchmark
Hugging Face released physics-intern, a multi-agent framework for theoretical physics research that doubles Gemini models' performance on the CritPt benchmark and sets a new SOTA vs GPT-5.5 Pro at lower cost.