Self-Hosted AI Agent Dashboard: Enable With One Flag

Built-in observability for AI agents

Definable AI, an open-source Python framework built on FastAPI for building AI agents, has added a self-hosted observability dashboard that requires minimal setup. Unlike other frameworks that treat observability as an afterthought requiring external services like LangSmith or Arize, this feature is built directly into the execution pipeline.

One-flag setup

To enable the dashboard, add a single parameter when creating your agent:

from definable.agent import Agent

agent = Agent(
    model="openai/gpt-4o",
    tools=[get_weather, calculate],
    observability=True,  # <- this line
)
agent.serve(enable_server=True, port=8002)
Dashboard live at http://localhost:8002/obs/

The setup requires no API keys, cloud accounts, or separate infrastructure like Docker-compose for metrics stacks. The dashboard is served alongside your agent as a self-contained component.

Dashboard features

Live event stream: SSE-powered real-time streaming of every model call, tool execution, knowledge retrieval, and memory recall across 60+ event types
Token & cost accounting: Per-run and aggregate tracking to see exactly where your budget is going
Latency percentiles: p50, p95, p99 metrics across all runs to spot regressions instantly
Per-tool analytics: Which tools get called most frequently, which ones error, and average execution times
Run replay: Click into any historical run and step through it turn-by-turn
Run comparison: Side-by-side diff of two runs to see changed prompts or different tool calls immediately
Timeline charts: Token consumption, costs, and error rates over time with 5-minute, 30-minute, hourly, and daily buckets

Architecture approach

The observability system differs from alternatives like LangSmith or Phoenix in several ways:

Self-hosted: Your data never leaves your machine with no vendor lock-in
Zero-config: No separate infrastructure or collector processes required
Built into the pipeline: Events are emitted from inside the 8-phase execution pipeline rather than patched on via monkey-patching or OTEL instrumentation
Protocol-based: Write a 3-method class to export to any backend without installing SDKs

The maintainer notes this isn't intended to replace full-blown APM systems with enterprise features like RBAC and retention policies. It's designed for developers building agents who want to see what's happening during development.

The project is currently in early stages with the maintainer seeking additional contributors. The framework is available at https://github.com/definableai/definable.ai.

📖 Read the full source: r/LocalLLaMA