LLMSpend: Open-source cost tracker for Anthropic and OpenAI SDKs

✍️ OpenClawRadar📅 Published: March 12, 2026🔗 Source

What LLMSpend does

LLMSpend is a Python package that monitors API usage costs for Anthropic and OpenAI SDKs. It was created because the Anthropic dashboard only shows total spend without breaking it down by feature. The tool tracks tokens, cost, and latency per call, grouping data by feature, model, user, or project.

How to use it

Install with pip install llmspend. Integration requires two lines of code:

from llmspend import monitor
client = monitor.wrap(anthropic.Anthropic(), project="my-app")

Then add an llmspend parameter to track specific features:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1000,
    messages=[{"role": "user", "content": query}],
    llmspend={"feature": "chatbot"}
)

Reporting and dashboard

From the terminal, run llmspend stats --last 7d --by feature to get output like:

Total: $4.2100 across 847 calls
chatbot 512 $2.8900 1180ms
summarizer 335 $1.3200 640ms

Run llmspend dashboard to open a local web dashboard at localhost:8888.

Technical details

Local SQLite storage — no account needed, no data leaves your machine
Works with both Anthropic and OpenAI SDKs
Zero dependencies (pure Python standard library)
Never stores prompts or responses — only tracks cost metrics
No prompt logging, tracing, or evaluations — focused solely on cost tracking
MIT licensed, open source on GitHub

The tool was built entirely with Claude Code in a single session, with Claude writing the monkey-patching logic, pricing engine, CLI, and web dashboard.

📖 Read the full source: r/ClaudeAI

👀 See Also

Tools

Claude Code CLI Toolkit: Four Tools for Code Review, Project Briefs, Auto-Journaling Git Hooks

A developer has released four CLI tools built around Claude Code's print mode that handle code reviews, project brief generation, auto-journaling git hooks, and Claude session status. The tools use existing Claude Code authentication and are available as open source.

Mar 26, 2026, 01:45 PM UTC

OpenClawRadar

Tools

Engram v1.0.0: Persistent Memory for Local LLMs via Knowledge Graph

Engram is a single binary that provides persistent memory for local LLMs through a knowledge graph system. It includes an MCP server for integration with Claude Code, Cursor, and Windsurf, stores all data in a single .brain file, and runs fully offline.

Mar 9, 2026, 05:45 AM UTC

OpenClawRadar

🦀

Tools

Needle: A 26M Parameter Function-Calling Model That Runs at 6000 tok/s on Mobile

Cactus open-sources Needle, a 26M parameter model for single-shot function calling, achieving 6000 tok/s prefill and 1200 tok/s decode on consumer devices. Built with Simple Attention Networks (no FFNs), it beats several larger models on tool-use benchmarks.

May 12, 2026, 10:16 PM UTC

OpenClawRadar

Tools

AVP Protocol Enables LLM Agents to Share KV-Cache Instead of Text for Token Efficiency

AVP (Agent Vector Protocol) allows LLM agents to pass KV-cache directly between them instead of text, reducing token processing by 73-78% and achieving 2-4x speedups across Qwen, Llama, and DeepSeek models. The protocol works with HuggingFace and vLLM connectors and is available as a Python package.

Feb 28, 2026, 07:45 PM UTC

OpenClawRadar