User-built PTC for Claude Code shows 40-65% token savings on analysis tasks, not code writing

A developer has built a local Programmatic Tool Calling (PTC) implementation for Claude Code and analyzed 79 real usage sessions to measure actual benefits. PTC differs from normal tool calling by having the agent write code that runs in an isolated environment, with only final results entering the context window instead of every intermediate step.
What was built
The developer created Thalamus, a local MCP server that provides PTC-like capability to Claude Code. It includes four tools: execute() (runs Python with primitives), search, remember, and context. The implementation has 143 tests, uses Python stdlib only, and runs fully locally. The developer emphasizes this is their own implementation, not Anthropic's official PTC.
Measured results from 79 sessions
- Token footprint per call:
execute()averaged ~2,600 characters vsReadaveraging ~4,400 characters - JSONL size reduction: Sessions using PTC showed -15.6% size reduction
- Savings on analysis/research tasks: 40-65%
- Savings on code-writing tasks: ~0%
The developer notes these real-world numbers are "far from 98%" savings reported in optimal scenarios by Anthropic and Cloudflare.
How the agent actually uses execute()
Content analysis of 112 execute() calls revealed:
- 64% used standard Python (os.walk, open, sqlite3, subprocess) — not the PTC primitives
- 30% used a single primitive (one fs.read or fs.grep)
- 5% did true batching (2+ primitives combined)
The "replace 5 Reads with 1 execute" pattern occurred in only 5% of actual usage. The agent mostly used execute() as a general-purpose compute environment for accessing files outside the project, running aggregations, and querying databases.
Adoption patterns
Initial measurement showed only 25% of sessions used PTC, with the agent defaulting to Read/Grep/Glob. After adding a ~1,100 token operational manual to CLAUDE.md, adoption jumped to 42.9%. Sessions focused on writing code (Edit + Bash dominant) showed zero PTC usage.
The developer concludes PTC shines in analysis, debugging, and cross-file research tasks, but not in edit-heavy development workflows.
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenSwarm: Multi-Agent Claude CLI Orchestrator for Linear and GitHub
OpenSwarm orchestrates multiple Claude Code CLI instances as autonomous agents that pull Linear issues and run Worker/Reviewer/Test/Documenter pipelines. It uses LanceDB with multilingual-e5 embeddings for memory and includes Discord bot control, PR auto-improvement, and a web dashboard.

RUNE Protocol: Save AI Session Memory Across Platforms
RUNE (Relational User Notation for Entities) is an open-source protocol that saves your AI relationship to an encrypted .rune file, solving the cold-start problem where AI assistants forget you between sessions. Created with Claude Opus 4.6, it works across Claude and GPT platforms.

RalphTerm: ralph-style loop for Claude Code with cross-review sessions from different agents
RalphTerm is an open-source Rust CLI that runs a ralph-style outer loop around Claude Code: it takes a markdown plan, executes tasks in fresh interactive sessions, and runs cross-review with a different model (e.g., Codex) in separate fresh sessions, feeding issues back into new implementer sessions.

Node Control: Real-Time Multiplayer .io Game Built Entirely with Claude 4.6 and 4.7
Developer built a live competitive multiplayer .io game, Node Control, using Claude 4.6 and 4.7. Features server-authoritative netcode at 60Hz, 4-region deployment on fly.io, and neural-network aesthetic.