GrapeRoot: Open-source tool reduces Claude Code token usage by 40-80%

GrapeRoot is a free, open-source tool that reduces Claude Code token usage by 40-80% by acting as a local MCP server between your codebase and the AI model. Instead of sending full files repeatedly, it builds a structured understanding of your repository and tracks what the model has already seen during each session.
How it works
The tool builds a graph of your codebase (files, functions, dependencies) and tracks what the AI has already read or edited. It then sends only delta changes and relevant context instead of entire files, which stops repeated context loading and makes LLM responses more consistent across turns.
Performance results
- 40-80% token reduction depending on workflow
- Refactoring workflows show the biggest savings
- Greenfield development shows smaller gains
- 500+ users with ~200 daily active users
- ~4.5/5★ average rating
The developers found that pushing for 80-90% reduction caused quality to drop. The sweet spot is around 40-60% reduction where outputs actually improve rather than degrade. Benchmarks showing quality improvements are available at graperoot.dev/benchmarks.
Technical details
- Runs 100% locally
- No account or API key needed
- No data leaves your machine
Installation and resources
- Open source repository: github.com/kunal12203/Codex-CLI-Compact
- Better installation steps: graperoot.dev/#install
- Discord for debugging/feedback: discord.gg/YwKdQATY2d
According to the source, this approach means early-stage developers can get away with almost no cost, while serious builders don't need $200/month subscriptions anymore. A basic subscription combined with better context handling is sufficient.
📖 Read the full source: r/ClaudeAI
👀 See Also

Mind Keg MCP: Persistent Memory for Claude Code and MCP-Compatible Agents
Mind Keg MCP v0.1.1 is an open-source MCP server that provides persistent memory for Claude Code and other MCP-compatible agents. It stores learnings locally via SQLite and retrieves them via semantic search, allowing AI coding assistants to remember context between sessions.

Cloudflare's AI Platform: Unified Inference Layer for AI Agents
Cloudflare's AI Platform provides a single API to access 70+ models across 12+ providers, including multimodal support for image, video, and speech models. It enables switching between models with one-line code changes and offers centralized cost monitoring with custom metadata.

LoreConvo: MCP Server Adds Persistent Session Memory to Claude Code
LoreConvo is an MCP server that provides Claude Code with persistent session memory, automatically saving and loading context between sessions. It saves 3,000-8,000 tokens per session by eliminating re-contexting overhead.

Local-first AI tax preparer with encrypted PII built on MCP
A developer built a tax filing extension for Crow that encrypts all PII with AES-256-GCM and works with any MCP-compatible client including Claude, ChatGPT, Gemini, or local models through Ollama. The system handles 1040, Schedule 1, HSA (8889), education credits (8863), self-employment (Schedule C/SE), and capital gains (Schedule D) calculations locally.