Token Reducer: A Claude Code Plugin for Intelligent Context Compression

Token Reducer is a Claude Code plugin that addresses the problem of excessive token consumption when working with medium-to-large repositories. The tool processes repository context locally before sending it to Claude, significantly reducing context size without losing relevant code.
How It Works
The plugin uses several techniques to intelligently compress context:
- AST-based chunking — Parses code into meaningful units (functions, classes, blocks) instead of naive text splitting
- Hybrid retrieval — Combines BM25 (keyword matching) with vector similarity to find the most relevant chunks
- TextRank compression — Applies extractive summarization to keep important parts and drop noise
- Import graph mapping — Traces dependencies so related code stays together
- 2-hop symbol expansion — When working on function A that calls function B, it automatically pulls in B's context
Performance and Testing
In testing across Python, TypeScript, and JavaScript repositories, the developer reports 90-98% reduction in context size without losing code relevant to the task. The tool was built using Claude itself to iterate on the architecture, starting with a basic chunker and testing against real coding tasks until compression was tight but context-preserving.
Installation and Availability
Token Reducer is completely free and MIT licensed. To install:
/plugin marketplace add Madhan230205/token-reducer
The source code is available on GitHub at github.com/Madhan230205/token-reducer. The developer is seeking feedback on where compression helps workflows, cases where important context gets dropped, and which languages or repository structures need better handling.
Technical Details
The plugin runs entirely locally with no cloud APIs and no data leaving your machine. It was packaged as a Claude Code plugin after working reliably on the developer's own projects. The repository is open for contributions, with room to optimize for different languages, add smarter caching, or tune retrieval parameters.
📖 Read the full source: r/ClaudeAI
👀 See Also

Bernstein: A Kubernetes-like orchestrator for AI coding agents with verification and model policies
Bernstein is an orchestrator for AI coding agents that includes independent verification of agent outputs, model policy controls, 13 agent adapters, and deterministic Python-based scheduling. The project has 5000+ tests and features like circuit breakers, cost anomaly detection, and PII scanning.

AI Subroutines: Deterministic Browser Automation with Zero Token Cost
rtrvr.ai's AI Subroutines let you record browser tasks once as callable tools that replay inside the webpage context with auth propagated for free, eliminating LLM inference costs and non-determinism for repetitive tasks.

Claude Code v2.1.142: New claude agents flags, Opus 4.7 default, and bug fixes
Claude Code v2.1.142 adds eight new flags for configuring background sessions, switches fast mode to Opus 4.7 by default, and fixes over a dozen bugs including MCP tool timeout, macOS sleep/wake daemon issues, and Windows network-drive deadlocks.

Introducing OneTool MCP: An Open Source Multi-Tool for Developers
OneTool MCP, built using Claude AI, offers developers over 100 tools for tasks like web searches, library updates, and file management without tool tax or context rot.