Prompt-caching MCP plugin automatically reduces Claude API costs by identifying stable context

Prompt-caching is an MCP plugin that automatically reduces Claude API costs by leveraging Anthropic's caching feature. When using Claude Code or Cursor/Windsurf/Zed with the Anthropic API, each turn typically re-sends the entire context from scratch, which means thousands of tokens get billed at full rate repeatedly during long debugging sessions.
How it works
Anthropic provides a caching feature that makes repeated reads cost 0.1× instead of 1×, but this requires manually marking what gets cached. The prompt-caching plugin runs in the background, identifies stable parts of your context (system prompts, tool definitions, large file reads), and automatically marks them before each API call.
Performance results
- 20-turn bug fix: 85% cheaper
- 15-turn refactor: 80% cheaper
- 40-turn coding session: 92% cheaper
Installation
For Claude Code users:
/plugin marketplace add https://github.com/flightlesstux/prompt-caching
/plugin install prompt-caching@ercan-ermis
For Cursor/Windsurf/Zed:
npm install -g prompt-caching-mcp
Then point your MCP configuration at it.
The tool is open source under the MIT license and available for free. The repository is at https://github.com/flightlesstux/prompt-caching.
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenRouter Model Pricing and Intelligence-per-Dollar Analysis
A Reddit user compiled OpenRouter API pricing for 16 AI models and calculated intelligence-per-dollar values, identifying MiMo-V2-Flash as best value at $0.09/M tokens and GPT-5.4 as most intelligent at $2.50/M tokens.

Context Routing Layer Reduces Claude Code Token Usage by Tracking Accessed Files
A developer saved approximately $80 per month on Claude Code usage by adding a context routing layer that prevents the AI from re-reading the same repository files on follow-up turns. The tool tracks what files have already been accessed to reduce redundant token consumption.

OpenClaw Setup Assistance Offered by ClawSet
ClawSet provides setup services for OpenClaw, focusing on understanding client needs. The service includes a setup call for $99 and a month of troubleshooting support.

Terminal-Based 3D Renderer Built with Multi-Agent Claude Code System
A developer created tortuise, a pure terminal-based 3D renderer that displays Gaussian splats using Unicode and ASCII symbols, built over 3 days using 70-80 AI agents coordinated through a Claude Code setup with subagents inside subagents.