Prefex: A Local Proxy for Claude Code That Automates Prompt Caching and Session Memory

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source
Prefex: A Local Proxy for Claude Code That Automates Prompt Caching and Session Memory
Ad

Prefex is a local proxy tool designed to reduce API costs when using Claude Code. It addresses two specific cost inefficiencies: Anthropic's beta prompt caching feature requires manual header injection, and Claude Code sends full conversation history with every request.

How It Works

Prefex runs entirely on your local machine as a proxy between Claude Code and Anthropic's API. It automatically injects the specific header needed to activate Anthropic's prompt caching feature, which reduces costs for repeated input tokens by 90%. Without this header, all requests including your CLAUDE.md and project context are billed at full price.

The tool also implements session memory, preventing Claude Code from resending the entire conversation history with each turn. Additionally, it includes a model router that can route simpler queries to cheaper models, though this feature wasn't active during the initial testing period.

Ad

Performance and Installation

In a 4-day test with normal usage:

  • 1,338 requests processed
  • $49.60 actual cost with Prefex
  • $348 estimated cost without Prefex
  • 86% savings achieved (with caching only, no model routing)

The developer provides a benchmark that runs 5 questions on karpathy/nanoGPT with cold and warm starts, costing approximately $0.03. Cost calculations use Anthropic's actual billing fields.

Installation requires one curl command and adding one line to settings.json. The package includes an uninstall script. The tool operates locally with no external servers, no telemetry, and API keys go directly to Anthropic.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also