Context Gateway: An Open-Source Proxy for Compressing AI Agent Context

What Context Gateway Does
Context Gateway is an agentic proxy that sits between AI coding agents (like Claude Code, OpenClaw, or Cursor) and the LLM API. When tool outputs like file reads or grep results dump thousands of tokens into the context window, the proxy compresses this content before it reaches the LLM. The motivation comes from research showing that long-context benchmarks experience steep accuracy drops as context grows—OpenAI's GPT-5.4 evaluation reportedly drops from 97.2% at 32k tokens to 36.6% at 1M tokens.
How the Compression Works
The system uses small language models (SLMs) that examine model internals and train classifiers to detect which parts of the context carry the most signal. When a tool returns output, compression happens conditioned on the intent of the tool call. For example, if an agent called grep looking for error handling patterns, the SLM keeps relevant matches and strips the rest. If the model later needs something that was removed, it can call expand() to fetch the original output.
Key Features and Setup
- Background compaction: Triggered at 85% window capacity, with summaries pre-computed so you don't wait for compaction
- Lazy-load tool descriptions: The model only sees tools relevant to the current step
- Spending caps: Control costs with budget limits
- Dashboard: Track running and past sessions
- Slack notifications: Get pinged when an agent is waiting on you
- Supported agents: Claude Code, Cursor, OpenClaw, or custom configurations
Getting Started
Install with:
curl -fsSL https://compresr.ai/api/install | sh
Then run context-gateway to launch an interactive TUI wizard that helps you:
- Choose an agent (claude_code, cursor, openclaw, or custom)
- Create/edit configuration including summarizer model and API key
- Enable Slack notifications if needed
- Set trigger threshold for compression (default: 75%)
The tool is open-source, built primarily in Go (90.9%), and maintained by Compresr, a YC-backed company. You can check compaction logs at logs/history_compaction.jsonl to see what's happening under the hood.
📖 Read the full source: HN LLM Tools
👀 See Also

Claude Ops: Browser Dashboard for Claude Code Live Status and Subagent Tracking
A free, local macOS browser dashboard that tracks Claude Code session live status, current tool, spawned subagents, and sends OS push notifications when input is needed.

Building a Self-Updating Writing Style Guide for AI-Assisted Content
A team building a voice extraction platform called Noren has developed a 117-line Markdown style guide that rewrites itself after every published piece, using Claude to enforce rules and banning AI-sounding words like 'cadence' and 'optimize'.

Trepan: Local VS Code Security Auditor for AI-Generated Code
Trepan is an open-source VS Code extension that acts as a security gatekeeper for AI-generated code suggestions. It uses Ollama to run local security audits against project-specific rules in a .trepan/system_rules.md file.

Bespoke AI v0.8.1: VS Code Autocomplete Extension for Code and Text
Bespoke AI v0.8.1 is a VS Code extension providing autocomplete for both code and text, leveraging Claude Code subscriptions via Anthropic's Agent SDK to avoid API charges while supporting multiple backends including Ollama.