CAL: Open-Source Context Optimization Layer for LLM Agents

What CAL Does
CAL is a Python library that sits between your existing code and LLM API calls, intelligently selecting, compressing, and assembling context for each request. It addresses the cost and context problems in token-heavy agent setups, particularly relevant with recent Claude Pro/Max subscription changes.
Performance Benchmarks
In production with Claude Opus 4 and 103 context chunks:
- Without CAL: Every request sends all 103 chunks (~23,000 tokens) at $0.043 per request
- With CAL: Drops to ~6 chunks and 4,100 tokens at $0.008 per request
- Results: 83% reduction in tokens, 81% reduction in cost
Validated against 5,000 WildChat prompts (an open academic dataset of real LLM conversations across 57 languages) with 97.6% average savings.
Key Features
- Selector: IDF-weighted scoring picks only relevant chunks per query. Uses stable prefix + dynamic chunks selected per request.
- Tool Stubs: Three-tier lazy tool loading with lightweight stubs until the model signals intent to use a specific tool.
- Cost Engine: Provider-aware savings calculator that knows Anthropic's 4 input tiers and Google's cache storage pricing.
- Noise Suppression: IDF floor + require-any gates to stop common words from loading irrelevant chunks on every request.
- Cache-Stable Ordering: Uses scores only for selection, then alphabetical order for position to maintain cache hits.
Technical Details
Multi-turn context handling: Tool stubs are history-aware. If the model used a tool in a previous turn, the full schema stays loaded to maintain conversation continuity.
Provider support: CAL is provider-agnostic and works with any provider having a chat completions endpoint. The cost engine already handles Anthropic's 4 input tiers and Google's cache storage pricing.
Edge cases: Uses IDF floors and noise suppression for ambiguous queries. Hybrid keyword+semantic scoring is on the roadmap.
Installation and Licensing
pip install cal-context
MIT licensed. PyPI: https://pypi.org/project/cal-context/
GitHub: https://github.com/vjc-lab/context-assembly-layer
📖 Read the full source: r/openclaw
👀 See Also

memora: Version-Controlled, Typed Memory for AI Agents – Git for AI Beliefs
memora is a CLI tool written in Rust that version-controls AI agent memory with typed, provenance-tracked, branchable, and mergeable capabilities.

Clawpage: A Tool That Converts OpenClaw Conversations to Static Websites
A developer created Clawpage, a skill that transforms OpenClaw session history into static web pages to preserve valuable conversations, including the back-and-forth, research, and debugging process. The tool is available on GitHub.

Equibles: Self-Hosted MCP Server for U.S. Financial Data – SEC Filings, 13F, Insider Trades, FRED
Equibles is an open-source MCP server that scrapes public U.S. financial data (SEC filings, 13F, insider/congressional trades, short data, FRED) and exposes it as MCP tools for any local LLM agent.

Applying Claude Code's Architecture to Local 9B Models: Key Findings and Optimizations
A developer extracted architectural patterns from Claude Code's leaked source code and applied 10 optimizations to qwen3.5:9b running locally on an RTX 5070 Ti. The key discovery was that qwen3.5:9b has native structured tool_calls, and the biggest limitation for 9B models is self-discipline in knowing when to stop exploring and start producing output.