Claude AI Session Compaction Issues and Workarounds

How Compaction Works
Claude sessions are stored as JSONL files at ~/.claude/projects/{encoded-cwd}/sessions/{id}.jsonl. Each conversation turn is a JSON block. When compaction triggers, original blocks remain in the file, but a new block with a compressed summary gets appended. After compaction, the model works from the summary instead of the full conversation history.
Test Results
With a coding project at 90% context fill (before the 1 million token increase), the user tested 10 questions covering simple recall, 6-hop dependency chains, entity disambiguation, negation chaining, absence detection, and conflict detection.
- Pre-compaction: ~9.75/10 accuracy with Opus 4.6 finding scattered facts across 418K tokens
- Post-compaction (Default): ~5/10 accuracy with 3,461 tokens (121x compression). Same session, same questions resulted in hallucinated incorrect answers.
- Post-compaction (Manual Opus): ~9.75/10 accuracy with 6,080 tokens (69x compression). Using a custom compaction prompt with Opus preserved important information.
Why the Difference
According to Anthropic's documentation, the API defaults to using the same model for compaction. The user was running Opus 4.6 on medium compute, so default compaction should have used Opus too. The quality difference suggests issues with the summarization prompt, thinking/compute budget, or both.
Workarounds
Approach 1: Opus Compaction - Turn off auto-compaction and implement a background process that measures token counts for Claude Code instances. Trigger compaction using Opus with a custom prompt (potentially with user authorization).
Approach 2: spaCy NER Pre-seeding - Instead of starting sub-agents with zero context, use spaCy NER to extract proper nouns, numbers, service names, ports, and key identifiers from project files. Inject this as a lightweight entity briefing (few hundred tokens) at startup to inform agents about existing resources without narrative bloat.
📖 Read the full source: r/ClaudeAI
👀 See Also

ClaudeMeter: Open-Source macOS Menu Bar App for Real-Time Claude Usage Tracking
ClaudeMeter is a free, open-source macOS menu bar app for Claude Max subscribers that displays session and weekly usage percentages, reset timers, and pace indicators without interrupting workflow. The entire app was built using Claude (Claude Code/Opus) for Swift code, Supabase backend, and Edge Functions.

Claude Code now supports 240+ models via NVIDIA NIM gateway — including Nemotron-3 120B for agentic coding
Claude Code can switch mid-session to 240+ NVIDIA NIM models via the /model command. The Nemotron-3 Super 120B thinking variant shows strong results for multi-file refactoring and agentic tasks.

Axe: A 12MB CLI for Single-Purpose LLM Agents
Axe is a lightweight Go binary that runs focused AI agents defined in TOML files. It treats agents like Unix programs, supporting stdin piping, sub-agent delegation, and multi-provider LLM integration.

ClawVibe: A Hands-Free iOS Voice Assistant for AI Agents with On-Device STT/TTS
ClawVibe is a native iOS app that provides hands-free voice interaction with AI agents during commutes. It uses on-device speech recognition and TTS, supports CarPlay, and includes voice biometrics to filter background noise. Only text is sent over the network.