LightMem: Lightweight Memory System for LLM Agents with 10×+ Gains and 100× Lower Cost

LightMem: A Practical Memory Layer for LLM Agents
LightMem is a lightweight, modular memory system for LLM agents that addresses the challenges of long, multi-turn interactions where context grows noisy and expensive, models get "lost in the middle," and existing memory systems add latency and token cost.
How LightMem Works
The system maintains compact, topical, and consistent memories through three key mechanisms:
- Pre-compress sensory memory: Filters redundant and low-value tokens before storage
- Topic-aware short-term memory: Clusters turns by topic and summarizes into precise memory units
- Sleep-time long-term consolidation: Uses incremental inserts at runtime plus offline high-fidelity updates without latency impact
Performance Results
On the LongMemEval benchmark, LightMem shows:
- Accuracy improvement: up to ~10.9%
- Token reduction: up to 117×
- API call reduction: up to 159×
- Runtime reduction: >12×
Recent Updates and Features
- Baseline evaluation framework across memory systems (Mem0, A-MEM, LangMem) on LoCoMo & LongMemEval
- Demo video and tutorial notebooks for multiple scenarios
- MCP Server integration for multi-tool memory invocation
- Full LoCoMo dataset support
- GLM-4.6 integration with reproducible scripts
- Local deployment via Ollama, vLLM, Transformers with auto-load capability
Positioning and Use Cases
LightMem is designed as a modular memory layer that can integrate with various agent stacks including:
- Long-context agents
- Tool-using agents
- Autonomous workflows
- Conversational systems
The system provides structured memory that scales without exploding token counts, making it particularly useful for developers working with agent frameworks, memory/RAG systems, long-context models, and applied LLM teams.
Availability
Paper: https://arxiv.org/abs/2510.18866
Code: https://github.com/zjunlp/LightMem
📖 Read the full source: r/LocalLLaMA
👀 See Also

VTCode: A Rust TUI Coding Agent That Aggressively Trims Context with AST-Level Chunking
VTCode is an open-source Rust TUI coding agent that aggressively trims context using AST-level chunking via ripgrep and ast-grep. It supports custom OpenAI-compatible providers, sandboxing with macOS Seatbelt and Linux Landlock, and tree-sitter-bash validation on generated commands.

AI Agent Embedded in Shell: Terminal Buffer & Overlay Extension
Open-source shell with embedded AI agent that reads terminal output and types commands via a floating overlay. Supports local and cloud models.

Toothcomb: Open-Source Real-Time Speech Fact-Checker Built with Claude Opus and Sonnet APIs
Toothcomb is an open-source tool that takes a speech transcript, fact-checks claims, detects logical fallacies and manipulative language using Claude Opus API, and supports real-time microphone streaming.

Claude IDE Bridge: MCP Tool for Remote Editor Access
Claude IDE Bridge is an open-source tool that provides Claude AI with remote control access to code editors via MCP (Model Context Protocol). It exposes editor knowledge like live type information and debugger state as callable tools.