Open-source memory system for LLM agents achieves high benchmark scores

Memory system for LLM agents with published benchmarks
A developer has built a persistent memory system for Claude Code and OpenClaw that gives LLM agents actual context continuity across sessions. The system achieves benchmark scores of 90.8% on LoCoMo (beating every published system) and 89.1% on LongMemEval.
Architecture and framework compatibility
The architecture is adapter-based, currently hooking into lifecycle events, but the core components (storage, retrieval, intelligence) are framework-agnostic. The retrieval pipeline uses a 4-channel RRF approach with FTS5, Qdrant KNN, recency, and graph walk. The intelligence layer includes intent classification, experience patterns, and RL policy components that could plug into any agent framework.
Setup and tech stack
Quick setup requires:
ollama pull snowflake-arctic-embed2
bun install && bun run build && bun run setup
node dist/angel/index.cjsTech stack includes TypeScript, SQLite (better-sqlite3), Qdrant, Ollama, esbuild, and Vitest.
Key design decisions
- Dual-write system with SQLite as truth source and Qdrant for acceleration, with graceful degradation
- Every operation is non-throwing — individual failures never break the pipeline
- Ephemeral hooks (millisecond lifetime) for capture, persistent Angel for reflection
- RL policy models are pure TypeScript (Float32Array math, no PyTorch)
- Content-length-aware embedding backfill in background
The project contains 29K lines of code, 1,968 tests, and is MIT licensed at https://github.com/grigorijejakisic/Claudex.
📖 Read the full source: r/openclaw
👀 See Also

Codesight: AI Context Engine Cuts 30K-60K Tokens from Claude Code Sessions
Codesight is an open-source tool that analyzes codebases to provide AI coding agents with structured context, reducing token waste. A developer collaborated with the maintainer to add AST parsing for Next.js and Prisma, an eval suite, token telemetry, and profiles for Claude Code and Cursor.

Engram: Hybrid Memory Plugin for OpenClaw Agents — Vector + Semantic Search with Decay
Engram gives OpenClaw agents persistent memory across sessions using SQLite+FTS5 for exact recall and LanceDB for semantic search, with decay classes and auto-capture hooks.

Cloudflare's AI Platform: Unified Inference Layer for AI Agents
Cloudflare's AI Platform provides a single API to access 70+ models across 12+ providers, including multimodal support for image, video, and speech models. It enables switching between models with one-line code changes and offers centralized cost monitoring with custom metadata.

CAL: Open-Source Context Optimization Layer for LLM Agents
CAL (Context Assembly Layer) is a Python library that reduces Claude API token usage by 83% through intelligent context selection and compression. It's available via pip install and MIT licensed.