Memory System for LLM Agents Scores 90.8% on LoCoMo

Memory system for LLM agents with published benchmarks

A developer has built a persistent memory system for Claude Code and OpenClaw that gives LLM agents actual context continuity across sessions. The system achieves benchmark scores of 90.8% on LoCoMo (beating every published system) and 89.1% on LongMemEval.

Architecture and framework compatibility

The architecture is adapter-based, currently hooking into lifecycle events, but the core components (storage, retrieval, intelligence) are framework-agnostic. The retrieval pipeline uses a 4-channel RRF approach with FTS5, Qdrant KNN, recency, and graph walk. The intelligence layer includes intent classification, experience patterns, and RL policy components that could plug into any agent framework.

Setup and tech stack

Quick setup requires:

ollama pull snowflake-arctic-embed2
bun install && bun run build && bun run setup
node dist/angel/index.cjs

Tech stack includes TypeScript, SQLite (better-sqlite3), Qdrant, Ollama, esbuild, and Vitest.

Key design decisions

Dual-write system with SQLite as truth source and Qdrant for acceleration, with graceful degradation
Every operation is non-throwing — individual failures never break the pipeline
Ephemeral hooks (millisecond lifetime) for capture, persistent Angel for reflection
RL policy models are pure TypeScript (Float32Array math, no PyTorch)
Content-length-aware embedding backfill in background

The project contains 29K lines of code, 1,968 tests, and is MIT licensed at https://github.com/grigorijejakisic/Claudex.

📖 Read the full source: r/openclaw