Vektori 3-Layer Sentence Graph: Claude-Inspired AI Memory

Memory Architecture Principles

The Claude Code team shared how their memory system works, revealing key principles: memory is an index, not storage. MEMORY.md contains just pointers (150 chars per line), with real knowledge in separate files fetched on demand. Raw transcripts are never loaded—only grepped when needed. Three layers exist, each with different access costs. The sharpest principle: if something is derivable, do not store it. Retrieval is skeptical—memory is a hint, not truth, and the model verifies before using.

Vektori's Implementation

Vektori applies the same principles with a different shape. While Claude uses a file hierarchy, Vektori implements a hierarchical sentence graph with three layers:

FACT LAYER (L0) — Crisp statements serving as the search surface. Cheap and always queryable.
EPISODE LAYER (L1) — Episodes across conversations, auto-discovered.
SENTENCE LAYER (L2) — Raw conversation, only fetched when explicitly needed.

Same access model applies: L0 is your index, L2 is your transcript (grepped not dumped). You pay for what you need.

Strict Write Discipline

Nothing enters L0 without passing quality filters: minimum character count, content density check, pronoun ratio. If a sentence is too vague or purely filler, it never becomes a fact. This matches Claude's principle of not storing derivable things.

Retrieval Mechanics

Retrieval works as Claude describes: scored, thresholded, skeptical. Minimum score of 0.3 before anything surfaces. Results are ranked by vector similarity plus temporal decay, not retrieved blindly.

Architectural Divergence on Corrections

Claude's approach optimizes for single-user project contexts where the latest state matters. Vektori, designed for agents working across hundreds of sessions, preserves correction history. When a user changes their mind, the old fact stays in the graph with its sentence links, allowing tracing back to what was said before the change and why it got superseded.

Performance and Future

On LongMemEval-S, Vektori achieved 73% accuracy at L1 depth using BGE-M3 + Gemini Flash-2.5-lite. Multi-hop conflict resolution—where you reason about how a fact changed over time—is where triple-based systems (subject-object-predicate) collapse. The next layer involves storing why: causal edges between events ("user corrected X, agent updated Y, user disputed again") extracted asynchronously and queryable as a graph. Agent trajectories become memory—the agent's own behavior becomes part of what it can reason about.

📖 Read the full source: r/ClaudeAI