Engram Memory SDK: Graph-Based Memory for AI Agents with Local Models

✍️ OpenClawRadar📅 Published: April 14, 2026🔗 Source

Graph Memory SDK for Local AI Models

Engram Memory SDK is an open-source graph memory system designed for AI agents that works with local models through LiteLLM integration. The core architecture separates ingestion from recall: you only need the LLM once during ingestion to extract entities and relationships, while recall operates through pure vector search, graph traversal, and scoring without requiring additional LLM calls.

Technical Details

The SDK is built with async Python and uses Neo4j as its backend database. According to the source, it averages ~735 tokens per ingestion operation and achieves 95ms recall latency. The system includes self-restructuring memory features with decay and clustering running in the background.

Setup and Installation

Installation is straightforward:

pip install engram-memory-sdk

Configuration requires a .env file with these variables:

LLM_MODEL=ollama/llama3 # or any LiteLLM-supported local model
NEO4J_URI=bolt://localhost:7687

The system supports any model via LiteLLM, including local deployments through Ollama, vLLM, and text-generation-webui. The key advantage is cost efficiency: with a small local model handling extraction, ongoing recall operations have literally $0 cost since they don't consume LLM tokens.

📖 Read the full source: r/LocalLLaMA

👀 See Also

Tools

Introducing Lean Collab: A Multi-Agent Orchestrator for Long-Running LLM Tasks

Lean Collab is an open-source orchestrator designed to manage long-running LLM tasks using coordinated, parallel sub-agents.

Feb 13, 2026, 11:45 AM UTC

OpenClawRadar

🦀

Tools

xAI TTS Integration for Home Assistant Built with Claude — Full Repo

A developer used Claude to build a custom Home Assistant integration for xAI's TTS API (Eve voice) with full UI config, five voices, and speech tags.

May 13, 2026, 04:17 AM UTC

OpenClawRadar

Tools

Wisepanel MCP Server Enables Multi-LLM Deliberation in Claude Code and Cursor

Wisepanel released an MCP server that runs multi-agent deliberations directly from Claude Code, Cursor, or any MCP client, using a divergent context enhancement system with ChatGPT, Claude, Gemini, and Perplexity models.

Feb 25, 2026, 03:45 AM UTC

OpenClawRadar

Tools

Claude-ETA Plugin Adds Task Timing and Repair Loop Detection to Claude Code

Claude-ETA is a Claude Code plugin that times tasks, learns your actual velocity, and feeds real data back into Claude before it responds. It also detects repair loops by fingerprinting error content and intervenes after three identical failures.

Apr 16, 2026, 04:45 AM UTC

OpenClawRadar