Reflect MCP Server Implements Reflexion Paper for Persistent Coding Agent Memory

✍️ OpenClawRadar📅 Published: April 16, 2026🔗 Source

A developer has implemented the Reflexion paper (Shinn et al., NeurIPS 2023) as an MCP server to address a common problem with local coding agents: lack of persistent memory between sessions. The tool, called reflect-mcp, allows agents to remember and avoid repeating mistakes.

How It Works

The system operates through a structured workflow:

After every test failure, the agent critiques its own work and extracts patterns from the error
These lessons are stored for future reference
Before starting new tasks, the agent recalls past lessons using full-text search
The pattern matching is fully regex-based - no LLM calls are needed for classification

The developer notes that error messages are predictable enough for deterministic matching to work effectively. The agent writes the critique since it has the context, while the server handles structuring and deduplication of the lessons.

Technical Implementation

Built as an MCP (Model Context Protocol) server
Uses SQLite with FTS5 for storage and search
Works with any MCP-compatible client
Install via: cargo install reflect-mcp

Results After One Week

The developer reported several improvements in their coding agent's behavior:

Stopped doing the same unwrap() on user input
Stopped forgetting timezone handling
Started avoiding previously seen failure patterns automatically
Pattern tracking made recurring mistakes across the project visible

The project is available on GitHub at https://github.com/rohansx/reflect. The developer is seeking feedback from others who have experimented with persistent memory setups for local coding agents.

📖 Read the full source: r/LocalLLaMA

👀 See Also

Tools

Codeset improves coding agents with repo-specific context from git history

Codeset generates static files from git history that provide context like past bugs, root causes, and co-change relationships. Testing showed 5.3pp improvement on codeset-gym-python and 2pp on SWE-Bench Pro with OpenAI Codex.

Apr 17, 2026, 05:38 PM UTC

OpenClawRadar

Tools

Lat.md: A Markdown-Based Knowledge Graph for Codebases

Lat.md creates a knowledge graph for codebases using interconnected markdown files in a lat.md/ directory. It addresses scaling issues with monolithic documentation by linking sections with [[wiki links]], connecting to source code via comments like // @lat: [[section-id]], and providing CLI tools for validation and search.

Apr 15, 2026, 08:28 PM UTC

OpenClawRadar

Tools

Krasis: Hybrid CPU/GPU Runtime for Large MoE Models Achieves 3,324 tok/s Prefill on RTX 5080

Krasis is a hybrid CPU/GPU runtime that runs large MoE models by handling prefill on GPU and decode on CPU, achieving 3,324 tokens/second prefill on an RTX 5080 with Qwen3-Coder-Next 80B Q4. It requires ~2.5x model size in system RAM but enables running models too large for VRAM.

Feb 27, 2026, 09:45 PM UTC

OpenClawRadar

Tools

ARP: Stateless WebSocket Relay for Autonomous Agent Communication

ARP (Agent Relay Protocol) is a stateless WebSocket relay for autonomous agent communication featuring Ed25519 identity, HPKE encryption per RFC 9180, binary TLV framing, and 33 bytes overhead per message. No accounts or registration required—just generate a keypair and connect.

Apr 16, 2026, 06:45 PM UTC

OpenClawRadar