How to Build a Dual Memory System with Markdown for AI Agents

Core Method: Chatting in Files Instead of Chat Window

The developer uses Claude Code but avoids the standard chat interface. Instead, they instruct the agent to create a {topic}_LOG.md file where all important discussions are conducted and persisted. In the chat window, they only write /response to tell Claude to look at the current discussion file and respond there, reserving the chat for trivial aside questions that don't need persistence.

File Structure and Annotation

Responses are typically added at the bottom of the LOG file like normal chat, but comments can also be inserted inline to respond to specific points. This is particularly useful for parallel clarification during project planning. To maintain clarity on rereading, all human comments are marked with C: to distinguish them from Claude's contributions.

Dual Memory System Architecture

In addition to the LOG file, Claude is instructed to create and maintain a {topic}_SUMMARY.md file. This summary contains references to the original LOG with line numbers, since the LOG often becomes too large to fit in memory. The summary acts as high-level declarative memory, while the LOG serves as detailed procedural memory.

Agent Startup and Memory Management

When a new agent starts or after compaction, the process is:

User provides context: "Your task is to continue conversation {topic}. We will focus on X."
Agent reads {topic}_SUMMARY.md to understand what's important
Summary indicates where X was discussed (e.g., lines 100-200 and 500-800 of the LOG)
Agent loads those specific LOG lines plus the last hundred lines for recent context
Agent can decide autonomously when to look up details mentioned in the summary

Maintenance and Quality Control

Simple subagents scan summaries periodically to ensure proper synchronization with their corresponding logs. Summaries of different topics contain cross-references where appropriate, giving any worker agent infrastructure to look up additional details. Agents also flag any C: comments that were never addressed, preventing missed questions.

Fallback and Documentation Benefits

For maximum reliability regardless of token cost, a new agent can be instructed to reread the entire LOG file, which requires less memory than the original discussion since it excludes other operations like reading Python files or web browsing. As a bonus, the LOG files serve as thorough documentation for other people working on the same project.

📖 Read the full source: r/ClaudeAI