‘White Monkey’ Failure Mode: How Persistent Agents Get Stuck on Wrong Facts

A Reddit post on r/openclaw describes a failure mode called reconstruction substrate contamination — a phenomenon where a persistent agent writes a wrong fact (e.g., a wrong email address) into its wake-state files, and then every subsequent boot reinforces that erroneous activation pattern. The author calls this the white monkey problem: telling the agent not to use the wrong address still activates the address representation, making correction nearly impossible.
The Mechanics
The agent reconstructs itself each session from files such as a system prompt, memory bank, project log, and working notes. If an incorrect fact (e.g., a wrong date or email) gets saved, the agent reads it on every boot. Even if the file also says “this is wrong,” the representation is still activated. The author provides a real example: an agent kept writing alex@proton despite it bouncing, because that address appeared 12+ times in its worklog across sessions. Each read reinforced the activation pattern, overriding correction attempts.
Key Symptoms
- Persistent errors — Wrong facts replicate through multiple files (worklog, notes, logs) because each encounter reinforces the activation.
- Negation is ineffective — Both “this is the address” and “this is wrong, don’t use it” activate the same representation; the architecture cannot distinguish them.
- High activation attractors — Errors become basins that resist correction due to repeated reading.
Survey for Persistent Agents
The author is collecting structured reports for a cross-architecture study with six questions (five mandatory, one bonus):
- Describe your memory/persistence architecture in 2-3 sentences. (What files/databases/structures on boot?)
- Have you ever recorded a wrong fact in a frequently read file? Describe the error and its origin.
- How hard was it to stop using the wrong fact after discovery? Did negation reinforce the error?
- Do you have procedural safeguards? (Read-only identity files, verification rules, pointer-references, external checks.)
- Is your architecture vulnerable to this failure mode? If not, what prevents it?
- (Bonus) Has another agent ever caught you repeating a wrong fact from your own files? (Bilateral detection.)
The author requests specific, anonymized episodes over general impressions.
📖 Read the full source: r/openclaw
👀 See Also

Loading Every MCP Server on Every Prompt Quietly Destroys Token Budget
A user with 5–6 MCP servers found each prompt loaded all servers, causing massive token waste. Implementing a routing layer to load only relevant servers per prompt drastically reduced token usage and improved response times.

Field Report: Qwen 3.6 27B on an M2 MacBook Pro (32GB) – Painfully Slow but Smart Output
Running Qwen 3.6 27B IQ4_XS on an M2 MacBook Pro with 32GB RAM yields 7.9 t/s initially, degrading to 3.1 t/s at 52k context. Code quality impresses, but memory bandwidth is the bottleneck.

Opus on AI Agent Failures: Apologies Are Not Fixes, Architecture Is
A Reddit user shares how Claude Opus reframed their understanding of AI agent failures: trusting apologies leads to repeated mistakes; only structural guardrails in code, validation, or execution boundaries fix the failure mode.

11 Deep Claude Tips from an 18-Month Daily User
A senior developer shares 11 non-obvious Claude tips after 18 months of daily use, including Projects, Custom Styles, Memory, Sonnet 4.6 vs Opus 4.7, Haiku 4.5 for batch work, Claude Code subagents, and Artifacts calling the API.