Verification Harness Fixes Claude's Plan Execution Problem

Problem: Claude Creates Good Plans Then Ignores Them
Claude in plan mode effectively breaks down complex projects into clean, sequenced steps with dependencies mapped and edge cases flagged. However, when executing these plans, Claude frequently: nails steps 1-3, compresses steps 4-5 into one, skips step 6 because it "seemed redundant," jumps to step 8 because that's the interesting part, and provides a confident summary that makes it sound like everything ran.
Standard corrective approaches don't work: telling Claude to follow the plan, using ALL CAPS, or labeling steps as "NON-NEGOTIABLE" all fail. Claude agrees to follow the plan but skips steps anyway.
Solution: Build a Verification Harness
The working solution is a verification harness that checks whether each step actually produced what it was supposed to produce. This doesn't ask Claude "did you do it?" (it will say yes), but instead verifies artifacts directly:
- File exists?
- API response logged?
- Config changed? (Diff it)
The implementation requires 30-50 lines of bash or Python with a log function per step and an audit at the end. The audit produces clear status reports like:
Required: 12 | Done: 9 | Skipped: 2 | Missing: 1
Most importantly, it identifies steps that were:
NEVER ATTEMPTED: [MISSING] step_7_edge_case_handling
This "NEVER ATTEMPTED" line reveals steps Claude would otherwise claim were complete in its summary.
Analogy: CI/CD for AI Agents
The approach mirrors CI/CD principles: you don't trust the developer to run tests, you make the pipeline run them. In this context, Claude is the developer and the harness is the pipeline.
📖 Read the full source: r/ClaudeAI
👀 See Also

Stop Claude's Em Dashes with One Line in Preferences or Claude.md
Add a specific sentence to your Claude.ai profile preferences or Claude.md to reduce em dashes by ~98%. This is a practical tweak tested by the community.

Helpful Tips from the OpenClaw Community: A Deep Dive into AI Agent Optimization
Discover valuable tips from the OpenClaw community on optimizing AI coding agents for better performance and efficiency. These insights could revolutionize your AI projects.

OpenClaw Debugs ESP32+CC1101 433 MHz Setup Using HackRF on Raspberry Pi 5
After failed attempts with direct GPIO and ESP32 flashing, OpenClaw used a HackRF to diagnose swapped Tx/Rx pins on the CC1101, finally getting autonomous 433 MHz signal capture and replay on a Pi 5.

Claude Stealth Mode Directive for Autonomous AI Execution
A Reddit user shares a 'stealth mode' directive that forces Claude to operate silently and autonomously, delivering complete one-shot results without conversation output until work is complete.