LLMs Leak Reasoning into Structured Output Despite Explicit Instructions

The Problem: LLM Validation Passes Leak Reasoning
A developer building a tool that makes parallel API calls to Claude and parses structured output per call encountered an intermittent issue. Each call returns content inside specific markers like [COVER], [SLIDE 1], [CAPTION], etc. A second LLM pass validates the output against rules and rewrites anything that fails.
The validation prompt explicitly states: "return ONLY the corrected text in the exact same format. No commentary. No reasoning. No violation lists."
Despite this, the validation model occasionally outputs its reasoning before the corrected content. Examples include: "I need to check this text for violations... These sentences form a stacked dramatic pair used purely for effect. Here is the rewrite:" followed by the actual corrected text.
Downstream Consequences
This reasoning text gets passed straight to the parser. The parser expects content starting at [COVER] but instead receives meta-commentary. This causes field misalignment downstream. In one case, the validator's reasoning text ended up inside an image prompt field because the parser consumed the reasoning as body content, shifting everything down by a few lines.
Prompt tightening alone didn't fix the issue. Making instructions more explicit, adding "your output MUST start with the first content marker," and adding "never include reasoning" reduced frequency but didn't eliminate it. The model occasionally ignores instructions, especially when it finds violations to fix—it wants to show its working.
The Solution: Two-Layer Defense
The fix that worked involved two layers:
- Layer 1: Prompt tightening. Still worth doing because it reduces how often the problem occurs.
- Layer 2: A defensive strip function that runs on every validation output before any parsing happens. For structured formats, it anchors to the first recognized marker and throws away everything before it. For plain-text formats, it strips lines matching known validator commentary patterns (things like "Let me check this text" or "This violates the constraint").
The strip-before-parse ordering is key. Every downstream parser operates on already-sanitized output. This avoids maintaining per-field stripping logic or playing whack-a-mole with new reasoning formats.
Implementation Considerations
For plain-text strip patterns, careful design is needed. A regex that catches "This is a violation" could also catch "This is a common mistake" in legitimate content. Patterns should be tightened to match only validator-specific language, like "This violates the/a rule/constraint" rather than broad matches on "This is" or "This uses." Each pattern needs auditing against real content before deployment.
If you're parsing structured output from an LLM, treat prompt instructions as a best-effort first pass and always have a code-level defense before the parser. The model will comply 95% of the time, but the 5% where it doesn't will break downstream logic in ways that are hard to reproduce because they're intermittent.
📖 Read the full source: r/ClaudeAI
👀 See Also

Zerostack 1.0.0: A Unix-Inspired Coding Agent in Pure Rust
Zerostack is a coding agent written in pure Rust, modeled on Unix philosophy — small composable tools piped together via stdin/stdout.

ClearSpec: A Spec Generator to Reduce Hallucination in Claude Code
ClearSpec is a tool that generates structured specifications from plain English descriptions, connecting to GitHub repos to reference real file paths and dependencies, then uses those specs as prompts for Claude Code to provide better context.

Pneuma: An AI-Generated Desktop Environment Where Software Materializes from Descriptions
Pneuma is a desktop computing environment where you describe what you want—a CPU monitor, game, notes app, or data visualizer—and a working program materializes in seconds. The system generates self-contained Rust modules, compiles them to WebAssembly, and executes them in sandboxed Wasmtime instances with GPU rendering via wgpu.

HolyClaude: Docker Container for Claude Code with Browser UI and Headless Chromium
HolyClaude is an open-source Docker container that packages Claude Code CLI with a browser UI, headless Chromium, and additional AI coding tools. Setup requires only docker compose up and provides access at localhost:3001.