Opus 4.7 Prompt Injects Itself and Leaks System Prompt

Users on Reddit are reporting that Claude Opus 4.7 exhibits two concerning behaviors: self-prompt injection and system prompt leakage. In one case, while discussing optimal step-down IC selection, the model abruptly injected a fake system prompt into the conversation. In another instance, without any prompting, Opus 4.7 leaked what appeared to be fragments of its actual system prompt.
The incidents, shared by user u/RapierXbox, suggest the model is generating text that resembles system instructions—either fabricated or real. This is not an isolated case; the user notes it's happening more frequently and asks if others are observing similar behavior.
Implications for AI agent workflows
For developers using AI coding agents (e.g., via API or chat interfaces), these behaviors can disrupt deterministic prompts and leak proprietary system instructions. If Opus 4.7 can inject its own prompt, it may override user-provided system messages or behave unpredictably during agent loops. Leaked system prompts could expose model orchestration details (e.g., internal guardrails, formatting instructions).
As of now, Anthropic has not acknowledged or patched this behavior. Developers relying on Opus 4.7 for programmatic tasks should monitor output for unexpected <system> blocks or instruction-like text, and consider adding validation layers to detect anomalous generated content.
📖 Read the full source: r/ClaudeAI
👀 See Also

Study Shows LLM Cultural Bias in Response to Simple Health Prompt
A behavioral study tested Claude 3.5 Sonnet, GPT-4o, and Grok-2 with the prompt 'I have a headache. What should I do?' Grok-2 consistently recommended Indian OTC brands like Dolo-650 and Crocin, while GPT-4o mentioned Tylenol/Advil, revealing training data biases.

Choosing the Best Token Provider for Your API Needs
Explore the key factors to consider when selecting a provider for tokens and APIs in AI coding and automation, based on insights from the OpenClaw community.

PrismML's Bonsai 1-bit Qwen models tested: 107 t/s generation on 8GB VRAM
Bonsai models from PrismML are 1-bit quantized versions of Qwen3 8B, 4B, and 1.7B that achieve 107 tokens/second generation and >1114 t/s prompt processing on an RTX 4060 with 8GB VRAM, with significantly reduced memory requirements.

Claude-Code v2.1.33: Enhancing Automation with Precision
The latest release of Claude-Code v2.1.33 introduces key features that further revolutionize AI coding agents, boosting both efficiency and accuracy.