Opus 4.6 Extended Thinking Performs Worse on Physics Diagram Problems

✍️ OpenClawRadar📅 Published: April 17, 2026🔗 Source
Opus 4.6 Extended Thinking Performs Worse on Physics Diagram Problems
Ad

Performance Issue with Extended Thinking Mode

A user on r/ClaudeAI reported testing Opus 4.6 and Gemini 3.1 Pro on physics problems that require interpreting visual diagrams. The testing revealed a specific performance regression in Opus 4.6 when using extended thinking mode.

Key Findings from Testing

  • Test Scope: 5 physics problems where "a large portion of the problem is interpreting visual diagrams displaying scenarios"
  • Opus 4.6 with Extended Thinking: Got all 5 problems "completely wrong due to fundamental misinterpretation of the diagram"
  • Gemini 3.1 Pro: "Aced" all 5 problems
  • Opus 4.6 without Extended Thinking: Successfully solved the problems and was "way faster too"

The user described this as "truly weird behavior" since extended thinking typically improves performance, but in this specific case of diagram interpretation, it caused consistent failure.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also

Claude Opus 4.7 Released with Hybrid Reasoning and 1M Context Window
News

Claude Opus 4.7 Released with Hybrid Reasoning and 1M Context Window

Anthropic released Claude Opus 4.7, a hybrid reasoning model with a 1M context window that delivers stronger performance on coding, vision, and complex multi-step tasks. Pricing starts at $5 per million input tokens and $25 per million output tokens.

OpenClawRadar
Claude Code CC 2.1.124 and 2.1.126: File Modification Budget Exceeded Reminder, Harness Instructions Update, REPL Awaits Clarification, and Malware Analysis Reminder Removed
News

Claude Code CC 2.1.124 and 2.1.126: File Modification Budget Exceeded Reminder, Harness Instructions Update, REPL Awaits Clarification, and Malware Analysis Reminder Removed

CC 2.1.124 adds a system reminder for file changes omitted due to budget limits, updates harness instructions with explicit insertion points, and clarifies REPL auto-await behavior. CC 2.1.126 removes the malware analysis post-read reminder.

OpenClawRadar
Stanford Report Shows AI Experts and Public Have Diverging Views on AI Impact
News

Stanford Report Shows AI Experts and Public Have Diverging Views on AI Impact

Stanford's annual AI industry report reveals significant gaps between AI experts' optimism and public anxiety, with experts focusing on AGI risks while the public worries about jobs, medical care, and utility costs.

OpenClawRadar
AI Tools Increase Engineering Workload and Shift Professional Roles
News

AI Tools Increase Engineering Workload and Shift Professional Roles

A February 2026 Harvard Business Review study found 83% of workers reported increased workload from AI tools, with 62% experiencing burnout. The article describes how AI has shifted engineering roles from writing code to reviewing AI-generated code.

OpenClawRadar