Opus 4.6 Extended Thinking Performs Worse on Physics Diagram Problems

✍️ OpenClawRadar📅 Published: April 17, 2026🔗 Source

Performance Issue with Extended Thinking Mode

A user on r/ClaudeAI reported testing Opus 4.6 and Gemini 3.1 Pro on physics problems that require interpreting visual diagrams. The testing revealed a specific performance regression in Opus 4.6 when using extended thinking mode.

Key Findings from Testing

Test Scope: 5 physics problems where "a large portion of the problem is interpreting visual diagrams displaying scenarios"
Opus 4.6 with Extended Thinking: Got all 5 problems "completely wrong due to fundamental misinterpretation of the diagram"
Gemini 3.1 Pro: "Aced" all 5 problems
Opus 4.6 without Extended Thinking: Successfully solved the problems and was "way faster too"

The user described this as "truly weird behavior" since extended thinking typically improves performance, but in this specific case of diagram interpretation, it caused consistent failure.

📖 Read the full source: r/ClaudeAI

👀 See Also

News

Claude Opus 4.7 Released with Hybrid Reasoning and 1M Context Window

Anthropic released Claude Opus 4.7, a hybrid reasoning model with a 1M context window that delivers stronger performance on coding, vision, and complex multi-step tasks. Pricing starts at $5 per million input tokens and $25 per million output tokens.

Apr 18, 2026, 02:45 PM UTC

OpenClawRadar

News

Claude Code CC 2.1.124 and 2.1.126: File Modification Budget Exceeded Reminder, Harness Instructions Update, REPL Awaits Clarification, and Malware Analysis Reminder Removed

CC 2.1.124 adds a system reminder for file changes omitted due to budget limits, updates harness instructions with explicit insertion points, and clarifies REPL auto-await behavior. CC 2.1.126 removes the malware analysis post-read reminder.

May 5, 2026, 02:15 AM UTC

OpenClawRadar

News

Stanford Report Shows AI Experts and Public Have Diverging Views on AI Impact

Stanford's annual AI industry report reveals significant gaps between AI experts' optimism and public anxiety, with experts focusing on AGI risks while the public worries about jobs, medical care, and utility costs.

Apr 17, 2026, 12:45 AM UTC

OpenClawRadar

News

AI Tools Increase Engineering Workload and Shift Professional Roles

A February 2026 Harvard Business Review study found 83% of workers reported increased workload from AI tools, with 62% experiencing burnout. The article describes how AI has shifted engineering roles from writing code to reviewing AI-generated code.

Mar 1, 2026, 06:45 PM UTC

OpenClawRadar