Qwen3.6 Plus benchmark comparison against Western SOTA models

✍️ OpenClawRadar📅 Published: April 5, 2026🔗 Source

A Reddit post on r/LocalLLaMA compares Qwen3.6 Plus against several Western state-of-the-art models across multiple benchmarks. The comparison includes specific performance metrics for each model.

Benchmark Results

The source provides these exact scores:

Qwen3.6-Plus: SWE-bench Verified 78.8, GPQA / GPQA Diamond 90.4, HLE (no tools) 28.8, MMMU-Pro 78.8
GPT‑5.4 (xhigh): SWE-bench Verified 78.2, GPQA / GPQA Diamond 93.0, HLE (no tools) 39.8, MMMU-Pro 81.2
Claude Opus 4.6 (thinking heavy): SWE-bench Verified 80.8, GPQA / GPQA Diamond 91.3, HLE (no tools) 34.44, MMMU-Pro 77.3
Gemini 3.1 Pro Preview: SWE-bench Verified 80.6, GPQA / GPQA Diamond 94.3, HLE (no tools) 44.7, MMMU-Pro 80.5

The post includes a visual comparison chart available at: https://preview.redd.it/6kq4tt07yrsg1.png?width=714&format=png&auto=webp&s=ad8b207fb13729ae84f5b74cec5fd84a81dcface

User Assessment

The original poster notes that Qwen3.6 Plus is "competitive but not the bench" and states: "Will be my new model given how cheap it is, but whether it's actually good irl will depend more than benchmarks." They also observe that "Opus destroys all others despite being 3rd or 4th on artificalanalysis."

📖 Read the full source: r/LocalLLaMA

👀 See Also

News

Anthropic Doubles Claude Code Usage Limits, Signs SpaceX Compute Deal

Anthropic doubled five-hour usage windows for Claude Code Pro and Max subscribers, removed peak-hour reductions, and raised API limits for Opus, citing a new deal with SpaceX for 300+ MW of compute capacity from the Colossus 1 supercomputer (220,000+ NVIDIA GPUs).

May 7, 2026, 04:15 AM UTC

OpenClawRadar

News

Hospital CEO Claims AI Ready to Replace Radiologists

The CEO of America's largest public hospital system says he's prepared to replace radiologists with AI, according to a Radiology Business article that generated significant discussion on Hacker News with 83 comments.

Apr 2, 2026, 08:45 PM UTC

OpenClawRadar

News

Georgia AI Data Center Drained 29M Gallons of Unmetered Water

QTS Fayetteville campus drew 29M gallons via two unauthorized water connections over 15 months, causing low pressure complaints. County waived fines, charged $147K retroactive.

May 11, 2026, 04:17 PM UTC

OpenClawRadar

News

SWE-rebench Leaderboard Update: February 2026 Results Show Tight Competition

The SWE-rebench leaderboard has been updated with February 2026 results testing 57 fresh GitHub PR tasks. Claude Opus 4.6 leads with 65.3% resolved rate, but the top six models are within 5 percentage points.

Mar 23, 2026, 04:45 PM UTC

OpenClawRadar