Local Qwen 3.6 vs Frontier Models: Coding a Canvas Animation

A Reddit user ran a head-to-head comparison of local quantized models versus frontier web-based models on a specific coding primitive: generating a single HTML file with a full-page canvas animation of a side-view car driving with parallax scrolling, spinning wheels, and cinematic lighting.

The Prompt

The exact prompt asked for a single HTML file with no libraries, a full-page canvas, realistic side-view car animation, layered parallax scenery, spinning wheels, subtle body motion, smooth looping, and cohesive sky/lighting.

Models Tested

Frontier (web-based via Perplexity, tok/s not measured):

Claude Sonnet 4.6 Thinking (used internet for reasoning)
Gemini 3.1 Pro Thinking
GPT 5.4 Thinking
Kimi k2.6 Thinking

Local (Ryzen 5 5600, 24 GB DDR4-3200, RX 5700 XT 8GB):

Qwen3.5 9B Q4_K_M — ~50 tok/s
Qwen3.6-27B (Claude-opus-reasoning-distilled) Q4_K_M — 2.65 tok/s
Qwen3.6-27B Q4_K_M — 2.70 tok/s
Qwen3.6-31B A3B Q4_K_M — 12.13 tok/s
Gemma-4-31b-it — 1.91 tok/s
Qwen3.5 4B Q8 — 60 tok/s (used internet for reasoning)
Qwen3.5 4B Q4_K_M — 80 tok/s (used internet for reasoning)

Results & Subjective Ranking

The ranking for this specific task:

Kimi k2.6 Thinking — cleanest overall visual result
Qwen3.6-27B Q4_K_M (local) — stronger than expected; good parallax and road feel
Qwen3.6-27B Claude-opus-reasoning-distilled — close third

The local 27B quant delivered more natural motion and layering than some frontier outputs for this specific visual primitive. The poster noted they expected frontier models to outperform local quants more clearly.

The user only changed HTML <title> tags to track which model generated which file. Outputs are shared in the thread along with screenshots/GIFs of the running animations.

📖 Read the full source: r/LocalLLaMA

Local Qwen 3.6 vs Frontier Models on a Coding Primitive: Single-File HTML Canvas Driving Animation

The Prompt

Models Tested

Results & Subjective Ranking

👀 See Also

OpenClaw v3.22 Update Causes Dashboard and WhatsApp Issues

DeepSeek Paid API Uses Prompts for Training — What OpenClaw Users Need to Know

Anthropic Allows Subscription Usage for Claude via OpenClaw Starting June

Benchmark Comparison of Qwen 3.5 Models Against Major AI Models