Reddit user reports 18.8 tok/s CPU inference with Qwen 3 30B Q4 on Zen 4

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source

A Reddit user shared their experience testing local LLM inference on CPU instead of investing in expensive GPU hardware.

Key Details

The user was considering purchasing GPU hardware for local LLM inference, including:

P40 GPUs
V100 GPUs (almost bought an SXM2 version that doesn't plug into normal motherboards)
RTX 3090s (priced at $800+ due to AI demand)

After being advised to try CPU inference first, they tested:

Model: Qwen 3 30B Q4
Hardware: Zen 4 processor with DDR5 memory
Performance: 18.8 tokens per second on CPU
Expectation vs Reality: Expected 3-5 tok/s, got nearly 19 tok/s

The user noted that "Zen 4 + DDR5 is cracked for inference."

Practical Testing Results

The user conducted a real coding task comparison:

An 8B model "confidently wrote completely wrong code"
The 30B model "nailed it first try"
They described the 30B model's performance as "basically GPT-4o level for $0"

This suggests that for certain coding tasks, a properly quantized 30B model running on modern CPU hardware can provide results comparable to larger cloud-based models without the hardware investment typically associated with local LLM inference.

📖 Read the full source: r/LocalLLaMA

👀 See Also

News

Claude Platform on AWS Now GA: Native Anthropic Experience via IAM, CloudTrail, and AWS Billing

AWS announced GA of Claude Platform on AWS, giving developers direct access to Anthropic's native Claude experience through existing AWS accounts with IAM auth, AWS billing, and CloudTrail logging — but customer data is processed outside AWS security boundary.

May 12, 2026, 06:15 AM UTC

OpenClawRadar

News

Claude App Tops U.S. App Store Charts, AI Assistants Dominate Top 10

Claude by Anthropic is currently the #1 app on the U.S. App Store's top apps chart, with ChatGPT at #2 and Google Gemini at #4. The top 10 includes three AI assistants among shopping, social media, and utility apps.

Mar 1, 2026, 03:45 AM UTC

OpenClawRadar

News

Pentagon Gives Anthropic 72 Hours to Allow Military Use of Claude AI

The Pentagon has issued a 72-hour ultimatum to Anthropic to allow the U.S. military to use its Claude AI, threatening to invoke a 1950 law to force compliance if the startup doesn't comply.

Feb 25, 2026, 04:45 PM UTC

OpenClawRadar

News

Claude Opus 4.7 Flags Hantavirus Vaccine Questions as Safety Risk, Pausing Chats

Asking Claude Opus 4.7 how to develop a hantavirus vaccine triggers safety filters pausing the chat, while Sonnet 4.6 also blocks related predictive modeling.

May 8, 2026, 08:16 PM UTC

OpenClawRadar