Reddit user reports 18.8 tok/s CPU inference with Qwen 3 30B Q4 on Zen 4

A Reddit user shared their experience testing local LLM inference on CPU instead of investing in expensive GPU hardware.
Key Details
The user was considering purchasing GPU hardware for local LLM inference, including:
- P40 GPUs
- V100 GPUs (almost bought an SXM2 version that doesn't plug into normal motherboards)
- RTX 3090s (priced at $800+ due to AI demand)
After being advised to try CPU inference first, they tested:
- Model: Qwen 3 30B Q4
- Hardware: Zen 4 processor with DDR5 memory
- Performance: 18.8 tokens per second on CPU
- Expectation vs Reality: Expected 3-5 tok/s, got nearly 19 tok/s
The user noted that "Zen 4 + DDR5 is cracked for inference."
Practical Testing Results
The user conducted a real coding task comparison:
- An 8B model "confidently wrote completely wrong code"
- The 30B model "nailed it first try"
- They described the 30B model's performance as "basically GPT-4o level for $0"
This suggests that for certain coding tasks, a properly quantized 30B model running on modern CPU hardware can provide results comparable to larger cloud-based models without the hardware investment typically associated with local LLM inference.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Claude Platform on AWS Now GA: Native Anthropic Experience via IAM, CloudTrail, and AWS Billing
AWS announced GA of Claude Platform on AWS, giving developers direct access to Anthropic's native Claude experience through existing AWS accounts with IAM auth, AWS billing, and CloudTrail logging — but customer data is processed outside AWS security boundary.

Claude App Tops U.S. App Store Charts, AI Assistants Dominate Top 10
Claude by Anthropic is currently the #1 app on the U.S. App Store's top apps chart, with ChatGPT at #2 and Google Gemini at #4. The top 10 includes three AI assistants among shopping, social media, and utility apps.

Pentagon Gives Anthropic 72 Hours to Allow Military Use of Claude AI
The Pentagon has issued a 72-hour ultimatum to Anthropic to allow the U.S. military to use its Claude AI, threatening to invoke a 1950 law to force compliance if the startup doesn't comply.

Claude Opus 4.7 Flags Hantavirus Vaccine Questions as Safety Risk, Pausing Chats
Asking Claude Opus 4.7 how to develop a hantavirus vaccine triggers safety filters pausing the chat, while Sonnet 4.6 also blocks related predictive modeling.