Benchmark: Gemma4 12B vs Qwen3 8B quantized on 24GB Mac Mini

Performance comparison of two local models for OpenClaw
A developer ran a head-to-head test comparing Gemma4 12B and Qwen3:8b-q4_K_M on a 24GB Mac Mini. The test used two prompts: "explain how a carburetor works" and "write a Python function to detect memory leaks." Claude helped write a command to grep the output for measurement.
Benchmark results
Carburetor explanation task:
- Qwen3:8b-q4_K_M: Prompt eval: 89.8 t/s, Generation: 19.6 t/s
- Gemma4: Prompt eval: 20.8 t/s, Generation: 27.6 t/s
Python coding task:
- Qwen3:8b-q4_K_M: Prompt eval: 133.8 t/s, Generation: 18.7 t/s
- Gemma4: Prompt eval: 26.1 t/s, Generation: 26.1 t/s
Key findings
Qwen3 processes prompts 4-5x faster than Gemma4, which matters for OpenClaw because of the large context prompts typically sent. Gemma4 generates output slightly faster. For many OpenClaw uses, Qwen3 wins on speed. The developer notes that Gemma4 is a 12B model and might produce slightly better output, though this wasn't tested.
The developer runs various tasks on local models including cron jobs, heartbeat monitoring, memory indexing, and often has OpenClaw call subagents running local models. They're testing Gemma4 as the local model for all these background tasks but don't expect to notice performance differences since these run in the background.
📖 Read the full source: r/openclaw
👀 See Also

Contextium: Open-Source Persistent Context Framework for Claude Code
Contextium is a structured git repo framework that provides persistent context for Claude Code sessions, using a CLAUDE.md file as a context router to lazy-load relevant markdown files. The open-source version includes a template with 6 sample apps and 27 integration docs.
TextExpander MCP Server Lets Claude AI Access and Manage Your Snippet Library
TextExpander launched a free MCP server that connects your snippet library to Claude. Claude can list, search, create, and edit snippets in bulk, including dynamic fields like dates and dropdowns.

Claude Watch: Open Source Tool Visualizes AI-Generated Code Logic
Claude Watch is an open source tool that provides graphical semantic visualization for projects built with AI coding agents like Claude Code. It analyzes code in a nested way and includes AI-powered search to answer questions about project logic.

Local MCP Server Connects Claude to Mac Apps Without Cloud or Tokens
Local MCP is a native macOS MCP server that gives Claude Desktop, Cursor, Windsurf, and VS Code access to Mail, Calendar, Teams, and OneDrive data on your Mac without cloud processing or API tokens.