Run Gemma-4 26B-A4B Efficiently on M5 MacBook Air

A developer tested Gemma-4-26B-A4B with Opencode on a 32GB M5 MacBook Air and found it delivers practical performance for local AI coding tasks.

Performance Benchmarks

The specific configuration tested was gemma-4-26B-A4B-it-UD-IQ4_XS running on a 32GB M5 MacBook Air. In low power mode, it achieved:

300 tokens/second prompt processing
12 tokens/second generation
8W power consumption
No heat or fan noise during operation

The M5 MacBook Air showed significant improvements over previous hardware:

~25% faster prompt processing than an M1 Max 64GB (even when the Max wasn't in power saving mode)
~6 hours of battery life versus ~2 hours on the M1 Max when running Opencode
This despite having a smaller battery (53.8Wh vs 70Wh on the M1 Max)

Practical Use Cases

The developer found this setup "actually usable" for agentic coding behavior from a laptop. Previously, running LLMs on an M1 Max 64GB was limited to "tinkering and toy use cases" and couldn't handle longer context tasks effectively. While it could create a simple Snake game in Python, agentic coding or contributing to larger codebases was "a bit janky."

The M5's performance makes it practical for mobile use cases where internet connectivity might be unreliable, such as coffee shops or train commutes.

Comparison to Other Models

The developer compared Gemma-4-26B with Opencode to closed-source alternatives:

It doesn't replace Claude Code or Antigravity from their testing
Gemma-4 requires "far more hand-holding than current closed-source frontier models"
The responses are described as "kinda dry" compared to Claude Code or Gemini-3.1-Pro with Antigravity
However, they'd prefer Gemma-4-26B over running out of Gemini-2.5-Pro allowance and being forced to use Gemini-2.5-Flash

The developer notes this represents significant progress, as "this sort of agentic coding was cutting-edge / not even really possible with frontier models back at the end of 2024."

📖 Read the full source: r/LocalLLaMA

Gemma-4 26B-A4B with Opencode Runs Efficiently on M5 MacBook Air

Performance Benchmarks

Practical Use Cases

Comparison to Other Models

👀 See Also

Culpa: Open Source Deterministic Replay Engine for AI Agent Debugging

Using Claude Code to revive abandoned personal projects: a practical walkthrough

Atlarix v5.1 adds cloud tiers while maintaining local AI coding support

SMELT compiler reduces OpenClaw workspace token usage by up to 95%