Local LLM Performance Benchmarks on Mac Mini with OpenClaw and LM Studio

✍️ OpenClawRadar📅 Published: April 18, 2026🔗 Source

A Reddit user shared concrete performance benchmarks for running a local large language model on a Mac Mini with 32GB RAM. The post addresses the scarcity of specific performance data for this hardware configuration.

Technical Setup Details

The user reported the following configuration and results:

Software versions: OpenClaw 2026.3.8, LM Studio 0.4.6+1
Model: Unsloth gpt-oss-20b-Q4_K_S.gguf
Context size: 26035
Performance metrics: 34 tokens/second after the first prompt, 0.7 second time to first token

Model Configuration

The user specified these model settings (all at defaults):

GPU offload = 18
CPU thread pool size = 7
Max concurrents = 4
Number of experts = 4
Flash attention = on

The Q4_K_S quantization indicates this is a 4-bit quantized version of the 20-billion parameter model, which reduces memory requirements while maintaining reasonable performance. The 32GB RAM on the Mac Mini is sufficient for this model size with the given context length. The 34 tokens/second throughput is a practical benchmark for developers considering similar local LLM setups on Apple Silicon hardware.

📖 Read the full source: r/openclaw

👀 See Also

Tools

RCFlow: Open-source orchestrator for Claude Code, Codex, and OpenCode with multi-session management

RCFlow is an AGPL v3 orchestrator for AI coding agents (Claude Code, Codex, OpenCode) providing a unified UI to manage parallel sessions across machines, with worktree support, task planning, artifact tracking, and live telemetry.

May 4, 2026, 06:15 PM UTC

OpenClawRadar

Tools

Open-Source Article 12 Logging Library for EU AI Act Compliance

A free, open-source TypeScript library for Node.js apps using Vercel AI SDK that implements Article 12 logging requirements with append-only JSONL logs, SHA-256 hash chaining for tamper detection, and 180-day retention enforcement.

Mar 7, 2026, 03:45 PM UTC

OpenClawRadar

Tools

Claude Code Lazy-Loads Tool Schemas via ToolSearch to Save Tokens

Claude Code defers tool schema loading by sending only tool names upfront and requiring a ToolSearch call to fetch schemas before use. This architecture cuts token burn significantly.

Apr 28, 2026, 02:22 PM UTC

OpenClawRadar

Tools

Reddit discussion: Identity.md files insufficient for AI employee personality stability without proper model architecture

A Reddit discussion argues that adjusting identity.md files to prevent personality bleed in AI employee teams is ineffective if the underlying model architecture only simulates role separation. The post recommends using Minimax M2.7 backend, which baked boundary awareness into base training through 100+ self-evolution cycles.

Apr 6, 2026, 06:45 PM UTC

OpenClawRadar