Testing δ-Mem on Apple Silicon: MLX Implementation and Benchmarks

✍️ OpenClawRadar📅 Published: May 16, 2026🔗 Source
Testing δ-Mem on Apple Silicon: MLX Implementation and Benchmarks
Ad

A Reddit user implemented the δ-mem research paper (arXiv 2605.12357) for Apple Silicon using mlx and OpenClaw integration. The paper improves model attention direction without context or LoRA, reporting 20% better answers in their tests. The implementation used Qwen3-4B-Instruct via mlx and custom adapters.

Benchmark Results (normalized mlx tests, Qwen3-4B-Instruct on MacMini 64GB):

  • Synthetic paper-style: Plain 0.5129, δ-mem 0.5129 (1.00x)
  • LoCoMo-10 mini: Plain 0.0500, δ-mem 0.1833 (3.67x)
  • OpenClaw replay: Plain 0.5701, δ-mem 0.6667 (1.17x)

Latency costs (vs plain):

  • Synthetic: 1.013x
  • LoCoMo-10 mini: 1.33x query / 1.50x total
  • OpenClaw replay: 1.30x

Key links:

Takeaways:

Ad
  • Synthetic probes were flat (1.00x), but LoCoMo-mini showed strong relative gains (3.67x).
  • OpenClaw-style replay showed a practically meaningful improvement (6/8 → 7/8 probes passed, 1.17x).
  • The user notes Apple Silicon cannot run CUDA efficiently, so results are lower than paper benchmarks. Paper benchmarks (Qwen3-4B-Instruct) showed avg 1.10x vs frozen backbone, MemoryAgentBench 1.31x, LoCoMo 1.20x.
  • The user is seeking help (or funding ~$6k) to train an adapter for larger models like Qwen3.6:27B.

Who it's for: Developers running local LLM agents on Apple Silicon who want to experiment with δ-mem weight modulation to improve memory/context performance.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also