Testing δ-Mem on Apple Silicon: MLX Implementation and Benchmarks

A Reddit user implemented the δ-mem research paper (arXiv 2605.12357) for Apple Silicon using mlx and OpenClaw integration. The paper improves model attention direction without context or LoRA, reporting 20% better answers in their tests. The implementation used Qwen3-4B-Instruct via mlx and custom adapters.
Benchmark Results (normalized mlx tests, Qwen3-4B-Instruct on MacMini 64GB):
- Synthetic paper-style: Plain 0.5129, δ-mem 0.5129 (1.00x)
- LoCoMo-10 mini: Plain 0.0500, δ-mem 0.1833 (3.67x)
- OpenClaw replay: Plain 0.5701, δ-mem 0.6667 (1.17x)
Latency costs (vs plain):
- Synthetic: 1.013x
- LoCoMo-10 mini: 1.33x query / 1.50x total
- OpenClaw replay: 1.30x
Key links:
- GitHub repo with adapter: delta-mem-mlx-sidecar-w-openclaw
- MLX adapter on Hugging Face: delta-mem-qwen3-4b-instruct-mlx-adapter
Takeaways:
- Synthetic probes were flat (1.00x), but LoCoMo-mini showed strong relative gains (3.67x).
- OpenClaw-style replay showed a practically meaningful improvement (6/8 → 7/8 probes passed, 1.17x).
- The user notes Apple Silicon cannot run CUDA efficiently, so results are lower than paper benchmarks. Paper benchmarks (Qwen3-4B-Instruct) showed avg 1.10x vs frozen backbone, MemoryAgentBench 1.31x, LoCoMo 1.20x.
- The user is seeking help (or funding ~$6k) to train an adapter for larger models like Qwen3.6:27B.
Who it's for: Developers running local LLM agents on Apple Silicon who want to experiment with δ-mem weight modulation to improve memory/context performance.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Slack Message Formatter: Fix Claude's Broken Markdown in Slack
A developer built a skill that converts Claude-generated Markdown to proper Slack formatting, solving issues where bold text shows as asterisks, links appear raw, and tables break. The tool offers both browser preview with rich HTML copy-paste and API webhook support.

nan-forget: Local AI coding memory in a single SQLite file
nan-forget is a memory tool for AI coding agents that stores context in a single SQLite file (~3MB) with no background services. It uses a 3-stage retrieval pipeline and works across Claude Code, Cursor, and terminal via CLI.

Lumyr: Dashboard Generation via Claude with Python and Streamlit Automation
Lumyr is a tool that generates live, shareable dashboards from plain English descriptions using Claude for dashboard generation and automating the Python and Streamlit layer. Users don't need to write Python, open Streamlit, deploy, set up hosting, or manage infrastructure.

OpenClaw Skill Server for Indian Market Analysis and Trading
An open-source trading terminal for Indian markets has been integrated as an OpenClaw skill server, allowing agents to pull market data and run multi-agent analysis via HTTP. The system provides structured trade plans with entry prices, stop-losses, and targets across three risk profiles.