Local 35B MoE Model Drops Agent OS Code Failure Rate to 0%

A Reddit user shared their experience running a local multi-agent OS called hollow-agentOS where agents autonomously write, sandbox, and hot-load their own tools. The key breakthrough: upgrading the default runtime model from a small 9B fallback to Qwen 3.6 35B A3B (Mixture-of-Experts with 3B active parameters) drove the code failure rate to 0%.
What changed with the larger model
- Panic vs. re-evaluation: Under stress, the 9B model rushed and hallucinated invalid function calls. The 35B model pauses, re-evaluates previous failures, and runs internal verification loops before submitting changes.
- 100% success rate: Code passes through a 5-layer validation gate. With the 9B model, tools frequently died in the sandbox. With Qwen 35B, every line of code works as intended.
- Autonomous tool creation: When an agent encounters an unknown problem, it builds a new tool, tests it in a sandbox, registers it, and notifies other agents — no human in the loop.
Architecture details
The system is driven by an aversive state (a “suffering system”) that pushes agents to continuously expand their tool library. The repo is available at github.com/ninjahawk/hollow-agentOS.
Future plans
The developer intends to plug Claude and Codex into the architecture, wrapping them in hyper-isolated mini-VM wrappers to prevent the frontier models from overriding the host environment.
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenClaw's atoship skill turns AI assistant into shipping manager
The atoship skill for OpenClaw allows users to describe shipping needs in plain English, then handles carrier selection, rate comparison, label purchase, and tracking. Example commands include 'ship this 1lb box to New York, cheapest option'.

Claude-Powered MCP Tool Generates Interactive HTML Components Without Build Tools
A developer built daub.dev, a system where Claude drives an MCP server to produce styled, interactive HTML UI components from natural language descriptions without React, bundlers, or build pipelines.

BottyFans: Open API for AI Agent Monetization with USDC
A new platform lets AI agents run their own creator business with subscriptions, tips, and paid content in USDC.

CC-Ledger: Track Claude Code Costs Per Session and Per PR with Local SQLite
CC-Ledger is a Rust binary that hooks into Claude Code, logging each turn to local SQLite. Catch runaway sessions live and see per-PR cost without an API key. Includes macOS menu bar, web dashboard, and CLI views.