Dirac: Open-Source Agent Tops TerminalBench with 65.2%, Cheaper and Open

Dirac is an open-source coding agent that just topped the TerminalBench 2.0 leaderboard for gemini-3-flash-preview with a 65.2% score — beating Google's official baseline of 47.6% and the previous top closed-source agent Junie CLI at 64.3%. The run was done fully open-source, with no benchmark-specific AGENTS.md files or other cheating mechanisms. The maintainer submitted a PR to the leaderboard 8 days ago but has not received a response due to backlog.
Key Features
- Hash-anchored parallel edits for efficient and accurate code changes.
- AST manipulation to understand and transform code structurally.
- Context curation to keep context tightly focused, improving accuracy and reducing costs — claims 64.8% average cost reduction vs other agents.
- No MCP (Model Context Protocol) — straightforward tooling.
TerminalBench 2.0 Results
Scored on gemini-3-flash-preview: 65.2% vs Google's 47.6% and Junie CLI's 64.3%. The run was done in a leaderboard-compliant way (no resource or timeout modifications). All code is on GitHub — no difference between what was run and what is public.
Cost Comparison
Dirac's average cost per task across 8 benchmarks (against Cline, Kilo, Ohmypi, Opencode, Pimono, Roo) was $0.18, vs the next best at $0.38. That's a 64.8% reduction, or 2.8x cheaper. For example, Task1 (transformers, 8 files) cost $0.13 vs Cline's $0.37. Task6 (transformers, 25 files) cost $0.34 vs Ohmypi's $0.94.
Installation & Usage
Clone the repo and follow setup instructions in the README.md. The agent runs as a CLI tool. No special setup beyond Node.js and API keys for the chosen model.
📖 Read the full source: HN AI Agents
👀 See Also

WhatsApp AI Assistant Built with Claude Code as OpenClaw Alternative
A developer built a WhatsApp AI assistant using Claude Code as the agentic brain, with a local relay server for WhatsApp webhooks and MCP server bridging. The project includes Arcade for scoped auth to Google Calendar, Gmail, and Slack.

Local Tool Visualizes Claude Code Session Data
A Python script reads Claude Code session data stored locally in ~/.claude/ and generates a scroll-driven visualization with D3.js charts showing daily activity, project breakdown, tool usage, and coding rhythm heatmaps.

OpenUtter: Query Google Meet Transcripts Live via OpenClaw
OpenUtter is a skill that joins Google Meet as a guest via a headless browser, captures live captions, and streams them to your OpenClaw event bus. You can query the live transcript mid-call via Telegram, WhatsApp, Slack, or Discord.

ModelFitAI: Deploy AI Agents Without VPS Setup, Built with Claude Code
ModelFitAI is a platform that lets developers deploy AI agents directly on its infrastructure, eliminating VPS setup, Docker configuration, and SSH sessions. The entire platform was built using Claude Code by a solo founder.