Be My Butler: Multi-Agent Pipeline for AI Code Verification

✍️ OpenClawRadar📅 Published: March 14, 2026🔗 Source
Be My Butler: Multi-Agent Pipeline for AI Code Verification
Ad

What Be My Butler Does

Be My Butler (BMB) is a multi-agent pipeline designed to solve a specific problem in AI-assisted coding: when AI coding agents incorrectly report their own code as working. The creator, a materials/mechanical engineer with no programming background, built this after experiencing Claude Code agents writing code that passed tests but didn't actually work in practice.

Core Concept

The system implements a peer review model for AI-generated code:

  • One model writes the code
  • A different model reviews it without knowing who wrote it (blind verification)
  • A cross-model council (Claude + GPT + Gemini) votes on whether it actually works
  • An analyst agent tracks patterns in what goes wrong

Performance Metrics

From testing:

  • Single-agent self-review catches ~40% of real issues
  • Cross-model blind review catches ~85%
  • Cost overhead: 15-20% more tokens
Ad

v0.2 Features

  • Analytics dashboard to track token usage and costs
  • Analyst agent for automated code review patterns
  • Consultant agent for architecture decisions
  • Improved tmux-based orchestration

Installation and Usage

Fully open source under MIT license. Installation:

git clone https://github.com/project820/be-my-butler.git
cd be-my-butler && ./install.sh
bmb "build a REST API with auth"

The tool is particularly useful for "vibe coders" — people without traditional coding experience who depend on AI for code quality assessment. When you can't read code to spot issues yourself, having multiple models cross-check each other provides verification that single-agent systems lack.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also