Benchmark Results: GitHub CLI vs MCP Approaches for AI Agents

Benchmark Results: GitHub CLI vs MCP Approaches
A Reddit user conducted an independent study comparing different methods for exposing GitHub tools to AI agents. The benchmark tested four approaches: GitHub CLI, MCP (Model Context Protocol), MCP with Tool Search, and MCP with Code Mode, using real data and practical tasks.
Key Findings
- GitHub MCP is 2–3x more expensive to use than GitHub CLI. The source notes there's "almost no practical reason to use their MCP except for some of the different handling of security."
- Tool Search saves upfront tokens but spends them on extra turns. Whether this trade-off pays depends on task complexity. Tool Search also introduces a new failure mode due to imperfect search accuracy.
- Code Mode is the cheapest way to use MCP, but still 2x more expensive than CLI, and it's very slow. Code Mode introduces a unique failure mode when the agent writes buggy code or poor error handling.
- The benchmark suggests it's possible to push CLIs further toward higher success rates at lowest cost and latency with a principled design approach that treats agent ergonomics as a first-class concern.
Open Source Resources
The author has detailed their approach at https://axi.md and open-sourced the benchmark harness, results, and reference implementation of gh-axi at https://github.com/kunchenguid/axi.
📖 Read the full source: r/ClaudeAI
👀 See Also

Open-source tool automates Meta ad competitor analysis with Claude Code
Ads Machine is an open-source system built with Claude Code that scrapes competitor ads from Meta's Ad Library, transcribes videos, extracts hooks and angles, and grades ads based on how long they've been running. It can generate variations from successful ads and push campaigns to Meta.

Soul MCP Server Adds Persistent Memory and Safety for Local LLMs
Soul is an open-source MCP server that provides persistent memory across sessions for local LLMs with two commands: n2_boot at start and n2_work_end at end. It includes Ark safety features that block dangerous commands like rm -rf and DROP DATABASE at zero token cost, plus cloud storage configuration.

Skill Studio: Open-Source Desktop App for Managing Claude AI Agent Skills
Skill Studio is a free, open-source macOS desktop app that lets developers browse community skill repositories, preview documentation with markdown rendering, and install skills with one-click commands like npx skills add.

Single-page chatbot interface for locally running Gemma 4 26B A4B
A developer built a single HTML page chatbot that connects to Gemma 4 26B A4B running locally with 32K context window at 50-65 tokens/second, sharded between a 7900 XT and 3060 Ti GPU. The interface includes full streaming, Markdown rendering, and parameter controls.