Browser CLI: A Token-Efficient Browser Automation Tool for AI Coding Agents

✍️ OpenClawRadar📅 Published: April 15, 2026🔗 Source
Browser CLI: A Token-Efficient Browser Automation Tool for AI Coding Agents
Ad

What Browser CLI Does

Browser CLI is a browser automation tool built to address token overhead in AI coding agents. The creator noticed browser verification was consuming ~30,000 tokens per session through Playwright MCP protocol overhead, with each browser_navigate + browser_snapshot call costing ~1,500 tokens in JSON schema framing.

The solution is a persistent headless Chromium daemon that you interact with via Bash commands. It uses the same Playwright ARIA snapshot technology underneath but achieves ~50-100 tokens per call instead of ~1,500.

Commands and Usage

Available commands include:

  • browser-cli goto https://example.com - Navigate to URL
  • browser-cli snapshot -i - ARIA tree with @e refs
  • browser-cli click @e3 - Click by ref
  • browser-cli fill @e5 "hello" - Fill input
  • browser-cli css @e3 font-size - Get computed CSS value
  • browser-cli inspect @e3 - Full box model + styles
  • browser-cli screenshot /tmp/page.png - Screenshot
  • browser-cli snapshot -D - Diff: what changed since last snapshot
  • browser-cli responsive /tmp - Screenshots at mobile/tablet/desktop

The server auto-starts on first call (~3s), then subsequent calls are ~100-200ms. It stays alive for 30 minutes, preserving cookies, tabs, and state between commands.

Token Savings

Token comparison:

  • Playwright MCP: ~1,500 tokens per call, ~30,000 tokens for 20 calls
  • Browser CLI: ~75 tokens per call, ~1,500 tokens for 20 calls

That's 95% savings on browser verification. For automated pipelines that do multiple tasks per session, this compounds fast.

Ad

Features Beyond Playwright MCP

  • CSS inspection - css @e3 padding returns computed values. inspect @e3 gives full box model + 16 key styles as JSON.
  • Live style mutation - style @e3 color red with style --undo. Debug CSS without touching source code.
  • Snapshot diffing - snapshot -D compares before/after ARIA trees.
  • Responsive presets - responsive /tmp takes mobile + tablet + desktop screenshots in one command.
  • Auth profiles - handoff opens a visible Chrome for manual SSO/MFA login, resume goes back to headless, auth-save admin encrypts the session (AES-256). Next time: goto-auth https://app.com/dashboard --profile admin — no login needed.
  • Command batching - chain [["goto","url"],["snapshot","-i"],["console"]] runs multiple commands in one call.

Technical Implementation

Architecture: AI Agent → Bash → CLI client (bin/browse.mjs) → HTTP POST (localhost) → Server (src/server.mjs) → Playwright API → Chromium (headless).

Pure Node.js. Playwright is the only dependency. No Bun, no Rust, no MCP overhead.

Claude Code Integration

Install globally:

npm install -g @tuandm/browser-cli

Add to .claude/settings.json:

{
  "permissions": {
    "allow": ["Bash(browser-cli*)"]
  }
}

Add a rule at .claude/rules/browser-cli.md telling Claude to use Browser CLI instead of Playwright MCP. The creator ran 5 eval scenarios and Claude picked the right command every time with the rule loaded.

It also ships as a Claude Code plugin (plugin.json included) for future marketplace distribution.

Inspiration and Technology

Inspired by gstack by Garry Tan, which pioneered the persistent Chromium CLI approach for AI agents. The core insight was that Bash commands are dramatically more token-efficient than MCP for browser automation. The underlying technology is Playwright by Microsoft.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also