Gemma 4 E2B Tested as Multi-Agent Coordinator in TypeScript Framework

Coordinator Capabilities Tested
The test evaluated whether Gemma 4 E2B could handle the coordinator role in a multi-agent system, specifically: taking a natural language goal, breaking it into a task graph, assigning agents, calling tools, and stitching results together.
Technical Implementation
The framework used was open-multi-agent (TypeScript, open-source) with Ollama via an OpenAI-compatible API. The coordinator receives a goal and agent roster, then outputs a JSON task array with title, description, assignee, and dependencies. Agents execute with tool-calling capabilities including bash and file read/write operations.
Model Details
Gemma 4 E2B ("Effective 2B") has 2.3B effective parameters and 5.1B total parameters. The extra ~2.8B parameters are for the embedding layer supporting 140+ languages and multimodal capabilities.
Test Scenario
The goal provided was: "Check this machine's Node.js version, npm version, and OS info, then write a short Markdown summary report to /tmp/report.md"
E2B correctly:
- Broke it into 2 tasks with a dependency (researcher → summarizer)
- Assigned each to the right agent
- Used bash to run system commands
- Used file_write to save the report
- Synthesized the final output
Both runTasks() (explicit pipeline) and runTeam() (model plans everything autonomously) worked.
Performance and Observations
On an M1 with 16GB RAM:
- Full
runTeam()takes ~2 minutes - 6–9 sequential LLM calls under the hood (coordinator planning → researcher multi-turn tool use → summarizer → coordinator synthesis)
- ~10–15 seconds per call on M1
- E2B uses ~3–4 GB RAM with no memory pressure
What worked well:
- JSON output: The coordinator produced the correct schema for task decomposition. The framework has tolerant parsing that tries fenced blocks first, then falls back to bare array extraction.
- Tool-calling: Works through the OpenAI-compatible endpoint, correctly deciding when to call, parsing arguments, and handling multi-turn results.
Limitations noted:
- Output quality: The prose in final synthesis is noticeably weaker than larger models. Functional but not polished.
Reproduction Steps
ollama pull gemma4:e2b
git clone https://github.com/JackChen-me/open-multi-agent
cd open-multi-agent && npm install
no_proxy=localhost npx tsx examples/08-gemma4-local.tsThe test file is ~190 lines at examples/08-gemma4-local.ts. The no_proxy=localhost setting is only needed if you have an HTTP proxy configured.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Claude Code's File-Based Memory System: A Pragmatic Alternative to Vector DBs
Claude Code implements a file-based memory system using .md files with frontmatter metadata and a MEMORY.md index, avoiding vector databases and embedding pipelines by scanning files, building manifests, and using a small model to select relevant memories.

Claudetop: Real-Time Cost Monitoring for Claude Code Sessions
Claudetop is an htop-like tool that shows real-time spending, cache efficiency, and model comparisons for Claude Code sessions. It provides slash commands like /claudetop:stats and smart alerts for cost milestones and efficiency issues.

ClawsifyAI Agent Handles Email, Research, and Brainstorming Tasks
A developer tested ClawsifyAI, an AI agent-style claw bot, for a week and found it handled emails, research, repetitive work, and brainstorming. The agent provides clear feedback, practical solutions, and sometimes better ideas than originally planned.

Hollow Agent OS: Local AI workers call Claude as senior architect when stuck
Hollow Agent OS uses local Qwen models that run 24/7, but when they hit logic errors or need major changes, they trigger a Claude call via MCP. Claude reorganizes file structures, reviews code, and acts as a manager for autonomous local workers.