TEMM1E v3.0.0: Swarm Intelligence Cuts Costs 3.4x

Swarm Intelligence for AI Agent Runtimes

TEMM1E v3.0.0 introduces "Many Tems" — a swarm intelligence system where multiple AI agent workers coordinate through stigmergy: indirect communication via environmental signals. This approach eliminates the coordination overhead that plagues traditional multi-agent frameworks like AutoGen, CrewAI, and LangGraph, where every coordination message requires an LLM call and costs tokens.

How It Works

The Alpha (coordinator) decomposes tasks into a dependency graph with one LLM call
A Pack of Tems (workers) spawns as real parallel tokio tasks
Each Tem claims a task via atomic SQLite transaction (no distributed locks)
Tems emit Scent signals (time-decaying pheromones) as they work — "I'm done", "I'm stuck", "this is hard"
Other Tems read these signals to choose their next task — pure arithmetic, zero LLM calls
Results aggregate when all tasks complete

Technical Details

The key insight addresses context growth: a single agent processing 12 subtasks carries ALL previous outputs in context. By subtask 12, the context has grown 28x. Each additional subtask costs more because the LLM reads everything that came before — quadratic growth: h*m(m+1)/2.

Pack workers carry only their task description + results from dependency tasks. Context stays flat at ~190 bytes regardless of how many total subtasks exist. Linear, not quadratic.

Benchmarks

Real Gemini 3 Flash API calls (not simulated):

12 independent functions: Single agent 103 seconds, Pack 18 seconds. 5.86x faster. 7,379 tokens vs 2,149 tokens. 3.4x cheaper. Quality: both 12/12 passing tests.
5 parallel subtasks: Single agent 7.9 seconds, Pack 1.7 seconds. 4.54x faster. Same tokens (1.01x ratio — proves zero waste).
Simple messages ("hello"): Pack correctly does NOT activate. Zero overhead. Invisible.

What Makes This Different

Zero coordination tokens. AutoGen/CrewAI use LLM-to-LLM chat for coordination — every message costs. TEMM1E's scent field is arithmetic (exponential decay, Jaccard similarity, superposition). The math is cheaper than a single token.
Invisible for simple tasks. The classifier (already running on every message) decides. If it says "simple" or "standard" — single agent, zero overhead. Pack only activates for genuinely complex multi-deliverable tasks.

Implementation Details

The task selection equation is 40 lines of arithmetic, not an LLM call:

S = Affinity^2.0 * Urgency^1.5 * (1-Difficulty)^1.0 * (1-Failure)^0.8 * Reward^1.2

1,535 tests. 71 in the swarm crate alone, including two that prove real parallelism (4 workers completing 200ms tasks in ~200ms, not ~800ms).

Built in Rust. 17 crates. Open source. MIT licensed. The research paper has every benchmark command — you can reproduce every number yourself with an API key.

Limitations and Learnings

The swarm doesn't help for single-turn tasks where the LLM handles "do these 7 things" in one response. There's no history accumulation to eliminate. It helps when tasks involve multiple tool-loop rounds where context grows — which is how real agentic work actually happens.

The team ran benchmarks on Gemini Flash Lite ($0.075/M input), Gemini Pro, and GPT-5.2. Total experiment cost: $0.04 out of a $30 budget. The full experiment report includes every scenario where the swarm lost, not just where it won.

📖 Read the full source: r/openclaw