Multi-Agent Haiku System Matches Claude Opus on Complex Number Theory Problem at 15x Lower Cost

✍️ OpenClawRadar📅 Published: March 7, 2026🔗 Source
Multi-Agent Haiku System Matches Claude Opus on Complex Number Theory Problem at 15x Lower Cost
Ad

Experimental Setup and Results

A Reddit user conducted a comparative test between two Claude model configurations on a challenging number theory problem. The problem required proving that for an odd prime p, the sum 1^(p-1) + 2^(p-1) + ... + (p-1)^(p-1) is congruent to -1 (mod p), using Fermat's Little Theorem and properties of primitive roots.

Two configurations were tested:

  • Config X (Opus solo): Claude Opus 4.5 with max_tokens: 2048, no auditor
  • Config Y (Haiku multi-agent): Haiku generator produces full proof, second Haiku auditor checks every step, with two passes if auditor flags anything, max_tokens: 1024 each call

Scoring and Performance

Both configurations scored 4/4 using this rubric:

  • Correctly invokes Fermat's Little Theorem
  • Correctly handles primitive root argument
  • Summation over complete residue system valid
  • Congruence conclusion follows correctly

The Haiku auditor returned VERIFIED with no disagreement. Performance metrics:

  • Opus solo: ~8.7 seconds, score 4/4
  • Haiku + auditor: ~10.9 seconds, score 4/4
Ad

Cost Analysis

The economic implications are significant:

  • Opus solo: $0.075/1000 tokens × ~800 tokens = ~$0.06 per query
  • Haiku + Haiku: $0.0025/1000 tokens × ~1600 tokens = ~$0.004 per query

This represents approximately 15x lower cost for identical results on this problem. The problem was described as "genuinely hard" and not training-data-obvious like simpler proofs.

The source notes that on clean problems where Fermat's Little Theorem does the heavy lifting (each a^(p-1) ≡ 1, sum (p-1) ones, get p-1 ≡ -1), the auditor pattern adds about a 17% time tax to confirm correctness. The pattern is particularly valuable for problems where the generator might stumble with quantization stutter or hallucinated algebra.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also