Caveman vs 'be brief' prompt: benchmarking compression prompts for Claude

✍️ OpenClawRadar📅 Published: April 29, 2026🔗 Source
Caveman vs 'be brief' prompt: benchmarking compression prompts for Claude
Ad

A developer benchmarked caveman (the popular shorthand compression prompt) against the simple prompt 'be brief.' to see if the extra complexity actually pays off. The test ran 24 dev prompts across 6 categories, comparing 5 arms: baseline, 'be brief.', caveman lite, caveman full, and caveman ultra. Outputs were judged by a separate Claude instance using per-prompt rubrics.

Benchmark results

  • Baseline: mean score 0.985, mean tokens 636
  • 'be brief.': mean score 0.985, mean tokens 419
  • Caveman lite: mean score 0.976, mean tokens 401
  • Caveman full: mean score 0.975, mean tokens 404
  • Caveman ultra: mean score 0.970, mean tokens 449

The two-word version matched caveman on both compression and quality. However, caveman's value lies elsewhere: consistent output structure, mode switching, and the safety escape on destructive operations. The safety escape actually introduced significant variance in output quality, which may be a concern for certain use cases.

Full breakdown with per-category data and variance findings on safety questions is available at the author's site. The benchmark harness is open source on GitHub.

📖 Read the full source: r/ClaudeAI

Ad

👀 See Also