Leanstral: Open-Source Code Agent for Lean 4 and Formal Proof Engineering

What Leanstral Is
Leanstral is an open-source code agent specifically designed for Lean 4, a proof assistant capable of expressing complex mathematical objects and software specifications. Unlike existing proving systems that act as wrappers around large generalist models, Leanstral is trained for operating in realistic formal repositories with 6B active parameters.
Key Technical Details
The model uses a highly sparse architecture optimized for proof engineering tasks. It leverages parallel inference with Lean as a verifier, making it both performant and cost-efficient. Leanstral supports arbitrary MCPs through Mistral Vibe and was specifically trained to achieve maximal performance with the frequently used lean-lsp-mcp.
Performance Benchmarks
Leanstral was evaluated using FLTEval, a new evaluation suite focused on realistic proof engineering scenarios rather than isolated mathematical problems. The benchmarks compare completion of formal proofs and correct definition of new mathematical concepts in PRs to the FLT project.
Against Open-Source Models
- Leanstral-120B-A6B achieves a score of 26.3 with pass@2 (2 inference passes)
- GLM5-744B-A40B caps at approximately 16.6
- Kimi-K2.5-1T-32B caps at approximately 20.1
- Qwen3.5-397B-A17B requires 4 passes to reach 25.4
- Leanstral scales linearly, reaching 29.3 at pass@4 and 31.9 at pass@16
Against Claude Family
- Leanstral pass@2 (score 26.3) beats Sonnet (23.7) by 2.6 points
- Cost: Leanstral $36 vs. Sonnet $549
- Leanstral pass@16 reaches 31.9, beating Sonnet by 8 points
- Claude Opus 4.6 leads with 39.6 but costs $1,650 (92× Leanstral's cost)
- Haiku scores 23.0 at $184
Case Study Example
When presented with a real-world question from Proof Assistants Stack Exchange about a script that stopped compiling in Lean 4.29.0-rc6, Leanstral successfully built test code to recreate the failing environment. It diagnosed that a def T2 := List Bool was blocking the rw tactic from matching patterns due to definitional equality issues. The fix proposed was swapping def for abbrev since abbrev creates a transparent alias.
Availability
Leanstral weights are released under Apache 2.0 license, available in agent mode within Mistral Vibe, and through a free API endpoint. A tech report detailing the training approach will also be released.
📖 Read the full source: HN AI Agents
👀 See Also

Testreel: Programmatic Demo Video Generation with Claude Code
Testreel is an npm package that generates polished product demo videos from JSON, YAML, or Playwright interaction descriptions. It creates webm/mp4/gif videos with cursor overlays, click ripples, and gradient backgrounds.

blend-ai: New Blender MCP Service for Claude Code
blend-ai is a new Blender MCP service that allows Claude Code to generate 3D scenes. A user reported it worked faster and better than blender-mcp, creating a shuttle launch scene from reference images in 5 minutes.

Open-Source Claude Code Plugins for Agentic Commerce Protocols
OrcaQubits has released eight open-source Claude Code plugins that implement agentic commerce protocols including UCP, ACP, AP2, and A2A, with MIT licensing and support for platforms like Magento 2, BigCommerce, and WooCommerce.

Chrome Skills: Save and Reuse AI Prompts as One-Click Tools
Google's Chrome Skills feature lets users save AI prompts as reusable workflows that run with a single click on any webpage. Skills can be accessed by typing forward slash (/) or clicking the plus sign (+) in Gemini in Chrome.