100K Lines of Rust with AI: Contracts, Spec-Driven Dev, and Performance

Cheng Huang spent ~6 weeks building a Rust-based multi-Paxos consensus engine designed to modernize Azure's Replicated State Library (RSL). The project involved over 130K lines of Rust code (~100K written in 4 weeks by AI agents, plus 3 weeks of optimization) and achieved a throughput jump from 23K to 300K operations per second.
Huang used multiple AI coding agents: GitHub Copilot, Claude Code, Codex CLI, Augment Code, Kiro, and Trae. His primary setup now is Claude Code + Codex CLI from the terminal, with VS Code only for diffs and minor edits. He maintains two ChatGPT subscriptions to handle rate limits (one Mon-Wed, one Thu-Sun).
Code Contracts — Written by AI
The core correctness strategy: AI-generated code contracts that specify preconditions, postconditions, and invariants for critical functions, converted into runtime asserts during testing. Huang found GPT-5 High writes excellent contracts; Opus 4.1 is good but requires more review. For example, the process_2a method (handling Paxos phase 2a messages) has 16 contracts. Contracts are then used to generate targeted test cases and property-based tests that explore randomized inputs — one contract caught a subtle Paxos safety violation that could have caused replication consistency issues.
Lightweight Spec-Driven Development
Huang initially tried a rigid spec-driven approach: requirement markdown → design markdown → task list markdown. He found it too inflexible for iterative changes. He now uses a lighter-touch SDD: start with a concise spec, let AI generate code, then refine contracts and tests iteratively. The full system includes 1,300+ tests spanning unit, integration, and multi-replica failure injection tests.
Performance Optimization
The optimization phase (3 weeks) boosted throughput from 23K to 300K ops/sec. Key architectural changes: added pipelining (requests no longer wait for in-flight votes), support for non-volatile memory (NVM) to reduce commit time, and RDMA awareness for modern Azure datacenter hardware.
What's Next
Huang wishes for better AI support for property-based test generation from contracts and more seamless handling of breaking changes in codebases above 100K lines.
📖 Read the full source: HN AI Agents
👀 See Also

Yes Flow/No Flow: A Simple Technique to Reduce Context Hallucination in AI Coding Sessions
A Reddit user shares the Yes Flow/No Flow technique for maintaining consistency in AI conversations by rewriting prompts instead of stacking corrections, which helps reduce context breakdown and hallucination during long coding sessions.

How to Fix Claude Code's CSS Guesswork with a Design System
A developer found Claude Code repeatedly regenerated misaligned HTML/CSS because it designs blind without visual feedback. The solution: provide a complete design system with spacing, colors, and type variables, then separate HTML and CSS prompts.

Verification Harness Fixes Claude's Plan Execution Problem
A developer built a 30-50 line bash or Python verification layer that checks whether Claude actually executes each step of its own plans by verifying artifacts like file existence, API responses, and config changes.

Preventing output drift in long Claude threads by anchoring high-quality responses
A user describes how Claude responses degrade after 30-40 messages, and how they anchor the best mid-thread output to start fresh conversations.