Deterministic vs Probabilistic Code Generation: Why Bun's Vibe-Coded Rust Conversion Raises Red Flags

Noah Hall, writing for The Tech Enabler, draws a sharp line between deterministic and probabilistic code generation. He uses Bun's recent vibe-coded conversion of a million-line codebase from Zig to Rust as a cautionary tale. His core argument: deterministic systems produce consistent, reviewable results; LLMs introduce uncertainty that makes code review impossible at scale.
Deterministic Code Generation
Hall points to established deterministic tooling: Python's 2to3 for Python 2→3 migration, and transpilers for languages like Elm, PureScript, and TypeScript that always produce the same JavaScript. His own language Derw can output JavaScript, TypeScript, or English; Tegan outputs JavaScript or Go; Mojie targets JavaScript, Python, or English. All are based on AST-to-AST transformation — given the same input, you always get the same output. Consistency matters: "If a bug is consistent, we can fix it. If a bug is inconsistent, it becomes exponentially more difficult to fix."
Probabilistic Code Generation
LLMs vary output each run — sometimes A, sometimes B. Hall created neuro-lingo three years ago as a parody: humans write only function signatures and comments, and LLMs generate the implementation fresh each compilation. An example:
function add(a: number, b: number): number {
// Add two numbers together
}
function main() {
// Print "Hello World" to the console
// Print the result of add(2, 3)
}"Every time neuro-lingo is compiled, the code is generated from fresh by the LLMs. It's slightly different each time. Sometimes it introduces bugs. Sometimes it's clean and simple. Sometimes it's chaotic." Hall argues that fully AI-driven code flows are doing exactly this, but shipping to production with human accountability.
The "There Are Tests" Fallacy
Tests alone can't guarantee quality. Hall cites SQLite as the most tested codebase: 155.8 KSLOC of C code vs. 92,053.1 KSLOC of test code (590× more). Despite 100% branch coverage, millions of test cases, and extensive harnesses, SQLite still relies on human review. "It is not possible for a human to review 1 million lines of changes in 9 days. Bun has not reviewed the code they have merged to master."
Hall concludes that deterministic code generation still needs validation, and probabilistic generation creates risk that scales with line count. The source article goes deeper on each example.
📖 Read the full source: HN AI Agents
👀 See Also

MiniMax Releases MaxClaw: Cloud-Hosted AI Agent Based on OpenClaw
MiniMax has launched MaxClaw, a fully managed cloud-hosted AI agent built on the OpenClaw framework. It deploys in 10 seconds without Docker or servers and features the MiniMax M2.5 model with 229B parameters, 200K-1M token context, and up to 100 tokens/s inference speed.

Anthropic acquires Vercept AI to advance Claude's computer use capabilities
Anthropic has acquired Vercept AI to work on computer use features for Claude. The acquisition focuses on solving perception and interaction problems to make AI more useful for complex tasks.

Dangerously Skip Reading Code: When LLMs Write Code Faster Than You Can Read It
What if we stop reviewing LLM-generated code and instead treat it like machine code? Move rigor to specifications and tests.

OpenClaw Docker users: 2026.3.13 update missing Docker tags
OpenClaw version 2026.3.13 has been released, but Docker users should avoid updating as the Docker image lacks both 'latest' and '2026.3.13' tags. Users running from npm or git are unaffected.