AI Agents Are Killing Code Review — The Principal-Agent Problem Explained

The industry-standard code review process — review-then-commit, popularized by GitHub PRs — was designed for low-trust collaboration. A human makes a change, another human reviews it, iterations occur, and the change lands. This worked because reviewers could cheaply infer effort and understanding from reading the code. AI agents break that completely.
The agent-in-the-middle disaster
The best case with AI agents: a human prompts a machine to write code, the human reviews it, then sends it to a second human for traditional review. That doubles the review burden. Worse, agents increase total change volume. The result: review bandwidth is exhausted before even a fraction of the agent productivity gains materialize.
But reality is worse. The actual pattern is: human types a short prompt, lightly QAs the output, packages it as a PR, and then funnel reviewer comments back to the agent for fixes. This is a textbook principal-agent problem: the reviewer (principal) can no longer infer effort or understanding from the code, because the code was generated by a machine. The human driving the agent has no incentive to actually read the code or think critically about reviewer feedback. They spend 5 minutes and generate serious review load for another engineer.
This is what's killing open source — "slop PRs" from people who have no understanding of the project, its constraints, or its tools.
A way forward for small teams
For small high-trust teams, there is a simpler process: human prompts agent → human reviews the code → human deploys directly (no second reviewer). The human driving the machine takes full responsibility by owning the deployment. The principal-agent problem disappears because the human is both the driver and the deployer.
At exe.dev, a team of nine uses this approach successfully. Key practices: write far more integration and e2e tests, build agent-based workflows to analyze commits for safety/performance/usability bugs, and ensure a human is always accountable for the final deploy.
The traditional code review model is not salvageable with agents. Small teams can adapt; large organizations and open source projects face a harder structural problem.
📖 Read the full source: HN AI Agents
👀 See Also

SubQ: First Fully Subquadratic LLM with 12M-Token Context and 95% RULER Accuracy
Subquadratic launches SubQ 1M-Preview, a subquadratic LLM with linear compute scaling, 12M-token context, 52× faster sparse attention vs FlashAttention, and 95% on RULER 128K. Available via API, CLI code agent (SubQ Code), and search tool (SubQ Search).
FairyFuse Achieves 29.6x Kernel Speedup on CPUs via Ternary Weight Multiplication-Free Inference
FairyFuse fuses eight real-valued sub-GEMVs into a single AVX-512 loop using masked adds/subtracts, yielding 32.4 tokens/s on Xeon 8558P and 1.24x speedup over llama.cpp Q4_K_M with near-lossless quality.

Google Quietly Buying Play Store Code to Train AI Coding Tools
Google is emailing Android developers offering to pay for their app codebases to train AI coding tools, as part of a confidential pilot program.

Claude Code v2.1.158: Auto Mode Now on Bedrock, Vertex, Foundry for Opus 4.7/4.8
Claude Code v2.1.158 enables auto mode on Bedrock, Vertex, and Foundry for Opus 4.7 and 4.8. Opt in with CLAUDE_CODE_ENABLE_AUTO_MODE=1.