Stanford Study: Law Professors Prefer AI Answers Over Peers 75% of the Time

A Stanford Law School study led by Professor Julian Nyarko found that law professors overwhelmingly prefer AI-generated answers to student questions over responses written by fellow instructors. In a blind evaluation of nearly 3,000 anonymized comparisons across 16 U.S. law schools, AI responses won 75% of head-to-head matchups against peer-written answers.
Study Design & Results
The study, titled Law Professors Prefer AI Over Peer Answers, focused on contract law. Participants created 40 representative questions that students might ask after class or during office hours. Professors wrote their own answers, then evaluated responses without knowing whether they came from AI or other professors. The AI systems performed comparably to the best human instructor in the study.
Key findings:
- AI won 75% of head-to-head comparisons against peer answers
- AI responses flagged as pedagogically harmful only 3.5% of the time
- Peer-written answers flagged as harmful 12% of the time
- Evaluations focused on nuanced legal reasoning, not factual recall
Implications for Legal Education
“This study challenges important assumptions about AI’s role in legal education,” Nyarko said. “We focused on law precisely because it requires judgment, nuanced reasoning, and the ability to navigate ambiguity—not just factual recall.”
The research also examined specific AI models including commercial tutoring systems and Google’s NotebookLM, finding varying levels of performance. Even when context limitations affected AI responses, professors still frequently preferred them to human-written alternatives.
Co-author Sarath Sanga from Yale Law School noted: “In most fields where AI gets tested, there’s a right answer. In law, there often isn’t. Two opposing arguments can both be good.”
The study is particularly notable because previous AI evaluations focused on subjects with clear right-or-wrong answers, whereas legal reasoning demands careful analysis of competing arguments and defensible conclusions.
Cautions & Open Questions
Nyarko cautioned against wholesale adoption: “How to implement these tools to most effectively improve student learning is still an open question.” The study evaluated answer quality but noted that implementation challenges such as hallucinations, overreliance, and erosion of critical thinking skills remain.
📖 Read the full source: HN AI Agents
👀 See Also

Richard Dawkins Concludes AI Is Conscious — Experts Push Back
Evolutionary biologist Richard Dawkins, after extended chats with Anthropic's Claude and OpenAI's ChatGPT, concluded the AIs are conscious. Most cognitive scientists strongly disagree, calling it anthropomorphism.

Deterministic vs Probabilistic Code Generation: Why Bun's Vibe-Coded Rust Conversion Raises Red Flags
Noah Hall argues vibe-coded 1M-line repo changes (like Bun's Zig-to-Rust) are dangerous. Contrasts deterministic transpilers vs. probabilistic LLM output. Tests aren't enough.

Telus Deploys Real-Time Accent Conversion on Call-Center Agents via Tomato.ai
Telus is using Tomato.ai's speech-to-speech system to alter offshore agents' accents in real time, drawing backlash over transparency and worker rights.

AI Water Usage Is a Non-Issue: National, Local, and Personal Level Analysis
Andy Masley crunches the numbers on AI data center water use vs. other industries and finds it's a 'fake problem' — tax revenue per gallon is high, and per-person usage is trivial.