
Law Professors Rate AI Answers Higher in Blinded Study
According to a Stanford Law School press release and a working-paper draft by Julian Nyarko et al., a blinded study of short-answer tutoring in contracts courses found that large language models outperformed human instructors in peer comparisons. Sixteen contracts professors from fourteen U.S. law schools authored 40 representative questions and judged 2,918 anonymized pairwise comparisons; the paper draft quoted in Reason reports an average LLM win rate of 75.33%, while Stanford summarized the result as AI winning 75% of matchups. The draft also reports that professors flagged LLM responses as pedagogically harmful in 3.53% of cases versus 12.06% for peer-written answers. "We were frankly surprised by the magnitude of the results," Julian Nyarko said in Stanford's press release. Editorial analysis: This experiment tests LLMs on open-ended legal reasoning rather than single-answer tasks, raising practical questions for legal pedagogy and evaluation methods.
















