Generative AI Changes Constructed Response Scoring Validity

Researchers led by Jodi Casabianca submitted a paper on March 1, 2026, examining the use of generative AI for scoring constructed responses in high-stakes testing. They compare human ratings, feature-based NLP scoring, and generative models, finding generative AI requires more extensive validity evidence due to opacity and consistency concerns. The study analyzes a large corpus of 6–12th grade argumentative essays and proposes best practices for evidence collection.
Scoring Rationale
Provides timely, actionable guidance for high-stakes automated scoring; strength in empirical corpus analysis, limitation is preprint, pending peer review.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


