Centaur Model Exhibits Overfitting, Fails Instruction Understanding

Zhejiang University researchers published in National Science Open (2025) demonstrating that the AI model Centaur, earlier presented in Nature (July 2025) as simulating human cognition across 160 tasks, likely achieved high performance through overfitting. By overriding prompts with "Please choose option A," they showed Centaur ignored instructions and reproduced trained answer patterns, highlighting a language-understanding bottleneck and warning against equating LLM performance with genuine cognitive simulation.
Scoring Rationale
Highlights a critical evaluation flaw in high-profile cognitive LLMs, but findings are limited to cognitive-modeling research contexts.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems

