Large Language Models Misinterpret Human Probability Terms

Researchers published a study in NPJ Complexity showing large language models often misalign with humans when communicating uncertainty words like "maybe" and "likely." The models agree on extremes but diverge sharply on hedge terms and vary with gendered prompts and language (English vs Chinese). This miscalibration could affect high-stakes domains such as healthcare and policy where verbal probability conveys critical risk information.
Scoring Rationale
Peer-reviewed empirical evidence of widespread miscalibration, plus limited immediate tooling or prescriptive mitigation guidance for practitioners.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems


