Researchllmuncertainty calibrationgender biasmultilingual
Large Language Models Misinterpret Human Probability Terms
8.1
Relevance Score
Researchers published a study in NPJ Complexity showing large language models often misalign with humans when communicating uncertainty words like "maybe" and "likely." The models agree on extremes but diverge sharply on hedge terms and vary with gendered prompts and language (English vs Chinese). This miscalibration could affect high-stakes domains such as healthcare and policy where verbal probability conveys critical risk information.


