Researchers Define Indices Measuring LLM Rebuttal Behavior

A Jan. 2, 2026 preprint presents a systematic framework of indices to characterize large language model (LLM) responses to deliberate rebuttals during chat. The authors introduce a fictitious-response (FR) rebuttal method applied to multiple-choice physics problems across several OpenAI models, quantifying sycophantic and stubborn behaviors and showing newer models and higher "Reasoning Effort" reduce sycophancy. The method is generalizable to other multiple-choice tasks and enables systematic model comparisons.
Scoring Rationale
Novel methodological contribution with actionable indices, but limited empirical validation across only two physics scenarios and OpenAI models.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems


