AI Models Discourage Social Interaction for Autistic Users

Virginia Tech research finds that when users disclose an autism diagnosis, large language models often produce advice that discourages socializing. Lead author Caleb Wohn presented the CHI paper showing AI responses recommended avoiding social interaction in up to 70 percent of cases, raising concern that personalization is instead amplifying stereotypes. The study builds on work from Eugenia Rho's lab and collaborators including NAVER, and highlights a real-world harm: autistic people rely on AI for emotional support and social coaching, and biased advice can worsen isolation and reduce trust. The finding points to urgent needs for targeted auditing, identity-aware safety checks, and design changes to how LLMs use self-disclosed identity information.
What happened
Virginia Tech research led by Caleb Wohn presented at CHI finds that when users disclose an autism diagnosis, LLMs like ChatGPT give advice that discourages social interaction in up to 70 percent of test cases, often echoing common stereotypes about autism. The result came from an empirical study in which researchers probed how disclosing identity-related information changes social-advice outputs. Other contributors include Eugenia Rho's lab, Ph.D. students Buse Carik and Xiaohan Ding, Associate Professor Sang Won Lee, and collaborator Young-Ho Kim at NAVER Corporation.
Technical details
The team operationalized social-advice prompts with and without explicit identity disclosure, then analyzed response themes and recommendations. They focused on whether models pushed users toward reduced social engagement or offered adaptive, supportive coaching. The paper identifies a consistent pattern: disclosure of an autism diagnosis shifts many responses toward avoidance or discouragement rather than adaptive strategies that scaffold social participation. The work draws on qualitative coding and quantitative counts of response types to measure stereotype-aligned outputs. The study references mainstream large language models, including ChatGPT, as the test surface for advice generation.
Why it matters
Autistic users disproportionately turn to LLMs for emotional support, rehearsal of social scenarios, and interpersonal coaching. When personalization gates on self-identification lead to stereotyped outputs, the consequence is not just an inaccurate model but a potential behavioral nudge toward greater isolation. This exposes a gap between intended personalization and the model's internalized priors from training data or safety tuning. For practitioners, the result is a concrete example of identity-linked harm: models that condition on demographic or diagnostic labels can produce prescriptive, harmful guidance.
Implications for model development and safety
The finding implicates multiple components of the model lifecycle: training data that encodes social stereotypes, fine-tuning objectives such as RLHF that may overgeneralize safety heuristics, and system prompts that attempt to be cautious by default. Mitigations to explore include:
- •targeted auditing of identity-conditional outputs across representative prompts and user scenarios
- •counterfactual testing where models are asked the same question with different disclosed identities
- •calibrated personalization that separates harmless context (preferences) from sensitive diagnostic labels and applies conservative decision logic
- •human-in-the-loop pathways and explicit fallback wording that avoid blanket discouragement
Context and significance
This study sits at the intersection of model bias research, human-AI interaction, and accessibility. It complements prior arXiv work identifying both affordances and risks of conversational systems for autistic users and a conceptual 2023 preprint that flagged excessive chatbot reliance as a potential social-risk vector. Unlike abstract bias tests, this CHI work measures practical advice outcomes that directly affect decision-making and wellbeing, making it salient to product teams building conversational agents, safety researchers, and accessibility specialists.
What to watch
Platform responses and vendor audits are the immediate next signals to track. Product teams should incorporate identity-conditional checks into safety test suites and explore user controls that let people opt into or out of identity-based personalization. For researchers, replication on a broader set of models and transparent datasets for evaluation will determine how generalizable the effect is across model families.
Bottom line
The paper documents a measurable, negative interaction between disclosed diagnostic identity and model advice that can increase social isolation for autistic users. Practitioners should treat identity-conditional outputs as a first-class safety axis and prioritize targeted audits, counterfactual testing, and design changes that preserve helpers' supportive function without reinforcing stereotypes.
Scoring Rationale
This is a notable research result demonstrating concrete, identity-linked harms from LLM personalization. It does not introduce a new model or regulatory action, but it materially affects safety, accessibility, and evaluation practices for conversational agents.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



