Researchnlpmultilingualtokenizationtransliteration

Script Gap Study Reveals Romanisation Reduces Triage

|December 15, 2025|By LDS Team

10.0

Relevance Score

Script Gap Study Reveals Romanisation Reduces Triage — Photo: bl-i.thgim.com · rights & takedowns

Khullar et al. (2025) analyze maternal and newborn care chats across six Indian languages and English, finding romanised inputs produce 5–12 point F1 declines in LLM triage. Models (GPT‑4o, Claude 4.5, LLaMA 4, Qwen, others) often paraphrase romanised queries yet label them "insufficient information"; automatic transliteration back to native scripts restores performance, affecting 56% of user messages in the study.

Key Points

1Show romanised inputs cause 5–12 point F1 drops across LLMs; Kannada falls 83.7→57.3.
2Reveal orthographic noise and tokenisation instability, not semantic loss, drive misclassification in romanised text.
3Recommend automatic transliteration or normalization to native scripts to restore accuracy and fairness.

Scoring Rationale

Strong empirical finding with cross-model evidence and actionable normalization fix; limited to studied languages and settings.

Sources

Public references used for this report.

1 source

01thehindubusinessline.com‘Scripting’ ideal AI output

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems

Script Gap Study Reveals Romanisation Reduces Triage

Key Points

Scoring Rationale

Sources

More AI & Data Science News

Portugal Launches Amalia Open Source Portuguese Language Model

UN And ITU Launch AI For Good Global Commission

Author Documents Agentic Coding on Galapogos Island

Sai Insights Explains 30 Ideas Powering AI Agents

Script Gap Study Reveals Romanisation Reduces Triage

Key Points

Scoring Rationale

Sources

More AI & Data Science News

Portugal Launches Amalia Open Source Portuguese Language Model

UN And ITU Launch AI For Good Global Commission

Author Documents Agentic Coding on Galapogos Island

Sai Insights Explains 30 Ideas Powering AI Agents