Models & Researchllmelectronic health recordsdrug adherenceestonia

LLMs Extract Drug Discontinuations From Estonian EHRs

|June 17, 2026|By LDS Team

5.8

Relevance Score

LLMs Extract Drug Discontinuations From Estonian EHRs — Photo: asset.jmir.pub · rights & takedowns

Per a JMIR preprint by Suvalov et al., researchers combined prescription records with free-text anamneses from a 10% sample of the Estonian population (2012-2019) to identify drug discontinuation events and their reasons. The study applied Llama-3.1-70B and GPT-4o to extract discontinuation phrases, map them into a clinician-developed taxonomy, and label who initiated the stoppage; performance was evaluated on 100 randomly selected cases per drug group (statins and antidiabetic medications), according to the preprint. This work demonstrates a practical application of LLMs to a low-resource language for pharmacoepidemiology, highlighting both potential gains for large-scale adherence research and the need for careful validation on clinical free text.

What happened

Per a JMIR preprint by Suvalov et al., the authors merged prescription data with free-text clinical anamneses from a 10% sample of the Estonian population covering 2012-2019. The study targeted discontinuations for statins and antidiabetic medications and applied two large language models, Llama-3.1-70B and GPT-4o, to:

•extract discontinuation phrases
•classify reasons using a clinician-developed taxonomy
•identify whether the patient or clinician initiated the discontinuation. Performance was measured on 100 randomly chosen cases per drug group, as reported in the preprint

Technical details

The preprint documents using Llama-3.1-70B and GPT-4o for information extraction and classification from Estonian-language clinical notes. The authors developed a taxonomy of discontinuation reasons with clinician input and applied the models to link free-text evidence to structured prescription records. The manuscript presents validation on a held-out sample; exact performance metrics are reported in the preprint.

Context and significance

Applying LLMs to extract clinically relevant events from free text addresses a long-standing barrier in pharmacoepidemiology: important discontinuation rationale is frequently recorded only in narrative notes. Systems that successfully pair prescriptions with extracted reasons can enable higher-fidelity signal detection for side effects, inefficacy, or access barriers. A concurrent Harvard / Brigham and Women's Hospital preprint (arXiv 2506.11137) covers the same problem on English EHR datasets, demonstrating that LLM-based medication status extraction scales without human annotation - reinforcing the broader applicability of this approach.

What to watch

Observers should watch for the peer-reviewed final JMIR publication for full performance metrics and error analysis, replication on other languages or EHR systems, and whether the authors publish the taxonomy, annotation guidelines, or evaluation code to enable reproducibility. External replication and transparent error breakdowns (false positives versus false negatives, initiator misclassification) will determine practical utility for downstream clinical research.

Key Points

1Per the JMIR preprint, LLMs (Llama-3.1-70B, GPT-4o) can extract discontinuation reasons from Estonian EHR free text paired with prescriptions.
2Using a clinician-developed taxonomy and 100-case validation per drug group helps quantify model performance and label consistency for pharmacoepidemiology.
3For practitioners: validation in low-resource languages broadens population-level adherence research but requires transparent metrics and replication before operational use.

Scoring Rationale

A solid niche preprint demonstrating LLM application to pharmacoepidemiology in a low-resource (Estonian) language, using population-scale prescription and free-text EHR data. Relevant to clinical NLP and pharmacoepidemiology practitioners but limited by single-country scope, small evaluation set (100 cases per drug group), and preprint status pending peer review.

Sources

Public references used for this report.

3 sources

jmir.orgExtracting and Classifying Drug Discontinuations From Estonian Electronic Health Records: Development and Validation Study

preprints.jmir.orgExtracting and Classifying Drug Discontinuations from Estonian Electronic Health Records: Development and Validation Study (preprint)

arxiv.orgScalable Medication Extraction and Discontinuation Identification from Electronic Health Records Using Large Language Models

Practice with real Ad Tech data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

Active Search Campaigns by BudgetEasy

High CPC Clicks & Poor Landing PagesMedium

Campaign ROAS by Attribution ModelHard

250 free problems · No credit card

See all Ad Tech problems

Models & Researchllmelectronic health recordsdrug adherenceestonia