Swedish AI decodes 17th-century Borg Cipher

Researchers affiliated with Stockholm University have built an AI pipeline that sharply speeds up decoding of historical ciphers, combining character recognition, pattern reconstruction, and language modeling, according to AzerNEWS. AzerNEWS reports that the system, led by Professor Beata Megyesi, deciphered the Borg Cipher - a roughly 400-page 17th-century manuscript that uses 34 symbols and records medical and pharmaceutical recipes - in about 28 minutes, a task that previously took weeks of manual work. Stockholm University materials describe the broader DECRYPT project and list tools such as TranscripTool and CrypTool plus datasets including DECODE and HistCorp. Stockholm University reports releasing more than 7,000 encrypted sources, while AzerNEWS reports the project database holds over 20,000 encrypted documents. The team emphasizes that AI assists, but does not replace, historians and philologists.
What happened
According to AzerNEWS, researchers in Sweden developed an AI pipeline that substantially shortens the time needed to decode historical ciphers. AzerNEWS reports that, per project leader Professor Beata Megyesi, the system deciphered the Borg Cipher - a roughly 400-page 17th-century manuscript that uses 34 distinct symbols and records medical and pharmaceutical knowledge - in about 28 minutes, compared with prior manual efforts measured in weeks. Stockholm University's pages document the longer-running, interdisciplinary DECRYPT project led by Megyesi and the data and tools associated with it.
Technical details
Per AzerNEWS, the system combines character recognition, pattern reconstruction, and language modeling to reconstruct plaintext from cipher symbols. Stockholm University materials describe TranscripTool for turning cipher images into machine-readable transcriptions and CrypTool for assisting decipherment workflows, alongside language resources and diplomatic transcriptions used for evaluation. The project collections DECODE and HistCorp provide training and benchmark data.
Industry context
Editorial analysis
the project exemplifies applying computer vision, computational linguistics, and statistical or neural sequence models to archival and humanities problems. Comparable efforts typically pair OCR-like symbol recognition with probabilistic or neural language models and human-in-the-loop validation to handle noise, sparse training data, and diverse scripts. Recurring challenges for practitioners include domain-specific symbol inventories, limited parallel corpora, and error propagation between transcription and decipherment stages.
What to watch
For digital-humanities and NLP practitioners, reproducible pipelines and shared datasets matter more than single-case runtimes. Observers should look for releases of transcription and alignment data, code and model checkpoints for TranscripTool and CrypTool, benchmark evaluations across multiple ciphers and languages, and documentation of human-in-the-loop steps and uncertainty estimates. Note that source figures differ - Stockholm University reports 7,000+ encrypted items while AzerNEWS cites a 20,000+-document database - a discrepancy worth flagging.
Key Points
- 1An AI pipeline reportedly decoded the 17th-century Borg Cipher in about 28 minutes, versus weeks of manual work, per AzerNEWS.
- 2Stockholm University's DECRYPT project provides tools (TranscripTool, CrypTool) and datasets (DECODE, HistCorp), with 7,000+ encrypted sources released; the reported database size is 20,000+.
- 3Editorial analysis: modular, reproducible pipelines plus open data and benchmarks are the key levers for generalizing cipher-decoding models to low-resource historical scripts.
Scoring Rationale
A notable, domain-specific application of OCR and language modeling that accelerates a historically manual task and is backed by released datasets and tools useful to NLP and digital-humanities practitioners. It is an applied result rather than a frontier-model release, and the headline runtime is single-case and press-reported, placing it in the solid-to-notable band.
Sources
Public references used for this report.
Practice with real FinTech & Trading data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all FinTech & Trading problems
