Retro Computers Train Tiny Transformer Efficiently

On March 30, 2026, hobbyist Damien trained a transformer on a 1960s IBM and a PDP-11 to reverse lists of digits, achieving 100% accuracy in 350 steps. The experiment fit within a 32KB memory limit using fixed-point arithmetic and lookup tables, and hand-tuned SGD cut training time to roughly five minutes. The project illustrates transformers can run under extreme resource constraints.
Key Points
- 1Trains a transformer on 1960s IBM and PDP-11 to reverse digit lists, achieving 100% accuracy.
- 2Demonstrates extreme resource constraints via fixed-point arithmetic and lookup tables within 32KB memory.
- 3Shows efficient training optimizations cut training from hours or days to roughly five minutes on legacy hardware.
Scoring Rationale
Fresh same-day experimental project with practical techniques for low-memory training. Scored for relevance and actionable methods, but reduced for narrow scope and single-source maker reporting.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
