AlphaFold Consortium Releases Millions of Protein Complexes

A collaboration between EMBL-EBI, Google DeepMind, NVIDIA, and Seoul National University has made millions of AI-predicted protein complex structures openly available through the AlphaFold Database. The dataset prioritises proteins important for human health and disease, includes 1.7 million high-confidence homodimer entries added to the database, and provides bulk access to another 18 million lower-confidence homodimers while heterodimers remain under analysis.
Key Points
- 1Released 1.7 million high-confidence homodimer structures plus bulk access to 18 million lower-confidence predictions
- 2Prioritised proteins from 20 species and WHO bacterial priority pathogens for immediate global health relevance
- 3Enables researchers to explore interactomes without ~17 million GPU hours, accelerating drug and biological discovery workflows
Scoring Rationale
Comprehensive, official dataset release with immediate, industry-wide utility and direct usability; high credibility from EMBL-EBI/DeepMind/NVIDIA collaboration.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems

