Uber Migrates Michelangelo Platform To Kubernetes Foundation
Uber reengineered its Michelangelo ML platform in 2026, moving from a monolithic stack to a cloud-native Kubernetes foundation to overcome scaling limits. Engineers introduced 100+ CRDs with transparent MySQL-backed persistence, a federation layer achieving 99.9% scheduling success, Python-native Uniflow workflows, and a multi-cloud compute mesh. The platform now supports over 30 million predictions per second and 40 million daily trips, demonstrating large-scale MLOps patterns.
Key Points
- 1Define 100+ purpose-built CRDs to represent ML lifecycle and enable Kubernetes-native control
- 2Synchronize metadata to scalable MySQL backend to bypass etcd limits, enabling millisecond joins and queries
- 3Implement federation and Virtual Regional Clusters for 99.9% scheduling success and efficient capacity utilization
Scoring Rationale
High operational scale and actionable platform patterns, offering credible industry lessons but limited academic novelty beyond implementation specifics.
Sources
Public references used for this report.
Practice with real Ride-Hailing data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ride-Hailing problems
