DoorDash Adopts Multi‑Armed Bandits For Experimentation

DoorDash engineers Caixia Huang and Alex Weinstein adopt a multi-armed bandits (MAB) approach to optimize product experiments, using Thompson sampling to adaptively allocate traffic and reduce opportunity cost. They report MAB accelerates learning and lowers regret compared with fixed-split A/B tests but complicates metric inference and can create inconsistent user experiences. DoorDash plans contextual bandits, Bayesian optimization, and sticky user assignment to mitigate limitations.
Scoring Rationale
Practical, credible company implementation yields actionable guidance, but limited novelty beyond established multi-armed bandit techniques.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
