Dropbox Uses LLMs To Improve Search

Dropbox engineers describe using large language models to amplify human relevance labeling for Dash search, calibrating LLM evaluators against a small human-labeled set to produce hundreds of thousands to millions of labels and amplify human effort roughly 100×. They report the method improves retrieval ranking — the bottleneck in retrieval-augmented generation — by combining automated LLM judgments with human oversight and hardest-mistake analysis.
Scoring Rationale
Strong practical impact from scalable, human-calibrated LLM labeling; slightly limited by incremental novelty over existing RAG practices.
Practice with real FinTech & Trading data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all FinTech & Trading problems

