Let's Data ScienceLEARN • BUILD • STAY AHEAD

Curated Collections

Hand-curated learning roadmaps that take you from foundations to staff-level interview readiness. Every problem is deliberately chosen and ordered to build on what you just learned.

FAANG Interview Loop

Amazon-Style SQL 30

A round-by-round simulator of Amazon's data interview loop, from the OA Sprint through the Bar Raiser. Schemas mirror what Amazon's teams actually work on day-to-day (retail, marketplace, logistics), and every harder problem carries the kind of probing follow-up an L6 interviewer would press on — scale, cost, ownership. Not affiliated with Amazon; built from publicly reported 2025–2026 loops.

30Problems~10h6Stages

1Baseline Fluency Check5

3First Technical Screen6

FAANG Interview Loop

Meta-Style SQL 30

A round-by-round walk through Meta's data interview loop, including the Analytical Execution round that no other FAANG splits out into its own slot. Schemas mirror Meta's social-graph and ad surfaces, and every Hard and Expert problem carries a probing-depth section for IC6 candidates who'll be asked to scale the same query to 3B users. Not affiliated with Meta; built from publicly reported 2025–2026 loops.

30Problems~10h6Stages

1Baseline Fluency Check5

2Technical Screen6

3Technical Skills7

FAANG Interview Loop

Google-Style SQL 30

A round-by-round simulator of Google's data interview loop, built around the patterns Google actually tests: sessionization on event streams, cohort retention math, A/B test reads, and a dedicated statistics round (the only FAANG that still has one). Every Hard and Expert problem carries the L5/L6 probing follow-ups Google interviewers ask about BigQuery cost, partitioning, and metric ownership at scale. Not affiliated with Google; built from publicly reported 2025–2026 DSA / DSP / DE loops.

30Problems~11h6Stages

1Baseline Fluency Check5

2Technical Phone Screen6

FAANG Interview Loop

Netflix-Style SQL 30

A round-by-round simulator of Netflix's data interview loop, built around the patterns Netflix actually tests: streaming engagement metrics, weekly retention curves, A/B test reads on watch behavior, subscriber LTV, and a dedicated Experimentation & Causal Inference round (Netflix's signature — no other FAANG has a separately-titled causal inference DS track at scale). Every Hard and Expert problem carries the causal-thinking follow-ups Netflix interviewers ask about ratio metrics, CUPED variance reduction, and ambiguity tolerance. Not affiliated with Netflix; built from publicly reported 2025–2026 DS-Analytics / DS-Inference / DS-Algorithms / DE / Analytics Engineer loops.

30Problems~11h6Stages

1Recruiter & Hiring Manager Screens5

2Technical Phone Screen6

Curated Collection

SQL 50

A 5-stage progression that takes you from basic WHERE filters through window functions to staff-level multi-CTE scorecards, structured across 15 production-grade schemas so the same pattern shows up in different domains until it sticks. The closest LDS has to a "Blind 75" for SQL — start here if you only do one collection.

50Problems~20h5Stages

2Joins & Relationships10

3Analytical Thinking12

Curated Collection

Python 50

A 5-stage pandas progression from boolean indexing through groupby + rolling windows to staff-level multi-table scorecards, structured across 15 production-grade schemas — payments, lodging, streaming, social — so the same pattern repeats across different domains until it sticks. The Python companion to LDS SQL 50.

50Problems~22h5Stages

2Merging & Joining10

3Aggregation & Window12

Data Analyst Interview Prep 75

A 7-stage SQL roadmap built around the patterns that actually decide data analyst loops: multi-table joins, conditional aggregation, window functions, the multi-CTE scorecards that show up in final-round take-homes, and the behavioral "write a query that…" phrasing interviewers actually use. Seventy-five problems across 15 production-grade schemas modeled after Amazon, Stripe, Airbnb, Meta, and Netflix.

75Problems~30h7Stages

2Joins & Relationships12

3Aggregation & Analytics15

Data Scientist Interview Prep 75

A 6-round simulator of the modern DS onsite: SQL screen, SQL analytics, A/B testing & experiment analysis, pandas coding, statistical reasoning, and an ML feature-engineering capstone. Seventy-five problems across SQL and Python on 15 production-grade schemas — including a dedicated A/B testing round with Welch's t-test, chi-square, ANOVA, and CUPED, drilled the way interviewers actually ask it.

75Problems~32h6Stages

3A/B Testing & Experiment Analysis10

Topic Deep Dive

SQL Window Functions 30

A 6-stage progression through ROW_NUMBER, RANK, LAG, running totals, moving averages, Top-N per partition, and NTILE bucketing — composed across 15 production-grade schemas. The pattern that separates intermediate from senior SQL candidates in onsite rounds, and the one most people can fake their way around until they can't.

30Problems~12h6Stages

Curated Collection

SQL Joins & Multi-Table 30

A 5-stage progression from clean two-table INNER joins through 3-table chains and stars, LEFT joins and anti-joins, same-table set logic (self-join, INTERSECT, EXISTS), up to 6-table multi-CTE scorecards. Thirty problems on 15 production-grade schemas — the join muscle memory real interview rounds actually exercise.

30Problems~12h5Stages

12-Table INNER JOIN Foundation5

23-Table JOINs7

3LEFT JOIN & Anti-Join6

Topic Deep Dive

SQL CTE & Subqueries 25

A 5-stage progression from scalar subqueries and EXISTS / NOT EXISTS through single-CTE reference-value patterns up to the multi-CTE business scorecard you'll write at staff level. Twenty-five problems on 15 production-grade schemas — the query shape that turns a raw fact table into a board-ready metric in one statement.

25Problems~10h5Stages

1Scalar Subqueries5

2EXISTS / NOT EXISTS / NOT IN5

3CTE Fundamentals5

Curated Collection

SQL Date & Time 20

A 5-stage progression from date filters and arithmetic (`DATE`, `JULIANDAY`, `strftime`) through monthly bucketing and aging tiers, up to time-series window functions and a multi-CTE LTV capstone. Twenty problems on 15 production-grade schemas — the thinnest SQL skill most candidates show up with, and the fastest one to fix.

20Problems~9h5Stages

1Date Foundations4

2Date Arithmetic & Diff4

3Date Extraction & Bucketing4

Topic Deep Dive

Pandas Merge 25

A 5-stage progression on pandas merge: 2-table inner merges, chained 2-3-4 table pipelines, anti-merges with `indicator=True`, and the 4-5 merge feature-matrix pipelines that stitch a production DataFrame together. Twenty-five problems on 15 industry-grade schemas — the pandas equivalent of getting SQL joins right.

25Problems~10h5Stages

1Basic Inner Merge5

2Two-Merge Pipelines5

3Multi-Table 3+ Merges5

Topic Deep Dive

Pandas Window & Aggregation 25

A 5-stage progression through every pandas window/aggregation pattern — groupby with named aggregation, within-group rank, qcut quartile bucketing, IQR transforms, cumulative running totals, shift period-over-period, rolling 7-day means (both flat-daily and per-entity), and 2D pivot capstones. Twenty-five problems on 15 production-grade schemas — the pandas analog of SQL window functions, where naive solutions hit the for-loop trap.

25Problems~10h5Stages

1GroupBy Foundations5

2Multi-Key GroupBy + Named Aggregation5

3Rank, Qcut, Transform5

Topic Deep Dive

Pandas Cleaning & Reshape 25

A 5-stage progression on the pandas work most courses skip: filling missing data, normalizing units and free-text categories, pivoting long DataFrames into wide cross-tabulations, and assembling production-grade feature matrices from messy multi-table pipelines. Twenty-five problems on 15 industry schemas — the practice nobody gives you before your first dirty-data interview.

25Problems~10h5Stages

2Normalize & Clean5

3Standardize Categories5

All curated collections

LDS Amazon-Style SQL 30 — A round-by-round simulator of Amazon's data interview loop, from the OA Sprint through the Bar Raiser. Schemas mirror what Amazon's teams actually work on day-to-day (retail, marketplace, logistics), and every harder problem carries the kind of probing follow-up an L6 interviewer would press on — scale, cost, ownership. Not affiliated with Amazon; built from publicly reported 2025–2026 loops.
LDS Meta-Style SQL 30 — A round-by-round walk through Meta's data interview loop, including the Analytical Execution round that no other FAANG splits out into its own slot. Schemas mirror Meta's social-graph and ad surfaces, and every Hard and Expert problem carries a probing-depth section for IC6 candidates who'll be asked to scale the same query to 3B users. Not affiliated with Meta; built from publicly reported 2025–2026 loops.
LDS Google-Style SQL 30 — A round-by-round simulator of Google's data interview loop, built around the patterns Google actually tests: sessionization on event streams, cohort retention math, A/B test reads, and a dedicated statistics round (the only FAANG that still has one). Every Hard and Expert problem carries the L5/L6 probing follow-ups Google interviewers ask about BigQuery cost, partitioning, and metric ownership at scale. Not affiliated with Google; built from publicly reported 2025–2026 DSA / DSP / DE loops.
LDS Netflix-Style SQL 30 — A round-by-round simulator of Netflix's data interview loop, built around the patterns Netflix actually tests: streaming engagement metrics, weekly retention curves, A/B test reads on watch behavior, subscriber LTV, and a dedicated Experimentation & Causal Inference round (Netflix's signature — no other FAANG has a separately-titled causal inference DS track at scale). Every Hard and Expert problem carries the causal-thinking follow-ups Netflix interviewers ask about ratio metrics, CUPED variance reduction, and ambiguity tolerance. Not affiliated with Netflix; built from publicly reported 2025–2026 DS-Analytics / DS-Inference / DS-Algorithms / DE / Analytics Engineer loops.
LDS SQL 50 — A 5-stage progression that takes you from basic WHERE filters through window functions to staff-level multi-CTE scorecards, structured across 15 production-grade schemas so the same pattern shows up in different domains until it sticks. The closest LDS has to a "Blind 75" for SQL — start here if you only do one collection.
LDS Python 50 — A 5-stage pandas progression from boolean indexing through groupby + rolling windows to staff-level multi-table scorecards, structured across 15 production-grade schemas — payments, lodging, streaming, social — so the same pattern repeats across different domains until it sticks. The Python companion to LDS SQL 50.
LDS Data Analyst Interview Prep 75 — A 7-stage SQL roadmap built around the patterns that actually decide data analyst loops: multi-table joins, conditional aggregation, window functions, the multi-CTE scorecards that show up in final-round take-homes, and the behavioral "write a query that…" phrasing interviewers actually use. Seventy-five problems across 15 production-grade schemas modeled after Amazon, Stripe, Airbnb, Meta, and Netflix.
LDS Data Scientist Interview Prep 75 — A 6-round simulator of the modern DS onsite: SQL screen, SQL analytics, A/B testing & experiment analysis, pandas coding, statistical reasoning, and an ML feature-engineering capstone. Seventy-five problems across SQL and Python on 15 production-grade schemas — including a dedicated A/B testing round with Welch's t-test, chi-square, ANOVA, and CUPED, drilled the way interviewers actually ask it.
LDS SQL Window Functions 30 — A 6-stage progression through ROW_NUMBER, RANK, LAG, running totals, moving averages, Top-N per partition, and NTILE bucketing — composed across 15 production-grade schemas. The pattern that separates intermediate from senior SQL candidates in onsite rounds, and the one most people can fake their way around until they can't.
LDS SQL Joins & Multi-Table 30 — A 5-stage progression from clean two-table INNER joins through 3-table chains and stars, LEFT joins and anti-joins, same-table set logic (self-join, INTERSECT, EXISTS), up to 6-table multi-CTE scorecards. Thirty problems on 15 production-grade schemas — the join muscle memory real interview rounds actually exercise.
LDS SQL CTE & Subqueries 25 — A 5-stage progression from scalar subqueries and EXISTS / NOT EXISTS through single-CTE reference-value patterns up to the multi-CTE business scorecard you'll write at staff level. Twenty-five problems on 15 production-grade schemas — the query shape that turns a raw fact table into a board-ready metric in one statement.
LDS SQL Date & Time 20 — A 5-stage progression from date filters and arithmetic (`DATE`, `JULIANDAY`, `strftime`) through monthly bucketing and aging tiers, up to time-series window functions and a multi-CTE LTV capstone. Twenty problems on 15 production-grade schemas — the thinnest SQL skill most candidates show up with, and the fastest one to fix.
LDS Pandas Merge 25 — A 5-stage progression on pandas merge: 2-table inner merges, chained 2-3-4 table pipelines, anti-merges with `indicator=True`, and the 4-5 merge feature-matrix pipelines that stitch a production DataFrame together. Twenty-five problems on 15 industry-grade schemas — the pandas equivalent of getting SQL joins right.
LDS Pandas Window & Aggregation 25 — A 5-stage progression through every pandas window/aggregation pattern — groupby with named aggregation, within-group rank, qcut quartile bucketing, IQR transforms, cumulative running totals, shift period-over-period, rolling 7-day means (both flat-daily and per-entity), and 2D pivot capstones. Twenty-five problems on 15 production-grade schemas — the pandas analog of SQL window functions, where naive solutions hit the for-loop trap.
LDS Pandas Cleaning & Reshape 25 — A 5-stage progression on the pandas work most courses skip: filling missing data, normalizing units and free-text categories, pivoting long DataFrames into wide cross-tabulations, and assembling production-grade feature matrices from messy multi-table pipelines. Twenty-five problems on 15 industry schemas — the practice nobody gives you before your first dirty-data interview.