Skip to content

Machine Learning Fundamentals (scikit-learn)

Pro

The supervised-learning loop with scikit-learn — linear & logistic regression, regularization, classification, metrics, cross-validation, and leak-free pipelines.

8 modules · Module 1 is free; Modules 2+ require Pro.

View the full course

What this course covers

A module-by-module concept outline. Open the course to learn each topic with animated explanations, in-browser code, practice challenges, and a knowledge check.

Module 1. The ML Frame & the Lendly Dataset

Free
Topics
Supervised vs Unsupervised vs ReinforcementFeatures, Targets, and the Hypothesis ClassRegression vs ClassificationThe Train / Test SplitMeet the Lendly Dataset
Sections
  1. 1What Machine Learning Actually Is
  2. 2Features, Targets, and the Hypothesis Class
  3. 3Regression vs Classification — Same Frame, Different Output
  4. 4Train, Validate, Test — Why You Need Three Sets
  5. 5Meet Lendly — Your Running Dataset for All 8 Modules

Module 2. Linear Regression: From Line-Fitting to Loss

Pro
Topics
The Linear Model y = Xβ + εSquared Error and WhyThe Normal Equation (Closed Form)Gradient Descent — The Iterative CousinDiagnostics — R², Residuals, CoefficientsFitting Lendly Interest Rates
Sections
  1. 1The Linear Model — A Weighted Sum of Features
  2. 2Mean Squared Error — Why Squared, Why Mean
  3. 3The Normal Equation — Closed-Form Solution
  4. 4Gradient Descent — When Closed Form Won't Scale
  5. 5Reading the Output — Coefficients, R², Residuals
  6. 6Lendly Case: Predicting Interest Rate from Borrower Features

Module 3. Regularization & the Bias–Variance Tradeoff

Pro
Topics
Overfitting — The SymptomThe Bias-Variance DecompositionRidge (L2) — Shrink CoefficientsLasso (L1) — Zero Out FeaturesElastic Net — The CompromiseStandardization Matters Here
Sections
  1. 1Overfitting in Lendly — The Symptom
  2. 2Bias and Variance — The Two Failure Modes
  3. 3Ridge Regression — Squeeze the Coefficients
  4. 4Lasso — Squeeze Some Coefficients to Zero
  5. 5Picking Alpha and Why Standardization Is Non-Negotiable
  6. 6Lendly Case: Regularized Rate Prediction

Module 4. Classification: Logistic, kNN, Decision Tree

Pro
Topics
From Regression to ClassificationLogistic Regression & The Sigmoidk-Nearest Neighbors — The Lazy LearnerDecision Trees — Recursive SplitsInductive BiasesLendly Default Prediction
Sections
  1. 1The Classification Setup — Predicting Default on Lendly
  2. 2Logistic Regression — Sigmoid, Log-Odds, and Why
  3. 3k-Nearest Neighbors — Geometric, Lazy, Powerful
  4. 4Decision Trees — Recursive Yes/No Splits
  5. 5Three Models, One Dataset — How They Disagree and Why
  6. 6Lendly Case: Default Prediction Bake-Off

Module 5. Classification Metrics & Threshold Choice

Pro
Topics
Confusion Matrix — Four Numbers That MatterPrecision, Recall, F1, AccuracyROC and PR CurvesThreshold Choice & Cost AsymmetryClass Imbalance TrapsLendly Approval Threshold
Sections
  1. 1The Confusion Matrix — Where Every Other Metric Comes From
  2. 2Precision, Recall, F1 — Tradeoffs, Not Synonyms
  3. 3ROC and PR Curves — Whole-Model Reports
  4. 4Threshold Choice — The Most Underrated Lever in ML
  5. 5Class Imbalance — When Accuracy Lies
  6. 6Lendly Case: Picking a Default-Risk Threshold

Module 6. Cross-Validation, Leakage & Stratification

Pro
Topics
Why One Split Is Not EnoughK-Fold Cross-ValidationStratified K-FoldData Leakage — Six Common FormsTime-Series CVCV with Lendly
Sections
  1. 1The Problem with One Train/Test Split
  2. 2K-Fold Cross-Validation — One Idea, Many Variants
  3. 3Stratified K-Fold — Keep Class Balance Across Folds
  4. 4Data Leakage — The Silent Model Killer
  5. 5Time-Series CV — Why Random Folds Are Wrong for Temporal Data
  6. 6Lendly Case: 5-Fold CV the Right Way

Module 7. Pipelines, Preprocessing & GridSearchCV

Pro
Topics
Why Pipelines — The Anti-Leakage PatternColumnTransformer for Mixed Feature TypesOneHotEncoder, StandardScaler, SimpleImputerGridSearchCV — Systematic TuningRandomizedSearchCV — When Grid Is Too BigSaving the Best Pipeline
Sections
  1. 1The Anti-Leakage Pattern — Why Pipelines Exist
  2. 2ColumnTransformer — One Transformer per Column Type
  3. 3Imputation, Scaling, Encoding — The Trio
  4. 4GridSearchCV — Searching the Hyperparameter Space
  5. 5RandomizedSearchCV — Smarter When Grids Explode
  6. 6Lendly Case: A Production-Style Pipeline

Module 8. Capstone: End-to-End on Lendly

Pro
Topics
The Full ML LoopEDA in 15 MinutesSplitting, Pipelining, TuningPicking the ThresholdWriting the Model CardWhere to Go Next
Sections
  1. 1The Full Loop — From CSV to Decision
  2. 2Fast EDA — What to Look for in 15 Minutes
  3. 3Build, Tune, Evaluate — The Capstone Pipeline
  4. 4Threshold and Calibration — The Last Mile
  5. 5The Model Card — How to Hand Off a Model
  6. 6Where to Go Next — A/B Testing, LLMs, and Beyond

Ready to start Machine Learning Fundamentals (scikit-learn)?

Module 1 is free. Unlock the full course with Pro.

Go to the course