Module 5: Multiple Linear Regression
In this module, we will take a deep dive on linear regression.
I have split this into three parts: introduction, variable screening, and outliers.
Lecture Videos: Introductory
Lecture Notes
Lecture notes displayed in the lectures can always be found at the lecture notes website.
Prediction with 2 Separate Regression Models
Combining 2 Predictors into One Model
Interpreting Coefficients
Inferences
Confidence and Prediction Intervals
Residuals Again!
Three Variables and an F-test
Transformations… Again!
Lecture Videos: Variable Screening
Lecture Notes
Lecture notes displayed in the lectures can always be found at the lecture notes website.
PHE Data
Model for the PHE Data
Basics of Multicollinearity
Diagnosing Multicollinearity
Addressing Multicollinearity
Beware the \(R^2\)
Even more \(R^2\)
Information Criteria
Introducing Stepwise (Don’t do it!)
Prespecify Complexity
Sample Size Requirements
Basics of Data Reduction
Example of Redundancy & Data Reduction Analysis
Lecture Videos: Outliers
Lecture Notes
Lecture notes displayed in the lectures can always be found at the lecture notes website.
Reviewing the PHE Model Residuals
Leverage and Influence
DFFITS
DFBETAS
Filtering Outliers with PHE data
More Visualizing of Outliers
Removing and Outlier
Removing another Outlier
The Neverending Story: Model Building
📚 Recommended Reading & Other Content
Blogpost: Statisticians Hate Stepwise
Online textbook on “regression modeling strategies”: https://hbiostat.org/rmsc/multivar
- Highly recommend the section on variable selection
- The audio recordings of Professor Harrell are fantastic and informative!
Bonus: advanced lectures
Here is more lectures by Professor Pelleriti, the first goes into some interesting topics we won’t cover in this course, the second is really good at explaining the bias-variance tradeoff!