Statistical Learning for Data Mining

About the Class

This course applies multiple regression techniques to the increasingly important study of very large data sets. Those techniques include linear and logistic model fitting, inference, and diagnostics. Methods with special applicability for Big Data will be emphasized, such as lasso and ridge regression. Issues of model complexity, the bias-variance tradeoff and model validation will be studied in the context of large data sets. Methods that rely less on distributional assumptions are also introduced, including cross-validation, bootstrap resampling, and non-parametric methods.

Course Contents

Forthcoming