MSSC 6250 Statistical Machine Learning
So what is (Statistical) Machine Learning?
“A computer program is said to learn from experience (E) with respect to some class of tasks (T) and performance measure (P) if its performance at tasks in T, as measured by P, improves with experience E.” – Tom Mitchell, Professor of ML at Carnegie Mellon University
computer programs = computing software and system
experience = data (objective and/or subjective)
tasks = problems being solved by the computing system
The ML algorithms are mainly for predictive modeling problems, and many have been borrowed from Statistics, for example, linear regression.
ML is a CS perspective on modeling data with a focus on algorithmic methods.
Statistical learning refers to a set of tools for modeling and understanding complex datasets. – An Introduction to Statistical Learning
…to extract important patterns and trends, and understand “what the data says.” We call this learning from data. – The Elements of Statistical Learning
tools = mathematics, computing hardware/software/architecture, programming languages, algorithms, etc.
Statistical learning is a mathematical perspective on modeling data with a focus on goodness of fit.
Machine learning emphasizes algorithms and automation.
Statistical learning emphasizes modeling, interpretability, and uncertainty.
Source: http://statweb.stanford.edu/~tibs/stat315a/glossary.pdf
Source: All of Statistics
Source: https://towardsdatascience.com/machine-learning-algorithms-in-laymans-terms-part-1-d0368d769a7b
Response Y (output, outcome, target, label, dependent/endogenous variable)
Vector of p predictors X = (X_1, X_2, \dots, X_p) (inputs, features, regressors, covariates, explanatory/exogenous/independent variable).
Regression: Y is numeric (e.g price, blood pressure). (if p = 1, simple regression; if p > 1, multiple regression)
Classification: Y is categorical (e.g survived/died, digit 0-9 (MNIST), cancer class of tissue sample).
Goal: Use training data (E) to train our model for better (w.r.t. P) inference/prediction (T) on the response.
Source: ISL Fig. 1.1
Source: https://towardsdatascience.com/the-actual-difference-between-statistics-and-machine-learning-64b49f07ea3
Source: http://penplusbytes.org/strategies-for-dealing-with-e-mail-spam/
Based on the training data (E) we’d like to :
Source: https://www.datacamp.com/community/tutorials/introduction-customer-segmentation-python
A neural network with several hidden layers is called a deep neural network, or deep learning.
Source: ISL Ch 10