General Ideas

(Video of some derivations from Khan Academy)

$$ MSE = \frac{1}{n}\sum_{i=1}^{N}(y_i-(mx_i + b))^2 $$

where $(x_i, y_i)$ is the $i^{th}$ observation and $mx_i +b$ is the predicted value of $y$

Assumptions of Linear Regression

Assumptions of Linear Regression

R-squared (Coefficient of Determination)

(Video from Khan Academy)

Question: What % of the variation in y is described by the variation in x?

Alternate question: If we fit a line on the data points, how good is the fit of that line?

$$ SST = \sum(y_i - \bar{y})^2 $$