Machine Learning No.4: Regularization

1. Underfit = High bias

　 Overfit = High varience

2. Addressing overfitting:

　　(1) reduce number of features.

　　　　Manually select which features to keep.

　　　　Model selection algorithm

disadvantage: throw out some useful information

(2) Regularization

　　　　Keep all the features, but reduce magnitude/values of parameters θ_j

　　　　works well when we have a lot of features, each of which contributλes a bit to predicting y.

3. Regularization

if λ is extremely large, , then J(θ) will be underfitting

4. Gradient desent

Repeat {

　　 (j = 1, 2 ... n)

}

5. Normal equation

if λ > 0

if m <= n

is non-invertible/singular

but using regularization will avoid this problem