2.1 图像分类-K最近邻算法

Hyperparamters: K

一般来说K选择的越大就会使得决策边界越平滑。

Hyperparamters: Distance Metric

L1(Manhattan) distance = $sum_p|I_1^p - I_2^p|$
L2(Euclidean) distance = $sqrt{sum_p(I_1^p - I_2^p)^2}$
PS：当你旋转坐标系的时候L1的距离会发生变化而L2不会，所以如果你的feature vector里面有比较重要的feature时，一般采用L1距离，而如果是一个一般的vector就使用L2距离。

Setting Hyperparameters

Idear #3: Split data into train, val, test; choose hyperparamters on val and evaluate on test.
Idear #4: Cross-Validation: Split data into folds, try each fold as validation and average the results.（Useful for small datasets, but not useds too frequently in deep learning）

Weakness on images

Very slow at test time
Distance metrics on pixels are not informative
Curse of dimensionality（为了保证特征空间的训练样本比较密集均匀分布，那么训练样本就会呈指数增加）