20110317

RMS error

给定一个x, 回归线预测出一个平均的y值, 为了评价y值与平均值的spread状况, 定义了平方平均数误差(root-mean-square error, r.m.s. error).

$\begin{displaymath}RMS Errors= \sqrt{\frac{\sum_{i=1}^n (\hat{y_i}-y_i)^2}{n}}\end{displaymath}$

ref: http://www-stat.stanford.edu/~susan/courses/s60/split/node60.html

今天的机器学习课上主要介绍了classification. 传统的分类方法主要有三种: memorization, rule based expert system, and learning from training data & prior knowledge. 个人认为最重要的部分是给出了如何在训练数据上评价预测的错误率的方法. 这里主要由浅入深给出了一个通用的"评价函数", 一般包含两个部分, 一个term是损失函数(loss function, or error on data), 一个term是regularization, 并且解释regularization部分主要是用于控制over-fitting(拟合次数越高, penalty越重). 一般而言, regularization部分的形式比较固定.

然后举了两个例子, logistic regression和SVM的error function, 这里主要学习到hinge loss对应于SVM, logistic loss对应于Boost.

最后学习到了一个通过迭代求解convex function最小值的方法: Gradient descent(梯度下降法)

http://en.wikipedia.org/wiki/Gradient_descent

这里给出一个python的代码:

The gradient descent algorithm is applied to find a local minimum of the function f(x)=x⁴-3x³+2 , with derivative f'(x)=4x³-9x². Here is an implementation in the Python programming language.

# From calculation, we expect that the local minimum occurs at x=9/4

xOld = 0
xNew = 6 # The algorithm starts at x=6
eps = 0.01 # step size
precision = 0.00001

def f_prime(x):
return 4 * x**3 - 9 * x**2

while abs(xNew - xOld) > precision:
xOld = xNew
xNew = xOld - eps * f_prime(xNew)
print("Local minimum occurs at ", xNew)

今天还学习了Optimization的概念. 是因为energy function的概念不了解, 所以采取查找相关资料的. 实际上就是寻求参数的值, 使方程的值最大化/最小化的问题. optimization多是应用数学领域研究的问题, 有的算法能提供local optimization, 有的能提供global optimization.

(哎呀Bregman iteration好难啊... 搞不定搞不定...)