machine learning(11) -- classification: advanced optimization 去求cost function最小值的方法

其它的比gradient descent快，在某些场合得到广泛应用的求cost function的最小值的方法
when have a large machine learning problem，一般会使用这些advanced optimization algorithm而不是gradient descent

Conjugate gradient, BFGS,L-BFGS很复杂，可以在不明白详细原理的情况下进行应用(使用software libary)。

可以使用Octave和matlab的函数库直接进行应用,这些软件里面的build-in libarary已经很好的实现了这些算法。

当要使用其它的语言来实现这些算法时，如c,c++,Java等，要确保你使用了good libary for implement these algorithms，因为不同的实现方法在性能上相差很大。

initialTheta的维度要>=2,initialTheta不能是一个real number(一维的).
optimset是设置一些options
GradObj, on 是指是否要计算gradient,on 指要计算
MaxIter是指最大迭代数，100指最大迭代数为100
initialTheta设置初始值为0
[optTheta, functionVal, exitFlag]是指返回值，optTheta:求得最优解后theta的值；functionVal:最后costfuncion的值；exitFlag:表示迭代的过程是否收敛（1表示收敛，0不收敛）
fminunc是Octave里面的一个advanced optimization函数
@costFunction是指指向costFunction的指针