统计学习

朴素贝叶斯法

朴素贝叶斯是通过训练数据集学习联合概率分布P(X,Y),通过学习先验分布和条件分布得到联合概率分布。

P(x,y)=p(y)*P(x|y)

朴素贝叶斯法实际上学习到生成数据的机制,所以属于生成模型

 4.5式子

4.6式子

P50的朴素贝叶斯算法

P51贝叶斯估计

逻辑斯谛回归(logistic regression)与最大熵模型(maximum entropy)

这两个都属于对数线性模型     

一个事件发生的几率(odds)指的是该事件发生的概率与该事件不发生的概率的比值

P78 6.5,6.6     

输出Y=1的对数几率是输入x的线性函数 

逻辑斯谛回归学习中通常采用的是梯度下降法和拟牛顿法 

最大熵模型

学习的目的在于采用最大熵原理选择最好的分类模型 

最大熵模型的学习过程就是求解最大熵模型的过程,最大熵模型的学习可以形式化为约束最优化问题 ,模型学习约束条件是两个期望值相等(P83 6.10)(有约束的最优化的原始问题转化成无约束的最优化的对偶问题),这其中用到拉格朗日函数 

最大熵模型学习中的对偶函数极大化等价于最大熵模型的极大似然估计

多种最优化的方法:迭代尺度法、梯度下降法、牛顿法、拟牛顿法;牛顿法或拟牛顿法一般收敛速度更快

改进的迭代尺度法IIS

P88:  

支持向量机

支持向量机是一种二类分类模型,他的基本模型是定义在特征空间上的间隔最大的线性分类器,间隔最大使它有别于感知机,支持向量机还包括核技巧,使它成为市值上的非线性分类器,支持向量分类机的策略是间隔最大化,可形式化为解一个凸二次规划问题(convex quadratic programming)等价于正则化的合页损失函数最小化的问题支持向量机的学习算法是求解凸二次规划的最优化的算法。包括:线性可分支持向量机、线性支持向量机、非线性支持向量机。当训练数据线性可分时候,通过硬间隔最大化(hard margin maximum),学习一个线性的分类器,即线性可分支持向量机;当训练数据接近线性可分时,通过软间隔最大化(soft margin maximum),即软间隔支持向量机,当训练数据不可分时,通过使用核技巧(kernel trick)及软间隔最大化

当输入空间为欧式空间或者离散集合、特征空间为希尔伯特空间时,核函数就是将输入从输入空间映射到特征空间得到的特征向量之间的内积,使用核函数等价于隐式地在高维的特征空间中学习线性支持向量机,这样的方法称为核技巧,和方法是比支持向量机更为一般的机器学习方法

在感知机中,利用误分类最小的策略,求得分离超平面,这时候的解有无穷多个,线性可分支持向量机利用分割最大化求最优超平面(几何间隔最大),所得到的解释唯一的

P97函数间隔的概念(7.3)

函数间隔可以表示分类预测的正确性和确信度。要对分离超平面的法向量w加某些约束,如规范化

P98(几何间隔)概念      

凸优化问题是指约束最优化问题P100

 非线性支持向量机:首先使用一个变换将元空间的数据映射到新空间,然后在心空间里用线性分类学习方法训练数据中学习分类模型

核函数:设X是输入空间(欧式空间的子集或离散集合),设H为特征空间(希尔伯特空间),如果存在一个从X到H的映射,Ø(x):x->h

使得对于所有的x,z属于X,函数k(x,z)满足条件K(x,z)= Ø(x)。 Ø(z)

K(x,z)为核函数,Ø(x)为映射函数,Ø(x)。 Ø(z)为内积

每一个具体的输入是一个实例,通常由特征向量表示,这是,所有特征向量存在的空间称为特征空间                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        

   学习是隐性的,不需要显示地定义特征空间和隐式函数,这样的技巧成为核技巧 

矩阵的半正定性的含义是什么??    

希尔伯特空间:

在一个是实向量空间或复向量空间H上给定内积<x,y> ||x||=sqrt(<x,x>)

任意有限维内积空间都是希尔伯特空间

一个内积空间当作为一个赋范向量是完备时,就是希尔伯特空间                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

正定核的充要条件:

设K:x×X->R是对称函数,则K(x,z)为正定核函数的充要条件是对于任意的xi属于X ,K(X,Z)对应的Gram矩阵:K=[K(xi,xj)]m*n    是半正定矩阵

P122常用的核函数

当训练样本容量大的时候,这些算法会变得非常的低效,

快速实现算法:SMO(sequential minimal optimization 序列最小最优化问题)

KKT(Karush-Kuhn-Tucker  conditions)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

朴素贝叶斯法

朴素贝叶斯是通过训练数据集学习联合概率分布P(X,Y),通过学习先验分布和条件分布得到联合概率分布。

P(x,y)=p(y)*P(x|y)

朴素贝叶斯法实际上学习到生成数据的机制,所以属于生成模型

 4.5式子

4.6式子

P50的朴素贝叶斯算法

P51贝叶斯估计

逻辑斯谛回归(logistic regression)与最大熵模型(maximum entropy)

这两个都属于对数线性模型     

一个事件发生的几率(odds)指的是该事件发生的概率与该事件不发生的概率的比值

P78 6.5,6.6     

输出Y=1的对数几率是输入x的线性函数 

逻辑斯谛回归学习中通常采用的是梯度下降法和拟牛顿法 

最大熵模型

学习的目的在于采用最大熵原理选择最好的分类模型 

最大熵模型的学习过程就是求解最大熵模型的过程,最大熵模型的学习可以形式化为约束最优化问题 ,模型学习约束条件是两个期望值相等(P83 6.10)(有约束的最优化的原始问题转化成无约束的最优化的对偶问题),这其中用到拉格朗日函数 

最大熵模型学习中的对偶函数极大化等价于最大熵模型的极大似然估计

多种最优化的方法:迭代尺度法、梯度下降法、牛顿法、拟牛顿法;牛顿法或拟牛顿法一般收敛速度更快

改进的迭代尺度法IIS

P88:  

支持向量机

支持向量机是一种二类分类模型,他的基本模型是定义在特征空间上的间隔最大的线性分类器,间隔最大使它有别于感知机,支持向量机还包括核技巧,使它成为市值上的非线性分类器,支持向量分类机的策略是间隔最大化,可形式化为解一个凸二次规划问题(convex quadratic programming)等价于正则化的合页损失函数最小化的问题支持向量机的学习算法是求解凸二次规划的最优化的算法。包括:线性可分支持向量机、线性支持向量机、非线性支持向量机。当训练数据线性可分时候,通过硬间隔最大化(hard margin maximum),学习一个线性的分类器,即线性可分支持向量机;当训练数据接近线性可分时,通过软间隔最大化(soft margin maximum),即软间隔支持向量机,当训练数据不可分时,通过使用核技巧(kernel trick)及软间隔最大化

当输入空间为欧式空间或者离散集合、特征空间为希尔伯特空间时,核函数就是将输入从输入空间映射到特征空间得到的特征向量之间的内积,使用核函数等价于隐式地在高维的特征空间中学习线性支持向量机,这样的方法称为核技巧,和方法是比支持向量机更为一般的机器学习方法

在感知机中,利用误分类最小的策略,求得分离超平面,这时候的解有无穷多个,线性可分支持向量机利用分割最大化求最优超平面(几何间隔最大),所得到的解释唯一的

P97函数间隔的概念(7.3)

函数间隔可以表示分类预测的正确性和确信度。要对分离超平面的法向量w加某些约束,如规范化

P98(几何间隔)概念      

凸优化问题是指约束最优化问题P100

 非线性支持向量机:首先使用一个变换将元空间的数据映射到新空间,然后在心空间里用线性分类学习方法训练数据中学习分类模型

核函数:设X是输入空间(欧式空间的子集或离散集合),设H为特征空间(希尔伯特空间),如果存在一个从X到H的映射,Ø(x):x->h

使得对于所有的x,z属于X,函数k(x,z)满足条件K(x,z)= Ø(x)。 Ø(z)

K(x,z)为核函数,Ø(x)为映射函数,Ø(x)。 Ø(z)为内积

每一个具体的输入是一个实例,通常由特征向量表示,这是,所有特征向量存在的空间称为特征空间                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        

   学习是隐性的,不需要显示地定义特征空间和隐式函数,这样的技巧成为核技巧 

矩阵的半正定性的含义是什么??    

希尔伯特空间:

在一个是实向量空间或复向量空间H上给定内积<x,y> ||x||=sqrt(<x,x>)

任意有限维内积空间都是希尔伯特空间

一个内积空间当作为一个赋范向量是完备时,就是希尔伯特空间                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               

正定核的充要条件:

设K:x×X->R是对称函数,则K(x,z)为正定核函数的充要条件是对于任意的xi属于X ,K(X,Z)对应的Gram矩阵:K=[K(xi,xj)]m*n    是半正定矩阵

P122常用的核函数

当训练样本容量大的时候,这些算法会变得非常的低效,

快速实现算法:SMO(sequential minimal optimization 序列最小最优化问题)

KKT(Karush-Kuhn-Tucker  conditions)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

原文地址:https://www.cnblogs.com/huicpc0212/p/4185992.html