从数学角度看最大期望(EM)算法 II

【转载请注明出处】http://www.cnblogs.com/mashiqi

2015/3/13

对于隐变量只有有限个取值(比如$N$个)的情况,我们可以将隐变量表示为${z_j} = [{z_{j1}},{z_{j2}}, cdots ,{z_{jN}}]$,其中${z_{jk}} in { 0,1} $且${z_{j1}} + {z_{j2}} +  cdots  + {z_{jN}} = 1$。这样表示的目的主要是为了使后面的计算方便。如果:

$$left{ matrix{
p({z_{jk}} = 1) = {pi _k}cr
p({p_j}|{z_{jk}} = 1; heta ) = {f_k}({p_j}; heta ) cr} ight.$$

则我们可以把$p({p_j},{z_j}; heta )$表示为:

$$p({p_j},{z_j}; heta ) = mathop prod limits_{k = 1}^N {[{pi _k}{f_k}({p_j}; heta )]^{{z_{jk}}}}$$     

下面,我们看看怎么得到complete-data log-likelihood:

$$eqalign{
L( heta ) &= mathop sum limits_{j = 1}^M ln p({p_j}; heta ) = mathop sum limits_{j = 1}^M ln [mathop sum limits_k^{} p({p_j},{z_{jk}} = 1; heta )] cr
&= mathop sum limits_{j = 1}^M ln [mathop sum limits_k^{} p({p_j},{z_{jk}} = 1; heta ){{p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})} over {p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})}}] cr
&= mathop sum limits_{j = 1}^M ln [mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}}){{p({p_j},{z_{jk}} = 1; heta )} over {p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})}}] cr
&ge mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})}}]{kern 1pt} {kern 1pt} (Jensen's) cr
&= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})}}p({p_j};{ heta ^{(n)}})] cr
&= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})}}] cr
&+ mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [p({p_j};{ heta ^{(n)}})] cr
&= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})}}] + mathop sum limits_{j = 1}^M ln p({p_j};{ heta ^{(n)}}) cr
&= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})}}] + L({ heta ^{(n)}}) cr} $$

因此,记$l( heta ) = mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})}}]$,我们可以得到:

$$left{ matrix{
l({ heta ^{(n)}}) = 0 cr
L( heta ) ge l( heta ) + L({ heta ^{(n)}}) cr} ight.$$

如果我们能求得$l( heta )$的极大值点$ heta^{*}$,则一定有

$$L({ heta ^*}) ge L({ heta ^{(n)}})$$

我们就可以把$ heta^{*}$当作$ heta^{(n+1)}$

由于

$$eqalign{
l( heta ) &= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{{p({p_j},{z_{jk}} = 1; heta )} over {p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})}}] cr
&= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln p({p_j},{z_{jk}} = 1; heta ) + const cr
&= {cal Q}( heta ,{ heta ^{(n)}}) + const cr
{cal Q}( heta ,{ heta ^{(n)}}) &= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln p({p_j},{z_{jk}} = 1; heta ) cr} $$

因此,通常情况下我们优化$l( heta )$的前面这一项${cal Q}( heta ,{ heta ^{(n)}})$就行了,许多介绍EM算法的资料也就是直接优化${cal Q}( heta ,{ heta ^{(n)}})$这一项。在这一项里面:

$$eqalign{
p({p_j},{z_{jk}} = 1; heta ) &= p({z_{jk}} = 1; heta )p({p_j}|{z_{jk}} = 1; heta ) cr
&= {pi _k}{f_k}({p_j}; heta ) cr} $$

带入式可得:

$$eqalign{
{cal Q}( heta ,{ heta ^{(n)}}) &= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})ln [{pi _k}{f_k}({p_j}; heta )] cr
&= mathop sum limits_{j = 1}^M mathop sum limits_k^{} p({z_{jk}} = 1|{p_j};{ heta ^{(n)}})[ln {pi _k} + ln {f_k}({p_j}; heta )] cr} $$

为此我们需要计算这个后验概率:

$$eqalign{
p({z_{jk}} = 1|{p_j};{ heta ^{(n)}}) &= {{p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})} over {p({p_j};{ heta ^{(n)}})}} = {{p({p_j},{z_{jk}} = 1;{ heta ^{(n)}})} over {mathop sum limits_K^{} p({p_j},{z_{jK}} = 1;{ heta ^{(n)}})}} cr
&= {{p({z_{jk}} = 1;{ heta ^{(n)}})p({p_j}|{z_{jk}} = 1;{ heta ^{(n)}})} over {mathop sum limits_K^{} p({z_{jK}} = 1;{ heta ^{(n)}})p({p_j}|{z_{jK}} = 1;{ heta ^{(n)}})}} cr
&= {{pi _K^{(n)}{f_k}({p_j};{ heta ^{(n)}})} over {mathop sum limits_K^{} pi _K^{(n)}{f_K}({p_j};{ heta ^{(n)}})}} cr} $$

因此,

$${cal Q}( heta ,{ heta ^{(n)}}) = mathop sum limits_{j = 1}^M mathop sum limits_k^{} {{pi _K^{(n)}{f_k}({p_j};{ heta ^{(n)}})} over {mathop sum limits_K^{} pi _K^{(n)}{f_K}({p_j};{ heta ^{(n)}})}}[ln {pi _k} + ln {f_k}({p_j}; heta )]$$

我们求最优化问题:

$$[{pi ^{(n + 1)}},{ heta ^{(n + 1)}}] = mathop {arg max }limits_{pi , heta } {cal Q}( heta ,{ heta ^{(n)}})$$

就可以得到新一轮的迭代结果。

原文地址:https://www.cnblogs.com/mashiqi/p/4335100.html