机器学习中常用的求导公式

课堂上老师介绍的几个求偏导的公式,但是不知道为什么是这么个结果,只有课下带入实例计算一下才能更好的理解。

  1. (frac{partial eta^{mathrm{T}} mathrm{x}}{partial mathrm{x}}=eta)

  2. (frac{partial mathrm{x}^{mathrm{T}} mathrm{x}}{partial mathrm{x}}=2 mathrm{x})

  3. (frac{partial mathrm{x}^{mathrm{T}} mathrm{Ax}}{partial mathrm{x}}=left(mathrm{A}+mathrm{A}^{mathrm{T}} ight) mathrm{x})

对于上述三个求导公式,通过带入实例进行求导计算,令:

[eta = egin{bmatrix} eta_1 \ eta_2 \ eta_3 end{bmatrix}\ mathrm{x}= egin{bmatrix} x_1 \ x_2 \ x_3 end{bmatrix}\ A = egin{bmatrix} a_{11} & a_{12} & a_{13}\ a_{21} & a_{22} & a_{23} \ a_{31} & a_{32} & a_{33} end{bmatrix} ]

第一个公式

[eta^Tmathrm{x} = egin{bmatrix} eta_1 & eta_2 & eta_3 end{bmatrix} egin{bmatrix} x_1 \ x_2 \ x_3 end{bmatrix} =eta_1x_1+eta_2x_2+eta_3x_3\ \ frac{partial eta^{mathrm{T}} mathrm{x}}{partial mathrm{x}} = egin{bmatrix} frac{partial (eta_1x_1+eta_2x_2+eta_3x_3)}{partial x_1} \ frac{partial (eta_1x_1+eta_2x_2+eta_3x_3)}{partial x_2} \ frac{partial (eta_1x_1+eta_2x_2+eta_3x_3)}{partial x_3} end{bmatrix} = egin{bmatrix} eta_1 \ eta_2 \ eta_3 end{bmatrix} = eta ]

第二个公式

[mathrm{x}^Tmathrm{x} = egin{bmatrix} x_1&x_2&x_3 end{bmatrix} egin{bmatrix} x_1 \ x_2 \ x_3 end{bmatrix} = x_1^2+x_2^2+x_3^2\ frac{partial mathrm{x}^{mathrm{T}} mathrm{x}}{partial mathrm{x}} = egin{bmatrix} frac{partial (x_1^2+x_2^2+x_3^2)}{partial x1} \ frac{partial (x_1^2+x_2^2+x_3^2)}{partial x2} \ frac{partial (x_1^2+x_2^2+x_3^2)}{partial x3} end{bmatrix} = egin{bmatrix} 2x_1 \ 2x_2 \ 2x_3 end{bmatrix} = 2 egin{bmatrix} x_1 \ x_2 \ x_3 end{bmatrix} = 2mathrm{x} ]

第三个公式

[egin{aligned} mathrm{x}^{T} A mathrm{x} &=left[egin{array}{lll} x_{1} & x_{2} & x_{3} end{array} ight]left[egin{array}{lll} a_{11} & a_{12} & a_{13} \ a_{21} & a_{22} & a_{23} \ a_{31} & a_{32} & a_{33} end{array} ight]left[egin{array}{l} x_{1} \ x_{2} \ x_{3} end{array} ight] \ &=a_{11} x_{1}^{2}+a_{21} x_{2} x_{1}+a_{31} x_{3} x_{1}+a_{12} x_{1} x_{2}+a_{22} x_{2}^{2}+a_{32} x_{3} x_{2}+a_{13} x_{1} x_{3}+a_{23} x_{2} x_{3}+a_{33} x_{3}^{2} end{aligned}\ ]

[egin{aligned}frac{partial mathrm{x}^{mathrm{T}} mathrm{Ax}}{partial mathrm{x}}=&left[egin{array}{l}frac{partial mathrm{x}^{mathrm{T}} mathrm{Ax}}{partial x_{1}} \frac{partial mathrm{x}^{mathrm{T}} mathrm{Ax}}{partial x_{2}} \frac{partial mathrm{x}^{mathrm{T}} mathrm{Ax}}{partial x_{3}}end{array} ight] \=&left[egin{array}{l}2 a_{11} x_{1}+a_{21} x_{2}+a_{31} x_{3}+a_{12} x_{2}+a_{13} x_{3} \a_{21} x_{1}+a_{12} x_{1}+2 a_{22} x_{2}+a_{33} x_{3}+a_{23} x_{3} \a_{31} x_{1}+a_{32} x_{2}+a_{13} x_{1}+a_{23} x_{2}+2 a_{33} x_{3}end{array} ight] \=&left[egin{array}{l}2 a_{11} x_{1}+left(a_{21}+a_{12} ight) x_{2}+left(a_{31}+a_{13} ight) x_{3} \left(a_{21}+a_{12} ight) x_{1}+2 a_{22} x_{2}+left(a_{33}+a_{23} ight) x_{3} \left(a_{31}+a_{13} ight) x_{1}+left(a_{32}+a_{23} ight) x_{2}+2 a_{33} x_{3}end{array} ight] \=&left[egin{array}{l}2 a_{11}+left(a_{21}+a_{12} ight)+left(a_{31}+a_{13} ight) \left(a_{21}+a_{12} ight)+2 a_{22}+left(a_{33}+a_{23} ight) \left(a_{31}+a_{13} ight)+left(a_{32}+a_{23} ight)+2 a_{33}end{array} ight]left[egin{array}{l}x_{1} \x_{2} \x_{3}end{array} ight] \=&left(A+A^{T} ight) mathrm{x}end{aligned} ]

原文地址:https://www.cnblogs.com/xxmmqg/p/13647186.html