【350】机器学习中的线性代数之矩阵求导

参考:机器学习中的线性代数之矩阵求导

参考:Matrix calculus - Wikipedia

矩阵求导(Matrix Derivative)也称作矩阵微分(Matrix Differential),在机器学习、图像处理、最优化等领域的公式推导中经常用到。

布局(Layout):在矩阵求导中有两种布局,分别为分母布局(denominator layout)分子布局(numerator layout)。这两种不同布局的求导规则是不一样的。

个人理解:

Numerator Layout:布局按照分子的排列,例如分子的m列,那么结果的m列是对应分子的,与分母正好相反,分母如果为n列,对应的n行,比较常用。

Denominator Layout:与上面正好相反,结果正好是转置矩阵。

Numerator-layout notation

Using numerator-layout notation, we have:

$
{displaystyle {frac {partial y}{partial mathbf {x} }}=left[{frac {partial y}{partial x_{1}}}{frac {partial y}{partial x_{2}}}cdots {frac {partial y}{partial x_{n}}} ight].}
$

$
{displaystyle {frac {partial mathbf {y} }{partial x}}={egin{bmatrix}{frac {partial y_{1}}{partial x}}\{frac {partial y_{2}}{partial x}}\vdots \{frac {partial y_{m}}{partial x}}\end{bmatrix}}.}
$

${displaystyle {frac {partial mathbf {y} }{partial mathbf {x} }}={egin{bmatrix}{frac {partial y_{1}}{partial x_{1}}}&{frac {partial y_{1}}{partial x_{2}}}&cdots &{frac {partial y_{1}}{partial x_{n}}}\{frac {partial y_{2}}{partial x_{1}}}&{frac {partial y_{2}}{partial x_{2}}}&cdots &{frac {partial y_{2}}{partial x_{n}}}\vdots &vdots &ddots &vdots \{frac {partial y_{m}}{partial x_{1}}}&{frac {partial y_{m}}{partial x_{2}}}&cdots &{frac {partial y_{m}}{partial x_{n}}}\end{bmatrix}}.}
$

$
frac{partial y}{partial mathbf{X}} = egin{bmatrix} frac{partial y}{partial x_{11}} & frac{partial y}{partial x_{21}} & cdots & frac{partial y}{partial x_{p1}}\ frac{partial y}{partial x_{12}} & frac{partial y}{partial x_{22}} & cdots & frac{partial y}{partial x_{p2}}\ vdots & vdots & ddots & vdots\ frac{partial y}{partial x_{1q}} & frac{partial y}{partial x_{2q}} & cdots & frac{partial y}{partial x_{pq}}\ end{bmatrix}.
$

The following definitions are only provided in numerator-layout notation:

$
frac{partial mathbf{Y}}{partial x} = egin{bmatrix} frac{partial y_{11}}{partial x} & frac{partial y_{12}}{partial x} & cdots & frac{partial y_{1n}}{partial x}\ frac{partial y_{21}}{partial x} & frac{partial y_{22}}{partial x} & cdots & frac{partial y_{2n}}{partial x}\ vdots & vdots & ddots & vdots\ frac{partial y_{m1}}{partial x} & frac{partial y_{m2}}{partial x} & cdots & frac{partial y_{mn}}{partial x}\ end{bmatrix}.
$

$
dmathbf{X} = egin{bmatrix} dx_{11} & dx_{12} & cdots & dx_{1n}\ dx_{21} & dx_{22} & cdots & dx_{2n}\ vdots & vdots & ddots & vdots\ dx_{m1} & dx_{m2} & cdots & dx_{mn}\ end{bmatrix}.
$

代码参考:

$$
{displaystyle {frac {partial y}{partial mathbf {x} }}=left[{frac {partial y}{partial x_{1}}}{frac {partial y}{partial x_{2}}}cdots {frac {partial y}{partial x_{n}}}
ight].} 
$$

$$
{displaystyle {frac {partial mathbf {y} }{partial x}}={egin{bmatrix}{frac {partial y_{1}}{partial x}}\{frac {partial y_{2}}{partial x}}\vdots \{frac {partial y_{m}}{partial x}}\end{bmatrix}}.} 
$$

$${displaystyle {frac {partial mathbf {y} }{partial mathbf {x} }}={egin{bmatrix}{frac {partial y_{1}}{partial x_{1}}}&{frac {partial y_{1}}{partial x_{2}}}&cdots &{frac {partial y_{1}}{partial x_{n}}}\{frac {partial y_{2}}{partial x_{1}}}&{frac {partial y_{2}}{partial x_{2}}}&cdots &{frac {partial y_{2}}{partial x_{n}}}\vdots &vdots &ddots &vdots \{frac {partial y_{m}}{partial x_{1}}}&{frac {partial y_{m}}{partial x_{2}}}&cdots &{frac {partial y_{m}}{partial x_{n}}}\end{bmatrix}}.} 
$$

$$
frac{partial y}{partial mathbf{X}} = egin{bmatrix} frac{partial y}{partial x_{11}} & frac{partial y}{partial x_{21}} & cdots & frac{partial y}{partial x_{p1}}\ frac{partial y}{partial x_{12}} & frac{partial y}{partial x_{22}} & cdots & frac{partial y}{partial x_{p2}}\ vdots & vdots & ddots & vdots\ frac{partial y}{partial x_{1q}} & frac{partial y}{partial x_{2q}} & cdots & frac{partial y}{partial x_{pq}}\ end{bmatrix}. 
$$

$$
frac{partial mathbf{Y}}{partial x} = egin{bmatrix} frac{partial y_{11}}{partial x} & frac{partial y_{12}}{partial x} & cdots & frac{partial y_{1n}}{partial x}\ frac{partial y_{21}}{partial x} & frac{partial y_{22}}{partial x} & cdots & frac{partial y_{2n}}{partial x}\ vdots & vdots & ddots & vdots\ frac{partial y_{m1}}{partial x} & frac{partial y_{m2}}{partial x} & cdots & frac{partial y_{mn}}{partial x}\ end{bmatrix}. 
$$

$$
dmathbf{X} = egin{bmatrix} dx_{11} & dx_{12} & cdots & dx_{1n}\ dx_{21} & dx_{22} & cdots & dx_{2n}\ vdots & vdots & ddots & vdots\ dx_{m1} & dx_{m2} & cdots & dx_{mn}\ end{bmatrix}. 
$$

原文地址:https://www.cnblogs.com/alex-bn-lee/p/10292729.html