$$ x^*=\lim_{k\to \infty}x^{(k)} $$
$$ \lim_{k\to\infty}\|A-A_{k}\|=0 $$
向量(数)列的收敛,按分量收敛。
矩阵收敛: 矩阵收敛于零矩阵等价于矩阵的谱半径小于1,也即矩阵的谱范数小于1。 矩阵收敛于零矩阵,那么矩阵的幂级数$I+A+A^2+\cdots\to (I-A)^{-1}$收敛
Jordan标准型:任意矩阵总可以块对角化。取k次幂是上三角矩阵(填满非零值)⇒以下:\lim _{k\to\infty}A^k=0 \pp 谱半径趋近于零
如,【矩阵迭代分析】幂矩阵对应等比数列↔$A$的所有特征值满足$|\lambda_i(A)|<1$↔谱半径小于1↔矩阵收敛到零矩阵
$$ I+A+A^2+\cdots\to (I-A)^{-1} $$
向量值函数$f:\Omega\subset R^n\to R^m$的可微性?⇒函数的仿射逼近(对应一阶Taylor逼近):$\mathcal A(x)=\mathcal L(x-x_0)+f(x_0)$,由此定义梯度
此可微,一元函数可导,多元函数梯度存在,向量值函数雅可比矩阵存在
多元函数的方向导数$\frac{\partial f}{\partial d}=\nabla f(x)^Td$,即梯度方向方向导数最大
梯度向量
$$ \nabla f(\boldsymbol{x})=\begin{bmatrix}\frac{\partial f}{\partial x_1}(\boldsymbol{x})\\\vdots\\\frac{\partial f}{\partial x_n}(\boldsymbol{x})\end{bmatrix}=Df(\boldsymbol{x})^\top $$
<aside>
$$ Df(x)=\frac d{dx}\otimes f=\frac{d f(x)}{dx} $$
$$ Df(\vec x)=(\partial_{x_1},\partial_{x_2},\cdots,\partial_{x_m})\otimes f(\vec x)= \left(\nabla f(\vec x)\right)^T,\vec x\in\mathbb R^m $$
$$ Df(\bm A)= \begin{pmatrix} \partial_{ij}\\ \end{pmatrix}{ij}\otimes f(\bm A)= \begin{pmatrix} \partial{ij}f(\bm A)\\ \end{pmatrix}_{ij},\bm A\in\mathbb R^{m\times n} $$
$$ D\vec f(\vec x)= (\partial_{x_1},\partial_{x_2},\cdots,\partial_{x_m})\otimes \vec f(\vec x)= \begin{pmatrix} \partial_j f_i\\ \end{pmatrix}_{ij}= \begin{pmatrix} \nabla f_i\\ \end{pmatrix}^T=\bm J\left(\vec f\right)
$$
Fréchet derivate https://en.wikipedia.org/wiki/Fréchet_derivative
$$ \lim_{\|h\|{V} \to 0} \frac{\| f(x+h) - f(x) - A h \|{W}}{\|h\|_{V}} = 0. $$
Gateaux derivatehttps://en.wikipedia.org/wiki/Gateaux_derivative
$$ g(v) = \lim_{t \to 0} \frac{f(x + t v) - f(x)}{t}. $$
commutation matrix https://en.wikipedia.org/wiki/Commutation_matrix
$$ \bm K^{(m,n)}\operatorname{vec}(\bm A) = \operatorname{vec}(\bm A^T) . $$
$$ \operatorname{vec}(\bm A)= (a_{i+m(j-1)})_{mn\times 1},\bm A\in V^{m\times n} $$
permutation $P_\pi,\pi(i+m(j-1))=j+n(i-1)$
$$ \operatorname{vec}(\bm ABC)= (\bm C^T\otimes \bm A)\operatorname{vec}(B) $$
→ $\bm K^{(r,m)}(A\otimes B)\operatorname{vec} X= B\otimes AK^{(q,n)}\operatorname{vec} X$
$$ \bm K^{(r,m)}(A\otimes B)K^{(n,q)}= B\otimes A,A\in V^{m\times n} ,B\in V^{r\times q}\\ \bm K^{(r,m)}(A\otimes B)= B\otimes AK^{(q,n)} $$
$$ DF(A)= \begin{pmatrix} \partial_{x_{ij}}\\ \end{pmatrix}{ij} \otimes F= \frac{\partial F{pq}}{\partial A_{ij}} $$
</aside>
【雅可比矩阵】
每行为一个多元函数(多元函数为向量值函数$\bm f:\mathbb R^n\to \mathbb R^m$的$m$个分量之一)的梯度转置$Df_i(\boldsymbol x_0)$
$$ \mathbf{J}=\begin{bmatrix}\dfrac{\partial f}{\partial x_1}(\boldsymbol{x}_0)&\cdots&\dfrac{\partial f}{\partial x_n}(\boldsymbol{x}_0)\\\vdots&&\vdots\\\frac{\partial f_m}{\partial x_1}(\boldsymbol{x}_0)&\cdots&\frac{\partial f_m}{\partial x_n}(\boldsymbol{x}_0)\end{bmatrix}=\begin{bmatrix} \nabla f(\boldsymbol{x}_0)^T\\ \nabla f_2(\boldsymbol{x}_0)^T \\ \cdots \\ \nabla f_m(\boldsymbol{x}_0)^T \end{bmatrix}=\begin{bmatrix}\frac{\partial\mathbf{f}}{\partial x_1}&\cdots&\frac{\partial\mathbf{f}}{\partial x_n}\end{bmatrix} $$
黑塞矩阵:如果$\nabla f$可微,则$f$是二次可微的。每行为一个一元函数(多元函数的的梯度的一个分量)的梯度转置$D(\partial_i f)=\begin{bmatrix} \nabla \partial_1f_1(\boldsymbol{x}_0)^T\\ \nabla f_2(\boldsymbol{x}_0)^T \\ \cdots \\ \nabla f_m(\boldsymbol{x}_0)^T \end{bmatrix}$
$$ \lim_{x\to x_0,x\in\Omega}\frac{\|f(\boldsymbol{x})-(\mathcal{L}(\boldsymbol{x}-\boldsymbol{x}_0)+\boldsymbol{f}(\boldsymbol{x}_0))\|}{\|\boldsymbol{x}-\boldsymbol{x}_0\|}=0 $$
$$ D^2f=\begin{bmatrix}\frac{\partial^2f}{\partial x_1^2}&\frac{\partial^2f}{\partial x_2\partial x_1}&\cdots&\frac{\partial^2f}{\partial x_n\partial x_1}\\\frac{\partial^2f}{\partial x_1\partial x_2}&\frac{\partial^2f}{\partial x_2^2}&\cdots&\frac{\partial^2f}{\partial x_n\partial x_2}\\\vdots&\vdots&\ddots&\vdots\\\frac{\partial^2f}{\partial x_1\partial x_n}&\frac{\partial^2f}{\partial x_2\partial x_n}&\cdots&\frac{\partial^2f}{\partial x_n^2}\end{bmatrix} $$
连续函数$f:\in\mathbb C^n$,$f$中个元素具有$n$阶连续偏导数。
链式法则:不可交换,因为含有雅可比矩阵、向量等元素。$h=g(f(t))$
$$ h'(t)=Dg(\boldsymbol{f}(t))D\boldsymbol{f}(t)=\nabla g(\boldsymbol{f}(t))^\top\begin{bmatrix}f'_1(t)\\\vdots\\f'_n(t)\end{bmatrix} $$
examples 利用矩阵乘法结合律写成下列形式,其他形式 按照分量计算
| $f(x\in\mathbb R^n)\in\mathbb R$ | $Df(x)=(\nabla f)^T\in\mathbb R^n,\\ D\mathbf f(x)=J\mathbf f\in R^{n\times n},\\Ff(x)=D^2f(x)=J(Df(x))$ |
|---|---|
| $x,x\in \mathbb R^n$ | $I$ |
| $Ax+b,A\in \mathbb R^{m\times n}$ | $A$ |
| $a^Tx,a\in\mathbb R^n$ | $a^T$ |
| $x^Tx,x\in\mathbb R^n$ | $2x^T$ |
| $x^TAx,A\in \mathbb R^{n\times n}$ | 作业:$x^T(A+A^T)$ |
| $\left | x \right | _n,x\in\mathbb R^n$ | $\frac{x}{\left | x \right | _n},x\ne 0$ |
| $f(x\in\mathbb R^n)\in\mathbb R^m$ | |
| tr(A^TX) | A |
等值线(的切线)与梯度正交(方向为函数值增加的方向)
隐函数定理(偏微分)得到切线的斜率
等值线在极值点附近近似为椭圆