<aside> 📲

实数标量函数、复数标量函数

实值向量函数、复值向量函数

实值张量函数、复值张量函数

实参量凸优化问题的微扰、复参量凸优化问题的微扰

微积分

</aside>

<aside>

‘Tensor’ Differentiation

Real

$$ Df(x)=\frac d{dx}\otimes f=\frac{d f(x)}{dx} $$

$$ Df(\vec x)=(\partial_{x_1},\partial_{x_2},\cdots,\partial_{x_m})\otimes f(\vec x)= \left(\nabla f(\vec x)\right)^T,\vec x\in\mathbb R^m $$

$$ Df(\bm A)= \begin{pmatrix} \partial_{ij}\\ \end{pmatrix}{ij}\otimes f(\bm A)= \begin{pmatrix} \partial{ij}f(\bm A)\\ \end{pmatrix}_{ij},\bm A\in\mathbb R^{m\times n} $$

$$ D\vec f(\vec x)= (\partial_{x_1},\partial_{x_2},\cdots,\partial_{x_m})\otimes \vec f(\vec x)= \begin{pmatrix} \partial_j f_i\\ \end{pmatrix}_{ij}= \begin{pmatrix} \nabla f_i\\ \end{pmatrix}^T=\bm J\left(\vec f\right)

Fréchet derivate https://en.wikipedia.org/wiki/Fréchet_derivative

$$ \lim_{\|h\|{V} \to 0} \frac{\| f(x+h) - f(x) - A h \|{W}}{\|h\|_{V}} = 0. $$

https://en.wikipedia.org/wiki/Bounded_operator
Gateaux derivatehttps://en.wikipedia.org/wiki/Gateaux_derivative

$$ g(v) = \lim_{t \to 0} \frac{f(x + t v) - f(x)}{t}. $$
commutation matrix https://en.wikipedia.org/wiki/Commutation_matrix

$$ \bm K^{(m,n)}\operatorname{vec}(\bm A) = \operatorname{vec}(\bm A^T) . $$

$$ \operatorname{vec}(\bm A)= (a_{i+m(j-1)})_{mn\times 1},\bm A\in V^{m\times n} $$

permutation $P_\pi,\pi(i+m(j-1))=j+n(i-1)$

$$ \operatorname{vec}(\bm ABC)= (\bm C^T\otimes \bm A)\operatorname{vec}(B) $$

→ $\bm K^{(r,m)}(A\otimes B)\operatorname{vec} X= B\otimes AK^{(q,n)}\operatorname{vec} X$

$$ \bm K^{(r,m)}(A\otimes B)K^{(n,q)}= B\otimes A,A\in V^{m\times n} ,B\in V^{r\times q}\\ \bm K^{(r,m)}(A\otimes B)= B\otimes AK^{(q,n)} $$

$$ DF(A)= \begin{pmatrix} \partial_{x_{ij}}\\ \end{pmatrix}{ij} \otimes F= \frac{\partial F{pq}}{\partial A_{ij}} $$

</aside>

Forward-mode Automatic Differentiation

https://en.wikipedia.org/wiki/Dual_number#Automatic_differentiation

$$ f(a+b\varepsilon )=\sum _{n=0}^{\infty }{\frac {f^{(n)}(a)b^{n}\varepsilon ^{n}}{n!}}=f(a)+bf'(a)\varepsilon $$

故只需要将函数构造过程中每个操作用 dual tensor 描述，最终完成函数构造时即可写出函数的导数。https://docs.pytorch.org/tutorials/intermediate/forward_ad_usage.html#usage-with-modules

这种方案需要占据两倍的权重内存空间，适合输入维度少（变量占据内存小）输出维度大的函数。

<aside> 📲

A similar method works for polynomials of n variables, using the Exterior Algebra https://en.wikipedia.org/wiki/Exterior_algebra of an n-dimensional vector space. Grassman Algebra $Gr(n,\mathbb C)$ → dual number的扩展，Grassmann Algebra | https://en.wikipedia.org/wiki/Quadratic_algebra

</aside>

Reverse-mode Automatic Differentiation

vjp()

$$ g(\vec v)=\vec v \bm J $$

这样Jacobian的链式法则变为g的链式法则，而g作为向量值函数可以不实际计算Jacbian（而可以采用公式）

optimization

<aside>

优化问题的微分

derivative of the solution map of a convex cone program, when it exists.

given a perturbation to the cone program coefficients, and computing the gradient of a function of the solution with respect to the coefficients.

#coefficients in millions $\sim 10^6$ </aside>
residual map
齐次自对偶问题
- for Linear Programming
  
  Ye 等 - 1994 - An O(√nL)-Iteration Homogeneous and Self-Dual Linear Programming Algorithm.pdf
- for cone programming
  
  eeb19960111120029.pdf
- homogeneous
- self-dual / symmetric
- embedding

derivative operator
- its adjoint