1. Sherman-Morrison-Woodbury Formula
- \(\displaystyle (I-\mathbf{u}\mathbf{v}^\top)^{-1}=I+\frac{\mathbf{u}\mathbf{v}^\top}{1-\mathbf{v}^\top\mathbf{u}}\).
\(\square\)
- \((I_n-UV^\top)^{-1}=I_n+U(I_k-V^\top U)^{-1}V^\top\), where \(U\) and \(V\) are \(n \times k\) matrices.
\(\square\)
- \((A-UV^\top)^{-1}=A^{-1}+A^{-1}U(I-V^\top A^{-1}U)^{-1}V^\top A^{-1}\).
1.1. Application: Solving a Perturbation Problem
- Suppose \(A\mathbf{w}=\mathbf{b}\) is solved for \(\mathbf{w}\), and we want to solve \((A-\mathbf{u}\mathbf{v}^\top)\mathbf{x}=\mathbf{b}\). We first solve \(A\mathbf{z}=\mathbf{u}\), then \[\mathbf{x}=\mathbf{w}+\frac{\mathbf{v}^\top\mathbf{w}\mathbf{z}}{1-\mathbf{v}^\top\mathbf{z}}.\]
1.2. Application: Recursive Least Squares
- Suppose there is a new measurement in least squares: \(\begin{bmatrix} A \\ \mathbf{v} \end{bmatrix}\mathbf{x}=\begin{bmatrix} \mathbf{b} \\ b_\text{new} \end{bmatrix}\). The new normal equation is \[\begin{bmatrix} A^\top & \mathbf{v}^\top \end{bmatrix}\begin{bmatrix} A \\ \mathbf{v} \end{bmatrix}\widehat{\mathbf{x}}_\text{new}=\begin{bmatrix} A^\top & \mathbf{v}^\top \end{bmatrix}\begin{bmatrix} \mathbf{b} \\ b_\text{new} \end{bmatrix},\] i.e., \((A^\top A+\mathbf{v}^\top\mathbf{v})\widehat{\mathbf{x}}_\text{new}=A^\top\mathbf{b}+\mathbf{v}^\top b_\text{new}\). The update formula is \[(A^\top A+\mathbf{v}^\top\mathbf{v})^{-1}=(A^\top A)^{-1}-c(A^\top A)^{-1}\mathbf{v}^\top\mathbf{v}(A^\top A)^{-1}\] with \(\displaystyle c=\frac{1}{1+\mathbf{v}(A^\top A)^{-1}\mathbf{v}^\top}\). To find \(c\), we only need to solve the old equation \((A^\top A)\mathbf{y}=\mathbf{v}^\top\). The same idea applies when \(A\) has more new rows.
1.3. Application: Kalman Filter
- The update ideal also applies to dynamic least squares.
2. Derivative of \(A^{-1}\)
- Suppose \(A\) and \(B\) are invertible. \(B^{-1}-A^{-1}=B^{-1}(A-B)A^{-1}\).
\(\square\)
- Let \(\Delta A=B-A\), then \(\displaystyle \frac{\Delta A^{-1}}{\Delta t}=(A+\Delta A)^{-1}\left(-\frac{\Delta A}{\Delta t}\right)A^{-1}\). Therefore, as \(\Delta t \to 0\), \[\frac{\text{d}A^{-1}}{\text{d}t}=-A^{-1}\frac{\text{d}A}{\text{d}t}A^{-1}.\]
3. Change in Eigenvalues and Singular Values
\(A(t)\mathbf{x}(t)=\lambda(t)\mathbf{x}(t)\), \(\mathbf{y}^\top(t)A(t)=\lambda(t)\mathbf{y}^\top(t)\), and \(\mathbf{y}^\top(t)\mathbf{x}(t)=1\). In matrix notation, \(AX=X\Lambda\), \(Y^\top A=\Lambda Y^\top\), and \(Y^\top X=I\).
\(\lambda(t)=\mathbf{y}^\top(t)A(t)\mathbf{x}(t)\).
\(\square\)
- \(\displaystyle \frac{\text{d}\lambda}{\text{d}t}=\mathbf{y}^\top(t)\frac{\text{d}A}{\text{d}t}\mathbf{x}(t)\).
\(\square\)
- \(\displaystyle \frac{\text{d}\sigma}{\text{d}t}=\mathbf{u}^\top(t)\frac{\text{d}A}{\text{d}t}\mathbf{v}(t)\).
4. Interlacing
- Suppose the eigenvalues of a symmetric matrix \(S\) is \(\mu_1 \geq \cdots \geq \mu_n\), and the eigenvalues of \(S+\mathbf{u}\mathbf{u}^\top\) is \(\lambda_1 \geq \cdots \geq \lambda_n\), then \(\lambda_1 \geq \mu_1 \geq \lambda_2 \geq \mu_2 \geq \cdots \geq \lambda_n \geq \mu_n\).