The Davidson diagonalization method
As proposed initially by Davidson in 1975 (Davidson 1975), his diagonalization method applied to any symmetry, diagonally dominant matrix of dimension \(N\). It is a method to solve the associated eigenvalue problem for this matrix (which can have insanely large dimensions - large enough to be stored in the modern RAM-memory of a computer), finding its lowest eigenvalue and associated eigenvector.
In summary, the algorithm takes an initial guess for the lowest-eigenvalue eigenvector and produces new estimates by solving the diagonalization in an ever increasing subspace of previous estimates. Starting from that initial guess, the algorithm goes as follows.
Let \(\require{physics} \vb{t}\) be an initial guess vector. Calculate \(\vb{v}_1\), being a new subspace vector, as \[\begin{equation} \vb{v}_1 = \frac{\vb{t}}{|| \vb{t} ||} \thinspace . \end{equation}\]
Calculate the (expensive) matrix-vector product: \[\begin{equation} \vb{v}^{\vb{A}}_1 = \vb{A} \vb{v}_1 \thinspace . \end{equation}\]
Expand the subspace \(\vb{V}\) (initially it is empty) to \[\begin{equation} \vb{V}_1 = \begin{pmatrix} \vb{v}_1 \end{pmatrix} \end{equation}\] and expand (again: initially \(\vb{V}^{\vb{A}}\) is empty) \[\begin{equation} \vb{V}^{\vb{A}}_1 = \begin{pmatrix} \vb{v}^{\vb{A}}_1 \end{pmatrix} \thinspace . \end{equation}\]
1.Calculate the Rayleigh quotient: \[\begin{align} \theta &= \vb{v}^{\text{T}}_1 \vb{A} \vb{v}_1 \\ &= \vb{v}^{\text{T}}_1 \vb{v}^{\vb{A}}_1 \thinspace . \end{align}\]
Calculate the the associated residual vector: \[\begin{equation} \vb{r}_1 = \vb{v}^{\vb{A}}_1 - \theta \thinspace \vb{v}_1 \thinspace . \end{equation}\]
If the norm of the residual vector is lower than a specified threshold, the algorithm has already converged (but let’s not get our hopes up: this doesn’t happen) and we have found the lowest eigenpair of \(\vb{A}\), being \((\theta, \vb{v}_1)\).
Approximately solve the residue correction equation. This is where the ‘Davidson’ algorithm got his name. By using a diagonal approximation \(\vb{A}'\) of the matrix \(\vb{A}\), we get a clear and easy expression for a new \(\vb{t}\)-vector \[\begin{equation} \vb{t} = (\theta \thinspace \vb{I} - \vb{A}')^{-1} \thinspace \vb{r} \thinspace , \end{equation}\] or, written coefficient-wise: \[\begin{equation} t_i = \frac{r_i}{\theta - A_{ii}} \thinspace . \end{equation}\] Davidson originally suggested that if \(|\theta - A_{ii}|\) is smaller than a threshold, the components \(t_i\) are set to zero.
We are now in the position to phrase the algorithm for all iterations \(k>1\).
Project \(\vb{t}\) onto the orthogonal complement of \(\vb{V}_{k-1}\): \[\begin{align} \vb{t}^{\perp} &= (\vb{I} - \vb{V}_{k-1} \vb{V}_{k-1}^{\text{T}}) \vb{t} \\ &= \vb{t} - \sum_i^{\dim{V_{k-1}}} (\vb{v}_i^{\text{T}} \vb{t}) \vb{v}_i \end{align}\]
Calculate the new subspace vector: \[\begin{equation} \vb{v}_k = \frac{\vb{t}^\perp}{|| \vb{t}^\perp ||} \thinspace . \end{equation}\]
Calculate the (expensive) matrix-vector product: \[\begin{equation} \vb{v}^{\vb{A}}_k = \vb{A} \vb{v}_k \thinspace . \end{equation}\]
Expand the subspace \(\vb{V}\) to \[\begin{equation} \vb{V}_k = \begin{pmatrix} \vb{V}_{k-1} & \vb{v}_k \end{pmatrix} \end{equation}\] and expand \[\begin{equation} \vb{V}^{\vb{A}}_k = \begin{pmatrix} \vb{V}^{\vb{A}}_{k-1} & \vb{v}_k \end{pmatrix} \thinspace . \end{equation}\]
Calculate the subspace matrix: \[\begin{equation} \vb{M}_k = \vb{V}^{\text{T}}_k \vb{A} \vb{V}_k \thinspace . \end{equation}\]
Since in iteration \(k\), we have already calculated the upper-left \(\dim{V_{k-1}} \times \dim{V_{k-1}}\) block of \(\vb{M}_k\), we actually only require to calculate \[\begin{equation} M_{ik} = \vb{v}_i^{\text{T}} \vb{v}_k^{\vb{A}} = M_{ki} \end{equation}\] for \(i = 1, \dots, k\), which is the same as calculating the \(k\)-th row of \(\vb{M}\): \[\begin{align} \vb{m}_k &= \vb{V}^{\text{T}} (\vb{A} \vb{v}_k) \\ &= \vb{V}^{\text{T}} \vb{v}^{\vb{A}}_k \thinspace . \end{align}\]
Solve the subspace eigenvalue problem: \[\begin{equation} \vb{M}_k \vb{s} = \theta \thinspace \vb{s} \thinspace . \end{equation}\]
Calculate the new residual vector by first calculating \[\begin{equation} \vb{u} = \vb{V}_k \vb{s} \end{equation}\] and \[\begin{equation} \vb{u}^{\vb{A}} = \vb{V}_k^{\vb{A}} \vb{s} \thinspace , \end{equation}\] and subsequently calculation the actual residual vector \[\begin{align} \vb{r} &= (\vb{A} - \theta \thinspace \vb{I}) \vb{V}_k \vb{s} \\ &= \vb{u}^{\vb{A}} - \theta \thinspace \vb{u} \thinspace . \end{align}\]
If the norm of the residual vector is lower than a specified threshold, the algorithm has converged and we have found the lowest eigenpair of \(\vb{A}\), being \((\theta, \vb{u}_k)\).
Approximately solve the residue correction equation: \[\begin{equation} \vb{t} = (\theta \thinspace \vb{I} - \vb{A}')^{-1} \vb{r} \thinspace , \end{equation}\] or, written coefficient-wise: \[\begin{equation} t_i = \frac{r_i}{\theta - A_{ii}} \thinspace . \end{equation}\] Davidson originally suggested that if \(|\theta - A_{ii}|\) is smaller than a threshold, the components \(t_i\) are set to zero.
In case the dimension of the subspace matrix is getting too big (bigger than a predetermined maximum subspace dimension \(D\) of \(10\)-\(15\)), the algorithm should restart:
- Pulay wrote that taking the two latest ones is enough to prevent convergence instabilities.
- We can therefore take 2 linear combinations \[\begin{equation} \vb{v}^{'(j)} = \vb{V}_k \vb{s}^{(j)} \thinspace , \end{equation}\] in which \(\vb{V}_k\) is the current subspace, \((j)\) is \(1\) or \(2\) and \(\vb{s}^{(j)}\) is either the lowest or second lowest eigenvector of the subspace matrix \(\vb{M}\).