The Davidson diagonalization method

As proposed initially by Davidson in 1975 (Davidson 1975), his diagonalization method applied to any symmetry, diagonally dominant matrix of dimension \(N\). It is a method to solve the associated eigenvalue problem for this matrix (which can have insanely large dimensions - large enough to be stored in the modern RAM-memory of a computer), finding its lowest eigenvalue and associated eigenvector.

In summary, the algorithm takes an initial guess for the lowest-eigenvalue eigenvector and produces new estimates by solving the diagonalization in an ever increasing subspace of previous estimates. Starting from that initial guess, the algorithm goes as follows.

  1. Let \(\require{physics} \vb{t}\) be an initial guess vector. Calculate \(\vb{v}_1\), being a new subspace vector, as \[\begin{equation} \vb{v}_1 = \frac{\vb{t}}{|| \vb{t} ||} \thinspace . \end{equation}\]

  2. Calculate the (expensive) matrix-vector product: \[\begin{equation} \vb{v}^{\vb{A}}_1 = \vb{A} \vb{v}_1 \thinspace . \end{equation}\]

  3. Expand the subspace \(\vb{V}\) (initially it is empty) to \[\begin{equation} \vb{V}_1 = \begin{pmatrix} \vb{v}_1 \end{pmatrix} \end{equation}\] and expand (again: initially \(\vb{V}^{\vb{A}}\) is empty) \[\begin{equation} \vb{V}^{\vb{A}}_1 = \begin{pmatrix} \vb{v}^{\vb{A}}_1 \end{pmatrix} \thinspace . \end{equation}\]

1.Calculate the Rayleigh quotient: \[\begin{align} \theta &= \vb{v}^{\text{T}}_1 \vb{A} \vb{v}_1 \\ &= \vb{v}^{\text{T}}_1 \vb{v}^{\vb{A}}_1 \thinspace . \end{align}\]

  1. Calculate the the associated residual vector: \[\begin{equation} \vb{r}_1 = \vb{v}^{\vb{A}}_1 - \theta \thinspace \vb{v}_1 \thinspace . \end{equation}\]

  2. If the norm of the residual vector is lower than a specified threshold, the algorithm has already converged (but let’s not get our hopes up: this doesn’t happen) and we have found the lowest eigenpair of \(\vb{A}\), being \((\theta, \vb{v}_1)\).

  3. Approximately solve the residue correction equation. This is where the ‘Davidson’ algorithm got his name. By using a diagonal approximation \(\vb{A}'\) of the matrix \(\vb{A}\), we get a clear and easy expression for a new \(\vb{t}\)-vector \[\begin{equation} \vb{t} = (\theta \thinspace \vb{I} - \vb{A}')^{-1} \thinspace \vb{r} \thinspace , \end{equation}\] or, written coefficient-wise: \[\begin{equation} t_i = \frac{r_i}{\theta - A_{ii}} \thinspace . \end{equation}\] Davidson originally suggested that if \(|\theta - A_{ii}|\) is smaller than a threshold, the components \(t_i\) are set to zero.

We are now in the position to phrase the algorithm for all iterations \(k>1\).

  1. Project \(\vb{t}\) onto the orthogonal complement of \(\vb{V}_{k-1}\): \[\begin{align} \vb{t}^{\perp} &= (\vb{I} - \vb{V}_{k-1} \vb{V}_{k-1}^{\text{T}}) \vb{t} \\ &= \vb{t} - \sum_i^{\dim{V_{k-1}}} (\vb{v}_i^{\text{T}} \vb{t}) \vb{v}_i \end{align}\]

  2. Calculate the new subspace vector: \[\begin{equation} \vb{v}_k = \frac{\vb{t}^\perp}{|| \vb{t}^\perp ||} \thinspace . \end{equation}\]

  3. Calculate the (expensive) matrix-vector product: \[\begin{equation} \vb{v}^{\vb{A}}_k = \vb{A} \vb{v}_k \thinspace . \end{equation}\]

  4. Expand the subspace \(\vb{V}\) to \[\begin{equation} \vb{V}_k = \begin{pmatrix} \vb{V}_{k-1} & \vb{v}_k \end{pmatrix} \end{equation}\] and expand \[\begin{equation} \vb{V}^{\vb{A}}_k = \begin{pmatrix} \vb{V}^{\vb{A}}_{k-1} & \vb{v}_k \end{pmatrix} \thinspace . \end{equation}\]

  5. Calculate the subspace matrix: \[\begin{equation} \vb{M}_k = \vb{V}^{\text{T}}_k \vb{A} \vb{V}_k \thinspace . \end{equation}\]

Since in iteration \(k\), we have already calculated the upper-left \(\dim{V_{k-1}} \times \dim{V_{k-1}}\) block of \(\vb{M}_k\), we actually only require to calculate \[\begin{equation} M_{ik} = \vb{v}_i^{\text{T}} \vb{v}_k^{\vb{A}} = M_{ki} \end{equation}\] for \(i = 1, \dots, k\), which is the same as calculating the \(k\)-th row of \(\vb{M}\): \[\begin{align} \vb{m}_k &= \vb{V}^{\text{T}} (\vb{A} \vb{v}_k) \\ &= \vb{V}^{\text{T}} \vb{v}^{\vb{A}}_k \thinspace . \end{align}\]

  1. Solve the subspace eigenvalue problem: \[\begin{equation} \vb{M}_k \vb{s} = \theta \thinspace \vb{s} \thinspace . \end{equation}\]

  2. Calculate the new residual vector by first calculating \[\begin{equation} \vb{u} = \vb{V}_k \vb{s} \end{equation}\] and \[\begin{equation} \vb{u}^{\vb{A}} = \vb{V}_k^{\vb{A}} \vb{s} \thinspace , \end{equation}\] and subsequently calculation the actual residual vector \[\begin{align} \vb{r} &= (\vb{A} - \theta \thinspace \vb{I}) \vb{V}_k \vb{s} \\ &= \vb{u}^{\vb{A}} - \theta \thinspace \vb{u} \thinspace . \end{align}\]

  3. If the norm of the residual vector is lower than a specified threshold, the algorithm has converged and we have found the lowest eigenpair of \(\vb{A}\), being \((\theta, \vb{u}_k)\).

  4. Approximately solve the residue correction equation: \[\begin{equation} \vb{t} = (\theta \thinspace \vb{I} - \vb{A}')^{-1} \vb{r} \thinspace , \end{equation}\] or, written coefficient-wise: \[\begin{equation} t_i = \frac{r_i}{\theta - A_{ii}} \thinspace . \end{equation}\] Davidson originally suggested that if \(|\theta - A_{ii}|\) is smaller than a threshold, the components \(t_i\) are set to zero.

In case the dimension of the subspace matrix is getting too big (bigger than a predetermined maximum subspace dimension \(D\) of \(10\)-\(15\)), the algorithm should restart:

  • Pulay wrote that taking the two latest ones is enough to prevent convergence instabilities.
  • We can therefore take 2 linear combinations \[\begin{equation} \vb{v}^{'(j)} = \vb{V}_k \vb{s}^{(j)} \thinspace , \end{equation}\] in which \(\vb{V}_k\) is the current subspace, \((j)\) is \(1\) or \(2\) and \(\vb{s}^{(j)}\) is either the lowest or second lowest eigenvector of the subspace matrix \(\vb{M}\).

References

Davidson, Ernest R. 1975. The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices.” Journal of Computational Physics 17 (1): 87–94. https://doi.org/10.1016/0021-9991(75)90065-0.