Orbital optimization through conjugate pairs

If we rewrite the orbital rotation generator, we can write the energy function as: \[\begin{align} \require{physics} E ( \boldsymbol{\kappa}_{\geq}, \boldsymbol{\kappa}^*_{\geq} ) = &\ev{ \hat{\mathcal{H}} }{\Psi} + \sum_{P \geq Q}^M \kappa_{PQ} \ev*{ \comm*{ \hat{E}_{PQ} }{ \hat{\mathcal{H}} } }{\Psi} - \sum_{P>Q}^M \kappa^*_{PQ} \ev*{ \comm*{ \hat{E}_{QP} }{ \hat{\mathcal{H}} } }{\Psi} \\ &+ \frac{1}{2} \sum_{P \geq Q}^M \sum_{R \geq S}^M \kappa_{PQ} \kappa_{RS} \ev*{ \comm*{ \hat{E}_{PQ} }{ \comm*{ \hat{E}_{RS} }{ \hat{\mathcal{H}}} } }{\Psi} \\ &- \frac{1}{2} \sum_{P \geq Q}^M \sum_{R>S}^M \kappa_{PQ} \kappa^*_{RS} \ev*{ \comm*{ \hat{E}_{PQ} }{ \comm*{ \hat{E}_{SR} }{ \hat{\mathcal{H}}} } }{\Psi} \\ &- \frac{1}{2} \sum_{P > Q}^M \sum_{R \geq S}^M \kappa^*_{PQ} \kappa_{RS} \ev*{ \comm*{ \hat{E}_{QP} }{ \comm*{ \hat{E}_{SR} }{ \hat{\mathcal{H}}} } }{\Psi} \\ &+ \frac{1}{2} \sum_{P>Q}^M \sum_{R>S}^M \kappa^*_{PQ} \kappa^*_{RS} \ev*{ \comm*{ \hat{E}_{QP} }{ \comm*{ \hat{E}_{SR} }{ \hat{\mathcal{H}}} } }{\Psi} \thinspace , \end{align}\]

or, when collecting terms: \[\begin{align} E ( \boldsymbol{\kappa}_{\geq}, \boldsymbol{\kappa}^*_{\geq} ) = &\ev{ \hat{\mathcal{H}} }{\Psi} + \sum_{P \geq Q}^M \kappa_{PQ} \ev*{ \comm*{ \hat{E}_{PQ} }{ \hat{\mathcal{H}} } }{\Psi} - \sum_{P>Q}^M \kappa^*_{PQ} \ev*{ \comm*{ \hat{E}_{QP} }{ \hat{\mathcal{H}} } }{\Psi} \\ &+ \frac{1}{2} \sum_{P \geq Q}^M \sum_{R \geq S}^M \kappa_{PQ} \kappa_{RS} \ev*{ \comm*{ \hat{E}_{PQ} }{ \comm*{ \hat{E}_{RS} }{ \hat{\mathcal{H}}} } }{\Psi} \\ &- \frac{1}{2} \sum_{P \geq Q}^M \sum_{R>S}^M \kappa_{PQ} \kappa^*_{RS} (1 + P_{PQ,SR}) \ev*{ \comm*{ \hat{E}_{PQ} }{ \comm*{ \hat{E}_{SR} }{ \hat{\mathcal{H}}} } }{\Psi} \\ &+ \frac{1}{2} \sum_{P>Q}^M \sum_{R>S}^M \kappa^*_{PQ} \kappa^*_{RS} \ev*{ \comm*{ \hat{E}_{QP} }{ \comm*{ \hat{E}_{SR} }{ \hat{\mathcal{H}}} } }{\Psi} \thinspace . \end{align}\]

If we define the pair of conjugate orbital rotation generators \[\begin{equation} \vb{k}_{\geq} = \begin{pmatrix} \boldsymbol{\kappa}_{\geq} \\ \boldsymbol{\kappa}^*_{\geq} \end{pmatrix} \thinspace , \end{equation}\] we can rewrite the energy in terms of matrix-matrix multiplications as \[\begin{equation} E ( \boldsymbol{\kappa}_{\geq}, \boldsymbol{\kappa}^*_{\geq} ) = \ev{ \hat{\mathcal{H}} }{\Psi} + \begin{pmatrix} \vb{a} & \vb{a}^* \end{pmatrix} \begin{pmatrix} \boldsymbol{\kappa}_{\geq} \\ \boldsymbol{\kappa}^*_{\geq} \end{pmatrix} + \frac{1}{2} \begin{pmatrix} \boldsymbol{\kappa}_{\geq} \\ \boldsymbol{\kappa}^*_{\geq} \end{pmatrix}^\dagger \begin{pmatrix} \vb{A} & \vb{B} \\ \vb{B^*} & \vb{A^*} \end{pmatrix} \begin{pmatrix} \boldsymbol{\kappa}_{\geq} \\ \boldsymbol{\kappa}^*_{\geq} \end{pmatrix} \thinspace , \end{equation}\] by compounding (for example, in a column-major way) the matrix indices \(PQ\) into a vector index, say \(i\). We should point our attention to the fact that in the second-order term, a Hermitian adjoint appears, so the first block of the resulting row vector will actually be \(\boldsymbol{\kappa}^\dagger_{\geq}\) instead of \(\boldsymbol{\kappa}_{\geq}\) that appears as the first block of the column vector to the right.

The first-order term \(\vb{a}\) that appears in the energy expression is of course related to the first derivative. We find (\(P \geq Q\)): \[\begin{equation} a_{PQ} = \eval{ \pdv{ E( \boldsymbol{\kappa}_{\geq}, \boldsymbol{\kappa}^*_{\geq} ) }{\boldsymbol{\kappa}_{\geq}} }_{\boldsymbol{\kappa}_0} \end{equation}\] such that stationarity at the ‘current orbitals’ \(\boldsymbol{\kappa}_0\) is achieved when this vector of first derivatives (i.e. the orbital gradient) is zero: \[\begin{align} \eval{ \pdv{ E( \boldsymbol{\kappa}_{\geq}, \boldsymbol{\kappa}^*_{\geq} ) }{\kappa_{PQ}} }_{\boldsymbol{\kappa}_0, \thinspace \boldsymbol{\kappa}^*_0} &= \ev*{ \comm*{ \hat{E}_{PQ} }{ \hat{\mathcal{H}} } }{\Psi} \\ &= \mathscr{F}_{PQ}( \hat{\mathcal{H}} ) - \mathscr{F}^*_{QP}( \hat{\mathcal{H}} ) \\ &= 0 \thinspace , \end{align}\] which means that the Fockian matrix must be a Hermitian matrix.

For the second-order partial derivatives, we can calculate (\(P>Q\), \(R>S\)): \[\begin{align} \eval{ \pdv{ E(\boldsymbol{\kappa}) }{\kappa^*_{PQ}}{\kappa^*_{RS}} }_{\boldsymbol{\kappa}_0} &= B_{PQ,RS} \\ &= \frac{1}{2} (1 + P_{PQ,RS}) \ev*{ \comm*{ \hat{E}_{QP} }{ \comm*{ \hat{E}_{SR} }{ \hat{\mathcal{H}}} } }{\Psi} \\ &= \frac{1}{2} (1 + P_{PQ,RS}) \qty[ \mathscr{G}_{QPSR}(\hat{\mathcal{H}}) + \mathscr{G}^*_{PQRS}(\hat{\mathcal{H}}) ] \thinspace . \end{align}\] and (\(P \geq Q\), \(R \geq S\)): \[\begin{align} A_{PQ,RS} &= - \ev*{ \comm*{ \hat{E}_{QP} }{ \comm*{ \hat{E}_{RS} }{ \hat{\mathcal{H}} } } }{\Psi} \\ &= - \qty[ \mathscr{G}_{QPRS}(\hat{\mathcal{H}}) + \mathscr{G}^*_{PQSR}(\hat{\mathcal{H}}) ] \thinspace , \end{align}\] such that the mixed derivative is given by (\(P \geq Q\), \(R > S\)): \[\begin{equation} \eval{ \pdv{ E(\boldsymbol{\kappa}) }{\kappa_{PQ}}{\kappa^*_{RS}} }_{\boldsymbol{\kappa}_0} =\frac{1}{2} (1 + P_{PQ,SR}) A_{QP,SR} \thinspace . \end{equation}\] The matrix \[\begin{equation} \vb{M} = \begin{pmatrix} \vb{A} & \vb{B} \\ \vb{B^*} & \vb{A^*} \end{pmatrix} \end{equation}\] is then referred to as the orbital Hessian, and can be used in a second-order optimization.

Note that we are abusing the notation in this equation, because the actual dimensions of the \(\vb{A}\) and \(\vb{A}^*\) do not match: \(\vb{A}\) is related to \(\boldsymbol{\kappa}_{\geq}\), while \(\vb{A}^*\) is actually related to \(\boldsymbol{\kappa}_{>}^*\). This knowde should be revised to rewrite the orbital gradient as a block vector of \(3\) constituent blocks and the orbital Hessian as a block matrix of \(9\) constituent blocks.

For the case of the Hartree-Fock wave function model, these formulas simplify considerably.