Improving performance and numerical stability for solving Toeplitz systems. #26015

tillahoffmann · 2025-01-21T20:28:11Z

tillahoffmann
Jan 21, 2025

Summary

The covariance matrix of Gaussian processes often has a Toeplitz structure, i.e., the elements on each diagonal are constant¹. This structure can be exploited to evaluate the log likelihood of Gaussian process realizations more efficiently than having to solve a general linear system, e.g., using Levinson-Durbin recursion. Running variational inference through numpyro, I get about a 5x speed up in terms of training time using the implementation below (although I'm struggling to reproduce the performance improvement in simple benchmarks). It would be great to get your opinion on two questions.

Are there further improvements in terms of performance for the implementation below. My translation from an indexing-based algorithm to a JAX implementation that relies on constant-shaped arrays in the body of jax.lax.scan likely has room for improvement.
Are there opportunities to improve the numerical stability, e.g., by reordering operations, grouping summations, etc.? Levinson-Durbin recursion is not particularly stable for matrices with high condition number, but there might still be room for improvement for the implementation below.

Details

The log-likelihood for a realization $\mathbf x$ of a Gaussian process with zero mean covariance matrix $\mathbf C$ is (up to constants)

$$- \left(\log\left\vert\mathbf C\right\vert + \mathbf{x}^\intercal\mathbf{C}^{-1}\mathbf{x}\right)/2.$$

If $\mathbf{C}$ is Toeplitz, we can solve $\mathbf{C}\mathbf{z} = \mathbf{x}$ for $\mathbf{z}$ and evaluate the log absolute determinant in one pass to get

$$- \left(\log\left\vert\mathbf C\right\vert + \mathbf{x}^\intercal\mathbf{z}\right)/2.$$

We also don't need to evaluate the full matrix $\mathbf{C}$ but can get away with only evaluating the first row because all other elements are specified by the first row for a symmetric Toeplitz matrix. The following function implements the solver and evaluation of log absolute determinant.

def solve_toeplitz(a, b, return_aux=False):
    """
    Solve :math:`A x = b` for a positive-definite Toeplitz matrix :math:`A`.

    Args:
        a: First row of :math:`A`.
        b: Right-hand side of the equation.

    Returns:
        Solution :math:`x` and the log absolute determinant of :math:`A`.
    """
    a, b = jnp.broadcast_arrays(a, b)
    n = b.shape[-1]
    assert n > 1

    # Initialize state vectors and scalar for log abs det.
    a0 = a[..., 0]
    x = jnp.empty_like(b).at[..., 0].set(b[..., 0] / a0)
    g = jnp.empty_like(b).at[..., 0].set(a[..., 1] / a0)
    logabsdet = jnp.log(jnp.abs(a0))

    # Roll the vector which simplifies indexing in the body of the scan. Then
    # pre-compute a flipped version of `a` which we'll roll through in the body of the
    # scan.
    a = jnp.roll(a, -1, axis=-1)
    a_rev = jnp.flip(a, axis=-1)

    def _body(carry, m):
        # Unpack the carry and initialize dictionary to track auxiliary information 
        # (this is only for debugging and inspecting intermediate results during 
        # development).
        x, logabsdet, g, a_rev, mask = carry
        aux = {}

        # Roll arrays.
        g_rev = jnp.roll(jnp.flip(g, axis=-1), m, axis=-1)
        a_rev = jnp.roll(a_rev, 1, axis=-1)

        # Compute common denominator and nominators for new x and g.
        mask = aux.setdefault("mask", mask.at[m - 1].set(True))
        denom = jnp.sum(aux.setdefault("a * g", a * g), axis=-1, where=mask) - a0
        x_num = jnp.sum(aux.setdefault("x * a_rev", x * a_rev), axis=-1, where=mask) - b[..., m]
        g_num = jnp.sum(aux.setdefault("g * a_rev", g * a_rev), axis=-1, where=mask) - a[..., m]

        # Update with new values.
        x_new = x_num / denom
        g_new = g_num / denom
        x = x.at[..., m].set(x_new)
        g = g.at[..., m].set(g_new)
        x -= x_new[..., None] * g_rev
        g -= g_new[..., None] * g_rev

        # Finally update the determinant.
        logabsdet += jnp.log(jnp.abs(denom))

        return (x, logabsdet, g, a_rev, mask), aux

    mask = jnp.zeros(n, dtype=bool)
    carry, aux = jax.lax.scan(_body, (x, logabsdet, g, a_rev, mask), jnp.arange(1, n))
    x, logabsdet, *_ = carry

    if return_aux:
        return x, logabsdet, aux
    else:
        return x, logabsdet

Notes Regarding Performance

The reason I'm interested in further optimizing this function is that the bulk of the time for evaluating the loss function is due to the Gaussian process likelihood.
Keeping the mask in the carry and updating it seems to be more performant than creating a mask = jnp.arange(n) < m in the @body.
Flipping and rolling g might be a bit of a time hog, but I couldn't figure out a better way without indexing (and thus creating intermediate arrays with variable shape).

Notes Regarding Numerical Stability

Covariances matrices for Gaussian processes with relatively long correlation lengths have high condition number which can degrade the accuracy of the above algorithm. Below, I've shown an example of the first row of the covariance matrix and reconstruction error using different methods for a random vector from a standard normal distribution. The solve_toeplitz method exhibits relatively strong oscillatory behavior around index 5.
This same oscillatory behavior is observed in some of the working memory of the algorithm. I've kept track of some of the working variables and plotted the below (each row corresponds to one iteration of the loop over m in the implementation.

Squinting at these, maybe the oscillatory errors have something to do with rounding errors in the summation of a_rev * x?

Thanks for making it this far and any input you might have!

That's the case on a regularly spaced grid in one dimension for a stationary covariance kernel. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving performance and numerical stability for solving Toeplitz systems. #26015

{{title}}

Replies: 0 comments

Select a reply

Improving performance and numerical stability for solving Toeplitz systems. #26015

tillahoffmann Jan 21, 2025

Summary

Details

Notes Regarding Performance

Notes Regarding Numerical Stability

Footnotes

Replies: 0 comments

tillahoffmann
Jan 21, 2025