Abstract
The conjugate gradient (CG) method is one of the most widely applied iterative methods for solving positive definite linear systems. The method converges in the worst case at a rate depending on the square root of the system condition number, matching e.g., the rate of accelerated gradient methods. However, in practice, CG often converges much faster. This can be explained by well-known instance-optimality bounds, which show that CG’s error is bounded by the error of the best polynomial approximation to the matrix inverse on the specific input instance. This error bound is typically much stronger than the worst-case guarantee, explaining CG’s very fast convergence in practice. The conjugate gradient method is a special case of the more general Lanczos method, which can be used to approximate arbitrary matrix functions, such as the matrix square root and the matrix exponential. For such matrix functions, worst-case convergence guarantees can be proven on the accuracy of Lanczos. However, as in the case of linear systems, in practice the method typically far outperforms these worst-case bounds. Surprisingly, there is little theoretical understanding of this gap — and in particular, very few instance-optimal approximation bounds are known for general matrix functions. In this talk I will discuss recent progress towards proving instance-optimality of Lanczos for rational matrix functions and several other central classes of functions, by building on techniques used to analyze CG. I will discuss some of the many open questions that remain, as well as the possibility of giving even faster methods in some regimes via other approaches, such as stochastic gradient methods. Joint work with: Noah Amsel, Tyler Chen, Anne Greenbaum, and Christopher Musco