Incorporating Preconditioning into Accelerated Approaches: Theoretical Guarantees and Practical Improvement

Incorporating Preconditioning into Accelerated Approaches: Theoretical Guarantees and Practical Improvement

Trifonov S. D., Levin L. I., Chezhegov S. A., Beznosikov A. N.

УДК 519.853.62 
DOI: 10.33048/semi.2025.22.C09  
MSC 68T07


Аннотация:

Machine learning and deep learning are widely researched fields that provide solutions to many modern problems. Due to the complexity of new problems related to the size of datasets, efficient approaches are obligatory. In optimization theory, the Heavy Ball and Nesterov methods use momentum in their updates of model weights. On the other hand, the minimization problems considered may be poorly conditioned, which affects the applicability and effectiveness of the aforementioned techniques. One solution to this issue is preconditioning, which has already been investigated in approaches such as AdaGrad, RMSProp, Adam and others. Despite this, momentum acceleration and preconditioning have not been fully explored together. Therefore, we propose the Preconditioned Heavy Ball (PHB) and Preconditioned Nesterov method (PN) with theoretical guarantees of convergence under unified assumption on the scaling matrix. Furthermore, we provide numerical experiments that demonstrate superior performance compared to the unscaled techniques in terms of iteration and oracle complexities.

Ключевые слова: adaptive optimization, preconditioning, accelerated methods