Learning Optimal Algorithms for Parametric Optimization Problems

DSpace Repositorium (Manakin basiert)


Dateien:

Zitierfähiger Link (URI): http://hdl.handle.net/10900/168623
http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1686237
http://dx.doi.org/10.15496/publikation-109950
Dokumentart: Dissertation
Erscheinungsdatum: 2025-07-31
Sprache: Englisch
Fakultät: 7 Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich: Mathematik
Gutachter: Ochs, Peter (Prof. Dr.)
Tag der mündl. Prüfung: 2025-06-27
DDC-Klassifikation: 510 - Mathematik
Freie Schlagwörter:
Machine Learning
Optimization
Learning-to-Optimize
PAC-Bayes
Lizenz: http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=en
Zur Langanzeige

Abstract:

From an abstract point of view, the problem of optimally selecting a variable or arranging certain things in the best possible way occurs frequently and in a wide variety of situations. It is therefore not surprising that so-called optimization problems permanently have to be solved in science and industry. At the same time, however, nowadays these problems get increasingly complex and their solution more time- and energy-consuming. Thus, there is a constant need for ever more efficient optimization algorithms to solve such problems. "Learning-to-optimize" is a recent line of research that leverages machine learning techniques to automatically build and accelerated optimization algorithms. While the empirical results can be impressive, theoretical guarantees are mostly lacking, such that the application of learned optimization methods is still questionable and thus not widely adopted. This applies in particular to safety-critical applications with their need for guarantees, not least because there are already the conventional algorithms that can provably solve these problems. One of the reasons why the theoretical understanding of learned optimization algorithms lags so far behind is that the respective established mathematical language and the usual chains of reasoning in mathematical optimization and in machine learning are quite different: Mathematical optimization most often relies on geometric arguments and deductive reasoning, while machine learning models learn from observational data, that is, the results are data-dependent, such that the guarantees are of statistical nature. This results in the problem that most known proof-strategies fail sooner or later or, even worse, are not applicable at all. Therefore, this thesis provides a series of contributions towards new guarantees and a better theoretical understanding of learned optimization algorithms, aswell as away for bridging the gap between conventional optimization theory and learning-to-optimize. The first part considers a simplified model of an optimization algorithm and provides generalization guarantees for its performance based on the observations during training. Furthermore, it also introduces a corresponding learning procedure as well as several design choices for learning optimization algorithms. Since many limitations encountered in the first part are based on the inadequacy of the underlying model, the second part introduces a more faithful model of optimization algorithms and how practitioners use them. This construction is done in a bottom-up approach, which allows for minimal assumptions and wide applicability, and effectively solves the problems of the first part all at once. Even more so, this model offers the possibility for deriving generalization guarantees for most conceivable performance measures rather easily. Then, based on this new model, the third part provides a new proof-strategy with which the gap between conventional optimization theory and learning-to-optimize can be closed to a large extent. Here, the underlying idea is exemplified for the problem of proving convergence to stationary points of non-smooth and non-convex loss functions. This results in a generalization guarantee for the learned optimization algorithm which certifies that, roughly speaking, the probability that the learned algorithm will converge to a stationary point of the loss function is lower-bounded by corresponding quantities that are observable during training. Finally, the last part summarizes the contributions of this thesis and discusses its limitations. Further, it shows how the remaining problems can be explained based on the proposed model, which of these problems can potentially be solved based on statistical approaches and which will probably always have to be solved in the conventional way. Lastly, some potential directions for future research are discussed. Taken together, these results collectively contribute to a better understanding of learning-to-optimize and the overarching objective of obtaining more efficient optimization algorithms through the application of machine learning while keeping the needed theoretical guarantees.

Das Dokument erscheint in: