Graphs in Unsupervised Machine Learning

DSpace Repositorium (Manakin basiert)


Dateien:

Zitierfähiger Link (URI): http://hdl.handle.net/10900/161963
http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1619638
http://dx.doi.org/10.15496/publikation-103295
Dokumentart: Dissertation
Erscheinungsdatum: 2025-02-14
Sprache: Englisch
Fakultät: 7 Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich: Informatik
Gutachter: Luxburg, Ulrike von (Prof. Dr.)
Tag der mündl. Prüfung: 2025-01-21
DDC-Klassifikation: 004 - Informatik
Schlagworte: Graph , Maschinelles Lernen
Lizenz: http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_ohne_pod.php?la=en
Zur Langanzeige

Abstract:

Graphs are versatile data structures that represent individual data points and the intricate relationships between them, making them invaluable for modelling complex real-world systems. Their flexibility allows them to be applied to a wide range of problems, and with appropriate algorithms, they have become crucial in many domains. In particular, unsupervised machine learning – where patterns are discovered without labelled data – has greatly benefited from integrating graphs. This thesis presents an introduction to unsupervised machine learning on graphs, focusing on clustering and representation learning. We explore the strengths and challenges of applying graphs to machine learning, emphasizing these two specific tasks. Our contributions span two distinct perspectives within unsupervised learning on graphs. In the first part, we pro- pose a novel clustering method inspired by the concept of tangles from mathematical graph theory. Tangles are used to detect densely connected subsets within a graph, and we adapt this idea to cluster data across various domains. Our approach yields a soft clustering hierarchy that is often inherently interpretable. We provide theoretical performance guarantees across different data types and validate the method’s effective- ness through extensive empirical evaluations. The second part of this thesis delves into representation learning on graphs. We investigate the inductive biases of graph auto- encoders, connecting their non-linear architecture to a linear model. Our empirical findings show that linear models can match or surpass the performance of non-linear graph auto-encoders when node features are effectively leveraged. Additionally, we analyze the constraints imposed by features on the solution space, offering theoretical and empirical evidence that these features are the key to achieving strong performance in this setting.

Das Dokument erscheint in: