Abstract:
Protein-ligand interactions play a crucial role in many biological processes, including signal transduction, gene regulation, and enzymatic reactions. Understanding these interactions is fundamental and yet challenging since they are a vast and diverse field. To address this, abstract and interactive visualizations are required to enable visual analysis that facilitates both intuitive understanding and access to complex protein-ligand data. Due to their inherent three-dimensional nature, the interaction data and the protein and ligand structures are particularly well-suited for spatial visualization. However, to answer certain research questions, derived abstract 3D and 2D representations are utilized as well. An interactive depiction of the complex protein-ligand interplay provides profound insights into various research areas. It enhances the understanding of conformational changes of proteins, biochemical processes in and between cells, and disease mechanisms, which is pivotal in medical research. Furthermore, it supports the investigation of ligand transport processes, such as how a ligand approaches the active site of an enzyme and which conditions must be fulfilled for certain interaction types to occur. This also comprises valuable information for protein engineers and ligand designers, enabling targeted mutations of amino acids to improve the binding affinity of protein-ligand complexes or to increase catalytic rates of enzymes, which are essential for advancements in biotechnology and drug development.
This thesis focuses on developing interactive visualization methods for studying protein-ligand interactions, facilitating the exploration and visual analysis of complex and large data sets. Existing free and commercial tools provide visualizations of interactions and the molecular structures involved. However, they lack specialized functionality and features, often offering no or only basic capabilities for evaluation and visual analysis, e.g., of results from virtual screenings or molecular dynamics simulations. Furthermore, these solutions are also not designed to handle large datasets effectively, and they are typically specialized in a detailed ligand-centered analysis, frequently neglecting the protein perspective. However, a comprehensive approach also needs to include the investigation and analysis of protein surfaces and their physico-chemical properties, since it is not only the shape but the chemical composition as well that determines possible interactions.
In this thesis, various methods are discussed that enable the interactive visual analysis of protein-ligand interactions and elucidate the field from different points of view. Individual methods use machine learning to extract specific structural features of proteins for classification. In addition, the protein structure is also considered together with its physico-chemical properties, which enables the search for functionally similar proteins. These methods enable deriving common features of interactions and the learning of conditions that are crucial for the formation of certain interaction types. Subsequently, the thesis focuses on dynamic data from computational chemistry, particularly approaches visualizing molecular dynamics simulations. One application uses a highly GPU accelerated implementation of a protein surface visualization method, allowing an exploratory analysis on current consumer computers. A further method uses an aggregation approach to create a dashboard-like presentation of entire molecular dynamics simulations. It supports identifying binding sites and most interacting amino acids, highlighted in a novel sequence diagram. Furthermore, molecular docking is also considered to study interactions. The visual analysis application InVADo was designed to enable a ligand-centered, comprehensive analysis and evaluation of docking results. It guides scientists to areas of interest and supports users in verifying existing hypotheses by enriching the results through post-docking analysis.
In addition, the applicability and usefulness of the approaches were underpinned by expert evaluations. The methods developed for the interactive visual analysis of protein-ligand interactions contribute to gaining new insights into the functions of the molecular machinery of life and to acquiring a more sophisticated understanding of the underlying mechanisms, which is crucial for systems biology, biotechnology, and pharmaceutical research.