Abstract:
The main objective of this thesis is to classify the relations that hold between the elements of adjective-noun combinations. This classification makes a distinction between lexically free and lexically restricted phrases. For instance, in the free phrase green dress, the relation between the adjective green and the noun dress is color; in light package - weight. In a lexically restricted phrase such as heavy smoker, the relation is intensity; in old friend, the relation is temporal, more specifically, duration. This thesis claims that an adequate inventory of such relations can only be achieved through a synthesis of different theoretical frameworks. The investigation focuses on adjective-noun pairs as they have received considerably less attention in previous research than verbal constructions. The empirical basis for the research in this thesis is comprised of German data. The choice of language is motivated by the availability of rich linguistic resources developed for the German language that are suitable for data collection.
Using a sample of lexically varied data, the applicability of the following inventories in this task is investigated: Lexical Functions, Qualia Roles, Frame Elements, and attributes. The analysis reveals that each of these approaches alone is insufficient due to different reasons. Standard Lexical Functions can only be applied to a certain type of lexically restricted phrases, and Non-Standard Lexical Functions are too specific and fine-grained; Qualia Roles are coarse-grained and are limited to describing the semantics of concrete nouns. The inventory offered by Frame Elements is semantically broad and can accommodate nouns from various semantic fields. However, it has a disadvantage of being very fine-grained and, in part, inconsistent. Attributes overcome these issues, but they do not offer a specific inventory of labels that could be used for modeling. The semantic classification of adjectives in the German wordnet GermaNet is used as the basis for developing an adequate inventory of attributes.
The annotation scheme is developed in two stages: manually annotating the data with the attributes from GermaNet and modifying the labels for a better generalization relying on the insights from the above described frameworks. The resulting scheme consisting of 54 attributes is not exhaustive and can be extended depending on the data. In order to assess the adequacy of the scheme, an inter-annotator agreement is calculated, a qualitative analysis of the dataset is performed, and a series of machine learning experiments are conducted. Additionally, the scheme is successfully applied to an English dataset of adjective-noun phrases showing that the proposed inventory of relations is general and stable enough for describing different and more diverse data. Finally, a sample of asymmetric adjective-noun pairs is compiled and the inventory of attributes is applied to it.