Linking Dutch Wikipedia Categories to EuroWordNet : SA-OT accounts for pronoun resolution in child language

Wikipedia provides category information for a large number of named entities but the category structure of Wikipedia is associative, and not always suitable for linguistic applications. For this reason, a merger ofWikipedia andWordNet has been proposed. In this paper, we address the word sense disambiguation problem that needs to be solved when linking Dutch Wikipedia categories to polysemous Dutch EuroWordNet literals. We show that a method based on automatically acquired predominant word senses outperforms a method based on word overlap between Wikipedia supercategories and WordNet hypernyms... Mehr ...

Verfasser: Bouma, Gosse
Dokumenttyp: Part of book or chapter of book
Erscheinungsdatum: 2009
Schlagwörter: Taalwetenschap
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-26680273
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://dspace.library.uu.nl/handle/1874/297138

Wikipedia provides category information for a large number of named entities but the category structure of Wikipedia is associative, and not always suitable for linguistic applications. For this reason, a merger ofWikipedia andWordNet has been proposed. In this paper, we address the word sense disambiguation problem that needs to be solved when linking Dutch Wikipedia categories to polysemous Dutch EuroWordNet literals. We show that a method based on automatically acquired predominant word senses outperforms a method based on word overlap between Wikipedia supercategories and WordNet hypernyms. We compare the coverage of the resulting categorization with that of a corpus-based system that uses automatically acquired category labels.