Automatic Identification of Synonym Relations in the Dutch Parliament’s Thesaurus

For indexing archived documents the Dutch Parliament uses a specialized thesaurus. For good results for full text retrieval and automatic classification it turns out to be important to add more synonyms to the existing thesaurus terms. In the present work we investigate the possibilities to find synonyms for terms of the parliaments thesaurus automatically. We propose to use distributional similarity (DS). In an experiment with pairs of synonyms and non-synonyms we train and test a classifier using distributional similarity and string similarity. Using ten-fold cross validation we were able to... Mehr ...

Verfasser: Aga, Rosa Tsegaye
Wartena, Christian
Lange, Otto
Aders, Nelleke
Dokumenttyp: Artikel
Erscheinungsdatum: 2017
Verlag/Hrsg.: Hannover : Hochschule Hannover
Schlagwörter: Synononym / Automatische Identifikation / Thesaurus / ddc:020
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-27410006
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://serwiss.bib.hs-hannover.de/frontdoor/index/index/docId/1145

For indexing archived documents the Dutch Parliament uses a specialized thesaurus. For good results for full text retrieval and automatic classification it turns out to be important to add more synonyms to the existing thesaurus terms. In the present work we investigate the possibilities to find synonyms for terms of the parliaments thesaurus automatically. We propose to use distributional similarity (DS). In an experiment with pairs of synonyms and non-synonyms we train and test a classifier using distributional similarity and string similarity. Using ten-fold cross validation we were able to classify 75% of the pairs of a set of 6000 word pairs correctly.