Dutch Adjective-Noun Phrase Dataset for Compositionality Tests (nld-adj-n) ...
If you want to use this dataset for research purposes, please refer to the following sources: - Gertjan Van Noord, Gosse Bouma, Frank Van Eynde, Daniël De Kok, Jelmer Van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, and Vincent Vandeghinste. 2013. Large Scale Syntactic Annotation of Written Dutch: Lassy. In Essential Speech and Language Technology for Dutch, pages 147–164. Springer. - Corina Dima, Daniël de Kok, Neele Witte, Erhard Hinrichs. 2019. No word is an island — a transformation weighting model for semantic composition. Transactions of the Association for Computational Linguistics.... Mehr ...
Verfasser: | |
---|---|
Dokumenttyp: | Dataset |
Erscheinungsdatum: | 2019 |
Verlag/Hrsg.: |
University of Tübingen
|
Schlagwörter: | linguistics |
Sprache: | unknown |
Permalink: | https://search.fid-benelux.de/Record/base-28981773 |
Datenquelle: | BASE; Originalkatalog |
Powered By: | BASE |
Link(s) : | https://dx.doi.org/10.57754/fdat.n659d-e8r84 |
If you want to use this dataset for research purposes, please refer to the following sources: - Gertjan Van Noord, Gosse Bouma, Frank Van Eynde, Daniël De Kok, Jelmer Van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, and Vincent Vandeghinste. 2013. Large Scale Syntactic Annotation of Written Dutch: Lassy. In Essential Speech and Language Technology for Dutch, pages 147–164. Springer. - Corina Dima, Daniël de Kok, Neele Witte, Erhard Hinrichs. 2019. No word is an island — a transformation weighting model for semantic composition. Transactions of the Association for Computational Linguistics. The dataset is distributed under the Creative Commons Attribution NonCommercial (CC-BY-NC) license. The 83,392 Dutch adjective-noun phrases (58,347 train, 16,669 test, 8,376 dev) from this dataset were extracted from the Lassy Large treebank (Van Noord et al., 2013), which consists of written texts (Wikipedia, newspapers) and texts of the medical domain. The train/test/dev files have the following format, the single ...