TwNC: a Multifaceted Dutch News Corpus
This contribution describes the Twente News Corpus (TwNC), a multifaceted corpus for Dutch that is being deployed in a number of NLP research projects among which tracks within the Dutch national research programme MultimediaN, the NWO programme CATCH, and the Dutch-Flemish programme STEVIN. The development of the corpus started in 1998 within a predecessor project DRUID and has currently a size of 530M words. The text part has been built from texts of four different sources: Dutch national newspapers, television subtitles, teleprompter (auto-cues) files, and both manually and automatically ge... Mehr ...
Verfasser: | |
---|---|
Dokumenttyp: | article / Letter to editor |
Erscheinungsdatum: | 2007 |
Verlag/Hrsg.: |
ELRA
|
Sprache: | unknown |
Permalink: | https://search.fid-benelux.de/Record/base-27066351 |
Datenquelle: | BASE; Originalkatalog |
Powered By: | BASE |
Link(s) : | http://purl.utwente.nl/publications/68090 |