Sprekend Nederland: a heterogeneous speech data collection

Sprekend Nederland is a large-scale effort to document the variability of Dutch as spoken in the Netherlands anno 2016. A smartphone app was created to record the speech of as many speakers of Dutch as possible, as well as their attitudes (perceptions and evaluations) towards other participants’s speech. Initiated by the national broadcast organisation NTR, Sprekend Nederland relies on both traditional and modern media to recruit participants. At this point, about halfway through the project, over 7000 participants have recorded over 200 000 utterances, totalling about 375 hours of speech data... Mehr ...

Verfasser: Hinskens, F.L.M.P.
van Leeuwen, David A.
Martinovic, Borja
van Hessen, Arjan
Grondelaers, Stefan
Orr, Rosemary
Dokumenttyp: Artikel
Erscheinungsdatum: 2016
Reihe/Periodikum: Hinskens , F L M P , van Leeuwen , D A , Martinovic , B , van Hessen , A , Grondelaers , S & Orr , R 2016 , ' Sprekend Nederland: a heterogeneous speech data collection ' , Computational Linguistics in the Netherlands Journal , vol. 6 (2016) , pp. 21-38 . < http://www.clinjournal.org/sites/clinjournal.org/files/VanLeeuwen2016.pdf >
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-28710528
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://pure.knaw.nl/portal/en/publications/66ce6ae9-ddaa-40bf-82f4-5b0be025f384

Sprekend Nederland is a large-scale effort to document the variability of Dutch as spoken in the Netherlands anno 2016. A smartphone app was created to record the speech of as many speakers of Dutch as possible, as well as their attitudes (perceptions and evaluations) towards other participants’s speech. Initiated by the national broadcast organisation NTR, Sprekend Nederland relies on both traditional and modern media to recruit participants. At this point, about halfway through the project, over 7000 participants have recorded over 200 000 utterances, totalling about 375 hours of speech data, and over a million of attitude judgements have been given. In this paper we report the design and implementation of the data collection, we present some preliminary statistics and demographics, and we outline a number of research possibilities that this data collection offers.