Clearing the Transcription Hurdle in Dialect Corpus Building: The Corpus of Southern Dutch Dialects as Case Study

This paper discusses how the transcription hurdle in dialect corpus building can be cleared. While corpus analysis has strongly gained in popularity in linguistic research, dialect corpora are still relatively scarce. This scarcity can be attributed to several factors, one of which is the challenging nature of transcribing dialects, given a lack of both orthographic norms for many dialects and speech technological tools trained on dialect data. This paper addresses the questions (i) how dialects can be transcribed efficiently and (ii) whether speech technological tools can lighten the transcri... Mehr ...

Verfasser: Anne-Sophie Ghyselen
Anne Breitbarth
Melissa Farasyn
Jacques Van Keymeulen
Arjan van Hessen
Dokumenttyp: Artikel
Erscheinungsdatum: 2020
Reihe/Periodikum: Frontiers in Artificial Intelligence, Vol 3 (2020)
Verlag/Hrsg.: Frontiers Media S.A.
Schlagwörter: dialect / transcription / corpus research / ASR / respeaking / forced alignment / Electronic computers. Computer science / QA75.5-76.95
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-29406132
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://doi.org/10.3389/frai.2020.00010