BeCoS corpus : Belgian Covid-19 Sign language corpus : a corpus for training sign language recognition and translation

We are presenting the Belgian Federal COVID-19 corpus, nicknamed the BeCoS (Belgian Covid Sign language) corpus. It consists of the entire archive of official press conferences from the Belgian Federal Government concerning the COVID-19 pandemic. The speakers speak mostly in Dutch or French and occasionally in German, and nearly all speech is accompanied by a deaf signer who performs live interpreting from what is being said. We have preprocessed the corpus with speaker diarisation, applied Belgian Dutch ASR, and post-ASR language identification and punctuation prediction as well as signer dia... Mehr ...

Verfasser: Vandeghinste, Vincent
Van Dyck, Bob
De Coster, Mathieu
Goddefroy, Maud
Dambre, Joni
Dokumenttyp: journalarticle
Erscheinungsdatum: 2022
Schlagwörter: Languages and Literatures
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-28878239
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://biblio.ugent.be/publication/01GP37D6W8299E7AE1YVZGDV5S

We are presenting the Belgian Federal COVID-19 corpus, nicknamed the BeCoS (Belgian Covid Sign language) corpus. It consists of the entire archive of official press conferences from the Belgian Federal Government concerning the COVID-19 pandemic. The speakers speak mostly in Dutch or French and occasionally in German, and nearly all speech is accompanied by a deaf signer who performs live interpreting from what is being said. We have preprocessed the corpus with speaker diarisation, applied Belgian Dutch ASR, and post-ASR language identification and punctuation prediction as well as signer diarisation, sign language identification and sign language keypoint recognition. The corpus is made publicly available.