Word level language identification in online multilingual communication

Multilingual speakers switch between languages in online and spoken communication. Analyses of large scale multilingual data re- quire automatic language identification at the word level. For our experiments with multilingual online discussions, we first tag the language of individual words using language models and dictionaries. Secondly, we incorporate context to improve the performance. We achieve an accuracy of 98%. Besides word level accuracy, we use two new metrics to evaluate this task.

Verfasser: Nguyen, Dong
Doğruöz, A. Seza
Dokumenttyp: conference
Erscheinungsdatum: 2013
Verlag/Hrsg.: Association for Computational Linguistics (ACL)
Schlagwörter: Languages and Literatures / multilingual / Turkish / Dutch / code-switching / automatic language identification / social media / Lt3
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-27063372
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://biblio.ugent.be/publication/8694791