Can Internet Data Help to Uncover Developing Preferred Multilingual Usage Patterns? An Exploration of Data from Turkish-Dutch Bilingual Internet Fora

This paper discusses the extent to which two characteristics of digital data make such data suitable for detecting preference patterns in code switching: an absence of paralinguistic disambiguation- cues and its extra-linguistic ‘context-freeness’. This paper reports on the exploration of a 219,536 word Dutch-Turkish digital data corpus compiled from bilingual internet fora. It describes both macro-sociolinguistic patterns of language choice as well as micro-linguistic contact features in bilingual data, comparing both macro and micro results with what is known from the sociolinguistic literat... Mehr ...

Verfasser: Dorleijn, Margreet
Dokumenttyp: Artikel
Erscheinungsdatum: 2016
Reihe/Periodikum: Journal of Language Contact ; volume 9, issue 1, page 130-162 ; ISSN 1877-4091 1955-2629
Verlag/Hrsg.: Brill
Schlagwörter: Linguistics and Language / Language and Linguistics
Sprache: unknown
Permalink: https://search.fid-benelux.de/Record/base-26651943
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : http://dx.doi.org/10.1163/19552629-00901006

This paper discusses the extent to which two characteristics of digital data make such data suitable for detecting preference patterns in code switching: an absence of paralinguistic disambiguation- cues and its extra-linguistic ‘context-freeness’. This paper reports on the exploration of a 219,536 word Dutch-Turkish digital data corpus compiled from bilingual internet fora. It describes both macro-sociolinguistic patterns of language choice as well as micro-linguistic contact features in bilingual data, comparing both macro and micro results with what is known from the sociolinguistic literature in general, and Turkish-Dutch code switching and contact linguistic literature in particular. The data are analysed qualitatively and quantitatively. Focus is on the analysis of densely bilingual data of the type that has been called in the literature ‘mixed language’ (Auer, 1999), ‘intimate switching’ (Poplack, 1980), or ‘unmarked switching’ (Myers-Scotton, 1983; 1993b). It is argued that data of this type of intensive language mixing should display a certain degree of predictability since it is generally perceived of as the most effortless way of speaking by its users. It is demonstrated that recurring patterns can be found in the data, both on the macro-level of language choice and the micro-level of lexical choice, as well as in code switching patterns, and lexico-semantic choices, and it is argued that in these patterns principles of transparency and frequency of exposure may be an explanatory factor.