Lexicon or grammar?:Using memory-based learning to investigate the syntactic relationship between Belgian and Netherlandic Dutch

This article builds on computational tools to investigate the syntactic relationship between the highly related European national varieties of Dutch, viz. Belgian Dutch (BD) and Netherlandic Dutch (ND). It reports on a series of memory-based learning analyses of the post-verbal distribution of er “there” in adjunct-initial existential constructions like Op het dak staat (er) een schoorsteen “On the roof (there) is a chimney,’, which has been claimed to be among the most notoriously difficult variables in Dutch. On the basis of balanced datasets extracted from Flemish and Dutch newspaper corpor... Mehr ...

Verfasser: De Troij, Robbert
Grondelaers, Stefan
Speelman, Dirk
van den Bosch, A.
Dokumenttyp: Artikel
Erscheinungsdatum: 2021
Reihe/Periodikum: De Troij , R , Grondelaers , S , Speelman , D & van den Bosch , A 2021 , ' Lexicon or grammar? Using memory-based learning to investigate the syntactic relationship between Belgian and Netherlandic Dutch ' , Natural Language Engineering , pp. 1-19 . https://doi.org/10.1017/s1351324921000097
Schlagwörter: syntactic variation / Dutch / existential constructions / memory-based learning / national variation
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-26502586
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://pure.knaw.nl/portal/en/publications/8ae2ba67-5004-4e26-ac01-288c6fffa0b9

This article builds on computational tools to investigate the syntactic relationship between the highly related European national varieties of Dutch, viz. Belgian Dutch (BD) and Netherlandic Dutch (ND). It reports on a series of memory-based learning analyses of the post-verbal distribution of er “there” in adjunct-initial existential constructions like Op het dak staat (er) een schoorsteen “On the roof (there) is a chimney,’, which has been claimed to be among the most notoriously difficult variables in Dutch. On the basis of balanced datasets extracted from Flemish and Dutch newspaper corpora, it is shown that er ’s distribution in both national varieties can be learned to a considerable extent from bare lexical input which is not assigned to higher-level categories. However, whereas this yields good results for ND, BD scores are consistently lower, suggesting that BD cannot do with lexical features alone to attain accuracy scores comparable to ND. This ties in with earlier findings that the more advanced standardization of ND materializes in a higher lexical collocability, whereas Flemish speakers need additional higher-level linguistic information to insert er.