Middle Dutch syllabified words

Specifics of the data: Text file containing 43,710syllabified Middle Dutch words, takenfrom the Corpus Van Reenen-Mulder . Thiscorpus, created by Pieter van Reenen en Maaike Mulderat the Free University Amsterdam, containsabout 2,500 Middle Dutchcharters. It hasabout 750,000 tokens.The charters were written in the NetherlandsandFlanders between 1300and1400. The 43,710syllabified words in this list is the total amount of unique words from the Corpus Van Reenen-Mulder . Some tokens from this corpus were, however, excluded when assembling the data setdue to the fact that they contained diacritic... Mehr ...

Verfasser: Haverals, Wouter
Dokumenttyp: other
Erscheinungsdatum: 2018
Verlag/Hrsg.: Zenodo
Schlagwörter: Middle Dutch / syllabification / syllabifier
Sprache: Niederländisch, Middle (ca.1050-1350)
Permalink: https://search.fid-benelux.de/Record/base-28640152
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://doi.org/10.5281/zenodo.2274921

Specifics of the data: Text file containing 43,710syllabified Middle Dutch words, takenfrom the Corpus Van Reenen-Mulder . Thiscorpus, created by Pieter van Reenen en Maaike Mulderat the Free University Amsterdam, containsabout 2,500 Middle Dutchcharters. It hasabout 750,000 tokens.The charters were written in the NetherlandsandFlanders between 1300and1400. The 43,710syllabified words in this list is the total amount of unique words from the Corpus Van Reenen-Mulder . Some tokens from this corpus were, however, excluded when assembling the data setdue to the fact that they contained diacritic symbols to indicate abbreviations, clitics,or unclear parts in the original charter. A dash-symbol (-)is used as separator. Apart from the entire data set, this DOI also includes: A pdf-file visualizing the data set The splits used for the automatic syllabification experiment by Haverals, Kestemont & Karsdorp (2018).