Dataset of Middle Dutch lexical stress patterns and syllabifications

This dataset consists of 48.219 Middle Dutch words taken from in total 205 rhymed texts of the Cd-rom Middelnederlands (1998). All of these words have been assigned a syllabification and lexical stress pattern. E.g.: proevede is syllabified as proe-ve-de and has a stress index set at -3, which means that – counting from the rightmost syllable – the third syllable receives stress. This upload contains the following files: The JSON-file (compressed), which was used as input data for a machine learning algorithm trained for the automatic syllabification and stress assignment of Middle Dutch polys... Mehr ...

Verfasser: Haverals, Wouter
Dokumenttyp: other
Erscheinungsdatum: 2019
Schlagwörter: Middle Dutch / lexical stress / syllabification
Sprache: Niederländisch, Middle (ca.1050-1350)
Permalink: https://search.fid-benelux.de/Record/base-27466136
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://zenodo.org/record/2582976

This dataset consists of 48.219 Middle Dutch words taken from in total 205 rhymed texts of the Cd-rom Middelnederlands (1998). All of these words have been assigned a syllabification and lexical stress pattern. E.g.: proevede is syllabified as proe-ve-de and has a stress index set at -3, which means that – counting from the rightmost syllable – the third syllable receives stress. This upload contains the following files: The JSON-file (compressed), which was used as input data for a machine learning algorithm trained for the automatic syllabification and stress assignment of Middle Dutch polysyllabic words (for the code of this experiment, see GitHub) An Excel-file, containing the same data as the JSON (for more convenient reference) A split file (compressed), used in the training proces of the above-mentioned experiment A pdf-file with some insightful illustrations about the contents of the dataset This dataset is part of the research of Wouter Haverals (FWO, University of Antwerp), carried out under the supervision of prof. Mike Kestemont and em. prof. Frank Willaert.