The Early Modern Dutch Mediascape. Detecting Media Mentions in Chronicles Using Word Embeddings and CRF

While the production of information in the European early modern period is a well-researched topic, the question how people were engaging with the information explosion that occurred in early modern Europe, is still underexposed. This paper presents the annotations and experiments aimed at exploring whether we can automatically extract media related information (source, perception, and receiver) from a corpus of early modern Dutch chronicles in order to get insight in the mediascape of early modern middle class people from a historic perspective. In a number of classification experiments with... Mehr ...

Verfasser: Lassche, Alie
Morante, Roser
Dokumenttyp: contributionToPeriodical
Erscheinungsdatum: 2021
Verlag/Hrsg.: Association for Computational Linguistics (ACL)
Schlagwörter: /dk/atira/pure/sustainabledevelopmentgoals/quality_education / name=SDG 4 - Quality Education
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-29045443
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://research.vu.nl/en/publications/4f9b98fa-15d1-460f-97bf-14ff921bd0de

While the production of information in the European early modern period is a well-researched topic, the question how people were engaging with the information explosion that occurred in early modern Europe, is still underexposed. This paper presents the annotations and experiments aimed at exploring whether we can automatically extract media related information (source, perception, and receiver) from a corpus of early modern Dutch chronicles in order to get insight in the mediascape of early modern middle class people from a historic perspective. In a number of classification experiments with Conditional Random Fields, three categories of features are tested: (i) raw and binary word embedding features, (ii) lexicon features, and (iii) character features. Overall, the classifier that uses raw embeddings performs slightly better. However, given that the best F-scores are around 0.60, we conclude that the machine learning approach needs to be combined with a close reading approach for the results to be useful to answer history research questions.