Accessing the Republic. Entity extraction from the resolutions of the Dutch States-General. ...

This repository contains the abstract and presentation of our paper presented at the Digital Humanities in the BeNeLux 2024 (DH Benelux 2024) conference, held 4-7 June at KU Leuven in Leuven Belgium. In this paper we report on our approach to extracting entities from the REPUBLIC corpus of the resolutions of the States General of the Dutch Republic 1576-1796. We describe 1) the construction of ground truth data for different types of entities, 2) the evaluation of NER taggers based on various types of embeddings for historical Dutch, 3) our findings from curating millions of occurrences of the... Mehr ...

Verfasser: Koolen, Marijn
Renkema, Esger
Groskamp, Nienke
Smit, Frank
Reinders, Jirsi
Sluijter, Ronald
Hoekstra, Rik
Oddens, Joris
Dokumenttyp: article-journal
Erscheinungsdatum: 2024
Verlag/Hrsg.: Zenodo
Schlagwörter: named entity recognition / digital history / digital source publication / political history / data curation
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-28980686
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://dx.doi.org/10.5281/zenodo.11499311

This repository contains the abstract and presentation of our paper presented at the Digital Humanities in the BeNeLux 2024 (DH Benelux 2024) conference, held 4-7 June at KU Leuven in Leuven Belgium. In this paper we report on our approach to extracting entities from the REPUBLIC corpus of the resolutions of the States General of the Dutch Republic 1576-1796. We describe 1) the construction of ground truth data for different types of entities, 2) the evaluation of NER taggers based on various types of embeddings for historical Dutch, 3) our findings from curating millions of occurrences of the different entity types, and 4) how the curation gives insights into the characteristics of the corpus. ...