The Datafication of Early Modern Ordinances

The project Entangled Histories used early modern printed normative texts. The computer used to have significant problems being able to read Dutch Gothic print, which is used in the vast majority of the sources. Using the Handwritten Text Recognition suite Transkribus (v.1.07-v.1.10), we reprocessed the original scans that had poor quality OCR, obtaining a Character Error Rate (CER) much lower than our initial expectations of <5% CER. This result is a significant improvement that enables the searching through 75,000 pages of printed normative texts from the seventeen provinces, also known a... Mehr ...

Verfasser: Romein, C.A.
de Gruijter, Michel
Veldhoen, Sara Floor
Dokumenttyp: Artikel
Erscheinungsdatum: 2020
Reihe/Periodikum: Romein , C A , de Gruijter , M & Veldhoen , S F 2020 , ' The Datafication of Early Modern Ordinances ' , DH Benelux Journal , vol. 2 . < http://journal.dhbenelux.org/journal/issues/002/article-23-romein/article-23-romein.pdf >
Schlagwörter: Early Modern Printed Ordinances / Text recognition / Text segmentation / Dutch Gothic Print / Transkribus / Annif / Machine Learning / Categorisation
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-26635359
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://pure.knaw.nl/portal/en/publications/97a53282-c1e9-448c-8565-bb28c8ef27fe