NIOD_WarLet_1945-1950_NoBasemodel:Een openbaar HTR-model in Transkribus voor handgeschreven Nederlands uit het midden van de twintigste eeuw ; NIOD_WarLet_1945-1950_NoBasemodel:A public Transkribus HTR-model for mid-twentieth-century handwritten Dutch

The HTR model ‘NIOD_WarLet_1935-1950_NoBasemodel’ was trained using 968 ‘Ground Truth’ transcriptions of high-resolution scans of various handwritten letters. These letters are all written in Dutch and originate from the period 1935-1950. The training set contains personal correspondence from a wide variety of letter writers (e.g., children, soldiers, Jewish people in hiding). These personal correspondences are all part of the archival collection known as ‘247 Correspondentie’ held by the NIOD Institute for War, Holocaust, and Genocide Studies in Amsterdam. This model was created as part of th... Mehr ...

Verfasser: Lange, van, Milan
Nispen, van, Annelies
Keijzer, Carlijn
Bouman, Muriël
Dokumenttyp: other
Erscheinungsdatum: 2023
Verlag/Hrsg.: Transkribus
Schlagwörter: Handwritten text recognition / HTR / Transkribus / automatic transcription / historical documents / egodocuments / war letters / computer model
Sprache: Niederländisch
Permalink: https://search.fid-benelux.de/Record/base-28995463
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://pure.knaw.nl/portal/en/publications/819d39e5-a49c-434e-bd84-cd6df5eab58e

The HTR model ‘NIOD_WarLet_1935-1950_NoBasemodel’ was trained using 968 ‘Ground Truth’ transcriptions of high-resolution scans of various handwritten letters. These letters are all written in Dutch and originate from the period 1935-1950. The training set contains personal correspondence from a wide variety of letter writers (e.g., children, soldiers, Jewish people in hiding). These personal correspondences are all part of the archival collection known as ‘247 Correspondentie’ held by the NIOD Institute for War, Holocaust, and Genocide Studies in Amsterdam. This model was created as part of the project ‘First-Hand Accounts of War: War letters (1935-1950) from NIOD digitised’. All documents used for training and validation were scanned and transcribed within this project. This project ran from 2020 to 2023 and was funded by the Mondriaan Fund, the Dutch Ministry of Health, Welfare, and Sport, and the NIOD Institute for War, Holocaust, and Genocide Studies in Amsterdam. The ‘Ground Truth’ training set is created by project members Annelies van Nispen, Carlijn Keijzer and Milan van Lange. Additional transcription and correction of ‘Ground Truth’ transcriptions was performed under supervision of Muriël Bouman by citizen scientists Hillebrand Verkroost, Bart Cohen, Evelien Bachrach, Marjo Janssens, and Cocky Sietses. The validation set contains a sample of 17 ‘Ground Truth’ transcriptions from various writers and sub-collections. The model is trained using PyLaia HTR, max. 500 epochs (321 epochs trained), learning rate 0.0003. CER (validation set) is 5,40%. No base model was used.