Building a VOCabulary: the uses and challenges of thesauri for working with early modern recognized entities

The GLOBALISE project aims to unlock the extensive archives of the Dutch East India Company (VOC) using Handwritten Text Recognition and Natural Language Processing techniques. By creating a hierarchical reference thesaurus, the project makes the VOC archives researchable and enhances search capabilities. The reference data provides unique identifiers for terms, catalogues their variations and tries to link them to existing definitions where possible. Providing such a contextually appropriate and expandable vocabulary allowsresearchers to explore the archive with the help of synonyms andrelate... Mehr ...

Verfasser: Nijman, Brecht
Pepping, Kay
Dokumenttyp: conferencePaper
Erscheinungsdatum: 2023
Verlag/Hrsg.: Zenodo
Schlagwörter: Dutch East India Company (VOC) / Hierarchical reference thesaurus / Unique Resource Identifiers (URIs) / Linked Open Data (LOD) / Simple Knowledge Organization System (SKOS) / Overgekomen Brieven en Papieren (OBP) / Reference datasets
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-28640617
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://doi.org/10.5281/zenodo.7973695

The GLOBALISE project aims to unlock the extensive archives of the Dutch East India Company (VOC) using Handwritten Text Recognition and Natural Language Processing techniques. By creating a hierarchical reference thesaurus, the project makes the VOC archives researchable and enhances search capabilities. The reference data provides unique identifiers for terms, catalogues their variations and tries to link them to existing definitions where possible. Providing such a contextually appropriate and expandable vocabulary allowsresearchers to explore the archive with the help of synonyms andrelated concepts.