A Cross-Language Approach to Historic Document Retrieval ...

Our cultural heritage, as preserved in libraries, archives and museums, is made up of documents written many centuries ago. Large-scale digitization initiatives, like DigiCULT, make these documents available to non-expert users through digital libraries and vertical search engines. For a user, querying a historic document collection may be a disappointing experience. Natural languages evolve over time, changing in pronunciation and spelling, and new words are introduced continuously, while older words may disappear out of everyday use. For these reasons, queries involving modern words may not... Mehr ...

Verfasser: Kamps, Jaap
Koolen, Marijn
Adriaans, Frans
de Rijke, Maarten
Dokumenttyp: Journalarticle
Erscheinungsdatum: 2007
Verlag/Hrsg.: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Schlagwörter: Historic Documents / Information Retrieval / Spelling variation / Modernizing Spelling / 17th Century Dutch
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-29396171
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://dx.doi.org/10.4230/dagsemproc.06491.3

Our cultural heritage, as preserved in libraries, archives and museums, is made up of documents written many centuries ago. Large-scale digitization initiatives, like DigiCULT, make these documents available to non-expert users through digital libraries and vertical search engines. For a user, querying a historic document collection may be a disappointing experience. Natural languages evolve over time, changing in pronunciation and spelling, and new words are introduced continuously, while older words may disappear out of everyday use. For these reasons, queries involving modern words may not be very effective for retrieving documents that contain many historic terms. Although reading a 300-year-old document might not be problematic because the words are still recognizable, the changes in vocabulary and spelling can make it difficult to use a search engine to find relevant documents. To illustrate this, consider the following example from our collection of 17th century Dutch law texts. Looking for ...