Annotation-preserving machine translation of English corpora to validate Dutch clinical concept extraction tools

Objective To explore the feasibility of validating Dutch concept extraction tools using annotated corpora translated from English, focusing on preserving annotations during translation and addressing the scarcity of non-English annotated clinical corpora. Materials and Methods Three annotated corpora were standardized and translated from English to Dutch using 2 machine translation services, Google Translate and OpenAI GPT-4, with annotations preserved through a proposed method of embedding annotations in the text before translation. The performance of 2 concept extraction tools, MedSpaCy and... Mehr ...

Verfasser: Seinen, Tom M.
Kors, Jan A.
van Mulligen, Erik M.
Rijnbeek, Peter R.
Dokumenttyp: Artikel
Erscheinungsdatum: 2024
Reihe/Periodikum: Seinen , T M , Kors , J A , van Mulligen , E M & Rijnbeek , P R 2024 , ' Annotation-preserving machine translation of English corpora to validate Dutch clinical concept extraction tools ' , Journal of the American Medical Informatics Association , vol. 31 , no. 8 , ocae159 , pp. 1725-1734 . https://doi.org/10.1093/jamia/ocae159
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-29043926
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://pure.eur.nl/en/publications/73b7f6d4-172e-44ce-8405-d281fd007bb6