Modeling Dutch Medical Texts for Detecting Functional Categories and Levels of COVID-19 Patients

Electronic Health Records contain a lot of information in natural language that is not expressed in the structured clinical data. Especially in the case of new diseases such as COVID-19, this information is crucial to get a better understanding of patient recovery patterns and factors that may play a role in it. However, the language in these records is very different from standard language and generic natural language processing tools cannot easily be applied out-of-the-box. In this paper, we present a fine-tuned Dutch language model specifically developed for the language in these health rec... Mehr ...

Verfasser: Kim, J.
Verkijk, S.
Vossen, P.
Geleijn, E.
van der Leeden, M.
Meskers, C.
van der Veen, S.
Widdershoven, G.
Dokumenttyp: contributionToPeriodical
Erscheinungsdatum: 2022
Verlag/Hrsg.: European Language Resources Association (ELRA)
Schlagwörter: /dk/atira/pure/sustainabledevelopmentgoals/good_health_and_well_being / name=SDG 3 - Good Health and Well-being
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-27462320
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://research.vu.nl/en/publications/418279a3-c634-4972-b77f-5a70f909bbda

Electronic Health Records contain a lot of information in natural language that is not expressed in the structured clinical data. Especially in the case of new diseases such as COVID-19, this information is crucial to get a better understanding of patient recovery patterns and factors that may play a role in it. However, the language in these records is very different from standard language and generic natural language processing tools cannot easily be applied out-of-the-box. In this paper, we present a fine-tuned Dutch language model specifically developed for the language in these health records that can determine the functional level of patients according to a standard coding framework from the World Health Organization. We provide evidence that our classification performs at a sufficient level (F1-score above 80% for the main categories and error rates of less than 1 level on a 5-point Likert scale for levels) to generate patient recovery patterns that can be used to analyse factors that contribute to the rehabilitation of COVID-19 patients and to predict individual patient recovery of functioning.