Modeling Dutch Medical Texts for Detecting Functional Categories and Levels of COVID-19 Patients
Electronic Health Records contain a lot of information in natural language that is not expressed in the structured clinical data. Especially in the case of new diseases such as COVID-19, this information is crucial to get a better understanding of patient recovery patterns and factors that may play a role in it. However, the language in these records is very different from standard language and generic natural language processing tools cannot easily be applied out-of-the-box. In this paper, we present a fine-tuned Dutch language model specifically developed for the language in these health rec... Mehr ...
Verfasser: | |
---|---|
Dokumenttyp: | contributionToPeriodical |
Erscheinungsdatum: | 2022 |
Verlag/Hrsg.: |
European Language Resources Association (ELRA)
|
Schlagwörter: | /dk/atira/pure/sustainabledevelopmentgoals/good_health_and_well_being / name=SDG 3 - Good Health and Well-being |
Sprache: | Englisch |
Permalink: | https://search.fid-benelux.de/Record/base-27075688 |
Datenquelle: | BASE; Originalkatalog |
Powered By: | BASE |
Link(s) : | https://research.vu.nl/en/publications/418279a3-c634-4972-b77f-5a70f909bbda |
Electronic Health Records contain a lot of information in natural language that is not expressed in the structured clinical data. Especially in the case of new diseases such as COVID-19, this information is crucial to get a better understanding of patient recovery patterns and factors that may play a role in it. However, the language in these records is very different from standard language and generic natural language processing tools cannot easily be applied out-of-the-box. In this paper, we present a fine-tuned Dutch language model specifically developed for the language in these health records that can determine the functional level of patients according to a standard coding framework from the World Health Organization. We provide evidence that our classification performs at a sufficient level (F1-score above 80% for the main categories and error rates of less than 1 level on a 5-point Likert scale for levels) to generate patient recovery patterns that can be used to analyse factors that contribute to the rehabilitation of COVID-19 patients and to predict individual patient recovery of functioning.