Extracting patient lifestyle characteristics from Dutch clinical text with BERT models ...

Abstract Background BERT models have seen widespread use on unstructured text within the clinical domain. However, little to no research has been conducted into classifying unstructured clinical notes on the basis of patient lifestyle indicators, especially in Dutch. This article aims to test the feasibility of deep BERT models on the task of patient lifestyle classification, as well as introducing an experimental framework that is easily reproducible in future research. Methods This study makes use of unstructured general patient text data from HagaZiekenhuis, a large hospital in The Netherla... Mehr ...

Verfasser: Muizelaar, Hielke
Haas, Marcel
van Dortmont, Koert
van der Putten, Peter
Spruit, Marco
Dokumenttyp: Datenquelle
Erscheinungsdatum: 2024
Verlag/Hrsg.: figshare
Schlagwörter: Medicine / Biotechnology / Sociology / FOS: Sociology / Biological Sciences not elsewhere classified / Information Systems not elsewhere classified / Cancer / Mental Health
Sprache: unknown
Permalink: https://search.fid-benelux.de/Record/base-28984092
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://dx.doi.org/10.6084/m9.figshare.c.7267427.v1

Abstract Background BERT models have seen widespread use on unstructured text within the clinical domain. However, little to no research has been conducted into classifying unstructured clinical notes on the basis of patient lifestyle indicators, especially in Dutch. This article aims to test the feasibility of deep BERT models on the task of patient lifestyle classification, as well as introducing an experimental framework that is easily reproducible in future research. Methods This study makes use of unstructured general patient text data from HagaZiekenhuis, a large hospital in The Netherlands. Over 148 000 notes were provided to us, which were each automatically labelled on the basis of the respective patients’ smoking, alcohol usage and drug usage statuses. In this paper we test feasibility of automatically assigning labels, and justify it using hand-labelled input. Ultimately, we compare macro F1-scores of string matching, SGD and several BERT models on the task of classifying smoking, alcohol and drug ...