Natural Language Processing in Dutch Free Text Radiology Reports:Challenges in a Small Language Area Staging Pulmonary Oncology

Reports are the standard way of communication between the radiologist and the referring clinician. Efforts are made to improve this communication by, for instance, introducing standardization and structured reporting. Natural Language Processing (NLP) is another promising tool which can improve and enhance the radiological report by processing free text. NLP as such adds structure to the report and exposes the information, which in turn can be used for further analysis. This paper describes pre-processing and processing steps and highlights important challenges to overcome in order to successf... Mehr ...

Verfasser: Nobel, J. Martijn
Puts, Sander
Bakers, Frans C. H.
Robben, Simon G. F.
Dekker, Andre L. A. J.
Dokumenttyp: Artikel
Erscheinungsdatum: 2020
Reihe/Periodikum: Nobel , J M , Puts , S , Bakers , F C H , Robben , S G F & Dekker , A L A J 2020 , ' Natural Language Processing in Dutch Free Text Radiology Reports : Challenges in a Small Language Area Staging Pulmonary Oncology ' , Journal of Digital Imaging , vol. 33 , no. 4 , pp. 1002-1008 . https://doi.org/10.1007/s10278-020-00327-z
Schlagwörter: Radiology / Reporting / Natural language processing / Free text / Classification system / Machine learning / CLASSIFICATION
Sprache: Englisch
Permalink: https://search.fid-benelux.de/Record/base-29021213
Datenquelle: BASE; Originalkatalog
Powered By: BASE
Link(s) : https://cris.maastrichtuniversity.nl/en/publications/86b646d7-60cf-401b-b7fd-bd7c93514845

Reports are the standard way of communication between the radiologist and the referring clinician. Efforts are made to improve this communication by, for instance, introducing standardization and structured reporting. Natural Language Processing (NLP) is another promising tool which can improve and enhance the radiological report by processing free text. NLP as such adds structure to the report and exposes the information, which in turn can be used for further analysis. This paper describes pre-processing and processing steps and highlights important challenges to overcome in order to successfully implement a free text mining algorithm using NLP tools and machine learning in a small language area, like Dutch. A rule-based algorithm was constructed to classify T-stage of pulmonary oncology from the original free text radiological report, based on the items tumor size, presence and involvement according to the 8th TNM classification system. PyContextNLP, spaCy and regular expressions were used as tools to extract the correct information and process the free text. Overall accuracy of the algorithm for evaluating T-stage was 0,83 in the training set and 0,87 in the validation set, which shows that the approach in this pilot study is promising. Future research with larger datasets and external validation is needed to be able to introduce more machine learning approaches and perhaps to reduce required input efforts of domain-specific knowledge. However, a hybrid NLP approach will probably achieve the best results.