Automatic Assignment of Radiology Examination Protocols Using Pre-trained Language Models with Knowledge Distillation
Abstract
Selecting radiology examination protocol is a repetitive, and time-consuming process. In this paper, we present a deep learning approach to automatically assign protocols to computer tomography examinations, by pre-training a domain-specific BERT model ($BERT_{rad}$). To handle the high data imbalance across exam protocols, we used a knowledge distillation approach that up-sampled the minority classes through data augmentation. We compared classification performance of the described approach with the statistical n-gram models using Support Vector Machine (SVM), Gradient Boosting Machine (GBM), and Random Forest (RF) classifiers, as well as the Google's $BERT_{base}$ model. SVM, GBM and RF achieved macro-averaged F1 scores of 0.45, 0.45, and 0.6 while $BERT_{base}$ and $BERT_{rad}$ achieved 0.61 and 0.63. Knowledge distillation improved overall performance on the minority classes, achieving a F1 score of 0.66.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2020
- DOI:
- 10.48550/arXiv.2009.00694
- arXiv:
- arXiv:2009.00694
- Bibcode:
- 2020arXiv200900694L
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- accepted at American Medical Informatics Association symposium 2021