A System for Medical Information Extraction and Verification from Unstructured Text

Authors

Damir Juric Babylon Health
Giorgos Stoilos Huawei Technologies
Andre Melo Huawei Technologies
Jonathan Moore Babylon Health
Mohammad Khodadadi Babylon Health

DOI:

https://doi.org/10.1609/aaai.v34i08.7042

Abstract

A wealth of medical knowledge has been encoded in terminologies like SNOMED CT, NCI, FMA, and more. However, these resources are usually lacking information like relations between diseases, symptoms, and risk factors preventing their use in diagnostic or other decision making applications. In this paper we present a pipeline for extracting such information from unstructured text and enriching medical knowledge bases. Our approach uses Semantic Role Labelling and is unsupervised. We show how we dealt with several deficiencies of SRL-based extraction, like copula verbs, relations expressed through nouns, and assigning scores to extracted triples. The system have so far extracted about 120K relations and in-house doctors verified about 5k relationships. We compared the output of the system with a manually constructed network of diseases, symptoms and risk factors build by doctors in the course of a year. Our results show that our pipeline extracts good quality and precise relations and speeds up the knowledge acquisition process considerably.

Downloads

Published

2020-04-03

How to Cite

Juric, D., Stoilos, G., Melo, A., Moore, J., & Khodadadi, M. (2020). A System for Medical Information Extraction and Verification from Unstructured Text. Proceedings of the AAAI Conference on Artificial Intelligence, 34(08), 13314-13319. https://doi.org/10.1609/aaai.v34i08.7042

Download Citation

Issue

Vol. 34 No. 08: AAAI-20 / IAAI-20 Technical Tracks

Section

IAAI Technical Track: Emerging Papers