DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature Fusion
<p>Difference between the previous methods such as GCL [<a href="#B10-mathematics-12-00488" class="html-bibr">10</a>], Chet [<a href="#B11-mathematics-12-00488" class="html-bibr">11</a>], and our method DRR. GCL and Chet have only focused on the relationship between high-risk diseases and diagnosed diseases, DRR focuses also on the relationship between low-risk diseases and diagnosed diseases.</p> "> Figure 2
<p>An overview of the proposed DRR model. The model utilizes all HERs to construct a global graph, with nodes symbolizing diseases diagnosed in patients and edges reflecting disease co-occurrence frequencies. In global graph, the node corresponding to the disease diagnosed in patient <span class="html-italic">i</span> at time <span class="html-italic">t</span> is termed the “diagnosed disease node”. Nodes linked to this are “high-risk nodes”, while those connected to high-risk but not to the diagnosed disease node are “low-risk nodes”. The model extracts three types of subgraph encodings for each patient at time <span class="html-italic">T</span> using GCN for feature extraction. An attention mechanism rebuilds relationships between the diagnosed disease at time <span class="html-italic">T</span> low-risk diseases, high-risk diseases at time <span class="html-italic">T</span>, and high-risk diseases at time <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>−</mo> <mn>1</mn> </mrow> </semantics></math>. These features are then processed through a GRU to extract temporal features. Finally, the model integrates these temporal, high-risk, and diagnosed disease features to predict patient disease diagnosis at time <math display="inline"><semantics> <mrow> <mi>T</mi> <mo>+</mo> <mn>1</mn> </mrow> </semantics></math>.</p> "> Figure 3
<p>Visualization analysis. Prior to the classifier, features were retrieved, and dimensionality was reduced using the t-SNE approach. The red dots denote the characteristics of patients diagnosed with heart failure, and the blue dots represent those of individuals undiagnosed with the condition. (<b>a</b>) shows the characteristics of patients extracted directly without model training, where features of both diagnosed and undiagnosed individuals are interwoven. In contrast, (<b>b</b>) shows the characteristics post model training, demonstrating a clear demarcation between the features of heart failure patients and those without, thereby emphasizing the model’s effectiveness in precise feature differentiation.</p> ">
Abstract
:1. Introduction
- The clinical observation that if a patient has had disease A for a prolonged length of time, the probability of the patient developing disease B in the future significantly increases is the primary motivation for employing a graphical approach to establish connections between diseases. Therefore, it is reasonable to draw graphical correlations between diseases with current diagnoses and diseases with high risk. Deep learning models can be used to understand the relationships between high-risk diseases and presently identified diseases, aligning with clinical practice and potentially assisting in disease prediction. However, this clinical experience frequently neglects the potential for future diagnoses of low-risk disorders [14]. While using high-risk diseases to forecast future diseases can improve model performance, ignoring low-risk ones will create a bottleneck in the model. As illustrated in Figure 1, previous methods have focused on the relationship between high-risk diseases and diagnosed diseases; ours looks at not only the relationship between high-risk diseases and diagnosed diseases but also the relationship between low-risk diseases and diagnosed diseases.
- RNNs are frequently used to extract temporal information from a patient’s historical diagnostic records after correlations between currently diagnosed diseases and high-risk or low-risk diseases have been shown graphically. This approach aligns with clinical experience, as it involves predicting disease progression based on a patient’s prior medical history. However, when dealing with long-term sequences, RNN algorithms encounter the issue of forgetting. Depending on the co-occurrence connections between diseases, it is possible to overlook global information and consider certain diseases as low-risk when they seem to have improved. However, these diseases often have a significant likelihood of recurrence, which will create a bottleneck in the model.
- We propose the DRR model, which reconstructs the relationships between diagnosed diseases, high-risk diseases, and low-risk diseases, breaking the model bottleneck caused by existing models’ over-reliance on diagnosed and high-risk diseases.
- In our approach, we mitigate the global feature forgetting issue in disease prediction tasks of the GRU method by de-fusing the features of high-risk diseases at different time nodes with the features of diagnosed diseases.
2. Related Work
2.1. RNN-Type in Health Event Prediction
2.2. Graph Method in Health Event Prediction
2.3. NLP Method in Health Event Prediction
3. Method
3.1. Problem Formulation
3.2. Global Graph Definition
3.3. Disease Relationship Reasoning Module
3.4. Global Graph-Based Feature Fusion Module
4. Experiments
4.1. Experimental Setups
- Disease prediction. This task involves predicting all possible diagnosed diseases for a patient at time based on the patient’s previous T instances of confirmed disease records. It is a multi-label classification.
- Heart failure prediction. This task involves predicting whether a patient will be diagnosed with heart failure at time based on the patient’s previous T instances of confirmed disease records. It is a binary classification.
- Common disease prediction. We have collected data on some common diseases diagnosed in the MIMIC-IV dataset, including hypertension, diabetes, and others. This task involves predicting whether a patient will be diagnosed with these common diseases at time based on the patient’s previous T instances of confirmed disease records. It is a binary classification.
- CNN-based model: Deepr [9].
4.2. Comparative Experiments
4.3. Ablation Study
4.4. Visualization Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liang, H.; Tsui, B.Y.; Ni, H.; Valentim, C.C.; Baxter, S.L.; Liu, G.; Cai, W.; Kermany, D.S.; Sun, X.; Chen, J.; et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat. Med. 2019, 25, 433–438. [Google Scholar] [CrossRef] [PubMed]
- Henry, J.; Pylypchuk, Y.; Searcy, T.; Patel, V. Adoption of electronic health record systems among US non-federal acute care hospitals: 2008–2015. ONC Data Brief 2016, 35, 2008–2015. [Google Scholar]
- Meystre, S.M.; Savova, G.K.; Kipper-Schuler, K.C.; Hurdle, J.F. Extracting information from textual documents in the electronic health record: A review of recent research. Yearb. Med. Inform. 2008, 17, 128–144. [Google Scholar]
- Yang, J.; Lian, J.W.; Chin, Y.P.H.; Wang, L.; Lian, A.; Murphy, G.F.; Zhou, L. Assessing the prognostic significance of tumor-infiltrating lymphocytes in patients with melanoma using pathologic features identified by natural language processing. JAMA Netw. Open 2021, 4, e2126337. [Google Scholar] [CrossRef]
- Yang, X.; Chen, A.; PourNejatian, N.; Shin, H.C.; Smith, K.E.; Parisien, C.; Compas, C.; Martin, C.; Costa, A.B.; Flores, M.G.; et al. A large language model for electronic health records. NPJ Digit. Med. 2022, 5, 194. [Google Scholar] [CrossRef] [PubMed]
- Patel, R.; Wee, S.N.; Ramaswamy, R.; Thadani, S.; Guruswamy, G.; Garg, R.; Calvanese, N.; Valko, M.; Rush, A.; Rentería, M.; et al. NeuroBlu: A natural language processing (NLP) electronic health record (EHR) data analytic tool to generate real-world evidence in mental healthcare. Eur. Psychiatry 2022, 65, S99–S100. [Google Scholar] [CrossRef]
- Choi, E.; Bahadori, M.T.; Sun, J.; Kulas, J.; Schuetz, A.; Stewart, W. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Proceedings of the 30th Annual Conference on Neural Information Processing Systems, NIPS 2016, Barcelona, Spain, 5–10 December 2016. [Google Scholar]
- Ma, F.; Chitta, R.; Zhou, J.; You, Q.; Sun, T.; Gao, J. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 1903–1911. [Google Scholar]
- Wickramasinghe, N. A convolutional net for medical records. IEEE J. Biomed. Health Inform. 2017, 21, 22–30. [Google Scholar]
- Lu, C.; Reddy, C.K.; Chakraborty, P.; Kleinberg, S.; Ning, Y. Collaborative graph learning with auxiliary text for temporal event prediction in healthcare. arXiv 2021, arXiv:2105.07542. [Google Scholar]
- Lu, C.; Han, T.; Ning, Y. Context-aware health event prediction via transition functions on dynamic disease graphs. Proc. AAAI Conf. Artif. Intell. 2022, 36, 4567–4574. [Google Scholar] [CrossRef]
- Shang, J.; Ma, T.; Xiao, C.; Sun, J. Pre-training of graph augmented transformers for medication recommendation. arXiv 2019, arXiv:1906.00346. [Google Scholar]
- Choi, E.; Bahadori, M.T.; Song, L.; Stewart, W.F.; Sun, J. GRAM: Graph-based attention model for healthcare representation learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 787–795. [Google Scholar]
- Schiff, G.D.; Volodarskaya, M.; Ruan, E.; Lim, A.; Wright, A.; Singh, H.; Nieva, H.R. Characteristics of disease-specific and generic diagnostic pitfalls: A qualitative study. JAMA Netw. Open 2022, 5, e2144531. [Google Scholar] [CrossRef] [PubMed]
- Johnson, A.E.; Pollard, T.J.; Shen, L.; Lehman, L.W.H.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Anthony Celi, L.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef] [PubMed]
- Johnson, A.E.W.; Bulgarelli, L.; Shen, L.; Gayles, A.; Shammout, A.; Horng, S.; Pollard, T.J.; Moody, B.; Gow, B.; Lehman, L.-W.H. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data 2023, 10, 1. [Google Scholar] [CrossRef] [PubMed]
- Symeonidis, P.; Kostoulas, T.; Danilatou, V.; Andras, C.; Chairistanidis, S. Mortality Prediction and Safe Drug Recommendation for Critically-ill Patients. In Proceedings of the 2022 IEEE 22nd International Conference on Bioinformatics and Bioengineering (BIBE), Taichung, Taiwan, 7–9 November 2022; pp. 79–84. [Google Scholar]
- Li, Y.; Chen, C.; Duan, M.; Zeng, Z.; Li, K. Attention-aware encoder–decoder neural networks for heterogeneous graphs of things. IEEE Trans. Ind. Inform. 2020, 17, 2890–2898. [Google Scholar] [CrossRef]
- Zou, X.; Li, K.; Chen, C. Multilevel attention based u-shape graph neural network for point clouds learning. IEEE Trans. Ind. Inform. 2020, 18, 448–456. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical evaluation of rectified activations in convolutional network. arXiv 2015, arXiv:1505.00853. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Bai, T.; Zhang, S.; Egleston, B.L.; Vucetic, S. Interpretable representation learning for healthcare via capturing disease progression through time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 43–51. [Google Scholar]
- Choi, E.; Bahadori, M.T.; Schuetz, A.; Stewart, W.F.; Sun, J. Doctor ai: Predicting clinical events via recurrent neural networks. In Proceedings of the Machine Learning for Healthcare Conference, PMLR, Los Angeles, CA, USA, 19–20 August 2016; pp. 301–318. [Google Scholar]
- Luo, J.; Ye, M.; Xiao, C.; Ma, F. Hitanet: Hierarchical time-aware attention networks for risk prediction on electronic health records. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 6–10 July 2020; pp. 647–656. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Diagnosis Prediction | MIMIC-III | |||
---|---|---|---|---|
Models | -(%) | (%) | (%) | Params (M) |
RETAIN | 20.69 | 26.13 | 35.08 | 2.90 |
Deepr | 18.87 | 24.74 | 33.47 | 1.16 |
GRAM | 21.52 | 26.51 | 35.80 | 1.59 |
Dipole | 19.35 | 24.98 | 34.02 | 2.18 |
Timeline | 20.46 | 25.75 | 34.83 | 1.23 |
G-BERT | 19.88 | 25.86 | 35.31 | 6.15 |
HiTANet | 21.15 | 26.02 | 35.97 | 3.33 |
CGL | 21.92 | 26.64 | 36.72 | 1.5 |
Chet | 22.63 | 28.64 | 37.87 | 2.12 |
DRR | 24.69 | 28.31 | 37.43 | 2.34 |
Diagnosis Prediction | MIMIC-IV | |||
---|---|---|---|---|
Models | -(%) | (%) | (%) | Params (M) |
RETAIN | 24.71 | 28.02 | 34.46 | 3.56 |
Deepr | 24.08 | 26.29 | 33.93 | 1.44 |
GRAM | 23.50 | 27.29 | 36.36 | 1.67 |
Dipole | 23.69 | 27.38 | 35.58 | 2.51 |
Timeline | 25.26 | 29.00 | 37.13 | 1.52 |
G-BERT | 24.49 | 27.16 | 35.86 | 7.53 |
HiTANet | 24.92 | 27.45 | 36.37 | 3.93 |
CGL | 25.41 | 28.52 | 37.15 | 1.83 |
Chet | 26.35 | 30.28 | 38.69 | 2.59 |
DRR | 29.30 | 30.73 | 39.65 | 2.32 |
Heart Failure | MIMIC-III | MIMIC-IV | ||||
---|---|---|---|---|---|---|
Models | (%) | (%) | Params (M) | (%) | (%) | Params (M) |
RETAIN | 83.21 | 71.32 | 1.67 | 89.02 | 67.38 | 1.99 |
Deepr | 81.36 | 69.54 | 0.53 | 88.43 | 61.36 | 0.65 |
GRAM | 83.55 | 71.78 | 0.96 | 89.61 | 68.94 | 0.88 |
Dipole | 82.08 | 70.35 | 1.41 | 88.69 | 68.94 | 0.88 |
Timeline | 83.34 | 71.03 | 0.95 | 87.53 | 66.07 | 0.73 |
G-BERT | 81.50 | 71.18 | 3.58 | 87.26 | 68.04 | 3.95 |
HiTANet | 82.77 | 71.93 | 2.08 | 88.10 | 68.21 | 3.95 |
CGL | 84.19 | 71.77 | 0.55 | 89.05 | 69.36 | 0.60 |
Chet | 86.14 | 73.08 | 0.68 | 90.83 | 74.14 | 0.88 |
DRR | 86.33 | 72.35 | 0.85 | 94.30 | 81.57 | 1.00 |
Diseases Prediction | Chet | DRR | ||
---|---|---|---|---|
Diseases Name | (%) | (%) | (%) | (%) |
Diabetes | 83.98 | 74.55 | 95.13 | 87.15 |
Heart Attack | 91.13 | 61.94 | 94.11 | 63.58 |
Hypertension | 84.32 | 75.22 | 87.52 | 77.22 |
Cardiac Arrhythmia | 85.34 | 32.43 | 90.03 | 79.37 |
Model Name | w- (%) |
---|---|
26.35 | |
28.95 | |
28.43 | |
DRR | 29.30 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ding, Z.; Li, Z.; Li, X.; Li, H. DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature Fusion. Mathematics 2024, 12, 488. https://doi.org/10.3390/math12030488
Ding Z, Li Z, Li X, Li H. DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature Fusion. Mathematics. 2024; 12(3):488. https://doi.org/10.3390/math12030488
Chicago/Turabian StyleDing, Zhixing, Zhengqiang Li, Xi Li, and Hao Li. 2024. "DRR: Global Context-Aware Neural Network Using Disease Relationship Reasoning and Attention-Based Feature Fusion" Mathematics 12, no. 3: 488. https://doi.org/10.3390/math12030488