CN114098638B - Interpretable dynamic disease severity prediction method - Google Patents
Interpretable dynamic disease severity prediction method Download PDFInfo
- Publication number
- CN114098638B CN114098638B CN202111338917.9A CN202111338917A CN114098638B CN 114098638 B CN114098638 B CN 114098638B CN 202111338917 A CN202111338917 A CN 202111338917A CN 114098638 B CN114098638 B CN 114098638B
- Authority
- CN
- China
- Prior art keywords
- sofa
- patient
- drug
- sofa score
- medicine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 201000010099 disease Diseases 0.000 title claims abstract description 22
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 22
- 239000003814 drug Substances 0.000 claims abstract description 132
- 229940079593 drug Drugs 0.000 claims abstract description 80
- 230000008859 change Effects 0.000 claims abstract description 51
- 239000011159 matrix material Substances 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims abstract description 6
- 239000013598 vector Substances 0.000 claims description 13
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000009533 lab test Methods 0.000 claims description 5
- 230000007704 transition Effects 0.000 claims description 2
- 230000009467 reduction Effects 0.000 abstract description 3
- 230000000630 rising effect Effects 0.000 description 3
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 102000001554 Hemoglobins Human genes 0.000 description 2
- 108010054147 Hemoglobins Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000036387 respiratory rate Effects 0.000 description 2
- 206010067484 Adverse reaction Diseases 0.000 description 1
- 208000028399 Critical Illness Diseases 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N Magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- 206010053159 Organ failure Diseases 0.000 description 1
- 230000006838 adverse reaction Effects 0.000 description 1
- 230000004872 arterial blood pressure Effects 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 230000036760 body temperature Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000035487 diastolic blood pressure Effects 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000002489 hematologic effect Effects 0.000 description 1
- 230000002440 hepatic effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000000265 leukocyte Anatomy 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 230000010534 mechanism of action Effects 0.000 description 1
- 238000002483 medication Methods 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 210000004789 organ system Anatomy 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000241 respiratory effect Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000035488 systolic blood pressure Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 238000009528 vital sign measurement Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7271—Specific aspects of physiological measurement analysis
- A61B5/7275—Determining trends in physiological measurement data; Predicting development of a medical condition based on physiological measurements, e.g. determining a risk factor
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/48—Other medical applications
- A61B5/4848—Monitoring or testing the effects of treatment, e.g. of medication
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Animal Behavior & Ethology (AREA)
- Pathology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physiology (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention discloses an interpretable dynamic disease severity prediction method, which comprises the following steps: extracting SOFA score, patient status and medication usage information; according to the drug use information, linking to UMLS standard terms to construct a drug related knowledge graph; embedding the medicine related knowledge graph into the dimension reduction to obtain the embedding of the medicine entity; determining the category to which the SOFA change value at the current moment belongs as i, and multiplying the patient state and the embedding of the drug entity by the ith row of the corresponding weight matrix respectively; inputting the SOFA score, the patient state after weight processing and the time series data of the embedded splice of the pharmaceutical entity into a TCN prediction model, outputting a predicted SOFA score trend, and training and updating a weight matrix; the predicted SOFA score trend is explained. The invention can be more sensitive to the change trend of the SOFA score, more accurately predict the SOFA score and explain the predicted result.
Description
Technical Field
The invention relates to the technical field of disease severity prediction, in particular to an interpretable dynamic disease severity prediction method.
Background
With the continuous development of database technology, hospitals gradually collect and store a large amount of electronic medical records, and how to mine knowledge of such massive real data gradually attracts attention of researchers. Knowledge discovery and machine learning methods can be used to discover new patterns in patient data, as well as for classification and prediction purposes, such as outcome or risk assessment. Real-time disease severity is an important concern for caregivers in Intensive Care Units (ICUs) and is also critical to save patient lives. If the rich electronic medical record information can be learned, powerful support is provided for clinical decision making of the ICU, and the method can be a great contribution to clinical practice. Since the beginning of the 90 s of the 20 th century, the Sequential Organ Failure Assessment, SOFA, scores have been incorporated into various aspects of critical illness care, which comprehensively reflects six organ system functions, including respiratory, cardiovascular, renal, nervous, hepatic and hematological. The greater the value, the greater the severity of the disease in the patient. In the disease severity prediction task, if the SOFA score trend of ICU patients can be dynamically predicted, the clinician can be helped to better deal with the patient's condition and make more appropriate clinical decisions. At present, a plurality of disease severity prediction models based on electronic medical record data mining exist, however, the methods are not sensitive enough to the change trend of SOFA scores, the prediction accuracy is not enough, and the prediction results are not fully explained.
Disclosure of Invention
First, the technical problem to be solved
Based on the problems, the invention provides an interpretable dynamic disease severity prediction method which can be more sensitive to the change trend of the SOFA score and can be used for predicting the SOFA score more accurately.
(II) technical scheme
Based on the technical problems, the invention provides an interpretable dynamic disease severity prediction method, which comprises the following steps:
s1, extracting SOFA scores, patient states and drug use information from a MIMIMIC-III database, processing the SOFA scores, the patient states and the drug use information into a time sequence format, and preprocessing the SOFA scores, the patient states and the drug use information;
s2, according to the medicine use information, a medicine related knowledge graph is constructed by linking the names of the used medicines to medicine entities, relations and corresponding medical entities in the UMLS term library;
s3, embedding the medicine related knowledge graph into a low-dimensional continuous vector space by using a knowledge graph embedding model to obtain the embedding of all medicine entities;
s4, obtaining a SOFA change value at the current moment according to the SOFA score, determining the category to which the SOFA change value at the current moment belongs as i, multiplying the patient state and the embedding of the drug entity by a patient state weight matrix and an i-th row weight of the drug weight matrix respectively, and performing weight processing on the i=1, 2, 4 and 7; the value θ of the ith row and j column in the weight matrix ij Or omega ij Respectively represent the influence weight value of the jth patient state or the jth drug under the ith class on the SOFA change value, N 1 Representing the total number of patient states, N 2 Represents the total number of drugs:
s5, inputting the SOFA score, the patient state subjected to weight treatment and the time series data of the embedded splice of the pharmaceutical entity into a TCN prediction model, and outputting a predicted SOFA score trend; and obtaining the affiliated prediction category according to the predicted SOFA scoring trend, and respectively training and updating the patient state weight matrix and the drug weight matrix through an SGD learning model.
Further, the method further comprises the following steps:
s6, explaining the predicted SOFA scoring trend: and taking variables corresponding to weight values higher than a weight threshold under a certain class of the patient state weight matrix or the medicine weight matrix as important patient states or important medicines respectively, prompting the change trend of the disease severity of the patient according to the important patient states, and obtaining the direct cause of the increase of the SOFA score according to the combination of the important medicines and the medicine related knowledge graph.
Further, in step S1, the method for extracting the SOFA score includes: inquiring the names of the ITEMS related to the SOFA score in two tables of D_ITEMS and D_LABITIEMS of MIMIMIC-III, and inquiring the corresponding values and time of the ITEMS in two tables of corresponding item IDs to CHARTEVENTS and LABELENETS; the values of the items are mapped onto the SOFA score according to the definition of the SOFA score; the start time of this ICU is queried from the unique number of the ICU to the ICUSTAYS table, and the time corresponding to the SOFA score is calculated by subtracting the start time from the time of the item.
Further, in step S1, the patient status includes the following characteristics: demographic data of the patient, physiological parameters, laboratory test results, complications.
Further, in step S1, the drug usage information is extracted from the INPUTEVENTS_MV table of MIMIMIIC-III, with 1 representing that the drug is used at the current time, and 0 being the opposite.
Further, in step S1, the preprocessing includes data cleansing and missing value padding
Further, the medicine related knowledge graph contains 38,117 medical entities, 154 relations and 186840 triplets, wherein the medical entities comprise 157 medicine entities corresponding to medicine use information.
Further, the SOFA change value is the SOFA score at the current time minus the SOFA score at the admission time, and is divided into 7 classes, which are integers, and when the SOFA change value is zero or less, the SOFA change value is divided into a first class, a second class, a third class, a fourth class, a fifth class, a sixth class, and a seventh class, respectively, when the SOFA change value is 1 or less, the SOFA change value is 2 or more, the SOFA change value is 3 or more, the SOFA change value is 4 or more, the SOFA change value is 5 or more, and when the SOFA change value is 6 or more, the SOFA change value is 1 or more, the SOFA change value is 2 or less.
Furthermore, the knowledge graph embedding model adopts a TransE model.
Further, the SOFA score of the input of the TCN prediction model is embedded into a 1 x 80 dimension vector at the embedding layer before the input, the depth of the TCN prediction model is 6, and the convolution kernel is 2.
(III) beneficial effects
The technical scheme of the invention has the following advantages:
(1) According to the invention, a TCN time sequence prediction model is used as a basic model of SOFA scoring trend prediction, the state of a patient, the use of medicines and the SOFA scoring are effectively fused, the capability of the model for predicting the SOFA scoring trend is enhanced, a medicine related knowledge graph is constructed from an existing knowledge base according to medicines used by the patient, the medical background knowledge used by the medicines is fused, more medical entities related to the use of the medicines are obtained, and the embedded representation of the medical entities fused with the medical background knowledge is input into the SOFA scoring trend prediction model, so that even if the SOFA scoring change is larger, the SOFA scoring trend prediction capability is better, and the calculated amount is increased, so that the SOFA scoring trend prediction result is more accurate;
(2) According to the invention, the input quantity of the prediction model is processed through the patient state weight matrix and the medicine weight matrix, so that the influence on the SOFA trend in different SOFA categories can be reflected, and the accuracy of the SOFA trend prediction result is further improved;
(3) The invention can provide explanation for the predicted result, and explain how the patient state affects the change of the SOFA score by explaining the predicted SOFA scoring trend through the patient state weight matrix and the medicine weight matrix; and the medicine use and the SOFA score change are related through the constructed medicine related knowledge graph fusion medical background knowledge, so that how the used medicine affects the SOFA score fluctuation to different degrees is explained, and a clinician is helped to better cope with the condition of a patient and make a more proper clinical decision.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and should not be construed as limiting the invention in any way, in which:
FIG. 1 is a flow chart illustrating an exemplary method for predicting severity of a dynamic disease in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of a TCN prediction model according to an embodiment of the present invention;
FIG. 3 is a graph illustrating the weight impact of patient status on SOFA score according to an embodiment of the present invention;
FIG. 4 is a graph comparing the trend of patient status and SOFA score according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a drug knowledge graph including important drugs in part according to an embodiment of the present invention.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
An interpretable dynamic disease severity prediction method, as shown in fig. 1, comprises the steps of:
s1, extracting SOFA scores, patient states and drug use information from a MIMIMIC-III database, processing the SOFA scores, the patient states and the drug use information into a time sequence format, and preprocessing the SOFA scores, the patient states and the drug use information;
MIMIMIIC-III (Medical Information Mart for Intensive Care III) is a large electronic medical record database, which is the source of electronic medical record clinical data in this embodiment. The MIMIC-III contains health related data for more than forty thousands of patients hospitalized in Beth Israel Deaconess Medical Center intensive care units between 2001 and 2012, including information on demographics, clinical vital sign measurements (about 1 data point per hour), laboratory test results, procedures, medications, care records, imaging reports, and mortality (hospitalization and discharge). This information is spread across 26 tables of MIMIC-III, and the tables are connected by some defined key. Thus, there is a need to integrate the information needed by the present invention across multiple tables.
S1.1, extracting SOFA scores, patient states and drug use information from a MIMIMIC-III database, and processing the SOFA scores, the patient states and the drug use information into a time sequence format;
the invention uses postgresql to install the MIMIC-III database and uses the SQL language to extract the data. According to the definition of the SOFA score, the invention inquires the names of the ITEMS related to the SOFA score in two tables of D_ITEMS and D_LABITIEMS, and inquires the corresponding values and time of the ITEMS in two tables of corresponding item IDs to CHARTEVENTS and LABIVEENTS; the item values are mapped onto the SOFA scores according to the definition of the SOFA scores; the time maps from a precise to second time to the t hour of the ICU by looking up the start time of the ICU in the ICUSTAYS table from the unique number of the ICU and subtracting the start time from the time of the SOFA score to calculate the hour corresponding to the SOFA score.
Based on the reported factors and clinical experience, the present invention selects other relevant covariates recorded during hospitalization as patient status. Patient status consists of a series of characteristics including patient demographics, physiological parameters, laboratory test results, complications, and the like. Wherein, the static characteristics of the patient comprise demographic data such as age, sex, complications and the like, and the dynamic characteristics comprise physiological parameters such as heart rate and respiratory rate from a PATIENTS table of MIMINIC-III, and laboratory test results such as platelets and white blood cells are extracted in the same way as the SOFA scoring related items.
Drug use information is extracted from the input_mv table of MIMIC-III. The table contains the term name and time interval, i.e. start time and end time, for the patient to inject the drug. When drug use is treated as a time series, use 1 represents that the drug was used at the current time, and 0 is the opposite. Specifically, the present invention controls the time interval represented by each hour after the patient enters the ICU versus the time interval for each drug use. The two are overlapped, namely, the two are marked as 1, and the other is marked as 0.
The information needed in the MIMIMIMI-III database is completely extracted and processed into a time sequence format.
S1.2, preprocessing the extracted data, including data cleaning and missing value filling;
the MIMIMIIC-III contains ICU records of a plurality of ages, and the invention mainly researches the adult population, so that records of which the ages are less than 18 years are deleted; the invention deletes some error data existing in MIMIMIIC-III, such as ICU record of 300 years old patient; meanwhile, although dynamic information representing the state of a patient such as a plurality of vital signs can be theoretically measured once in one hour, a plurality of missing values exist in practice, the invention removes the ICU records of six vital signs, including heart rate, systolic pressure, diastolic pressure, mean arterial pressure, respiratory rate and body temperature; for drug use sequences, the present invention incorporates drug names representing the same drug under the direction of a skilled physician.
In the patient state time series, there is a problem that some characteristic values are missing. The present invention refers to forward-fill imputation strategy, i.e., a fill-in strategy that fills in forward, and makes some improvements to the data of the present invention. The specific filling strategy is as follows:
a. in an ICU, if a certain time of a certain feature is empty, taking a latest non-missing value of the feature before the time as filling;
b. if the feature before the vacancy time is all vacancy, taking the latest non-vacancy value after the time as filling;
c. if the ICU does not measure the feature at all times, the average value of the feature in all data is taken to fill.
S2, according to the medicine use information, a medicine related knowledge graph is constructed by linking the names of the used medicines to medicine entities, relations and corresponding medical entities in the UMLS term library;
the present invention links a medication name to a UMLS standard term using REST (Representational State Transfer, presentation layer state transition) API (Application Programming Interface, application program interface) of UMLS (Unified Medical Language System ), and in particular, the present invention retrieves a UMLS term library from a medication name, and when there are a plurality of search results, the first search result will be selected as a medication entity linked to the medication name. Based on these pharmaceutical entities, the present invention proceeds to use the API to search for their atomic information and to use each atomic information to retrieve relationships of the atomic information and the corresponding medical entity. Finally, the medicine related knowledge graph constructed by the invention contains 38,117 medical entities, 154 relations and 186840 triplets, wherein the triplets are in the form of (head entity, relation and tail entity) and are expressed by (h, r and t).
S3, embedding the medicine related knowledge graph into a low-dimensional continuous vector space by using a knowledge graph embedding model to obtain the embedding of all medicine entities;
the invention uses the existing TransE graph embedding model to represent a triplet (h, r, t) based on the entity and the distributed vector of the relation, and obtains the embedding of 38117 medical entities from the medicine related knowledge graph, wherein each medical entity embedding is a vector with 1-80 dimensions, which is equivalent to describing each medical entity by 80 features. While the present invention focuses only on the embedding of 157 pharmaceutical entities used.
According to the medicine used by the patient, a medicine related knowledge graph is constructed from the existing knowledge base, the medical background knowledge used by the medicine is fused, more medical entities related to the medicine are obtained, the medical entities comprise 125 categories, the embedded representation of 157 medicine entities fused with the medical background knowledge is input into the SOFA trend prediction model, and each medicine entity can represent more information after being embedded by the knowledge graph, so that even if the SOFA score change is larger, the medicine has better prediction capability, and the calculated amount is increased, but the SOFA trend prediction result is more accurate.
The condition that the SOFA in the training data is changed greatly is not more, the model captures the rule of the change trend only by the fact that the original data is insufficient, and the medicine is taken as an external event, after the information in the knowledge graph is fused, the model can be guided to better find the change trend, for example, medicine A and medicine B have medicine interaction, adverse reaction can be caused, illness state is rapidly aggravated, but the training data may lack or rarely only use the two medicines and cause the instance that the illness state is suddenly worsened, and the model can hardly find the rule that the illness state is changed greatly by combining the two medicines from the few data without fusing the information in the knowledge graph. While certain features in the embedding can represent this knowledge after drug embedding, the model can better capture changes caused by external causes as well.
S4, obtaining a SOFA change value at the current moment according to the SOFA score, determining the category to which the SOFA change value at the current moment belongs as i, multiplying the patient state and the embedding of the drug entity by a patient state weight matrix and an i-th row weight of the drug weight matrix respectively, and performing weight processing on the i=1, 2, 4 and 7; the patient state weight matrix is a matrix of 7 rows and N1 columns, and the drug weight matrix is a matrix of 7 rows and N2 columns, as follows:
wherein the value theta of the ith row and j column in the weight matrix ij Or omega ij Respectively representing the j patient state under the i type or the impact weight value of the j drug on the SOFA change value, and the row represents the category to which the SOFA change value belongs, N 1 Representing the total number of patient states, N 2 Representing the total number of drugs. The SOFA change value is the SOFA score of the current moment minus the SOFA score of the admission moment, and is divided into 7 classes, which are integers, and the SOFA change value is divided into a first class when the SOFA change value is less than or equal to zero, a second class when the SOFA change value is equal to 1, a third class when the SOFA change value is equal to 2, a fourth class when the SOFA change value is equal to 3, a fifth class when the SOFA change value is equal to 4, and a third class when the SOFA change value is equal to 2The category 5 is classified into a sixth category and a seventh category when the number is 6 or more. The smaller the SOFA change value, the smaller the risk of disease.
S5, inputting the SOFA score, the patient state subjected to weight treatment and the time series data of the embedded splice of the pharmaceutical entity into a TCN prediction model, and outputting a predicted SOFA score trend; obtaining the affiliated prediction category according to the predicted SOFA scoring trend, and respectively training and updating the patient state weight matrix and the drug weight matrix through an SGD learning model;
the present invention selects the existing TCN model to predict the SOFA score trend, as shown in fig. 2. The input of the model is time series data, the data of each time is spliced by SOFA scoring, the patient state after weight processing and the embedding of the drug entity after weight processing, and the data of a plurality of patients are separated by a separator; the model output is the SOFA scoring trend. The input SOFA scoring data is single data, and is embedded into 1 x 80-dimensional vectors at an embedding layer of the prediction model; the input patient state data is a 1 x 79 dimensional vector, and each dimension represents a characteristic, such as heart rate, etc.; the input embedded data of the pharmaceutical entity is n vectors with 1 x 80 dimensions, n represents the number of pharmaceutical species used by the patient in the hour, and the n vectors are added to form a vector with 1 x 80 dimensions at the embedded layer of the model; that is, the data of each time in the time sequence data is a splice vector of 1×80, 1×79, 1×80 of the data dimension of the input network of each hour; in figure v 0 -v T-1 The subscript of (2) is time, the data length of each time is determined according to the number of three collected data, the depth of the TCN model in the embodiment is 6, the convolution kernel size is 2, and the convolution kernel expansion d= [2 ] 0 ,2 1 ,2 2 ,2 3 ,2 4 ,2 5 ]。
After the predicted SOFA scoring trend is obtained, on one hand, parameters of a TCN prediction model are trained and optimized, on the other hand, the category of the prediction is obtained according to the predicted SOFA scoring trend, and the state weight matrix and the drug weight matrix of the patient are respectively trained and updated through the existing SGD learning model, so that the prediction result is continuously optimized.
S6, explaining the predicted SOFA scoring trend: and taking variables corresponding to weight values higher than a weight threshold under a certain class of the patient state weight matrix or the medicine weight matrix as important patient states or important medicines respectively, prompting the change trend of the disease severity of the patient according to the important patient states, and obtaining the direct cause of the increase of the SOFA score according to the combination of the important medicines and the medicine related knowledge graph.
The SOFA score is reduced when the SOFA change value is in the first class, and the SOFA score is increased when the SOFA change value is in the second to seventh classes; therefore, the influence of the patient state on the reduction of the SOFA score is analyzed by taking the weight value of the first row, namely the first type, in the weight matrix of the patient state and the corresponding patient state, and as shown in figure 3, the corresponding weight of the first type of magnesium is-0.95, and the influence on the reduction of the SOFA score is the largest; the disease condition is the most serious when the SOFA change value is the seventh type, the influence of the patient state on the rising of the SOFA score is analyzed by taking the weight value of the seventh row, namely the seventh type, in the weight matrix of the patient state and the corresponding patient state, and as shown in fig. 3, the influence of the corresponding weight of the hemoglobin under the seventh type on the rising of the SOFA score is the greatest.
These important features may suggest to the physician which patient states may be of greater concern in real-time care, for example in the case of fig. 4, when hemoglobin is decreasing, oxygen saturation is decreasing, urine nitrogen is increasing, meaning that the patient is about to have a trend toward increasing SOFA score, i.e., increasing severity of the disease.
In the use of medicines, after important medicines are found, a medicine knowledge graph constructed by the invention can be combined to provide a confirmation space for doctors. Figure 5 shows the mechanism of action of a portion of the important drugs, with important drugs in the ellipses, direct causes of rising SOFA scores in the rectangles, relationships on the edges, and intermediate entities not graphically surrounded.
In summary, the method for predicting the severity of the dynamic disease has the following beneficial effects:
(1) According to the invention, a TCN time sequence prediction model is used as a basic model of SOFA scoring trend prediction, the state of a patient, the use of medicines and the SOFA scoring are effectively fused, the capability of the model for predicting the SOFA scoring trend is enhanced, a medicine related knowledge graph is constructed from an existing knowledge base according to medicines used by the patient, the medical background knowledge used by the medicines is fused, more medical entities related to the use of the medicines are obtained, and the embedded representation of the medical entities fused with the medical background knowledge is input into the SOFA scoring trend prediction model, so that even if the SOFA scoring change is larger, the SOFA scoring trend prediction capability is better, and the calculated amount is increased, so that the SOFA scoring trend prediction result is more accurate;
(2) According to the invention, the input quantity of the prediction model is processed through the patient state weight matrix and the medicine weight matrix, so that the influence on the SOFA trend in different SOFA categories can be reflected, and the accuracy of the SOFA trend prediction result is further improved;
(3) The invention can provide explanation for the predicted result, and explain how the patient state affects the change of the SOFA score by explaining the predicted SOFA scoring trend through the patient state weight matrix and the medicine weight matrix; and the medicine use and the SOFA score change are related through the constructed medicine related knowledge graph fusion medical background knowledge, so that how the used medicine affects the SOFA score fluctuation to different degrees is explained, and a clinician is helped to better cope with the condition of a patient and make a more proper clinical decision.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.
Claims (10)
1. An interpretable method of predicting severity of a dynamic disease, comprising the steps of:
s1, extracting SOFA scores, patient states and drug use information from a MIMIMIC-III database, processing the SOFA scores, the patient states and the drug use information into a time sequence format, and preprocessing the SOFA scores, the patient states and the drug use information;
s2, according to the medicine use information, a medicine related knowledge graph is constructed by linking the names of the used medicines to medicine entities, relations and corresponding medical entities in the UMLS term library;
s3, embedding the medicine related knowledge graph into a low-dimensional continuous vector space by using a knowledge graph embedding model to obtain the embedding of all medicine entities;
s4, obtaining a SOFA change value at the current moment according to the SOFA score, determining the category to which the SOFA change value at the current moment belongs as i, multiplying the patient state and the embedding of the drug entity by a patient state weight matrix and an i-th row weight of the drug weight matrix respectively, and performing weight processing on the i=1, 2, 4 and 7; value θ of ith row j column in patient state weight matrix ij Representing the j patient status under the i type, the value omega of the j column of the i row in the drug weight matrix ij Weight value, N, representing the effect of the jth drug on SOFA change value under the ith class 1 Representing the total number of patient states, N 2 Represents the total number of drugs:
s5, inputting the SOFA score, the patient state subjected to weight treatment and the time series data of the embedded splice of the pharmaceutical entity into a TCN prediction model, and outputting a predicted SOFA score trend; and obtaining the affiliated prediction category according to the predicted SOFA scoring trend, and respectively training and updating the patient state weight matrix and the drug weight matrix through an SGD learning model.
2. The method of claim 1, further comprising:
s6, explaining the predicted SOFA scoring trend: and taking variables corresponding to weight values higher than a weight threshold under a certain class of the patient state weight matrix or the medicine weight matrix as important patient states or important medicines respectively, prompting the change trend of the disease severity of the patient according to the important patient states, and obtaining the direct cause of the increase of the SOFA score according to the combination of the important medicines and the medicine related knowledge graph.
3. The method of claim 1, wherein in step S1, the method of extracting the SOFA score is: inquiring the names of the ITEMS related to the SOFA score in two tables of D_ITEMS and D_LABITIEMS of MIMIMIC-III, and inquiring the corresponding values and time of the ITEMS in two tables of corresponding item IDs to CHARTEVENTS and LABELENETS; the values of the items are mapped onto the SOFA score according to the definition of the SOFA score; the start time of this ICU is queried from the unique number of the ICU to the ICUSTAYS table, and the time corresponding to the SOFA score is calculated by subtracting the start time from the time of the item.
4. The method of claim 1, wherein in step S1, the patient status comprises the following characteristics: demographic data of the patient, physiological parameters, laboratory test results, complications.
5. The method according to claim 1, wherein in step S1, the drug use information is extracted from the input_mv table of MIMIC-III, and use 1 represents that the drug is used at the current time, and 0 is the opposite.
6. The method of claim 1, wherein in step S1, the preprocessing includes data cleansing and missing value padding.
7. The method of claim 1, wherein the drug-related knowledge-graph comprises 38,117 medical entities, 154 relationships, and 186840 triplets, wherein the medical entities include 157 drug entities corresponding to drug use information.
8. The method according to claim 1, wherein the SOFA change value is a SOFA score at a current time minus a SOFA score at a time of admission, and is divided into 7 classes, each being an integer, and the SOFA change value is divided into a first class when zero or less, a second class when 1, a third class when 2, a fourth class when 3, a fifth class when 4, a sixth class when 5, and a seventh class when 6 or more.
9. The method for predicting the severity of an interpretable dynamic disease of claim 1, wherein the knowledge-graph embedding model uses a transition model.
10. The method of claim 1, wherein the input SOFA score of the TCN predictive model is embedded as a 1 x 80 dimensional vector at the embedding layer prior to input, the TCN predictive model has a depth of 6 and a convolution kernel of 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111338917.9A CN114098638B (en) | 2021-11-12 | 2021-11-12 | Interpretable dynamic disease severity prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111338917.9A CN114098638B (en) | 2021-11-12 | 2021-11-12 | Interpretable dynamic disease severity prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114098638A CN114098638A (en) | 2022-03-01 |
CN114098638B true CN114098638B (en) | 2023-09-08 |
Family
ID=80378999
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111338917.9A Active CN114098638B (en) | 2021-11-12 | 2021-11-12 | Interpretable dynamic disease severity prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114098638B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117079762B (en) * | 2023-09-25 | 2024-01-23 | 腾讯科技(深圳)有限公司 | Drug effect prediction model training method, drug effect prediction method and device thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10295647A (en) * | 1996-06-17 | 1998-11-10 | Smithkline Beecham Corp | Method and system for identifying patients at risk of being diagnosed with congestive heart failure |
CN104115150A (en) * | 2012-02-17 | 2014-10-22 | 皇家飞利浦有限公司 | Acute lung injury (ALI)/acute respiratory distress syndrome (ARDS) assessment and monitoring |
KR101450784B1 (en) * | 2013-07-02 | 2014-10-23 | 아주대학교산학협력단 | Systematic identification method of novel drug indications using electronic medical records in network frame method |
WO2020172607A1 (en) * | 2019-02-22 | 2020-08-27 | University Of Florida Research Foundation, Incorporated | Systems and methods for using deep learning to generate acuity scores for critically ill or injured patients |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6835176B2 (en) * | 2003-05-08 | 2004-12-28 | Cerner Innovation, Inc. | Computerized system and method for predicting mortality risk using a lyapunov stability classifier |
-
2021
- 2021-11-12 CN CN202111338917.9A patent/CN114098638B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10295647A (en) * | 1996-06-17 | 1998-11-10 | Smithkline Beecham Corp | Method and system for identifying patients at risk of being diagnosed with congestive heart failure |
CN104115150A (en) * | 2012-02-17 | 2014-10-22 | 皇家飞利浦有限公司 | Acute lung injury (ALI)/acute respiratory distress syndrome (ARDS) assessment and monitoring |
KR101450784B1 (en) * | 2013-07-02 | 2014-10-23 | 아주대학교산학협력단 | Systematic identification method of novel drug indications using electronic medical records in network frame method |
WO2020172607A1 (en) * | 2019-02-22 | 2020-08-27 | University Of Florida Research Foundation, Incorporated | Systems and methods for using deep learning to generate acuity scores for critically ill or injured patients |
Non-Patent Citations (1)
Title |
---|
血管活性药物评分在危重症患儿中的临床应用价值;任洁;刘成军;;儿科药学杂志(第03期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN114098638A (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Amal et al. | Use of multi-modal data and machine learning to improve cardiovascular disease care | |
Hashimoto et al. | Artificial intelligence in anesthesiology: current techniques, clinical applications, and limitations | |
JP6907831B2 (en) | Context-based patient similarity methods and equipment | |
US8670997B2 (en) | Quality metric extraction and editing for medical data | |
Huddar et al. | Predicting complications in critical care using heterogeneous clinical data | |
JP5977898B1 (en) | BEHAVIOR PREDICTION DEVICE, BEHAVIOR PREDICTION DEVICE CONTROL METHOD, AND BEHAVIOR PREDICTION DEVICE CONTROL PROGRAM | |
US20230245779A1 (en) | System and method for peri-anaesthetic risk evaluation | |
JP2012509707A (en) | Patient safety processor | |
US20210271924A1 (en) | Analyzer, analysis method, and analysis program | |
US20120030156A1 (en) | Computer-implemented method, clinical decision support system, and computer-readable non-transitory storage medium for creating a care plan | |
US20210397996A1 (en) | Methods and systems for classification using expert data | |
CN111612278A (en) | Life state prediction method, device, electronic device and storage medium | |
Mansouri et al. | Predicting hospital length of stay of neonates admitted to the NICU using data mining techniques | |
CN114098638B (en) | Interpretable dynamic disease severity prediction method | |
CN112037876A (en) | System, device and storage medium for chronic disease course stage analysis | |
JP2022546192A (en) | How to validate medical data | |
KR102693667B1 (en) | Apparatus and method for predicting discharge of inpatients | |
CN112071431B (en) | Clinical path automatic generation method and system based on deep learning and knowledge graph | |
Otten et al. | Does reinforcement learning improve outcomes for critically ill patients? A systematic review and level-of-readiness assessment | |
JP4729444B2 (en) | Health guidance support system | |
EP4270402A1 (en) | Genogram creation and diagnosis | |
KR102590325B1 (en) | Data analysis method based on pattern mining and time-series window weights | |
US20120065986A1 (en) | Healthcare management system, computer-readable non-transitory storage medium, and computer-implemented method for compiling a guideline model into a rule set | |
Johnson | Mortality prediction and acuity assessment in critical care | |
JP2012033155A (en) | Medical examination data processing method, medical examination data processing device and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |