[go: up one dir, main page]

CN110827993A - Early death risk assessment model establishing method and device based on ensemble learning - Google Patents

Early death risk assessment model establishing method and device based on ensemble learning Download PDF

Info

Publication number
CN110827993A
CN110827993A CN201911146419.7A CN201911146419A CN110827993A CN 110827993 A CN110827993 A CN 110827993A CN 201911146419 A CN201911146419 A CN 201911146419A CN 110827993 A CN110827993 A CN 110827993A
Authority
CN
China
Prior art keywords
model
data
risk assessment
score
early
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911146419.7A
Other languages
Chinese (zh)
Inventor
李德玉
刘晓莉
张弛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
CERNET Corp
Original Assignee
Beihang University
CERNET Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University, CERNET Corp filed Critical Beihang University
Priority to CN201911146419.7A priority Critical patent/CN110827993A/en
Publication of CN110827993A publication Critical patent/CN110827993A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本申请提出一种基于集成学习的早期死亡风险评估模型建立方法及装置,其用于基于老年多器官功能衰竭患者进入重症监护室第一天的诊疗数据来评估患者住院期间的死亡风险;其包括数据集构建、数据处理、模型的构建和评估;通过获取患者住重症监护室期间第一天的3个人口统计学信息、5个生命体征信息、5个实验室检查指标和2个临床指标;将其输入到风险评估装置中,经过内部的数据预处理、特征计算和模型运算;最终可获得早期预测患者院内发生不良结局风险,辅助医生对患者进行及早干预和治疗。

Figure 201911146419

The present application proposes a method and device for establishing an early death risk assessment model based on ensemble learning, which is used to assess the death risk of a patient during hospitalization based on the diagnosis and treatment data of an elderly patient with multiple organ failure on the first day of entering an intensive care unit; the method includes: Data set construction, data processing, model construction and evaluation; by obtaining 3 demographic information, 5 vital sign information, 5 laboratory test indicators and 2 clinical indicators on the first day of the patient's stay in the intensive care unit; It is input into the risk assessment device, and after internal data preprocessing, feature calculation and model operation, it can finally predict the risk of adverse outcomes in the patient's hospital at an early stage, and assist doctors in early intervention and treatment of patients.

Figure 201911146419

Description

基于集成学习的早期死亡风险评估模型建立方法及装置Method and device for establishing early death risk assessment model based on ensemble learning

技术领域technical field

本发明涉及一种死亡风险评估方法和装置,尤其涉及一种基于集成学习的老年多器官功能衰竭早期死亡风险评估模型建立方法和装置。The invention relates to a death risk assessment method and device, in particular to a method and device for establishing an early death risk assessment model for elderly multiple organ failure based on integrated learning.

背景技术Background technique

多器官功能衰竭(multiple organs dysfunction syndrome,MODS)的特征在于由某种诱因激发,出现了两个或多个器官(器官系统)进行性生理功能障碍。老年多器官功能衰竭(MODSE)是一种起病隐匿、多病因、发病机制复杂、易被临床医师忽视的临床综合症,是老年危重病人死亡的重要原因。据相关文献报道,若发生3个及以上器官功能衰竭,死亡率为50%~100%。同时其治疗和护理费用非常巨额,据统计平均每位患者的医疗花费约2.2万美元,全国每年的总费用支出约167亿美元,给家庭、医院和社会带来的沉重的负担。及早的诊断和及时的干预治疗是降低其病死率的重要途径和方法。Multiple organs dysfunction syndrome (MODS) is characterized by the occurrence of progressive physiological dysfunction of two or more organs (organ systems) triggered by a certain incentive. Multiple organ failure in the elderly (MODSE) is a clinical syndrome with insidious onset, multiple etiologies, complex pathogenesis, and easily overlooked by clinicians. It is an important cause of death in elderly critically ill patients. According to relevant literature reports, if three or more organ failure occurs, the mortality rate is 50% to 100%. At the same time, its treatment and nursing costs are very huge. According to statistics, the average medical cost of each patient is about 22,000 US dollars, and the national annual total expenditure is about 16.7 billion US dollars, which brings a heavy burden to families, hospitals and society. Early diagnosis and timely intervention are important ways and methods to reduce its mortality.

SOFA(Sepsis Related Organ Failure Assessment)是用于常用于评估全身性感染相关性器官功能衰竭的评分,包括呼吸、神经、心血管、肝脏、凝血共6个系统。改良多器官功能障碍综合征(The multiple organ dysfunction score,MODS)用于评估患者入重症监护室(Intensive care unit,ICU)每日器官功能衰竭的变化情况。简化急性生理评分(Simplified Acute Physiology Score,SAPS)常用于危重症疾病严重程度。急性生理学及慢性健康状况评分系统(Acute physiology and chronic health evaluation scoringsystem,APACHE-IV)是目前ICU中应用最为广泛和最具权威的危重疾病的评价系统。然而上述评分特征通常采用逻辑回归建模,并未考虑到器官功能之间的联系性。并假设预测变量与结局之间存在线性叠加关系,而真实疾病恶化、演变的复杂性远非简单线性模型所能表达和量化;各项子评分赋予的权重/系数存在较大的主观性,即通过专家小组对死亡率预测相关性的认知来选择和分配权重给相应的变量,并未通过大量的实验数据得到验证;老年人由于其基础疾病多、并发症多、器官衰老和功能减退等,成年人的评估标准并不能适用于老年人。而这些评分,鲜有针对老年患者提供更加详细的评估方案。SOFA (Sepsis Related Organ Failure Assessment) is a score commonly used to evaluate systemic infection-related organ failure, including respiratory, neurological, cardiovascular, liver, and coagulation systems. The modified multiple organ dysfunction score (MODS) was used to evaluate the daily changes of organ failure in patients admitted to the intensive care unit (ICU). The Simplified Acute Physiology Score (SAPS) is often used for critical illness severity. Acute physiology and chronic health evaluation scoring system (APACHE-IV) is currently the most widely used and authoritative critical disease evaluation system in ICU. However, the above scoring features are usually modeled by logistic regression and do not take into account the association between organ functions. It is assumed that there is a linear superposition relationship between predictors and outcomes, and the complexity of real disease deterioration and evolution is far from being expressed and quantified by a simple linear model; the weights/coefficients assigned to each sub-score are highly subjective, that is, Selecting and assigning weights to the corresponding variables based on the knowledge of the mortality prediction correlation of the expert group has not been verified by a large number of experimental data; the elderly have many underlying diseases, complications, organ aging and functional decline, etc. , the evaluation criteria for adults do not apply to the elderly. Few of these scores provide a more detailed assessment scheme for elderly patients.

电子健康档案(Electronic healthcare record,EHR)是以数字格式系统化地收集患者和人群的健康信息。它包含了患者的人口统计学、病史、药物和过敏、实验室检查结果、生命体征和账单等详细诊疗信息。近几年来,随着EHR在医疗办公事务在的普及和覆盖,以及大数据挖掘技术的发展和应用,利用EHR来改进医疗质量、提升医疗安全、优化医疗流程已不再是一个美好愿景。基于EHR的疾病严重程度/不良结局预测性分析现已成为学术和工业界研究的热点。2014年Pirracchio R等人研发的非参数、集成学习算法Super ICULearner可更加准确地早期评估患者ICU的死亡风险,其模型的AUC高达0.88。2015年Henry,K.E等人针对早期预测脓毒症休克发展了一个“有针对性的实时预警评分”(TREWScore),该评分通过Cox比例风险回归模型可实现提前28.2小时(平均)预测脓毒症休克发生,模型AUC为0.83,敏感性为0.85。Dascena公司于2017年研发的‘Previse’自动诊断算法能够在患者达到临床诊断标准前一整天预测急性肾损伤,为临床医生提供充足的时间进行干预并预防长期损伤,算法在提前48小时预测具有84%的准确性。Electronic healthcare records (EHR) are the systematic collection of health information of patients and populations in a digital format. It contains patient demographics, medical history, medications and allergies, laboratory test results, vital signs and billing details. In recent years, with the popularization and coverage of EHR in medical office affairs and the development and application of big data mining technology, it is no longer a good vision to use EHR to improve medical quality, improve medical safety, and optimize medical process. EHR-based predictive analysis of disease severity/adverse outcomes has now become a research hotspot in academia and industry. In 2014, the non-parametric, integrated learning algorithm Super ICULearner developed by Pirracchio R et al. can more accurately assess the mortality risk of patients in the ICU at an early stage. The AUC of the model is as high as 0.88. In 2015, Henry, K.E et al. developed a "Targeted Real-Time Early Warning Score" (TREWScore), which predicted septic shock 28.2 hours ahead (on average) using a Cox proportional hazards regression model with an AUC of 0.83 and a sensitivity of 0.85. The 'Previse' automatic diagnosis algorithm developed by Dascena in 2017 can predict acute kidney injury a day before patients reach clinical diagnostic criteria, providing clinicians with sufficient time to intervene and prevent long-term damage. The algorithm predicts acute kidney injury 48 hours in advance 84% accuracy.

XGBoost(eXtreme Gradient Boosting)是一种boosting算法,其思想是将许多弱分类器(树模型)集成在一起形成一个强分类器(提升树模型)。该模型在解决预测类问题具有高准确率,同时还具备运行速度快、防止过拟合和对稀疏数据建模的优势。因而被广泛用于数据竞赛、科研教学、工业应用中。XGBoost (eXtreme Gradient Boosting) is a boosting algorithm whose idea is to integrate many weak classifiers (tree models) together to form a strong classifier (boosted tree model). The model has high accuracy in solving prediction problems, and also has the advantages of fast running speed, preventing overfitting and modeling sparse data. Therefore, it is widely used in data competition, scientific research and teaching, and industrial applications.

发明内容SUMMARY OF THE INVENTION

本申请旨在提出一种基于集成学习的早期死亡风险评估模型建立方法及装置,可通过老年多器官功能衰竭患者进入重症监护室第一天的诊疗数据,来评估患者住院期间的死亡风险。The purpose of this application is to propose a method and device for establishing an early death risk assessment model based on ensemble learning, which can use the diagnosis and treatment data of elderly patients with multiple organ failure to enter the intensive care unit on the first day to assess the death risk of patients during hospitalization.

本申请的基于集成学习的早期死亡风险评估模型建立方法,所述早期死亡风险评估模型用于基于老年多器官功能衰竭患者进入重症监护室第一天的诊疗数据来评估患者住院期间的死亡风险;所述评估模型是XGBoost模型;该方法包括:The method for establishing an early death risk assessment model based on ensemble learning of the present application, the early death risk assessment model is used to assess the death risk of the patient during hospitalization based on the diagnosis and treatment data of the elderly patient with multiple organ failure on the first day of entering the intensive care unit; The evaluation model is an XGBoost model; the method includes:

构建数据集的步骤;利用数据集构建模块,根据临床诊断定义来标识第一数据库中的多器官功能衰竭患者,确定研究人群和研究纳入的四类特征,构成第一数据集;The step of constructing the dataset; using the dataset building module, identifying the patients with multiple organ failure in the first database according to the clinical diagnosis definition, determining the study population and four types of characteristics included in the study, and forming the first dataset;

数据处理的步骤;利用数据处理模块的数据提取与处理单元,在第一数据集中进行数据提取和处理,形成第二数据集,第二数据集分为训练集、验证集和测试集;利用数据处理模块的特征构建单元,对待输入到风险评估模型中的数据进行特征构建,以使其符合风险评估模型的数据输入要求;利用数据处理模块的患者结局标注单元,对患者结局进行标注;The steps of data processing; using the data extraction and processing unit of the data processing module to perform data extraction and processing in the first data set to form a second data set, the second data set is divided into a training set, a verification set and a test set; using the data The feature construction unit of the processing module performs feature construction on the data to be input into the risk assessment model, so that it meets the data input requirements of the risk assessment model; uses the patient outcome labeling unit of the data processing module to label patient outcomes;

模型的构建和评估步骤;使用XGBoost模型作为早期模型进行训练,即通过早停机制,采用AUC作为性能评估指标,设定早停轮数,在验证集上测试模型的性能;当模型在验证集上的表现在连续的预定次数迭代性能不再提升则终止训练,获得模型训练的最优参数;并采用测试集数据评估经训练的模型的预测性能,以防模型过拟合现象发生;Model construction and evaluation steps; use the XGBoost model as an early model for training, that is, through the early stop mechanism, use AUC as a performance evaluation indicator, set the number of early stop rounds, and test the performance of the model on the validation set; If the performance of the continuous predetermined number of iterations is no longer improved, the training is terminated to obtain the optimal parameters for model training; and the test set data is used to evaluate the prediction performance of the trained model to prevent the occurrence of model overfitting;

将最终的模型作为早期死亡风险评估模型。The final model was used as an early mortality risk assessment model.

优选地,所确定的研究纳入的四类特征为:Preferably, the four categories of characteristics identified for inclusion in the study are:

人口统计学特征,其包括:入院类型、年龄、BMI指数、种族、入重症监护室类型、性别、身高、体重;Demographic characteristics, including: type of admission, age, BMI, ethnicity, type of ICU admission, gender, height, weight;

生命体征特征,其包括:中心静脉压、舒张压、心率、平均动脉压、呼吸速率、收缩压、休克指数、血氧饱和度、体温;Vital signs, including: central venous pressure, diastolic blood pressure, heart rate, mean arterial pressure, respiratory rate, systolic blood pressure, shock index, blood oxygen saturation, body temperature;

实验室检查特征,其包括:白蛋白、碱性磷酸酶、谷丙转氨酶、谷草转氨酶、碱剩余、碳酸氢盐、胆红素、B型尿钠肽、血尿素氮、氯化物、肌酐、纤维蛋白原、吸入氧浓度、血葡萄糖、红细胞压积、血红蛋白、国际标准化比值、乳酸、淋巴细胞、镁、中性粒细胞、动脉血二氧化碳分压、氧合指数、血氧分压、ph值、血小板、钾、凝血酶原时间、凝血激活酶时间、血钠、肌钙蛋白、白细胞计数;Laboratory findings including: albumin, alkaline phosphatase, alanine aminotransferase, aspartate aminotransferase, base excess, bicarbonate, bilirubin, B-type natriuretic peptide, blood urea nitrogen, chloride, creatinine, fiber Proteinogen, inspired oxygen concentration, blood glucose, hematocrit, hemoglobin, international normalized ratio, lactate, lymphocytes, magnesium, neutrophils, arterial blood carbon dioxide partial pressure, oxygenation index, blood oxygen partial pressure, ph value, Platelet, potassium, prothrombin time, thromboplastin time, serum sodium, troponin, white blood cell count;

临床特征,其包括:格拉斯哥评分、全身炎症反应综合征评分、全身性感染相关性器官功能衰竭评分的神经系统及呼吸系统分数、是否进行机械通气、是否进行连续性肾脏替代治疗、总排尿量、去甲肾上腺素使用速率。Clinical characteristics, including: Glasgow score, systemic inflammatory response syndrome score, neurological and respiratory score of systemic infection-related organ failure score, whether mechanical ventilation, continuous renal replacement therapy, total urine output, Norepinephrine use rate.

优选地,利用数据处理模块的数据提取与处理单元所进行的数据提取与处理包括:数据清洗、数据采样、数据插值;Preferably, the data extraction and processing performed by the data extraction and processing unit of the data processing module includes: data cleaning, data sampling, and data interpolation;

利用数据处理模块的特征构建单元所进行的特征构建包括:人口统计学特征的原值;生命体征特征的最大值、最小值、均值;实验室检查特征的最大值、最小值、原值、临床特征的最大值、最小值、均值、总和、原值。The feature construction performed by the feature construction unit of the data processing module includes: the original value of the demographic characteristics; the maximum value, minimum value and mean value of the vital sign characteristics; the maximum value, minimum value, original value, clinical value of the laboratory test characteristics Maximum, minimum, mean, sum, original value of the feature.

优选地,在性能评估时,通过内部验证和外部验证两种方式评估模型的性能,性能的衡量指标为AUC、特异性、敏感性、准确性、F1值、鲁棒性、普适性,性能评估的参照标准为基准模型和临床评分;最终的模型可基于特征排名函数获得与死亡关联的风险因素排名。Preferably, in the performance evaluation, the performance of the model is evaluated through internal verification and external verification, and the performance metrics are AUC, specificity, sensitivity, accuracy, F1 value, robustness, universality, performance The baseline model and clinical scores were evaluated against the baseline; the final model was able to obtain a ranking of risk factors associated with death based on a feature ranking function.

优选地,所述基准模型选自:逻辑回归LR、支持向量机SVM、神经网络NN、随机森林RT、朴素贝叶斯NB模型;Preferably, the benchmark model is selected from: logistic regression LR, support vector machine SVM, neural network NN, random forest RT, naive Bayesian NB model;

所述临床评分选自:牛津急性疾病严重程度评分(OASIS)、急性生理与慢性健康评分(APACHE-IV)、急性生理评估评分(APSIII)、多器官功能障碍综合征评分(MODS)、全身性感染相关性器官功能衰竭评分(SOFA)、简化急性生理评分(SAPS)、改良早期预警评分(MEWS)、全身炎症反应综合征评分(SIRS)、快速序贯器官衰竭评分(qSOFA)、查尔森合并症指数评分(Charlson Comorbidity)。The clinical score is selected from: Oxford Acute Illness Severity Score (OASIS), Acute Physiology and Chronic Health Score (APACHE-IV), Acute Physiological Assessment Score (APSIII), Multiple Organ Dysfunction Syndrome Score (MODS), Systemic Infection-Associated Organ Failure Score (SOFA), Simplified Acute Physiological Score (SAPS), Modified Early Warning Score (MEWS), Systemic Inflammatory Response Syndrome Score (SIRS), Rapid Sequential Organ Failure Score (qSOFA), Charlesson Combination Charlson Comorbidity score.

优选地,鲁棒性和普适性通过以下三方面评估:Preferably, robustness and generalizability are evaluated through the following three aspects:

内部验证:通过所述第二数据集的验证集与测试集,比较AUC、特异性、敏感性、准确性、F1值的变化情况,以获得性能最优模型参数和防止过拟合发生;Internal validation: through the validation set and test set of the second data set, compare the changes in AUC, specificity, sensitivity, accuracy, and F1 value to obtain model parameters with optimal performance and prevent overfitting;

外部验证:采用由多中心大样本数据集以同样的数据处理方式获取该数据集对应的第二数据库数据集,比较AUC、特异性、敏感性、准确性、F1值的变化情况,并与内部验证结果相比较;External validation: The second database data set corresponding to the data set is obtained from the multi-center large sample data set in the same data processing method, and the changes in AUC, specificity, sensitivity, accuracy, and F1 value are compared and compared with the internal data set. Compare the verification results;

减少输入特征:通过减少输入最终的模型的特征个数,评估最终的模型在内部验证和外部验证相比于输入全部特征的AUC变化情况,并与临床评分相比较。Reduce input features: By reducing the number of features input to the final model, evaluate the AUC changes of the final model in internal and external validation compared to the input of all features, and compare with clinical scores.

优选地,所述老年多器官功能衰竭患者进入重症监护室第一天的诊疗数据包括:Preferably, the diagnosis and treatment data of the elderly patient with multiple organ failure on the first day of entering the intensive care unit include:

人口统计学特征:年龄、体重、BMI;Demographic characteristics: age, weight, BMI;

生命体征特征:收缩压最小值、收缩压均值、休克指数最大值、休克指数最小值、呼吸速率均值、血氧饱和度最小值、体温最大值;Vital signs: minimum systolic blood pressure, mean systolic blood pressure, maximum shock index, minimum shock index, mean respiratory rate, minimum blood oxygen saturation, and maximum body temperature;

实验室检查指标:血尿素氮最小值、血糖最小值、氯化物最大值、血氧分压最大值、白细胞计数最小值;Laboratory test indicators: minimum blood urea nitrogen, minimum blood sugar, maximum chloride, maximum partial pressure of oxygen, and minimum white blood cell count;

临床特征:尿量总和、尿量最小值、格拉斯哥评分最大值、格拉斯哥评分均值。Clinical features: sum of urine output, minimum urine output, maximum Glasgow score, mean Glasgow score.

本申请的基于集成学习的早期死亡风险评估装置,其基于老年多器官功能衰竭患者进入重症监护室第一天的诊疗数据来评估患者住院期间的死亡风险;其包括:数据预处理模块、特征计算模块、模型运算模块;The integrated learning-based early death risk assessment device of the present application evaluates the death risk of the patient during hospitalization based on the diagnosis and treatment data of the elderly patient with multiple organ failure on the first day of entering the intensive care unit; it includes: a data preprocessing module, a feature calculation module module, model operation module;

数据预处理模块通过计算机实现;所述数据预处理模块用于对所述诊疗数据进行提取和处理,得到预处理数据;The data preprocessing module is realized by a computer; the data preprocessing module is used for extracting and processing the diagnosis and treatment data to obtain preprocessing data;

特征计算模块通过计算机实现;所述特征计算模块用于对所述预处理数据进行特征构建,形成适合输入模型运算模块的数据;The feature calculation module is realized by a computer; the feature calculation module is used for feature construction on the preprocessed data to form data suitable for inputting the model calculation module;

模型运算模块通过计算机实现;所述模型运算模块包括早期死亡风险评估模型,该早期死亡风险评估模型是以XGBoost模型为基础利用数据库中的数据集进行训练、调优和评估后得到的,其性能经过了多中心大样本数据集的有效验证和评估;经过特征构建的数据输入模型运算模块,模型运算模块输出患者住院期间的风险预测结果。The model calculation module is realized by a computer; the model calculation module includes an early death risk assessment model, and the early death risk assessment model is obtained by using the data set in the database for training, tuning and evaluation based on the XGBoost model, and its performance After the effective verification and evaluation of the multi-center large sample data set; the data constructed by the feature is input into the model operation module, and the model operation module outputs the risk prediction results of patients during hospitalization.

优选地,所述模型运算模块所包含的模型为权利要求1-7中任一项所述的基于集成学习的早期死亡风险评估模型建立方法所建立的早期死亡风险评估模型。Preferably, the model included in the model operation module is an early death risk assessment model established by the method for establishing an early death risk assessment model based on ensemble learning according to any one of claims 1-7.

优选地,所述数据预处理模块对所述提取和处理包括对所述诊疗数据进行的清洗、采样、插值。Preferably, the extraction and processing by the data preprocessing module include cleaning, sampling and interpolation of the diagnosis and treatment data.

本申请的基于集成学习的早期死亡风险评估模型建立方法及装置,The method and device for establishing an early death risk assessment model based on integrated learning of the present application,

(1)可早期预测MODSE患者的院内不良结局发生概率,进而辅助医生对患者进行及早干预和治疗;(1) The probability of in-hospital adverse outcomes of MODSE patients can be predicted early, so as to assist doctors in early intervention and treatment of patients;

(2)经过内部和外部(大样本、多中心)数据集验证,模型性能良好,且优于基线模型和临床现有评分。同时,减少模型输入参数个数到20,仍具有不错的预测性能。该预测模型可便捷部署于医院信息系统;(2) Validated on both internal and external (large-sample, multi-center) datasets, the model performs well and outperforms the baseline model and clinical existing scores. At the same time, reducing the number of model input parameters to 20 still has good prediction performance. The prediction model can be easily deployed in the hospital information system;

(3)可提供与不良结局发生关联的重要因素,帮助医生理解疾病的发展过程;(3) It can provide important factors associated with adverse outcomes to help doctors understand the development of the disease;

(4)仅输入15种患者的临床信息,风险预测装置即可全自动输出早期对患者的发生院内不良结局(死亡)风险评估结果。(4) Only the clinical information of 15 kinds of patients is input, and the risk prediction device can automatically output the early risk assessment results of in-hospital adverse outcomes (death) of patients.

附图说明Description of drawings

图1为本申请的基于集成学习的早期死亡风险评估模型建立方法的执行流程;Fig. 1 is the execution flow of the early death risk assessment model establishment method based on ensemble learning of the present application;

图2为MIMIC-III中老年多器官功能衰竭患者纳入流程;Figure 2 shows the inclusion process of MIMIC-III middle-aged and elderly patients with multiple organ failure;

图3为MODSE死亡和存活患者ICU住院期间疾病发展轨迹可视化呈现;Figure 3 is a visualization of the disease development trajectories during the ICU stay of MODSE dead and surviving patients;

图4为早期预测患者死亡风险模型的ROC曲线;Figure 4 is the ROC curve of the early prediction patient mortality risk model;

图5为预测模型与基线模型性能对比;Figure 5 shows the performance comparison between the prediction model and the baseline model;

图6为预测模型与临床常用评分性能对比;Figure 6 shows the performance comparison between the prediction model and the commonly used clinical score;

图7为预测模型前30个特征重要度排名;Figure 7 shows the importance ranking of the top 30 features of the prediction model;

图8为采用前20个重要特征的预测模型的ROC曲线;Figure 8 is the ROC curve of the prediction model using the top 20 important features;

图9为采用前20个重要特征的预测模型与临床评分性能对比;Figure 9 shows the performance comparison of the prediction model using the top 20 important features and the clinical score;

图10为预测模型在eICU多中心数据库中MODSE的纳入流程;Figure 10 shows the inclusion process of the prediction model in the eICU multicenter database for MODSE;

图11为预测模型在eICU中验证的ROC曲线;Figure 11 shows the ROC curve of the prediction model validated in eICU;

图12为预测模型在eICU中验证与临床常用评分性能对比;Figure 12 shows the comparison of the performance of the prediction model in the eICU and the commonly used clinical score;

图13为选前20个特征的预测模型在eICU中验证的ROC曲线;Figure 13 shows the ROC curve of the prediction model with the top 20 features selected in eICU;

图14为选前20个特征的预测模型在eICU中验证与临床评分性能对比;Figure 14 shows the comparison between the validation and clinical scoring performance of the prediction model with the top 20 features selected in the eICU;

图15为基于集成学习的老年多器官功能衰竭早期死亡风险评估装置的结构示意图;Figure 15 is a schematic structural diagram of an early death risk assessment device for elderly multiple organ failure based on integrated learning;

图16为预测模型的梯度提升决策树。Figure 16 is a gradient boosted decision tree for the prediction model.

具体实施方式Detailed ways

下面,结合附图对本申请的基于集成学习的早期死亡风险评估模型建立方法及装置进行详细说明。The method and device for establishing an early death risk assessment model based on ensemble learning of the present application will be described in detail below with reference to the accompanying drawings.

本发明提出的基于电子健康档案发展疾病/结局预测模型及装置主要用于早期预测老年多器官功能衰竭患者在住院期间出现不良结局的概率/风险,目的是发展经大样本数据集验证可在临床落地的风险评估模型,全自动地对老年多器官功能衰竭患者的疾病严重程度进行及早的评估,帮助医生对有恶化风险患者及早干预和治疗。本发明利用了电子健康档案累积的多年的大样本数据,可以快速有效且低成本的发展模型,有效地解决了临床随机对照研究耗时、耗力且花费巨额的问题;本方法经过另一多中心重症监护数据集的验证,其普适性和鲁棒性得到了有效地验证;且本方法在较少输入参数仍可保持较好预测性能;本方法最终被封装,可全自动计算患者发生不良结局的风险(概率)。The development of a disease/outcome prediction model and device based on electronic health records proposed by the present invention is mainly used for early prediction of the probability/risk of adverse outcomes in elderly patients with multiple organ failure during hospitalization. The implemented risk assessment model can automatically assess the disease severity of elderly patients with multiple organ failure early and help doctors to intervene and treat patients at risk of deterioration as soon as possible. The invention utilizes the large sample data accumulated in the electronic health records for many years, can develop the model quickly, effectively and at low cost, and effectively solves the problems of time-consuming, labor-intensive and huge cost of clinical randomized controlled research; The verification of the central intensive care data set, its universality and robustness have been effectively verified; and the method can still maintain good prediction performance with fewer input parameters; the method is finally encapsulated, which can automatically calculate the incidence of patients Risk (probability) of adverse outcomes.

本发明中提出的方法总体主要包括三个模块:(1)数据集构建模块;(2)数据处理模块:根据步骤(1)确定的研究群体和纳入的研究特征对数据进行提取、处理和特征构建;(3)模型构建与评估:根据步骤(2)得到的数据集进行模型的学习/训练和验证。其中步骤(1)主要利用了临床评估患者是否有多器官功能衰竭的标准对MIMIC-III数据库中的MODSE患者进行标注;步骤(2)主要根据步骤(1)给定的条件和范围从MIMIC-III数据库获取所需信息,为了便于模型计算和学习更加丰富信息,需要抽取统计特征;步骤(3)主要根据步骤(2)得到的数据进行模型的训练、优化和内部验证,进一步利用eICU数据库对模型进行了外部验证。The method proposed in the present invention generally includes three modules: (1) a data set building module; (2) a data processing module: extracting, processing and characterizing data according to the research group determined in step (1) and the characteristics of the included research Construction; (3) Model construction and evaluation: According to the data set obtained in step (2), the learning/training and verification of the model are performed. Among them, step (1) mainly uses the criteria for clinical assessment of whether patients have multiple organ failure to mark MODSE patients in the MIMIC-III database; The III database obtains the required information. In order to facilitate the model calculation and learning to enrich the information, it is necessary to extract statistical features; step (3) mainly conducts model training, optimization and internal verification according to the data obtained in step (2), and further uses the eICU database to Models are externally validated.

本发明中提出的基于电子健康档案的老年多器官功能衰竭早期死亡风险评估方法,其预测性能优于基线模型和临床常用评分,为患者的恶化风险评估提供了更加准确的方法;其普适性和鲁棒性经受了截止目前(据我们所知)最大的样本数据集的验证;并且减少对模型的输入,模型性能良好且优于临床评分,模型的普适性和鲁棒性得到了不同方面的验证;同时该方法可获得与疾病恶化相关联的风险因素排名,有助于医生对MODSE疾病发展有更加深入的理解;最后该方法内置了并行计算,可以自动化、快速地获得患者住院期间的早期死亡风险。The electronic health record-based early death risk assessment method for multiple organ failure in the elderly proposed in the present invention has better predictive performance than baseline models and commonly used clinical scores, and provides a more accurate method for assessing the deterioration risk of patients; its universality and robustness have been validated on the largest sample dataset to date (to our knowledge); and with reduced input to the model, the model performs well and outperforms clinical scores, and the model's generalizability and robustness are differentiated At the same time, the method can obtain the ranking of risk factors associated with disease deterioration, which helps doctors to have a deeper understanding of the development of MODSE disease; finally, the method has built-in parallel computing, which can automatically and quickly obtain the patient's hospitalization period. risk of early mortality.

本发明提出的一种基于电子健康档案的老年多器官功能衰竭早期死亡风险评估方法的具体的实现如图1所示,包括以下步骤:The specific implementation of an electronic health record-based early death risk assessment method for multiple organ failure in the elderly proposed by the present invention is shown in Figure 1, including the following steps:

一、本发明的中的数据集构建模块过程如下:One, the data set building module process in the present invention is as follows:

首先根据临床诊断多器官功能衰竭的临床评分SOFA≥2(且至少两个系统异常)和国际规定对老年人的定义(年龄≥65岁)标定MIMIC-III数据库中的MODSE患者;随后,根据图2中MODSE患者纳入流程从数据库中确定本方法用于研究的人群。具体的纳入条件为:①年龄≥65岁,②ICU住院时长≥24小时,③第一次住院且第一次进入ICU,④住院期间发生了MODS(即SOFA≥2分,且至少两个器官系统发生衰竭),⑤住ICU第一天内至少各测量过一次:心率(HR)、呼吸率(RR)、平均动脉压(MAP)、格拉斯哥评分(GCS)、体温(T)、血氧饱和度(SpO2)。通过条件①~⑤最终共纳入15804名患者(2353名患者死亡);最后,需要确定研究特征,我们结合了临床医生的经验知识和部分文献,确定共纳入4类数据,分别为:人口统计学,生命体征(添加了休克指数shock index,即脉搏/收缩压),实验室检查,临床(评分、医疗干预、出量),将在(二)中详细描述具体内容。此外,我们挑选了7个临床关注的重要指标可视化呈现MODSE院内死亡和幸存患者的疾病发展轨迹,见图3。First, MODSE patients in the MIMIC-III database were calibrated according to the clinical score of SOFA ≥ 2 (and at least two systemic abnormalities) for clinical diagnosis of multiple organ failure and the international definition of the elderly (age ≥ 65 years); then, according to Fig. The MODSE patient inclusion process in 2 identified the population for the study using this method from the database. The specific inclusion criteria are: ① age ≥ 65 years old, ② ICU length of stay ≥ 24 hours, ③ first hospitalization and ICU admission for the first time, ④ MODS (i.e. SOFA ≥ 2 points, and at least two organ systems) occurred during hospitalization failure occurred), ⑤ measured at least once within the first day in the ICU: heart rate (HR), respiratory rate (RR), mean arterial pressure (MAP), Glasgow score (GCS), body temperature (T), blood oxygen saturation (SpO 2 ). A total of 15,804 patients (2,353 patients died) were finally included through the conditions ① to ⑤; finally, the characteristics of the study needed to be determined. We combined the experience and knowledge of clinicians and some literature to determine a total of 4 types of data to be included, namely: demographics , vital signs (added shock index shock index, namely pulse/systolic blood pressure), laboratory tests, clinical (score, medical intervention, output), the specific content will be described in detail in (2). In addition, we selected 7 important indicators of clinical interest to visualize the disease progression trajectories of MODSE in-hospital deceased and surviving patients, as shown in Figure 3.

二、本发明中的数据处理模块过程如下:Two, the data processing module process in the present invention is as follows:

首先基于(一)中确认的纳入人群和研究特征,编写Postgresql语句从MIMIC-III数据;First, based on the included population and study characteristics confirmed in (1), write Postgresql statements from MIMIC-III data;

库中提取患者进入ICU头24小时所有的相关信息,分为以下3个步骤完成:①对数据进行格式标准化处理(如:葡萄糖的itemid为50809或50931,心率的itemid为211或220045),进一步需要将同时存在字符串和数值组合的信息进行清洗,并根据医生给出的每个指标的生理极限范围将异常值/离群值去除;②将(一)中提到的实验室检查、生命体征和临床指标进行降采样(若一小时内出现多个值则取均值,其中尿量较为特殊,采用一小时内的总和);③将降采样过后的数据进一步进行插值处理,当该特征的整个研究群体的缺少比例≤30%,则采用人群中位数插补。若缺少比例>30%,插入人群的均值,并加入该特征对应的标志指标(即0/1标签,代表未测/测量);All the relevant information of the patient in the first 24 hours after entering the ICU is extracted from the database, which is divided into the following three steps: ① Standardize the format of the data (for example: the itemid of glucose is 50809 or 50931, the itemid of heart rate is 211 or 220045), and further It is necessary to clean the information that contains both strings and numerical values, and remove outliers/outliers according to the physiological limit range of each indicator given by the doctor; Signs and clinical indicators are down-sampled (if there are multiple values within one hour, the average value is taken, among which the urine volume is special, and the sum within one hour is used); ③ The down-sampled data is further interpolated. When the missing proportion of the entire study population was ≤30%, the population median was imputed. If the missing ratio is >30%, insert the mean value of the population, and add the corresponding marker index of the feature (ie 0/1 label, representing untested/measured);

接着,将整理好的数据进行特征构建,见图3中的观测窗口(入ICU头24h)。将所有患者观测窗口内的信息进行抽取,具体如下:①人口统计学特征(8个):保持原有值;②生命体征统计特征(25个):最大值、最小值、均值;③实验室检查统计特征:最大、最小值和标签flag(如:lactate_flag为是否测量乳酸(0/1));临床(16个):最大值、最小值、均值、总和(尿量)和标签flag(如:ventilation_flag为是否进行机械通气治疗(0/1))。表1为特征统计总表;Next, feature construction is performed on the sorted data, as shown in the observation window in Figure 3 (the first 24h of ICU admission). The information in the observation window of all patients is extracted, as follows: ①Demographic characteristics (8): keep the original value; ②Vital sign statistical characteristics (25): maximum value, minimum value, mean value; ③Laboratory Check statistical features: maximum, minimum and label flag (eg: lactate_flag is lactate measurement (0/1)); clinical (16): maximum, minimum, mean, sum (urine volume) and label flag (eg : ventilation_flag is whether to perform mechanical ventilation (0/1)). Table 1 is a summary table of characteristic statistics;

最后,为了便于接下来进行模型的训练,对患者的结局进行标注(0:幸存,1:死亡)。Finally, in order to facilitate the training of the model, the outcome of the patient is annotated (0: survived, 1: died).

表1预测模型纳入特征及结局标注Table 1 Included features and outcome annotations in the prediction model

Figure BDA0002282327790000091
Figure BDA0002282327790000091

三、本发明中的模型构建与评估模块过程如下:3. The process of model construction and evaluation module in the present invention is as follows:

首先基于(二)中获得的整理过后的数据,将其输入到我们选定的模型XGBoost,XGBoost是集成模型,其运算速度快,且具有一定的可解释性。将数据集分为80%的训练集和验证集以及20%的测试集,其中80%的训练集和验证集会进一步分为80%的训练集和20%的验证集,采用交叉验证的方法优化XGBoost模型的超参数设置;First, based on the sorted data obtained in (2), it is input into our selected model XGBoost. XGBoost is an ensemble model, which is fast in operation and has a certain degree of interpretability. Divide the dataset into 80% training set and validation set and 20% test set, of which 80% training set and validation set will be further divided into 80% training set and 20% validation set, using cross-validation method to optimize Hyperparameter settings of the XGBoost model;

接着为预测模型XGBoost性能评估,本方法采用的评价指标有AUC、特异性、敏感性、准确率、F1值和鲁棒性;采用的参照标准为基线模型(LR、NN、SVM、RF、NB)和临床常用评估危重病人病情的严重程度的评分(OASIS、APACHE-IV、MODS、SOFA、SAPS、MEWS、SIRS、qSOFA、Charlson Comorbidity);采用的方式为内部验证(即MIMIC-III数据集自身的模型训练与基于20%的测试集进行评估)和外部验证(即MIMIC-III数据集发展的预测模型在eICU数据集中整个研究人群中进行性能评估)。以下通过结合预测结果进行此过程的更加详细说明:Next is the performance evaluation of the prediction model XGBoost. The evaluation indicators used in this method are AUC, specificity, sensitivity, accuracy, F1 value and robustness; the reference standard used is the baseline model (LR, NN, SVM, RF, NB ) and commonly used clinical scores to assess the severity of critically ill patients (OASIS, APACHE-IV, MODS, SOFA, SAPS, MEWS, SIRS, qSOFA, Charlson Comorbidity); The model was trained and evaluated based on 20% of the test set) and external validation (i.e. the predictive model developed on the MIMIC-III dataset was evaluated for performance across the entire study population in the eICU dataset). The following is a more detailed explanation of this process by combining the forecast results:

1)XGBoost预测模型的内部验证结果解释1) Interpretation of the internal validation results of the XGBoost prediction model

图4为XGBoost预测模型(使用全部特征107个)的ROC曲线,AUC为0.866。图5为本方法与5个基线模型预测性能对比的ROC曲线图,XGBoost表现最优。表2为性能对比的具体指标,XGBoost的AUC为0.866,敏感性为0.774,特异性为0.808,准确性0.877,F1值为0.680。图6为本方法与10个临床常用评分性能对比的ROC曲线图,XGBoost表现也一致最优。表3为与临床性能对比的具体指标。XGBoost预测模型可以提供模型特征重要程度排名,这一特性为模型提供了可解释性的功能,即特征重要度的排名为临床医生提供了关于MODSE患者发生恶化早期应该重点关注的指标。图7为挑选出前30个特征重要度排名的展示,其中格拉斯哥评分的最大值(gcs_max)、呼吸速率的均值(rr_mean)、格拉斯哥评分的均值(gcs_mean)、收缩压的均值(sbp_mean)、总尿量(uo_sum)、体重(weight)、血尿素氮的最小值(bun_min)、年龄(age)、尿量的均值(uo_mean)、休克指数的最大值(si_max)排在前10位。表3为前30个重要特征的名称和具体的重要度数值。另外,为了验证模型是否可以应用于不同医疗资源配置机构中,我们将模型的输入特征个数限制到20个(@20),图8为该方法的ROC曲线,AUC为0.857;图9为减少输入特征模型的性能与2个性能优良的临床评分(AUCOASIS=0.752,AUCAPACHE-IV=0.704)对比的ROC曲线,我们的方法尽管预测性能相比于输入全部特征略有下降,但仍远优于临床评分的性能。Figure 4 shows the ROC curve of the XGBoost prediction model (using all 107 features), and the AUC is 0.866. Figure 5 is the ROC curve graph comparing the prediction performance of this method and the five baseline models, and XGBoost performs the best. Table 2 shows the specific indicators of performance comparison. The AUC of XGBoost is 0.866, the sensitivity is 0.774, the specificity is 0.808, the accuracy is 0.877, and the F1 value is 0.680. Figure 6 is a ROC curve graph comparing the performance of this method with 10 commonly used clinical scores, and the performance of XGBoost is also consistently the best. Table 3 shows the specific indicators compared with clinical performance. The XGBoost prediction model can provide the model feature importance ranking, which provides the model with the function of interpretability, that is, the feature importance ranking provides clinicians with indicators that they should focus on in the early stage of deterioration in MODSE patients. Figure 7 shows the importance ranking of the top 30 features, including the maximum value of Glasgow score (gcs_max), the mean value of respiratory rate (rr_mean), the mean value of Glasgow score (gcs_mean), the mean value of systolic blood pressure (sbp_mean), the mean value of total urine Volume (uo_sum), body weight (weight), minimum value of blood urea nitrogen (bun_min), age (age), mean value of urine output (uo_mean), and maximum value of shock index (si_max) ranked in the top 10. Table 3 lists the names and specific importance values of the top 30 important features. In addition, in order to verify whether the model can be applied to different medical resource allocation institutions, we limit the number of input features of the model to 20 (@20). Figure 8 shows the ROC curve of this method, and the AUC is 0.857; Figure 9 shows the reduction ROC curves of the performance of the input feature model compared to 2 well-performing clinical scores (AUC OASIS = 0.752, AUC APACHE-IV = 0.704), although the prediction performance of our method decreases slightly compared to the input full features, it is still far Better performance than clinical scoring.

表2 XGBoost预测模型与其他基线模型的性能对比Table 2 Performance comparison between XGBoost prediction model and other baseline models

模型名称model name AUCAUC 敏感性Sensitivity 特异性specificity 准确性accuracy F1值F1 value XGBoostXGBoost 0.8660.866 0.7740.774 0.8080.808 0.8770.877 0.6800.680 NNNN 0.8480.848 0.8020.802 0.7430.743 0.8670.867 0.6010.601 LRLR 0.8470.847 0.8410.841 0.7030.703 0.8710.871 0.6740.674 SVMSVM 0.8340.834 0.7630.763 0.7730.773 0.8530.853 0.4600.460 RFRF 0.7950.795 0.6340.634 0.8190.819 0.8590.859 0.5970.597 NBNB 0.7780.778 0.6960.696 0.7420.742 0.8190.819 0.6640.664

表3 XGBoost预测模型与临床常用评分的性能对比Table 3 Performance comparison between the XGBoost prediction model and the commonly used clinical scores

Figure BDA0002282327790000101
Figure BDA0002282327790000101

Figure BDA0002282327790000111
Figure BDA0002282327790000111

表4 XGBoost预测模型前30个重要特征Table 4 The top 30 important features of the XGBoost prediction model

Figure BDA0002282327790000112
Figure BDA0002282327790000112

Figure BDA0002282327790000121
Figure BDA0002282327790000121

2)XGBoost预测模型的外部验证结果解释2) Interpretation of the external validation results of the XGBoost prediction model

本部分内容通过多中心的大样本数据集eICU对基于MIMIC-III数据集发展得到的MODSE早期死亡风险评估模型的普适性、鲁棒性进行验证。首先是数据集模块构建过程,需要通过与MIMIC-III中获取研究人群相同的流程获得eICU数据集中纳入人群,具体纳入流程见图10,最终纳入人群为34523名MODSE患者,其中死亡患者为3966名。与MIMIC-III数据集保持一致,也纳入相同的研究特征;接着为数据处理模块,与MIMIC-III数据集中的流程一致。最后是模型的外部验证,我们的方法在eICU数据集中进行验证,模型的ROC曲线见图11,其中AUC为0.837,由此可见模型的普适性/鲁棒性得到了很好的验证。图12为本方法与临床评分(OASIS,APACHE-IV)预测性能对比的ROC曲线,我们的方法依旧一致远优于临床评分(AUCOASIS=0.746,AUCAPACHE-IV=0.742)。我们依旧选用了部分输入特征(前20个重要特征)对模型的普适性/鲁棒性进行评估。图13为输入模型特征的方法在eICU数据集中外部验证的结果,AUC为0.828。图14为与临床评分对比性能对比的ROC曲线。结合图12和14,可知尽管模型模型的性能略有模型,模型性能依旧表现良好。The content of this part is used to verify the universality and robustness of the MODSE early mortality risk assessment model developed based on the MIMIC-III data set through the multi-center large sample data set eICU. The first is the construction process of the dataset module. It is necessary to obtain the population included in the eICU dataset through the same process as the study population obtained in MIMIC-III. The specific inclusion process is shown in Figure 10. The final included population is 34,523 MODSE patients, of which 3,966 died. . Consistent with the MIMIC-III dataset, the same research characteristics are also included; followed by the data processing module, which is consistent with the process in the MIMIC-III dataset. Finally, the external validation of the model, our method is validated in the eICU dataset, the ROC curve of the model is shown in Figure 11, where the AUC is 0.837, which shows that the universality/robustness of the model has been well validated. Figure 12 shows the ROC curve comparing the predictive performance of this method with the clinical score (OASIS, APACHE-IV). Our method is still far superior to the clinical score (AUC OASIS = 0.746, AUC APACHE-IV = 0.742). We still select some input features (top 20 important features) to evaluate the generality/robustness of the model. Figure 13 shows the results of external validation of the method of inputting model features in the eICU dataset, with an AUC of 0.828. Figure 14 is a ROC curve versus clinical score versus performance. Combining Figures 12 and 14, it can be seen that although the performance of the model model is slightly modeled, the performance of the model is still good.

表5展示了输入全部特征的预测模型、输入部分特征(@20)的预测模型和10个临床评分性能对比的汇总。Table 5 shows a summary of the performance comparisons of the prediction model with full input, the prediction model with partial input (@20), and 10 clinical scores.

表5预测模型在eICU数据集中验证与临床评分对比Table 5 Validation and clinical score comparison of the prediction model in the eICU dataset

名称name AUCAUC 敏感性Sensitivity 特异性specificity XGBoost_all_featuresXGBoost_all_features 0.8370.837 0.7630.763 0.7520.752 XGBoost_20_featuresXGBoost_20_features 0.8280.828 0.7330.733 0.7690.769 MODSMODS 0.7630.763 0.7190.719 0.6670.667 OASISOASIS 0.7460.746 0.6730.673 0.6940.694 APACHE-IVAPACHE-IV 0.7420.742 0.6350.635 0.7510.751 APSIIIAPSIII 0.7360.736 0.6630.663 0.7160.716 SOFASOFA 0.7310.731 0.6840.684 0.6730.673 MEWSMEWS 0.7190.719 0.6140.614 0.7120.712 SAPSSAPS 0.6930.693 0.5690.569 0.7170.717 SIRSSIRS 0.6700.670 0.7340.734 0.5310.531 qSOFAqSOFA 0.6750.675 0.5970.597 0.7100.710 Charlson ComorbidityCharlson Comorbidity 0.5530.553 0.52290.5229 0.5690.569

本发明中提出的全自动风险评估装置,见图15。我们兼顾了模型的预测性能(特异性、敏感性、准确性等)和适用性,挑选了特征重要性排名前20的特征(见表4)作为最终预测模型的输入信息,模型的超参数设置为{'base_score':0.5,'booster':'gbtree','colsample_bylevel':1,'colsample_bytree':1,'gamma':0,'learning_rate':0.1,'max_delta_step':0,'max_depth':3,'min_child_weight':1,'missing':None,'n_estimators':100,'n_jobs':-1,'nthread':None,'objective':'binary:logistic','random_state':0,'reg_alpha':0,'reg_lambda':1,'scale_pos_weight':1,'seed':None,'silent':1,'subsample':1,'eval_metric':'auc','tree_method':'approx'}。The fully automatic risk assessment device proposed in the present invention is shown in Figure 15. We take into account the prediction performance (specificity, sensitivity, accuracy, etc.) and applicability of the model, and select the top 20 features with feature importance (see Table 4) as the input information of the final prediction model. The hyperparameter settings of the model is {'base_score':0.5,'booster':'gbtree','colsample_bylevel':1,'colsample_bytree':1,'gamma':0,'learning_rate':0.1,'max_delta_step':0,'max_depth': 3,'min_child_weight':1,'missing':None,'n_estimators':100,'n_jobs':-1,'nthread':None,'objective':'binary:logistic','random_state':0,' reg_alpha':0,'reg_lambda':1,'scale_pos_weight':1,'seed':None,'silent':1,'subsample':1,'eval_metric':'auc','tree_method':'approx' }.

为了更加直观理解模型,绘制了预测模型的梯度决策提升树,见图16。In order to understand the model more intuitively, the gradient decision lifting tree of the prediction model is drawn, as shown in Figure 16.

产业利用可能性Industrial use possibility

本申请的基于集成学习的早期死亡风险评估模型建立方法及装置将数据预处理、特征计算和模型运算进行了全自动封装,使得可以在接收到单独一位患者/多位患者的所需诊疗信息/数据后,即可自动、快速获得患者是否会在住院期间发生不良结局。帮助医生对患者的疾病严重程度进行更加全面的评估,进而对高风险患者进行及早护理和治疗。The method and device for establishing an early mortality risk assessment model based on ensemble learning of the present application fully automatically encapsulates data preprocessing, feature calculation and model calculation, so that the required diagnosis and treatment information of a single patient/multiple patients can be received / data, it is possible to automatically and quickly obtain whether patients will experience adverse outcomes during hospitalization. Helps physicians conduct a more comprehensive assessment of the severity of a patient's disease, leading to early care and treatment of high-risk patients.

Claims (10)

1. An early death risk assessment model building method based on ensemble learning, wherein the early death risk assessment model is used for assessing the death risk of an aged multi-organ failure patient during hospitalization based on the diagnosis and treatment data of the first day when the aged multi-organ failure patient enters an intensive care unit; the evaluation model is an XGboost model; the method comprises the following steps:
a step of constructing a data set; identifying, with a dataset construction module, multiple organ failure patients in a first database according to a clinical diagnosis definition, determining four types of characteristics for study population and study inclusion, forming a first dataset;
a step of data processing; performing data extraction and processing in the first data set by using a data extraction and processing unit of the data processing module to form a second data set, wherein the second data set is divided into a training set, a verification set and a test set; performing feature construction on data to be input into the risk assessment model by using a feature construction unit of the data processing module so as to enable the data to meet the data input requirement of the risk assessment model; marking the patient outcome by using a patient outcome marking unit of the data processing module;
constructing and evaluating a model; training by using an XGboost model as an early prediction model, setting the number of early stop rounds by using an early stop mechanism and AUC as a performance evaluation index, and testing the performance of the model on a verification set; terminating the training when the iteration performance of the model represented on the verification set for continuous preset times is not improved any more, and obtaining the optimal parameters of the model training; evaluating the prediction performance of the trained model by adopting the test set data so as to prevent the model from over-fitting;
the final model was used as an early mortality risk assessment model.
2. The ensemble learning-based early mortality risk assessment model building method of claim 1, wherein:
four types of features included in the identified studies are:
demographic characteristics, comprising: admission type, age, BMI index, race, type of intensive care unit admitted, sex, height, weight;
vital signs features, comprising: central venous pressure, diastolic pressure, heart rate, mean arterial pressure, respiratory rate, systolic pressure, shock index, blood oxygen saturation, body temperature;
a laboratory examination feature, comprising: albumin, alkaline phosphatase, glutamic-pyruvic transaminase, glutamic-oxalacetic transaminase, residual alkali, bicarbonate, bilirubin, B-type natriuretic peptide, blood urea nitrogen, chloride, creatinine, fibrinogen, inhaled oxygen concentration, blood glucose, hematocrit, hemoglobin, international normalized ratio, lactic acid, lymphocytes, magnesium, neutrophils, arterial blood partial pressure of carbon dioxide, oxygenation index, blood oxygen partial pressure, ph value, platelets, potassium, prothrombin time, thromboplastin time, blood sodium, troponin, white blood cell count;
a clinical profile, comprising: glasgow score, systemic inflammatory response syndrome score, nervous system and respiratory system score for systemic infection-related organ failure score, whether mechanical ventilation was performed, whether continuous renal replacement therapy was performed, total urine output, norepinephrine use rate.
3. The ensemble learning-based early mortality risk assessment model building method according to claim 2, wherein: the data processing step comprises:
the data extraction and processing performed by the data extraction and processing unit of the data processing module comprises: data cleaning, data sampling and data interpolation;
the feature construction by the feature construction unit of the data processing module includes: raw values of demographic characteristics; maximum value, minimum value and mean value of vital sign features; maximum, minimum, raw value for a laboratory test feature, maximum, minimum, mean, sum, raw value for a clinical feature.
4. The ensemble learning-based early mortality risk assessment model building method according to claim 3, wherein:
during performance evaluation, the performance of the model is evaluated through an internal verification mode and an external verification mode, the measurement indexes of the performance are AUC, specificity, sensitivity, accuracy, F1 value, robustness and universality, and the reference standard of the performance evaluation is a reference model and clinical scores; the final model may obtain a mortality-associated risk factor ranking based on a feature ranking function.
5. The ensemble learning-based early mortality risk assessment model building method of claim 4, wherein:
the reference model is selected from: a logistic regression LR, a support vector machine SVM, a neural network NN, a random forest RT and a naive Bayes NB model;
the clinical score is selected from: oxford acute disease severity score (OASIS), acute physiology and chronic health score (APACHE-IV), acute physiology assessment score (apsiiii), multiple organ dysfunction syndrome score (MODS), systemic infection-related organ failure Score (SOFA), Simplified Acute Physiology Score (SAPS), improved early warning score (MEWS), systemic inflammatory response syndrome score (SIRS), rapid sequential organ failure score (qSOFA), Charlson syndrome index score (Charlson organization).
6. The ensemble learning-based early mortality risk assessment model building method of claim 4, wherein:
robustness and universality are evaluated by the following three aspects:
internal verification: comparing the change conditions of AUC, specificity, sensitivity, accuracy and F1 values through the verification set and the test set of the second data set to obtain the model parameters with optimal performance and prevent overfitting;
external verification: acquiring a second database data set corresponding to the data set by using a multi-center large sample data set in the same data processing mode, comparing the change conditions of AUC, specificity, sensitivity, accuracy and F1 values, and comparing the change conditions with an internal verification result;
reducing input features: by reducing the number of features input into the final model, the AUC change of the final model is evaluated for internal and external validation compared to the input of all features, and compared to the clinical score.
7. The ensemble learning-based early mortality risk assessment model building method of claim 1, wherein:
the diagnosis and treatment data of the aged multi-organ failure patient entering an intensive care unit for the first day comprise:
demographic characteristics: age, weight, BMI;
vital sign characteristics: the blood oxygen saturation control method comprises the following steps of (1) systolic pressure minimum value, systolic pressure mean value, shock index maximum value, shock index minimum value, respiratory rate mean value, blood oxygen saturation minimum value and body temperature maximum value;
laboratory examination indexes: blood urea nitrogen minimum, blood glucose minimum, chloride maximum, blood oxygen partial pressure maximum, white blood cell count minimum;
the clinical characteristics are as follows: total urine volume, minimum urine volume, maximum glasgow score, mean glasgow score.
8. An integrated learning-based early death risk assessment device for assessing the death risk of an elderly multi-organ failure patient during hospitalization based on first day clinical data of the patient entering an intensive care unit; it includes: the system comprises a data preprocessing module, a characteristic calculation module and a model operation module;
the data preprocessing module is realized by a computer; the data preprocessing module is used for extracting and processing the diagnosis and treatment data to obtain preprocessed data;
the characteristic calculation module is realized by a computer; the characteristic calculation module is used for carrying out characteristic construction on the preprocessed data to form data suitable for being input into the model operation module;
the model operation module is realized by a computer; the model operation module comprises an early death risk assessment model, the early death risk assessment model is obtained by training, tuning and assessing by utilizing a data set in a database on the basis of an XGboost model, and the performance of the early death risk assessment model is effectively verified and assessed by a multi-center large sample data set; the data constructed through the characteristics is input into a model operation module, and the model operation module outputs a risk prediction result of the patient during the hospitalization period.
9. The ensemble learning-based early mortality risk assessment apparatus according to claim 8, wherein:
the model operation module comprises a model which is an early death risk assessment model established by the ensemble learning-based early death risk assessment model establishing method of any one of claims 1-7.
10. The ensemble learning-based early mortality risk assessment apparatus according to claim 8, wherein:
the extraction and processing comprises cleaning, sampling and interpolation of the diagnosis and treatment data.
CN201911146419.7A 2019-11-21 2019-11-21 Early death risk assessment model establishing method and device based on ensemble learning Pending CN110827993A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911146419.7A CN110827993A (en) 2019-11-21 2019-11-21 Early death risk assessment model establishing method and device based on ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911146419.7A CN110827993A (en) 2019-11-21 2019-11-21 Early death risk assessment model establishing method and device based on ensemble learning

Publications (1)

Publication Number Publication Date
CN110827993A true CN110827993A (en) 2020-02-21

Family

ID=69557584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911146419.7A Pending CN110827993A (en) 2019-11-21 2019-11-21 Early death risk assessment model establishing method and device based on ensemble learning

Country Status (1)

Country Link
CN (1) CN110827993A (en)

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111584087A (en) * 2020-05-22 2020-08-25 肾泰网健康科技(南京)有限公司 Method and model for prognosis prediction model of AAV (adeno-associated Virus) different treatment schemes based on AI (AI) technology
CN111599466A (en) * 2020-05-13 2020-08-28 上海森亿医疗科技有限公司 Method, device, terminal and medium for predicting septic shock of child hematologic tumor patient
CN111599465A (en) * 2020-05-13 2020-08-28 上海森亿医疗科技有限公司 Method, device, terminal and medium for predicting etiology type of children community acquired pneumonia
CN111612278A (en) * 2020-06-01 2020-09-01 戴松世 Life state prediction method, device, electronic device and storage medium
CN111714135A (en) * 2020-06-05 2020-09-29 安徽华米信息科技有限公司 Method and device for determining blood oxygen saturation
CN111816319A (en) * 2020-07-16 2020-10-23 山东大学 A step-by-step screening method for the determination of critical disease indicators of the urinary system and a risk prediction system
CN111816307A (en) * 2020-04-15 2020-10-23 浙江大学 Method and evaluation method of constructing biological age evaluation model of Chinese population based on clinical markers
CN112185560A (en) * 2020-09-27 2021-01-05 江苏省人民医院(南京医科大学第一附属医院) Early prediction method and system for prognosis risk degree of patient infected with COVID-19
CN112205965A (en) * 2020-08-28 2021-01-12 北京大学 Health risk key event detection method and system based on time window cutting
CN112216382A (en) * 2020-10-14 2021-01-12 天津医科大学总医院 Biochemical index information processing system, method, equipment, storage medium and detector
CN112270478A (en) * 2020-10-30 2021-01-26 重庆富民银行股份有限公司 Management method and platform for wind control model competition
CN112420196A (en) * 2020-11-20 2021-02-26 长沙市弘源心血管健康研究院 Prediction method and system for survival rate of acute myocardial infarction patient within 5 years
CN112466469A (en) * 2020-12-08 2021-03-09 杭州脉兴医疗科技有限公司 Major crisis and death risk prediction method
CN112820368A (en) * 2021-01-13 2021-05-18 中国人民解放军国防科技大学 Method, system, device and storage medium for constructing critical patient data set
CN112837826A (en) * 2020-12-30 2021-05-25 浙江大学温州研究院 A method and system for scoring severe sequential organ failure based on machine learning
CN112908480A (en) * 2021-03-17 2021-06-04 上海电气集团股份有限公司 Organ failure early warning method and system, electronic equipment and storage medium
CN112967803A (en) * 2021-01-29 2021-06-15 成都一尧科技有限公司 Early mortality prediction method and system for emergency patients based on integrated model
CN113057589A (en) * 2021-03-17 2021-07-02 上海电气集团股份有限公司 Method and system for predicting organ failure infection diseases and training prediction model
CN113160986A (en) * 2021-04-23 2021-07-23 桥恩(北京)生物科技有限公司 Model construction method and system for predicting development of systemic inflammatory response syndrome
CN113409947A (en) * 2021-07-29 2021-09-17 四川大学华西医院 New coronary pneumonia severe change prediction model and system, and establishment method and prediction method thereof
CN113436743A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Multi-outcome efficacy prediction method and device based on expression learning and storage medium
CN113593708A (en) * 2021-07-12 2021-11-02 杭州电子科技大学 Sepsis prognosis prediction method based on integrated learning algorithm
CN113658693A (en) * 2021-05-17 2021-11-16 娄底市中心医院 Model system for predicting exacerbation risk of new coronary pneumonia patient
CN113707295A (en) * 2021-08-24 2021-11-26 中山大学附属第三医院(中山大学肝脏病医院) Prediction method and system for senile postoperative systemic inflammatory response syndrome
CN113782197A (en) * 2021-08-03 2021-12-10 中国人民解放军总医院第一医学中心 Novel coronary pneumonia patient outcome prediction method based on interpretable machine learning algorithm
CN113838577A (en) * 2021-11-08 2021-12-24 北京航空航天大学 Convenient layered old people MODS early death risk assessment model, device and establishment method
CN113871006A (en) * 2021-09-03 2021-12-31 华中科技大学 Method and system for scoring survival probability based on sepsis patient detection information
CN114023441A (en) * 2021-11-08 2022-02-08 中国人民解放军总医院 Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof
CN114023440A (en) * 2021-11-08 2022-02-08 中国人民解放军总医院 An interpretable stratified model for early mortality risk assessment in elderly MODS, a device and its establishment method
CN114188014A (en) * 2021-09-28 2022-03-15 中国医学科学院阜外医院 A method, system and application for constructing a prediction model for poor prognosis in a patient's hospital
CN114520054A (en) * 2020-11-18 2022-05-20 英业达科技有限公司 Heart failure prediction module and heart failure prediction method
CN114913982A (en) * 2022-07-18 2022-08-16 之江实验室 End-stage renal disease complication risk prediction system based on contrast learning
CN114927230A (en) * 2022-04-11 2022-08-19 四川大学华西医院 Machine learning-based severe heart failure patient prognosis decision support system and method
WO2022226890A1 (en) * 2021-04-29 2022-11-03 京东方科技集团股份有限公司 Disease prediction method and apparatus, electronic device, and computer-readable storage medium
CN115810425A (en) * 2022-11-30 2023-03-17 广州中医药大学第一附属医院 Method and device for predicting mortality risk level of septic shock patient
CN116580836A (en) * 2023-05-11 2023-08-11 中国人民解放军总医院 A device, system and medium for assessing the risk of adverse in-hospital outcomes in elderly patients
CN117497182A (en) * 2023-08-02 2024-02-02 上海长征医院 Traumatic brain injury ending prediction system based on machine learning and physical sign time sequence
CN117672495A (en) * 2023-11-30 2024-03-08 北京医院 Artificial intelligence-based prediction method for long-term mortality in patients with atrial fibrillation and coronary heart disease
CN118016285A (en) * 2023-12-18 2024-05-10 中国医学科学院北京协和医院 Method and system for predicting hospitalization time of intensive care patient based on artificial intelligence
CN118538408A (en) * 2024-05-16 2024-08-23 深圳市人民医院 Method, apparatus, device and storage medium for assessing risk of sarcopenia

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190080057A1 (en) * 2017-09-12 2019-03-14 Michael Stanley Toxicity or adverse effect of a substance predicting automated system and method of training thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190080057A1 (en) * 2017-09-12 2019-03-14 Michael Stanley Toxicity or adverse effect of a substance predicting automated system and method of training thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
蔺轲 等: "基于XGBoost算法的ICU脓毒症患者住院死亡风险预测研究", 《中国卫生信息管理杂志》 *
虎磐 等: "基于集成机器学习的ICU老年多器官功能不全早期死亡风险预测模型", 《解放军医学院学报》 *

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111816307A (en) * 2020-04-15 2020-10-23 浙江大学 Method and evaluation method of constructing biological age evaluation model of Chinese population based on clinical markers
CN111599466A (en) * 2020-05-13 2020-08-28 上海森亿医疗科技有限公司 Method, device, terminal and medium for predicting septic shock of child hematologic tumor patient
CN111599465A (en) * 2020-05-13 2020-08-28 上海森亿医疗科技有限公司 Method, device, terminal and medium for predicting etiology type of children community acquired pneumonia
CN111584087B (en) * 2020-05-22 2023-03-10 肾泰网健康科技(南京)有限公司 AI technology-based method and model for predicting prognosis of AAV (adeno-associated virus) in different treatment schemes
CN111584087A (en) * 2020-05-22 2020-08-25 肾泰网健康科技(南京)有限公司 Method and model for prognosis prediction model of AAV (adeno-associated Virus) different treatment schemes based on AI (AI) technology
CN111612278A (en) * 2020-06-01 2020-09-01 戴松世 Life state prediction method, device, electronic device and storage medium
CN111714135B (en) * 2020-06-05 2022-06-10 合肥华米微电子有限公司 Method and device for determining blood oxygen saturation
CN111714135A (en) * 2020-06-05 2020-09-29 安徽华米信息科技有限公司 Method and device for determining blood oxygen saturation
CN111816319A (en) * 2020-07-16 2020-10-23 山东大学 A step-by-step screening method for the determination of critical disease indicators of the urinary system and a risk prediction system
CN112205965A (en) * 2020-08-28 2021-01-12 北京大学 Health risk key event detection method and system based on time window cutting
CN112185560A (en) * 2020-09-27 2021-01-05 江苏省人民医院(南京医科大学第一附属医院) Early prediction method and system for prognosis risk degree of patient infected with COVID-19
CN112216382A (en) * 2020-10-14 2021-01-12 天津医科大学总医院 Biochemical index information processing system, method, equipment, storage medium and detector
CN112270478A (en) * 2020-10-30 2021-01-26 重庆富民银行股份有限公司 Management method and platform for wind control model competition
CN112270478B (en) * 2020-10-30 2023-06-09 重庆富民银行股份有限公司 Management method and platform for competition of wind control model
CN114520054A (en) * 2020-11-18 2022-05-20 英业达科技有限公司 Heart failure prediction module and heart failure prediction method
CN112420196A (en) * 2020-11-20 2021-02-26 长沙市弘源心血管健康研究院 Prediction method and system for survival rate of acute myocardial infarction patient within 5 years
CN112466469A (en) * 2020-12-08 2021-03-09 杭州脉兴医疗科技有限公司 Major crisis and death risk prediction method
CN112837826A (en) * 2020-12-30 2021-05-25 浙江大学温州研究院 A method and system for scoring severe sequential organ failure based on machine learning
CN112820368A (en) * 2021-01-13 2021-05-18 中国人民解放军国防科技大学 Method, system, device and storage medium for constructing critical patient data set
CN112967803A (en) * 2021-01-29 2021-06-15 成都一尧科技有限公司 Early mortality prediction method and system for emergency patients based on integrated model
CN113057589A (en) * 2021-03-17 2021-07-02 上海电气集团股份有限公司 Method and system for predicting organ failure infection diseases and training prediction model
CN112908480A (en) * 2021-03-17 2021-06-04 上海电气集团股份有限公司 Organ failure early warning method and system, electronic equipment and storage medium
CN113160986A (en) * 2021-04-23 2021-07-23 桥恩(北京)生物科技有限公司 Model construction method and system for predicting development of systemic inflammatory response syndrome
CN113160986B (en) * 2021-04-23 2023-12-15 桥恩(北京)生物科技有限公司 Model construction method and system for predicting development of systemic inflammatory response syndrome
WO2022226890A1 (en) * 2021-04-29 2022-11-03 京东方科技集团股份有限公司 Disease prediction method and apparatus, electronic device, and computer-readable storage medium
CN113658693A (en) * 2021-05-17 2021-11-16 娄底市中心医院 Model system for predicting exacerbation risk of new coronary pneumonia patient
CN113436743B (en) * 2021-06-30 2023-06-23 平安科技(深圳)有限公司 Representation learning-based multi-outcome efficacy prediction method, device and storage medium
CN113436743A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Multi-outcome efficacy prediction method and device based on expression learning and storage medium
CN113593708A (en) * 2021-07-12 2021-11-02 杭州电子科技大学 Sepsis prognosis prediction method based on integrated learning algorithm
CN113409947A (en) * 2021-07-29 2021-09-17 四川大学华西医院 New coronary pneumonia severe change prediction model and system, and establishment method and prediction method thereof
CN113782197A (en) * 2021-08-03 2021-12-10 中国人民解放军总医院第一医学中心 Novel coronary pneumonia patient outcome prediction method based on interpretable machine learning algorithm
CN113707295A (en) * 2021-08-24 2021-11-26 中山大学附属第三医院(中山大学肝脏病医院) Prediction method and system for senile postoperative systemic inflammatory response syndrome
CN113871006B (en) * 2021-09-03 2024-09-10 华中科技大学 Method and system for scoring survival probability based on sepsis patient detection information
CN113871006A (en) * 2021-09-03 2021-12-31 华中科技大学 Method and system for scoring survival probability based on sepsis patient detection information
CN114188014A (en) * 2021-09-28 2022-03-15 中国医学科学院阜外医院 A method, system and application for constructing a prediction model for poor prognosis in a patient's hospital
CN114023441A (en) * 2021-11-08 2022-02-08 中国人民解放军总医院 Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof
CN114023440A (en) * 2021-11-08 2022-02-08 中国人民解放军总医院 An interpretable stratified model for early mortality risk assessment in elderly MODS, a device and its establishment method
CN113838577A (en) * 2021-11-08 2021-12-24 北京航空航天大学 Convenient layered old people MODS early death risk assessment model, device and establishment method
CN114927230A (en) * 2022-04-11 2022-08-19 四川大学华西医院 Machine learning-based severe heart failure patient prognosis decision support system and method
CN114927230B (en) * 2022-04-11 2023-05-23 四川大学华西医院 Prognosis decision support system and method for severe heart failure patient based on machine learning
CN114913982B (en) * 2022-07-18 2022-10-11 之江实验室 A risk prediction system for end-stage renal disease complications based on contrastive learning
CN114913982A (en) * 2022-07-18 2022-08-16 之江实验室 End-stage renal disease complication risk prediction system based on contrast learning
US11875882B1 (en) 2022-07-18 2024-01-16 Zhejiang Lab System for predicting end-stage renal disease complication risk based on contrastive learning
CN115810425B (en) * 2022-11-30 2023-12-08 广州中医药大学第一附属医院 Method and device for predicting mortality risk level of sepsis shock patient
CN115810425A (en) * 2022-11-30 2023-03-17 广州中医药大学第一附属医院 Method and device for predicting mortality risk level of septic shock patient
CN116580836A (en) * 2023-05-11 2023-08-11 中国人民解放军总医院 A device, system and medium for assessing the risk of adverse in-hospital outcomes in elderly patients
CN117497182A (en) * 2023-08-02 2024-02-02 上海长征医院 Traumatic brain injury ending prediction system based on machine learning and physical sign time sequence
CN117672495A (en) * 2023-11-30 2024-03-08 北京医院 Artificial intelligence-based prediction method for long-term mortality in patients with atrial fibrillation and coronary heart disease
CN117672495B (en) * 2023-11-30 2024-05-14 北京医院 Atrial fibrillation combined coronary heart disease patient long-term mortality prediction method based on artificial intelligence
CN118016285A (en) * 2023-12-18 2024-05-10 中国医学科学院北京协和医院 Method and system for predicting hospitalization time of intensive care patient based on artificial intelligence
CN118538408A (en) * 2024-05-16 2024-08-23 深圳市人民医院 Method, apparatus, device and storage medium for assessing risk of sarcopenia

Similar Documents

Publication Publication Date Title
CN110827993A (en) Early death risk assessment model establishing method and device based on ensemble learning
Desautels et al. Prediction of sepsis in the intensive care unit with minimal electronic health record data: a machine learning approach
García-Gallo et al. A machine learning-based model for 1-year mortality prediction in patients admitted to an Intensive Care Unit with a diagnosis of sepsis
Maas et al. Predicting outcome after traumatic brain injury
JP6049620B2 (en) Medical scoring system and method
CN110051324B (en) Method and system for predicting death rate of acute respiratory distress syndrome
CN114023441A (en) Severe AKI early risk assessment model and device based on interpretable machine learning model and development method thereof
CN115527678A (en) Nomogram ICU (intensive care unit) elderly disease risk scoring model and device fusing medical history texts and establishing method thereof
CN111297329B (en) Method and system for predicting dynamic risk of cardiovascular complications in diabetic patients
Hu et al. Explainable machine-learning model for prediction of in-hospital mortality in septic patients requiring intensive care unit readmission
Ding et al. Mortality prediction for ICU patients combining just-in-time learning and extreme learning machine
CN116543902A (en) An interpretable death risk assessment model, device and establishment method for critically ill children
CN117198532A (en) A machine learning-based sepsis risk prediction method and system for ICU patients
WO2017165693A1 (en) Use of clinical parameters for the prediction of sirs
CN114023440A (en) An interpretable stratified model for early mortality risk assessment in elderly MODS, a device and its establishment method
CN115101199A (en) Interpretable fair early death risk assessment model and device for critically ill elderly patients and establishment method thereof
Wang et al. Predictive classification of ICU readmission using weight decay random forest
CN113128654A (en) Improved random forest model for coronary heart disease pre-diagnosis and pre-diagnosis system thereof
Scholz et al. Outcome prediction in critical care: physicians’ prognoses vs. scoring systems
Liu et al. A machine learning method for predicting the probability of MODS using only non-invasive parameters
US20220068492A1 (en) System and method for selecting required parameters for predicting or detecting a medical condition of a patient
Zhou et al. An early sepsis prediction model utilizing machine learning and unbalanced data processing in a clinical context
US20240266062A1 (en) Disease risk evaluation method, disease risk evaluation system, and health information processing device
Wang et al. Method of non-invasive parameters for predicting the probability of early in-hospital death of patients in intensive care unit
CN116580836A (en) A device, system and medium for assessing the risk of adverse in-hospital outcomes in elderly patients

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221

RJ01 Rejection of invention patent application after publication