CN118398211A - Kidney stone risk prediction method and system based on CT image histology - Google Patents
Kidney stone risk prediction method and system based on CT image histology Download PDFInfo
- Publication number
- CN118398211A CN118398211A CN202410513073.4A CN202410513073A CN118398211A CN 118398211 A CN118398211 A CN 118398211A CN 202410513073 A CN202410513073 A CN 202410513073A CN 118398211 A CN118398211 A CN 118398211A
- Authority
- CN
- China
- Prior art keywords
- radiomics
- risk
- features
- target
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010029148 Nephrolithiasis Diseases 0.000 title claims abstract description 85
- 208000000913 Kidney Calculi Diseases 0.000 title claims abstract description 84
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 38
- 238000007473 univariate analysis Methods 0.000 claims abstract description 31
- 238000003384 imaging method Methods 0.000 claims description 102
- 238000012549 training Methods 0.000 claims description 39
- 238000000605 extraction Methods 0.000 claims description 32
- 238000000491 multivariate analysis Methods 0.000 claims description 27
- 210000005084 renal tissue Anatomy 0.000 claims description 25
- 238000010200 validation analysis Methods 0.000 claims description 25
- 239000000284 extract Substances 0.000 claims description 15
- 238000010801 machine learning Methods 0.000 claims description 7
- 210000001519 tissue Anatomy 0.000 claims description 7
- 238000012800 visualization Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 5
- 230000006835 compression Effects 0.000 claims description 4
- 238000007906 compression Methods 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 2
- 238000007477 logistic regression Methods 0.000 abstract description 12
- 238000012216 screening Methods 0.000 abstract description 5
- 238000007637 random forest analysis Methods 0.000 abstract description 2
- 238000012314 multivariate regression analysis Methods 0.000 abstract 1
- 238000002591 computed tomography Methods 0.000 description 66
- 230000000875 corresponding effect Effects 0.000 description 52
- 238000010586 diagram Methods 0.000 description 22
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 11
- 201000010099 disease Diseases 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 239000004575 stone Substances 0.000 description 7
- 238000013170 computed tomography imaging Methods 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 6
- 238000011088 calibration curve Methods 0.000 description 5
- 238000003066 decision tree Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 210000003734 kidney Anatomy 0.000 description 4
- 238000012706 support-vector machine Methods 0.000 description 4
- 206010038419 Renal colic Diseases 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003709 image segmentation Methods 0.000 description 3
- 238000012417 linear regression Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000000556 factor analysis Methods 0.000 description 2
- 238000010988 intraclass correlation coefficient Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 208000008281 urolithiasis Diseases 0.000 description 2
- 208000008035 Back Pain Diseases 0.000 description 1
- 206010007027 Calculus urinary Diseases 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- 208000032984 Intraoperative Complications Diseases 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 208000009911 Urinary Calculi Diseases 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- WCIDIJQCEUODDY-UHFFFAOYSA-N chloro(dimethyl)sulfanium Chemical compound C[S+](C)Cl WCIDIJQCEUODDY-UHFFFAOYSA-N 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000000378 dietary effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 208000006750 hematuria Diseases 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000011477 surgical intervention Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30084—Kidney; Renal
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Public Health (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Artificial Intelligence (AREA)
- Quality & Reliability (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Pathology (AREA)
- Image Analysis (AREA)
Abstract
本发明提供了一种基于CT影像组学的肾结石风险预测方法及系统,包括:分别采集多名患者多轴面的CT图像并进行勾画,得到感兴趣区域结果,基于LASSO筛选出最相关的目标影像组学特征,并通过逻辑回归和随机森林等算法构建模型,基于预测性能的比较筛选出最优算法用于构建预测模型;分别采集多名患者的临床变量,依次基于单因素分析模型和多因素回归分析,从临床变量中筛选出独立风险因素;将目标影像组学特征与独立风险因素进行组合,得到组合特征,基于构建的预测模型,获取组合特征对应风险预测的分类结果;并根据组合特征构建列线图为分类结果进行解释;采用本发明能够提高预测精确度,以及提高可解释性。
The present invention provides a method and system for predicting kidney stone risk based on CT radiomics, comprising: collecting multi-axial CT images of multiple patients and outlining them to obtain results of regions of interest, screening out the most relevant target radiomics features based on LASSO, building models through algorithms such as logistic regression and random forest, and screening out the optimal algorithm for building the prediction model based on comparison of prediction performance; collecting clinical variables of multiple patients respectively, and screening out independent risk factors from the clinical variables based on univariate analysis models and multivariate regression analysis in turn; combining the target radiomics features with the independent risk factors to obtain combined features, and obtaining classification results of risk prediction corresponding to the combined features based on the constructed prediction model; and building a nomogram based on the combined features to explain the classification results; the present invention can improve prediction accuracy and interpretability.
Description
技术领域Technical Field
本发明涉及图像处理技术领域,尤其涉及一种基于CT影像组学的肾结石风险预测方法及系统。The present invention relates to the technical field of image processing, and in particular to a method and system for predicting kidney stone risk based on CT radiomics.
背景技术Background technique
泌尿系结石是泌尿外科最常见疾病,发病率及复发率高,肾结石是其中最常见的类型。目前,大多数肾结石患者因肾绞痛、血尿等症状就诊,或在体检、诊治其他疾病时偶然发现,在诊断时往往较晚,不利于肾结石的治疗,有研究报道约50%的有症状结石患者需要接受手术干预。因此,预防对于肾结石疾病的管理至关重要。筛查高危人群是预防肾结石的关键,它有利于早期干预,如饮食调节从而减少发病率;有利于及时发现结石,使患者能够及时接受治疗,从而有利于控制疾病并限制手术损伤。Urinary stones are the most common disease in urology, with high incidence and recurrence rates, and kidney stones are the most common type. At present, most patients with kidney stones seek medical treatment due to symptoms such as renal colic and hematuria, or are accidentally discovered during physical examinations or treatment of other diseases. The diagnosis is often late, which is not conducive to the treatment of kidney stones. Studies have reported that about 50% of patients with symptomatic stones require surgical intervention. Therefore, prevention is crucial for the management of kidney stone disease. Screening high-risk groups is the key to preventing kidney stones. It is conducive to early intervention, such as dietary adjustments to reduce the incidence rate; it is conducive to the timely detection of stones, so that patients can receive timely treatment, which is conducive to controlling the disease and limiting surgical injuries.
电子计算机断层扫描(Computed Tomography,CT)是诊断肾结石的金标准,也是肾绞痛患者的首选影像学评估方法。然而,对于一些结石较小的患者,由于CT扫描层厚或影像科医师的忽视而被漏诊。在国外,每年约有200万人因肾绞痛或与结石相关的背部疼痛前往急诊科就诊,约有一半的人接受CT检查,仅有20%被诊断患有肾结石。因此,迫切需要一种有效的方法来辅助筛查高危人群。研究者尝试通过医学影像筛选肾结石高风险人群,如监测肾乳头CT值并进行数据跟踪。Computed Tomography (CT) is the gold standard for diagnosing kidney stones and the preferred imaging evaluation method for patients with renal colic. However, some patients with smaller stones are missed due to thick CT scan layers or neglect by radiologists. Abroad, about 2 million people go to the emergency department each year for renal colic or stone-related back pain. About half of them receive CT scans, and only 20% are diagnosed with kidney stones. Therefore, there is an urgent need for an effective method to assist in screening high-risk groups. Researchers have tried to screen people at high risk of kidney stones through medical imaging, such as monitoring renal papillary CT values and tracking data.
然而,通过CT检测肾结石时,如果肾结石的直径过小,小于CT扫描层厚或重建层厚,则无法被发现;并且,CT无法筛选出肾结石高风险人群。测量肾乳头CT值有助于筛选肾结石高风险人群,但现有研究仅提出肾乳头CT值增高会增加肾结石风险,无法预测罹患结石风险的概率,导致难以对是否患肾结石的预测结果进行解释;此外,仅测量肾乳头CT值可能会导致信息的遗漏,准确性较低。However, when detecting kidney stones through CT, if the diameter of the kidney stone is too small, smaller than the CT scan layer thickness or reconstruction layer thickness, it cannot be detected; and CT cannot screen out people at high risk of kidney stones. Measuring the renal papillary CT value can help screen people at high risk of kidney stones, but existing studies only suggest that increased renal papillary CT values increase the risk of kidney stones and cannot predict the probability of stone risk, making it difficult to interpret the prediction results of whether or not to have kidney stones; in addition, only measuring the renal papillary CT value may lead to omission of information and low accuracy.
发明内容Summary of the invention
本发明的目的是针对上述现有技术的不足,提出一种基于CT影像组学的肾结石风险预测方法及系统,能够提高肾结石预测的精确度,以及提高肾结石预测的可解释性。The purpose of the present invention is to address the deficiencies of the above-mentioned prior art and to propose a kidney stone risk prediction method and system based on CT imaging genomics, which can improve the accuracy of kidney stone prediction and improve the interpretability of kidney stone prediction.
第一方面,本发明提供了一种基于CT影像组学的肾结石风险预测方法,包括:In a first aspect, the present invention provides a method for predicting kidney stone risk based on CT radiomics, comprising:
分别采集多名患者多轴面的CT图像,对所述CT图像勾画感兴趣区域,得到感兴趣区域结果,对所述感兴趣区域结果提取目标影像组学特征;Collect multi-axial CT images of multiple patients respectively, delineate regions of interest on the CT images, obtain region of interest results, and extract target imaging omics features from the region of interest results;
分别采集多名患者的临床变量,基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对所述独立风险的交叉影响因素进行控制,得到独立风险因素;Clinical variables of multiple patients were collected respectively, and the independent risk of kidney stones in clinical variable images was obtained based on the univariate analysis model. Then, the cross-influencing factors of the independent risks were controlled based on the multivariate analysis model to obtain independent risk factors.
将所述目标影像组学特征与所述独立风险因素进行组合,得到组合特征,基于预测模型,获取所述组合特征对应风险预测的分类结果;Combining the target radiomics feature with the independent risk factor to obtain a combined feature, and obtaining a classification result of risk prediction corresponding to the combined feature based on a prediction model;
根据所述组合特征构建列线图,并根据所述列线图对所述分类结果进行概率可视化,得到风险概率,并根据所述风险概率对应的风险描述,为所述分类结果进行解释。A nomogram is constructed according to the combined features, and the classification results are probability visualized according to the nomogram to obtain the risk probability, and the classification results are explained according to the risk description corresponding to the risk probability.
本发明通过获取肾组织多轴面的CT图像来全方面检测肾组织,避免有用的CT图像信息遗漏,从而能够提高肾结石风险预测的准确度;并且,采用预测模型对临床变量先获取影响患肾结石的独立风险,然后再基于相同的预测模型,对影响独立风险的交叉影响因素进行控制,以获取更加精确的独立风险因素,从而能够进一步提高肾结石风险预测的准确度;接着,结合CT图像的目标影像组学特征和独立风险因素进行风险预测,并采用列线图获取风险概率,根据风险概率对应的风险描述对分类结果进行解释,能够提高肾结石风险预测的可解释性。The present invention obtains multi-axial CT images of renal tissue to comprehensively detect renal tissue, thereby avoiding the omission of useful CT image information, thereby improving the accuracy of kidney stone risk prediction; and, a prediction model is used to first obtain the independent risk affecting kidney stones from clinical variables, and then based on the same prediction model, the cross-influencing factors affecting the independent risk are controlled to obtain more accurate independent risk factors, thereby further improving the accuracy of kidney stone risk prediction; then, risk prediction is performed in combination with the target imaging genomics features of the CT image and the independent risk factors, and a nomogram is used to obtain the risk probability, and the classification results are interpreted according to the risk description corresponding to the risk probability, thereby improving the interpretability of kidney stone risk prediction.
进一步,所述基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对所述独立风险的交叉影响因素进行控制,得到独立风险因素,包括:Furthermore, based on the univariate analysis model, the independent risk of suffering from kidney stones based on clinical variable images is obtained respectively, and then based on the multivariate analysis model, the cross-influencing factors of the independent risk are controlled to obtain independent risk factors, including:
将临床变量依次输入到单因素分析模型中,分别输出对应单因素影像患肾结石的第一显著性差异指标,将满足预设指标阈值的第一显著性差异指标的对应临床变量输入到多因素分析模型中,对单因素之间的交叉影响因素进行控制,输出第二显著性差异指标,根据所述第二显著性差异指标,从所述临床变量中选出独立风险因素。The clinical variables are input into the univariate analysis model in sequence, and the first significant difference index of kidney stones corresponding to the univariate images is output respectively. The corresponding clinical variables of the first significant difference index that meets the preset index threshold are input into the multivariate analysis model, the cross-influencing factors between the single factors are controlled, and the second significant difference index is output. According to the second significant difference index, independent risk factors are selected from the clinical variables.
进一步,所述根据所述组合特征构建列线图,并根据所述列线图对所述分类结果进行概率可视化,得到风险概率,包括:Further, constructing a nomogram according to the combined features, and performing probability visualization on the classification results according to the nomogram to obtain the risk probability, includes:
分别获取所述目标影像组学特征和所述独立风险因素在列线图中的分值,对应得到目标影像组学特征分值和独立风险因素分值,获取所述目标影像组学特征分值和所述独立风险因素分值的总分值,根据所述总分值获取对应的风险概率。The scores of the target radiomics feature and the independent risk factor in the nomogram are obtained respectively, the target radiomics feature score and the independent risk factor score are obtained correspondingly, the total score of the target radiomics feature score and the independent risk factor score is obtained, and the corresponding risk probability is obtained according to the total score.
进一步,获取所述目标影像组学特征分值,包括:Further, obtaining the target radiomics feature score includes:
根据所述目标影像组学特征的系数,对所述目标影像组学特征进行加权,得到目标影像组学特征分值;According to the coefficient of the target radiomics feature, weighting the target radiomics feature to obtain a target radiomics feature score;
其中,所述目标影像组学特征分值表示为:Wherein, the target radiomics feature score is expressed as:
Radiomics score=a1*F1+a2*F2+...+ai*Fi+b;Radiomics score=a1*F1+a2*F2+...+ai*Fi+b;
其中,a1至ai为目标影像组学特征的系数,F1至Fi为目标影像组学特征;目标影像组学特征的个数为i;b为截距。Among them, a1 to ai are the coefficients of the target imaging features, F1 to Fi are the target imaging features; the number of target imaging features is i; and b is the intercept.
进一步,所述基于所述预测模型,获取所述组合特征对应风险预测的分类结果,包括:Further, obtaining the classification result of the risk prediction corresponding to the combined feature based on the prediction model includes:
将所述组合特征输入到基于所述预测模型的影像组学模型中,输出所述组合特征对应风险预测的分类结果;所述影像组学模型是根据输入所述目标影像组学特征和所述目标影像组学特征对应的截距进行训练后得到的。The combined feature is input into an imaging omics model based on the prediction model, and a classification result of risk prediction corresponding to the combined feature is output; the imaging omics model is obtained after training based on the input of the target imaging omics feature and the intercept corresponding to the target imaging omics feature.
进一步,所述影像组学模型是根据输入所述目标影像组学特征和所述目标影像组学特征对应的截距进行训练后得到的,包括:Furthermore, the radiomics model is obtained by training according to the input of the target radiomics feature and the intercept corresponding to the target radiomics feature, and includes:
获取所述目标影像组学特征和所述目标影像组学特征对应的截距的训练集和验证集,根据所述训练集和所述验证集分别对多个不同机器学习的算法模型进行训练和验证,并选择训练结果与验证结果的AUC指标差值最小的算法模型作为所述影像组学模型。A training set and a validation set of the target imaging features and the intercept corresponding to the target imaging features are obtained, and a plurality of different machine learning algorithm models are trained and validated according to the training set and the validation set, and the algorithm model with the smallest AUC index difference between the training result and the validation result is selected as the imaging model.
本发明根据AUC值以及训练集和验证集之间的AUC差值最小的算法模型为来选择最终的影像组学模型,能够保证影像组学模型的稳定性和泛化能力,从而提高风险预测的准确度。The present invention selects the final imaging omics model according to the algorithm model with the smallest AUC value and the AUC difference between the training set and the validation set, which can ensure the stability and generalization ability of the imaging omics model, thereby improving the accuracy of risk prediction.
进一步,所述对所述感兴趣区域结果提取目标影像组学特征,包括:Further, the step of extracting target radiomics features from the region of interest result includes:
提取所述感兴趣区域的多个影像组学特征,并通过两名专家对同一影像组学特征进行提取,分别得到两组提取结果,对所述两组提取结果评估组内一致性,得到目标影像组学特征;所述多个影像组学特征包括:一阶特征、形状特征和纹理特征。Extract multiple radiomics features of the region of interest, and have two experts extract the same radiomics feature to obtain two groups of extraction results, respectively. Evaluate the intra-group consistency of the two groups of extraction results to obtain the target radiomics feature; the multiple radiomics features include: first-order features, shape features and texture features.
进一步,所述对所述两组提取结果评估组内一致性,得到目标影像组学特征,包括:Furthermore, the intra-group consistency of the two groups of extraction results is evaluated to obtain target imaging omics features, including:
保留组内相关系数不小于预设阈值的影像组学特征,得到多个候选影像组学特征,并对所述多个候选影像组学特征进行数据降维或参数压缩,筛选出与患肾结石相关性最高的多个目标影像组学特征。The imaging features with intra-group correlation coefficients not less than a preset threshold are retained to obtain multiple candidate imaging features, and data dimension reduction or parameter compression is performed on the multiple candidate imaging features to screen out multiple target imaging features with the highest correlation with kidney stones.
进一步,所述分别采集多名患者多轴面的CT图像,对所述CT图像勾画感兴趣区域,得到感兴趣区域结果,包括:Furthermore, the step of respectively acquiring multi-axial CT images of multiple patients, outlining regions of interest on the CT images, and obtaining region of interest results includes:
分别采集多名患者肾组织多轴面的CT图像,所述CT图像包括:肾组织的横截面、冠状面和矢状面的CT图像;Acquiring multi-axial CT images of renal tissues of multiple patients respectively, wherein the CT images include: cross-sectional, coronal and sagittal CT images of renal tissues;
通过预设的图像阈值同时遮蔽所述横截面、所述冠状面和所述矢状面的CT图像中的肾组织周围空气和脂肪组织,并分别进行肾组织勾画,得到感兴趣区域。The air and fat tissue surrounding the kidney tissue in the CT images of the cross section, the coronal plane and the sagittal plane are simultaneously shielded by a preset image threshold, and the kidney tissue is outlined respectively to obtain a region of interest.
第二方面,本发明提供了一种基于CT影像组学的肾结石风险预测系统,包括:目标影像组学特征提取单元、独立风险因素提取单元、分类结果获取单元和解释单元;其中,In a second aspect, the present invention provides a kidney stone risk prediction system based on CT radiomics, comprising: a target radiomics feature extraction unit, an independent risk factor extraction unit, a classification result acquisition unit and an interpretation unit; wherein,
所述目标影像组学特征提取单元,用于分别采集多名患者多轴面的CT图像,对所述CT图像勾画感兴趣区域,得到感兴趣区域结果,对所述感兴趣区域结果提取目标影像组学特征;The target imaging genomics feature extraction unit is used to respectively collect multi-axial CT images of multiple patients, outline the region of interest on the CT images, obtain the region of interest results, and extract the target imaging genomics features from the region of interest results;
所述独立风险因素提取单元,用于分别采集多名患者的临床变量,基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对所述独立风险的交叉影响因素进行控制,得到独立风险因素;The independent risk factor extraction unit is used to collect clinical variables of multiple patients respectively, obtain independent risks of kidney stones in clinical variable images based on a univariate analysis model, and then control the cross-influencing factors of the independent risks based on a multivariate analysis model to obtain independent risk factors;
所述分类结果获取单元,用于将所述目标影像组学特征与所述独立风险因素进行组合,得到组合特征,基于预测模型,获取所述组合特征对应风险预测的分类结果;The classification result acquisition unit is used to combine the target imaging genomics feature with the independent risk factor to obtain a combined feature, and obtain a classification result of the risk prediction corresponding to the combined feature based on a prediction model;
所述解释单元,用于根据所述组合特征构建列线图,并根据所述列线图对所述分类结果进行概率可视化,得到风险概率,并根据所述风险概率对应的风险描述,为所述分类结果进行解释。The explanation unit is used to construct a nomogram according to the combined features, and to perform probability visualization of the classification results according to the nomogram to obtain the risk probability, and to explain the classification results according to the risk description corresponding to the risk probability.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本实施例提供的一种基于CT影像组学的肾结石风险预测方法的流程示意图;FIG1 is a schematic flow chart of a method for predicting kidney stone risk based on CT imaging genomics provided in this embodiment;
图2是本实施例提供的感兴趣区域提取的示意图;FIG2 is a schematic diagram of extracting a region of interest provided by this embodiment;
图3是本实施例提供的不同机器学习的算法模型的训练结果与验证结果的AUC示意图;FIG3 is a schematic diagram of the AUC of the training results and verification results of different machine learning algorithm models provided in this embodiment;
图4是本实施例提供的独立风险因素与目标影像组学特征的预测模型AUC示意图;FIG4 is a schematic diagram of the AUC of the prediction model of independent risk factors and target radiomics features provided in this embodiment;
图5是本实施例提供的基于影像组学模型的校准曲线和临床决策曲线的示意图;FIG5 is a schematic diagram of a calibration curve and a clinical decision curve based on an imaging omics model provided in this embodiment;
图6是本实施例提供的列线图I的示意图;FIG6 is a schematic diagram of a nomogram I provided in this embodiment;
图7是本实施例提供的列线图II的示意图;FIG7 is a schematic diagram of a nomogram II provided in this embodiment;
图8是本实施例提供的又一种基于CT影像组学的肾结石风险预测方法的流程图;FIG8 is a flow chart of another method for predicting kidney stone risk based on CT radiomics provided in this embodiment;
图9是本实施例提供的一种基于CT影像组学的肾结石风险预测系统的结构示意图。FIG9 is a schematic diagram of the structure of a kidney stone risk prediction system based on CT imaging omics provided in this embodiment.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.
值得说明的是,现有的相关技术采用肾乳头CT值来预测肾结石风险,但是无法预测肾结石风险概率,导致难以对是否患肾结石进行解释;而采用肾乳头CT值的相关技术当无法扫描较小的肾结石时,就无法预测出肾结石风险,且仅依赖与肾乳头CT值进行风险预测,遗漏了其他特征信息,导致准确率低。It is worth noting that the existing related technologies use renal papillary CT values to predict the risk of kidney stones, but they cannot predict the probability of kidney stone risk, which makes it difficult to interpret whether one has kidney stones. When the related technologies using renal papillary CT values cannot scan smaller kidney stones, they cannot predict the risk of kidney stones, and they only rely on renal papillary CT values for risk prediction, omitting other characteristic information, resulting in low accuracy.
基于此,本发明提供了一种基于CT影像组学的肾结石风险预测方法及系统,通过获取肾组织多轴面的CT图像来全方面检测肾组织,避免有用的CT图像信息遗漏,从而能够提高肾结石风险预测的准确度;并且,采用单因素分析模型对临床变量先获取影响患肾结石的独立风险,然后再基于多因素分析模型,对影响独立风险的交叉影响因素进行控制,以获取更加精确的独立风险因素,从而能够进一步提高肾结石风险预测的准确度;接着,结合CT图像的目标影像组学特征和独立风险因素进行风险预测,由于风险预测的分类结果是基于端到端的结果,要么有患病风险,要么没有患病风险,其可解释性差,因此,本发明在风险预测之后,采用列线图获取风险概率,根据风险概率对应的风险描述对分类结果进行解释,即对为什么有患病风险或为什么没有患病风险进行解释,从而提高肾结石风险预测的可解释性。Based on this, the present invention provides a method and system for predicting kidney stone risk based on CT imaging genomics, which comprehensively detects kidney tissue by acquiring CT images of multi-axial surfaces of kidney tissue, avoiding the omission of useful CT image information, thereby improving the accuracy of kidney stone risk prediction; and, using a univariate analysis model to first obtain the independent risk affecting kidney stones from clinical variables, and then based on a multivariate analysis model, control the cross-influencing factors affecting the independent risk to obtain more accurate independent risk factors, thereby further improving the accuracy of kidney stone risk prediction; then, combining the target imaging genomics features of the CT image and the independent risk factors to predict risk, since the classification result of risk prediction is based on an end-to-end result, either there is a risk of disease or there is no risk of disease, its interpretability is poor, therefore, after risk prediction, the present invention uses a nomogram to obtain the risk probability, and interprets the classification result according to the risk description corresponding to the risk probability, that is, explains why there is a risk of disease or why there is no risk of disease, thereby improving the interpretability of kidney stone risk prediction.
为了更好地说明本发明的技术方案,将从以下实施例进行详细说明。In order to better illustrate the technical solution of the present invention, it will be described in detail from the following embodiments.
实施例1Example 1
参见图1,是本实施例提供的一种基于CT影像组学的肾结石风险预测方法的流程示意图,包括:步骤S11~S14,具体为:Referring to FIG. 1 , it is a flow chart of a method for predicting kidney stone risk based on CT radiomics provided in this embodiment, including steps S11 to S14, specifically:
步骤S11、分别采集多名患者多轴面的CT图像,对CT图像勾画感兴趣区域,得到感兴趣区域结果,对感兴趣区域结果提取目标影像组学特征。Step S11, respectively collect multi-axial CT images of multiple patients, delineate regions of interest on the CT images, obtain region of interest results, and extract target imaging genomics features from the region of interest results.
在一些实施例中,分别采集多名患者多轴面的CT图像,对CT图像勾画感兴趣区域,得到感兴趣区域结果,包括:分别采集多名患者肾组织多轴面的CT图像,CT图像包括:肾组织的横截面、冠状面和矢状面的CT图像;通过预设的图像阈值同时遮蔽横截面、冠状面和矢状面的CT图像中的肾组织周围空气和脂肪组织,并分别进行肾组织勾画,得到感兴趣区域。In some embodiments, multi-axial CT images of multiple patients are collected separately, and regions of interest are outlined on the CT images to obtain region of interest results, including: collecting multi-axial CT images of renal tissue of multiple patients separately, the CT images including: cross-sectional, coronal and sagittal CT images of renal tissue; simultaneously masking the air and fat tissue surrounding the renal tissue in the cross-sectional, coronal and sagittal CT images by a preset image threshold, and outlining the renal tissue separately to obtain the region of interest.
在一些实施例中,回顾性收集于2020至2022年就诊的患者,其中有肾结石者222例,无肾结石者291例。通过科研大数据平台对病例进行初步筛选。In some embodiments, patients who were treated from 2020 to 2022 were retrospectively collected, of which 222 had kidney stones and 291 had no kidney stones. The cases were preliminarily screened through the scientific research big data platform.
在一些实施例中,通过RUIKE系统采集CT图像,以DCM格式保存;其中,CT图像采用Siemens SOMATOM Definition AS+128层CT或64层CT(Siemens,Erlangen,Germany)。CT成像采集参数包括:管电压120kV;自动管电;ref mAs 200-500;矩阵512×512;重建厚度多为2mm,少数为5mm。In some embodiments, CT images are acquired by a RUIKE system and saved in DCM format; wherein the CT images are acquired using Siemens SOMATOM Definition AS+ 128-slice CT or 64-slice CT (Siemens, Erlangen, Germany). CT imaging acquisition parameters include: tube voltage 120 kV; automatic tube voltage; ref mAs 200-500; matrix 512×512; reconstruction thickness is mostly 2 mm, and a few are 5 mm.
在一些实施例中,使用3Dslicer软件由两名专家对感兴趣区域(Region ofinterest,ROI)进行勾画;其中,ROI为肾脏,分割模式为手动分割。分割边缘被调整为尽量包括尽可能多的肾组织,同时避免包括结石和肾周组织。In some embodiments, 3Dslicer software is used by two experts to delineate a region of interest (ROI), wherein the ROI is the kidney and the segmentation mode is manual segmentation. The segmentation margin is adjusted to include as much renal tissue as possible while avoiding stones and perirenal tissue.
在一些实施例中,为使ROI的分割更加简便快速,采用3Dslicer软件的Threshold功能对图像进行遮掩,所选用的图像阈值为0,这样可以在避免勾画肾脏周围空气和脂肪组织等的情况下将肾脏边界清晰的分割出来。In some embodiments, to make ROI segmentation easier and faster, the Threshold function of the 3Dslicer software is used to mask the image, and the selected image threshold is 0, so that the kidney boundary can be clearly segmented without outlining the air and fat tissue around the kidney.
在一些实施例中,参见图2,是本实施例提供的感兴趣区域提取的示意图。在图2中勾画感兴趣区域时,需要遵守如下图像分割的原则:1)ROI的分割应在平扫CT图像中进行,并且在图像分割过程中应避免结石和肾周组织。2)图像分割涉及全层分割,即对所有CT层面包括肾组织进行分割并包括在ROI中。根据上述原则,在横截面、冠状面和矢状面上进行全层勾画,并将全层勾画结果进行组合,可以得到肾组织的三维勾画结果。In some embodiments, referring to FIG. 2 , it is a schematic diagram of the region of interest extraction provided by the present embodiment. When outlining the region of interest in FIG. 2 , it is necessary to comply with the following principles of image segmentation: 1) the segmentation of the ROI should be performed in the plain scan CT image, and stones and perirenal tissue should be avoided during the image segmentation process. 2) Image segmentation involves full-thickness segmentation, that is, all CT layers including renal tissue are segmented and included in the ROI. According to the above principles, full-thickness outlining is performed on the cross-section, coronal and sagittal planes, and the full-thickness outlining results are combined to obtain a three-dimensional outlining result of the renal tissue.
值得说明的是,影像组学是指通过基于医学图像分析的计算机图像处理技术从医学影像中高通量的提取影像组学特征,包括:一阶特征、纹理特征和形状特征等。It is worth mentioning that radiomics refers to the high-throughput extraction of radiomics features from medical images through computer image processing technology based on medical image analysis, including first-order features, texture features, shape features, etc.
在一些实施例中,影像组学特征通过3Dslicer软件的插件进行提取,包括:一阶特征、形状特征和纹理特征。In some embodiments, radiomics features are extracted through a plug-in of 3Dslicer software, including: first-order features, shape features, and texture features.
在一些实施例中,影像组学特征通过3Dslicer软件的插件进行提取,包括:一阶特征(18个特征)、形状特征(14个特征)、灰度共生矩阵(GLCM)特征(24个特征)、灰度尺寸区域矩阵(GLSZM)特征(16个特征)、灰度运行长度矩阵(GLRLM)特征(16个特征)、相邻灰度差异矩阵(NGTDM)特征(5个特征)、灰度依赖矩阵(GLDM)特征(14个特征),共提取了7类别和107个特征,详细特征名称参见表1,影像组学特征表。In some embodiments, radiomics features are extracted through a plug-in of 3Dslicer software, including first-order features (18 features), shape features (14 features), gray-level co-occurrence matrix (GLCM) features (24 features), gray-level size zone matrix (GLSZM) features (16 features), gray-level run length matrix (GLRLM) features (16 features), neighboring gray-level difference matrix (NGTDM) features (5 features), and gray-level dependency matrix (GLDM) features (14 features). A total of 7 categories and 107 features were extracted. For detailed feature names, see Table 1, Radiomics Feature Table.
表1影像组学特征表Table 1 Radiomics characteristics
在一些实施例中,对感兴趣区域结果提取目标影像组学特征,包括:提取感兴趣区域的多个影像组学特征后,通过两名专家对同一影像组学特征进行提取,分别得到两组提取结果,对两组提取结果评估组内一致性,得到目标影像组学特征;多个影像组学特征包括:一阶特征、形状特征和纹理特征。In some embodiments, target radiomics features are extracted from the results of the region of interest, including: after extracting multiple radiomics features of the region of interest, two experts extract the same radiomics features to obtain two groups of extraction results respectively, and the two groups of extraction results are evaluated for intra-group consistency to obtain the target radiomics features; the multiple radiomics features include: first-order features, shape features and texture features.
在一些实施例中,对两组提取结果评估组内一致性,得到目标影像组学特征,包括:保留组内相关系数不小于预设阈值的影像组学特征,得到多个候选影像组学特征,并对多个候选影像组学特征进行数据降维或参数压缩,筛选出与患肾结石相关性最高的多个目标影像组学特征。In some embodiments, the intra-group consistency of the two groups of extraction results is evaluated to obtain target radiomics features, including: retaining radiomics features with an intra-group correlation coefficient not less than a preset threshold, obtaining multiple candidate radiomics features, and performing data dimension reduction or parameter compression on the multiple candidate radiomics features to screen out multiple target radiomics features with the highest correlation with kidney stones.
在一些实施例中,通过组内相关系数(Intraclass correlation coefficient,ICC)评估特征提取的组内和组间一致性。专家A和专家B提取同一CT图像的影像组学特征以评估组间一致性;其中一个专家在至少一个月后对另一个专家提取的影像组学特征进行重复提取,然后评估组内一致性。In some embodiments, the intraclass correlation coefficient (ICC) is used to evaluate the intra-group and inter-group consistency of feature extraction. Expert A and expert B extract radiomic features of the same CT image to evaluate the inter-group consistency; one of the experts repeats the extraction of the radiomic features extracted by the other expert at least one month later, and then evaluates the intra-group consistency.
在一些实施例中,组内相关系数小于预设阈值0.75的影像组学特征被删除,将剩余的影像组学特征作为候选影像组学特征。In some embodiments, radiomics features with an intra-group correlation coefficient less than a preset threshold of 0.75 are deleted, and the remaining radiomics features are used as candidate radiomics features.
在一些实施例中,通过最小绝对值收敛和选择算子(Least absolute shrinkageand selection operator,LASSO)对多个候选影像组学特征进行参数压缩,筛选出与患肾结石相关性最高的多个目标影像组学特征。In some embodiments, the parameters of multiple candidate radiomics features are compressed by the least absolute shrinkage and selection operator (LASSO) to screen out multiple target radiomics features with the highest correlation with kidney stones.
在一些实施例中,通过主成分分析对多个候选影像组学特征进行数据降维,筛选出与患肾结石相关性最高的多个目标影像组学特征。In some embodiments, principal component analysis is used to perform data dimension reduction on multiple candidate radiomics features to screen out multiple target radiomics features that are most correlated with kidney stones.
在一些实施例中,目标影像组学特征参见表2,是本实施例提供的目标影像组学特征及对应的系数表。In some embodiments, the target imaging genomics features refer to Table 2, which is a table of target imaging genomics features and corresponding coefficients provided in this embodiment.
表2目标影像组学特征及对应的系数表Table 2 Target imaging omics features and corresponding coefficients
步骤S12、分别采集多名患者的临床变量,基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对独立风险的交叉影响因素进行控制,得到独立风险因素。Step S12: clinical variables of multiple patients are collected respectively, and independent risks of kidney stones in clinical variable images are obtained based on a univariate analysis model. Then, based on a multivariate analysis model, cross-influencing factors of independent risks are controlled to obtain independent risk factors.
在一些实施例中,临床变量包括:基线信息和既往病史;基线信息包括:年龄、性别、身高和体重;既往病史包括:高血压、糖尿病或尿石症中的至少一种。In some embodiments, clinical variables include: baseline information and past medical history; baseline information includes: age, gender, height and weight; past medical history includes: at least one of hypertension, diabetes or urolithiasis.
在一些实施例中,通过CDMS数字医疗记录系统收集患者的临床变量。In some embodiments, clinical variables of patients are collected via the CDMS digital medical record system.
在一些实施例中,基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对独立风险的交叉影响因素进行控制,得到独立风险因素,包括:将临床变量依次输入到单因素分析模型中,分别输出对应单因素影像患肾结石的第一显著性差异指标,将满足预设指标阈值的第一显著性差异指标的对应临床变量输入到多因素分析模型中,对单因素之间的交叉影响因素进行控制,输出第二显著性差异指标,根据第二显著性差异指标,从临床变量中选出独立风险因素。In some embodiments, based on a univariate analysis model, the independent risk of kidney stones in clinical variable images is obtained respectively, and then based on a multivariate analysis model, the cross-influencing factors of the independent risks are controlled to obtain independent risk factors, including: inputting the clinical variables into the univariate analysis model in sequence, outputting the first significant difference index corresponding to the univariate image of kidney stones respectively, inputting the corresponding clinical variables of the first significant difference index that meets the preset index threshold into the multivariate analysis model, controlling the cross-influencing factors between the single factors, outputting the second significant difference index, and selecting the independent risk factors from the clinical variables according to the second significant difference index.
值得说明的是,第一显著性差异指标是根据单因素分析模型对临床变量进行单因素分析时,将所有临床变量作为单因素分析模型的输入,输出的显著性差异指标,并选择满足预设指标阈值的第一显著性差异指标的对应临床变量,为了分析从所有临床变量中筛选出的单因素之间的影响,将满足预设指标阈值的第一显著性差异指标所对应临床变量作为多因素分析模型的输入,输出第二显著性差异指标,即第二显著性差异指标是由多因素分析模型对输入的满足预设指标阈值的第一显著性差异指标所对应临床变量,输出的显著性差异指标。It is worth noting that the first significant difference index is the significance difference index output when all clinical variables are used as the input of the univariate analysis model when the clinical variables are subjected to univariate analysis according to the univariate analysis model, and the corresponding clinical variables of the first significant difference index that meet the preset index threshold are selected. In order to analyze the influence between the single factors screened out from all clinical variables, the clinical variables corresponding to the first significant difference index that meets the preset index threshold are used as the input of the multifactor analysis model, and the second significant difference index is output. That is, the second significant difference index is the significance difference index output by the multifactor analysis model for the clinical variables corresponding to the first significant difference index that meets the preset index threshold.
换句话说,通过单因素分析模型从临床变量中筛选出影响患肾结石的单因素,再通过多因素分析模型对筛选出的单因素之间的影响进行控制,以获取受到单因素之间影响的独立风险因素。In other words, the single factors that affect kidney stones are screened out from clinical variables through the univariate analysis model, and then the influence between the screened single factors is controlled through the multivariate analysis model to obtain independent risk factors affected by the single factors.
在一些实施例中,在训练时中,单因素分析模型与多因素分析模型采用相同的线性回归算法模型时,根据基于线性回归算法模型的单因素分析模型,来确定临床变量中单因素对于患肾结石风险的影响,为确使变量的影响更接近真实世界的复杂情况,进一步使用基于线性回归算法模型的多因素分析模型,以控制其他因素的交叉影响。In some embodiments, during training, when the univariate analysis model and the multivariate analysis model use the same linear regression algorithm model, the influence of a single factor in the clinical variables on the risk of kidney stones is determined according to the univariate analysis model based on the linear regression algorithm model. In order to ensure that the influence of the variables is closer to the complex situation in the real world, a multivariate analysis model based on the linear regression algorithm model is further used to control the cross-influence of other factors.
在一些实施例中,第一显著性差异指标和第二显著性差异指标为P值。值得说明的是,在统计学中,P值(P-value)是假设原假设为真时获得检验统计量的概率。In some embodiments, the first significant difference indicator and the second significant difference indicator are P values. It is worth noting that in statistics, the P value is the probability of obtaining a test statistic when the null hypothesis is true.
在一些实施例中,参见表3,是本实施例提供的基于逻辑回归算法模型的单因素分析模型与多因素分析模型的分析结果表。在表3中,单因素分析模型的分析显示性别和身体质量指数(Body Mass Index,BMI)的P值小于0.05,具有统计学差异,则将性别和BMI值作为多因素分析模型的输入,输出性别和BMI值的P值,再选择P值满足多因素预设阈值的临床变量作为最终的独立风险因素。In some embodiments, see Table 3, which is an analysis result table of the univariate analysis model and the multivariate analysis model based on the logistic regression algorithm model provided in this embodiment. In Table 3, the analysis of the univariate analysis model shows that the P value of gender and body mass index (Body Mass Index, BMI) is less than 0.05, which is statistically significant, then the gender and BMI values are used as inputs of the multivariate analysis model, the P values of gender and BMI values are output, and then the clinical variables whose P values meet the multivariate preset thresholds are selected as the final independent risk factors.
在一些实施例中,在对单因素分析模型输入所有临床变量,并输出P值后,将所有临床变量均作为多因素分析模型的输入,多因素分析模型输出所有临床变量的P值,选择满足预设阈值的单因素分析模型的P值所对应的临床变量,并选择满足多因素预设阈值的多因素分析模型的P值所对应的临床变量,取两个临床变量的交集为最终的独立风险因素。在表3中,在单因素分析模型对所有临床变量输出P值后,再由多因素分析模型输入所有临床变量后,输出所有临床变量的P值,分析结果显示:单因素分析模型输出P值中,性别和BMI的P值小于0.05具有统计学差异,多因素分析模型输出的P值中,性别和BMI的P值同样小于0.05且具有统计学差异,即多因素分析模型和单因素分析模型输出的P值分别满足预设阈值和多因素预设阈值的对应临床变量的交集,只有性别和BMI,因此,可以筛选出独立风险因素仅包括:性别和BMI。In some embodiments, after all clinical variables are input to the univariate analysis model and the P value is output, all clinical variables are used as inputs to the multivariate analysis model, the multivariate analysis model outputs the P values of all clinical variables, the clinical variables corresponding to the P values of the univariate analysis model that meet the preset threshold are selected, and the clinical variables corresponding to the P values of the multivariate analysis model that meet the multifactor preset threshold are selected, and the intersection of the two clinical variables is taken as the final independent risk factor. In Table 3, after the univariate analysis model outputs the P value for all clinical variables, the multivariate analysis model inputs all clinical variables and outputs the P value of all clinical variables. The analysis results show that among the P values output by the univariate analysis model, the P values of gender and BMI are less than 0.05 and have statistical differences, and among the P values output by the multivariate analysis model, the P values of gender and BMI are also less than 0.05 and have statistical differences, that is, the P values output by the multivariate analysis model and the univariate analysis model respectively meet the intersection of the corresponding clinical variables of the preset threshold and the multifactor preset threshold, only gender and BMI, therefore, the independent risk factors that can be screened out include only: gender and BMI.
表3基于回归算法模型的单因素分析模型与多因素分析模型的分析结果表Table 3 Analysis results of the single factor analysis model and multi-factor analysis model based on the regression algorithm model
步骤S13、将目标影像组学特征与独立风险因素进行组合,得到组合特征,基于预测模型,获取组合特征对应风险预测的分类结果。Step S13: Combine the target radiomics features with the independent risk factors to obtain a combined feature, and obtain a classification result of the risk prediction corresponding to the combined feature based on the prediction model.
值得说明的是,分类结果是根据组合特征,对应的预测模型输出的分类结果,在一些实施例中,基于逻辑回归算法模型的影像组学模型输入组合特征后,输出分类结果。It is worth noting that the classification result is the classification result output by the corresponding prediction model based on the combined features. In some embodiments, the imaging omics model based on the logistic regression algorithm model inputs the combined features and outputs the classification result.
在一些实施例中,基于预测模型,获取组合特征对应风险预测的分类结果,包括:将组合特征输入到基于预测模型的影像组学模型中,输出组合特征对应风险预测的分类结果;影像组学模型是根据输入目标影像组学特征和目标影像组学特征对应的截距进行训练后得到的。In some embodiments, based on the prediction model, the classification result of the risk prediction corresponding to the combined feature is obtained, including: inputting the combined feature into an imaging omics model based on the prediction model, and outputting the classification result of the risk prediction corresponding to the combined feature; the imaging omics model is obtained after training based on the input target imaging omics feature and the intercept corresponding to the target imaging omics feature.
在一些实施例中,患者按7:3的比例划分为训练数据和验证数据,并将对应患者的目标影像组学特征分配到训练集和验证集。In some embodiments, patients are divided into training data and validation data in a ratio of 7:3, and the target imaging features of the corresponding patients are assigned to the training set and the validation set.
在一些实施例中,影像组学模型是根据输入目标影像组学特征和目标影像组学特征对应的截距进行训练后得到的,包括:获取目标影像组学特征和目标影像组学特征对应的截距的训练集和验证集,根据训练集和验证集分别对多个不同机器学习的算法模型进行训练和验证,并选择训练结果与验证结果的AUC(Area under the curve,ROC曲线下的面积)指标差值最小的算法模型作为影像组学模型。In some embodiments, the imaging omics model is obtained after training based on the input target imaging omics features and the intercept corresponding to the target imaging omics features, including: obtaining a training set and a validation set of the target imaging omics features and the intercept corresponding to the target imaging omics features, training and validating multiple different machine learning algorithm models according to the training set and the validation set, and selecting the algorithm model with the smallest difference in AUC (Area under the curve, area under the ROC curve) indicator between the training result and the validation result as the imaging omics model.
在一些实施例中,多个不同机器学习的算法模型包括:逻辑回归算法模型、Adaboost算法模型、决策树算法模型和支持向量机算法模型。In some embodiments, multiple different machine learning algorithm models include: a logistic regression algorithm model, an Adaboost algorithm model, a decision tree algorithm model, and a support vector machine algorithm model.
在一些实施例中,对逻辑回归算法模型、Adaboost算法模型、决策树算法模型和支持向量机算法模型进行训练和验证,参见图3,是本实施例提供的不同机器学习的算法模型的训练结果与验证结果的AUC示意图。在图3(a)中,基于逻辑回归的预测模型的训练集的AUC为0.858(95% CI 0.819-0.897),即:曲线下面积估值为0.858,有95%的置信度相信真实的AUC值在0.819到0.897之间;验证集的AUC为0.806(95% CI 0.734-0.877)。In some embodiments, the logistic regression algorithm model, Adaboost algorithm model, decision tree algorithm model and support vector machine algorithm model are trained and verified. See Figure 3, which is a schematic diagram of the AUC of the training results and verification results of different machine learning algorithm models provided in this embodiment. In Figure 3 (a), the AUC of the training set of the prediction model based on logistic regression is 0.858 (95% CI 0.819-0.897), that is: the area under the curve is estimated to be 0.858, and there is a 95% confidence that the true AUC value is between 0.819 and 0.897; the AUC of the verification set is 0.806 (95% CI 0.734-0.877).
在图3中,基于决策树算法模型(图3b)、支持向量机算法模型(图3c)、Adaboost算法模型(图3d)在训练集中的AUC值分别为0.873(95% CIIn Figure 3, the AUC values of the decision tree algorithm model (Figure 3b), the support vector machine algorithm model (Figure 3c), and the Adaboost algorithm model (Figure 3d) in the training set were 0.873 (95% CI
0.834-0.912)、0.933(95% CI 0.908-0.958)、0.799(95% CI 0.757-0.841),在验证集中的AUC值分别为0.770(95% CI 0.696-0.845)、0.834(95% CI 0.772-0.896)、0.654(95% CI 0.578 -0.730)。The AUC values in the validation set were 0.770 (95% CI 0.696-0.845), 0.834 (95% CI 0.772-0.896), and 0.654 (95% CI 0.578 -0.730), respectively.
在选用的四种算法中,Adaboost算法表现出了最优秀的预测性能,在训练集中AUC值为0.933,验证集中为0.834,但是它在训练集与验证集之间的预测性能有较大的差异,存在过拟合的可能。因此,为了保证影像组学模型的稳定性和泛化能力,根据AUC值以及训练集和验证集之间的AUC差值来选择最终的预测模型。四种算法在训练集与验证集AUC的差值分别为:逻辑回归(0.052)、决策树(0.103)、支持向量机(0.099)和Adaboost(0.145),因此,选择逻辑回归来构建影像组学模型。Among the four algorithms selected, the Adaboost algorithm showed the best prediction performance, with an AUC value of 0.933 in the training set and 0.834 in the validation set. However, there was a large difference in its prediction performance between the training set and the validation set, and there was a possibility of overfitting. Therefore, in order to ensure the stability and generalization ability of the imaging omics model, the final prediction model was selected based on the AUC value and the AUC difference between the training set and the validation set. The differences in the AUC of the four algorithms between the training set and the validation set were: logistic regression (0.052), decision tree (0.103), support vector machine (0.099) and Adaboost (0.145), so logistic regression was selected to build the imaging omics model.
在一些实施例中,对独立风险因素建立基于逻辑回归算法模型的预测模型,并与对目标影像组学特征建立基于逻辑回归算法模型的预测模型进行比对,参见图4,是本实施例提供的独立风险因素与目标影像组学特征的预测模型AUC示意图。结果显示,图4(a)是基于训练集的预测模型AUC指标对比示意图,其中独立风险因素的预测模型AUC值为0.617。图4(a)是基于验证集的预测模型AUC指标对比示意图,其中独立风险因素的预测模型AUC值为0.619,其AUC指标远低于影像组学特征的预测模型(训练集中AUC值为0.858,验证集中AUC值为0.806),从而进一步显示出影像组学预测模型的优越性。In some embodiments, a prediction model based on a logistic regression algorithm model is established for independent risk factors, and compared with a prediction model based on a logistic regression algorithm model established for target imaging genomics features. See Figure 4, which is a schematic diagram of the AUC of the prediction model of independent risk factors and target imaging genomics features provided in this embodiment. The results show that Figure 4 (a) is a schematic diagram of the comparison of the AUC index of the prediction model based on the training set, in which the AUC value of the prediction model of the independent risk factor is 0.617. Figure 4 (a) is a schematic diagram of the comparison of the AUC index of the prediction model based on the validation set, in which the AUC value of the prediction model of the independent risk factor is 0.619, and its AUC index is much lower than the prediction model of the imaging genomics feature (the AUC value in the training set is 0.858, and the AUC value in the validation set is 0.806), thereby further showing the superiority of the imaging genomics prediction model.
在一些实施例中,将独立风险因素与目标影像组学特征组合,得到组合特征,并对组合特征构建基于逻辑回归算法模型的影像组学模型,以提高模型的预测性能,结果显示,在结合独立风险因素后,影像组学模型的预测性能稍有提升。值得注意的是,影像组学特征所占权重较大。通过单因素和多因素分析模型确定独立风险因素,并将独立风险因素与影像组学特征结合构建影像组学模型,由影像组学模型输出患肾结石风险预测的分类结果,由于分类结果要么是有患病风险,要么是无患病风险,其解释性差,因此,绘制列线图(Nomogram)来将分类结果进行可视化,即将影像组学模型中的输入的独立风险因素与目标影像组学特征放到一个图中,根据独立风险因素与目标影像组学特征的系数赋予分值,然后计算独立风险因素与目标影像组学特征的分值的总和,从而得出罹患疾病的几率。从而达到无需阅读统计表格或方程式就可以理解模型的预测结果,简化了模型的使用,且更加直观。In some embodiments, the independent risk factors are combined with the target radiomics features to obtain the combined features, and the radiomics model based on the logistic regression algorithm model is constructed for the combined features to improve the predictive performance of the model. The results show that after combining the independent risk factors, the predictive performance of the radiomics model is slightly improved. It is worth noting that the radiomics features have a large weight. The independent risk factors are determined by univariate and multivariate analysis models, and the independent risk factors are combined with the radiomics features to construct the radiomics model. The radiomics model outputs the classification results of the risk prediction of kidney stones. Since the classification results are either at risk of disease or without risk of disease, the interpretability is poor. Therefore, a nomogram is drawn to visualize the classification results, that is, the input independent risk factors and the target radiomics features in the radiomics model are put into a graph, and the scores are assigned according to the coefficients of the independent risk factors and the target radiomics features, and then the sum of the scores of the independent risk factors and the target radiomics features is calculated, thereby obtaining the probability of suffering from the disease. Thereby, the prediction results of the model can be understood without reading statistical tables or equations, which simplifies the use of the model and is more intuitive.
在一些实施例中,为了进一步模型的预测性能和实用价值,通过校准曲线和临床决策曲线对影像组学模型进行评价。In some embodiments, to further verify the predictive performance and practical value of the model, the radiomics model is evaluated by a calibration curve and a clinical decision curve.
在一些实施例中,参见图5,是本实施例提供的基于影像组学模型的校准曲线和临床决策曲线的示意图。在校准曲线中,45°虚线对应的次对角线表示在理想情况下的概率,预测曲线越接近理想曲线则预测性能越好,校准曲线显示,影像组学模型的预测性能接近理想状态下概率,预测性能好。在临床决策曲线中,无表示全部不进行干预,灰色曲线表示全部进行干预,曲线下面积表示净效益。临床决策曲线显示,在阈值概率为0.1-1.0时,使用影像组学模型所获得的净效益高于全部干预和全部不干预,提示所构建的影像组学模型具有临床应用价值。In some embodiments, referring to FIG5 , it is a schematic diagram of a calibration curve and a clinical decision curve based on an imaging omics model provided in the present embodiment. In the calibration curve, the secondary diagonal corresponding to the 45° dashed line represents the probability under ideal conditions. The closer the prediction curve is to the ideal curve, the better the prediction performance. The calibration curve shows that the prediction performance of the imaging omics model is close to the probability under ideal conditions, and the prediction performance is good. In the clinical decision curve, none means no intervention at all, the gray curve means all interventions, and the area under the curve represents the net benefit. The clinical decision curve shows that when the threshold probability is 0.1-1.0, the net benefit obtained using the imaging omics model is higher than all interventions and no interventions, indicating that the constructed imaging omics model has clinical application value.
步骤S14、根据组合特征构建列线图,并根据列线图对分类结果进行概率可视化,得到风险概率,并根据风险概率对应的风险描述,为分类结果进行解释。Step S14: construct a nomogram based on the combined features, and perform probability visualization of the classification results based on the nomogram to obtain the risk probability, and interpret the classification results based on the risk description corresponding to the risk probability.
在一些实施例中,根据组合特征构建列线图,并根据列线图对分类结果进行概率可视化,得到风险概率,包括:分别获取目标影像组学特征和独立风险因素在列线图中的分值,对应得到目标影像组学特征分值和独立风险因素分值,获取目标影像组学特征分值和独立风险因素分值的总分值,根据总分值获取对应的风险概率。参见图6,是本实施例提供的列线图I的示意图。In some embodiments, a nomogram is constructed according to the combined features, and the classification results are probabilistically visualized according to the nomogram to obtain the risk probability, including: obtaining the scores of the target radiomics features and the independent risk factors in the nomogram respectively, obtaining the target radiomics feature scores and the independent risk factor scores accordingly, obtaining the total score of the target radiomics feature scores and the independent risk factor scores, and obtaining the corresponding risk probability according to the total score. See Figure 6, which is a schematic diagram of the nomogram I provided in this embodiment.
在一些实施例中,获取目标影像组学特征分值,包括:根据目标影像组学特征的系数,对目标影像组学特征进行加权,得到目标影像组学特征分值。In some embodiments, obtaining the target radiomics feature score includes: weighting the target radiomics feature according to a coefficient of the target radiomics feature to obtain the target radiomics feature score.
在一些实施例中,目标影像组学特征分值表示为:In some embodiments, the target radiomics feature score is expressed as:
Radiomics score=a1*F1+a2*F2+...+ai*Fi+b;Radiomics score=a1*F1+a2*F2+...+ai*Fi+b;
其中,a1至ai为目标影像组学特征的系数,F1至Fi为目标影像组学特征;目标影像组学特征的个数为i;b为截距。Among them, a1 to ai are the coefficients of the target imaging features, F1 to Fi are the target imaging features; the number of target imaging features is i; and b is the intercept.
示例性地,参见图7,是本实施例提供的列线图II的示意图。对于一个身高168cm,体重60kg的女性患者,首先性别为女性,在列线图中分值为性别0所对应A点,计算其BMI值为21.25,在18.5和23.9之间,在列线图中为21.25的BMI分值为2,在列线图中分值为BMI变量2所对应B点,提取其目标影像组学特征,并通过表3对应的系数,代入目标影像组学参数计算公式计算目标影像组学分值,假设目标影像组学分值-2分对应C点,则其最终的总分值为A+B+C≈4.02分,从而可以得到其罹患肾结石的概率<0.1,其描述为低风险人群。Exemplarily, see Figure 7, which is a schematic diagram of the nomogram II provided in this embodiment. For a female patient with a height of 168 cm and a weight of 60 kg, first of all, the gender is female, and the score in the nomogram is point A corresponding to gender 0, and the BMI value is calculated to be 21.25, which is between 18.5 and 23.9. The BMI score of 21.25 in the nomogram is 2, and the score in the nomogram is point B corresponding to BMI variable 2. The target imaging genomics features are extracted, and the target imaging genomics parameter calculation formula is substituted by the corresponding coefficients in Table 3 to calculate the target imaging genomics score. Assuming that the target imaging genomics score of -2 points corresponds to point C, the final total score is A+B+C≈4.02 points, so that the probability of suffering from kidney stones can be obtained <0.1, which is described as a low-risk population.
在一些实施例中,参见图8,是本实施例提供的又一种基于CT影像组学的肾结石风险预测方法的流程图,在图8中,首先基于CT图像进行ROI勾画,得到ROI结果,并在ROI结果上提取目标影像组学特征以及目标影像组学特征对应的截距,将患者按照7:3,将对应患者的目标影像组学特征和目标影像组学特征对应的截距分到训练集和验证集中,根据训练集和验证集筛选出最优的影像组学模型;同时收集患者的临床变量,将从临床变量中提取独立风险因素后,根据目标影像组学特征和独立风险因素,构建影像组学模型;并基于影像组学模型,获取患肾结石风险预测的分类结果;最后,绘制列线图,获取分类结果解释性的描述,以分类结果和对分类结果的描述,为最终的结果分析。In some embodiments, referring to FIG8 , there is a flowchart of another method for predicting kidney stone risk based on CT radiomics provided in this embodiment. In FIG8 , ROI is first delineated based on the CT image to obtain the ROI result, and the target radiomics feature and the intercept corresponding to the target radiomics feature are extracted on the ROI result. The patients are divided into a training set and a validation set according to a ratio of 7:3, and the target radiomics feature and the intercept corresponding to the target radiomics feature of the corresponding patient are divided into the training set and the validation set, and the optimal radiomics model is screened out according to the training set and the validation set; at the same time, the clinical variables of the patients are collected, and after extracting independent risk factors from the clinical variables, the radiomics model is constructed according to the target radiomics feature and the independent risk factors; and based on the radiomics model, the classification result of kidney stone risk prediction is obtained; finally, a nomogram is drawn to obtain an explanatory description of the classification result, and the classification result and the description of the classification result are used as the final result analysis.
与现有技术相比,本实施例通过影像组学高通量的提取肾脏影像的影像组学特征,不仅仅局限于肾乳头CT值,提取了包括肾脏CT值在内的一阶特征、形状特征和纹理特征,将其转换为定量数据,并对所提取的影像组学特征进行筛选,以避免所构建影像组学模型的过拟合,并对比了逻辑回归、随机森林等算法等,来构建最稳定的影像组学模型,相比于目前研究更加准确。其次,在所构建影像组学模型的基础上进一步绘制列线图,简化了预测模型的使用,并见预测概率可视化,从而可以更直观的反应患者罹患肾结石的风险,更有利于与患者的沟通,从而提高医从性。Compared with the prior art, this embodiment extracts the radiomic features of kidney images through high-throughput radiomics, not only limited to the renal papillary CT value, but also extracts the first-order features, shape features and texture features including the renal CT value, converts them into quantitative data, and screens the extracted radiomic features to avoid overfitting of the constructed radiomics model, and compares algorithms such as logistic regression and random forest to build the most stable radiomics model, which is more accurate than current research. Secondly, a nomogram is further drawn on the basis of the constructed radiomics model, which simplifies the use of the prediction model and visualizes the prediction probability, so that the risk of patients suffering from kidney stones can be more intuitively reflected, which is more conducive to communication with patients, thereby improving medical compliance.
实施例2Example 2
参见图9,是本实施例提供的一种基于CT影像组学的肾结石风险预测系统的结构示意图,包括:目标影像组学特征提取单元21、独立风险因素提取单元22、分类结果获取单元23和解释单元24。Referring to FIG. 9 , it is a schematic diagram of the structure of a kidney stone risk prediction system based on CT imaging genomics provided in this embodiment, comprising: a target imaging genomics feature extraction unit 21 , an independent risk factor extraction unit 22 , a classification result acquisition unit 23 and an interpretation unit 24 .
值得说明的是,目标影像组学特征提取单元21通过采集的CT图像,输出目标影像组学特征,并将目标组学影像组学特征传输给分类结果获取单元23;独立风险因素提取单元22通过采集的临床变量,输出独立风险因素,并将独立风险因素传输给分类结果获取单元23;分类结果获取单元23获取到目标影像组学特征和独立风险因素后,输出分类结果,并将分类结果传输给解释单元24;解释单元24接收到分类结果后,输出分类结果的描述。It is worth noting that the target imaging genomics feature extraction unit 21 outputs the target imaging genomics feature through the collected CT images, and transmits the target imaging genomics feature to the classification result acquisition unit 23; the independent risk factor extraction unit 22 outputs the independent risk factor through the collected clinical variables, and transmits the independent risk factor to the classification result acquisition unit 23; after the classification result acquisition unit 23 acquires the target imaging genomics feature and the independent risk factor, it outputs the classification result, and transmits the classification result to the interpretation unit 24; after receiving the classification result, the interpretation unit 24 outputs a description of the classification result.
目标影像组学特征提取单元21,用于分别采集多名患者多轴面的CT图像,对CT图像勾画感兴趣区域,得到感兴趣区域结果,对感兴趣区域结果提取目标影像组学特征。The target imaging genomics feature extraction unit 21 is used to respectively collect multi-axial CT images of multiple patients, outline the region of interest on the CT images, obtain the region of interest results, and extract the target imaging genomics features from the region of interest results.
在一些实施例中,分别采集多名患者多轴面的CT图像,对CT图像勾画感兴趣区域,得到感兴趣区域结果,包括:分别采集多名患者肾组织多轴面的CT图像,CT图像包括:肾组织的横截面、冠状面和矢状面的CT图像;通过预设的图像阈值同时遮蔽横截面、冠状面和矢状面的CT图像中的肾组织周围空气和脂肪组织,并分别进行肾组织勾画,得到感兴趣区域。In some embodiments, multi-axial CT images of multiple patients are collected separately, and regions of interest are outlined on the CT images to obtain region of interest results, including: collecting multi-axial CT images of renal tissue of multiple patients separately, the CT images including: cross-sectional, coronal and sagittal CT images of renal tissue; simultaneously masking the air and fat tissue surrounding the renal tissue in the cross-sectional, coronal and sagittal CT images by a preset image threshold, and outlining the renal tissue separately to obtain the region of interest.
在一些实施例中,对感兴趣区域结果提取目标影像组学特征,包括:提取感兴趣区域的多个影像组学特征,并通过两名专家对同一影像组学特征进行提取,分别得到两组提取结果,对两组提取结果评估组内一致性,得到目标影像组学特征;多个影像组学特征包括:一阶特征、形状特征和纹理特征。In some embodiments, target radiomics features are extracted from the results of the region of interest, including: extracting multiple radiomics features of the region of interest, and having two experts extract the same radiomics feature to obtain two groups of extraction results respectively, evaluating the intra-group consistency of the two groups of extraction results to obtain the target radiomics features; the multiple radiomics features include: first-order features, shape features and texture features.
在一些实施例中,对两组提取结果评估组内一致性,得到目标影像组学特征,包括:保留组内相关系数不小于预设阈值的影像组学特征,得到多个候选影像组学特征,并对多个候选影像组学特征进行数据降维或参数压缩,筛选出与患肾结石相关性最高的多个目标影像组学特征。In some embodiments, the intra-group consistency of the two groups of extraction results is evaluated to obtain target radiomics features, including: retaining radiomics features with an intra-group correlation coefficient not less than a preset threshold, obtaining multiple candidate radiomics features, and performing data dimension reduction or parameter compression on the multiple candidate radiomics features to screen out multiple target radiomics features with the highest correlation with kidney stones.
独立风险因素提取单元22,用于分别采集多名患者的临床变量,基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对独立风险的交叉影响因素进行控制,得到独立风险因素。The independent risk factor extraction unit 22 is used to collect clinical variables of multiple patients respectively, obtain the independent risk of kidney stones in the clinical variable images based on the univariate analysis model, and then control the cross-influencing factors of the independent risk based on the multivariate analysis model to obtain the independent risk factors.
在一些实施例中,基于单因素分析模型,分别获取临床变量影像患肾结石的独立风险,再基于多因素分析模型,对独立风险的交叉影响因素进行控制,得到独立风险因素,包括:将临床变量依次输入到单因素分析模型中,分别输出对应单因素影像患肾结石的第一显著性差异指标,将满足预设指标阈值的第一显著性差异指标的对应临床变量输入到多因素分析模型中,对单因素之间的交叉影响因素进行控制,输出第二显著性差异指标,根据第二显著性差异指标,从临床变量中选出独立风险因素。In some embodiments, based on a univariate analysis model, the independent risk of kidney stones in clinical variable images is obtained respectively, and then based on a multivariate analysis model, the cross-influencing factors of the independent risks are controlled to obtain independent risk factors, including: inputting the clinical variables into the univariate analysis model in sequence, outputting the first significant difference index corresponding to the univariate image of kidney stones respectively, inputting the corresponding clinical variables of the first significant difference index that meets the preset index threshold into the multivariate analysis model, controlling the cross-influencing factors between the single factors, outputting the second significant difference index, and selecting the independent risk factors from the clinical variables according to the second significant difference index.
分类结果获取单元23,用于将目标影像组学特征与独立风险因素进行组合,得到组合特征,基于预测模型,获取组合特征对应风险预测的分类结果。The classification result acquisition unit 23 is used to combine the target imaging genomics feature with the independent risk factor to obtain a combined feature, and obtain a classification result of the risk prediction corresponding to the combined feature based on the prediction model.
在一些实施例中,基于预测模型,获取组合特征对应风险预测的分类结果,包括:将组合特征输入到基于预测模型的影像组学模型中,输出组合特征对应风险预测的分类结果;影像组学模型是根据输入目标影像组学特征和目标影像组学特征对应的截距进行训练后得到的。In some embodiments, based on the prediction model, the classification result of the risk prediction corresponding to the combined feature is obtained, including: inputting the combined feature into an imaging omics model based on the prediction model, and outputting the classification result of the risk prediction corresponding to the combined feature; the imaging omics model is obtained after training based on the input target imaging omics feature and the intercept corresponding to the target imaging omics feature.
在一些实施例中,影像组学模型是根据输入目标影像组学特征和目标影像组学特征对应的截距进行训练后得到的,包括:获取目标影像组学特征和目标影像组学特征对应的截距的训练集和验证集,根据训练集和验证集分别对多个不同机器学习的算法模型进行训练和验证,并选择训练结果与验证结果的AUC指标差值最小的算法模型作为影像组学模型。In some embodiments, the imaging omics model is obtained after training based on the input target imaging omics features and the intercept corresponding to the target imaging omics features, including: obtaining a training set and a validation set of the target imaging omics features and the intercept corresponding to the target imaging omics features, training and validating multiple different machine learning algorithm models according to the training set and the validation set, and selecting the algorithm model with the smallest difference in AUC indicators between the training results and the validation results as the imaging omics model.
解释单元24,用于根据组合特征构建列线图,并根据列线图对分类结果进行概率可视化,得到风险概率,并根据风险概率对应的风险描述,为分类结果进行解释。The interpretation unit 24 is used to construct a nomogram according to the combined features, and to perform probability visualization of the classification results according to the nomogram to obtain the risk probability, and to interpret the classification results according to the risk description corresponding to the risk probability.
在一些实施例中,根据组合特征构建列线图,并根据列线图对分类结果进行概率可视化,得到风险概率,包括:分别获取目标影像组学特征和独立风险因素在列线图中的分值,对应得到目标影像组学特征分值和独立风险因素分值,获取目标影像组学特征分值和独立风险因素分值的总分值,根据总分值获取对应的风险概率。In some embodiments, a nomogram is constructed based on the combined features, and the classification results are probabilistically visualized based on the nomogram to obtain the risk probability, including: respectively obtaining the scores of the target imaging genomics features and the independent risk factors in the nomogram, correspondingly obtaining the target imaging genomics feature scores and the independent risk factor scores, obtaining the total score of the target imaging genomics feature scores and the independent risk factor scores, and obtaining the corresponding risk probability based on the total score.
在一些实施例中,获取目标影像组学特征分值,包括:In some embodiments, obtaining a target radiomics feature score includes:
根据目标影像组学特征的系数,对目标影像组学特征进行加权,得到目标影像组学特征分值。The target radiomics feature is weighted according to its coefficient to obtain the target radiomics feature score.
在一些实施例中,目标影像组学特征分值表示为:In some embodiments, the target radiomics feature score is expressed as:
Radiomics score=a1*F1+a2*F2+...+ai*Fi+b;Radiomics score=a1*F1+a2*F2+...+ai*Fi+b;
其中,a1至ai为目标影像组学特征的系数,F1至Fi为目标影像组学特征;目标影像组学特征的个数为i;b为截距。Among them, a1 to ai are the coefficients of the target imaging features, F1 to Fi are the target imaging features; the number of target imaging features is i; and b is the intercept.
本实施例通过目标影像组学特征提取单元21获取肾组织多轴面的CT图像来全方面检测肾组织,避免有用的CT图像信息遗漏,从而能够提高肾结石风险预测的准确度;并且,通过独立风险因素提取单元22采用单因素分析模型对临床变量先获取影响患肾结石的独立风险,然后再基于多因素分析模型,对影响独立风险的交叉影响因素进行控制,以获取更加精确的独立风险因素,从而能够进一步提高肾结石风险预测的准确度;接着,通过分类结果获取单元23结合CT图像的目标影像组学特征和独立风险因素进行风险预测,并采用解释单元24构建列线图并获取风险概率,根据风险概率对应的风险描述对分类结果进行解释,能够提高肾结石风险预测的可解释性。In this embodiment, the target imaging genomics feature extraction unit 21 is used to obtain multi-axial CT images of renal tissue to comprehensively detect renal tissue, thereby avoiding the omission of useful CT image information, thereby improving the accuracy of kidney stone risk prediction; and, the independent risk factor extraction unit 22 uses a univariate analysis model to first obtain the independent risk of kidney stones from clinical variables, and then controls the cross-influencing factors affecting the independent risk based on the multifactor analysis model to obtain more accurate independent risk factors, thereby further improving the accuracy of kidney stone risk prediction; then, the classification result acquisition unit 23 combines the target imaging genomics features and independent risk factors of the CT image to perform risk prediction, and uses the interpretation unit 24 to construct a nomogram and obtain the risk probability, and interprets the classification results according to the risk description corresponding to the risk probability, thereby improving the interpretability of kidney stone risk prediction.
本领域内的技术人员应明白,本申请的实施例还可提供包括计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that the embodiments of the present application may also provide computer program products. Therefore, the present application may adopt the form of complete hardware embodiments, complete software embodiments, or embodiments in combination with software and hardware. Moreover, the present application may adopt the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) that contain computer-usable program codes.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above is only a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the technical principles of the present invention. These improvements and modifications should also be regarded as the scope of protection of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410513073.4A CN118398211A (en) | 2024-04-26 | 2024-04-26 | Kidney stone risk prediction method and system based on CT image histology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410513073.4A CN118398211A (en) | 2024-04-26 | 2024-04-26 | Kidney stone risk prediction method and system based on CT image histology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118398211A true CN118398211A (en) | 2024-07-26 |
Family
ID=91998812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410513073.4A Pending CN118398211A (en) | 2024-04-26 | 2024-04-26 | Kidney stone risk prediction method and system based on CT image histology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118398211A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118899090A (en) * | 2024-09-30 | 2024-11-05 | 华中科技大学协和深圳医院 | A probability prediction method and prediction device for cholangitis |
-
2024
- 2024-04-26 CN CN202410513073.4A patent/CN118398211A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118899090A (en) * | 2024-09-30 | 2024-11-05 | 华中科技大学协和深圳医院 | A probability prediction method and prediction device for cholangitis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113257413A (en) | Cancer prognosis survival prediction method and device based on deep learning and storage medium | |
KR101144964B1 (en) | System for Detection of Interstitial Lung Diseases and Method Therefor | |
JP7413295B2 (en) | Image processing device, image processing method and program | |
WO2023020366A1 (en) | Medical image information computing method and apparatus, edge computing device, and storage medium | |
CN112652398A (en) | New coronary pneumonia severe prediction method and system based on machine learning algorithm | |
CN118398211A (en) | Kidney stone risk prediction method and system based on CT image histology | |
CN118538428A (en) | Cerebral apoplexy classification auxiliary method integrating cross-modal characteristics | |
CN118799648A (en) | A deep learning medical image processing classification system combined with multimodal big data | |
CN119230098A (en) | Colorectal cancer patient discovery system based on AI multimodal technology | |
CN117668760A (en) | A multimodal deep learning classification method suitable for immunosuppressant-related pneumonia | |
CN114121288A (en) | Device, method and medium for generating CT-based chronic obstructive pulmonary prediction model | |
CN116524248A (en) | Medical data processing device, method and classification model training device | |
CN115274119A (en) | A method for constructing an immunotherapy prediction model integrating multiple radiomics features | |
CN119693378A (en) | A prediction method and system for temporomandibular joint disorder based on artificial intelligence | |
Gayatri et al. | Skin cancer classification enabled mobile neuro fuzzy network and entropy with weber local binary pattern based for feature extraction | |
CN119361172A (en) | An artificial intelligence-based pulmonary nodule growth prediction system and follow-up management system | |
Patil et al. | Prediction of ultrasound kidney imaging using convolution neural networks | |
AU2021104727A4 (en) | Development of cnn scheme for covid-19 disease detection using chest radiograph | |
CN116504384A (en) | Clinical multimode prediction model for severe acute pancreatitis based on machine learning | |
CN114494191A (en) | Medical image processing method, apparatus and computer storage medium | |
Hamsavath et al. | Enhanced Automatic Identification of Kidney Cyst, Stone and Tumor using Deep Learning | |
Wijerathna et al. | Brain tumor detection using image processing | |
CN119027722B (en) | STRes3dRNN model, evaluation system and evaluation method | |
CN118841171B (en) | KOA risk prediction method based on automated measurement of body composition by CT examination | |
CN119722597B (en) | Probability embedding-based multi-illumination esophageal cancer early screening and labeling system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |