CN109875546B - Depth model classification result visualization method for electrocardiogram data - Google Patents
Depth model classification result visualization method for electrocardiogram data Download PDFInfo
- Publication number
- CN109875546B CN109875546B CN201910067724.0A CN201910067724A CN109875546B CN 109875546 B CN109875546 B CN 109875546B CN 201910067724 A CN201910067724 A CN 201910067724A CN 109875546 B CN109875546 B CN 109875546B
- Authority
- CN
- China
- Prior art keywords
- model
- result
- value
- interval
- heartbeat
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000007794 visualization technique Methods 0.000 title claims abstract description 27
- 238000012800 visualization Methods 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims description 61
- 239000013598 vector Substances 0.000 claims description 46
- 230000000694 effects Effects 0.000 claims description 14
- 239000003086 colorant Substances 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 235000019580 granularity Nutrition 0.000 abstract description 5
- 230000002159 abnormal effect Effects 0.000 description 8
- 206010003658 Atrial Fibrillation Diseases 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 238000013461 design Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000008034 disappearance Effects 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 210000000115 thoracic cavity Anatomy 0.000 description 1
Images
Landscapes
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
本发明公开了一种面向心电图数据的深度模型分类结果可视化方法,包括:将心电图序列输入训练好的深度模型中,获得基准结果;通过遮挡区间抹除选定的心跳区间的信息,将没有选定心跳区间信息时的深度模型输出结果与深度模型输出的基准结果相比较,计算获得每一次心跳对于深度模型的影响因子ΔO;采用渐变色带将每一次心跳的影响因子ΔO可视化表示出来,实现深度模型分类结果的可视化。本发明通过分析宏观和微观两种粒度下心电图数据对于深度模型输出结果的影响,能够展示得到模型分类结果的关键证据,可增强模型输出的分类结果的可解释性。
The invention discloses a deep model classification result visualization method oriented to electrocardiogram data, comprising: inputting electrocardiogram sequences into a trained depth model to obtain reference results; The depth model output results when determining the heartbeat interval information are compared with the benchmark results output by the depth model, and the impact factor ΔO of each heartbeat on the depth model is calculated and obtained; the impact factor ΔO of each heartbeat is visualized by using a gradient color band to achieve Visualization of deep model classification results. By analyzing the influence of electrocardiogram data in macroscopic and microscopic granularities on the output result of the depth model, the present invention can display the key evidence for obtaining the classification result of the model, and can enhance the interpretability of the classification result output by the model.
Description
技术领域technical field
本发明属于深度模型分类结果可视化技术领域,特别涉及一种面向心电图数据的深度模型分类结果可视化方法。The invention belongs to the technical field of deep model classification result visualization, and particularly relates to a deep model classification result visualization method oriented to electrocardiogram data.
背景技术Background technique
根据维基百科的定义,心电图数据是指一种经胸腔的以时间为单位记录心脏的电生理活动,并通过皮肤上的电极捕捉并记录下来的数据。在实践中,为了提高效率,减轻医生的负担和工作强度,一些基于深度学习的模型被应用于心电图数据上的特征提取与分类上。但现有的这些模型只能给出最后的分类结果,无法对该分类结果的产生依据做出解释;而在实践中没有明确解释的分类结果预测很难被接受和应用,使得应用场景大大受限,也不利于医生利用模型输出的分类结果。According to the definition of Wikipedia, ECG data refers to a time-based recording of the electrophysiological activity of the heart through the chest cavity, which is captured and recorded by electrodes on the skin. In practice, in order to improve efficiency and reduce the burden and work intensity of doctors, some deep learning-based models are applied to feature extraction and classification on ECG data. However, these existing models can only give the final classification result, and cannot explain the basis of the classification result; and the classification result prediction without a clear explanation in practice is difficult to be accepted and applied, which makes the application scenario greatly affected. It is not conducive to doctors using the classification results output by the model.
综上,亟需一种面向心电图数据的深度模型分类结果可视化方法。In conclusion, there is an urgent need for a visualization method of deep model classification results for ECG data.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种面向心电图数据的深度模型分类结果可视化方法,以解决上述存在的技术问题。本发明能够展示得到最终结果的关键证据,可增强模型输出的分类结果的可解释性。The purpose of the present invention is to provide a visualization method of deep model classification results oriented to electrocardiogram data, so as to solve the above-mentioned technical problems. The present invention can show the key evidence of obtaining the final result, and can enhance the interpretability of the classification result output by the model.
为达到上述目的,本发明采用以下技术方案:To achieve the above object, the present invention adopts the following technical solutions:
一种面向心电图数据的深度模型分类结果可视化方法,包括以下步骤:A deep model classification result visualization method for electrocardiogram data, comprising the following steps:
步骤1,将采集的心电图数据处理为心电图序列,将心电图序列输入训练好的深度模型中,获得基准结果;Step 1: Process the collected electrocardiogram data into an electrocardiogram sequence, and input the electrocardiogram sequence into the trained deep model to obtain a benchmark result;
步骤2,把心跳间隔作为基本单位,根据心电图数据中的心跳信息动态调整遮挡区间,通过遮挡区间抹除选定的心跳区间的信息,将没有该心跳区间信息时的深度模型输出结果与包含该心跳信息时深度模型输出的基准结果相比较,计算获得每一次心跳对于深度模型的影响因子ΔO;Step 2, take the heartbeat interval as the basic unit, dynamically adjust the occlusion interval according to the heartbeat information in the electrocardiogram data, erase the information of the selected heartbeat interval through the occlusion interval, and output the result of the depth model without the heartbeat interval information and include the heartbeat interval information. Comparing the benchmark results output by the depth model when heartbeat information is used, the impact factor ΔO of each heartbeat on the depth model is calculated and obtained;
步骤3,采用渐变色带将每一次心跳的影响因子ΔO可视化表示出来,实现深度模型分类结果的可视化。In step 3, the influence factor ΔO of each heartbeat is visualized by using a gradient color band, so as to realize the visualization of the classification result of the deep model.
进一步地,还包括:Further, it also includes:
步骤4,设置可移动的遮挡区间,依次遮挡心电图数据中的每个点;将心电图数据遮挡每个点的深度模型输出结果分别与深度模型输出的基准结果比较,获得心电图数据上每个点对于深度模型输出结果的影响因子;Step 4: Set a movable occlusion interval to occlude each point in the ECG data in turn; compare the depth model output result of each point occluded by the ECG data with the benchmark result output by the depth model, and obtain the corresponding value of each point on the ECG data. Influence factor of the output result of the deep model;
步骤5,将步骤4获得的每个点的影响因子进行可视化表示。Step 5: Visually represent the impact factor of each point obtained in Step 4.
进一步地,步骤2具体包括:Further, step 2 specifically includes:
步骤2.1,根据原始心电图数据,获取每次心跳区间的长度,根据该长度动态设置遮挡区间,依次遮挡每个心跳区间;Step 2.1, according to the original electrocardiogram data, obtain the length of each heartbeat interval, dynamically set the blocking interval according to the length, and sequentially block each heartbeat interval;
步骤2.2,将添加了遮挡区间的心电图序列向量分别输入到深度模型中,得到新的深度模型输出结果;Step 2.2, input the ECG sequence vector with the added occlusion interval into the depth model respectively, and obtain the output result of the new depth model;
步骤2.3,分别计算步骤2.2获得的各个新的深度模型输出结果与步骤1获得基准结果的差值,获得各个心跳区间对深度模型输出结果的影响因子。Step 2.3: Calculate the difference between each new depth model output result obtained in step 2.2 and the reference result obtained in step 1, and obtain the influence factor of each heartbeat interval on the depth model output result.
进一步地,步骤3具体包括:Further, step 3 specifically includes:
步骤3.1,将每一个心跳区间对应的ΔO值编码,得到一个对应的颜色序列;规则为:当ΔO>0时,将其编码为一种预设颜色,该值越大,则颜色深度越深;当ΔO<0时,将其编码为另一种不同的预设颜色,该值越小,则颜色深度越深;Step 3.1, encode the ΔO value corresponding to each heartbeat interval to obtain a corresponding color sequence; the rule is: when ΔO>0, encode it as a preset color, the larger the value, the darker the color depth ; when ΔO<0, encode it as another different preset color, the smaller the value, the deeper the color depth;
步骤3.2,以每一个心跳区间长度为矩形宽度,以心电图上最高R峰的高度为矩形长度,将心电图数据序列分成若干矩形,每个矩形包含一个心跳区间;将步骤3.1获得的每个心跳区间编码生成的颜色填充到每个心跳区间对应的矩形中;Step 3.2, take the length of each heartbeat interval as the width of the rectangle, and take the height of the highest R peak on the electrocardiogram as the length of the rectangle, divide the electrocardiogram data sequence into several rectangles, and each rectangle contains a heartbeat interval; The color generated by the encoding is filled into the rectangle corresponding to each heartbeat interval;
步骤3.3,将步骤3.2获得的各个心跳区间对应的填充有颜色的矩形叠加到心电图数据背景上,实现深度模型分类结果的可视化。In step 3.3, the colored rectangles corresponding to each heartbeat interval obtained in step 3.2 are superimposed on the background of the electrocardiogram data, so as to realize the visualization of the classification result of the depth model.
进一步地,步骤3.2中,矩形中心设置为透明,两端设置为填充颜色,将矩形调整为渐变色带。Further, in step 3.2, the center of the rectangle is set to transparent, the two ends are set to fill color, and the rectangle is adjusted to a gradient color band.
进一步地,步骤1具体包括:Further, step 1 specifically includes:
心电图数据处理为心电图序列后的表示形式为:The representation of the ECG data after processing it into an ECG sequence is:
S=[s1,s2,…,si,…,sn]S=[s 1 ,s 2 ,…,s i ,…,s n ]
式中,S为n维向量,i=1,2,…,n,si表示序列中第i个点的数据;In the formula, S is an n-dimensional vector, i=1,2,...,n, s i represents the data of the i-th point in the sequence;
将心电图序列输入到训练好的深度模型中,得到的结果数据格式为:Input the ECG sequence into the trained deep model, and the resulting data format is:
Y=[y1,y2,…,yj,…,yN]Y=[y 1 ,y 2 ,...,y j ,...,y N ]
式中,Y为N维向量,N表示模型分类的标签数量;j=1,2,...,N,yj表示模型在标签j上的分类值,0≤yj≤1;In the formula, Y is an N-dimensional vector, N represents the number of labels of the model classification; j=1,2,...,N, y j represents the classification value of the model on the label j , 0≤yj≤1;
其中,yj取最大值时所对应的标签为深度模型的预测分类结果,将该标签对应的yj值定为基准值O,标签序号设为I,基准值O的表达式为:Among them, the label corresponding to the maximum value of y j is the predicted classification result of the depth model, the value of y j corresponding to the label is set as the reference value O, the label serial number is set as I, and the expression of the reference value O is:
O=max{y1,y2,…,yj,…,yN}O=max{y 1 ,y 2 ,…,y j ,…,y N }
式中,yj表示模型在标签j上的分类值,0≤yj≤1。In the formula, y j represents the classification value of the model on the label j , 0≤yj≤1.
进一步地,步骤2.1,动态确定遮挡区间长度;Further, step 2.1, dynamically determine the length of the occlusion interval;
从原始心电图数据中,得到每一个心跳的R峰位置标签,两个R峰之间认为是一次心跳的RR区间;设置第k个遮挡区间长度为:From the original ECG data, the R peak position label of each heartbeat is obtained, and the interval between two R peaks is considered to be the RR interval of a heartbeat; the length of the kth occlusion interval is set as:
Lengthk=xk+1-xk Length k = x k+1 -x k
式中,Lengthk表示第k个RR区间上设置的遮挡区间的长度,xk表示第k个R峰位置的横坐标,0≤xk≤Len,Len表示心电图序列的总长度;In the formula, Length k represents the length of the occlusion interval set on the kth RR interval, x k represents the abscissa of the kth R peak position, 0≤xk ≤Len, and Len represents the total length of the ECG sequence;
步骤2.2,计算每一个心跳区间信息对于深度模型输出结果的影响因子;Step 2.2, calculate the influence factor of each heartbeat interval information on the output result of the depth model;
步骤2.2.1,将遮挡区间开始位置与第k次心跳的R峰位置对齐,区间长度设置为Lengthk,使得遮挡区间覆盖第k次心跳区间信息;Step 2.2.1, align the start position of the occlusion interval with the position of the R peak of the kth heartbeat, and set the interval length to Length k , so that the occlusion interval covers the information of the kth heartbeat interval;
步骤2.2.2,将遮挡区间内的向量值统一赋值为0,其余位置的向量值保持不变,修改后的心电图序列为:Step 2.2.2, uniformly assign the vector values in the occlusion interval to 0, and keep the vector values in other positions unchanged. The modified ECG sequence is:
Sk=[s1,s2,…,0,…,0,…,sn] Sk = [s 1 , s 2 , ..., 0, ..., 0, ..., s n ]
其中,si表示序列中第i个点的数据,赋值为0的区域从第k次心跳的R峰开始,长度为Lengthk;Wherein, si represents the data of the ith point in the sequence, and the area assigned as 0 starts from the R peak of the kth heartbeat, and the length is Length k ;
步骤2.2.3,将添加了遮挡区间的心电图序列Sk向量输入到深度模型中,得到新的深度模型输出结果Yk,Yk是N维向量,表达式为:Step 2.2.3, input the electrocardiogram sequence Sk vector with the added occlusion interval into the depth model, and obtain the new depth model output result Y k , where Y k is an N-dimensional vector, and the expression is:
Yk=[y′1,y′2,…,y′N]Y k = [y' 1, y' 2 , ..., y' N ]
式中,y′1,y′2,…,y′N分别表示在1,2,…,N标签上的输出值;In the formula, y′ 1 , y′ 2 , …, y′ N represent the output values on the labels 1, 2, …, N, respectively;
步骤2.2.4,计算获得遮挡第k次心跳区间信息的深度模型结果Ok与基准结果的差值ΔOk;ΔOk为第k个心跳区间的影响因子,表达式为:Step 2.2.4, calculate and obtain the difference ΔO k between the depth model result O k that blocks the information of the kth heartbeat interval and the reference result; ΔOk is the influence factor of the kth heartbeat interval, and the expression is:
ΔOk=yI-y′I ΔO k =y I -y' I
式中,I表示步骤1计算得到的基准值O的标签序号,yI和y′I表示在该标签序号上的深度模型输出值;ΔOk表示第k个心跳区间对于深度模型输出结果的影响因子;ΔOk>0表示该心跳区间对模型分类结果具有正面影响,是模型的支持证据,该值越大,表示与模型分类结果越契合;ΔOk<0表示该心跳区间对最终分类结果具有负面影响,是模型的反对证据,该值为负值,值越小表示与模型分类结果越背离;In the formula, I represents the label number of the reference value O calculated in step 1, y I and y′ I represent the depth model output value on the label number; ΔO k represents the influence of the kth heartbeat interval on the output result of the depth model factor; ΔO k >0 indicates that the heartbeat interval has a positive impact on the model classification result, which is the supporting evidence of the model. The larger the value, the better the fit with the model classification result; ΔOk <0 indicates that the heartbeat interval has a positive effect on the final classification result. Negative influence is the evidence against the model, the value is negative, and the smaller the value, the greater the deviation from the model classification result;
通过ΔO的数值,区分不同心跳区间对于模型分类结果的影响,实现对模型分类结果的解释。Through the value of ΔO, the influence of different heartbeat intervals on the model classification results can be distinguished, and the interpretation of the model classification results can be realized.
进一步地,步骤4具体包括:Further, step 4 specifically includes:
步骤4.1,从心电图序列S向量的第一个数据开始,将之后L个向量值数据置为0,其余位置的向量值保持不变,形成遮挡区间;遮挡区间从第一个数据开始,每次向后移动一格,直至遍历心电图数据中的所有数据;Step 4.1, starting from the first data of the S vector of the electrocardiogram sequence, set the following L vector value data to 0, and the vector values of the remaining positions remain unchanged to form an occlusion interval; the occlusion interval starts from the first data, and each time Move backward one grid until all the data in the ECG data are traversed;
第m次循环时添加了遮挡区间的心电图序列Sm向量数据表达式为:The vector data expression of the ECG sequence S m with the occlusion interval added in the mth cycle is:
Sm=[s1,s2,…,sm-1,0,0,…,0,sm+L,…,sn]S m =[s 1 ,s 2 ,...,s m-1 ,0,0,...,0,s m+L ,...,s n ]
其中s1,s2,…,sn表示组成心电图序列的单个数据,由该公式可知,sm,sm+1,…,sm+L-1被添加了遮挡区间,区间内的数据都被赋值为0;Among them, s 1 , s 2 ,…,s n represent the single data constituting the ECG sequence. It can be seen from this formula that s m , s m+1 ,…, s m+L-1 are added with occlusion intervals, and the data in the interval are assigned to 0;
步骤4.2,逐点计算设置遮挡区间后深度模型输出结果与基准结果的差值,得到心电图上每一个点的影响因子ΔO数值;Step 4.2: Calculate the difference between the output result of the depth model and the reference result after setting the occlusion interval point by point, and obtain the value of the influence factor ΔO of each point on the ECG;
具体步骤包括:Specific steps include:
步骤4.2.1,将第m次循环时添加了遮挡区间的心电图序列Sm向量输入到深度模型中,得到模型的输出结果Ym,表达式为:Step 4.2.1, input the electrocardiogram sequence S m vector with the occlusion interval added in the mth cycle into the depth model, and obtain the output result Y m of the model, the expression is:
Ym=[y′1,y′2,…,y′N]Y m =[y' 1 , y' 2 , ..., y' N ]
式中,y′1,y′2,…,y′N分别表示在1,2,…,N标签上的输出值;In the formula, y′ 1 , y′ 2 ,…,y′ N represent the output values on the labels 1, 2,…,N, respectively;
步骤4.2.2,计算步骤4.2.1获得的新的模型输出结果与模型基准结果之间的差值ΔOm,该值反映单独的点对于模型输出结果的影响,计算公式为:Step 4.2.2, calculate the difference ΔO m between the new model output result obtained in step 4.2.1 and the model benchmark result, which reflects the influence of a single point on the model output result. The calculation formula is:
ΔOm=yI-y′I ΔO m =y I -y' I
式中,I表示计算得到的基准值O的标签序号,yI和y′I表示在该标签序号上的深度模型输出值;ΔOm表示心跳序列中第m个数据对于深度模型输出结果的影响因子;ΔOm>0表示该点对最终分类结果具有正面影响,是模型的支持证据,该值越大,表示与模型最终结果越契合;ΔOm<0表示该点对最终分类结果具有负面影响,是模型的反对证据,该值为负值,值越小表示与最终结果越背离;In the formula, I represents the label number of the calculated reference value O, y I and y′ I represent the depth model output value on the label number; ΔO m represents the influence of the mth data in the heartbeat sequence on the output result of the depth model factor; ΔO m >0 indicates that the point has a positive impact on the final classification result, which is the supporting evidence for the model. The larger the value, the better the final result of the model; ΔO m <0 indicates that the point has a negative impact on the final classification result. , is the evidence against the model, the value is negative, the smaller the value, the more deviation from the final result;
通过ΔO的数值,得到心电图上每个点对于模型分类结果的影响因子,实现心电图数据中细节信息的解释。Through the value of ΔO, the influence factor of each point on the ECG on the model classification result is obtained, and the detailed information in the ECG data can be explained.
进一步地,步骤5具体步骤包括:Further, the specific steps of step 5 include:
步骤5.1,将步骤4.2获得的每个点的ΔO数值编码为高度,并通过该点的位置和高度确定心电图平面上的一个点P,ΔO>0,表示点P在心电图的上方区域,并将心电图上对应点显示为一种预设颜色;ΔO=0,表示点P落在零轴上,将心电图上对应点显示为另一种预设颜色;ΔO<0,表示点P在心电图的下方区域,将心电图上对应点显示为再一种预设颜色;预设颜色均不相同;Step 5.1, encode the ΔO value of each point obtained in step 4.2 as the height, and determine a point P on the ECG plane by the position and height of the point, ΔO>0, indicating that point P is in the upper area of the ECG, and the The corresponding point on the ECG is displayed as a preset color; ΔO=0, it means that the point P falls on the zero axis, and the corresponding point on the ECG is displayed as another preset color; ΔO<0, it means that the point P is below the ECG area, and display the corresponding point on the ECG as another preset color; the preset colors are all different;
步骤5.2,使用平滑曲线将以序号为横坐标,ΔO数值为纵坐标形成的点连接起来,并与零轴共同包围出若干区域;曲线的高度反映ΔO绝对值的大小,曲线的尖峰和低谷反映支持模型结果和违背模型结果的关键依据;Step 5.2, use a smooth curve to connect the points formed with the serial number as the abscissa and the ΔO value as the ordinate, and together with the zero axis to enclose several areas; the height of the curve reflects the absolute value of ΔO, and the peaks and valleys of the curve reflect Key evidence supporting and violating model results;
步骤5.3,使用预设的不同颜色填充步骤5.2曲线包围的区域,实现深度模型分类结果可视化。In step 5.3, use preset different colors to fill the area surrounded by the curve in step 5.2 to realize the visualization of the classification result of the deep model.
进一步地,步骤4中,遮挡区间的长度L的范围为10≤L≤20。Further, in step 4, the range of the length L of the occlusion interval is 10≤L≤20.
与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
本发明的面向心电图数据的深度模型分类结果可视化方法,设计了从全局到细节的可视化结果展示过程,可完整展示影响模型得出结果的关键依据。本发明的方法,首先将采集的原始心电图数据输入到深度模型中,得到深度模型的输出数据,根据输出数据分析确定预测的分类结果,并将输出数据保存为基准结果并参与后续的对比,获得影响因子;然后结合心跳信息动态设置遮挡区间参数,得出每一个心跳区间对于模型最终结果预测的影响,并以可视化的方法直观展示;进一步设计可移动的遮挡区间,计算每个点与基准的偏差值,将该值与原始心电图数据叠加,通过峰值和区域面积展示心电图数据中的细节特征,方便查找存在异常的细节区域。本发明通过设置遮挡区间计算特定区域对于最终结果的影响,通过分析宏观和微观两种粒度下心电图数据对于最终模型结果的影响,能够展示模型得到最终结果的关键证据,可增强模型结果的可解释性。The visualization method of the deep model classification result for electrocardiogram data of the present invention designs a visualization result display process from the overall situation to the details, and can completely display the key basis affecting the results obtained by the model. The method of the present invention firstly inputs the collected original electrocardiogram data into the depth model, obtains the output data of the depth model, determines the predicted classification result according to the output data analysis, saves the output data as the benchmark result and participates in the subsequent comparison, and obtains Influence factor; then dynamically set the parameters of the occlusion interval combined with the heartbeat information, obtain the influence of each heartbeat interval on the prediction of the final result of the model, and visualize it in a visual way; further design a movable occlusion interval, and calculate the difference between each point and the benchmark. The deviation value is superimposed with the original ECG data, and the detailed features in the ECG data are displayed through the peak value and area area, which is convenient for finding abnormal detailed areas. The invention calculates the influence of a specific area on the final result by setting the occlusion interval, and analyzes the influence of the electrocardiogram data in macroscopic and microscopic granularities on the final model result, which can display the key evidence that the model obtains the final result, and can enhance the interpretability of the model result. sex.
本发明的可视化方法,能够增强模型结果的可解释性;传统方法下模型结果为一个特定的分类结果标签,没有办法解释得出该结果的依据,这样的结果较难被采纳和使用。本发明的方法对于模型结果做出了解释,找到了模型得出结果的支持证据与反对证据,展示出每一个细节对于模型得出最终结果的影响,可大大提升模型结果的可解释性。The visualization method of the present invention can enhance the interpretability of the model result; under the traditional method, the model result is a specific classification result label, and there is no way to explain the basis for the result, and such a result is difficult to be adopted and used. The method of the invention explains the model results, finds the supporting evidence and the opposing evidence for the model results, and shows the influence of every detail on the model results, which can greatly improve the interpretability of the model results.
本发明从宏观和微观的角度对解释过程进行了可视化展示;传统方法下心电图数据杂乱冗长,从中分辨关键信息是费时费力。对模型结果影响较大的区域很有可能是关键性的异常区域,比如存在P波消失等异常现象。本发明的方法从心电图数据中发掘这样的区域,并从宏观和微观两种粒度上将其通过颜色、高度等可视化元素展示出来,从而使模型运行过程更直观,进一步提升了模型结果的可解释性。The present invention visualizes the interpretation process from the macroscopic and microscopic perspectives; under the traditional method, the electrocardiogram data is cluttered and lengthy, and it is time-consuming and laborious to distinguish key information from it. The areas that have a greater impact on the model results are likely to be critical abnormal areas, such as the presence of abnormal phenomena such as the disappearance of P waves. The method of the present invention excavates such regions from the electrocardiogram data, and displays them through visual elements such as color and height from both macroscopic and microscopic granularities, thereby making the model running process more intuitive and further improving the interpretability of the model results. sex.
本发明的方法适用于各种深度学习模型,可扩展性强;传统方法下解释模型结果需要参考模型结构,无法扩展到其他模型上。本发明的方法并不依赖于特定模型,所有适用于心电图数据的深度模型分类结果均可采用本方法进行解释和展示,并能很方便地扩展到目前层出不穷的改进模型上。The method of the present invention is suitable for various deep learning models, and has strong expansibility; under the traditional method, the interpretation of the model results needs to refer to the model structure and cannot be extended to other models. The method of the present invention does not depend on a specific model, and all deep model classification results applicable to electrocardiogram data can be interpreted and displayed by this method, and can be easily extended to the endlessly emerging improved models.
附图说明Description of drawings
图1是本发明的一种面向心电图数据的深度模型分类结果可视化方法的流程示意框图;Fig. 1 is a kind of flow schematic block diagram of the visualization method of deep model classification result oriented to electrocardiogram data of the present invention;
图2是本发明的一种面向心电图数据的深度模型分类结果可视化方法中心跳区间影响可视化方法的流程示意框图;Fig. 2 is a schematic flow diagram of a method for visualizing the influence of a heartbeat interval in a deep model classification result visualization method for electrocardiogram data of the present invention;
图3是本发明的一种面向心电图数据的深度模型分类结果可视化方法中逐点影响可视化方法的流程示意框图;3 is a schematic flow diagram of a point-by-point influence visualization method in a deep model classification result visualization method oriented to electrocardiogram data of the present invention;
图4是本发明的一种面向心电图数据的深度模型分类结果可视化方法中心跳区间影响可视化结果示意图;4 is a schematic diagram of the visualization result of a heartbeat interval in a method for visualizing the results of deep model classification for electrocardiogram data according to the present invention;
图5是本发明的一种面向心电图数据的深度模型分类结果可视化方法中逐点影响可视化结果示意图。FIG. 5 is a schematic diagram of a point-by-point influence visualization result in a method for visualizing a result of deep model classification for electrocardiogram data according to the present invention.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明作进一步详细说明。The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.
本发明的一种面向心电图数据的深度模型分类结果可视化方法,具体包括以下步骤:A method for visualizing the results of deep model classification for electrocardiogram data of the present invention specifically includes the following steps:
步骤1,采集获取预设数量的已经诊断的心电图数据,将每个心电图数据处理为心电图序列,将每个心电图序列输入选定的训练好的深度模型,获得深度模型输出结果,将此时的输出结果定为深度模型输出的基准结果。Step 1: Collect and obtain a preset number of diagnosed electrocardiogram data, process each electrocardiogram data into an electrocardiogram sequence, and input each electrocardiogram sequence into the selected trained deep model to obtain the output result of the deep model. The output result is set as the benchmark result output by the deep model.
原始心电图数据处理为心电图序列后的表示形式为:The representation of the raw ECG data after processing it into an ECG sequence is:
S=[s1,s2,…,si,…,sn]S=[s 1 ,s 2 ,…,s i ,…,s n ]
式中,S为n维向量,i=1,2,…,n,si表示序列中第i个点的数据,将该数据序列输入到预设训练好的深度模型中,得到的结果数据格式为:In the formula, S is an n-dimensional vector, i=1,2,...,n, s i represents the data of the ith point in the sequence, input the data sequence into the preset trained depth model, and get the result data The format is:
Y=[y1,y2,…,yj,…,yN]Y=[y 1 ,y 2 ,...,y j ,...,y N ]
式中,Y为N维向量,N表示模型分类的标签数量;j=1,2,...,N,yj表示模型在标签j上的分类值,0≤yj≤1,其中,yj取最大值时所对应的标签为深度模型的预测分类结果,将该标签对应的yj值定为基准值O,标签序号设为I,基准值O的表达式为:In the formula, Y is an N-dimensional vector, N represents the number of labels of the model classification; j=1,2,...,N, y j represents the classification value of the model on the label j, 0≤y j ≤1, where, The label corresponding to the maximum value of y j is the predicted classification result of the deep model. The value of y j corresponding to the label is set as the reference value O, the label serial number is set as I, and the expression of the reference value O is:
O=max{y1,y2,…,yj,…,yN}O=max{y 1 ,y 2 ,…,y j ,…,y N }
式中yj表示模型在标签j上的分类值,0≤yj≤1。In the formula, y j represents the classification value of the model on the label j , 0≤yj≤1.
步骤2,宏观上展示不同心跳区间对于深度模型输出结果的影响。Step 2: Macroscopically display the impact of different heartbeat intervals on the output results of the depth model.
把心跳间隔作为基本单位,根据心电图数据中的心跳信息动态调整遮挡区间,并计算每一次心跳区间对于最终模型的影响因子。然后采用渐变色带将这种影响可视化表示出来。Taking the heartbeat interval as the basic unit, the occlusion interval is dynamically adjusted according to the heartbeat information in the ECG data, and the impact factor of each heartbeat interval on the final model is calculated. This effect is then visualized using a gradient color ramp.
步骤2具体包括以下步骤:Step 2 specifically includes the following steps:
步骤2.1,动态确定遮挡区间长度。Step 2.1, dynamically determine the length of the occlusion interval.
从原始心电图数据中,可以得到每一个心跳的R峰位置标签,两个R峰之间认为是一次心跳的RR区间。因此设置第k次心跳的遮挡区间长度为:From the original ECG data, the R-peak position label of each heartbeat can be obtained, and the interval between two R-peaks is considered to be the RR interval of a heartbeat. Therefore, the length of the occlusion interval of the kth heartbeat is set as:
Lengthk=xk+1-xk Length k = x k+1 -x k
式中,Lengthk表示第k个RR区间上设置的遮挡区间的长度,xk表示第k个R峰位置的横坐标,0≤xk≤Len,Len表示心电图序列的总长度;In the formula, Length k represents the length of the occlusion interval set on the kth RR interval, x k represents the abscissa of the kth R peak position, 0≤xk ≤Len, and Len represents the total length of the ECG sequence;
步骤2.2,计算每一个心跳区间对于深度模型输出结果的影响。Step 2.2: Calculate the impact of each heartbeat interval on the output of the deep model.
从步骤2.1中获得了第k次心跳区间的长度,接下来根据该心跳区间的长度动态设置遮挡区间;具体步骤包括:The length of the kth heartbeat interval is obtained from step 2.1, and then the occlusion interval is dynamically set according to the length of the heartbeat interval; the specific steps include:
步骤2.2.1,将遮挡区间开始位置与第k次心跳的R峰位置对齐,遮挡区间长度设置为Lengthk,使得遮挡区间恰好覆盖第k次心跳RR区间信息;Step 2.2.1, align the start position of the occlusion interval with the R peak position of the kth heartbeat, and set the length of the occlusion interval to Length k , so that the occlusion interval just covers the information of the RR interval of the kth heartbeat;
步骤2.2.2,将遮挡区间内的向量值统一赋值为0,其余位置的向量值保持不变,修改后的心电图序列为:Step 2.2.2, uniformly assign the vector values in the occlusion interval to 0, and keep the vector values in other positions unchanged. The modified ECG sequence is:
Sk=[s1,s2,…,0,…,0,…,sn]S k =[s 1 ,s 2 ,…,0,…,0,…,s n ]
其中,si表示序列中第i个点的数据,赋值为0的区域从第k次心跳的R峰开始,长度为Lengthk;Wherein, si represents the data of the ith point in the sequence, and the area assigned as 0 starts from the R peak of the kth heartbeat, and the length is Length k ;
步骤2.2.3,在步骤2.2.2中我们在第k次心跳区间上设置了遮挡区间,现在将添加了遮挡区间的心电图序列Sk向量输入到深度模型中,得到新的深度模型输出结果Yk,Yk是N维向量,表达式为:Step 2.2.3, in step 2.2.2, we set the occlusion interval on the kth heartbeat interval, and now the ECG sequence S k vector with the occlusion interval added is input into the depth model, and the new output result Y of the depth model is obtained. k , Y k is an N-dimensional vector whose expression is:
Yk=[y′1,y′2,…,y′N]Y k =[y′ 1 ,y′ 2 ,…,y′ N ]
式中,y′1,y′2,…,y′N分别表示在1,2,…,N标签上的输出值;In the formula, y′ 1 , y′ 2 ,…,y′ N represent the output values on the labels 1, 2,…,N, respectively;
步骤2.2.4,计算获得遮挡第k次心跳区间信息的深度模型结果Ok与基准结果的差值ΔOk;ΔOk为第k个心跳区间的影响因子,表达式为:Step 2.2.4, calculate and obtain the difference ΔO k between the depth model result O k that blocks the information of the kth heartbeat interval and the reference result; ΔOk is the influence factor of the kth heartbeat interval, and the expression is:
ΔOk=yI-y′I ΔO k =y I -y' I
式中,I表示步骤1中计算得到的基准值O的标签序号,yI和y′I表示在该标签序号上的深度模型输出值;ΔOk表示第k个心跳区间对于深度模型输出结果的影响因子;ΔOk>0表示该心跳区间对模型分类结果具有正面影响,是模型的支持证据,该值越大,表示与模型分类结果越契合;ΔOk<0表示该心跳区间对最终分类结果具有负面影响,是模型的反对证据,该值为负值,值越小表示与模型分类结果越背离;In the formula, I represents the label number of the reference value O calculated in step 1, y I and y′ I represent the depth model output value on the label number; ΔO k represents the kth heartbeat interval for the depth model output result. Impact factor; ΔO k >0 indicates that the heartbeat interval has a positive impact on the model classification result, which is the supporting evidence of the model. The larger the value, the better the model classification result; ΔO k <0 indicates that the heartbeat interval has a positive impact on the final classification result. It has a negative impact and is the evidence against the model. The value is negative, and the smaller the value, the greater the deviation from the model classification result;
通过ΔO的数值,区分不同心跳区间对于模型分类结果的影响,实现对模型分类结果的解释。Through the value of ΔO, the influence of different heartbeat intervals on the model classification results can be distinguished, and the interpretation of the model classification results can be realized.
步骤2.2.5,移动遮挡区间,重复以上过程,直到所有的心跳区间对于结果的影响因子都已经计算完成。Step 2.2.5, move the occlusion interval, and repeat the above process until all the influence factors of the heartbeat interval on the result have been calculated.
设置遮挡区间的目的是抹除该心跳的信息,将没有该心跳时模型的结果与包含该心跳时模型的结果相比较,即可计算得到该心跳对于模型的影响因子。反复执行该操作,即可得到每一个心跳对于模型结果的影响因子。The purpose of setting the occlusion interval is to erase the information of the heartbeat. By comparing the results of the model without the heartbeat with the results of the model with the heartbeat, the impact factor of the heartbeat on the model can be calculated. Repeatedly perform this operation, you can get the influence factor of each heartbeat on the model result.
步骤2.3,可视化展示每个心跳的影响因子。Step 2.3, visualize the impact factor of each heartbeat.
在步骤2.2中,得到的差值ΔO可以用来表示该心跳区间对于模型最终结果的影响。但是心电图数据冗长,包含多个心跳区间,使用数值方式不够直观,因此还需要设计相应的可视化方法。通过将数值映射到矩形的颜色,可以在心电图数据中直观显示各个心跳区间的表现。In step 2.2, the obtained difference ΔO can be used to represent the influence of the heartbeat interval on the final result of the model. However, the ECG data is lengthy and contains multiple heartbeat intervals, and the numerical method is not intuitive enough, so a corresponding visualization method needs to be designed. By mapping the values to the colors of the rectangles, the performance of each heartbeat interval can be visualized in the ECG data.
步骤2.3具体方法如下:The specific method of step 2.3 is as follows:
(1)将ΔO编码为颜色;每一个心跳区间对应一个ΔO值,经过编码后即可得到一个颜色序列(1) Code ΔO as a color; each heartbeat interval corresponds to a ΔO value, and a color sequence can be obtained after encoding
为了直观显示ΔO的含义,本发明中将其编码为颜色,其规则为:In order to visually display the meaning of ΔO, it is coded as a color in the present invention, and its rules are:
当ΔO>0时,将其编码为红色,该值越大,则红色深度越深;When ΔO>0, encode it as red, the larger the value, the deeper the red depth;
当ΔO<0时,将其编码为蓝色,该值越小,则蓝色深度越深。When ΔO < 0, it is encoded as blue, the smaller the value, the deeper the blue depth.
(2)生成渐变矩形(2) Generate a gradient rectangle
以每一个心跳区间长度为矩形宽度,以心电图上最高R峰的高度为矩形长度,可以将心电图数据序列分成若干矩形,每个矩形包含一个心跳区间。将该心跳区间编码生成的颜色填充到矩形中。为了不遮挡心电图信息,矩形中心设置为透明,两端设置为填充颜色,将矩形调整为渐变色带。Taking the length of each heartbeat interval as the rectangle width, and taking the height of the highest R peak on the electrocardiogram as the rectangle length, the electrocardiogram data sequence can be divided into several rectangles, and each rectangle contains a heartbeat interval. Fill the rectangle with the color generated by the heartbeat interval encoding. In order not to block the ECG information, the center of the rectangle is set to transparent, the ends are set to fill color, and the rectangle is adjusted to a gradient color band.
(3)将矩形叠加到心电图背景(3) Superimpose the rectangle to the ECG background
将各个心跳区间对应的渐变矩形叠加到心电图的背景上,即可生成可视化效果。The visualization can be generated by superimposing the gradient rectangle corresponding to each heartbeat interval on the background of the electrocardiogram.
通过查看每一个心跳区间上的颜色得到模型的支持证据和反对证据,通过颜色深度,可以判断证据对最终分类结果的影响强度;通过本发明的这种方法,模型的分类结果可在心跳层面得到解释。The supporting evidence and the opposing evidence of the model can be obtained by checking the color on each heartbeat interval, and the influence strength of the evidence on the final classification result can be judged by the color depth; through this method of the present invention, the classification result of the model can be obtained at the heartbeat level explain.
步骤3,微观上展示心电图数据的细节对于模型结果的影响。Step 3: Microscopically display the influence of the details of the ECG data on the model results.
在步骤2中,我们以心跳为间隔,找到了不同心跳对于模型结果的影响,初步解释了模型的分类结果。但心跳间隔之内的某些细节,同样对于模型分类结果有重要的影响,如果不加处理,则容易导致细节缺失。因此还需要对心电图序列数据内的细节进行可视化展示,更加详细地解释模型分类的依据,加强模型的可解释性。In step 2, we used the heartbeat as an interval to find the impact of different heartbeats on the model results, and preliminarily explained the classification results of the model. However, some details within the heartbeat interval also have an important impact on the model classification results. If not processed, it will easily lead to the loss of details. Therefore, it is also necessary to visualize the details in the ECG sequence data, explain the basis of the model classification in more detail, and strengthen the interpretability of the model.
步骤3具体包括以下步骤:Step 3 specifically includes the following steps:
步骤3.1,设置可移动遮挡区间。Step 3.1, set the movable occlusion interval.
由于需要计算逐点的影响因子,因此遮挡区间从第一个点开始,每次向后移动一格。遮挡区间的长度L的范围为10≤L≤20,在本专利中取L=15。这是对多次实验结果进行比较得到的经验值。因为遮挡区间过短会导致模型输出结果差值很小,无法体现出单独一个点对于整体结果的影响;过长则会混淆各个点的影响。本发明将整个遮挡区间对模型结果产生的影响视为区间内第一个点的影响因子,这样经过逐点移动遮挡区间即可得到每个单独的点对于模型结果的影响。Since the point-by-point influence factor needs to be calculated, the occlusion interval starts from the first point and moves backward one grid at a time. The range of the length L of the blocking interval is 10≤L≤20, and L=15 in this patent. This is an empirical value obtained by comparing the results of multiple experiments. Because the occlusion interval is too short, the difference between the output results of the model will be small, which cannot reflect the influence of a single point on the overall result; if it is too long, the influence of each point will be confused. The present invention regards the influence of the entire occlusion interval on the model result as the influence factor of the first point in the interval, so that the influence of each individual point on the model result can be obtained by moving the occlusion interval point by point.
步骤3.2,逐点计算差值。Step 3.2, calculate the difference point by point.
经过步骤3.1后,遮挡区间的长度已经确定下来。接下来则使用遮挡区间计算逐点差值,步骤3.2具体步骤如下:After step 3.1, the length of the occlusion interval has been determined. Next, the occlusion interval is used to calculate the point-by-point difference. The specific steps of step 3.2 are as follows:
步骤3.2.1,从心电图序列S向量的第一个数据开始,将之后L个向量值数据置为0,其余位置的向量值保持不变,形成遮挡区间;遮挡区间从第一个数据开始,每次向后移动一格,直至遍历心电图数据中的所有数据;Step 3.2.1, starting from the first data of the S vector of the ECG sequence, set the following L vector value data to 0, and the vector values of the remaining positions remain unchanged to form an occlusion interval; the occlusion interval starts from the first data, Move backward one grid at a time until all the data in the ECG data are traversed;
第m次循环时添加了遮挡区间的心电图序列Sm向量数据表达式为:The vector data expression of the ECG sequence S m with the occlusion interval added in the mth cycle is:
Sm=[s1,s2,…,sm-1,0,0,…,0,sm+L,…,sn]S m =[s 1 ,s 2 ,...,s m-1 ,0,0,...,0,s m+L ,...,s n ]
其中s1,s2,…,sn表示组成心电图序列的单个数据,由该公式可知,sm,sm+1,…,sm+L-1被添加了遮挡区间,区间内的数据都被赋值为0;Among them, s 1 , s 2 ,…,s n represent the single data constituting the ECG sequence. It can be seen from this formula that s m , s m+1 ,…, s m+L-1 are added with occlusion intervals, and the data in the interval are assigned to 0;
步骤3.2.2,将第m次循环时添加了遮挡区间的心电图序列Sm向量输入到深度模型中,得到模型的输出结果Ym,表达式为:Step 3.2.2, input the electrocardiogram sequence S m vector with the occlusion interval added in the mth cycle into the depth model, and obtain the output result Y m of the model, the expression is:
Ym=[y′1,y′2,…,y′N]Y m =[y′ 1 ,y′ 2 ,...,y′ N ]
式中,y′1,y′2,…,y′N分别表示在1,2,…,N标签上的输出值;In the formula, y′ 1 , y′ 2 ,…, y′ N represent the output values on the labels 1, 2,…, N, respectively;
步骤3.2.3,计算步骤3.2.2中获得的新的模型输出结果与模型基准结果之间的差值ΔOm,该值反映单独的点对于模型输出结果的影响,计算公式为:Step 3.2.3, calculate the difference ΔO m between the new model output result obtained in step 3.2.2 and the model benchmark result, which reflects the influence of a single point on the model output result. The calculation formula is:
ΔOm=yI-y′I ΔO m =y I -y' I
式中,I表示步骤1中计算得到的基准值O的标签序号,yI和y′I表示在该标签序号上的深度模型输出值;ΔOm表示心跳序列中第m个数据对于深度模型输出结果的影响因子;ΔOm>0表示该点对最终分类结果具有正面影响,是模型的支持证据,该值越大,表示与模型最终结果越契合;ΔOm<0表示该点对最终分类结果具有负面影响,是模型的反对证据,该值为负值,值越小表示与最终结果越背离;In the formula, I represents the label number of the reference value O calculated in step 1, y I and y′ I represent the depth model output value on the label number; ΔO m represents the mth data in the heartbeat sequence for the depth model output. The impact factor of the result; ΔO m >0 indicates that the point has a positive impact on the final classification result, which is the supporting evidence of the model. The larger the value, the better the final result of the model; ΔO m <0 indicates that the point has a positive impact on the final classification result. It has a negative impact and is the evidence against the model. The value is negative, and the smaller the value, the more deviation from the final result;
通过ΔO的数值,得到心电图上每个点对于模型分类结果的影响因子,实现心电图数据中细节信息的解释。Through the value of ΔO, the influence factor of each point on the ECG on the model classification result is obtained, and the detailed information in the ECG data can be explained.
步骤3.2.4,将遮挡区间向后移动一格,重复以上过程,直到最后一个点计算完成。最终可以得到心电图上每一个点的ΔO数值。Step 3.2.4, move the occlusion interval backward by one grid, and repeat the above process until the calculation of the last point is completed. Finally, the ΔO value of each point on the ECG can be obtained.
步骤3.3,对逐点贡献进行可视化展示。Step 3.3, visualize the point-by-point contribution.
在步骤3.2中,经过计算得到了每一点的ΔO数值,该值可以反映出单独的点对于模型最后分类结果的影响。但是查看每个点的数值是不直观的,因此还需要设计针对逐点的可视化方法。点数值与心跳区间数值不同,单独的点很难看出它的颜色,因此不能采用上一环节的可视化方法,必须针对逐点数据的特点进行显示。In step 3.2, the ΔO value of each point is calculated, which can reflect the influence of individual points on the final classification result of the model. But looking at the value of each point is not intuitive, so it is also necessary to design a point-by-point visualization method. The point value is different from the heartbeat interval value. It is difficult to see the color of a single point. Therefore, the visualization method of the previous link cannot be used, and it must be displayed according to the characteristics of point-by-point data.
步骤3.3具体步骤如下:Step 3.3 The specific steps are as follows:
步骤3.3.1,将每个点的ΔO数值编码为高度。Step 3.3.1, encode the ΔO value of each point as height.
经过步骤3.2,心电图数据序列中,每一个数据都对应了ΔO数值,进一步将ΔO数值编码为高度,并通过该数据的横坐标和由ΔO编码的高度确定心电图平面上的一个点P:ΔO>0,表示点P在心电图的上方区域,并将心电图上对应点显示为红色;ΔO=0,表示点P落在零轴上,并将心电图上对应点显示为黑色;ΔO<0,表示点P在心电图的下方区域,并将心电图上对应点显示为蓝色。After step 3.2, in the electrocardiogram data sequence, each data corresponds to the ΔO value, and the ΔO value is further encoded as the height, and a point P on the ECG plane is determined by the abscissa of the data and the height encoded by ΔO: ΔO> 0, indicates that point P is in the upper area of the ECG, and the corresponding point on the ECG is displayed in red; ΔO=0, indicates that the point P falls on the zero axis, and the corresponding point on the ECG is displayed in black; ΔO<0, indicates that the point P is in the lower area of the ECG, and the corresponding point on the ECG is shown in blue.
这样心电图上每一个点都划分了颜色,呈现出它们对于模型分类结果的贡献。同时心电图数据序列中每个数据都对应了由ΔO生成的点P。In this way, each point on the ECG is divided into colors, showing their contribution to the classification results of the model. At the same time, each data in the ECG data sequence corresponds to the point P generated by ΔO.
步骤3.3.2,使用平滑曲线连接心电图数据序列中每个数据对应的点P。Step 3.3.2, use a smooth curve to connect the points P corresponding to each data in the ECG data series.
由于点P过于繁密,无法通过颜色、高度直观反映出其信息,因此需要使用平滑曲线将点P连接起来,并与零轴共同包围出若干区域。曲线的高度反映出ΔO绝对值的大小,曲线的尖峰和低谷反映出支持模型结果和违背模型结果的关键依据。Since the point P is too dense to reflect its information intuitively through color and height, it is necessary to use a smooth curve to connect the point P, and together with the zero axis to enclose several areas. The height of the curve reflects the magnitude of the absolute value of ΔO, and the peaks and valleys of the curve reflect the key evidence supporting and violating the model results.
步骤3.3.3,使用颜色填充曲线包围的区域。Step 3.3.3, fill the area enclosed by the curve with color.
为了使局部细节区域的信息更加直观,在步骤3.3.2形成的若干区域内填充颜色,使其属性更加明显。在零轴上方的区域填充红色,代表该局部区域支持模型的最终分类结果;在零轴下方的区域填充蓝色,代表该局部区域违背模型的最终分类结果。原始心电图曲线已经划分为若干段落,分别使用不同颜色来表示。同时,根据零轴附近的填充区域可以了解心电图局部细节信息,区域越大、尖峰越高,其发生异常的可能性越大,代表该区域对于模型最终结果的形成影响越大。对于心电图数据细节的可视化展示进一步说明了模型分类结果的形成依据,增强了模型的可解释性。In order to make the information of the local detail area more intuitive, fill the color in the several areas formed in step 3.3.2 to make its attributes more obvious. The area above the zero axis is filled with red, indicating that the local area supports the final classification result of the model; the area below the zero axis is filled with blue, indicating that the local area violates the final classification result of the model. The original ECG curve has been divided into several paragraphs, which are represented by different colors. At the same time, the local detailed information of the ECG can be learned according to the filled area near the zero axis. The larger the area and the higher the peak, the greater the possibility of abnormality, which means that the area has a greater influence on the final result of the model. The visual display of the details of the ECG data further illustrates the basis for the formation of the model classification results and enhances the interpretability of the model.
综上,本发明提供一种面向心电图数据的深度模型分类结果可视化方法,用于解决现有深度模型结果简单抽象,可解释性不足的缺陷,主要通过设置遮挡区间计算特定区域对于最终结果的影响,并从宏观和微观的角度分别设计方案将该影响可视化展示出来。与现有技术相比,本发明增强了模型结果的可解释性;传统方法下模型结果为一个特定的分类结果标签,没有办法解释得出该结果的依据,这样的结果在医疗领域很难被医生采纳。本方法对于模型结果做出了解释,找到了模型得出结果的支持证据与反对证据,展示出每一个细节对于模型得出最终结果的影响,大大提升了模型结果的可解释性;本发明从宏观和微观的角度对解释过程进行了可视化展示:传统方法下心电图数据杂乱冗长,从中分辨关键信息是一项费时费力的工作。对模型结果影响较大的区域很有可能是关键性的异常区域,比如存在P波消失等异常现象。本方法从心电图数据中发掘这样的区域,并从宏观和微观两种粒度上将其通过颜色、高度等可视化元素展示出来,从而使模型运行过程更直观,提升了模型结果的可解释性;本发明的方法适用于各种模型,可扩展性强:传统方法下解释模型结果需要参考模型结构,无法扩展到其他模型上。本方法并不依赖于特定模型,所有适用于心电图数据的深度模型分类结果均可采用本方法进行解释和展示,并能很方便地扩展到目前层出不穷的改进模型上。To sum up, the present invention provides a method for visualizing the classification results of a depth model for ECG data, which is used to solve the defects of simple abstraction and insufficient interpretability of the existing depth model results. , and design solutions from the macro and micro perspectives to visualize the impact. Compared with the prior art, the present invention enhances the interpretability of the model results; under the traditional method, the model result is a specific classification result label, and there is no way to explain the basis for the result, and such a result is difficult to be used in the medical field. Doctor accepts. The method explains the results of the model, finds supporting evidence and evidence against the results obtained by the model, shows the influence of every detail on the final results obtained by the model, and greatly improves the interpretability of the model results; The interpretation process is visualized from the macro and micro perspectives: under the traditional method, the ECG data is messy and lengthy, and it is a time-consuming and laborious task to distinguish key information from it. The areas that have a greater impact on the model results are likely to be critical abnormal areas, such as the presence of abnormal phenomena such as the disappearance of P waves. This method excavates such regions from the ECG data, and displays them through visual elements such as color and height from both macroscopic and microscopic granularities, thereby making the model running process more intuitive and improving the interpretability of the model results. The invented method is suitable for various models and has strong scalability: the interpretation of the model results under the traditional method requires reference to the model structure and cannot be extended to other models. This method does not depend on a specific model, and all classification results of deep models applicable to ECG data can be interpreted and displayed by this method, and can be easily extended to an endless stream of improved models.
实施例Example
请参阅图1,为了实现最终的可视化效果,本发明的可视化方法包括以下步骤:Please refer to Fig. 1, in order to realize the final visualization effect, the visualization method of the present invention comprises the following steps:
S101,确定基准结果。S101, determining a benchmark result.
在本实施例中,原始心电图数据处理为心电图序列后的表示形式为:In this embodiment, the representation form after processing the original ECG data into an ECG sequence is:
S=[s1,s2,…,si,…,sn]S=[s 1 , s 2 ,...,s i ,...,s n ]
式中,S为n维向量,i=1,2,…,n,si表示序列中第i个点的数据,将该数据序列输入到预设训练好的深度模型中,得到的结果数据格式为:In the formula, S is an n-dimensional vector, i=1,2,...,n, s i represents the data of the ith point in the sequence, input the data sequence into the preset trained depth model, and get the result data The format is:
Y=[y1,y2,…,yj,…,yN]Y=[y 1 , y 2 ,...,y j ,...,y N ]
式中,Y为N维向量,N表示模型分类的标签数量;j=1,2,...,N,yj表示模型在标签j上的分类值,0≤yj≤1,其中,yj取最大值时所对应的标签为深度模型的预测分类结果,将该标签对应的yj值定为基准值O,标签序号设为I,基准值O的表达式为:In the formula, Y is an N-dimensional vector, N represents the number of labels of the model classification; j=1,2,...,N, y j represents the classification value of the model on the label j, 0≤y j ≤1, where, The label corresponding to the maximum value of y j is the predicted classification result of the deep model. The value of y j corresponding to the label is set as the reference value O, the label serial number is set as I, and the expression of the reference value O is:
O=max{y1,y2,…,yj,…,yN}O=max{y 1 ,y 2 ,…,y j ,…,y N }
式中yj表示模型在标签j上的分类值,0≤yj≤1;In the formula, y j represents the classification value of the model on the label j , 0≤yj≤1;
S102,设计心跳区间对于模型结果影响的可视化方法。S102, designing a visualization method for the influence of the heartbeat interval on the model result.
请参阅图2,设计心跳区间对于模型结果影响的可视化方法,具体步骤包括:Please refer to Figure 2 to design a visualization method for the impact of the heartbeat interval on the model results. The specific steps include:
1)动态确定遮挡区间长度。1) Dynamically determine the length of the occlusion interval.
从原始心电图数据中,可以得到每一个心跳的R峰位置标签,两个R峰之间认为是一次心跳的RR区间。因此设置第k次心跳的遮挡区间长度为:From the original ECG data, the R-peak position label of each heartbeat can be obtained, and the interval between two R-peaks is considered to be the RR interval of a heartbeat. Therefore, the length of the occlusion interval of the kth heartbeat is set as:
Lengthk=xk+1-xk Length k = x k+1 -x k
式中,Lengthk表示第k个RR区间上设置的遮挡区间的长度,xk表示第k个R峰位置的横坐标,0≤xk≤Len,Len表示心电图序列的总长度;In the formula, Length k represents the length of the occlusion interval set on the kth RR interval, x k represents the abscissa of the kth R peak position, 0≤xk ≤Len, and Len represents the total length of the ECG sequence;
2)计算每一个心跳对于模型结果的影响。2) Calculate the impact of each heartbeat on the model results.
从上一步中我们获得了第k次心跳区间的长度,接下来需要根据该长度设置遮挡区间。From the previous step, we obtained the length of the kth heartbeat interval, and then we need to set the occlusion interval according to this length.
S1,将遮挡区间开始位置与第k次心跳的R峰位置对齐,区间长度设置为Lengthk,使得遮挡区间恰好覆盖第k次心跳。S1, align the start position of the occlusion interval with the position of the R peak of the kth heartbeat, and set the interval length to Length k , so that the occlusion interval just covers the kth heartbeat.
S2,将遮挡区间内的向量值统一赋值为0,其余位置的向量值保持不变,修改后的心电图序列为:S2, uniformly assign the vector values in the occlusion interval to 0, and keep the vector values in other positions unchanged. The modified ECG sequence is:
Sk=[s1,s2,…,0,…,0,…,sn]S k =[s 1 ,s 2 ,…,0,…,0,…,s n ]
其中,si表示序列中第i个点的数据,赋值为0的区域从第k次心跳的R峰开始,长度为Lengthk;Wherein, si represents the data of the ith point in the sequence, and the area assigned as 0 starts from the R peak of the kth heartbeat, and the length is Length k ;
S3,在S2中我们在第k次心跳区间上设置了遮挡区间,现在将添加了遮挡区间的心电图序列Sk向量输入到深度模型中,得到新的深度模型输出结果Yk,Yk是N维向量,表达式为:S3, in S2, we set the occlusion interval on the kth heartbeat interval, and now the ECG sequence Sk vector with the occlusion interval added is input into the depth model, and the new depth model output result Yk is obtained, Yk is N dimensional vector, the expression is:
Yk=[y′1,y′2,…,y′N]Y k =[y′ 1 ,y′ 2 ,…,y′ N ]
式中,y′1,y′2,…,y′N分别表示在1,2,…,N标签上的输出值;In the formula, y′ 1 , y′ 2 ,…,y′ N represent the output values on the labels 1, 2,…,N, respectively;
S4,计算获得遮挡第k次心跳区间信息的深度模型结果Ok与基准结果的差值ΔOk;ΔOk为第k个心跳区间的影响因子,表达式为:S4, calculate and obtain the difference ΔO k between the depth model result O k that blocks the information of the kth heartbeat interval and the reference result; ΔOk is the influence factor of the kth heartbeat interval, and the expression is:
ΔOk=yI-y′I ΔO k =y I -y' I
式中,I表示步骤1中计算得到的基准值O的标签序号,yI和y′I表示在该标签序号上的深度模型输出值;ΔOk表示第k个心跳区间对于深度模型输出结果的影响因子;ΔOk>0表示该心跳区间对模型分类结果具有正面影响,是模型的支持证据,该值越大,表示与模型分类结果越契合;ΔOk<0表示该心跳区间对最终分类结果具有负面影响,是模型的反对证据,该值为负值,值越小表示与模型分类结果越背离;In the formula, I represents the label number of the reference value O calculated in step 1, y I and y′ I represent the depth model output value on the label number; ΔO k represents the kth heartbeat interval for the depth model output result. Impact factor; ΔO k >0 indicates that the heartbeat interval has a positive impact on the model classification result, which is the supporting evidence of the model. The larger the value, the better the model classification result; ΔO k <0 indicates that the heartbeat interval has a positive impact on the final classification result. It has a negative impact and is the evidence against the model. The value is negative, and the smaller the value, the greater the deviation from the model classification result;
通过ΔO的数值,区分不同心跳区间对于模型分类结果的影响,实现对模型分类结果的解释。Through the value of ΔO, the influence of different heartbeat intervals on the model classification results can be distinguished, and the interpretation of the model classification results can be realized.
S5,移动遮挡区间,重复以上过程,计算第k+1次心跳对于结果的影响。S5, move the occlusion interval, repeat the above process, and calculate the influence of the k+1th heartbeat on the result.
本发明实施例中设置遮挡区间的目的是抹除该心跳的信息,将没有该心跳时模型的结果与包含该心跳时模型的结果相比较,即可计算得到该心跳对于模型的影响数值。反复执行该操作,即可得到每一个心跳对于模型结果的影响数值。The purpose of setting the occlusion interval in the embodiment of the present invention is to erase the information of the heartbeat. By comparing the results of the model without the heartbeat with the results of the model including the heartbeat, the impact value of the heartbeat on the model can be calculated. By repeating this operation, you can get the value of the impact of each heartbeat on the model results.
3)可视化展示每个心跳的影响。3) Visualize the impact of each heartbeat.
在步骤2)中,得到的差值ΔO可以用来表示该心跳区间对于模型最终结果的影响。但是心电图数据很长,包含多个心跳区间,使用数值方式不够直观,因此还需要设计相应的可视化方法。通过将数值映射到矩形的颜色,可以在心电图数据中直观显示各个心跳区间的表现。具体方法如下:In step 2), the obtained difference ΔO can be used to represent the influence of the heartbeat interval on the final result of the model. However, the ECG data is very long and contains multiple heartbeat intervals. The numerical method is not intuitive enough. Therefore, a corresponding visualization method needs to be designed. By mapping the values to the colors of the rectangles, the performance of each heartbeat interval can be visualized in the ECG data. The specific method is as follows:
S1,将ΔO编码为颜色。为了直观显示ΔO的含义,可以将其编码为颜色,其规则为:当ΔO>0时,将其编码为红色,该值越大,则红色深度越深;当ΔO<0时,将其编码为蓝色,该值越小,则蓝色深度越深。每一个心跳区间对应一个ΔO值,经过编码后即可得到一个颜色序列。S1, encode ΔO as a color. In order to visualize the meaning of ΔO, it can be encoded as a color. The rule is: when ΔO>0, encode it as red, the larger the value, the deeper the red depth; when ΔO<0, encode it as red is blue, the smaller the value, the deeper the blue depth. Each heartbeat interval corresponds to a ΔO value, and after encoding, a color sequence can be obtained.
S2,生成渐变矩形。S2, generate a gradient rectangle.
以每一个心跳区间长度为矩形宽度,以心电图上最高R峰的高度为矩形长度,可以将心电图分成若干矩形,每个矩形包含一个心跳区间。将该心跳区间编码生成的颜色填充到矩形中。Taking the length of each heartbeat interval as the width of the rectangle, and taking the height of the highest R peak on the electrocardiogram as the length of the rectangle, the electrocardiogram can be divided into several rectangles, and each rectangle contains a heartbeat interval. Fill the rectangle with the color generated by the heartbeat interval encoding.
同时,为了不遮挡心电图信息,矩形中心设置为透明,两端设置为填充颜色,将矩形调整为渐变色带。At the same time, in order not to block the ECG information, the center of the rectangle is set to transparent, the two ends are set to fill color, and the rectangle is adjusted to a gradient color band.
S3,将矩形叠加到心电图背景。S3, superimpose the rectangle to the ECG background.
最后将各个心跳区间对应的渐变矩形叠加到心电图的背景上,即可生成可视化效果。根据每一个心跳区间上的颜色得到模型的支持证据和反对证据,并对模型分类结果进行解释。Finally, the gradient rectangle corresponding to each heartbeat interval is superimposed on the background of the electrocardiogram to generate a visualization effect. According to the color on each heartbeat interval, the supporting evidence and the negative evidence for the model are obtained, and the classification results of the model are explained.
在本实施例中,我们选取由AliveCor捐赠的实际心电图数据来说明方法的实施过程。需要指出的是,作为示例,本例仅列举了一个数据片段来说明本方法的执行过程,实际的心电图数据要远远超过列举范围。In this embodiment, we select the actual ECG data donated by AliveCor to illustrate the implementation process of the method. It should be pointed out that, as an example, this example only enumerates a data segment to illustrate the execution process of the method, and the actual electrocardiogram data is far beyond the enumeration range.
本发明的实施例中,心电图数据片段为:In the embodiment of the present invention, the electrocardiogram data segment is:
S=[...0bff 02ff fbfe f7fe f4fe f4fe f5fe f7fe f9fe fcfe 00ff 03ff07ff 09ff 0bff 0dff...];S=[...0bff 02ff fbfe f7fe f4fe f4fe f5fe f7fe f9fe fcfe 00ff 03ff07ff 09ff 0bff 0dff...];
为了得到基准值,将S输入到模型中,模型分类结果为:In order to get the benchmark value, S is input into the model, and the model classification result is:
Y=[0.1215,0.9877,0.1010];Y=[0.1215, 0.9877, 0.1010];
从该分类结果中可以看到,AF标签对应的分类值在所有分类值中最大,为0.9877,即认为模型分类结果为AF,代表Atrial Fibrillation(心房颤动)。根据前面的定义,我们可以得到基准值yI=0.9877;It can be seen from the classification results that the classification value corresponding to the AF label is the largest among all classification values, which is 0.9877, that is, the model classification result is considered to be AF, representing Atrial Fibrillation (atrial fibrillation). According to the previous definition, we can get the reference value y I = 0.9877;
从数据中得到两个心电图R峰的标签,将RR区间内的数据改成0,形成遮挡区间。修改后S为:The labels of two ECG R peaks are obtained from the data, and the data in the RR interval is changed to 0 to form an occlusion interval. After modification, S is:
S=[...0000 0000 0000 0000 0000 0000f5fe f7fe f9fe fcfe 00ff 03ff07ff 09ff 0bff 0dff...]S=[...0000 0000 0000 0000 0000 0000f5fe f7fe f9fe fcfe 00ff 03ff07ff 09ff 0bff 0dff...]
将其重新输入到模型中,得到新的分类结果为:Re-input it into the model to get a new classification result as:
Y=[0.2011,0.6856,0.1317]Y=[0.2011, 0.6856, 0.1317]
此时标签AF对应的分类值y′I=0.6856,由公式可得影响因子ΔO=yI-y′I=0.3021。At this time, the classification value corresponding to the label AF is y' I =0.6856, and the influence factor ΔO=y I -y' I =0.3021 can be obtained from the formula.
由于ΔO>0,即将该心跳片段遮挡后,模型分类结果的显著性下降,由此我们可以认为该心跳区间对于模型分类结果起支持作用,是模型得到该结果的正面依据。Since ΔO>0, that is, after the heartbeat segment is occluded, the significance of the model classification result decreases. Therefore, we can think that the heartbeat interval supports the model classification result and is a positive basis for the model to obtain the result.
重复以上过程,对每一个心跳区间计算其影响因子。然后将影响因子数值编码为颜色,生成渐变矩形并叠加到心电图波形上。Repeat the above process, and calculate its impact factor for each heartbeat interval. The impact factor values are then encoded as colors, and gradient rectangles are generated and superimposed on the ECG waveform.
请参阅图4,最终得到的可视化效果如图4所示,从图中可以看到,针对每一个心跳区间,渐变矩形的颜色显示出该区间对于模型最终结果的影响,红色部分代表支持模型分类结果,蓝色部分代表反对模型分类结果,颜色的深度则反映出影响的大小。该可视化结果解释了每一个心跳区间对模型最终分类结果的作用。Please refer to Figure 4. The final visualization effect is shown in Figure 4. As can be seen from the figure, for each heartbeat interval, the color of the gradient rectangle shows the impact of the interval on the final result of the model, and the red part represents the support model classification As a result, the blue part represents the anti-model classification results, and the depth of the color reflects the magnitude of the effect. This visualization explains how each heartbeat interval contributes to the final classification of the model.
S103,设计逐点对于模型结果影响的可视化方法。S103, designing a visualization method of point-by-point influence on the model result.
请参阅图3,设计单独的点对于模型分类结果影响的可视化方法实施流程如图3所示,具体步骤包括:Referring to Figure 3, the implementation process of the visualization method for designing the influence of individual points on the model classification results is shown in Figure 3. The specific steps include:
1)设置可移动遮挡区间。1) Set the movable occlusion interval.
由于需要计算逐点的差值,因此遮挡区间从第一个点开始,每次向后移动一格。遮挡区间的长度L=15。Since the point-by-point difference needs to be calculated, the occlusion interval starts from the first point and moves backward one grid at a time. The length of the occlusion interval is L=15.
2)逐点计算差值。2) Calculate the difference point by point.
经过第一个步骤后,遮挡区间的长度已经确定下来。接下来则使用遮挡区间计算逐点差值,具体步骤如下:After the first step, the length of the occlusion interval has been determined. Next, the occlusion interval is used to calculate the point-by-point difference. The specific steps are as follows:
S1,从心电图序列S向量的第一个数据开始,将之后L个向量值数据置为0,其余位置的向量值保持不变,形成遮挡区间;遮挡区间从第一个数据开始,每次向后移动一格,直至遍历心电图数据中的所有数据;S1, starting from the first data of the S vector of the electrocardiogram sequence, set the following L vector value data to 0, and the vector values of the remaining positions remain unchanged to form an occlusion interval; the occlusion interval starts from the first data, and each direction Then move a grid until all the data in the ECG data are traversed;
第m次循环时添加了遮挡区间的心电图序列Sm向量数据表达式为:The vector data expression of the ECG sequence S m with the occlusion interval added in the mth cycle is:
Sm=[s1,s2,…,sm-1,0,0,…,0,sm+L,…,sn]S m =[s 1 ,s 2 ,...,s m-1 ,0,0,...,0,s m+L ,...,s n ]
其中s1,s2,…,sn表示组成心电图序列的单个数据,由该公式可知,sm,sm+1,…,sm+L-1被添加了遮挡区间,区间内的数据都被赋值为0;Among them, s 1 , s 2 ,…,s n represent the single data constituting the ECG sequence. It can be seen from this formula that s m , s m+1 ,…, s m+L-1 are added with occlusion intervals, and the data in the interval are assigned to 0;
S2,将第m次循环时添加了遮挡区间的心电图序列Sm向量输入到深度模型中,得到模型的输出结果Ym,表达式为:S2, the electrocardiogram sequence S m vector with the occlusion interval added in the mth cycle is input into the depth model, and the output result Y m of the model is obtained, and the expression is:
Ym=[y′1,y′2,…,y′N]Y m =[y′ 1 ,y′ 2 ,...,y′ N ]
式中,y′1,y′2,…,y′N分别表示在1,2,…,N标签上的输出值;In the formula, y′ 1 , y′ 2 , …, y′ N represent the output values on the labels 1, 2, …, N, respectively;
S3,计算步骤S2中获得的新的模型输出结果与模型基准结果之间的差值ΔOm,该值反映单独的点对于模型输出结果的影响,计算公式为:S3, calculate the difference ΔO m between the new model output result obtained in step S2 and the model reference result, this value reflects the influence of a single point on the model output result, and the calculation formula is:
ΔOm=yI-y′I ΔO m =y I -y' I
式中,I表示步骤1中计算得到的基准值O的标签序号,yI和y′I表示在该标签序号上的深度模型输出值;ΔOm表示心跳序列中第m个数据对于深度模型输出结果的影响因子;ΔOm>0表示该点对最终分类结果具有正面影响,是模型的支持证据,该值越大,表示与模型最终结果越契合;ΔOm<0表示该点对最终分类结果具有负面影响,是模型的反对证据,该值为负值,值越小表示与最终结果越背离;In the formula, I represents the label number of the reference value O calculated in step 1, y I and y′ I represent the depth model output value on the label number; ΔO m represents the mth data in the heartbeat sequence for the depth model output. The impact factor of the result; ΔO m >0 indicates that the point has a positive impact on the final classification result, which is the supporting evidence of the model. The larger the value, the better the final result of the model; ΔO m <0 indicates that the point has a positive impact on the final classification result. It has a negative impact and is the evidence against the model. The value is negative, and the smaller the value, the more deviation from the final result;
通过ΔO的数值,得到心电图上每个点对于模型分类结果的影响因子,实现心电图数据中细节信息的解释。Through the value of ΔO, the influence factor of each point on the ECG on the model classification result is obtained, and the detailed information in the ECG data can be explained.
S4,将遮挡区间向后移动一格,重复以上过程,最终计算得到心电图上每一个点的ΔO数值。S4, move the occlusion interval back by one grid, repeat the above process, and finally calculate the ΔO value of each point on the electrocardiogram.
3)对逐点贡献进行可视化展示。3) Visual display of point-by-point contributions.
在上一步中,经过计算得到了每一个点的ΔO数值,该值可以反映出单独的点对于模型最后分类结果的影响。但是查看每个点的数值是不直观的,因此还需要设计针对逐点的可视化方法。点数值与心跳区间数值不同,单独的点很难看出它的颜色,因此不能采用上一环节的可视化方法,必须针对逐点数据的特点进行显示。具体步骤如下:In the previous step, the ΔO value of each point was calculated, which can reflect the influence of individual points on the final classification result of the model. But looking at the value of each point is not intuitive, so it is also necessary to design a point-by-point visualization method. The point value is different from the heartbeat interval value. It is difficult to see the color of a single point. Therefore, the visualization method of the previous link cannot be used, and it must be displayed according to the characteristics of point-by-point data. Specific steps are as follows:
S1,将每个点的ΔO数值编码为高度。S1, encode the ΔO value of each point as height.
经过了上一个步骤,心电图数据序列中,每一个数据都对应了ΔO数值,进一步将ΔO数值编码为高度,并通过该数据的横坐标和由ΔO编码的高度确定心电图平面上的一个点P:ΔO>0,表示点P在心电图的上方区域,并将心电图上对应点显示为红色;ΔO=0,表示点P落在零轴上,并将心电图上对应点显示为黑色;ΔO<0,表示点P在心电图的下方区域,并将心电图上对应点显示为蓝色。After the previous step, in the ECG data sequence, each data corresponds to the ΔO value, and the ΔO value is further encoded as the height, and a point P on the ECG plane is determined by the abscissa of the data and the height encoded by ΔO: ΔO>0, it means that point P is in the upper area of the ECG, and the corresponding point on the ECG is displayed in red; ΔO=0, it means that point P falls on the zero axis, and the corresponding point on the ECG is displayed in black; ΔO<0, Indicates that point P is in the lower area of the ECG, and displays the corresponding point on the ECG in blue.
这样心电图上每一个点都划分了颜色,呈现出它们对于模型分类结果的贡献。同时心电图数据序列中每个数据都对应了由ΔO生成的点P。In this way, each point on the ECG is divided into colors, showing their contribution to the classification results of the model. At the same time, each data in the ECG data sequence corresponds to the point P generated by ΔO.
S2,使用平滑曲线连接心电图数据序列中每个数据对应的点P。S2, use a smooth curve to connect the points P corresponding to each data in the ECG data sequence.
由于点过于繁密,无法通过颜色、高度直观反映出其信息,因此需要使用平滑曲线将点P连接起来,并与零轴共同包围出若干区域。曲线的高度反映出ΔO绝对值的大小,曲线的尖峰和低谷反映出支持模型结果和违背模型结果的关键依据。Since the points are too dense to directly reflect their information through color and height, it is necessary to use a smooth curve to connect the points P, and together with the zero axis to enclose several areas. The height of the curve reflects the magnitude of the absolute value of ΔO, and the peaks and valleys of the curve reflect the key evidence supporting and violating the model results.
S3,使用颜色填充曲线包围的区域。S3, fill the area surrounded by the curve with color.
为了使局部细节区域的信息更加直观,在上一步形成的若干区域内填充颜色,使其属性更加明显。在零轴上方的区域填充红色,代表该局部区域支持模型的最终分类结果;在零轴下方的区域填充蓝色,代表该局部区域违背模型的最终分类结果。原始心电图曲线已经划分为若干段落,分别使用不同颜色来表示。同时,根据零轴附近的填充区域可以了解心电图局部细节信息,区域越大、尖峰越高,其发生异常的可能性越大,对于模型最终结果的形成影响越大。对于心电图数据细节的可视化展示进一步说明了模型分类结果的形成依据,增强了模型的可解释性。In order to make the information of the local detail area more intuitive, fill the color in the several areas formed in the previous step to make the properties more obvious. The area above the zero axis is filled with red, indicating that the local area supports the final classification result of the model; the area below the zero axis is filled with blue, indicating that the local area violates the final classification result of the model. The original ECG curve has been divided into several paragraphs, which are represented by different colors. At the same time, according to the filling area near the zero axis, the local details of the ECG can be learned. The larger the area and the higher the peak, the greater the possibility of abnormality, and the greater the impact on the final result of the model. The visual display of the details of the ECG data further illustrates the basis for the formation of the model classification results and enhances the interpretability of the model.
在本实施例中,作为示例,依旧延续S102中示例的思路。不同之处在于,在S102中,为了得到每一个心跳区间对于分类结果的影响,遮挡窗口的长度和移动距离根据心跳区间的长度动态调整,效果是恰好将一个心跳区间遮挡。在本步骤中,为了得到每一个点对于分类结果的影响,遮挡区间需要采用固定长度,同时每次向后移动一个点,直到所有的点的影响因子都已经计算出来。In this embodiment, as an example, the idea of the example in S102 is still continued. The difference is that in S102, in order to obtain the influence of each heartbeat interval on the classification result, the length and moving distance of the shielding window are dynamically adjusted according to the length of the heartbeat interval, and the effect is to shield exactly one heartbeat interval. In this step, in order to obtain the influence of each point on the classification result, the occlusion interval needs to adopt a fixed length, and at the same time move backward one point at a time, until the influence factors of all points have been calculated.
例如,一个真实的心电图数据片段为:For example, a real piece of ECG data is:
S=[...2cff 2dff 2eff 2fff 32ff 35ff 37ff 3aff...];S=[...2cff 2dff 2eff 2fff 32ff 35ff 37ff 3aff...];
设置遮挡区间长度为15,移动距离为1,为求出第一个点的影响因子,我们可以从该点开始设置遮挡区间:Set the length of the occlusion interval to 15 and the moving distance to 1. In order to find the influence factor of the first point, we can set the occlusion interval from this point:
S=[...0000 0000 0000 000f 32ff 35ff 37ff 3aff...]S=[...0000 0000 0000 000f 32ff 35ff 37ff 3aff...]
将设置了遮挡区间的数据输入到模型中,根据模型新的输出结果得到新的分类值y′I=0.9903。Input the data with the occlusion interval set into the model, and obtain a new classification value y′ I =0.9903 according to the new output result of the model.
由公式可得第一个点的影响因子ΔO=yI-y′I=-0.0026。From the formula, the influence factor of the first point ΔO=y I -y' I =-0.0026 can be obtained.
因为ΔO<0,因此我们可以认为该点反对模型分类结果,为模型得出该分类结果的负面依据。Because ΔO<0, we can think that this point is against the classification result of the model, which is a negative basis for the model to obtain the classification result.
之后遮挡区间长度不变,向后移动一个点:After that, the length of the occlusion interval remains unchanged, and it moves one point backward:
S=[...2000 0000 0000 0000 32ff 35ff 37ff 3aff...];S=[...2000 0000 0000 0000 32ff 35ff 37ff 3aff...];
重复以上过程,即可得到每一个点的影响因子。Repeat the above process to get the impact factor of each point.
然后根据可视化方法生成心电图平面上的点,用曲线连接起来,该曲线和零轴可以形成若干的包围区域。根据影响因子的正负将包围区域填充对应的颜色。Then, the points on the ECG plane are generated according to the visualization method, and are connected by a curve, and the curve and the zero axis can form several surrounding areas. Fill the surrounding area with the corresponding color according to the positive or negative of the influence factor.
请参阅图5,最终形成的可视化效果如图5所示。从图中可以看到,心电图的原始波形分成了红、蓝、黑三种颜色,代表此段波形对于模型最终分类结果的影响。同时,由影响因子确定的点连接为一条曲线,该曲线与零轴共同包围形成若干区域,这些区域解释了模型得出最终分类结果的详细依据。例如,图5中框出的部分出现了向上尖峰区域,提示在该区域可能存在异常。根据医学知识,这里其实是出现了P波消失异常,正是由于关注到了这个细节所以模型最终得出了心房颤动(AF)的分类结果。在普通心电图中难以直观发现该区域,但借助本发明中的可视化方法则在此出现了强烈的尖峰,说明与普通方法相比,该方法能够更加直观地对深度模型分类结果做出解释,即提升了深度模型分类结果的可解释性。Please refer to Figure 5, the final visualization effect is shown in Figure 5. As can be seen from the figure, the original waveform of the ECG is divided into three colors: red, blue and black, which represent the influence of this waveform on the final classification result of the model. At the same time, the points determined by the impact factor are connected into a curve, which together with the zero axis forms several regions, which explain the detailed basis for the model to obtain the final classification result. For example, an upward spike region appears in the boxed portion in Figure 5, suggesting that there may be anomalies in this region. According to medical knowledge, there is actually an abnormal disappearance of the P wave here. It is precisely because of this detail that the model finally obtained the classification result of atrial fibrillation (AF). It is difficult to intuitively find this area in an ordinary electrocardiogram, but with the help of the visualization method in the present invention, a strong peak appears here, indicating that compared with the ordinary method, this method can explain the classification results of the deep model more intuitively, that is, Improved interpretability of deep model classification results.
S104,形成最终可视化结果。S104, a final visualization result is formed.
经过以上步骤,并将宏观和细节可视化效果进行叠加,最终建立起了从宏观到细节的综合可视化效果。宏观效果如图4所示,细节效果如图5所示。可视化效果完整解释了模型分类结果,突出展示了模型做出分类结果的关键依据,加强了模型分类结果的可解释性。在实际应用中,医生可以根据宏观信息确定可能出现异常的心跳区间,迅速定位到特定的心跳区间查看细节;也可以根据细节信息判断可能出现的异常现象,从波形细节中寻找到关键信息,从而提升了诊断效率。After the above steps and superimposing the macro and detail visualization effects, a comprehensive visualization effect from macro to details is finally established. The macro effect is shown in Figure 4, and the detail effect is shown in Figure 5. The visualization effect fully explains the model classification results, highlights the key basis for the model to make the classification results, and strengthens the interpretability of the model classification results. In practical applications, doctors can determine the heartbeat interval that may be abnormal according to the macro information, and quickly locate the specific heartbeat interval to view the details; they can also judge the possible abnormal phenomenon according to the detailed information, and find the key information from the waveform details, so as to Improved diagnostic efficiency.
综上,本发明公开了一种面向心电图数据的深度模型分类结果可视化方法,包括以下步骤:首先将原始心电图数据输入到模型中,得到模型的原始输出数据,分析最终预测结果,并将原始数据保存为基准参与后续的对比;然后根据心跳区间动态设置遮挡区间,得出每一个心跳对于最终分类结果的影响因子,并将影响因子编码为颜色,生成渐变矩形叠加到原始心电图信息上,以可视化的方法直观展示每一个心跳区间对于模型最终分类结果的影响。接下来重新设置可移动遮挡区间参数,移动遮挡区间,计算每个点与基准的偏差值,将该值与原始数据叠加,通过峰值和区域面积展示心电图的细节特征,表现微小区域对于模型分类结果的影响,揭示模型得出该结果的关键依据。本发明通过展示宏观和细节两种粒度下心电图数据对于最终模型分类结果的影响,对模型分类结果进行解释,展示了模型得到最终结果的关键证据,解决了模型结果可解释性不足的问题;同时可视化展示方法深入挖掘心电图数据中的关键信息,将模型运行过程直观表现出来,进一步提升了深度模型分类结果的可解释性。In summary, the present invention discloses a deep model classification result visualization method oriented to electrocardiogram data, which includes the following steps: firstly, inputting original electrocardiogram data into a model, obtaining the original output data of the model, analyzing the final prediction result, and converting the original data into Save it as a benchmark to participate in subsequent comparisons; then dynamically set the occlusion interval according to the heartbeat interval, obtain the impact factor of each heartbeat on the final classification result, encode the impact factor as a color, and generate a gradient rectangle to be superimposed on the original ECG information for visualization. The method visually shows the impact of each heartbeat interval on the final classification result of the model. Next, reset the parameters of the movable occlusion interval, move the occlusion interval, calculate the deviation value of each point from the benchmark, superimpose this value with the original data, and display the detailed characteristics of the ECG through the peak value and area area, showing that the small area is useful for the model classification results. , revealing the key basis for the model to arrive at this result. The present invention interprets the model classification result by showing the influence of the electrocardiogram data in macroscopic and detailed granularities on the final model classification result, shows the key evidence that the model obtains the final result, and solves the problem of insufficient interpretability of the model result; The visual display method digs deeply into the key information in the ECG data, visualizes the running process of the model, and further improves the interpretability of the classification results of the deep model.
以上实施例仅用以说明本发明的技术方案而非对其限制,尽管参照上述实施例对本发明进行了详细的说明,所属领域的普通技术人员依然可以对本发明的具体实施方式进行修改或者等同替换,这些未脱离本发明精神和范围的任何修改或者等同替换,均在申请待批的本发明的权利要求保护范围之内。The above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art can still modify or equivalently replace the specific embodiments of the present invention. , any modifications or equivalent replacements that do not depart from the spirit and scope of the present invention are all within the protection scope of the claims of the present invention for which the application is pending.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910067724.0A CN109875546B (en) | 2019-01-24 | 2019-01-24 | Depth model classification result visualization method for electrocardiogram data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910067724.0A CN109875546B (en) | 2019-01-24 | 2019-01-24 | Depth model classification result visualization method for electrocardiogram data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109875546A CN109875546A (en) | 2019-06-14 |
CN109875546B true CN109875546B (en) | 2020-07-28 |
Family
ID=66926716
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910067724.0A Active CN109875546B (en) | 2019-01-24 | 2019-01-24 | Depth model classification result visualization method for electrocardiogram data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109875546B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111161789B (en) * | 2019-12-11 | 2023-10-31 | 深圳先进技术研究院 | Analysis method and device for key areas of model prediction |
CN112587148B (en) * | 2020-12-01 | 2023-02-17 | 上海数创医疗科技有限公司 | Template generation method and device comprising fuzzification similarity measurement method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004032741A1 (en) * | 2002-10-09 | 2004-04-22 | Bang & Olufsen Medicom A/S | A procedure for extracting information from a heart sound signal |
CN101263510A (en) * | 2004-11-08 | 2008-09-10 | 依德西亚有限公司 | Method and apparatus for electro-biometric identity recognition |
CN102542283A (en) * | 2010-12-31 | 2012-07-04 | 北京工业大学 | Optimal electrode assembly automatic selecting method of brain-machine interface |
CN105960200A (en) * | 2014-02-25 | 2016-09-21 | 圣犹达医疗用品心脏病学部门有限公司 | Systems and methods for using electrophysiology properties for classifying arrhythmia sources |
CN108478209A (en) * | 2018-02-24 | 2018-09-04 | 乐普(北京)医疗器械股份有限公司 | Ecg information dynamic monitor method and dynamic monitor system |
CN108937912A (en) * | 2018-05-12 | 2018-12-07 | 鲁东大学 | A kind of automatic arrhythmia analysis method based on deep neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6669455B2 (en) * | 2015-09-10 | 2020-03-18 | 日本光電工業株式会社 | Electrocardiogram analysis method, electrocardiogram analyzer, electrocardiogram analysis program, and computer-readable storage medium storing electrocardiogram analysis program |
-
2019
- 2019-01-24 CN CN201910067724.0A patent/CN109875546B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004032741A1 (en) * | 2002-10-09 | 2004-04-22 | Bang & Olufsen Medicom A/S | A procedure for extracting information from a heart sound signal |
CN101263510A (en) * | 2004-11-08 | 2008-09-10 | 依德西亚有限公司 | Method and apparatus for electro-biometric identity recognition |
CN102542283A (en) * | 2010-12-31 | 2012-07-04 | 北京工业大学 | Optimal electrode assembly automatic selecting method of brain-machine interface |
CN105960200A (en) * | 2014-02-25 | 2016-09-21 | 圣犹达医疗用品心脏病学部门有限公司 | Systems and methods for using electrophysiology properties for classifying arrhythmia sources |
CN108478209A (en) * | 2018-02-24 | 2018-09-04 | 乐普(北京)医疗器械股份有限公司 | Ecg information dynamic monitor method and dynamic monitor system |
CN108937912A (en) * | 2018-05-12 | 2018-12-07 | 鲁东大学 | A kind of automatic arrhythmia analysis method based on deep neural network |
Non-Patent Citations (2)
Title |
---|
Nonparametric models for characterizing the topical communities in social network;Ziqi Liu et al.;《Neurocomputing》;20161205;439-450 * |
面向网络舆情数据的异常行为识别;郝亚洲等;《计算机研究与发展》;20160315;611-620 * |
Also Published As
Publication number | Publication date |
---|---|
CN109875546A (en) | 2019-06-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111772619B (en) | Heart beat identification method based on deep learning, terminal equipment and storage medium | |
US10398344B2 (en) | Apparatus, methods and articles for four dimensional (4D) flow magnetic resonance imaging | |
JP2022523741A (en) | ECG processing system for depiction and classification | |
CN111248883B (en) | Blood pressure prediction method and device | |
US8989467B2 (en) | Image processing apparatus, method and computer readable recording medium for detecting abnormal area based on difference between pixel value and reference surface | |
CN108403105B (en) | Display method and display device for electrocardio scatter points | |
CN109875546B (en) | Depth model classification result visualization method for electrocardiogram data | |
JP2018171177A (en) | Fundus image processing device | |
CN115470828A (en) | Multi-lead electrocardiogram classification and recognition method based on convolution and self-attention mechanism | |
JP6288676B2 (en) | Visualization device, visualization method, and visualization program | |
CN110236520A (en) | ECG type recognition methods and device based on double convolutional neural networks | |
Tabei et al. | A novel diversity method for smartphone camera-based heart rhythm signals in the presence of motion and noise artifacts | |
CN109770891B (en) | Electrocardiosignal preprocessing method and preprocessing device | |
Nouman et al. | Neuro-TransUNet: Segmentation of stroke lesion in MRI using transformers | |
CN114881975A (en) | System, method, electronic device, and medium for predicting aneurysm rupture potential | |
IT201800001656A1 (en) | Image processing method and method for determining the quantitative and / or qualitative characteristics of objects reproduced in an image, image processing system and system for determining the quantitative and / or qualitative characteristics of the objects reproduced in an image | |
Hudyma et al. | Computer-aided detecting of early strokes and its evaluation on the base of CT images | |
CN111105872A (en) | Interaction method and device of medical image artificial intelligence auxiliary diagnosis system and PACS | |
CN116628489A (en) | Semi-supervised condition migration learning method for medical feature generation model | |
CN111767931A (en) | Automatic report generation system based on medical images | |
Kim et al. | Effective Segmentation of Post-Treatment Gliomas Using Simple Approaches: Artificial Sequence Generation and Ensemble Models | |
Hellar et al. | Manifold approximating graph interpolation of cardiac local activation time | |
CN116784862A (en) | Processing method and device of dynamic electrocardiographic scatter diagram, medical equipment and storage medium | |
KR102503723B1 (en) | Method of searching for similar medical image and similar image search device for performing method | |
Birla et al. | HR-TRACK: An rPPG Method for Heartrate Monitoring Using Temporal Convolution Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |