CN112633416A

CN112633416A - Brain CT image classification method fusing multi-scale superpixels

Info

Publication number: CN112633416A
Application number: CN202110058684.0A
Authority: CN
Inventors: 冀俊忠; 张梦隆; 张晓丹
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-01-16
Filing date: 2021-01-16
Publication date: 2021-04-09

Abstract

A brain CT image classification method fused with multi-scale superpixels belongs to the field of medical image research. The method has the following characteristics: 1) The fusion of multi-scale superpixels and brain CT images removes redundant image information and reduces the grayscale similarity between lesions and surrounding brain tissue pixels. 2) A region- and boundary-based multi-scale superpixel encoder is designed to effectively extract the low-level information of lesions contained in multi-scale superpixels. 3) A multi-scale superpixel feature fusion model is designed, which comprehensively utilizes the high-level features extracted by the residual neural network and the low-level features of multi-scale superpixels to realize the classification of brain CT. 4) Compared with the traditional deep learning algorithm, the method of the present invention can effectively utilize the lesion information contained in the multi-scale superpixels, so as to classify the diseases existing in the brain CT images more accurately, and the method is reasonable and reliable, and can be used for The classification of brain CT images provides powerful help.

Description

A brain CT image classification method fused with multi-scale superpixels

技术领域technical field

本发明属于医学图像研究领域，具体地说，本发明涉及一种融合多尺度超像素的脑CT图像分类方法。The invention belongs to the field of medical image research, in particular, the invention relates to a brain CT image classification method fused with multi-scale superpixels.

背景技术Background technique

临床急诊中脑部损伤的诊断是极其紧迫的，即使短时间的延误也可能导致患者病情恶化。电子计算机断层扫描(Computed Tomography，CT)是最常用的诊断工具之一，具有成像快、成本低、适用范围广、病变检查率高等特点。尽管脑CT能检测出颅内出血、颅内压升高和颅骨骨折等关键且时效性强的异常，但是传统的疾病分类方法通常需要放射科医生目测出血面积大小、估计出中线偏移等信息，这个过程是相对耗时的。近年来，伴随着医学影像技术的进步和发展，脑CT图像数量呈现出几何形式的增长，但放射科医生数量的增长速度却相对缓慢，而且培养一名合格的放射科医生成本高、周期长，致使在职放射科医生的工作任务与日俱增，间接导致了看病难等社会问题。因此，脑CT自动分类方法能够辅助放射科医生工作，提升诊断效率，减少误诊、漏诊率，具有十分重要的现实意义。The diagnosis of brain injury in the clinical emergency department is extremely urgent, and even a short delay can lead to a worsening of the patient's condition. Computed Tomography (CT) is one of the most commonly used diagnostic tools. It has the characteristics of fast imaging, low cost, wide application range, and high detection rate of lesions. Although brain CT can detect critical and time-sensitive abnormalities such as intracranial hemorrhage, elevated intracranial pressure, and skull fractures, traditional disease classification methods often require radiologists to visually measure the size of the hemorrhage and estimate midline offset. This process is relatively time-consuming. In recent years, with the progress and development of medical imaging technology, the number of brain CT images has increased geometrically, but the growth rate of the number of radiologists is relatively slow, and the cost of training a qualified radiologist is high and the cycle is long. , resulting in increasing work tasks of in-service radiologists, which indirectly leads to social problems such as difficulty in seeing a doctor. Therefore, the automatic classification method of brain CT can assist radiologists in their work, improve the diagnostic efficiency, and reduce the rate of misdiagnosis and missed diagnosis, which is of great practical significance.

近年来，深度学习(Deep learning,DL)在计算机视觉领域的巨大成功，也促进了医疗图像分析技术的快速发展，卷积神经网络(Convolutional Neural Networks,CNN)是一种传统的深度学习模型，能够捕获图像的局部区域信息、提取高层语义特征，被广泛用于图像的特征提取与分类任务。如今CNN已经广泛应用在医学图像的识别和处理上,并且经过不断的迭代优化，已经构成很多以CNN为架构的分类器。In recent years, the great success of deep learning (DL) in the field of computer vision has also promoted the rapid development of medical image analysis technology. Convolutional Neural Networks (CNN) is a traditional deep learning model. It can capture local area information of images and extract high-level semantic features, and is widely used in image feature extraction and classification tasks. Nowadays, CNN has been widely used in the recognition and processing of medical images, and after continuous iterative optimization, many classifiers based on CNN have been formed.

然而，现有工作在使用传统CNN提取脑CT图像特征时，并没有考虑脑CT图像和自然光学图像具有的差异：首先，脑CT图像空间分辨率低、对比度低；缺少亮度、颜色、纹理等易于识别的自然视觉特征；区域之间的分界线并不清晰,纹理差异也不大；图像会因患者个体差异和成像体位不同而显著不同；图像还具有位移产生的伪影、容积效应产生误差、设备原因产生噪声的诸多不稳定因素。且考虑到医学图像大多属于医院私有数据，隐私保护条例会阻碍医学图像的共享，数据集的数量会直接影响深度学习的效果。研究表明，无监督生成的超像素由一系列特征相似像素点组成，能够保留原图像的局部细节并且能够突出局部特征，无需，有利于图像高级特征的提取与表达，超像素的数量远小于原图像像素的数量，用超像素代替像素作为图像处理的基元，能够大幅度降低后续图像处理的计算复杂度，提高图像处理算法的效率。However, the existing work does not consider the differences between brain CT images and natural optical images when using traditional CNN to extract brain CT image features: First, brain CT images have low spatial resolution and low contrast; lack of brightness, color, texture, etc. Easy-to-recognize natural visual features; the demarcation line between regions is not clear, and the texture difference is not large; the image will be significantly different due to individual differences in patients and different imaging positions; the image also has artifacts caused by displacement and errors caused by volume effects , Many unstable factors that cause noise from equipment. And considering that medical images are mostly private data of hospitals, privacy protection regulations will hinder the sharing of medical images, and the number of datasets will directly affect the effect of deep learning. Studies have shown that unsupervised generated superpixels are composed of a series of pixel points with similar features, which can retain the local details of the original image and highlight local features. The number of image pixels, using superpixels instead of pixels as the primitives of image processing can greatly reduce the computational complexity of subsequent image processing and improve the efficiency of image processing algorithms.

发明内容SUMMARY OF THE INVENTION

本发明针对上述现有方法忽略了脑CT图像视觉特性的问题，提出一种融合多尺度超像素(multi-scale superpixel fusion,MSF)的脑CT图像分类方法。本发明所述方法能够通过多尺度超像素优化图像，并无监督的提取病灶区域的信息，以此加强残差神经网络生成特征的表达性，提升分类任务的准确率。Aiming at the problem that the above-mentioned existing methods ignore the visual characteristics of brain CT images, the present invention proposes a brain CT image classification method fused with multi-scale superpixel fusion (MSF). The method of the invention can optimize the image through multi-scale superpixels, and extract the information of the lesion area unsupervised, thereby enhancing the expressivity of the features generated by the residual neural network and improving the accuracy of the classification task.

为实现上述目的，本发明采用的技术方案为一种融合多尺度超像素的脑CT图像分类方法。本发明的流程如图1所示，包含以下几个步骤。1)首先，构建数据集并进行预处理，得到多尺度超像素；其次，通过多尺度超像素图像融合进行数据增强，获得优化的融合图像；然后采用基于区域和边界信息的多尺度超像素特征编码算法，得到多尺度超像素低层次特征；最后利用多尺度超像素特征融合分类模型，来对脑CT图像进行分类；In order to achieve the above object, the technical solution adopted in the present invention is a brain CT image classification method fused with multi-scale superpixels. The flow of the present invention is shown in FIG. 1 and includes the following steps. 1) First, construct a dataset and perform preprocessing to obtain multi-scale superpixels; secondly, perform data enhancement through multi-scale superpixel image fusion to obtain an optimized fused image; then use multi-scale superpixel features based on region and boundary information coding algorithm to obtain multi-scale superpixel low-level features; finally, the multi-scale superpixel feature fusion classification model is used to classify brain CT images;

步骤(1)获取数据并预处理：Step (1) Acquire data and preprocess:

步骤(1.1)数据：采集脑CT图像构建数据集，每一个患者数据包含其脑CT图像生成的RGB矩阵

与脑CT分类标签向量Y＝[Y₁，Y₂，…Y_T]，Y_i∈{0，1}，其中N表示图像像素尺寸，T表示采集的疾病类别个数。Step (1.1) Data: collect brain CT images to construct a data set, each patient data contains the RGB matrix generated by its brain CT images

and brain CT classification label vector _Y =[ _Y ₁ , Y ₂ , .

步骤(1.2)将所有患者数据划分为训练集、验证集和测试集。其中，训练集用于学习神经网络的参数；验证集用于确定网络结构和超参数；测试集用于验证神经网络分类效果。Step (1.2) divides all patient data into training set, validation set and test set. Among them, the training set is used to learn the parameters of the neural network; the validation set is used to determine the network structure and hyperparameters; the test set is used to verify the classification effect of the neural network.

步骤(1.3)数据预处理：基于超层次分割算法(Super Hierarchy，SH)，对给定的脑CT图像I和设定的超像素的分割尺度{scale¹，scale²，…scale^s}，其中S表示设定超像素的数量，生成第s个分割尺度下超像素图

的计算过程如下：Step (1.3) Data preprocessing: Based on the super-hierarchy segmentation algorithm (Super Hierarchy, SH), for the given brain CT image I and the set superpixel segmentation scale {scale ¹ , scale ² , ... scale ^s }, where S represents the number of set superpixels, and generates a superpixel map at the sth segmentation scale

The calculation process is as follows:

P^s＝SH(I，scale^s)P ^s =SH(I, scale ^s )

其中，s∈{1，2…S}，scale^s为第s个分割尺度，对每个分割尺度进行计算得到

表示包含S个不同尺度超像素图的多尺度超像素。Among them, s∈{1, 2…S}, scale ^s is the s-th segmentation scale, and each segmentation scale is calculated to get

Represents a multi-scale superpixel containing S superpixel maps of different scales.

步骤(2)多尺度超像素图像融合模型：对给定的脑CT图像I及其多尺度超像素

融合图像I′计算过程如下：Step (2) Multi-scale superpixel image fusion model: for a given brain CT image I and its multi-scale superpixels

The calculation process of the fusion image I' is as follows:

其中，⊙表示点积，f(·)为SoftMax函数，

表示训练的权重，P^s表不

中第s个元素，W实现了各个尺度比重的自适应分配。Among them, ⊙ represents the dot product, f( ) is the SoftMax function,

Indicates the weight of training, P ^s represents

The s-th element in W realizes the adaptive allocation of the proportions of each scale.

步骤(3)多尺度超像素特征编码：Step (3) Multi-scale superpixel feature encoding:

步骤(3.1)对脑CT图像I的多尺度超像素

中分割尺度为scale^s的超像素图

生成其像素值集合

对

中每个像素生成映射矩阵集合

其中，第k个映射矩阵

中第i，j个元素

的计算方式如下：Step (3.1) Multi-scale superpixels for brain CT image I

A superpixel map with a median segmentation scale of scale ^s

generate a collection of its pixel values

right

Generates a set of mapping matrices for each pixel in

Among them, the kth mapping matrix

The i,jth element in

is calculated as follows:

其中，k∈{1，2，…scale^s}，其中

表示P^s中第k个超像素的像素值，M^s，k表示像素值为

的超像素区域映射。where k∈{1, 2,…scale ^s }, where

represents the pixel value of the kth superpixel in P ^s , M ^{s, k} represents the pixel value

superpixel region map.

步骤(3.2)基于面积和边界信息对集合M^s中每个映射矩阵进行编码，得到的超像素图P^s编码结果

计算过程如下：Step (3.2) encodes each mapping matrix in the set M ^s based on the area and boundary information, and obtains the encoding result of the superpixel map P ^s

The calculation process is as follows:

其中N²表示超像素图所含像素点个数，s_k表示第k个超像素区域包含的的像素点个数。Among them, N ² represents the number of pixels contained in the superpixel map, and _sk represents the number of pixels contained in the k-th superpixel region.

步骤(3.3)对

中每个超像素图重复步骤(3.1)和步骤(3.2)，依次生成编码结果b¹，b²，…b^S，将其通过矩阵拼接成

得到多尺度超像素

的多尺度超像素特征编码B。Step (3.3) to

Steps (3.1) and (3.2) are repeated for each superpixel image in , and the encoding results b ¹ , b ² , . . . b ^S are generated in turn, which are spliced into

get multi-scale superpixels

The multi-scale superpixel feature encoding B.

步骤(4)多尺度超像素特征融合分类模型：Step (4) Multi-scale superpixel feature fusion classification model:

步骤(4.1)构建一个残差神经网络(Residual Network，ResNet)作为主干网络，使用步骤(2)提取的脑CT融合图像I′输入，选择ResNet中四个Layer的最后一个残差结构(Basic Block)的特征激活输出l₁、l₂、l₃、l₄作为高层次特征。Step (4.1) Build a residual neural network (Residual Network, ResNet) as the backbone network, use the brain CT fusion image I' extracted in step (2) to input, and select the last residual structure (Basic Block) of the four Layers in ResNet. ) feature activation outputs l ₁ , l ₂ , l ₃ , l ₄ as high-level features.

步骤(4.2)对(3.3)提取的低层次特征B进行降维处理，通过256个3×3卷积核构成的卷积层生成特征

将f₀自下而上依次和l₁、l₂、l₃、l₄进行多层融合，生成融合特征f₁、f₂、f₃、f₄。其中f_i(i∈{1，2，3，4})的计算如下：先采用池化操作将f_(i-1)转化为和l_i相同尺寸的特征矩阵，之后通过256个1×1卷积核构成的卷积层将l_i的通道数转换为256，之后通过矩阵相加进行特征融合，得到第i层融合特征f_i。Step (4.2) performs dimensionality reduction processing on the low-level feature B extracted in (3.3), and generates features through a convolutional layer composed of 256 3×3 convolution kernels

Perform multi-layer fusion of f ₀ with l ₁ , l ₂ , l ₃ , and l ₄ from bottom to top to generate fusion features f ₁ , f ₂ , f ₃ , and f ₄ . The calculation of f _i (i∈{1, 2, 3, 4}) is as follows: first, the pooling operation is used to convert f _(i-1) into a feature matrix of the same size as l _i , and then 256 1×1 The convolution layer formed by the convolution kernel converts the number of channels of li to 256, and then performs feature fusion through matrix addition to obtain the _i -th layer fusion feature f _i .

步骤(4.3)将得到的融合特征f₄输入512个3×3卷积核构成的卷积层、池化层和全连接层得到和标签数量T等长的分类向量x，将其通过Sigmoid线性回归生成预测值向量y＝[y₁，y₂，…y_T]，其中y_i∈[0，1]，x中第i个元素生成对应标签为正例的概率y_i表示为：y_i＝Sigmoid(x_i)，根据设定的分类阈值t确定分类结果，当y_i大于设定阈值t时，模型判定该脑CT存在对应标签疾病，若y_i小于设定阈值t时则为正常。t＝0.5。Step (4.3) Input the obtained fusion feature f ₄ into the convolution layer, pooling layer and fully connected layer composed of 512 3×3 convolution kernels to obtain a classification vector x of the same length as the number of labels T, and pass it through the Sigmoid linear Regression generates a predicted value vector y=[y ₁ , y ₂ ,...y _T ], where y _i ∈ [0, 1], the probability that the i-th element in x generates the corresponding label as a positive example y _i is expressed as: y _i =Sigmoid(x _i ), the classification result is determined according to the set classification threshold t, when y _i is greater than the set threshold t, the model determines that the brain CT has a corresponding label disease, and if y _i is less than the set threshold t, it is normal . t=0.5.

步骤(4.4)本发明所述的一种融合多尺度超像素的脑CT图像分类方法的输入为患者脑CT图像I与脑疾病分类标签Y，然后得到该被试属于各个类别的概率y。若给出M个患者的数据集D＝{(I¹，Y¹)，(I²，Y²)，…，(I^M，Y^M)}，对于给定的脑CT图像Iⁱ对应标签Yⁱ以及通过模型生成的标签预测yⁱ，我们通过二分类交叉熵损失(Binary cross entropy loss，BCEloss)来计算样本中每个标签的分类误差，之后通过对所有标签分类误差求均值得到样本误差，损失函数计算如下：Step (4.4) The input of the brain CT image classification method fused with multi-scale superpixels of the present invention is the patient's brain CT image I and the brain disease classification label Y, and then the probability y of the subject belonging to each category is obtained. If a dataset of M patients is given D={(I ¹ , Y ¹ ), (I ² , Y ² ), ..., (I ^M , Y ^M )}, for a given brain CT image I ⁱ corresponds to the label Yi and the label prediction ^yi generated by the model ^, we calculate the classification error of each label in the sample through binary cross entropy loss (BCEloss), and then obtain the sample error by averaging all label classification errors , the loss function is calculated as follows:

其中

表示样本的第j个标签的值，

表示模型预测的第j个标签的值，T表示样本中标签的个数。in

represents the value of the jth label of the sample,

Represents the value of the jth label predicted by the model, and T represents the number of labels in the sample.

步骤(4.5)针对步骤(1.2)中的训练集，利用Adam自适应优化算法最小化步骤(4.4)中所述损失函数，通过比较模型在不同学习率λ设置下，观察模型经过训练集训练后在验证集上的分类准确率，一般λ初始值设置为10^-3至10^-6，下一次学习率设置为上次的3倍，λ最大值设置为0.1至0.5，然后来选择其中准确率最高的学习率来训练模型。Step (4.5) For the training set in step (1.2), use the Adam adaptive optimization algorithm to minimize the loss function described in step (4.4), by comparing the models under different learning rate λ settings, observe the model after training on the training set. For the classification accuracy on the validation set, the initial value of λ is generally set to 10 ^-3 to 10 ^-6 , the next learning rate is set to 3 times the previous one, and the maximum value of λ is set to 0.1 to 0.5, and then choose the accuracy rate. The highest learning rate to train the model.

步骤(5)在完成上述所有步骤后，可以将新的脑CT数据集输入至模型中，根据模型输出的预测结果对这些脑CT图像进行分类。Step (5) After completing all the above steps, a new brain CT data set can be input into the model, and these brain CT images can be classified according to the prediction results output by the model.

与现有方法相比，本发明具有以下明显的优势和有益效果：Compared with the existing method, the present invention has the following obvious advantages and beneficial effects:

本发明提出一种融合多尺度超像素的脑CT图像分类方法，相较于传统图像分类网络，所述方法具有以下特点：1)利用多尺度超像素与脑CT图像融合，去除了图像冗余信息，降低了病灶和周围脑组织像素的灰度相似性。2)设计了一种基于区域和边界的多尺度超像素编码器，有效的提取多尺度超像素中包含的病灶低层次信息。3)设计了一种融合多尺度超像素特征融合模型，综合利用了残差神经网络提取的高层次特征和多尺度超像素的低层次特征，实现对脑CT的分类。4)相比传统深度学习算法，本发明所述方法可以有效利用多尺度超像素中包含的病灶信息，从而更准确地对脑CT图像中存在的疾病进行分类。The present invention proposes a brain CT image classification method fused with multi-scale superpixels. Compared with the traditional image classification network, the method has the following characteristics: 1) The fusion of multi-scale superpixels and brain CT images is used to remove image redundancy. information, reducing the grayscale similarity of pixels in the lesion and surrounding brain tissue. 2) A region- and boundary-based multi-scale superpixel encoder is designed to effectively extract the low-level information of lesions contained in multi-scale superpixels. 3) A multi-scale superpixel feature fusion model is designed, which comprehensively utilizes the high-level features extracted by the residual neural network and the low-level features of multi-scale superpixels to realize the classification of brain CT. 4) Compared with the traditional deep learning algorithm, the method of the present invention can effectively utilize the lesion information contained in the multi-scale superpixels, thereby more accurately classifying the diseases existing in the brain CT images.

附图说明Description of drawings

图1：一种融合多尺度超像素的脑CT图像分类方法流程图。Figure 1: Flow chart of a brain CT image classification method fused with multi-scale superpixels.

图2：多尺度超像素特征融合分类模型。Figure 2: Multi-scale superpixel feature fusion classification model.

图3：不同尺寸特征融合模型。Figure 3: Feature fusion models of different sizes.

图4：多尺度超像素脑CT图像融合可视化。Figure 4: Multiscale superpixel brain CT image fusion visualization.

具体实施方式Detailed ways

本实施例中以脑出血患者为研究对象，但本方法不限于此，还可以以他脑疾病患者脑CT图像为研究对象。下面以真实脑出血CT数据集为例，具体说明本方法的实施步骤：In this example, patients with cerebral hemorrhage are taken as the research object, but the method is not limited to this, and brain CT images of patients with other brain diseases can also be taken as the research object. The following takes the real intracerebral hemorrhage CT data set as an example to specifically describe the implementation steps of the method:

步骤(1)获取数据并预处理：Step (1) Acquire data and preprocess:

步骤(1.1)数据：本发明使用CQ500数据集(http://headctstudy.qure.ai/dataset)采集脑CT图像构建数据集，实际获得451例扫描数据，总共22773张脑CT图像，每个患者标签信息包含脑部疾病的14种诊断类别：颅内出血、脑实质出血、脑室出血、硬膜下出血、硬膜外出血、蛛网膜下出血、左侧脑出血、右侧脑出血、慢性出血、骨折、颅骨骨折、其他骨折、中线偏移、质量效应。首先，针对数据集标签，由于三名放射学专家对同一标签可能有不同标注，因此针对标注不统一的情况，我们选择多数的专家的选择作为真实的标签。然后，根据确定的颅内出血标签信息将数据集分为确诊脑出血患者病例204例，未确诊的有247例，从确诊脑出血患者病例中根据诊断类别标签选取对应病灶位置的图像742张，从未确诊病例中选取和脑出血患者病例中病灶位置相同位置的图像1045张，总共M＝1787张脑CT图像作为数据集D＝{(I¹，Y¹)，(I²，Y²)，…，(I¹⁷⁸⁷，Y¹⁷⁸⁷)}。每一个数据包含其脑CT图像生成的RGB矩阵

与T＝14种脑疾病诊断标签向量Y＝[Y₁，Y₂，…Y₁₄]，Y_i∈{0，1}，其中标签的元素Y_i＝1表示经该脑CT图像存在第i个标签相应的脑疾病，Y_i＝0则是正常。Step (1.1) Data: The present invention uses the CQ500 data set (http://headctstudy.qure.ai/dataset) to collect brain CT images to construct a data set, and actually obtains 451 cases of scan data, a total of 22773 brain CT images, each patient The label information contains 14 diagnostic categories of brain diseases: intracranial hemorrhage, parenchymal hemorrhage, intraventricular hemorrhage, subdural hemorrhage, epidural hemorrhage, subarachnoid hemorrhage, left intracerebral hemorrhage, right intracerebral hemorrhage, chronic hemorrhage, Fractures, skull fractures, other fractures, midline offset, mass effect. First, for dataset labels, since three radiology experts may have different labels for the same label, we choose the choice of the majority of experts as the true label for the inconsistent labeling. Then, according to the determined label information of intracranial hemorrhage, the dataset was divided into 204 cases of patients with confirmed intracerebral hemorrhage and 247 cases of undiagnosed cases. From the cases of patients with confirmed intracerebral hemorrhage, 742 images corresponding to the location of the lesions were selected according to the label of the diagnosis category, and 742 images of the corresponding lesions were selected from the cases of patients with confirmed intracerebral hemorrhage. In the undiagnosed cases, 1045 images with the same location as the lesions in the patients with cerebral hemorrhage were selected, and a total of M=1787 brain CT images were used as the data set D={(I ¹ , Y ¹ ), (I ² , Y ² ), ..., (I ¹⁷⁸⁷ , Y ¹⁷⁸⁷ )}. Each data contains an RGB matrix generated from its brain CT image

and T= ₁₄ kinds of brain disease diagnosis label vector _Y ₌ [Y ₁ , Y ₂ , . Each label corresponds to a brain disease, and Y _i = 0 is normal.

步骤(1.2)将所有患者数据按8∶1∶1划分为训练集、验证集和测试集。其中，训练集用于学习神经网络的参数；验证集用于确定网络结构和超参数；测试集用于验证神经网络分类效果。Step (1.2) divides all patient data into training set, validation set and test set according to 8:1:1. Among them, the training set is used to learn the parameters of the neural network; the validation set is used to determine the network structure and hyperparameters; the test set is used to verify the classification effect of the neural network.

步骤(1.3)数据预处理：基于超层次分割算法(Super Hierarchy，SH)，对给定的脑CT图像I和设定的S＝3个超像素的分割尺度{5，10，15}，生成不同分割尺度下超像素图

的计算过程如下：Step (1.3) Data preprocessing: Based on the super-hierarchy segmentation algorithm (Super Hierarchy, SH), for the given brain CT image I and the set S=3 superpixel segmentation scale {5, 10, 15}, generate Superpixel maps at different segmentation scales

The calculation process is as follows:

P¹＝SH(I，5)P ¹ =SH(I,5)

P²＝SH(I，10)P ² =SH(I, 10)

P³＝SH(I，15)P ³ =SH(I, 15)

其中，

表示包含3个不同尺度超像素图的多尺度超像素。in,

Represents a multi-scale superpixel that contains 3 superpixel maps of different scales.

融合图像

计算过程如下：Step (2) Multi-scale superpixel image fusion model: for a given brain CT image I and its multi-scale superpixels

fused images

The calculation process is as follows:

其中，⊙表示点积，f(·)为SoftMax函数，

表示训练的权重，P^s表不

中第s个元素，W实现了各个尺度比重的自适应分配。。Among them, ⊙ represents the dot product, f( ) is the SoftMax function,

Indicates the weight of training, P ^s represents

The s-th element in W realizes the adaptive allocation of the proportions of each scale. .

步骤(3.1)对脑CT图像I的多尺度超像素

中分割尺度为5的超像素图

其中scale¹＝5，生成其像素值集合

对

中每个像素生成映射矩阵集合M¹＝{M^1，1，M^1，2，…，M^1，5}，其中，第k个映射矩阵

中第i，j个元素

的计算方式如下：Step (3.1) Multi-scale superpixels for brain CT image I

A superpixel map with a median segmentation scale of 5

Where scale ¹ = 5, generate its pixel value set

right

Each pixel in generates a set of mapping matrices M ¹ ={M ^1,1 , M ^1,2 ,...,M ^1,5 }, where the kth mapping matrix

The i,jth element in

is calculated as follows:

其中，k∈{1，2，…5}，其中

表示P¹中第k个超像素的像素值，M^1，k表示像素值为

的超像素区域映射where, k∈{1,2,…5}, where

represents the pixel value of the kth ^superpixel in P1, M1 ^,k represents the pixel value

The superpixel region map of

步骤(3.2)基于面积和边界信息对集合M¹中每个映射矩阵进行编码，得到的超像素图P¹编码结果

计算过程如下：Step (3.2) encodes each mapping matrix in the set M ¹ based on the area and boundary information, and obtains the encoding result of the superpixel map P ¹

The calculation process is as follows:

其中s_k表示第k个超像素区域包含的的像素点个数。where _sk represents the number of pixels contained in the k-th superpixel region.

步骤(3.3)对

中每个超像素图重复步骤(3.1)和步骤(3.2)，依次生成编码结果b¹，b²，b³，将其通过矩阵拼接成

得到多尺度超像素

的多尺度超像素特征编码B。Step (3.3) to

Steps (3.1) and (3.2) are repeated for each superpixel image, and the encoding results b ¹ , b ² , and b ³ are generated in turn, and they are spliced into

get multi-scale superpixels

The multi-scale superpixel feature encoding B.

步骤(4.1)构建一个34层的残差神经网络ResNet-34作为主干网络，使用步骤(2)提取的脑CT融合图像I′输入，选择ResNet-34中四个Layer的最后一个残差结构(BasicBlock)的特征激活输出

作为高层次特征。Step (4.1) Build a 34-layer residual neural network ResNet-34 as the backbone network, use the brain CT fusion image I' extracted in step (2) to input, and select the last residual structure of the four layers in ResNet-34 ( BasicBlock) feature activation output

as a high-level feature.

将f₀自下而上依次和l₁、l₂、l₃、l₄进行多层融合，生成融合特征f₁、f₂、f₃、f₄。其中f_i(i∈{1，2，3，4})的计算如下：先采用池化操作将f_(i-1)转化为和l_i相同尺寸的特征矩阵，之后通过256个1×1卷积核构成的卷积层将l_i的通道数转换为256，之后通过矩阵相加进行特征融合，得到第f层融合特征f_i。Step (4.2) performs dimensionality reduction processing on the low-level feature B extracted in (3.3), and generates features through a convolutional layer composed of 256 3×3 convolution kernels

Perform multi-layer fusion of f ₀ with l ₁ , l ₂ , l ₃ , and l ₄ from bottom to top to generate fusion features f ₁ , f ₂ , f ₃ , and f ₄ . The calculation of f _i (i∈{1, 2, 3, 4}) is as follows: first, the pooling operation is used to convert f _(i-1) into a feature matrix of the same size as l _i , and then 256 1×1 The convolution layer formed by the convolution kernel converts the number of channels of li to 256, and then performs feature fusion through matrix addition to obtain the f- _{th layer fusion feature f i} _.

步骤(4.3)将得到的融合特征f₄输入512个3×3卷积核构成的卷积层、池化层和全连接层得到和标签数量T等长的分类向量x，将其通过Sigmoid线性回归生成预测值向量y＝[y₁，y₂，…y₁₄]，其中y_i∈[0，1]，x中第i个元素生成对应标签为正例的概率y_i表示为：y_i＝Sigmoid(x_i)，此时分类阈值设定为t＝0.5，当y_i大于设定阈值0.5时，模型判定该脑CT存在对应标签疾病，若y_i小于设定阈值0.5则为正常。Step (4.3) Input the obtained fusion feature f ₄ into the convolution layer, pooling layer and fully connected layer composed of 512 3×3 convolution kernels to obtain a classification vector x of the same length as the number of labels T, and pass it through the Sigmoid linear Regression generates a predicted value vector y=[y ₁ , y ₂ ,...y ₁₄ ], where y _i ∈ [0, 1], the probability that the i-th element in x generates the corresponding label as a positive example y _i is expressed as: y _i =Sigmoid(x _i ), at this time, the classification threshold is set to t=0.5. When y _i is greater than the set threshold of 0.5, the model determines that the brain CT has a corresponding label disease. If y _i is less than the set threshold of 0.5, it is normal.

步骤(4.4)本发明所述的一种融合多尺度超像素的脑CT图像分类方法的输入为患者脑CT图像I与脑疾病分类标签Y，然后得到该被试属于各个类别的概率y。对于输入的的脑CT图像I对应标签Y以及通过模型生成的标签预测y，我们通过二分类交叉熵损失(Binarycross entropy loss，BCE loss)来计算样本中每个标签的分类误差，之后通过对所有标签分类误差求均值得到样本误差，损失函数计算如下：Step (4.4) The input of the brain CT image classification method fused with multi-scale superpixels of the present invention is the patient's brain CT image I and the brain disease classification label Y, and then the probability y of the subject belonging to each category is obtained. For the input brain CT image I corresponding to the label Y and the label prediction y generated by the model, we use the binary cross entropy loss (Binarycross entropy loss, BCE loss) to calculate the classification error of each label in the sample, and then pass all The label classification error is averaged to obtain the sample error, and the loss function is calculated as follows:

步骤(4.5)针对步骤(1.2)中的训练集，利用Adam自适应优化算法最小化步骤(4.4)中所述损失函数，通过比较模型在不同学习率λ设置下，观察模型经过训练集训练后在验证集上的分类准确率，λ初始值设置为10^-5，下一次学习率设置为上次的3倍，λ最大值设置为0.1，然后来选择其中准确率最高的学习率来训练模型。Step (4.5) For the training set in step (1.2), use the Adam adaptive optimization algorithm to minimize the loss function described in step (4.4), by comparing the models under different learning rate λ settings, observe the model after training on the training set. For the classification accuracy rate on the validation set, the initial value of λ is set to 10 ^-5 , the next learning rate is set to 3 times the last time, and the maximum value of λ is set to 0.1, and then select the learning rate with the highest accuracy to train the model .

为了说明本发明所述方法的有益效果，在具体实施过程中，我们与本文作为主干网络的传统分类模型ResNet-34进行对比，并且对融合多尺度超像素的脑CT分类模型中三个部分：超像素图像融合、超像素特征编码算法和不同层次特征融合模型，进行了消融实验来验证每一部分有效性，实验采用目前广泛使用的准确率(Accuracy，ACC)，灵敏度(Sensitivity，SEN)，和F1评估值(F-score，F)作为评价指标，实验结果如表1所示。In order to illustrate the beneficial effects of the method of the present invention, in the specific implementation process, we compared with the traditional classification model ResNet-34 as the backbone network in this paper, and compared the three parts of the brain CT classification model fused with multi-scale superpixels: Superpixel image fusion, superpixel feature encoding algorithm and different-level feature fusion models, ablation experiments are carried out to verify the effectiveness of each part. The F1 evaluation value (F-score, F) is used as the evaluation index, and the experimental results are shown in Table 1.

多尺度超像素选择使用5、10、15和10、15、20两种尺度的超像素图的组合。其中实验的对照组Baseline为使用脑CT图像进行分类，MSF表示本发明所述方法，MSF₀₀₁表示使用脑CT原图像作为输入，仅使用特征融合模型进行分类的结果；MSF₀₁₁方法表示去除MSF方法中多尺度超像素图像融合部分；MSF₁₀₁表示去除MSF方法中超像素特征编码部分；MSF₁₀₀表示去除MSF方法中多尺度超像素图像融合和超像素特征编码部分。Multi-scale superpixel selection uses a combination of superpixel maps at scales 5, 10, 15 and 10, 15, and 20. The baseline of the control group in the experiment is to use the brain CT image for classification, MSF represents the method of the present invention, MSF ₀₀₁ represents the result of using the original brain CT image as input, and only the feature fusion model is used for classification; MSF ₀₁₁ Method represents the method of removing MSF Medium multi-scale superpixel image fusion part; MSF ₁₀₁ represents the removal of the superpixel feature encoding part in the MSF method; MSF ₁₀₀ represents the removal of the multi-scale superpixel image fusion and superpixel feature encoding part in the MSF method.

表1融合多尺度超像素的脑CT分类模型对比实验Table 1 Comparative experiments of brain CT classification models fused with multi-scale superpixels

通过对比MSF₀₁₁和MSF的分类效果可以看出：不使用多尺度超像素脑CT图像融合的情况下分类效果对比Baseline有了少许提升，但仍不及本文的MSF方法。通过对比MSF₁₀₁和MSF的分类结果可以看出：不使用超像素特征编码算法的情况下分类效果虽然对比Baseline有了显著提升，但由于直接使用多尺度超像素图没有参考区域面积和边界信息，表达不佳，不能直接作为低层次特征，导致和高层特征的融合特征效果不好，和MSF分类效果有较大的差距。通过MSF₀₀₁和Baseline，MSF₁₀₀和MSF₁₀₁两组对比实验可以看出，不论输入是原图像还是经过多尺度超像素融合后的图像，与仅使用ResNet提取特征相比特征融合模型的使用都显著的提升了分类的效果。By comparing the classification effects of MSF ₀₁₁ and MSF, it can be seen that the classification effect is slightly improved compared to Baseline without using multi-scale superpixel brain CT image fusion, but it is still not as good as the MSF method in this paper. By comparing the classification results of MSF ₁₀₁ and MSF, it can be seen that the classification effect is significantly improved compared to Baseline without the use of the superpixel feature encoding algorithm, but because the multi-scale superpixel map is directly used, there is no reference area and boundary information. The expression is not good and cannot be directly used as a low-level feature, resulting in a poor fusion feature with high-level features, and a large gap with the MSF classification effect. Through the comparison experiments of MSF ₀₀₁ and Baseline, MSF ₁₀₀ and MSF ₁₀₁ , it can be seen that whether the input is the original image or the image after multi-scale superpixel fusion, compared with only using ResNet to extract features, the use of the feature fusion model is significant. to improve the classification effect.

此外，我们还进行了多尺度超像素融合过程中多尺度超像素与权重的加权图和多尺度超像素融合图像I′进行了可视化展示，如图4所示。可以清晰的看出，加权图很好的去除了颅脑外的冗余信息，准确的划分出病灶区域且没有过度分割；融合图像I′可以明显区分病灶区域，降低了病灶和周围脑组织像素的灰度相似性，更好的表达了病灶区域。In addition, we also carried out a weighted map of multi-scale superpixels and weights in the process of multi-scale superpixel fusion and visualized the multi-scale superpixel fusion image I′, as shown in Figure 4. It can be clearly seen that the weighted image removes the redundant information outside the brain very well, and accurately divides the lesion area without over-segmentation; the fusion image I′ can clearly distinguish the lesion area and reduce the pixels of the lesion and surrounding brain tissue. The grayscale similarity of , better represents the lesion area.

综上所述，通过对比Baseline方法和消融实验，验证了本文提出的MSF方法在脑CT图像分类任务中的有效性。这是由于基于多尺度超像素脑CT图像模型融合生成的融合图像有效的降低了图像的噪声，并且准确地划分了病灶区域；多尺度超像素编码器提取准确的低层次特征，其中包含了区域面积和边界信息，能更好的关注小面积病灶；特征融合模型通过融合两种不同层次的特征生成了更具判别性的融合特征，对面积不固定的病灶区域有更有效的表达。因此，本发明所述方法合理可靠，可为脑CT图像的分类提供有力的帮助。In summary, by comparing the Baseline method and ablation experiments, the effectiveness of the MSF method proposed in this paper is verified in the brain CT image classification task. This is because the fusion image based on the fusion of the multi-scale superpixel brain CT image model effectively reduces the noise of the image and accurately divides the lesion area; the multi-scale superpixel encoder extracts accurate low-level features, which contain regions The area and boundary information can better pay attention to small-area lesions; the feature fusion model generates more discriminative fusion features by fusing two different levels of features, and has a more effective expression for the lesion area with variable area. Therefore, the method of the present invention is reasonable and reliable, and can provide powerful help for the classification of brain CT images.

Claims

1. A brain CT image classification method fused with multi-scale superpixels, characterized in that: first, a data set is constructed and preprocessed to obtain multi-scale superpixels; secondly, data enhancement is performed through multi-scale superpixel image fusion to obtain The optimized fusion image; then the feature coding algorithm based on region and boundary information is used to process multi-scale superpixels to obtain multi-scale superpixel low-level features; finally, the multi-scale superpixel feature fusion classification model is used to classify brain CT images;

Step (1) Acquire data and preprocess:

Step (1.1) Data: collect brain CT images to construct a dataset, each patient data contains its RGB matrix generated by brain CT images

with brain CT classification label vector Y = [Y ₁ , Y ₂ , ... Y _T ], Y _i ∈ {0, 1}, where

represents the set of real numbers, N represents the pixel size of the image, and T represents the number of disease categories collected;

Step (1.2) divides all patient data into training set, verification set and test set; wherein, the training set is used to learn the parameters of the neural network; the verification set is used to determine hyperparameters; the test set is used to verify the neural network classification effect;

Step (1.3) Data preprocessing: Based on the super-hierarchy segmentation algorithm (Super Hierarchy, SH), for the given brain CT image I and the set superpixel segmentation scale {scale ¹ , scale ² , ... scale ^S }, where S represents the number of set superpixels, and generates a superpixel map at the sth segmentation scale

The calculation process is as follows:

P ^s =SH(I, scale ^s )

Among them, s∈{1, 2…S}, scale ^s is the s-th segmentation scale, and each segmentation scale is calculated to get

represents a multi-scale superpixel containing S superpixel maps of different scales;

Step (2) Multi-scale superpixel image fusion model: for a given brain CT image I and its multi-scale superpixels

The calculation process of the fusion image I' is as follows:

Among them, ⊙ represents the dot product, f( ) is the SoftMax function,

Represents the weight of training, P ^s represents

In the sth element, W realizes the adaptive distribution of the proportions of each scale;

Step (3) Multi-scale superpixel feature encoding:

Step (3.1) Multi-scale superpixels for brain CT image I

A superpixel map with a median segmentation scale of scale ^s

generate a collection of its pixel values

right

Generates a set of mapping matrices for each pixel in

Among them, the kth mapping matrix

The i,jth element in

is calculated as follows:

where k∈{1, 2,…scale ^s }, where

The superpixel region map of ;

Step (3.2) encodes each mapping matrix in the set M ^s based on the area and boundary information, and obtains the encoding result of the superpixel map P ^s

The calculation process is as follows:

where N ² represents the number of pixels contained in the superpixel map, _sk represents the number of pixels contained in the k-th superpixel region, and ⊙ represents the dot product;

Step (3.3) to

get multi-scale superpixels

The multi-scale superpixel feature encoding B;

Step (4) Multi-scale superpixel feature fusion classification model:

Step (4.1) Build a residual neural network ResNet as the backbone network, use the brain CT fusion image I' extracted in step (2) as input, and select the feature activation output of the last residual structure (Basic Block) of the four Layers in ResNet l ₁ , l ₂ , l ₃ , l ₄ are used as high-level features;

Step (4.2) performs dimensionality reduction processing on the low-level feature B extracted in (3.3), and generates features through a convolutional layer composed of 256 3×3 convolution kernels

Perform multi-layer fusion of f ₀ with l ₁ , l ₂ , l ₃ , and l ₄ from bottom to top to generate fusion features f ₁ , f ₂ , f ₃ , f ₄ ; where f _i (i∈{1, 2 , 3, 4}) are calculated as follows: first, the pooling operation is used to convert f _(i-1) into a feature matrix of the same size as l _i , and then l is passed through the convolution layer composed of 256 1×1 convolution kernels. The number of channels of _i is converted to 256, and then feature fusion is performed by matrix addition to obtain the i-th layer fusion feature f _i ;

Step (4.3) Input the obtained fusion feature f ₄ into the convolution layer, pooling layer and fully connected layer composed of 512 3×3 convolution kernels to obtain a classification vector x of the same length as the number of labels T, and pass it through the Sigmoid linear Regression generates a predicted value vector y=[y ₁ , y ₂ ,...y _T ], where y _i ∈ [0, 1], the probability that the i-th element in x generates the corresponding label as a positive example y _i is expressed as: y _i =Sigmoid(x _i ), the classification result is determined according to the set classification threshold t, when y _i is greater than the set threshold t, the model determines that the brain CT has a corresponding label disease, and if y _i is less than the set threshold t, it is normal ; t is 0.5;

In step (4.4), the input is the patient's brain CT image I and the brain disease classification label Y, and then the probability y of the subject belonging to each category is obtained; if the data set D={(I ¹ , Y ¹ ) of M patients is given, ⁽ ^I ² , ^Y ² ⁾ ^, . The classification error of each label in the sample, and then the sample error is obtained by averaging the classification errors of all labels. The loss function is calculated as follows:

in

represents the value of the jth label of the sample,

Represents the value of the jth label predicted by the model, and T represents the number of labels in the sample;

Step (4.5) For the training set in step (1.2), use the Adam adaptive optimization algorithm to minimize the loss function described in step (4.4), by comparing the models under different learning rate λ settings, observe the model after training on the training set. For the classification accuracy rate on the validation set, the initial value of λ is set to 10 ^-5 , the next learning rate is set to 3 times the last time, and the maximum value of λ is set to 0.1, and then select the learning rate with the highest accuracy to train the model ;

Step (5) After completing all the above steps, input a new brain CT data set into the model, and classify these brain CT images according to the prediction results output by the model.