CN118537853A - Improved YOLOv-based leukocyte classification method - Google Patents
Improved YOLOv-based leukocyte classification method Download PDFInfo
- Publication number
- CN118537853A CN118537853A CN202410452341.6A CN202410452341A CN118537853A CN 118537853 A CN118537853 A CN 118537853A CN 202410452341 A CN202410452341 A CN 202410452341A CN 118537853 A CN118537853 A CN 118537853A
- Authority
- CN
- China
- Prior art keywords
- network
- white blood
- improved
- attention
- blood cell
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 210000000265 leukocyte Anatomy 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000006870 function Effects 0.000 claims abstract description 13
- 230000007246 mechanism Effects 0.000 claims abstract description 13
- 230000006872 improvement Effects 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 30
- 238000012360 testing method Methods 0.000 claims description 28
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000001914 filtration Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 5
- 230000002146 bilateral effect Effects 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- 210000003651 basophil Anatomy 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 210000003979 eosinophil Anatomy 0.000 claims description 3
- 210000004698 lymphocyte Anatomy 0.000 claims description 3
- 210000001616 monocyte Anatomy 0.000 claims description 3
- 210000000440 neutrophil Anatomy 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 239000002245 particle Substances 0.000 claims description 2
- 238000012795 verification Methods 0.000 claims 4
- 238000007781 pre-processing Methods 0.000 claims 2
- 210000000988 bone and bone Anatomy 0.000 claims 1
- 238000002372 labelling Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 8
- 238000000605 extraction Methods 0.000 abstract description 3
- 238000013135 deep learning Methods 0.000 abstract description 2
- 210000000601 blood cell Anatomy 0.000 abstract 1
- 238000005457 optimization Methods 0.000 abstract 1
- 239000013598 vector Substances 0.000 description 11
- 238000001514 detection method Methods 0.000 description 6
- 210000002569 neuron Anatomy 0.000 description 6
- 210000004027 cell Anatomy 0.000 description 4
- 238000009499 grossing Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/698—Matching; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Image Analysis (AREA)
Abstract
本发明属于医学显微图像分类技术领域,尤其为提出一种基于改进的YOLOv7的白细胞五分类方法,通过深度学习技术实现对血液细胞图像的识别。本发明以YOLOv7网络作为主要结构,将Backbone部分中第一个和第四个ELAN‑1模块的四个并行卷积层替换为GhostV2模块,减少参数量的同时生成更多的特征图,结合SimAM注意力机制和MHSA改进Head部分中得ELAN‑2模块,在不增加参数量的情况下生成更多的特征图,使得算法更加关注对白细胞表面的粒度特征,针对网络中不同位置的ELAN模块采用不同的改进方法达到对YOLOv7网络特征提取能力的优化。本发明采用改进后的Focal Loss替换原始的网络结构中的分类损失,增加了难分样本在损失函数中的权重,使得损失函数倾向于难分的样本,同时解决了白细胞图像类别不平衡问题,提高了白细胞分类的准确率。The present invention belongs to the technical field of medical microscopic image classification, and in particular, proposes a five-classification method of white blood cells based on improved YOLOv7, and realizes the recognition of blood cell images through deep learning technology. The present invention uses the YOLOv7 network as the main structure, replaces the four parallel convolutional layers of the first and fourth ELAN-1 modules in the Backbone part with GhostV2 modules, reduces the amount of parameters and generates more feature maps, combines the SimAM attention mechanism and MHSA to improve the ELAN-2 module in the Head part, generates more feature maps without increasing the amount of parameters, makes the algorithm pay more attention to the granularity characteristics of the white blood cell surface, and uses different improvement methods for the ELAN modules at different positions in the network to achieve the optimization of the feature extraction ability of the YOLOv7 network. The present invention uses the improved Focal Loss to replace the classification loss in the original network structure, increases the weight of the difficult-to-distinguish samples in the loss function, makes the loss function tend to the difficult-to-distinguish samples, and solves the problem of imbalanced white blood cell image categories, and improves the accuracy of white blood cell classification.
Description
技术领域Technical Field
本发明涉及医学显微图像分类技术领域,具体为一种基于改进的YOLOv7的白细胞分类方法。The present invention relates to the technical field of medical microscopic image classification, and in particular to a leukocyte classification method based on improved YOLOv7.
背景技术Background Art
白细胞是血液的重要组成部分,同时也是人体主要的免疫细胞,在维持免疫力和吞噬病原体以及细菌方面起着中流砥柱的作用。因此,准确识别不同类型的白细胞对相应疾病的辅助诊断具有重要意义,白细胞的正确分类也是细胞图像自动分析过程中的关键步骤。White blood cells are an important component of blood and are also the main immune cells of the human body. They play a pivotal role in maintaining immunity and phagocytizing pathogens and bacteria. Therefore, accurate identification of different types of white blood cells is of great significance for the auxiliary diagnosis of corresponding diseases. The correct classification of white blood cells is also a key step in the automatic analysis of cell images.
然而,目前针对白细胞的分类和识别技术还存在一些问题。传统的图像处理方法往往需要手工设计特征和规则,而且对于不同尺度和形状的白细胞很难进行准确的分类。同时,传统的深度学习模型在白细胞分类任务中往往存在着检测速度慢、准确性低和泛化能力差的问题。因此,有必要研发一种新的白细胞分类技术,以提高分类的准确性和效率。However, there are still some problems with the current classification and recognition technology for white blood cells. Traditional image processing methods often require manual design of features and rules, and it is difficult to accurately classify white blood cells of different scales and shapes. At the same time, traditional deep learning models often have problems with slow detection speed, low accuracy and poor generalization ability in white blood cell classification tasks. Therefore, it is necessary to develop a new white blood cell classification technology to improve the accuracy and efficiency of classification.
YOLOv7是一种用于目标检测的深度学习模型,它具有许多优点相对于其他网络。首先,YOLOv7具有更快的检测速度和更高的准确性。它采用了一种全新的检测方法,可以在不损失准确性的前提下大大提高检测速度,这对于大规模白细胞分类具有重要意义。其次,YOLOv7具有更好的泛化能力,可以适应不同尺度和形状的白细胞,从而提高了分类的准确性和鲁棒性。此外,YOLOv7还具有更好的可扩展性和灵活性,可以轻松地应用于不同的白细胞分类任务中。YOLOv7 is a deep learning model for object detection, which has many advantages over other networks. First, YOLOv7 has faster detection speed and higher accuracy. It uses a new detection method, which can greatly improve the detection speed without losing accuracy, which is of great significance for large-scale white blood cell classification. Secondly, YOLOv7 has better generalization ability and can adapt to white blood cells of different scales and shapes, thereby improving the accuracy and robustness of classification. In addition, YOLOv7 also has better scalability and flexibility, and can be easily applied to different white blood cell classification tasks.
基于以上背景技术,我们提出了一种基于YOLOv7的白细胞分类专利技术。该技术利用YOLOv7的快速检测和高准确性的特点,结合深度学习的优势,可以实现对不同尺度和形状的白细胞进行快速、准确的分类。同时,该技术还具有良好的泛化能力和灵活性,可以适应不同的白细胞分类任务。通过该技术,可以有效地提高白细胞分类的准确性和效率,为临床诊断和治疗提供更好的支持。Based on the above background technology, we proposed a patented technology for white blood cell classification based on YOLOv7. This technology uses the fast detection and high accuracy of YOLOv7, combined with the advantages of deep learning, to achieve fast and accurate classification of white blood cells of different scales and shapes. At the same time, this technology also has good generalization ability and flexibility, and can adapt to different white blood cell classification tasks. Through this technology, the accuracy and efficiency of white blood cell classification can be effectively improved, providing better support for clinical diagnosis and treatment.
发明内容Summary of the invention
(一)解决的技术问题1. Technical issues to be solved
针对现有技术的不足,本发明提供了一种基于改进的YOLOv7的白细胞分类方法,解决了白细胞图像背景复杂、数据不均衡、类间相似度大的问题,减少了YOLOv7网络模型的参数量的同时提升了分类准确率,将在医疗病理诊断中提供客观、有效的参考意见。In view of the shortcomings of the prior art, the present invention provides a white blood cell classification method based on an improved YOLOv7, which solves the problems of complex background, unbalanced data, and large similarity between classes of white blood cell images, reduces the number of parameters of the YOLOv7 network model, and improves the classification accuracy, which will provide objective and effective reference opinions in medical pathology diagnosis.
(二)技术方案(II) Technical solution
本发明为了实现上述目的具体采用以下技术方案:In order to achieve the above-mentioned purpose, the present invention specifically adopts the following technical solutions:
一种基于改进的YOLOv7的白细胞分类方法,包括以下步骤:A white blood cell classification method based on improved YOLOv7, comprising the following steps:
S1.搜集多个公共白细胞数据集,对数据集进行数据增强和降噪操作,数据增强采用旋转90°,180°,270°和翻转方法,使得数据集中白细胞图片数量变为原来的5倍,降噪采用非线性滤波方法中值滤波和双边滤波相结合的方法,在抑制图像噪声的同时可以较好的保护图像边缘信息。S1. Collect multiple public white blood cell datasets and perform data enhancement and denoising operations on the datasets. Data enhancement uses rotation of 90°, 180°, 270° and flipping methods to increase the number of white blood cell images in the dataset to five times the original number. Noise reduction uses a combination of nonlinear filtering methods, median filtering and bilateral filtering, which can better protect image edge information while suppressing image noise.
S2.对步骤S2得到的白细胞图像进行标注,采用VOC格式,使用Labelimg工具。S2. Annotate the white blood cell image obtained in step S2 using the VOC format and the Labelimg tool.
S3.将标注好的数据集按照比例随机划分为训练集,测试集和验证集,新建三个文件夹train,test,val将其分别放在其中为后续网络训练和预测做准备。S3. Randomly divide the labeled data set into training set, test set and validation set according to the proportion, and create three folders train, test and val to place them respectively in preparation for subsequent network training and prediction.
S4.构建YOLOv7网络结构基础模型,实现对YOLOv7网络结构的改进。S4. Build the basic model of YOLOv7 network structure to improve the YOLOv7 network structure.
S5.对改进后的YOLOv7网络结构的特定参数进行设置,输入图片大小img_size=640×640,训练批大小batch_size=4,训练迭代次数Epoch=100,学习率lr=0.0001,衰减decay=0.00001,标签平滑label-smoothing=0.1,优化器采用adam优化器,将设置好参数的改进YOLOv7网络结构放入配置好环境的计算机中使用后步骤S3产生的train文件夹下的白细胞图像进行训练。S5. Set specific parameters of the improved YOLOv7 network structure, input image size img_size = 640 × 640, training batch size batch_size = 4, training iteration number Epoch = 100, learning rate lr = 0.0001, decay decay = 0.00001, label smoothing label-smoothing = 0.1, and use adam optimizer as optimizer. Put the improved YOLOv7 network structure with set parameters into a computer with a configured environment and use it to train the white blood cell images in the train folder generated in step S3.
S6.训练时随着迭代次数的不断增加,训练损失和测试损失都慢慢收敛至一个稳定值,二者差值不在变化,准确率也趋于稳定,表明已经达到了预期的训练效果,同时对于测试集的分类结果也会以图片的形式保存在run文件夹下,该文件夹分为train和test两个文件夹,在网络训练完成以后会在train文件夹下面输出训练结果和权重,此时我们选取训练过程中的最优权重来对测试集进行测试从而得到最优测试结果,最后将结果保存至run文件夹下的test文件夹下。S6. During training, as the number of iterations increases, the training loss and test loss slowly converge to a stable value, the difference between the two does not change, and the accuracy tends to be stable, indicating that the expected training effect has been achieved. At the same time, the classification results of the test set will also be saved in the form of pictures in the run folder. The folder is divided into two folders, train and test. After the network training is completed, the training results and weights will be output under the train folder. At this time, we select the optimal weight in the training process to test the test set to obtain the optimal test result, and finally save the result to the test folder under the run folder.
S7.对预测结果进行分析,将错误分类的结果统计出来,生成对应的混淆矩阵热力图便于后续计算评价指标。S7. Analyze the prediction results, count the misclassified results, and generate the corresponding confusion matrix heat map to facilitate the subsequent calculation of evaluation indicators.
S8.为了更加公正客观的验证白细胞分类识别模型的有效性,引入准确率(Accuracy),精确率(Precision),召回率(Recall)和F1-score四个指标对模型进行评估。S8. In order to verify the effectiveness of the white blood cell classification and recognition model more fairly and objectively, four indicators, namely accuracy, precision, recall and F1-score, are introduced to evaluate the model.
(三)有益效果(III) Beneficial effects
与现有技术相比,本发明提供了一种基于改进的YOLOv7的白细胞分类方法,具备以下有益效果:Compared with the prior art, the present invention provides a white blood cell classification method based on an improved YOLOv7, which has the following beneficial effects:
1.本发明通过将Backbone部分的ELAN-1结构中的四个3×3大小的卷积核替换为轻量化的GhostV2模块,减少计算复杂度的同时生成更多特征图,能够在减少计算时间的同时提升分类准确率。1. The present invention replaces the four 3×3 convolution kernels in the ELAN-1 structure of the Backbone part with a lightweight GhostV2 module, thereby reducing the computational complexity and generating more feature maps, thereby reducing the computational time and improving the classification accuracy.
2.本发明通过在Head部分的ELAN-2结构中融入SimAM和MHSA注意力使得网络在分类过程中更加关注细胞图像表面的粒度特征,对网络区分有颗粒细胞和无颗粒细胞有显著帮助。2. The present invention integrates SimAM and MHSA attention into the ELAN-2 structure of the Head part, so that the network pays more attention to the granular features of the cell image surface during the classification process, which significantly helps the network to distinguish between granular cells and non-granular cells.
3.本发明通过采用改进后的Focal Loss代替原YOLOv7中的cls Loss,动态降低训练过程中易区分样本的权重,将重心快速聚焦在那些难区分的样本,相当于增加了难分样本在损失函数中的权重,使得损失函数倾向于难分的样本,同时解决了白细胞图像类别不平衡问题。3. The present invention adopts the improved Focal Loss to replace the cls Loss in the original YOLOv7, dynamically reduces the weight of easy-to-distinguish samples during training, and quickly focuses on those difficult-to-distinguish samples, which is equivalent to increasing the weight of difficult-to-distinguish samples in the loss function, making the loss function tend to difficult-to-distinguish samples, and at the same time solves the problem of imbalanced white blood cell image categories.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明方法流程图;Fig. 1 is a flow chart of the method of the present invention;
图2为改进的YOLOv7网络结构图;Figure 2 is a diagram of the improved YOLOv7 network structure;
图3为数据集中五类白细胞图像;Figure 3 shows the five types of white blood cell images in the dataset;
图4为部分分类结果图。Figure 4 shows some of the classification results.
具体实施方式DETAILED DESCRIPTION
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.
实施例Example
如图1-4所示,本发明一个实施例提出的一种基于改进的YOLOv7网络的白细胞分类方法,包括以下步骤:As shown in FIGS. 1-4 , a method for classifying white blood cells based on an improved YOLOv7 network according to an embodiment of the present invention includes the following steps:
S1.搜集多个公共白细胞数据集,对数据集进行数据增强和降噪操作,数据增强采用旋转90°,180°,270°和翻转方法,使得数据集中白细胞图片数量变为原来的5倍,降噪采用非线性滤波方法中值滤波和双边滤波相结合的方法,在抑制图像噪声的同时可以较好的保护图像边缘信息。S1. Collect multiple public white blood cell datasets and perform data enhancement and denoising operations on the datasets. Data enhancement uses rotation of 90°, 180°, 270° and flipping methods to increase the number of white blood cell images in the dataset to five times the original number. Noise reduction uses a combination of nonlinear filtering methods, median filtering and bilateral filtering, which can better protect image edge information while suppressing image noise.
S2.对步骤S2得到的白细胞图像使用Labelimg工具对其进行标注,分别为嗜中性粒细胞、嗜酸性粒细胞、嗜碱性粒细胞、单核细胞和淋巴细胞,保存方式采用VOC格式,在保存白细胞图像的文件夹下生成一个新的文件夹Annotations来保存生成的xml文件。S2. Use the Labelimg tool to annotate the white blood cell image obtained in step S2 as neutrophils, eosinophils, basophils, monocytes and lymphocytes respectively. The VOC format is used for saving. A new folder Annotations is generated under the folder where the white blood cell image is saved to save the generated xml file.
S3.将标注好的数据集按照7:1:2的比例随机划分为训练集,测试集和验证集,新建三个文件夹train,test,val将其分别放在其中为后续网络训练和预测做准备。S3. Randomly divide the labeled data set into training set, test set and validation set in the ratio of 7:1:2, and create three folders train, test and val to place them respectively in preparation for subsequent network training and prediction.
S4.构建YOLOv7网络结构基础模型,实现对YOLOv7网络结构的改进,具体步骤如下:S4. Build the basic model of YOLOv7 network structure to improve the YOLOv7 network structure. The specific steps are as follows:
S4-1.首先实现对ELAN-1模块的改进,将ELAN-1中四个卷积大小为3×3的卷积采用GhostV2(长距离注意力机制增强廉价操作)进行替换。原始的GhostV1是由两个Ghost模块堆叠而成。这个Ghost Block采用了反向瓶颈的结构,第一个Ghost模块充当扩展层,增加输出通道数,第二个Ghost模块减少通道数以匹配快捷路径。这种结构有助于提高特征的抽象能力和表示质量。其中GhostNet是一种轻量级卷积神经网络,其设计的初衷是使用更少的参数来生成更多特征图。它将输出通道分成了相同的两个部分:一部分是常规的卷积,这部分就是正常的实现,没有任何区别。另一部分是廉价操作,不再采用常规的卷积进行实现,而是通过简单的线性变换来生成,最后把两个部分的结果拼接在一起得到最终的输出。而GhostV2是在GhostV1的基础上进行了改进,将DFC注意力分支与第一个Ghost模块并行,用于增强扩展特征,然后将增强的特征送入第二个Ghost模块生成输出特征,以捕捉不同空间位置中像素之间的长程依赖关系,增强模型的表达能力和细节信息处理能力。将改进后的ELAN-1模块替换原始YOLOv7结构中Backbone部分的第一个和第四个ELAN-1模块,从而加强YOLOv7网络结构的特征提取能力。S4-1. First, the ELAN-1 module is improved. Four convolutions with a size of 3×3 in ELAN-1 are replaced by GhostV2 (Long-distance attention mechanism to enhance cheap operations). The original GhostV1 is composed of two stacked Ghost modules. This Ghost Block adopts a reverse bottleneck structure. The first Ghost module acts as an expansion layer to increase the number of output channels, and the second Ghost module reduces the number of channels to match the shortcut path. This structure helps to improve the abstraction ability and representation quality of features. GhostNet is a lightweight convolutional neural network, which is designed to generate more feature maps with fewer parameters. It divides the output channel into two identical parts: one part is a regular convolution, which is a normal implementation without any difference. The other part is a cheap operation, which is no longer implemented by regular convolution, but is generated by a simple linear transformation. Finally, the results of the two parts are spliced together to get the final output. GhostV2 is an improvement on GhostV1. The DFC attention branch is paralleled with the first Ghost module to enhance the extended features. The enhanced features are then sent to the second Ghost module to generate output features to capture the long-range dependencies between pixels in different spatial positions, enhancing the model's expressiveness and detail information processing capabilities. The improved ELAN-1 module replaces the first and fourth ELAN-1 modules in the Backbone part of the original YOLOv7 structure, thereby enhancing the feature extraction capabilities of the YOLOv7 network structure.
S4-2.SimAM注意力机制是继现有的通道或空间注意力机制所产生的一个新型注意力机制,该注意力机制无需添加额外参数来为特征图推导出3D注意力权值,其大部分操作均基于所定义的能量函数选,从而避免了过多的结构调整,其能量函数公式如下:S4-2.SimAM attention mechanism is a new type of attention mechanism derived from the existing channel or spatial attention mechanism. This attention mechanism does not need to add additional parameters to derive 3D attention weights for feature maps. Most of its operations are based on the defined energy function selection, thus avoiding excessive structural adjustments. The energy function formula is as follows:
其中t和xi分别指输入特征X的目标神经元和其他神经元,i是指空间维度上的索引,M=H×W指在某个通道上所有神经元的个数,wt和bt分别指某个神经元变换时的“权重”和“偏差”,引入二进制标签代替yt和y0,求解最小化et,相当于找到了目标神经元与其他神经元的线性可分性,其具体步骤如下:in t and xi refer to the target neuron and other neurons of the input feature X, respectively. i refers to the index in the spatial dimension. M = H × W refers to the number of all neurons in a certain channel. wt and bt refer to the "weight" and "bias" of a certain neuron when it is transformed, respectively. Introducing binary labels to replace yt and y0 and solving the minimization of et is equivalent to finding the linear separability of the target neuron and other neurons. The specific steps are as follows:
S4-2-1.输入特征图:SimAM将接收到输入特征图作为输入。假设输入特征图的大小为H×W×C,其中H和W表示特征图的高度和宽度,C表示通道数。S4-2-1. Input feature map: SimAM will receive the input feature map as input. Assume that the size of the input feature map is H×W×C, where H and W represent the height and width of the feature map, and C represents the number of channels.
S4-2-2.特征图的重塑:首先,SimAM对输入特征图进行重塑操作,将其转换为大小为N×C的二维特征矩阵,其中N=H×W表示特征图中所有像素点的总数。这样做是为了将空间信息转换为通道维度,以便后续的注意力计算。S4-2-2. Reshaping of feature maps: First, SimAM reshapes the input feature map and converts it into a two-dimensional feature matrix of size N×C, where N=H×W represents the total number of pixels in the feature map. This is done to convert spatial information into channel dimensions for subsequent attention calculations.
S4-2-3.计算通道注意力权重:接下来,SimAM计算每个通道的注意力权重。这里使用了一种简单的计算方式,即对每个通道的特征向量进行范数归一化,然后将其作为通道的注意力权重。这样可以使得重要通道的权重更高,而不需要额外的参数进行学习。S4-2-3. Calculate channel attention weights: Next, SimAM calculates the attention weights of each channel. A simple calculation method is used here, which is to normalize the feature vector of each channel and then use it as the attention weight of the channel. This can make the weights of important channels higher without the need for additional parameters to learn.
S4-2-4.应用注意力权重:计算得到通道的注意力权重后,SimAM将这些权重应用到输入特征图的每个通道上。具体地,对于每个通道,将其对应的注意力权重乘以通道内的特征值,以增强重要通道的特征表示。S4-2-4. Apply attention weights: After calculating the attention weights of the channels, SimAM applies these weights to each channel of the input feature map. Specifically, for each channel, its corresponding attention weight is multiplied by the feature value in the channel to enhance the feature representation of important channels.
S4-2-5.重构特征图:最后,将经过注意力调节的特征图重新组合成原始的三维特征图形式,以便传递给后续的网络层进行特征提取和学习。S4-2-5. Reconstruct feature map: Finally, the attention-adjusted feature map is reassembled into the original three-dimensional feature map form so that it can be passed to the subsequent network layers for feature extraction and learning.
S4-3.MHSA是是对注意力机制和自注意力机制的进一步改进,可以同时关注不同的位置和特征,提高了特征的表示能力。而且将注意力计算分成多个头并行计算,也提高了模型的计算效率,其具体步骤如下:S4-3.MHSA is a further improvement on the attention mechanism and self-attention mechanism. It can focus on different positions and features at the same time, improving the representation ability of features. In addition, the attention calculation is divided into multiple heads for parallel calculation, which also improves the calculation efficiency of the model. The specific steps are as follows:
S4-3-1.输入序列:MHSA接收一个输入序列,通常表示为X={x1,x2,x3...xn}其中n表示序列的长度,xi表示序列中第i个元素的特征表示。S4-3-1. Input sequence: MHSA receives an input sequence, usually expressed as X = {x 1 , x 2 , x 3 ... x n }, where n represents the length of the sequence, and xi represents the feature representation of the i-th element in the sequence.
S4-3-2.生成查询、键、值:对于输入序列中的每个元素xi,MHSA通过线性变换生成三组向量:查询向量Qi、键向量Ki和值向量Vi。这些向量将用于计算注意力权重和生成输出。S4-3-2. Generate query, key, value: For each element x i in the input sequence, MHSA generates three sets of vectors through linear transformation: query vector Q i , key vector K i and value vector V i . These vectors will be used to calculate attention weights and generate outputs.
S4-3-3.计算注意力权重:对于每个查询向量Qi,MHSA计算其与所有键向量Kj的相似度,通常使用点积或其他相似度计算方法。然后通过Softmax函数将这些相似度转换为注意力权重,表示查询向量与不同位置的关联程度。S4-3-3. Calculate attention weights: For each query vector Qi , MHSA calculates its similarity with all key vectors Kj , usually using dot product or other similarity calculation methods. These similarities are then converted into attention weights through the Softmax function, indicating the degree of association between the query vector and different positions.
S4-3-4.加权求和:利用计算得到的注意力权重,MHSA对值向量Vj进行加权求和,得到查询向量Qi的输出表示。这样可以使得重要的信息得到更多的关注,而不重要的信息得到抑制。S4-3-4. Weighted summation: Using the calculated attention weights, MHSA performs weighted summation on the value vector Vj to obtain the output representation of the query vector Qi . This allows important information to receive more attention, while unimportant information is suppressed.
S4-3-5.多头机制:MHSA通常会使用多个头(heads)来并行计算注意力,每个头独立学习不同的查询、键、值的映射关系。最后将多个头的输出进行拼接或加权求和,以获得更丰富和复杂的表示能力。S4-3-5. Multi-head mechanism: MHSA usually uses multiple heads to calculate attention in parallel, and each head independently learns different query, key, and value mapping relationships. Finally, the outputs of multiple heads are concatenated or weighted summed to obtain richer and more complex representation capabilities.
S4-3-6.残差连接和层归一化:在每个MHSA模块中,通常会使用残差连接和层归一化来稳定训练过程,使得模型更容易学习有效的表示。这样可以避免梯度消失或爆炸等问题。S4-3-6. Residual connection and layer normalization: In each MHSA module, residual connection and layer normalization are usually used to stabilize the training process, making it easier for the model to learn effective representations. This can avoid problems such as gradient disappearance or explosion.
本发明将SimAM和MHSA融入ELAN-2模块中CAT之后,二者使得YOLOv7网络能够更加准确、高效地捕捉和表征图像表面颗粒的特征,从而提升白细胞分类的准确率。After the present invention integrates SimAM and MHSA into CAT in the ELAN-2 module, the two enable the YOLOv7 network to more accurately and efficiently capture and characterize the features of particles on the image surface, thereby improving the accuracy of white blood cell classification.
S4-4.进一步的,我们使用Focal Loss代替其中的cls Loss,它是基于二分类交叉熵CE的一种动态缩放交叉熵的损失函数,通过引入动态缩放因子,能够降低训练过程中易区分样本的权重,从而将重心快速聚焦在那些难区分的样本,下面是他的具体形式:S4-4. Further, we use Focal Loss instead of CLS Loss, which is a dynamic scaling cross entropy loss function based on binary cross entropy CE. By introducing a dynamic scaling factor, the weight of easily distinguishable samples can be reduced during training, thereby quickly focusing on those difficult-to-distinguish samples. The following is its specific form:
令 make
将Focal loss表达式统一为一个表达式:Unify the Focal loss expression into one expression:
Lfl=-(1-pt)γlog(pt)L fl =-(1-p t ) γ log(p t )
其引入了一个衰减因子pt,当样本被错误分类时,该因子可以放大损失,使得难以分类的样本在损失函数中占据更大的权重,从而引导模型更专注于难以分类的样本。It introduces a decay factor pt , which can amplify the loss when a sample is misclassified, so that the difficult-to-classify samples occupy a larger weight in the loss function, thereby guiding the model to focus more on difficult-to-classify samples.
为了解决类别不平衡问题,我对公式加入权重因子βt∈[0,1]:To solve the problem of class imbalance, I add a weight factor β t ∈[0,1] to the formula:
那么最终得到的平衡后损失函数为:Then the final balanced loss function is:
L′fl=-βt(1-pt)γtog(pt)L′ fl =-β t (1-p t ) γ tog(p t )
S5.对改进后的YOLOv7网络结构的特定参数进行设置,输入图片大小imgsize=640×640,训练批大小batch_size=4,训练迭代次数Epoch=100,学习率lr=0.0001,衰减decay=0.00001,标签平滑label-smoothing=0.1,优化器采用adam优化器,将设置好参数的改进YOLOv7网络结构放入配置好环境的计算机中使用后步骤S3产生的train文件夹下的白细胞图像进行训练。S5. Set specific parameters of the improved YOLOv7 network structure, input image size imgsize = 640 × 640, training batch size batch_size = 4, training iteration number Epoch = 100, learning rate lr = 0.0001, decay decay = 0.00001, label smoothing label-smoothing = 0.1, and use adam optimizer as optimizer. Put the improved YOLOv7 network structure with set parameters into a computer with a configured environment and use it to train the white blood cell images in the train folder generated in step S3.
S6.训练时随着迭代次数的不断增加,训练损失和测试损失都慢慢收敛至一个稳定值,二者差值不在变化,准确率也趋于稳定,表明已经达到了预期的训练效果,同时对于测试集的分类结果也会以图片的形式保存在run文件夹下,该文件夹分为train和test两个文件夹,在网络训练完成以后会在train文件夹下面输出训练结果和权重,此时我们选取训练过程中的最优权重来对测试集进行测试从而得到最优测试结果,最后将结果保存至run文件夹下的test文件夹下。S6. During training, as the number of iterations increases, the training loss and test loss slowly converge to a stable value, the difference between the two does not change, and the accuracy tends to be stable, indicating that the expected training effect has been achieved. At the same time, the classification results of the test set will also be saved in the form of pictures in the run folder. The folder is divided into two folders, train and test. After the network training is completed, the training results and weights will be output under the train folder. At this time, we select the optimal weight in the training process to test the test set to obtain the optimal test result, and finally save the result to the test folder under the run folder.
S7.对预测结果进行分析,将错误分类的结果统计出来,生成对应的混淆矩阵热力图便于后续计算评价指标。S7. Analyze the prediction results, count the misclassified results, and generate the corresponding confusion matrix heat map to facilitate the subsequent calculation of evaluation indicators.
S8.为了更加公正客观的验证白细胞分类识别模型的有效性,引入准确率(Accuracy),精确率(Precision),召回率(Recall)和F1-score四个指标对模型进行评估。S8. In order to verify the effectiveness of the white blood cell classification and recognition model more fairly and objectively, four indicators, namely accuracy, precision, recall and F1-score, are introduced to evaluate the model.
精确率(Precision)表示在模型预测是正样本的所有结果中被准确预测对的比重,其计算公式如下:Precision refers to the proportion of all the results predicted by the model as positive samples that are accurately predicted. The calculation formula is as follows:
召回率(Recall)表示的是在真实值是正样本的数据中被预测正确的比重,其计算公式如下:Recall refers to the proportion of data that are correctly predicted among the data whose true values are positive samples. The calculation formula is as follows:
F1-Score指标结合了精确率与召回率的结果。它的取值范围是从0到1的,如果值为1代表模型的输出最好,如果值为0则代表模型的输出结果最差,其计算公式如下:The F1-Score indicator combines the results of precision and recall. Its value range is from 0 to 1. If the value is 1, it means the model has the best output, and if the value is 0, it means the model has the worst output. Its calculation formula is as follows:
准确率(Accuracy)是一种用于评估分类模型性能的指标,表示模型在所有预测中正确分类的比例。准确率通常用百分比表示,越高表示模型的分类性能越好,其计算公式如下:Accuracy is an indicator used to evaluate the performance of a classification model, indicating the proportion of correct classifications among all predictions. Accuracy is usually expressed as a percentage, and the higher the percentage, the better the classification performance of the model. The calculation formula is as follows:
最后应说明的是:以上所述仅为本发明的优选实施例而已,并不用于限制本发明,尽管参照前述实施例对本发明进行了详细的说明,对于本领域的技术人员来说,其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。Finally, it should be noted that the above is only a preferred embodiment of the present invention and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art can still modify the technical solutions described in the aforementioned embodiments or replace some of the technical features therein by equivalents. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410452341.6A CN118537853A (en) | 2024-04-16 | 2024-04-16 | Improved YOLOv-based leukocyte classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410452341.6A CN118537853A (en) | 2024-04-16 | 2024-04-16 | Improved YOLOv-based leukocyte classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118537853A true CN118537853A (en) | 2024-08-23 |
Family
ID=92380038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410452341.6A Pending CN118537853A (en) | 2024-04-16 | 2024-04-16 | Improved YOLOv-based leukocyte classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118537853A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118896874A (en) * | 2024-10-08 | 2024-11-05 | 国炬(天津)医疗科技有限公司 | Test method and equipment for saturated water absorption rate of Chinese herbal medicine slices based on water content identification |
-
2024
- 2024-04-16 CN CN202410452341.6A patent/CN118537853A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118896874A (en) * | 2024-10-08 | 2024-11-05 | 国炬(天津)医疗科技有限公司 | Test method and equipment for saturated water absorption rate of Chinese herbal medicine slices based on water content identification |
CN118896874B (en) * | 2024-10-08 | 2025-05-30 | 国炬(天津)医疗科技有限公司 | Method and equipment for testing saturated water absorption of traditional Chinese medicine decoction pieces based on water content identification |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112270347B (en) | Medical waste classification detection method based on improved SSD | |
CN113191390B (en) | Image classification model construction method, image classification method and storage medium | |
CN113743353B (en) | Cervical cell classification method for space, channel and scale attention fusion learning | |
CN113887503B (en) | Improved attention convolution neural network-based five-classification method for white blood cells | |
CN109886346A (en) | A Cardiac MRI Image Classification System | |
CN118097662B (en) | CNN-SPPF and ViT-based pap smear cervical cell image classification method | |
CN115908255A (en) | Improved light-weight YOLOX-nano model for target detection and detection method | |
Tummala et al. | Few-shot learning using explainable Siamese twin network for the automated classification of blood cells | |
CN118537853A (en) | Improved YOLOv-based leukocyte classification method | |
Wu et al. | An improved Yolov5s based on transformer backbone network for detection and classification of bronchoalveolar lavage cells | |
CN118823856A (en) | A facial expression recognition method based on multi-scale and deep fine-grained feature enhancement | |
CN114693923A (en) | Three-dimensional point cloud semantic segmentation method based on context and attention | |
Huang et al. | 3D human pose estimation with multi-scale graph convolution and hierarchical body pooling | |
CN114972263B (en) | Real-time ultrasonic image follicle measurement method and system based on intelligent picture segmentation | |
Geng et al. | STCNet: Alternating CNN and improved transformer network for COVID-19 CT image segmentation | |
CN115511798A (en) | Pneumonia classification method and device based on artificial intelligence technology | |
Zhai et al. | Automatic white blood cell classification based on whole-slide images with a deeply aggregated neural network | |
CN118644674A (en) | A small sample medical image segmentation method based on multi-level feature guidance | |
CN118097304A (en) | Sonar image classification method based on neural architecture search | |
CN113486969A (en) | X-ray image classification method based on improved Resnet network | |
Pei et al. | FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction | |
Cao et al. | Alzheimer’s Disease Stage Detection Method Based on Convolutional Neural Network | |
CN117611918A (en) | Marine organism classification method based on hierarchical neural network | |
CN116363517A (en) | A method of apple leaf disease detection based on improved YOLOX-S algorithm | |
CN117789934A (en) | A pathological image hash retrieval method based on multi-scale pooling and norm attention mechanism of serpentine convolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |