[go: up one dir, main page]

CN114170599B - Abnormal object segmentation method based on distillation comparison - Google Patents

Abnormal object segmentation method based on distillation comparison Download PDF

Info

Publication number
CN114170599B
CN114170599B CN202111523499.0A CN202111523499A CN114170599B CN 114170599 B CN114170599 B CN 114170599B CN 202111523499 A CN202111523499 A CN 202111523499A CN 114170599 B CN114170599 B CN 114170599B
Authority
CN
China
Prior art keywords
branch
teacher
abnormal
distillation
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111523499.0A
Other languages
Chinese (zh)
Other versions
CN114170599A (en
Inventor
周瑜
周欢
龚石
白翔
郑增强
刘荣华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202111523499.0A priority Critical patent/CN114170599B/en
Publication of CN114170599A publication Critical patent/CN114170599A/en
Application granted granted Critical
Publication of CN114170599B publication Critical patent/CN114170599B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于蒸馏比较的异常物体分割方法:在无异常训练集上,训练一个语义分割网络,移除语义分类头后作为教师分支;固定教师分支参数,利用语义特征分布蒸馏获得与教师分支结构相似的学生分支。两分支的输出在正常类上保持一致,在异常类上不一致。输入带异常测试图像,两分支分别对图像进行多尺度特征提取和聚合,提取的语义特征逐位置进行比较得到异常分数图,异常分数图双线性插值取阈值将图像中的所有像素划分为正常和异常两类。本方法引入了一种全新的简单灵活的蒸馏比较网络来进行异常物体分割,在推理阶段没有利用语义分类头的结果,大幅减少了对语义分割错误的正常类别像素的误判,实现了更准确的异常物体分割。

The present invention discloses a method for segmenting abnormal objects based on distillation comparison: on a training set without abnormalities, a semantic segmentation network is trained, and the network is used as a teacher branch after removing the semantic classification head; the parameters of the teacher branch are fixed, and a student branch with a similar structure to the teacher branch is obtained by using semantic feature distribution distillation. The outputs of the two branches are consistent in the normal class and inconsistent in the abnormal class. An abnormal test image is input, and the two branches respectively perform multi-scale feature extraction and aggregation on the image. The extracted semantic features are compared position by position to obtain an abnormal score map. The abnormal score map is bilinearly interpolated and thresholded to divide all pixels in the image into normal and abnormal categories. This method introduces a new, simple and flexible distillation comparison network to perform abnormal object segmentation. The result of the semantic classification head is not used in the reasoning stage, which greatly reduces the misjudgment of normal category pixels with semantic segmentation errors, and achieves more accurate abnormal object segmentation.

Description

一种基于蒸馏比较的异常物体分割方法A method for abnormal object segmentation based on distillation comparison

技术领域Technical Field

本发明属于计算机视觉技术领域,更具体地,涉及一种基于蒸馏比较的异常物体分割方法。The present invention belongs to the technical field of computer vision, and more specifically, relates to an abnormal object segmentation method based on distillation comparison.

背景技术Background Art

近年来,异常物体分割成为一些安全关键领域的研究热点,如自动驾驶,医学图像分析等。深度卷积神经网络在语义分割任务上取得了重大突破,其目标是为每个像素分配一个预定义的语义类别标签,然而,在真实世界的开放场景中,如果一个像素属于一个未知的类别,即异常类别,那么将它归为预定义正常类别中的一类而不是异常类是很危险的。异常物体分割任务是要将测试图像中的不在训练图像预定义正常类别中的异常物体像素分割出来。In recent years, abnormal object segmentation has become a research hotspot in some safety-critical fields, such as autonomous driving and medical image analysis. Deep convolutional neural networks have made significant breakthroughs in semantic segmentation tasks, whose goal is to assign a predefined semantic category label to each pixel. However, in real-world open scenes, if a pixel belongs to an unknown category, i.e., an abnormal category, it is dangerous to classify it as a class in the predefined normal category instead of the abnormal category. The abnormal object segmentation task is to segment out the abnormal object pixels in the test image that are not in the predefined normal category of the training image.

现有的异常物体分割方法首先利用语义分割网络(如PSPNet)获得语义分割预测,然后根据语义分割预测使用不同的策略来获得异常分数图。基于不确定度估计的方法对语义分割预测进行不同的后处理,如softmax函数、CRF(Conditional Random Field,条件随机场)算法、语义分割预测集成等来获得异常分数图;基于图像重建的方法使用语义分割预测对原始图像进行重新合成,并将重新合成的图像与原始图像进行比较来获得异常分数图。语义分割预测是现有异常分割方法在推理阶段的一个核心步骤。Existing abnormal object segmentation methods first use semantic segmentation networks (such as PSPNet) to obtain semantic segmentation predictions, and then use different strategies to obtain anomaly score maps based on the semantic segmentation predictions. Methods based on uncertainty estimation perform different post-processing on semantic segmentation predictions, such as softmax function, CRF (Conditional Random Field) algorithm, semantic segmentation prediction integration, etc. to obtain anomaly score maps; methods based on image reconstruction use semantic segmentation predictions to resynthesize the original image, and compare the resynthesized image with the original image to obtain anomaly score maps. Semantic segmentation prediction is a core step in the inference stage of existing abnormal segmentation methods.

现有的异常物体分割方法大多严重依赖于语义分割预测,但观察发现语义分割中分类错误的像素非常容易被误检为异常,这种现象严重影响了异常物体分割的准确性,但在现有的方法中很少被讨论。Most existing abnormal object segmentation methods rely heavily on semantic segmentation prediction, but it is observed that pixels that are misclassified in semantic segmentation are very easy to be misdetected as abnormalities. This phenomenon seriously affects the accuracy of abnormal object segmentation, but is rarely discussed in existing methods.

发明内容Summary of the invention

针对现有技术的以上缺陷或改进需求,本发明提供了一种更准确的基于蒸馏比较的异常物体分割方法,避免语义分割错误的引入对结果造成不良影响,大幅减少了对语义分割错误的正常类别像素的误判,实现了对图像中异常物体更准确的分割。In response to the above defects or improvement needs of the prior art, the present invention provides a more accurate abnormal object segmentation method based on distillation comparison, which avoids the adverse effects of the introduction of semantic segmentation errors on the results, greatly reduces the misjudgment of normal category pixels caused by semantic segmentation errors, and achieves more accurate segmentation of abnormal objects in images.

为达到上述目的,本发明提供一种基于蒸馏比较的异常物体分割方法,包括以下步骤:To achieve the above object, the present invention provides an abnormal object segmentation method based on distillation comparison, comprising the following steps:

步骤S1,利用无异常物体的训练图像和其像素级语义标签训练一个语义分割网络,移除训练好的语义分割网络的分类器仅保留特征提取和聚合部分作为教师分支;Step S1, using training images without abnormal objects and their pixel-level semantic labels to train a semantic segmentation network, removing the classifier of the trained semantic segmentation network and retaining only the feature extraction and aggregation parts as the teacher branch;

步骤S2,固定步骤S1获得的教师分支的参数,在无异常物体的训练图像上利用语义特征分布蒸馏训练一个与教师分支结构相似的学生分支,语义特征分布蒸馏保证两分支的输出在正常类上保持一致,在异常类上表现不一致;Step S2, fixing the parameters of the teacher branch obtained in step S1, and using semantic feature distribution distillation to train a student branch with a similar structure to the teacher branch on training images without abnormal objects. The semantic feature distribution distillation ensures that the outputs of the two branches are consistent on the normal class and inconsistent on the abnormal class.

步骤S3,输入带有异常物体的测试图像,利用步骤S1获得的教师分支和步骤S2获得的学生分支分别对图像进行多尺度特征提取和聚合,获得图像高层语义特征;Step S3, input a test image with an abnormal object, use the teacher branch obtained in step S1 and the student branch obtained in step S2 to perform multi-scale feature extraction and aggregation on the image, and obtain high-level semantic features of the image;

步骤S4,将步骤S3获得的教师分支和学生分支的高层语义特征逐位置进行比较,利用打分函数计算两分支语义特征的差异作为该位置的异常分数得到异常分数图,将异常分数图双线性上采样到原图大小,最后设置合适的阈值将图像中的所有像素划分为正常和异常两类。In step S4, the high-level semantic features of the teacher branch and the student branch obtained in step S3 are compared position by position, and the difference in the semantic features of the two branches is calculated using a scoring function as the anomaly score of the position to obtain an anomaly score map. The anomaly score map is bilinearly upsampled to the original image size, and finally a suitable threshold is set to divide all pixels in the image into normal and abnormal categories.

本发明的一个实施例中,所述步骤S1中的语义分割网络,采用任意现有的语义分割网络。In one embodiment of the present invention, the semantic segmentation network in step S1 adopts any existing semantic segmentation network.

本发明的一个实施例中,在语义分割网络中移除分类器部分作为教师分支,教师分支包括图像特征提取模块和特征聚合模块,在训练语义分割网络时,去掉特征提取模块高层的下采样操作,同时在特征提取模块高层引入空洞卷积替代普通卷积,使得教师分支输出特征图尺寸为输入图像的1/8,通道数为C,C为预设值,最后对每一层通道分别进行标准化得到教师分支最终的输出特征。In one embodiment of the present invention, the classifier part is removed from the semantic segmentation network as a teacher branch. The teacher branch includes an image feature extraction module and a feature aggregation module. When training the semantic segmentation network, the downsampling operation of the high-level feature extraction module is removed. At the same time, a hole convolution is introduced at the high-level of the feature extraction module to replace the ordinary convolution, so that the output feature map size of the teacher branch is 1/8 of the input image, the number of channels is C, and C is a preset value. Finally, each layer of channels is standardized to obtain the final output features of the teacher branch.

本发明的一个实施例中,所述步骤S2中的学生分支,采用与所述步骤S1中的教师分支相似的结构。In one embodiment of the present invention, the student branch in step S2 adopts a structure similar to that of the teacher branch in step S1.

本发明的一个实施例中,所述步骤S2中的学生分支具体为:学生分支包括图像特征提取模块和特征聚合模块,去掉特征提取模块高层的下采样操作,同时在特征提取模块高层引入空洞卷积替代普通卷积,使得学生分支输出特征图尺寸为输入图像的1/8,与教师分支输出特征图尺寸保持一致,通道数为C,C为预设值,与教师分支输出特征通道数保持一致,最后对每一层通道分别进行标准化得到学生分支最终的输出特征。In one embodiment of the present invention, the student branch in step S2 is specifically as follows: the student branch includes an image feature extraction module and a feature aggregation module, the downsampling operation of the high-level feature extraction module is removed, and at the same time, a hole convolution is introduced at the high-level feature extraction module to replace the ordinary convolution, so that the output feature map size of the student branch is 1/8 of the input image, which is consistent with the output feature map size of the teacher branch, the number of channels is C, C is a preset value, and is consistent with the number of output feature channels of the teacher branch, and finally each layer of channels is standardized to obtain the final output features of the student branch.

本发明的一个实施例中,所述步骤S2中在无异常物体的训练图像上利用语义特征分布蒸馏训练学生分支,优化的目标函数为: 其中M为批大小,C为通道数,i,j是特征图中的位置索引,c是特征图中的通道索引,是教师分支和学生分支输出特征图中对应通道对应位置处特征值的差。In one embodiment of the present invention, in step S2, the student branch is trained using semantic feature distribution distillation on a training image without abnormal objects, and the optimized objective function is: Where M is the batch size, C is the number of channels, i,j are the position indexes in the feature map, and c is the channel index in the feature map. It is the difference between the feature values at the corresponding positions of the corresponding channels in the output feature maps of the teacher branch and the student branch.

本发明的一个实施例中,在教师分支和学生分支的训练阶段,输入的训练图像均是不带有异常物体仅包含预定义正常类别的图像。In one embodiment of the present invention, during the training phase of the teacher branch and the student branch, the input training images are images without abnormal objects and only contain predefined normal categories.

本发明的一个实施例中,所述步骤S4中的打分函数为: 其中,C为通道数,i,j是特征图中的位置索引,c是特征图中的通道索引,是教师分支和学生分支输出特征图中对应通道对应位置处特征值的差。In one embodiment of the present invention, the scoring function in step S4 is: Among them, C is the number of channels, i,j is the position index in the feature map, and c is the channel index in the feature map. It is the difference between the feature values at the corresponding positions of the corresponding channels in the output feature maps of the teacher branch and the student branch.

本发明的一个实施例中,采用以ResNet-50作为骨干网络的PSPNet作为语义分割网络。In one embodiment of the present invention, PSPNet with ResNet-50 as the backbone network is used as the semantic segmentation network.

本发明的一个实施例中,M=4,C=512。In one embodiment of the present invention, M=4, C=512.

总体而言,通过本发明所构思的以上技术方案与现有技术相比,具有如下有益效果:In general, the above technical solution conceived by the present invention has the following beneficial effects compared with the prior art:

(1)本发明提出了一种新颖的基于蒸馏比较的异常物体分割方法,该方法在训练阶段仅利用无异常的正常图片即可,测试阶段通过逐位置比较两分支提取的高层语义特征的差异来更有效发现异常,测试阶段不使用语义分割网络分类头的预测作为中间步骤,避免语义分割错误的引入对结果造成不良影响,大幅减少了对语义分割错误的正常类别像素的误判,实现了对图像中异常物体更准确的分割;(1) The present invention proposes a novel abnormal object segmentation method based on distillation comparison. In the training phase, the method only uses normal images without abnormalities. In the test phase, the differences in high-level semantic features extracted by the two branches are compared position by position to more effectively detect abnormalities. In the test phase, the prediction of the semantic segmentation network classification head is not used as an intermediate step to avoid the introduction of semantic segmentation errors that have a negative impact on the results. This greatly reduces the misjudgment of normal category pixels caused by semantic segmentation errors, and achieves more accurate segmentation of abnormal objects in images.

(2)本发明提出了一种简单且灵活的基于蒸馏比较的异常物体分割方法,其教师分支可以是任意的语义分割网络去掉分类器部分,学生分支的结构与教师分支相似即可,可以根据实际情况要求选择模型大小,方法更具有通用性。(2) The present invention proposes a simple and flexible abnormal object segmentation method based on distillation comparison, in which the teacher branch can be any semantic segmentation network without the classifier part. The structure of the student branch can be similar to that of the teacher branch. The model size can be selected according to actual requirements, and the method is more universal.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明提供的基于蒸馏比较的异常物体分割方法的整体流程图。FIG1 is an overall flow chart of the abnormal object segmentation method based on distillation comparison provided by the present invention.

具体实施方式DETAILED DESCRIPTION

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In order to make the purpose, technical solutions and advantages of the present invention more clearly understood, the present invention is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not intended to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.

本发明提供了一种基于蒸馏比较的异常物体分割方法,如图1所示,包括以下步骤:The present invention provides a method for segmenting abnormal objects based on distillation comparison, as shown in FIG1 , comprising the following steps:

步骤S1,利用无异常物体的训练图像(1)和其像素级语义标签训练一个语义分割网络,移除训练好的语义分割网络的分类器仅保留特征提取和聚合部分作为教师分支(2);Step S1, using training images without abnormal objects (1) and their pixel-level semantic labels to train a semantic segmentation network, removing the classifier of the trained semantic segmentation network and retaining only the feature extraction and aggregation parts as the teacher branch (2);

步骤S2,固定步骤S1获得的教师分支的参数,在无异常物体的训练图像(1)上利用语义特征分布蒸馏训练一个与教师分支结构相似的学生分支(3),语义特征分布蒸馏保证两分支的输出在正常类上保持一致,在异常类上表现不一致;Step S2, fix the parameters of the teacher branch obtained in step S1, and train a student branch (3) with a similar structure to the teacher branch using semantic feature distribution distillation on the training image (1) without abnormal objects. The semantic feature distribution distillation ensures that the outputs of the two branches are consistent on the normal class and inconsistent on the abnormal class.

步骤S3,输入带有异常物体的测试图像(5),利用步骤S1获得的教师分支和步骤S2获得的学生分支分别对图像进行多尺度特征提取和聚合,获得图像高层语义特征;Step S3, input the test image (5) with the abnormal object, use the teacher branch obtained in step S1 and the student branch obtained in step S2 to perform multi-scale feature extraction and aggregation on the image, and obtain high-level semantic features of the image;

步骤S4,将步骤S3获得的教师分支和学生分支的高层语义特征逐位置进行比较,利用打分函数(6)计算两分支语义特征的差异作为该位置的异常分数得到异常分数图,将异常分数图双线性上采样到原图大小,最后设置合适的阈值将图像中的所有像素划分为正常和异常两类。In step S4, the high-level semantic features of the teacher branch and the student branch obtained in step S3 are compared position by position, and the difference in the semantic features of the two branches is calculated using the scoring function (6) as the anomaly score of the position to obtain an anomaly score map. The anomaly score map is bilinearly upsampled to the original image size, and finally a suitable threshold is set to divide all pixels in the image into normal and abnormal categories.

主要有三个实现部分:1)教师分支;2)学生分支;3)目标函数和打分函数。接下来对本发明中步骤进行详细说明。There are three main implementation parts: 1) teacher branch; 2) student branch; 3) objective function and scoring function. Next, the steps in the present invention are described in detail.

1.教师分支1. Teacher branch

本发明实施例采用以ResNet-50作为骨干网络的PSPNet作为语义分割网络。在无异常物体的训练集上,PSPNet是在语义分割的常规设置下进行训练。实施例中的教师分支(2)是将训练好的PSPNet移除分类器部分,保留特征提取模块ResNet-50进行图像特征提取,特征聚合模块采用金字塔池化模块进行高层语义特征多尺度聚合。为了不损失太多分辨率以免丢失小物体信息,实施例中将ResNet-50去掉最后两层卷积块的下采样操作,为了获得更大感受野以获取更多上下文信息,将ResNet-50第四层卷积块替换成空洞率为2的空洞卷积块,将第五层卷积块替换成空洞率为4的空洞卷积块,输出的图像特征图尺寸为输入图像的1/8。ResNet-50最后一层输出特征经过1×1,2×2,3×3,6×6四个等级的池化,将池化后的特征双线性上采样到输入图像的1/8大小并与ResNet-50最后一层输出特征连接在一起,使用1×1卷积将连接后的特征通道数变为C,最后对每一层通道分别进行标准化得到教师分支(2)最终的输出特征。实施例中将C设置为512。The embodiment of the present invention adopts PSPNet with ResNet-50 as the backbone network as the semantic segmentation network. On the training set without abnormal objects, PSPNet is trained under the conventional settings of semantic segmentation. The teacher branch (2) in the embodiment removes the classifier part of the trained PSPNet, retains the feature extraction module ResNet-50 for image feature extraction, and the feature aggregation module uses a pyramid pooling module for multi-scale aggregation of high-level semantic features. In order not to lose too much resolution to avoid losing small object information, the embodiment removes the downsampling operation of the last two layers of convolution blocks of ResNet-50. In order to obtain a larger receptive field to obtain more contextual information, the fourth layer convolution block of ResNet-50 is replaced with a dilated convolution block with a dilated rate of 2, and the fifth layer convolution block is replaced with a dilated convolution block with a dilated rate of 4. The output image feature map size is 1/8 of the input image. The output features of the last layer of ResNet-50 are pooled at four levels: 1×1, 2×2, 3×3, and 6×6. The pooled features are bilinearly upsampled to 1/8 of the input image size and concatenated with the output features of the last layer of ResNet-50. A 1×1 convolution is used to convert the number of concatenated feature channels to C. Finally, each layer of channels is standardized to obtain the final output features of the teacher branch (2). In the embodiment, C is set to 512.

2.学生分支2. Student Branch

本发明实施例采用的学生分支(3)与实施例中的教师分支(2)结构相似,以ResNet-34作为特征提取模块进行图像特征提取,金字塔池化模块作为特征聚合模块进行高层语义特征多尺度聚合,为了不损失太多分辨率以免丢失小物体信息,实施例中将ResNet-34去掉最后两层卷积块的下采样操作,为了获得更大感受野以获取更多上下文信息,将ResNet-34第四层卷积块替换成空洞率为2的空洞卷积块,将第五层卷积块替换成空洞率为4的空洞卷积块,输出的图像特征图尺寸为输入图像的1/8。ResNet-34最后一层输出特征经过1×1,2×2,3×3,6×6四个等级的池化,将池化后的特征双线性上采样到输入图像的1/8大小并与ResNet-34最后一层输出特征连接在一起,使用1×1卷积将连接后的特征通道数变为C,最后对每一层通道分别进行标准化得到学生分支最终的输出特征。实施例中将C设置为512。The student branch (3) used in the embodiment of the present invention is similar in structure to the teacher branch (2) in the embodiment. ResNet-34 is used as a feature extraction module to extract image features, and a pyramid pooling module is used as a feature aggregation module to perform multi-scale aggregation of high-level semantic features. In order not to lose too much resolution and avoid losing small object information, the downsampling operation of the last two layers of convolution blocks of ResNet-34 is removed in the embodiment. In order to obtain a larger receptive field to obtain more context information, the fourth layer of convolution blocks of ResNet-34 is replaced with a dilated convolution block with a dilated rate of 2, and the fifth layer of convolution blocks is replaced with a dilated convolution block with a dilated rate of 4. The size of the output image feature map is 1/8 of the input image. The output features of the last layer of ResNet-34 are pooled at four levels of 1×1, 2×2, 3×3, and 6×6. The pooled features are bilinearly upsampled to 1/8 of the size of the input image and connected with the output features of the last layer of ResNet-34. The number of feature channels after connection is changed to C using 1×1 convolution. Finally, each layer of channels is standardized to obtain the final output features of the student branch. In the embodiment, C is set to 512.

3.目标函数和打分函数3. Objective function and scoring function

本发明实施例在训练学生分支阶段,固定教师分支的参数,在无异常物体的训练集上利用语义特征分布蒸馏训练学生分支,优化的目标函数(4)为:In the stage of training the student branch, the embodiment of the present invention fixes the parameters of the teacher branch, and trains the student branch using semantic feature distribution distillation on a training set without abnormal objects. The optimized objective function (4) is:

其中M为批大小,C为通道数,i,j是特征图中的位置索引,c是特征图中的通道索引,是教师分支和学生分支输出特征图中对应通道对应位置处特征值的差,本发明实施例设定M=4,C=512。Where M is the batch size, C is the number of channels, i,j are the position indexes in the feature map, and c is the channel index in the feature map. It is the difference between the feature values at the corresponding positions of the corresponding channels in the output feature graphs of the teacher branch and the student branch. In this embodiment of the present invention, M=4 and C=512 are set.

本发明实施例在测试阶段,输入带有异常物体的测试图像,教师分支(2)和学生分支(3)分别提取聚合后输出带有异常物体的测试图像(5)的高层语义特征并逐位置进行比较,利用打分函数(6)计算两分支语义特征的差异作为该位置的异常分数得到异常分数图,打分函数(6)为:In the test phase of the embodiment of the present invention, a test image with an abnormal object is input, and the teacher branch (2) and the student branch (3) respectively extract high-level semantic features of the test image (5) with an abnormal object after aggregation and compare them position by position. The difference of the semantic features of the two branches is calculated using the scoring function (6) as the abnormal score of the position to obtain an abnormal score map. The scoring function (6) is:

其中,C为通道数,i,j是特征图中的位置索引,c是特征图中的通道索引,是教师分支和学生分支输出特征图中对应通道对应位置处特征值的差,本发明实施例设定C=512。语义特征分布蒸馏保证两分支的输出在正常类上保持一致,因为训练过程只在不带有异常物体的仅包含预定义正常类别的图像上进行,对于未经过训练的异常类别像素,两分支提取的语义特征分布呈现任意性,所以两分支的输出特征在异常类上表现不一致。对于正常类别像素,打分函数计算出的异常分数较小,而对于异常类别像素,打分函数计算出的异常分数较大。Among them, C is the number of channels, i,j is the position index in the feature map, and c is the channel index in the feature map. It is the difference between the feature values at the corresponding positions of the corresponding channels in the output feature maps of the teacher branch and the student branch. In the embodiment of the present invention, C=512 is set. The semantic feature distribution distillation ensures that the outputs of the two branches are consistent in the normal class, because the training process is only performed on images that only contain predefined normal categories without abnormal objects. For untrained abnormal category pixels, the semantic feature distribution extracted by the two branches is arbitrary, so the output features of the two branches are inconsistent in the abnormal class. For normal category pixels, the abnormal score calculated by the scoring function is small, while for abnormal category pixels, the abnormal score calculated by the scoring function is large.

本发明提出了一种基于蒸馏比较的异常物体分割方法。蒸馏比较网络包含一个教师分支和一个学生分支,在无异常物体的训练集上,教师分支是在语义分割的常规设置下进行训练,学生分支是利用对教师分支语义特征分布进行蒸馏获得。语义特征分布蒸馏保证两分支的输出在正常类上保持一致,在异常类上表现不一致,测试阶段利用两分支之间语义特征的差异来有效发现异常。蒸馏比较网络简单灵活,测试阶段不使用语义分割网络分类头的预测作为中间步骤,避免语义分割错误的引入对结果造成不良影响,大幅减少了对语义分割错误的正常类别像素的误判,实现了对图像中异常物体更准确的分割。The present invention proposes a method for segmenting abnormal objects based on distillation comparison. The distillation comparison network includes a teacher branch and a student branch. On a training set without abnormal objects, the teacher branch is trained under the conventional setting of semantic segmentation, and the student branch is obtained by distilling the semantic feature distribution of the teacher branch. The semantic feature distribution distillation ensures that the outputs of the two branches remain consistent on the normal class and inconsistent on the abnormal class. The difference in semantic features between the two branches is used to effectively detect anomalies in the test phase. The distillation comparison network is simple and flexible. The prediction of the semantic segmentation network classification head is not used as an intermediate step in the test phase, so as to avoid the introduction of semantic segmentation errors that have a negative impact on the results, greatly reduce the misjudgment of normal category pixels with semantic segmentation errors, and achieve more accurate segmentation of abnormal objects in the image.

本领域的技术人员容易理解,以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。It will be easily understood by those skilled in the art that the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the protection scope of the present invention.

Claims (10)

1.一种基于蒸馏比较的异常物体分割方法,其特征在于,包括以下步骤:1. A method for segmenting abnormal objects based on distillation comparison, characterized in that it comprises the following steps: 步骤S1,利用无异常物体的训练图像和其像素级语义标签训练一个语义分割网络,移除训练好的语义分割网络的分类器仅保留特征提取和聚合部分作为教师分支;Step S1, using training images without abnormal objects and their pixel-level semantic labels to train a semantic segmentation network, removing the classifier of the trained semantic segmentation network and retaining only the feature extraction and aggregation parts as the teacher branch; 步骤S2,固定步骤S1获得的教师分支的参数,在无异常物体的训练图像上利用语义特征分布蒸馏训练一个与教师分支结构相似的学生分支,语义特征分布蒸馏保证两分支的输出在正常类上保持一致,在异常类上表现不一致;Step S2, fixing the parameters of the teacher branch obtained in step S1, and using semantic feature distribution distillation to train a student branch with a similar structure to the teacher branch on training images without abnormal objects. The semantic feature distribution distillation ensures that the outputs of the two branches are consistent on the normal class and inconsistent on the abnormal class. 步骤S3,输入带有异常物体的测试图像,利用步骤S1获得的教师分支和步骤S2获得的学生分支分别对图像进行多尺度特征提取和聚合,获得图像高层语义特征;Step S3, input a test image with an abnormal object, use the teacher branch obtained in step S1 and the student branch obtained in step S2 to perform multi-scale feature extraction and aggregation on the image, and obtain high-level semantic features of the image; 步骤S4,将步骤S3获得的教师分支和学生分支的高层语义特征逐位置进行比较,利用打分函数计算两分支语义特征的差异作为该位置的异常分数得到异常分数图,将异常分数图双线性上采样到原图大小,最后设置合适的阈值将图像中的所有像素划分为正常和异常两类。In step S4, the high-level semantic features of the teacher branch and the student branch obtained in step S3 are compared position by position, and the difference in the semantic features of the two branches is calculated using a scoring function as the anomaly score of the position to obtain an anomaly score map. The anomaly score map is bilinearly upsampled to the original image size, and finally a suitable threshold is set to divide all pixels in the image into normal and abnormal categories. 2.如权利要求1所述的基于蒸馏比较的异常物体分割方法,其特征在于,所述步骤S1中的语义分割网络,采用任意现有的语义分割网络。2. The abnormal object segmentation method based on distillation comparison as described in claim 1 is characterized in that the semantic segmentation network in step S1 adopts any existing semantic segmentation network. 3.如权利要求1或2所述的基于蒸馏比较的异常物体分割方法,其特征在于,在语义分割网络中移除分类器部分作为教师分支,教师分支包括图像特征提取模块和特征聚合模块,在训练语义分割网络时,去掉特征提取模块高层的下采样操作,同时在特征提取模块高层引入空洞卷积替代普通卷积,使得教师分支输出特征图尺寸为输入图像的1/8,通道数为C,C为预设值,最后对每一层通道分别进行标准化得到教师分支最终的输出特征。3. The abnormal object segmentation method based on distillation comparison as described in claim 1 or 2 is characterized in that the classifier part is removed as a teacher branch in the semantic segmentation network, and the teacher branch includes an image feature extraction module and a feature aggregation module. When training the semantic segmentation network, the downsampling operation of the high-level feature extraction module is removed, and at the same time, a hole convolution is introduced at the high-level of the feature extraction module to replace the ordinary convolution, so that the output feature map size of the teacher branch is 1/8 of the input image, the number of channels is C, C is a preset value, and finally each layer of channels is standardized to obtain the final output features of the teacher branch. 4.如权利要求1或2所述的基于蒸馏比较的异常物体分割方法,其特征在于,所述步骤S2中的学生分支,采用与所述步骤S1中的教师分支相似的结构。4. The abnormal object segmentation method based on distillation comparison as described in claim 1 or 2 is characterized in that the student branch in step S2 adopts a structure similar to that of the teacher branch in step S1. 5.如权利要求4所述的基于蒸馏比较的异常物体分割方法,其特征在于,所述步骤S2中的学生分支具体为:5. The abnormal object segmentation method based on distillation comparison according to claim 4, characterized in that the student branch in step S2 is specifically: 学生分支包括图像特征提取模块和特征聚合模块,去掉特征提取模块高层的下采样操作,同时在特征提取模块高层引入空洞卷积替代普通卷积,使得学生分支输出特征图尺寸为输入图像的1/8,与教师分支输出特征图尺寸保持一致,通道数为C,C为预设值,与教师分支输出特征通道数保持一致,最后对每一层通道分别进行标准化得到学生分支最终的输出特征。The student branch includes an image feature extraction module and a feature aggregation module. The downsampling operation of the high-level feature extraction module is removed, and atrous convolution is introduced at the high-level feature extraction module to replace ordinary convolution, so that the output feature map size of the student branch is 1/8 of the input image, which is consistent with the output feature map size of the teacher branch. The number of channels is C, which is a preset value and consistent with the number of output feature channels of the teacher branch. Finally, each layer of channels is standardized to obtain the final output features of the student branch. 6.如权利要求1或2所述的基于蒸馏比较的异常物体分割方法,其特征在于,所述步骤S2中在无异常物体的训练图像上利用语义特征分布蒸馏训练学生分支,优化的目标函数为:6. The abnormal object segmentation method based on distillation comparison according to claim 1 or 2, characterized in that in step S2, the student branch is trained by using semantic feature distribution distillation on the training image without abnormal objects, and the optimized objective function is: 其中M为批大小,C为通道数,i,j是特征图中的位置索引,c是特征图中的通道索引,是教师分支和学生分支输出特征图中对应通道对应位置处特征值的差。Where M is the batch size, C is the number of channels, i,j are the position indexes in the feature map, and c is the channel index in the feature map. It is the difference between the feature values at the corresponding positions of the corresponding channels in the output feature maps of the teacher branch and the student branch. 7.如权利要求1或2所述的基于蒸馏比较的异常物体分割方法,其特征在于,在教师分支和学生分支的训练阶段,输入的训练图像均是不带有异常物体仅包含预定义正常类别的图像。7. The abnormal object segmentation method based on distillation comparison as described in claim 1 or 2 is characterized in that, in the training stage of the teacher branch and the student branch, the input training images are images without abnormal objects and only contain predefined normal categories. 8.如权利要求1或2所述的基于蒸馏比较的异常物体分割方法,其特征在于,所述步骤S4中的打分函数为:8. The abnormal object segmentation method based on distillation comparison according to claim 1 or 2, characterized in that the scoring function in step S4 is: 其中,C为通道数,i,j是特征图中的位置索引,c是特征图中的通道索引,是教师分支和学生分支输出特征图中对应通道对应位置处特征值的差。Among them, C is the number of channels, i,j is the position index in the feature map, and c is the channel index in the feature map. It is the difference between the feature values at the corresponding positions of the corresponding channels in the output feature maps of the teacher branch and the student branch. 9.如权利要求1或2所述的基于蒸馏比较的异常物体分割方法,其特征在于,采用以ResNet-50作为骨干网络的PSPNet作为语义分割网络。9. The abnormal object segmentation method based on distillation comparison as described in claim 1 or 2 is characterized in that PSPNet with ResNet-50 as the backbone network is used as the semantic segmentation network. 10.如权利要求6所述的基于蒸馏比较的异常物体分割方法,其特征在于,M=4,C=512。10. The abnormal object segmentation method based on distillation comparison as described in claim 6 is characterized in that M=4 and C=512.
CN202111523499.0A 2021-12-14 2021-12-14 Abnormal object segmentation method based on distillation comparison Active CN114170599B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111523499.0A CN114170599B (en) 2021-12-14 2021-12-14 Abnormal object segmentation method based on distillation comparison

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111523499.0A CN114170599B (en) 2021-12-14 2021-12-14 Abnormal object segmentation method based on distillation comparison

Publications (2)

Publication Number Publication Date
CN114170599A CN114170599A (en) 2022-03-11
CN114170599B true CN114170599B (en) 2024-08-23

Family

ID=80486195

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111523499.0A Active CN114170599B (en) 2021-12-14 2021-12-14 Abnormal object segmentation method based on distillation comparison

Country Status (1)

Country Link
CN (1) CN114170599B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742799B (en) * 2022-04-18 2024-04-26 华中科技大学 Unknown type defect segmentation method for industrial scenes based on self-supervised heterogeneous network
CN117152459B (en) * 2023-10-30 2024-06-11 腾讯科技(深圳)有限公司 Image detection method, device, computer readable medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062951A (en) * 2019-12-11 2020-04-24 华中科技大学 A Knowledge Distillation Method Based on Intra-Class Feature Difference for Semantic Segmentation
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11347965B2 (en) * 2019-03-21 2022-05-31 Illumina, Inc. Training data generation for artificial intelligence-based sequencing
US12354008B2 (en) * 2020-02-20 2025-07-08 Illumina, Inc. Knowledge distillation and gradient pruning-based compression of artificial intelligence-based base caller

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062951A (en) * 2019-12-11 2020-04-24 华中科技大学 A Knowledge Distillation Method Based on Intra-Class Feature Difference for Semantic Segmentation
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field

Also Published As

Publication number Publication date
CN114170599A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
CN109509192B (en) Semantic segmentation network integrating multi-scale feature space and semantic space
CN112561910B (en) Industrial surface defect detection method based on multi-scale feature fusion
CN111489324B (en) Cervical image classification method fusing multi-mode prior pathological depth features
CN112381097A (en) Scene semantic segmentation method based on deep learning
CN111680702B (en) Method for realizing weak supervision image significance detection by using detection frame
CN110188654B (en) Video behavior identification method based on mobile uncut network
CN114170599B (en) Abnormal object segmentation method based on distillation comparison
CN107239801A (en) Video attribute represents that learning method and video text describe automatic generation method
CN110298843B (en) Two-dimensional image component segmentation method based on improved deep Lab and application thereof
CN114742799B (en) Unknown type defect segmentation method for industrial scenes based on self-supervised heterogeneous network
CN111652273B (en) Deep learning-based RGB-D image classification method
CN114758133B (en) Image defect segmentation method based on superpixel active learning and semi-supervised learning strategy
CN114821184B (en) Long-tail image classification method and system based on balanced complementary entropy
CN115797637B (en) Semi-supervised segmentation model based on inter-model and intra-model uncertainty
CN114417836B (en) A semantic segmentation method for Chinese electronic medical record text based on deep learning
CN113066025A (en) Image defogging method based on incremental learning and feature and attention transfer
CN112070040A (en) Text line detection method for video subtitles
CN113392840B (en) Real-time Semantic Segmentation Method Based on Multi-Scale Segmentation Fusion
CN111783688A (en) A classification method of remote sensing image scene based on convolutional neural network
CN115690574A (en) Remote sensing image ship detection method based on self-supervision learning
CN117392388A (en) Capsule endoscope polyp segmentation method based on attention mechanism and multi-scale fusion
CN115457329B (en) Training method of image classification model, image classification method and device
CN116030004A (en) A Surface Defect Detection Method Based on Improved Faster RCNN
CN110992309B (en) Fundus image segmentation method based on deep information transfer network
CN111723852A (en) Robust training method for target detection network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant