[go: up one dir, main page]

CN117409413A - A small-sample semantic segmentation method and system based on background information mining - Google Patents

A small-sample semantic segmentation method and system based on background information mining Download PDF

Info

Publication number
CN117409413A
CN117409413A CN202311720688.6A CN202311720688A CN117409413A CN 117409413 A CN117409413 A CN 117409413A CN 202311720688 A CN202311720688 A CN 202311720688A CN 117409413 A CN117409413 A CN 117409413A
Authority
CN
China
Prior art keywords
pseudo
class
image
data set
background
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311720688.6A
Other languages
Chinese (zh)
Other versions
CN117409413B (en
Inventor
刘建明
经卓勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Normal University
Original Assignee
Jiangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Normal University filed Critical Jiangxi Normal University
Priority to CN202311720688.6A priority Critical patent/CN117409413B/en
Publication of CN117409413A publication Critical patent/CN117409413A/en
Application granted granted Critical
Publication of CN117409413B publication Critical patent/CN117409413B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7753Incorporation of unlabelled data, e.g. multiple instance learning [MIL]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Image Analysis (AREA)

Abstract

本发明提出的一种基于背景信息挖掘的小样本语义分割方法及系统,该方法包括:通过离线背景标记算法网络,对小样本语义分割任务中背景部分的潜在信息进行挖掘,获取伪类数据集,再通过联合训练,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。本发明提出的基于背景信息挖掘的小样本语义分割方法,通过将基类图像数据集输入离线背景标记算法网络,由无监督图像分割算法子网络和骨干网络获取原型特征以聚类获取伪类数据集,根据伪类数据集和基类图像数据集进行联合训练,以进行新类图像数据集的分割任务,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。

The present invention proposes a small-sample semantic segmentation method and system based on background information mining. The method includes: mining the potential information of the background part in the small-sample semantic segmentation task through an offline background marking algorithm network to obtain a pseudo-class data set , and then through joint training, the generalization ability and performance of the semantic segmentation model in the face of new classes are greatly improved, and the base class bias problem of the model is greatly alleviated. The small-sample semantic segmentation method based on background information mining proposed by this invention inputs the base class image data set into the offline background labeling algorithm network, and obtains prototype features from the unsupervised image segmentation algorithm subnetwork and backbone network to obtain pseudo-class data through clustering. The set is jointly trained based on the pseudo-class data set and the base class image data set to perform the segmentation task of the new class image data set, which greatly improves the generalization ability and performance of the semantic segmentation model when facing new classes, and greatly improves the generalization ability and performance of the semantic segmentation model when facing new classes. Alleviates the base class bias problem of the model.

Description

一种基于背景信息挖掘的小样本语义分割方法及系统A small-sample semantic segmentation method and system based on background information mining

技术领域Technical field

本发明涉及图像识别领域,特别涉及一种基于背景信息挖掘的小样本语义分割方法及系统。The invention relates to the field of image recognition, and in particular to a small sample semantic segmentation method and system based on background information mining.

背景技术Background technique

随着人工智能产业的快速发展,深度学习被广泛的应用于各个行业领域之中,而在计算机视觉方面其应用尤为突出,其中的语义分割算法在目标检测、图像分类、实例分割、姿态估计这一系列图像识别任务中均起着重要作用,是图像识别领域的重点研究方向之一。With the rapid development of the artificial intelligence industry, deep learning is widely used in various industries, and its application is particularly prominent in computer vision. Among them, semantic segmentation algorithms are used in target detection, image classification, instance segmentation, and pose estimation. It plays an important role in a series of image recognition tasks and is one of the key research directions in the field of image recognition.

目前存在的小样本语义分割的方法,通常都会面临“基类偏置问题”,这是因为在训练阶段使用大量基类数据,导致模型在测试阶段面对新类对象的分割效果下降,假如图像中同时存在新类和基类对象,还容易出现误分割现象,现有技术中,有的采用只更新模型网络中部分值的方法来解决这个问题,通过奇异值分解方法,分解主干网络的权重矩阵,以找出必须更新的奇异值,随后冻结其他权重参数,只更新奇异值,保持其他值不变,最后将更新后的值重新变换为模型的权重矩阵,在只更新少量参数的情况下提高了模型架构面对新类的泛化能力,有的通过添加了一个额外的基类学习器分支来精确地分割基类对象,并纠正最终的预测结果。Currently existing small-sample semantic segmentation methods usually face the "base class bias problem". This is because a large amount of base class data is used in the training stage, resulting in the model's segmentation effect declining when facing new class objects in the test stage. If the image There are both new class and base class objects in the model, and it is easy to cause mis-segmentation. In the existing technology, some use the method of updating only part of the values in the model network to solve this problem, and use the singular value decomposition method to decompose the weight of the backbone network. matrix to find the singular values that must be updated, then freeze other weight parameters, update only the singular values, keep other values unchanged, and finally re-transform the updated values into the weight matrix of the model, while only updating a small number of parameters The generalization ability of the model architecture to new classes is improved, and some add an additional base class learner branch to accurately segment base class objects and correct the final prediction results.

但目前这些方法中,不论是只更新模型的部分参数,还是通过额外的基类学习器分支来精确地分割基类对象,仍然难以缓解模型的基类偏置问题。However, among these current methods, whether they only update some parameters of the model or accurately segment base class objects through additional base class learner branches, it is still difficult to alleviate the base class bias problem of the model.

发明内容Contents of the invention

基于此,本发明的目的是提供一种基于背景信息挖掘的小样本语义分割方法及系统,通过将基类图片输入离线背景标记算法网络,对小样本语义分割任务中背景部分的潜在信息进行挖掘,获取伪类数据集,再通过根据伪类数据集和原数据集进行联合训练,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。Based on this, the purpose of the present invention is to provide a small-sample semantic segmentation method and system based on background information mining. By inputting base class images into the offline background labeling algorithm network, the potential information of the background part in the small-sample semantic segmentation task is mined. , obtain the pseudo-class data set, and then conduct joint training based on the pseudo-class data set and the original data set, which greatly improves the generalization ability and performance of the semantic segmentation model when facing new classes, and greatly eases the base class of the model. Bias problem.

本发明提出的基于背景信息挖掘的小样本语义分割方法,包括:The small sample semantic segmentation method based on background information mining proposed by this invention includes:

将预先设定的基类图像数据集输入离线背景标记算法网络;Input the preset base class image data set into the offline background labeling algorithm network;

通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征;Obtain the pre-segmentation sub-region mask and high-level semantic features of the base class image through the unsupervised image segmentation algorithm sub-network and backbone network to extract the prototype features of the background area in the sub-region;

根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集;Perform clustering according to the prototype features to divide multiple different pseudo-classes, and make the pseudo-classes into a pseudo-class data set;

根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务。The semantic segmentation model is jointly trained according to the pseudo class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model.

综上,根据上述基于背景信息挖掘的小样本语义分割方法,通过将基类图片输入离线背景标记算法网络,对小样本语义分割任务中背景部分的潜在信息进行挖掘,获取伪类数据集,再通过根据伪类数据集和原数据集进行联合训练,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。具体的,将预先设定的基类图像数据集输入离线背景标记算法网络,设定基类图像中的前景区域与背景区域,再通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征,根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集,因为在原数据基础上通过离线背景标记算法网络进行了数据扩充,模型的训练阶段将不只有基类信息参与,还有生成的背景伪类信息,因此模型对于新类的泛化能力得到明显提升,根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。In summary, according to the above-mentioned small-sample semantic segmentation method based on background information mining, by inputting the base class image into the offline background labeling algorithm network, the potential information of the background part in the small-sample semantic segmentation task is mined to obtain the pseudo-class data set, and then Through joint training based on the pseudo-class data set and the original data set, the generalization ability and performance of the semantic segmentation model in the face of new classes are greatly improved, and the base class bias problem of the model is greatly alleviated. Specifically, the preset base class image data set is input into the offline background labeling algorithm network, the foreground area and background area in the base class image are set, and then the base class is obtained through the unsupervised image segmentation algorithm subnetwork and backbone network. The pre-segmented sub-region mask and high-level semantic features of the image are used to extract the prototype features of the background area in the sub-region, clustering is performed based on the prototype features to divide multiple different pseudo-classes, and the pseudo-classes are Made into a pseudo-class data set, because the data is expanded through the offline background labeling algorithm network based on the original data, the training phase of the model will not only involve the base class information, but also the generated background pseudo-class information, so the model is very sensitive to new classes. The generalization ability is significantly improved, and the semantic segmentation model is jointly trained according to the pseudo-class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model, It greatly improves the generalization ability and performance of the semantic segmentation model when facing new classes, and greatly alleviates the base class bias problem of the model.

进一步的,所述将预先设定的基类图像数据集输入离线背景标记算法网络的步骤包括:Further, the step of inputting the preset base class image data set into the offline background labeling algorithm network includes:

在预先设定的基类图像数据集中选择当前的基类目标,设定所述基类图像中的基类区域为前景区域,设定所述基类图像中的非基类区域为背景区域。Select the current base class target in the preset base class image data set, set the base class area in the base class image as the foreground area, and set the non-base class area in the base class image as the background area.

进一步的,所述通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征的步骤包括:Further, the step of obtaining the pre-segmented sub-region mask and high-level semantic features of the base class image through the unsupervised image segmentation algorithm sub-network and the backbone network to extract the prototype features of the background area in the sub-region includes:

将预先设定后的所述基类图像数据集中的基类图像进行缩放至预设待分割图像尺寸阈值;Scale the preset base class images in the base class image data set to a preset image size threshold to be segmented;

通过离线背景标记算法网络中的无监督图像分割算法子网络对所述基类图像进行预分割,以获取多个预分割子区域掩码;Pre-segment the base class image through the unsupervised image segmentation algorithm sub-network in the offline background labeling algorithm network to obtain multiple pre-segmented sub-region masks;

将所述基类图像的未分割原图像通过所述离线背景标记算法网络中的骨干网络,对所述未分割原图像进行上采样操作后,提取所述未分割原图像的高层语义特征。The unsegmented original image of the base class image is passed through the backbone network in the offline background labeling algorithm network, and after an upsampling operation is performed on the unsegmented original image, high-level semantic features of the unsegmented original image are extracted.

进一步的,所述通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征的步骤还包括:Further, the step of obtaining the pre-segmented sub-region mask and high-level semantic features of the base class image through the unsupervised image segmentation algorithm sub-network and the backbone network to extract the prototype features of the background area in the sub-region also includes :

通过基类目标掩码进行掩码取反操作,以取出当前基类目标的背景区域并抑制前景区域;Perform a mask inversion operation through the base class target mask to remove the background area of the current base class target and suppress the foreground area;

根据预分割子区域掩码和高层语义特征,由以下公式进行哈达玛积计算:According to the pre-segmentation sub-region mask and high-level semantic features, the Hadamard product is calculated by the following formula:

其中,为所述高层语义特征,其中/>,/>,/>分别为所述高层语义特征的高、宽、通道维数,/>代表将掩码沿着通道广播,/>代表哈达玛积,/>为经过掩码覆盖的伪掩码,其中的/>为获得的所述预分割子区域掩码的数量,/>为单幅图像背景区域的临时伪掩码标注,其中的/>,/>分别为背景掩码对应的前景基类以及背景原型特征的数量,/>为真实的背景掩码;in, is the high-level semantic feature, where/> ,/> ,/> are respectively the height, width, and channel dimensions of the high-level semantic features,/> Represents broadcasting the mask along the channel, /> Represents Hadama Product,/> is the pseudo mask covered by the mask, where /> is the number of obtained pre-segmented sub-region masks,/> Temporary pseudo-mask annotation for the background area of a single image, where /> ,/> are the number of foreground base classes and background prototype features corresponding to the background mask, /> Mask the real background;

通过掩码平均池化获取原型特征。Obtain prototype features through masked average pooling.

进一步的,所述根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集的步骤包括:Further, the step of clustering according to the prototype features to divide a plurality of different pseudo-classes and making the pseudo-classes into a pseudo-class data set includes:

对所有基类图像的背景区域进行标注,以获取伪类原型特征和伪掩码;Annotate the background areas of all base class images to obtain pseudo class prototype features and pseudo masks;

通过无监督聚类算法,将所有所述伪类原型特征进行聚类,以进行伪类划分;All pseudo-class prototype features are clustered through an unsupervised clustering algorithm to perform pseudo-class classification;

将所述背景区域的预分割子区域分类至对应伪类,并给所述伪掩码打上对应伪类标签,以制作获取伪类数据集。Classify the pre-segmented sub-regions of the background area into corresponding pseudo-classes, and label the pseudo-mask with corresponding pseudo-class labels to create a pseudo-class data set.

进一步的,所述根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务的步骤包括:Further, the step of jointly training a semantic segmentation model based on the pseudo-class data set and the base class image data set to perform a segmentation task on the new class image data set through the trained semantic segmentation model includes: :

将基类数据集和伪类数据集分别输入联合训练骨干网络;Input the base class data set and pseudo class data set into the joint training backbone network respectively;

通过所述联合训练骨干网络进行特征图提取,获取支持特征图和查询特征图;Feature map extraction is performed through the joint training backbone network to obtain support feature maps and query feature maps;

通过特征丰富模块进行多尺度特征提取后,进行所述支持特征图和所述查询特征图的比较并整合;After performing multi-scale feature extraction through the feature enrichment module, compare and integrate the support feature map and the query feature map;

将整合后的特征图进行卷积并经过分类器获取最终预测结果。The integrated feature map is convolved and passed through the classifier to obtain the final prediction result.

进一步的,所述将整合后的特征图进行卷积并经过分类器获取最终预测结果的步骤之后还包括:Further, the step of convolving the integrated feature map and passing it through the classifier to obtain the final prediction result also includes:

通过分类器获取最终预测结果后,根据以下公式计算由基类组成的原数据的损失函数After obtaining the final prediction result through the classifier, calculate the loss function of the original data composed of the base class according to the following formula :

再根据以下公式计算伪类数据的损失函数Then calculate the loss function of pseudo-class data according to the following formula :

根据以下公式计算整体损失函数LCalculate the overall loss function L according to the following formula:

以上公式中,为预测的查询图像分割结果,/>为对应像素点空间位置,/>是查询图像的真实值掩码,/>和/>为伪类通过小样本分割网络后的分割预测掩码和伪类掩码,/>为超参数。In the above formula, is the predicted query image segmentation result,/> is the spatial position of the corresponding pixel,/> is the true value mask of the query image, /> and/> The segmentation prediction mask and pseudo class mask for the pseudo class after passing through the small sample segmentation network, /> is a hyperparameter.

本发明提出的一种基于背景信息挖掘的小样本语义分割系统,包括:The present invention proposes a small-sample semantic segmentation system based on background information mining, including:

背景挖掘模块,用于将预先设定的基类图像数据集输入离线背景标记算法网络,通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征,根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集;The background mining module is used to input the preset base class image data set into the offline background labeling algorithm network, and obtain the pre-segmented sub-region mask and high-level semantics of the base class image through the unsupervised image segmentation algorithm sub-network and backbone network. Features to extract prototype features of the background area in the sub-region, perform clustering according to the prototype features to divide multiple different pseudo-classes, and make the pseudo-classes into a pseudo-class data set;

联合训练模块,用于根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务。A joint training module, configured to jointly train a semantic segmentation model according to the pseudo-class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model.

本发明另一方面,还提供一种存储介质,包括所述存储介质存储一个或多个程序,该程序被执行时实现如上述的基于背景信息挖掘的小样本语义分割方法。On the other hand, the present invention also provides a storage medium, including the storage medium storing one or more programs. When the program is executed, the small sample semantic segmentation method based on background information mining is implemented as described above.

本发明另一方面还提供一种计算机设备,所述计算机设备包括存储器和处理器,其中:Another aspect of the present invention also provides a computer device, the computer device includes a memory and a processor, wherein:

所述存储器用于存放计算机程序;The memory is used to store computer programs;

所述处理器用于执行所述存储器上所存放的计算机程序时,实现如上述的基于背景信息挖掘的小样本语义分割方法。The processor is configured to implement the above-mentioned small-sample semantic segmentation method based on background information mining when executing the computer program stored on the memory.

附图说明Description of the drawings

图1为本发明第一实施例提出的基于背景信息挖掘的小样本语义分割方法的流程图;Figure 1 is a flow chart of the small-sample semantic segmentation method based on background information mining proposed by the first embodiment of the present invention;

图2为本发明第二实施例提出的基于背景信息挖掘的小样本语义分割方法的流程图;Figure 2 is a flow chart of a small-sample semantic segmentation method based on background information mining proposed in the second embodiment of the present invention;

图3为本发明第三实施例提出的基于背景信息挖掘的小样本语义分割系统的结构示意图;Figure 3 is a schematic structural diagram of a small-sample semantic segmentation system based on background information mining proposed in the third embodiment of the present invention;

图4为本发明第一实施例提出的基于背景信息挖掘的小样本语义分割方法的无监督图像分割算法流程图。Figure 4 is a flow chart of the unsupervised image segmentation algorithm of the small-sample semantic segmentation method based on background information mining proposed in the first embodiment of the present invention.

如下具体实施方式将结合上述附图进一步说明本发明。The following specific embodiments will further illustrate the present invention in conjunction with the above-mentioned drawings.

具体实施方式Detailed ways

为了便于理解本发明,下面将参照相关附图对本发明进行更全面的描述。附图中给出了本发明的若干个实施例。但是,本发明可以以许多不同的形式来实现,并不限于本文所描述的实施例。相反地,提供这些实施例的目的是使对本发明的公开内容更加透彻全面。In order to facilitate understanding of the present invention, the present invention will be described more fully below with reference to the relevant drawings. Several embodiments of the invention are shown in the drawings. However, the invention may be embodied in many different forms and is not limited to the embodiments described herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.

需要说明的是,当元件被称为“固设于”另一个元件,它可以直接在另一个元件上或者也可以存在居中的元件。当一个元件被认为是“连接”另一个元件,它可以是直接连接到另一个元件或者可能同时存在居中元件。本文所使用的术语“垂直的”、“水平的”、“左”、“右”以及类似的表述只是为了说明的目的。It should be noted that when an element is referred to as being "mounted" on another element, it can be directly on the other element or intervening elements may also be present. When an element is said to be "connected" to another element, it can be directly connected to the other element or there may also be intervening elements present. The terms "vertical," "horizontal," "left," "right" and similar expressions are used herein for illustrative purposes only.

除非另有定义,本文所使用的所有的技术和科学术语与属于本发明的技术领域的技术人员通常理解的含义相同。本文中在本发明的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本发明。本文所使用的术语“及/或”包括一个或多个相关的所列项目的任意的和所有的组合。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which the invention belongs. The terminology used herein in the description of the invention is for the purpose of describing specific embodiments only and is not intended to limit the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

请参阅图1,所示为本发明第一实施例提出的基于背景信息挖掘的小样本语义分割方法的流程图,该种基于背景信息挖掘的小样本语义分割方法包括步骤S01至步骤S04,其中:Please refer to Figure 1, which is a flow chart of a small-sample semantic segmentation method based on background information mining proposed by the first embodiment of the present invention. This small-sample semantic segmentation method based on background information mining includes steps S01 to step S04, where :

步骤S01:将预先设定的基类图像数据集输入离线背景标记算法网络;Step S01: Input the preset base class image data set into the offline background labeling algorithm network;

需要说明的,本实施例中,采用数据集为PASCAL-5i,PASCAL-5i数据集中总共有20个类别,每5个类别算作一组,本实施例在训练时,选取三组15个类别作为基类数据,另外一组5个类别作为新类用于测试。It should be noted that in this embodiment, the data set used is PASCAL-5i. There are a total of 20 categories in the PASCAL-5i data set, and every 5 categories are counted as one group. In this embodiment, three groups of 15 categories are selected during training. As base class data, another set of 5 categories are used as new classes for testing.

步骤S02:通过无监督图像分割算法子网络和骨干网络获取基类图像的预分割子区域掩码和高层语义特征,以提取子区域中背景区域的原型特征;Step S02: Obtain the pre-segmentation sub-region mask and high-level semantic features of the base class image through the unsupervised image segmentation algorithm sub-network and backbone network to extract the prototype features of the background area in the sub-region;

需要说明的,本实施例中首先将图像缩放到常用的473×473的大小,通过无监督分割算法对原图像进行预分割,预分割算法所分割子区域的上限设定为10,该分割方法会让图像产生多个分割子区域,同时原图像经过骨干网络,将其上采样到同样大小提取其高层语义特征,通过掩码取反取出当前类别的背景区域,抑制特征图的前景区域,随后将取预分割得到的子区域掩码和图像特征图计算哈达玛积Hadamard,本实施例中的无监督预分割方法使用反向传播的无监督图像分割算法,首先得到缩放后的原始图像,用前景掩码取反以获得图像的背景区域,然后使用基于图的分割算法预先分割图像替代原后向传播无监督分割方法中的Mask-SLIC算法以生成分割区域及其标签,算法的具体流程步骤请参阅图4,其中表示输入原图像的像素点集合(总共N个像素点),通过无监督算法/>得到/>个原型以及对应每个超像素的像素集合/>,随后输入神经网络进行t次迭代,卷积神经网络/>将图像作为输入来生成图像特征图/>,其中/>,/>,/>表示特征图的高度、宽度和通道维数,表示为像素点集合为/>,根据特征图,如果第/>个通道取得最大值,就将标记像素为/>,对应每个像素集合/>,我们找到其中出现次数最多的标签,并将整个集合的像素都记为这个标签/>,然后使用Softmax损失函数计算模型损失,使其接近预分割的结果,最后得到每个像素的分割预测/>,该方法在预分类算法下,为语义信息相同的小区域分配相同的语义标签,随后使用神经网络模型,对输入图片进行分类,让网络输出的分类结果尽可能地靠近图像分割算法的预分类结果,最后在符合预分类的结果基础上,将具备相同语义信息的小区块进行合并,得到最终的分割效果,通过无监督分割算法得到的单幅图像背景区域的临时伪掩码标注,记为/>,其中/>,/>分别为背景伪掩码对应的前景基类以及背景特征原型的数量,通过对原数据对应基类/>所对应的真实前景掩码取反,得到真实的背景掩码/>,通过骨干网络得到原图像的高层语义特征/>,其中/>,/>,/>分别为高层语义特征图的宽、高、通道维数,在此基础上,依次将/>,/>掩码双线性插值到特征图大小,并沿着通道广播,将维数从/>变为/>,和/>做哈达玛积,其公式如下:It should be noted that in this embodiment, the image is first scaled to the commonly used size of 473×473, and the original image is pre-segmented through an unsupervised segmentation algorithm. The upper limit of the sub-region segmented by the pre-segmentation algorithm is set to 10. This segmentation method It will generate multiple segmented sub-regions in the image. At the same time, the original image passes through the backbone network, upsamples it to the same size to extract its high-level semantic features, extracts the background area of the current category through mask inversion, suppresses the foreground area of the feature map, and then The Hadamard product Hadamard is calculated by taking the sub-region mask and image feature map obtained by pre-segmentation. The unsupervised pre-segmentation method in this embodiment uses the unsupervised image segmentation algorithm of back propagation. First, the scaled original image is obtained, using The foreground mask is inverted to obtain the background area of the image, and then a graph-based segmentation algorithm is used to pre-segment the image to replace the Mask-SLIC algorithm in the original back propagation unsupervised segmentation method to generate segmented areas and their labels. The specific process steps of the algorithm See Figure 4, where Represents the set of pixels of the input original image (a total of N pixels), through an unsupervised algorithm/> Get/> prototypes and a set of pixels corresponding to each superpixel/> , then input the neural network for t iterations, convolutional neural network/> Take an image as input to generate an image feature map/> , of which/> ,/> ,/> Represents the height, width and channel dimensions of the feature map, expressed as a set of pixels/> , according to the feature map, if the /> If the maximum value is obtained for each channel, the pixel will be marked as/> , corresponding to each pixel set/> , we find the label with the most occurrences, and record the pixels of the entire set as this label/> , and then use the Softmax loss function to calculate the model loss to make it close to the pre-segmentation result, and finally obtain the segmentation prediction of each pixel/> , this method assigns the same semantic label to small areas with the same semantic information under the pre-classification algorithm, and then uses a neural network model to classify the input image, so that the classification results output by the network are as close as possible to the pre-classification of the image segmentation algorithm. As a result, on the basis of the pre-classification results, small blocks with the same semantic information are merged to obtain the final segmentation effect. The temporary pseudo-mask annotation of the background area of a single image obtained through the unsupervised segmentation algorithm is recorded as /> , of which/> ,/> are the foreground base class and the number of background feature prototypes corresponding to the background pseudo mask respectively. By corresponding to the base class of the original data/> The corresponding real foreground mask is inverted to obtain the real background mask/> , obtain the high-level semantic features of the original image through the backbone network/> , of which/> ,/> ,/> are the width, height, and channel dimensions of the high-level semantic feature map respectively. On this basis,/> ,/> The mask is bilinearly interpolated to the feature map size and broadcast along the channel, changing the dimensionality from/> Become/> , and/> To do the Hadamard product, the formula is as follows:

其中,代表将掩码沿着通道广播,/>代表哈达玛积,得到一组经过掩码覆盖的伪掩码/>,且其中/>为所获得预分割伪掩码的数量。in, Represents broadcasting the mask along the channel, /> Represents the Hadamard product and obtains a set of pseudo masks covered by the mask/> , and among them/> is the number of pre-segmentation pseudo-masks obtained.

步骤S03:根据原型特征进行聚类,以划分多个不同的伪类,并将伪类制作成伪类数据集;Step S03: Cluster according to prototype features to divide multiple different pseudo-classes, and make the pseudo-classes into pseudo-class data sets;

需要说明的,本实施例中对每张图像的背景区域按照顺序进行标注,每张图像上面得到个伪类原型和掩码,标注为/>,一般取U等于5,保存下来,最后当标注完成,对得到的所有原型向量进行再聚类,最终将所有向量划分到/>个伪类,随后根据之前的标注对每张图像的背景子区域分类至这/>个伪类,给伪掩码打上对应的伪类标签,得到最终的伪类数据集,具体则是在获取/>后,将/>做掩码平均池化得到一张图片的背景原型集合/>,随后在全数据集的范围上运行以上过程,得到所有图像的原型集合,在/>上运行基于余弦相似度的k-means算法,最终收敛为/>个伪类,一般取100,以此为标准,将原图像的背景类别重标记为/>It should be noted that in this embodiment, the background area of each image is marked in order, and each image is obtained pseudo-class prototype and mask, marked as/> , generally take U equal to 5 and save it. Finally, when the annotation is completed, all the prototype vectors obtained are re-clustered, and finally all vectors are divided into/> pseudo-class, and then classify the background sub-region of each image to this/> pseudo-class, label the pseudo-mask with the corresponding pseudo-class label, and obtain the final pseudo-class data set. The specific process is to obtain/> After that, add/> Perform mask average pooling to obtain a background prototype collection of a picture/> , and then run the above process on the entire data set to obtain a prototype set of all images , in/> Running the k-means algorithm based on cosine similarity, the final convergence is/> pseudo-class, usually 100. Based on this, re-label the background category of the original image as/> .

步骤S04:根据伪类数据集和基类图像数据集对语义分割模型进行联合训练,以通过训练后的语义分割模型进行新类图像数据集的分割任务;Step S04: Jointly train the semantic segmentation model according to the pseudo-class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model;

需要说明的,本实施例中首先按照将整体数据划分为基类数据集和新类数据集/>,其中,/>数据用于训练,/>用于测试,/>中提取支持图像集/>和查询图像集,本实施例采用的是“1-Way-1-Shot”的设置方式,即支持图像集仅一个类别抽取一张图像作为支持图像的情况,查询图像集/>则在测试阶段用于测试模型表现,本实施例中使用PFENet框架,从骨干网络的中层提取特征图,通过支持特征图和查询特征图进行特征比较后,再卷积通过最后的分类器得到最后的结果,因为在使用中层特征的同时,另外用高层语义特征计算一个先验掩码prior mask作为前景的概率预测图,并且设计了一个特征丰富模块FEM,使得各个尺度的特征聚合后建立了链接,从各个尺度的特征中提取信息,最后经过分类器得到最后的预测结果,提升了框架整体的分割能力。It should be noted that in this embodiment, the overall data is first divided into base class data sets according to and new class data sets/> , where,/> Data used for training/> for testing,/> Extract support image set/> and query image set , this embodiment adopts the "1-Way-1-Shot" setting method, that is, when the supported image set only extracts an image from one category as a supporting image, query the image set/> It is used to test the model performance in the testing phase. In this embodiment, the PFENet framework is used to extract the feature map from the middle layer of the backbone network. After comparing the features through the support feature map and the query feature map, it is then convolved through the final classifier to obtain the final The result is that while using mid-level features, high-level semantic features are also used to calculate a prior mask as the probability prediction map of the foreground, and a feature enrichment module FEM is designed to establish links after aggregating features at each scale. , extract information from features at various scales, and finally obtain the final prediction result through the classifier, which improves the overall segmentation capability of the framework.

综上,根据上述基于背景信息挖掘的小样本语义分割方法,通过将基类图片输入离线背景标记算法网络,对小样本语义分割任务中背景部分的潜在信息进行挖掘,获取伪类数据集,再通过根据伪类数据集和原数据集进行联合训练,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。具体的,将预先设定的基类图像数据集输入离线背景标记算法网络,设定基类图像中的前景区域与背景区域,再通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征,根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集,因为在原数据基础上通过离线背景标记算法网络进行了数据扩充,模型的训练阶段将不只有基类信息参与,还有生成的背景伪类信息,因此模型对于新类的泛化能力得到明显提升,根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。In summary, according to the above-mentioned small-sample semantic segmentation method based on background information mining, by inputting the base class image into the offline background labeling algorithm network, the potential information of the background part in the small-sample semantic segmentation task is mined to obtain the pseudo-class data set, and then Through joint training based on the pseudo-class data set and the original data set, the generalization ability and performance of the semantic segmentation model in the face of new classes are greatly improved, and the base class bias problem of the model is greatly alleviated. Specifically, the preset base class image data set is input into the offline background labeling algorithm network, the foreground area and background area in the base class image are set, and then the base class is obtained through the unsupervised image segmentation algorithm subnetwork and backbone network. The pre-segmented sub-region mask and high-level semantic features of the image are used to extract the prototype features of the background area in the sub-region, clustering is performed based on the prototype features to divide multiple different pseudo-classes, and the pseudo-classes are Made into a pseudo-class data set, because the data is expanded through the offline background labeling algorithm network based on the original data, the training phase of the model will not only involve the base class information, but also the generated background pseudo-class information, so the model is very sensitive to new classes. The generalization ability is significantly improved, and the semantic segmentation model is jointly trained according to the pseudo-class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model, It greatly improves the generalization ability and performance of the semantic segmentation model when facing new classes, and greatly alleviates the base class bias problem of the model.

请参阅图2,所示为本发明第二实施例提出的基于背景信息挖掘的小样本语义分割方法的流程图,该种基于背景信息挖掘的小样本语义分割方法包括步骤S11至步骤S15,其中:Please refer to Figure 2, which is a flow chart of a small-sample semantic segmentation method based on background information mining proposed by the second embodiment of the present invention. This small-sample semantic segmentation method based on background information mining includes steps S11 to step S15, where :

步骤S11:在预先设定的基类图像数据集中选择当前的基类目标,并设定前景区域与背景区域;Step S11: Select the current base class target in the preset base class image data set, and set the foreground area and background area;

步骤S12:将基类图像缩放后通过无监督图像分割算法子网络进行预分割,以获取多个预分割子区域掩码,再将基类图像的未分割原图像进行上采样操作后通过离线背景标记算法网络中的骨干网络提取高层语义特征;Step S12: Scale the base class image and perform pre-segmentation through the unsupervised image segmentation algorithm sub-network to obtain multiple pre-segmented sub-region masks, and then upsample the unsegmented original image of the base class image through the offline background The backbone network in the tagging algorithm network extracts high-level semantic features;

步骤S13:根据预分割子区域掩码和高层语义特征进行哈达玛积计算后,通过掩码平均池化获取原型特征;Step S13: After calculating the Hadamard product based on the pre-segmented sub-region mask and high-level semantic features, obtain the prototype features through mask average pooling;

步骤S14:对所有基类图像的背景区域进行标注,以获取伪类原型特征和伪掩码,再通过无监督聚类算法进行聚类,以划分伪类并制作伪类数据集;Step S14: Annotate the background areas of all base class images to obtain pseudo class prototype features and pseudo masks, and then perform clustering through an unsupervised clustering algorithm to divide pseudo classes and create pseudo class data sets;

步骤S15:根据基类数据集和伪类数据集进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务;Step S15: Perform joint training based on the base class data set and the pseudo class data set to perform the segmentation task of the new class image data set through the trained semantic segmentation model;

需要说明的,本发明提出的离线背景标记算法OBAA(Offline BackgroundAnnotation Algorithm)应用到PFENet和BAM以及基于BAM模型改进的MSANet模型这三个小样本分割框架中,在相同设置不同基类分割的情况下其性能与未使用本发明方法的原始框架的性能对比如下:It should be noted that the offline background marking algorithm OBAA (Offline BackgroundAnnotation Algorithm) proposed by this invention is applied to three small sample segmentation frameworks: PFENet, BAM, and the MSANet model improved based on the BAM model. Under the same settings and different base class segmentation, Its performance is compared with the performance of the original framework without using the method of the present invention as follows:

表1Table 1

表2Table 2

从上表中结果可以明显看出,在两种条件下使用了背景挖掘算法进行数据扩充后的各模型,在平均交并比(MIoU)的数值上均有明显提升,由上表1可见在“1-Way-1-Shot”的设置中,使用了VGG-16作为骨干网络应用在PFENet框架上,结果表明其平均交并比相比原模型提升了1.2%,随后在使用了ResNet作为骨干网络时,平均交并比提升了0.78%,而BAM模型和MSANet在使用OBAA算法后,在PASCAL-5i数据集上的平均交并比和前景后景-交并比也有明显提升,平均交并比分别提升了0.54%和0.6%,由上表2可见,在“1-Way-5-Shot”条件下,BAM模型在使用OBAA算法后,在PASCAL-5i数据集上的平均交并比和前景后景-交并比也有明显提升,其中平均交并比提升了0.48%,前景后景-交并比提升了1.06%,在“1-Way-1-Shot”的设置条件下对COCO-20i数据集上也进行了测试,性能对比结果如下:It can be clearly seen from the results in the above table that the average intersection-over-union ratio (MIoU) of each model after using the background mining algorithm for data expansion under two conditions has been significantly improved. As can be seen from Table 1 above, In the "1-Way-1-Shot" setting, VGG-16 was used as the backbone network and applied to the PFENet framework. The results showed that its average intersection ratio increased by 1.2% compared to the original model. ResNet was then used as the backbone. network, the average intersection and union ratio increased by 0.78%, and after using the OBAA algorithm for the BAM model and MSANet, the average intersection and union ratio and the foreground-background-intersection and union ratio on the PASCAL-5i data set also improved significantly, and the average intersection and union ratio also improved significantly. The ratios increased by 0.54% and 0.6% respectively. As can be seen from Table 2 above, under the "1-Way-5-Shot" condition, the BAM model has an average intersection and union ratio on the PASCAL-5i data set after using the OBAA algorithm. The intersection-to-merge ratio of foreground and background has also been significantly improved. The average intersection-to-merge ratio has increased by 0.48%, and the foreground and background-to-intersection ratio has increased by 1.06%. Under the "1-Way-1-Shot" setting conditions, the COCO- Tests were also conducted on the 20i data set, and the performance comparison results are as follows:

表3table 3

COCO-20i数据集整体的分割难度相比PASCAL-5i数据集更大,由上表3可见,本发明提出的方法在COCO-20i数据集上选用4种集合的情况下都获得了提升,整体的平均交并比提升了1.67%,分割效果提升更加明显。The overall segmentation difficulty of the COCO-20i data set is greater than that of the PASCAL-5i data set. As can be seen from Table 3 above, the method proposed by the present invention has been improved when four types of sets are selected on the COCO-20i data set. Overall The average intersection ratio has increased by 1.67%, and the segmentation effect has been improved even more obviously.

综上,根据上述基于背景信息挖掘的小样本语义分割方法,通过将基类图片输入离线背景标记算法网络,对小样本语义分割任务中背景部分的潜在信息进行挖掘,获取伪类数据集,再通过根据伪类数据集和原数据集进行联合训练,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。具体的,将预先设定的基类图像数据集输入离线背景标记算法网络,设定基类图像中的前景区域与背景区域,再通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征,根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集,因为在原数据基础上通过离线背景标记算法网络进行了数据扩充,模型的训练阶段将不只有基类信息参与,还有生成的背景伪类信息,因此模型对于新类的泛化能力得到明显提升,根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务,极大地提高了语义分割模型在面对新类时的泛化能力和性能,极大地缓解了模型的基类偏置问题。In summary, according to the above-mentioned small-sample semantic segmentation method based on background information mining, by inputting the base class image into the offline background labeling algorithm network, the potential information of the background part in the small-sample semantic segmentation task is mined to obtain the pseudo-class data set, and then Through joint training based on the pseudo-class data set and the original data set, the generalization ability and performance of the semantic segmentation model in the face of new classes are greatly improved, and the base class bias problem of the model is greatly alleviated. Specifically, the preset base class image data set is input into the offline background labeling algorithm network, the foreground area and background area in the base class image are set, and then the base class is obtained through the unsupervised image segmentation algorithm subnetwork and backbone network. The pre-segmented sub-region mask and high-level semantic features of the image are used to extract the prototype features of the background area in the sub-region, clustering is performed based on the prototype features to divide multiple different pseudo-classes, and the pseudo-classes are Made into a pseudo-class data set, because the data is expanded through the offline background labeling algorithm network based on the original data, the training phase of the model will not only involve the base class information, but also the generated background pseudo-class information, so the model is very sensitive to new classes. The generalization ability is significantly improved, and the semantic segmentation model is jointly trained according to the pseudo-class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model, It greatly improves the generalization ability and performance of the semantic segmentation model when facing new classes, and greatly alleviates the base class bias problem of the model.

请参阅图3,所示为本发明第三实施例提出的基于背景信息挖掘的小样本语义分割系统的结构示意图,该系统包括:Please refer to Figure 3, which is a schematic structural diagram of a small-sample semantic segmentation system based on background information mining proposed in the third embodiment of the present invention. The system includes:

背景挖掘模块10,用于将预先设定的基类图像数据集输入离线背景标记算法网络,通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征,根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集;The background mining module 10 is used to input the preset base class image data set into the offline background labeling algorithm network, and obtain the pre-segmented sub-region mask and high-level layer of the base class image through the unsupervised image segmentation algorithm sub-network and backbone network. Semantic features to extract prototype features of the background area in the sub-region, cluster according to the prototype features to divide multiple different pseudo-classes, and make the pseudo-classes into a pseudo-class data set;

联合训练模块20,用于根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务。The joint training module 20 is used to jointly train a semantic segmentation model according to the pseudo class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model.

进一步的,背景挖掘模块10包括:Further, the background mining module 10 includes:

特征提取单元101,用于将预先设定的基类图像数据集输入离线背景标记算法网络,通过无监督图像分割算法子网络和骨干网络获取所述基类图像的预分割子区域掩码和高层语义特征,以提取所述子区域中背景区域的原型特征;The feature extraction unit 101 is used to input the preset base class image data set into the offline background labeling algorithm network, and obtain the pre-segmented sub-region mask and high-level layer of the base class image through the unsupervised image segmentation algorithm sub-network and backbone network. Semantic features to extract prototype features of the background area in the sub-region;

伪类划分单元102,用于根据所述原型特征进行聚类,以划分多个不同的伪类,并将所述伪类制作成伪类数据集。The pseudo-class dividing unit 102 is configured to perform clustering according to the prototype features to divide a plurality of different pseudo-classes, and produce the pseudo-classes into a pseudo-class data set.

进一步的,联合训练模块20包括:Further, the joint training module 20 includes:

联合训练单元201,用于根据所述伪类数据集和所述基类图像数据集对语义分割模型进行联合训练,以通过训练后的所述语义分割模型进行新类图像数据集的分割任务。The joint training unit 201 is used to jointly train a semantic segmentation model according to the pseudo class data set and the base class image data set, so as to perform the segmentation task of the new class image data set through the trained semantic segmentation model.

本发明另一方面还提出计算机存储介质,其上存储有一个或多个程序,该程序给处理器执行时实现上述的基于背景信息挖掘的小样本语义分割方法。On the other hand, the present invention also provides a computer storage medium on which one or more programs are stored, which implement the above-mentioned small sample semantic segmentation method based on background information mining when executed by the processor.

本发明另一方面还提出一种计算机设备,包括存储器和处理器,其中所述存储器用于存放计算机程序,所述处理器用于执行所述存储器上所存放的计算机程序,以实现上述的基于背景信息挖掘的小样本语义分割方法。Another aspect of the present invention also proposes a computer device, including a memory and a processor, wherein the memory is used to store computer programs, and the processor is used to execute the computer programs stored on the memory to implement the above background-based A small-sample semantic segmentation method for information mining.

本领域技术人员可以理解,在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,“计算机可读介质”可以是任何可以包含存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。It will be understood by those skilled in the art that the logic and/or steps represented in the flowchart diagrams or otherwise described herein may, for example, be considered a sequenced list of executable instructions for implementing the logical functions, which may be embodied in in any computer-readable medium for use by an instruction execution system, device, or device (such as a computer-based system, a system including a processor, or other system that can fetch and execute instructions from the instruction execution system, device, or device), or Used in connection with these instruction execution systems, devices or equipment. For the purposes of this specification, a "computer-readable medium" may be any device that can contain a program that stores, communicates, propagates, or transports a program for use by or in connection with an instruction execution system, apparatus, or device.

计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。More specific examples (non-exhaustive list) of computer readable media include the following: electrical connection with one or more wires (electronic device), portable computer disk cartridge (magnetic device), random access memory (RAM), Read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), fiber optic devices, and portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, and subsequently edited, interpreted, or otherwise suitable as necessary. process to obtain the program electronically and then store it in computer memory.

应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或它们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if it is implemented in hardware, as in another embodiment, it can be implemented by any one of the following technologies known in the art or a combination thereof: a gate circuit with a logic gate circuit for implementing a logic function on a data signal; Discrete logic circuits, application specific integrated circuits with suitable combinational logic gates, programmable gate arrays (PGA), field programmable gate arrays (FPGA), etc.

在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、 “示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。In the description of this specification, reference to the terms "one embodiment," "some embodiments," "an example," "specific examples," or "some examples" or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation modes of the present invention, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the patent scope of the present invention. It should be noted that, for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present invention, and these all belong to the protection scope of the present invention. Therefore, the scope of protection of the patent of the present invention should be determined by the appended claims.

Claims (10)

1. The small sample semantic segmentation method based on background information mining is characterized by comprising the following steps of:
inputting a preset base class image data set into an offline background marking algorithm network;
acquiring a pre-segmentation subarea mask and high-level semantic features of the base class image through an unsupervised image segmentation algorithm sub-network and a backbone network so as to extract prototype features of a background area in the subarea;
clustering according to the prototype features to divide a plurality of different pseudo classes, and manufacturing the pseudo classes into a pseudo class data set;
and carrying out joint training on the semantic segmentation model according to the pseudo-class data set and the base class image data set so as to carry out segmentation tasks of the new class image data set through the trained semantic segmentation model.
2. The method for semantic segmentation of small samples based on background information mining according to claim 1, wherein the step of inputting a pre-set base class image dataset into an offline background labeling algorithm network comprises:
selecting a current base object in a preset base image data set, setting a base region in the base image as a foreground region, and setting a non-base region in the base image as a background region.
3. The background information mining-based small sample semantic segmentation method according to claim 1, wherein the step of acquiring pre-segmented sub-region masks and high-level semantic features of the base class image through an unsupervised image segmentation algorithm sub-network and a backbone network to extract prototype features of background regions in the sub-regions comprises:
scaling the preset base class image in the base class image dataset to a preset image size threshold to be segmented;
pre-segmenting the base class image through an unsupervised image segmentation algorithm sub-network in an offline background marking algorithm network to obtain a plurality of pre-segmented sub-region masks;
and carrying out up-sampling operation on the undivided original image of the base image through a backbone network in the offline background marking algorithm network, and extracting high-level semantic features of the undivided original image.
4. The background information mining-based small sample semantic segmentation method according to claim 1, wherein the step of acquiring pre-segmented sub-region masks and high-level semantic features of the base class image through an unsupervised image segmentation algorithm sub-network and a backbone network to extract prototype features of background regions in the sub-regions further comprises:
performing mask negation operation through the base class target mask so as to take out the background area of the current base class target and inhibit the foreground area;
and carrying out Hadamard product calculation according to the pre-segmentation subarea mask and the high-level semantic features by the following formula:
wherein,for said high-level semantic features, wherein +.>,/>,/>The height, width and channel dimension of the high-level semantic features are respectively +.>Representing broadcasting the mask along the channel, +.>Representing Hadamard product, ->Is a pseudo mask covered by a mask, wherein +.>For the number of pre-divided sub-area masks obtained,/->Temporary pseudo mask labeling for background area of single image, wherein +.>,/>The number of foreground base class and background prototype feature corresponding to the background mask respectively, ++>Is a true background mask;
prototype features were obtained by mask averaging pooling.
5. The background information mining-based small sample semantic segmentation method according to claim 1, wherein the step of clustering according to the prototype features to divide a plurality of different pseudo-classes and making the pseudo-classes into a pseudo-class dataset comprises:
labeling background areas of all base class images to obtain pseudo-class prototype features and pseudo masks;
clustering all the pseudo-class prototype features through an unsupervised clustering algorithm to perform pseudo-class division;
classifying the pre-divided subareas of the background area into corresponding pseudo classes, and marking the pseudo masks with corresponding pseudo class labels to manufacture and acquire pseudo class data sets.
6. The background information mining-based small sample semantic segmentation method according to claim 1, wherein the step of jointly training a semantic segmentation model according to the pseudo-class data set and the base-class image data set to perform a segmentation task of a new-class image data set through the trained semantic segmentation model comprises:
respectively inputting the basic class data set and the pseudo class data set into a joint training backbone network;
extracting feature images through the joint training backbone network to obtain a support feature image and a query feature image;
after multi-scale feature extraction is carried out through a feature enrichment module, the support feature map and the query feature map are compared and integrated;
and convolving the integrated feature images and obtaining a final prediction result through a classifier.
7. The background information mining-based small sample semantic segmentation method according to claim 6, wherein after the step of convolving the integrated feature map and obtaining a final prediction result through a classifier, further comprises:
after the final prediction result is obtained by the classifier, the loss function of the original data composed of the base classes is calculated according to the following formula
Then, the loss function of the pseudo-class data is calculated according to the following formula
Calculating the overall loss function according to the following formulaL
In the above formula, the water content of the water-soluble polymer,segmentation of the result for the predicted query image, +.>For the spatial position of the corresponding pixel point, +.>Is the true value mask of the query image, +.>And->A split prediction mask and a pseudo class mask after the pseudo class has passed through the small sample split network,is a super parameter.
8. A small sample semantic segmentation system based on background information mining, comprising:
the background mining module is used for inputting a preset base class image data set into an offline background marking algorithm network, acquiring a pre-segmentation subarea mask and high-level semantic features of the base class image through an unsupervised image segmentation algorithm sub-network and a backbone network, extracting prototype features of a background area in the subarea, clustering according to the prototype features, dividing a plurality of different pseudo classes, and manufacturing the pseudo classes into a pseudo class data set;
and the joint training module is used for carrying out joint training on the semantic segmentation model according to the pseudo-class data set and the base class image data set so as to carry out segmentation tasks of the new class image data set through the trained semantic segmentation model.
9. A storage medium, comprising: the storage medium stores one or more programs which, when executed by a processor, implement the small sample semantic segmentation method based on background information mining of any of claims 1-7.
10. A computer device comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor is configured to implement the small sample semantic segmentation method based on background information mining of any one of claims 1-7 when executing a computer program stored on the memory.
CN202311720688.6A 2023-12-14 2023-12-14 A small sample semantic segmentation method and system based on background information mining Active CN117409413B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311720688.6A CN117409413B (en) 2023-12-14 2023-12-14 A small sample semantic segmentation method and system based on background information mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311720688.6A CN117409413B (en) 2023-12-14 2023-12-14 A small sample semantic segmentation method and system based on background information mining

Publications (2)

Publication Number Publication Date
CN117409413A true CN117409413A (en) 2024-01-16
CN117409413B CN117409413B (en) 2024-04-05

Family

ID=89498355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311720688.6A Active CN117409413B (en) 2023-12-14 2023-12-14 A small sample semantic segmentation method and system based on background information mining

Country Status (1)

Country Link
CN (1) CN117409413B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118015282A (en) * 2024-03-19 2024-05-10 哈尔滨工业大学(威海) Weakly supervised semantic segmentation method based on background prior
CN119339084A (en) * 2024-12-13 2025-01-21 华侨大学 Cable image segmentation method and device based on block category coding

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569865A (en) * 2021-09-27 2021-10-29 南京码极客科技有限公司 Single sample image segmentation method based on class prototype learning
CN114821045A (en) * 2022-03-23 2022-07-29 腾讯科技(深圳)有限公司 Semantic segmentation method and device, electronic equipment and storage medium
US20220405933A1 (en) * 2021-06-18 2022-12-22 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing annotation-efficient deep learning models utilizing sparsely-annotated or annotation-free training
CN115546474A (en) * 2022-06-25 2022-12-30 西北工业大学 Few-sample semantic segmentation method based on learner integration strategy
US20230222643A1 (en) * 2022-01-11 2023-07-13 Bentley Systems, Incorporated Semantic deep learning and rule optimization for surface corrosion detection and evaluation
CN116993978A (en) * 2023-07-20 2023-11-03 江西师范大学 Small sample segmentation method, system, readable storage medium and computer device
CN117095163A (en) * 2023-07-28 2023-11-21 中国科学院自动化研究所 Small sample image semantic segmentation method and device based on meta alignment and meta mask

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220405933A1 (en) * 2021-06-18 2022-12-22 Arizona Board Of Regents On Behalf Of Arizona State University Systems, methods, and apparatuses for implementing annotation-efficient deep learning models utilizing sparsely-annotated or annotation-free training
CN113569865A (en) * 2021-09-27 2021-10-29 南京码极客科技有限公司 Single sample image segmentation method based on class prototype learning
US20230222643A1 (en) * 2022-01-11 2023-07-13 Bentley Systems, Incorporated Semantic deep learning and rule optimization for surface corrosion detection and evaluation
CN114821045A (en) * 2022-03-23 2022-07-29 腾讯科技(深圳)有限公司 Semantic segmentation method and device, electronic equipment and storage medium
CN115546474A (en) * 2022-06-25 2022-12-30 西北工业大学 Few-sample semantic segmentation method based on learner integration strategy
CN116993978A (en) * 2023-07-20 2023-11-03 江西师范大学 Small sample segmentation method, system, readable storage medium and computer device
CN117095163A (en) * 2023-07-28 2023-11-21 中国科学院自动化研究所 Small sample image semantic segmentation method and device based on meta alignment and meta mask

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YAO HUANG ET AL: "Self-Reinforcing For Few-shot Medical Image Segmentation", 《2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING》, 11 September 2023 (2023-09-11), pages 655 - 659 *
陈琼 等: "小样本图像语义分割综述", 《数据与计算发展前沿》, vol. 3, no. 6, 31 December 2021 (2021-12-31), pages 17 - 34 *
青晨 等: "深度卷积神经网络图像语义分割研究进展", 《中国图象图形学报》, vol. 25, no. 6, 16 June 2020 (2020-06-16), pages 22 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118015282A (en) * 2024-03-19 2024-05-10 哈尔滨工业大学(威海) Weakly supervised semantic segmentation method based on background prior
CN119339084A (en) * 2024-12-13 2025-01-21 华侨大学 Cable image segmentation method and device based on block category coding
CN119339084B (en) * 2024-12-13 2025-03-25 华侨大学 Cable image segmentation method and device based on block category coding

Also Published As

Publication number Publication date
CN117409413B (en) 2024-04-05

Similar Documents

Publication Publication Date Title
WO2022000426A1 (en) Method and system for segmenting moving target on basis of twin deep neural network
CN112001385B (en) A target cross-domain detection and understanding method, system, equipment and storage medium
Kumar et al. A visual-numeric approach to clustering and anomaly detection for trajectory data
CN104008390B (en) Method and apparatus for detecting abnormal motion
CN117409413B (en) A small sample semantic segmentation method and system based on background information mining
CN107103326A (en) The collaboration conspicuousness detection method clustered based on super-pixel
CN112766170B (en) Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
Arulananth et al. Semantic segmentation of urban environments: Leveraging U-Net deep learning model for cityscape image analysis
US20230095533A1 (en) Enriched and discriminative convolutional neural network features for pedestrian re-identification and trajectory modeling
CN113221770A (en) Cross-domain pedestrian re-identification method and system based on multi-feature hybrid learning
CN112053358A (en) Method, device and equipment for determining instance type of pixel in image and storage medium
CN114170570B (en) A pedestrian detection method and system suitable for crowded scenes
CN113223037A (en) Unsupervised semantic segmentation method and unsupervised semantic segmentation system for large-scale data
WO2021114688A1 (en) Video processing method and apparatus based on deep learning
CN108647703B (en) A Type Judgment Method of Saliency-Based Classified Image Library
CN110852327A (en) Image processing method, device, electronic device and storage medium
Hoang et al. Improving traffic signs recognition based region proposal and deep neural networks
CN113822134B (en) A video-based instance tracking method, device, equipment and storage medium
Das et al. Object detection on scene images: a novel approach
CN112785601B (en) Image segmentation method, system, medium and electronic terminal
Tabib et al. Categorization and selection of crowdsourced images towards 3d reconstruction of heritage sites
Nagaraja et al. Hierarchy of localized random forests for video annotation
CN115115691A (en) Monocular three-dimensional plane recovery method, equipment and storage medium
CN114882372A (en) Target detection method and device
Laptev et al. Integrating traditional machine learning and neural networks for image processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant