[go: up one dir, main page]

CN112733756B - Remote sensing image semantic segmentation method based on W divergence countermeasure network - Google Patents

Remote sensing image semantic segmentation method based on W divergence countermeasure network Download PDF

Info

Publication number
CN112733756B
CN112733756B CN202110053047.4A CN202110053047A CN112733756B CN 112733756 B CN112733756 B CN 112733756B CN 202110053047 A CN202110053047 A CN 202110053047A CN 112733756 B CN112733756 B CN 112733756B
Authority
CN
China
Prior art keywords
remote sensing
network
generator
sensing image
divergence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110053047.4A
Other languages
Chinese (zh)
Other versions
CN112733756A (en
Inventor
刘昶
曹峡
赵卫东
鄢涛
刘永红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu University
Original Assignee
Chengdu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu University filed Critical Chengdu University
Priority to CN202110053047.4A priority Critical patent/CN112733756B/en
Publication of CN112733756A publication Critical patent/CN112733756A/en
Application granted granted Critical
Publication of CN112733756B publication Critical patent/CN112733756B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Remote Sensing (AREA)
  • Astronomy & Astrophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a remote sensing image semantic segmentation method based on a W divergence countermeasure network, which introduces a countermeasure training mechanism on a traditional U-net network architecture to solve the problem that the traditional convolutional neural network excessively depends on single pixel precision loss and ignores context connection. According to the invention, by improving the model network structure and adding a layer of deconvolution layer, the spatial resolution of the distribution of the segmented targets is recovered, and Wassertein divergence is introduced, so that the problem of spatial discontinuity of the joint distribution of different source data is solved, and the continuity of the training gradient is ensured. The segmentation precision of the model and the stability of model training are improved. All training and debugging works in the remote sensing image semantic segmentation method designed by the invention are carried out on line, and the model can be used for fast prediction of new on-line samples after training is finished.

Description

一种基于W散度对抗网络的遥感图像语义分割方法A Semantic Segmentation Method of Remote Sensing Image Based on W-Divergence Adversarial Network

技术领域technical field

本发明涉及图像处理领域,尤其涉及一种基于W散度对抗网络的遥感图像语义分割方法。The invention relates to the field of image processing, in particular to a remote sensing image semantic segmentation method based on W-divergence confrontation network.

背景技术Background technique

生成式对抗网络GAN是一种新型深度学习模型,在自然图像合成、图像翻译、风格迁移等应用中表现良好。GAN框架由一个生成器网络和一个判别器网络组成,其中生成器试图通过伪造数据来骗过判别器,而判别器则需要提升自身的鉴别能力来区分生成的假数据和真实数据,通过对抗训练,生成器和判别器就能学到样本数据的真实分布。由于GAN具有良好的数据拟合能力,许多研究尝试将GAN方法引入传统CNN语义分割网络。这些方法将原始的CNN分割网络作为对抗网络的生成器,设置另一个卷积结构的判别器对生成器生成的预测分割图与真实的标签进行对比,返回预测分割图为真的概率分布作为损失函数进而对生成器和判别器进行对抗训练。由于语义分割任务中标签是离散的,生成器生成的预测分布图往往存在连续,这种结构上的差异导致原始GAN模型在拟合连续分割结果与离散标签的映射关系时表现欠佳,具体体现在其使用的JS散度在两个分布正交或重叠部分较少时会出现梯度消失导致训练无法正常进行。Generative Adversarial Network (GAN) is a new type of deep learning model that performs well in applications such as natural image synthesis, image translation, and style transfer. The GAN framework consists of a generator network and a discriminator network, in which the generator tries to deceive the discriminator by forging data, and the discriminator needs to improve its own discrimination ability to distinguish generated fake data from real data, through confrontation training , the generator and discriminator can learn the true distribution of the sample data. Due to the good data fitting ability of GAN, many studies try to introduce the GAN method into the traditional CNN semantic segmentation network. These methods use the original CNN segmentation network as the generator of the confrontation network, set another discriminator with a convolutional structure to compare the predicted segmentation map generated by the generator with the real label, and return the probability distribution of the predicted segmentation map as a loss. The function then performs adversarial training on the generator and the discriminator. Since the labels in the semantic segmentation task are discrete, the prediction distribution map generated by the generator is often continuous. This structural difference leads to the poor performance of the original GAN model in fitting the mapping relationship between continuous segmentation results and discrete labels. Specifically, When the JS divergence used in the two distributions is orthogonal or the overlap is small, the gradient will disappear and the training will not proceed normally.

最近,Pix2pix采用CGAN(Conditional GAN)框架+U-net网络实现了城市遥感图像到地图的翻译,遗憾的是,在语义分割评价指标上,引入GAN框架的Pix2pix得到的分数不如基础的U-net网络。Recently, Pix2pix used the CGAN (Conditional GAN) framework + U-net network to realize the translation of urban remote sensing images to maps. Unfortunately, in terms of semantic segmentation evaluation indicators, Pix2pix introduced the GAN framework. The scores obtained are not as good as the basic U-net network.

现有遥感图像分割技术存在以下不足:The existing remote sensing image segmentation technology has the following deficiencies:

1、采用JS散度作为损失函数的原始GAN模型在训练过程中需要小心平衡生成器和判别器的学习能力,以免出现一方压制另一方导致模型崩溃。目前GAN分割方法例如Pix2pix仍然采用原始GAN框架,当生成器执行复杂的语义分割任务时,生成器和判别器的平衡条件更加微妙,导致模型难以稳定训练。此外,JS散度在处理语义分割任务的离散标签时常常出现梯度消失,从而导致训练无法正常进行。1. The original GAN model using JS divergence as the loss function needs to carefully balance the learning capabilities of the generator and the discriminator during the training process, so as to avoid one party suppressing the other and causing the model to collapse. At present, GAN segmentation methods such as Pix2pix still use the original GAN framework. When the generator performs complex semantic segmentation tasks, the balance conditions between the generator and the discriminator are more subtle, making it difficult to train the model stably. In addition, JS divergence often has vanishing gradients when dealing with discrete labels for semantic segmentation tasks, which makes training impossible.

2、Pix2pix的判别器缺少必要的上采样层,导致预测分布图的空间分辨率降低,从而丢失分割目标位置信息。此外,判别器损失使用概率分数统计预测分布图的真实性,一定程度上会导致数据失真,导致分割精度下降。2. The discriminator of Pix2pix lacks the necessary upsampling layer, which leads to the reduction of the spatial resolution of the predicted distribution map, thus losing the location information of the segmentation target. In addition, the discriminator loss uses probability score statistics to predict the authenticity of the distribution map, which will lead to data distortion to a certain extent, resulting in a decrease in segmentation accuracy.

因此,如何提高对遥感图像的分割精度是遥感图像处理领域亟需解决的问题。Therefore, how to improve the segmentation accuracy of remote sensing images is an urgent problem in the field of remote sensing image processing.

发明内容Contents of the invention

针对现有技术之不足,本发明提出一种基于Wasserstein散度对抗网络的遥感图像语义分割方法,所述方法包括:Aiming at the deficiencies in the prior art, the present invention proposes a remote sensing image semantic segmentation method based on Wasserstein divergence confrontation network, said method comprising:

步骤1:建立遥感图像数据集;Step 1: Establish a remote sensing image dataset;

步骤2:建立遥感图像语义分割模型,所述语义分割模型包括生成网络和判别器网络;Step 2: Establish a remote sensing image semantic segmentation model, which includes a generation network and a discriminator network;

步骤3:训练所述遥感图像语义分割模型,图4是本发明遥感图像分割方法的训练流程图,具体包括:Step 3: training described remote sensing image semantic segmentation model, Fig. 4 is the training flowchart of remote sensing image segmentation method of the present invention, specifically comprises:

步骤31:将所述训练集中待分割的RGB遥感图像和对应二值图标签配对,依次将一组RGB遥感图像和对应二值图标签输入到所述生成器网络的输入层,随机裁剪为生成器卷积网络额定输入尺寸,并进行位置对齐;Step 31: pair the RGB remote sensing images to be segmented in the training set with the corresponding binary image labels, and sequentially input a group of RGB remote sensing images and corresponding binary image labels to the input layer of the generator network, and randomly crop them to generate The rated input size of the convolutional network and position alignment;

具体的,将RGB遥感图像随机裁剪为生成器网络额定输入尺寸,默认256×256,并对二值图标签对应位置做相同操作。Specifically, the RGB remote sensing image is randomly cropped to the rated input size of the generator network, the default is 256×256, and the same operation is performed on the corresponding position of the binary image label.

步骤32:生成器卷积栈对输入的待分割RGB遥感图像进行特征提取,完成特征提取后输出第一预测分割图;Step 32: The generator convolution stack performs feature extraction on the input RGB remote sensing image to be segmented, and outputs the first predicted segmentation map after feature extraction is completed;

步骤33:对比所述第一预测分割图和二值图标签,统计预测分布的L1损失,数学表达式如下:Step 33: Comparing the labels of the first predicted segmentation image and the binary image, and counting the L1 loss of the predicted distribution, the mathematical expression is as follows:

Figure BDA0002899846650000021
Figure BDA0002899846650000021

其中,

Figure BDA0002899846650000022
表示L1距离损失,y表示二值图标签,G(x)表示预测分割图,
Figure BDA0002899846650000023
表示一次迭代的平均损失。in,
Figure BDA0002899846650000022
Represents the L1 distance loss, y represents the binary image label, G(x) represents the predicted segmentation map,
Figure BDA0002899846650000023
Indicates the average loss for an iteration.

迭代的输入单位为batch,一个batch中可能包含多张图片,故

Figure BDA0002899846650000024
可能在不同模型中计算范围不同。The input unit of iteration is batch, and a batch may contain multiple pictures, so
Figure BDA0002899846650000024
The calculation range may be different in different models.

步骤34:将生成器生成的所述第一预测分割图和对应的所述待分割RGB图像堆叠,同时将二值图标签和所述待分割RGB遥感图像堆叠,将两组堆叠数据依次输入判别器网络中,完成判别并输出所述第一预测分割图和所述二值图标签的Wasserstein散度,Wasserstein散度的数学表达式如下:Step 34: stack the first predicted segmentation image generated by the generator and the corresponding RGB image to be segmented, and simultaneously stack the binary image label and the RGB remote sensing image to be segmented, and input the two sets of stacked data into the discriminant in sequence In the device network, complete the discrimination and output the Wasserstein divergence of the first predicted segmentation map and the binary map label, the mathematical expression of the Wasserstein divergence is as follows:

Figure BDA0002899846650000031
Figure BDA0002899846650000031

其中,

Figure BDA0002899846650000032
表示Wasserstein散度损失,x表示待分割RGB遥感图像,y表示二值图标签,G(·)表示生成器网络输出,D(·)表示判别器网络输出,k、p分别为正则项的系数和指数,属于超参数,本方法默认k=0.001,p=3。in,
Figure BDA0002899846650000032
Represents the Wasserstein divergence loss, x represents the RGB remote sensing image to be segmented, y represents the binary image label, G( ) represents the output of the generator network, D( ) represents the output of the discriminator network, k and p are the coefficients of the regularization term and index are hyperparameters. This method defaults to k=0.001 and p=3.

步骤35:将Wasserstein散度作为判别器损失函数,通过梯度下降反向传播,更新判别器权重向量一次;Step 35: Use Wasserstein divergence as the discriminator loss function, and update the discriminator weight vector once through gradient descent backpropagation;

步骤36:将Wasserstein散度多项式中的生成器损失项取反再和L1损失加权求和(比重默认1:100)作为生成器损失函数,通过梯度下降反向传播,更新生成器权重向量一次,最优生成器G*的数学描述为:Step 36: Invert the generator loss term in the Wasserstein divergence polynomial and then add the weighted sum of the L1 loss (the default ratio is 1:100) as the generator loss function, and update the generator weight vector once through gradient descent backpropagation. The mathematical description of the optimal generator G * is:

Figure BDA0002899846650000033
Figure BDA0002899846650000033

其中,λ表示权重参数。Among them, λ represents the weight parameter.

步骤37、重复步骤31至步骤36,直到所述训练集所有样本都参与一次训练;Step 37. Repeat steps 31 to 36 until all samples in the training set participate in one training session;

步骤38、重复步骤31至步骤37,直到迭代次数达到上限;Step 38. Repeat steps 31 to 37 until the number of iterations reaches the upper limit;

步骤39、模型预训练完成,保存预训练模型;Step 39, the model pre-training is completed, and the pre-trained model is saved;

步骤4:测试所述遥感图像语义分割模型。Step 4: Test the remote sensing image semantic segmentation model.

根据一种优选的实施方式,所述步骤1包括:According to a preferred embodiment, the step 1 includes:

步骤11:将高分辨率遥感图像和对应二值图标签均匀地切割成适应计算资源的尺寸的子图像,筛选出目标和背景像素点数量不均衡的样本;Step 11: Evenly cut the high-resolution remote sensing image and the corresponding binary image label into sub-images that fit the size of the computing resources, and filter out the samples with unbalanced number of target and background pixels;

步骤12:将切割后的所述子图像按照设定比例分成训练集和验证集,保证训练集和验证集不重叠同分布。Step 12: Divide the cut sub-images into a training set and a verification set according to a set ratio, ensuring that the training set and the verification set do not overlap and have the same distribution.

根据一种优选的实施方式,所述步骤2具体包括:According to a preferred embodiment, the step 2 specifically includes:

步骤21:建立基于U-net架构的生成器网络,包含一个输入层和一个卷积栈,所述生成器网络用于生成。Step 21: Establish a generator network based on the U-net architecture, including an input layer and a convolution stack, and the generator network is used for generation.

步骤22:建立基于FCN架构的判别器网络,包含一个卷积栈,所述判别器网络用于判别;Step 22: Establish a discriminator network based on the FCN architecture, including a convolution stack, and the discriminator network is used for discrimination;

步骤23:对生成器网络和判别器网络权重向量进行初始化。Step 23: Initialize the generator network and discriminator network weight vectors.

根据一种优选的实施方式,所述测试遥感图像语义分割模型的方法包括:According to a preferred embodiment, the method for testing the semantic segmentation model of remote sensing images includes:

步骤41:将所述验证集中的待分割RGB遥感图像输入预训练模型生成器网络中,得到第二预分割图像;Step 41: Input the RGB remote sensing image to be segmented in the verification set into the pre-training model generator network to obtain the second pre-segmented image;

步骤42:将所述待分割RGB遥感图像对应的二值图标签输入到生成器网络,通过对比所述第二预分割图像和二值图标签的差异性计算验证集的分割精度,差异量化可选用DICE相似性系数IoU交并比等参数来定量估计;Step 42: Input the binary image label corresponding to the RGB remote sensing image to be segmented into the generator network, and calculate the segmentation accuracy of the verification set by comparing the difference between the second pre-segmented image and the binary image label, and the difference quantification can be Select parameters such as the DICE similarity coefficient IoU intersection and union ratio to quantitatively estimate;

步骤43:若验证集的分割精度未达到设定值,执行步骤2至步骤4来调整参数提高分割精度。Step 43: If the segmentation accuracy of the verification set does not reach the set value, perform steps 2 to 4 to adjust parameters to improve segmentation accuracy.

本发明的有益效果在于:The beneficial effects of the present invention are:

1、本发明使用散度形式的Wasserstein距离代替传统GAN方法中的JS散度,一方面解决了JS散度带来的梯度消失问题,另一方面规避了传统Wasserstein距离的优化需要满足的Lipschitz约束,解决了遥感图像和二值图标签之间不同源数据联合分布空间不连续的问题,保证了训练梯度的稳定性。1. The present invention uses the Wasserstein distance in the form of divergence to replace the JS divergence in the traditional GAN method. On the one hand, it solves the problem of gradient disappearance caused by the JS divergence, and on the other hand, it avoids the Lipschitz constraint that the optimization of the traditional Wasserstein distance needs to meet , which solves the problem of spatial discontinuity in the joint distribution of different source data between remote sensing images and binary image labels, and ensures the stability of the training gradient.

2、本发明在判别器下采样卷积栈后添加了一层反卷积层,不同于JS散度采用多像素选举的方式返回概率分数或概率分布图来反映预测分割准确性,wasserstein距离采用实值计算,所以我们的方法不需要压缩判别器输出的空间分辨率,因此加上反卷积层,用于恢复分割目标分布的空间分辨率,减少位置信息的丢失。2. The present invention adds a deconvolution layer after the discriminator downsamples the convolution stack. Unlike JS divergence, which uses multi-pixel elections to return probability scores or probability distribution maps to reflect the prediction segmentation accuracy, the Wasserstein distance adopts Real-valued calculations, so our method does not need to compress the spatial resolution of the discriminator output, so a deconvolution layer is added to restore the spatial resolution of the segmentation target distribution and reduce the loss of position information.

附图说明Description of drawings

图1是本发明遥感图像分割方法流程图;Fig. 1 is the flow chart of remote sensing image segmentation method of the present invention;

图2是本发明生成器的网络结构图;Fig. 2 is the network structural diagram of generator of the present invention;

图3是本发明判别器的网络结构图;Fig. 3 is the network structural diagram of discriminator of the present invention;

图4是本发明遥感图像分割方法的训练流程图;Fig. 4 is the training flowchart of remote sensing image segmentation method of the present invention;

图5是本发明一个实例的效果对比图;和Fig. 5 is the effect contrast figure of an example of the present invention; With

图6是本发明另一个实例的效果对比图。Fig. 6 is an effect comparison diagram of another example of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚明了,下面结合具体实施方式并参照附图,对本发明进一步详细说明。应该理解,这些描述只是示例性的,而并非要限制本发明的范围。此外,在以下说明中,省略了对公知结构和技术的描述,以避免不必要地混淆本发明的概念。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in combination with specific embodiments and with reference to the accompanying drawings. It should be understood that these descriptions are exemplary only, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

下面结合附图进行详细说明。A detailed description will be given below in conjunction with the accompanying drawings.

主要解决传统卷积神经网络(CNN)方法在遥感图像语义分割中存在的上下文关系缺失的问题。高分辨率遥感图像包含丰富的地面信息,相对于自然图像的语义分割,遥感图像的分割目标往往存在更大的类内差异性和更严重的类间干扰,这对神经网络的语义理解能力提出了更高的要求。传统的CNN方法过于依赖单个像素的分类而忽略目标实例的内部整体性以及类间关系,从而导致误分割和误识别。传统的Patch AN通过压缩空间分辨率将一个直径小于分割目标的像素块的像素值归纳到一个概率分数,从而获得像素块内的上下文一致性,实践中发现这个做法会导致分割结果模糊,本发明将JS散度改为Wasserstein散度之后,在实验中采用反卷积层恢复空间分辨率效果更好。It mainly solves the problem of lack of context in the traditional convolutional neural network (CNN) method in the semantic segmentation of remote sensing images. High-resolution remote sensing images contain rich ground information. Compared with the semantic segmentation of natural images, the segmentation targets of remote sensing images often have greater intra-class differences and more serious inter-class interference. higher requirements. Traditional CNN methods rely too much on the classification of individual pixels and ignore the internal integrity of target instances and the relationship between classes, resulting in mis-segmentation and mis-recognition. The traditional Patch AN summarizes the pixel value of a pixel block with a diameter smaller than the segmentation target into a probability score by compressing the spatial resolution, thereby obtaining the contextual consistency within the pixel block. In practice, it is found that this approach will lead to blurred segmentation results. The present invention After changing the JS divergence to the Wasserstein divergence, it is better to use the deconvolution layer to restore the spatial resolution in the experiment.

为了向CNN网络引入更高阶的上下文关系,改善CNN网络的分割性能,本发明提出了一种基于Wasserstein散度的对抗网络(GAN)分割方法。In order to introduce higher-order context relations into the CNN network and improve the segmentation performance of the CNN network, the present invention proposes an Adversarial Network (GAN) segmentation method based on Wasserstein divergence.

图1为本发明分割方法的流程图,现结合图1,对本发明方法进行详细的描述。本发明分割方法包括:FIG. 1 is a flow chart of the segmentation method of the present invention. Now, with reference to FIG. 1 , the method of the present invention will be described in detail. The segmentation method of the present invention comprises:

步骤1:建立遥感图像数据集,具体包括:Step 1: Establish a remote sensing image dataset, including:

步骤11:将高分辨率遥感图像和对应二值图标签均匀地切割成适应计算资源的尺寸的子图像,筛选掉目标和背景像素点数量不均衡的样本;筛选的目的在于:防止模型对像素点的分类结果向像素点占比更高的一方偏移。默认600*600的尺寸。Step 11: Evenly cut the high-resolution remote sensing image and the corresponding binary image label into sub-images that fit the size of the computing resources, and filter out samples with an unbalanced number of target and background pixels; the purpose of screening is to prevent the model from distorting pixels The classification result of the points is shifted to the side with a higher proportion of pixels. The default size is 600*600.

步骤12:将切割后的所述子图像按照设定比例分成训练集和验证集,保证训练集和验证集不重叠同分布。一般按照训练集:验证集的比例为3:1。Step 12: Divide the cut sub-images into a training set and a verification set according to a set ratio, ensuring that the training set and the verification set do not overlap and have the same distribution. Generally, the ratio of training set: validation set is 3:1.

步骤2:建立遥感图像语义分割模型,所述语义分割模型包括生成器网络和判别器网络,具体包括:Step 2: Establish a remote sensing image semantic segmentation model, the semantic segmentation model includes a generator network and a discriminator network, specifically including:

步骤21:建立基于U-net架构的生成器网络,包含一个输入层和一个卷积栈,所述生成器网络用于生成预测分割图。图2为本发明生成器的网络结构图,具体结构如图2所示。卷积栈包括依次运行的卷积层和相应的反卷积层,卷积层和解卷积层之间的虚线指skip-connection layer,即跳跃层,其作用为融合不同卷积层提取到的特征。目前,跳跃层的使用方式有加合、堆叠两种。本发明采用堆叠的方式连接镜像层,主要作用为增强空间对齐和防止梯度消失。Step 21: Establish a generator network based on the U-net architecture, including an input layer and a convolution stack, and the generator network is used to generate a prediction segmentation map. Fig. 2 is a network structure diagram of the generator of the present invention, and the specific structure is shown in Fig. 2 . The convolution stack includes convolutional layers and corresponding deconvolutional layers that run sequentially. The dotted line between the convolutional layer and the deconvolutional layer refers to the skip-connection layer, that is, the skipping layer. Its function is to fuse the data extracted by different convolutional layers. feature. Currently, there are two ways to use jump layers: addition and stacking. The present invention connects mirror layers in a stacked manner, and its main function is to enhance spatial alignment and prevent gradient disappearance.

步骤22:建立基于FCN架构的判别器网络,包含一个卷积栈,所述判别器网络用于计算生成器所生成的预测分割图的wasserstein散度损失;图3为本发明判别器的网络结构图,如图3所示,判别器包括依次运行的多个卷积层和一个反卷积层。本发明下采样卷积栈后添加了一层反卷积层,不同于JS散度采用统计像素块的概率分数或概率分布图的方式来反映预测分割准确性,wasserstein距离采用单个像素点的实值计算,所以我们的方法不需要压缩判别器输出的空间分辨率,因此加上反卷积层,用于恢复分割目标分布的空间分辨率,减少位置信息的丢失。Step 22: Establish a discriminator network based on the FCN architecture, including a convolution stack, and the discriminator network is used to calculate the wasserstein divergence loss of the predicted segmentation map generated by the generator; Figure 3 is the network structure of the discriminator of the present invention As shown in Figure 3, the discriminator consists of multiple convolutional layers and a deconvolutional layer that run sequentially. The present invention adds a layer of deconvolution layer after the downsampling convolution stack, which is different from JS divergence which uses the probability score or probability distribution map of statistical pixel blocks to reflect the accuracy of prediction segmentation. Wasserstein distance uses the actual value of a single pixel Value calculation, so our method does not need to compress the spatial resolution of the discriminator output, so a deconvolution layer is added to restore the spatial resolution of the segmentation target distribution and reduce the loss of position information.

步骤23:对生成器网络和判别器网络权重向量进行初始化;Step 23: Initialize the generator network and discriminator network weight vectors;

步骤3:训练所述遥感图像语义分割模型,图4是本发明遥感图像分割方法的训练流程图,具体包括:Step 3: training described remote sensing image semantic segmentation model, Fig. 4 is the training flowchart of remote sensing image segmentation method of the present invention, specifically comprises:

步骤31:将所述训练集中待分割的RGB遥感图像和对应二值图标签配对,依次将一组RGB遥感图像和对应二值图标签输入到所述生成器网络的输入层,随机裁剪为生成器卷积网络额定输入尺寸,并进行位置对齐;Step 31: pair the RGB remote sensing images to be segmented in the training set with the corresponding binary image labels, and sequentially input a group of RGB remote sensing images and corresponding binary image labels to the input layer of the generator network, and randomly crop them to generate The rated input size of the convolutional network and position alignment;

具体的,将RGB遥感图像随机裁剪为生成器网络额定输入尺寸,默认256×256,并对二值图标签对应位置做相同操作。Specifically, the RGB remote sensing image is randomly cropped to the rated input size of the generator network, the default is 256×256, and the same operation is performed on the corresponding position of the binary image label.

步骤32:生成器卷积栈对输入的待分割RGB遥感图像进行特征提取,完成特征提取后输出第一预测分割图;Step 32: The generator convolution stack performs feature extraction on the input RGB remote sensing image to be segmented, and outputs the first predicted segmentation map after feature extraction is completed;

步骤33:对比所述第一预测分割图和二值图标签,统计预测分布的L1损失,数学表达式如下:Step 33: Comparing the labels of the first predicted segmentation image and the binary image, and counting the L1 loss of the predicted distribution, the mathematical expression is as follows:

Figure BDA0002899846650000061
Figure BDA0002899846650000061

其中,

Figure BDA0002899846650000062
表示L1距离损失,y表示二值图标签,G(x)表示预测分割图,
Figure BDA0002899846650000063
表示一次迭代的平均损失。in,
Figure BDA0002899846650000062
Represents the L1 distance loss, y represents the binary image label, G(x) represents the predicted segmentation map,
Figure BDA0002899846650000063
Indicates the average loss for an iteration.

迭代的输入单位为batch,一个batch中可能包含多张图片,故

Figure BDA0002899846650000064
可能在不同模型中计算范围不同。The input unit of iteration is batch, and a batch may contain multiple pictures, so
Figure BDA0002899846650000064
The calculation range may be different in different models.

步骤34:将生成器生成的所述第一预测分割图和对应的所述待分割RGB图像堆叠,同时将二值图标签和所述待分割RGB遥感图像堆叠,将两组堆叠数据依次输入判别器网络中,完成判别并输出所述第一预测分割图和所述二值图标签的Wasserstein散度,Wasserstein散度的数学表达式如下:Step 34: stack the first predicted segmentation image generated by the generator and the corresponding RGB image to be segmented, and simultaneously stack the binary image label and the RGB remote sensing image to be segmented, and input the two sets of stacked data into the discriminant in sequence In the device network, complete the discrimination and output the Wasserstein divergence of the first predicted segmentation map and the binary map label, the mathematical expression of the Wasserstein divergence is as follows:

Figure BDA0002899846650000071
Figure BDA0002899846650000071

其中,

Figure BDA0002899846650000072
表示Wasserstein散度损失,x表示待分割RGB遥感图像,y表示二值图标签,G(·)表示生成器网络输出,D(·)表示判别器网络输出,k、p分别为正则项的系数和指数,属于超参数,本方法默认k=0.001,p=3。in,
Figure BDA0002899846650000072
Represents the Wasserstein divergence loss, x represents the RGB remote sensing image to be segmented, y represents the binary image label, G( ) represents the output of the generator network, D( ) represents the output of the discriminator network, k and p are the coefficients of the regularization term and index are hyperparameters. This method defaults to k=0.001 and p=3.

步骤35:将Wasserstein散度作为判别器损失函数,通过梯度下降反向传播,更新判别器权重向量一次;Step 35: Use Wasserstein divergence as the discriminator loss function, and update the discriminator weight vector once through gradient descent backpropagation;

步骤36:将Wasserstein散度多项式中的生成器损失项取反再和L1损失加权求和(比重默认1:100)作为生成器损失函数,通过梯度下降反向传播,更新生成器权重向量一次,最优生成器G*的数学描述为:Step 36: Invert the generator loss term in the Wasserstein divergence polynomial and then add the weighted sum of the L1 loss (the default ratio is 1:100) as the generator loss function, and update the generator weight vector once through gradient descent backpropagation. The mathematical description of the optimal generator G * is:

Figure BDA0002899846650000073
Figure BDA0002899846650000073

其中,λ表示权重参数。Among them, λ represents the weight parameter.

步骤37、重复步骤31至步骤36,直到所述训练集所有样本都参与一次训练;Step 37. Repeat steps 31 to 36 until all samples in the training set participate in one training session;

步骤38、重复步骤31至步骤37,直到迭代次数达到上限;Step 38. Repeat steps 31 to 37 until the number of iterations reaches the upper limit;

步骤39、模型预训练完成,保存预训练模型;Step 39, the model pre-training is completed, and the pre-trained model is saved;

步骤4:测试所述遥感图像语义分割模型,具体包括:Step 4: Test the remote sensing image semantic segmentation model, specifically including:

步骤41:将所述验证集中的待分割RGB遥感图像输入预训练模型生成器网络中,得到第二预分割图像;Step 41: Input the RGB remote sensing image to be segmented in the verification set into the pre-training model generator network to obtain the second pre-segmented image;

步骤42:将所述待分割RGB遥感图像对应的二值图标签输入到生成器网络,通过对比所述第二预分割图像和二值图标签的差异性计算验证集的分割精度,差异量化可选用DICE相似性系数IoU交并比等参数来定量估计;Step 42: Input the binary image label corresponding to the RGB remote sensing image to be segmented into the generator network, and calculate the segmentation accuracy of the verification set by comparing the difference between the second pre-segmented image and the binary image label, and the difference quantification can be Select parameters such as the DICE similarity coefficient IoU intersection and union ratio to quantitatively estimate;

步骤43:若验证集的分割精度未达到设定值,执行步骤2至步骤4来调整参数提高分割精度。Step 43: If the segmentation accuracy of the verification set does not reach the set value, perform steps 2 to 4 to adjust parameters to improve segmentation accuracy.

本发明方法还包括:当验证集的分割精度达到设定值,进行线上应用,即将待分割的新样本输入预训练的生成器网络,输出预测分割结果。The method of the present invention also includes: when the segmentation accuracy of the verification set reaches the set value, the online application is performed, that is, the new sample to be segmented is input into the pre-trained generator network, and the predicted segmentation result is output.

本发明的调参工作主要分为两个方面:1、通过调整网络架构、更换损失函数、调整网络结构达到精度提升;2、通过观察损失曲线的收敛情况结合预分割图像的视觉效果以及验证集精度来调整超参、归一化、标准化、激活函数设置等操作排除异常。The parameter adjustment work of the present invention is mainly divided into two aspects: 1. Accuracy is improved by adjusting the network architecture, replacing the loss function, and adjusting the network structure; 2. By observing the convergence of the loss curve combined with the visual effect of the pre-segmented image and the verification set Accuracy to adjust hyperparameters, normalization, standardization, activation function settings and other operations to eliminate abnormalities.

本发明的目的在于通过在传统卷积网络中引入对抗网络架构来改善分割模型的性能,为了验证本文方法可行性,以U-net为例进行了如下对比实验。The purpose of the present invention is to improve the performance of the segmentation model by introducing an adversarial network architecture into the traditional convolutional network. In order to verify the feasibility of the method in this paper, the following comparative experiments were carried out using U-net as an example.

本文设置了基础U-net分割网络和U-net+WGAN-div(本发明方法)两组实验。实验所用参数、训练集和验证集、训练迭代次数均一致。本文方法较基础U-net不仅在精度指标上取得了较大提升,同时,我们的方法在视觉效果上也体现出了更好的表现。本发明采用的客观评价指标为DICE相似性系数和IoU交并比,实验结果是本发明方法的DICE值是0.844,IoU值是0.729,单像素精度是0.951,基础U-net分割结果的DICE值为0.796,IoU值是0.663,单像素精度是0.933。In this paper, two sets of experiments on the basic U-net segmentation network and U-net+WGAN-div (the method of the present invention) are set up. The parameters used in the experiment, the training set and validation set, and the number of training iterations are all the same. Compared with the basic U-net, the method in this paper has not only achieved a greater improvement in the accuracy index, but also our method has better performance in terms of visual effects. The objective evaluation index adopted by the present invention is the DICE similarity coefficient and the IoU cross-over-union ratio. The experimental results show that the DICE value of the method of the present invention is 0.844, the IoU value is 0.729, the single-pixel precision is 0.951, and the DICE value of the basic U-net segmentation result It is 0.796, the IoU value is 0.663, and the single-pixel accuracy is 0.933.

图5和图6从左至右依次为原始RGB遥感图像、基础U-net分割结果图、本发明方法分割结果、真实二值图标签。从图5明显可见,GAN方法在个体实例的分割上具有更好的类内一致性,而基础U-net在建筑物表面不光滑时存在漏分。从图6可见,基础U-net在分割建筑物时容易受开阔区域例如停车场等干扰导致误识别,而本发明方法相对具有更好的鲁棒性。Figure 5 and Figure 6 are from left to right the original RGB remote sensing image, the basic U-net segmentation result map, the segmentation result of the method of the present invention, and the real binary image label. It is obvious from Figure 5 that the GAN method has better intra-class consistency in the segmentation of individual instances, while the basic U-net has missing points when the building surface is not smooth. It can be seen from Figure 6 that the basic U-net is prone to misidentification caused by interference from open areas such as parking lots when segmenting buildings, while the method of the present invention is relatively more robust.

本发明的主要目的是改善传统U-net所代表的卷积神经网络在语义分割任务中缺乏上下文联系,传统卷积网络将语义分割任务转化成了逐像素分类任务,使单个像素的分类脱离了语义环境,事实上分割目标具有连续性,一个像素点如果被判定为目标,那么它的领域同样为目标的概率应该高于其他区域。目前针对语义分割的改进大多围绕上下文联系的问题展开,对抗网络是其中一种使用较少的方。传统基于JS散度的对抗网络主要通过压缩图像空间分辨率提升上下文一致性,比如通过3个的下采样卷积层,就可以用一个概率分数表示一个8*8的区域范围内的像素值,副作用是可能造成分割结果高频部分(比如边界轮廓)模糊。实验中传统对抗网络方法因为解决不好连续的RGB图像到离散的二值化标签导致梯度消失问题,分割结果并不好,这也是对抗网络使用较少的一个原因。本文的创新点在于使用wasserstein散度替换JS散度来规避传统对抗网络方法的梯度消失问题,并引入一个反卷积层,更多的保留了空间分辨率,提高了分割的准确性。The main purpose of the present invention is to improve the lack of contextual connection in the semantic segmentation task of the convolutional neural network represented by the traditional U-net. The traditional convolutional network converts the semantic segmentation task into a pixel-by-pixel classification task, so that the classification of a single pixel is separated from the Semantic environment, in fact, the segmented target has continuity, if a pixel is judged as the target, then its area should have a higher probability of being the target than other areas. At present, most of the improvements for semantic segmentation revolve around the problem of contextual connection, and the confrontation network is one of the less used methods. The traditional JS divergence-based adversarial network mainly improves the contextual consistency by compressing the spatial resolution of the image. For example, through three downsampling convolutional layers, a probability score can be used to represent the pixel value in an 8*8 area. The side effect is that the high-frequency part (such as the boundary contour) of the segmentation result may be blurred. In the experiment, the traditional adversarial network method is not good at solving the problem of gradient disappearance from continuous RGB images to discrete binarized labels, and the segmentation results are not good, which is one of the reasons why the adversarial network is used less. The innovation of this paper is to use the wasserstein divergence to replace the JS divergence to avoid the gradient disappearance problem of the traditional adversarial network method, and introduce a deconvolution layer, which retains more spatial resolution and improves the accuracy of segmentation.

需要注意的是,上述具体实施例是示例性的,本领域技术人员可以在本发明公开内容的启发下想出各种解决方案,而这些解决方案也都属于本发明的公开范围并落入本发明的保护范围之内。本领域技术人员应该明白,本发明说明书及其附图均为说明性而并非构成对权利要求的限制。本发明的保护范围由权利要求及其等同物限定。It should be noted that the above specific embodiments are exemplary, and those skilled in the art can come up with various solutions inspired by the disclosure of the present invention, and these solutions also belong to the scope of the disclosure of the present invention and fall within the scope of this disclosure. within the scope of protection of the invention. Those skilled in the art should understand that the description and drawings of the present invention are illustrative rather than limiting to the claims. The protection scope of the present invention is defined by the claims and their equivalents.

Claims (3)

1.一种基于W散度对抗网络的遥感图像语义分割方法,其特征在于,所述方法包括:1. a remote sensing image semantic segmentation method based on W divergence confrontation network, it is characterized in that, described method comprises: 步骤1:建立遥感图像数据集;Step 1: Establish a remote sensing image dataset; 步骤11:将高分辨率遥感图像和对应二值图标签均匀地切割成适应计算资源的尺寸的子图像,筛选掉目标和背景像素点数量不均衡的样本;Step 11: Evenly cut the high-resolution remote sensing image and the corresponding binary image label into sub-images that fit the size of the computing resources, and filter out the samples with unbalanced number of target and background pixels; 步骤12:将切割后的所述子图像按照设定比例分成训练集和验证集,保证训练集和验证集不重叠同分布;Step 12: Dividing the cut sub-images into a training set and a verification set according to a set ratio, ensuring that the training set and the verification set do not overlap and have the same distribution; 步骤2:建立遥感图像语义分割模型,所述语义分割模型包括生成器网络和判别器网络;Step 2: Establish a remote sensing image semantic segmentation model, the semantic segmentation model includes a generator network and a discriminator network; 步骤3:训练所述遥感图像语义分割模型,具体包括:Step 3: training the remote sensing image semantic segmentation model, specifically including: 步骤31:将所述训练集中待分割的RGB遥感图像和对应二值图标签配对,依次将一组RGB遥感图像和对应二值图标签输入到所述生成器网络的输入层,随机裁剪为生成器卷积网络额定输入尺寸,并进行位置对齐;Step 31: pair the RGB remote sensing images to be segmented in the training set with the corresponding binary image labels, and sequentially input a group of RGB remote sensing images and corresponding binary image labels to the input layer of the generator network, and randomly crop them to generate The rated input size of the convolutional network and position alignment; 步骤32:生成器卷积栈对输入的待分割RGB遥感图像进行特征提取,完成特征提取后输出第一预测分割图;Step 32: The generator convolution stack performs feature extraction on the input RGB remote sensing image to be segmented, and outputs the first predicted segmentation map after feature extraction is completed; 步骤33:对比所述第一预测分割图和二值图标签,统计预测分布的L1损失,数学表达式如下:Step 33: Comparing the labels of the first predicted segmentation image and the binary image, and counting the L1 loss of the predicted distribution, the mathematical expression is as follows:
Figure FDA0003997952680000011
Figure FDA0003997952680000011
其中,
Figure FDA0003997952680000012
表示L1距离损失,y表示二值图标签,G(x)表示预测分割图,
Figure FDA0003997952680000013
表示一次迭代的平均损失;
in,
Figure FDA0003997952680000012
Represents the L1 distance loss, y represents the binary image label, G(x) represents the predicted segmentation map,
Figure FDA0003997952680000013
Indicates the average loss of an iteration;
步骤34:将生成器生成的所述第一预测分割图和对应的所述待分割RGB遥感图像堆叠,同时将二值图标签和所述待分割RGB遥感图像堆叠,将两组堆叠数据依次输入判别器网络中,完成判别并输出所述第一预测分割图和所述二值图标签的Wasserstein散度,Wasserstein散度的数学表达式如下:Step 34: stack the first predicted segmentation map generated by the generator and the corresponding RGB remote sensing image to be segmented, and simultaneously stack the binary image label and the RGB remote sensing image to be segmented, and input two sets of stacked data in sequence In the discriminator network, complete the discrimination and output the Wasserstein divergence of the first predicted segmentation map and the binary map label, the mathematical expression of the Wasserstein divergence is as follows:
Figure FDA0003997952680000014
Figure FDA0003997952680000014
其中,
Figure FDA0003997952680000015
表示Wasserstein散度损失,x表示待分割RGB遥感图像,y表示二值图标签,G(·)表示生成器网络输出,D(·)表示判别器网络输出,k、p分别为正则项的系数和指数;
in,
Figure FDA0003997952680000015
Represents the Wasserstein divergence loss, x represents the RGB remote sensing image to be segmented, y represents the binary image label, G( ) represents the output of the generator network, D( ) represents the output of the discriminator network, k and p are the coefficients of the regularization term and index;
步骤35:将Wasserstein散度作为判别器损失函数,通过梯度下降反向传播,更新判别器权重向量一次;Step 35: Use Wasserstein divergence as the discriminator loss function, and update the discriminator weight vector once through gradient descent backpropagation; 步骤36:将Wasserstein散度多项式中的生成器损失项取反再和L1损失加权求和作为生成器损失函数,通过梯度下降反向传播,更新生成器权重向量一次,最优生成器G*的数学描述为:Step 36: Invert the generator loss term in the Wasserstein divergence polynomial and then add the weighted sum of the L1 loss as the generator loss function, and update the generator weight vector once through gradient descent backpropagation, the optimal generator G * The mathematical description is:
Figure FDA0003997952680000021
Figure FDA0003997952680000021
其中,λ表示权重参数;Among them, λ represents the weight parameter; 步骤37:重复步骤31至步骤36,直到所述训练集所有样本都参与一次训练;Step 37: Repeat steps 31 to 36 until all samples in the training set participate in one training session; 步骤38:重复步骤31至步骤37,直到迭代次数达到上限;Step 38: Repeat steps 31 to 37 until the number of iterations reaches the upper limit; 步骤39:模型预训练完成,保存预训练模型;Step 39: Model pre-training is completed, save the pre-trained model; 步骤4:测试所述遥感图像语义分割模型。Step 4: Test the remote sensing image semantic segmentation model.
2.如权利要求1所述的遥感图像语义分割方法,其特征在于,所述步骤2具体包括:2. The remote sensing image semantic segmentation method according to claim 1, wherein said step 2 specifically comprises: 步骤21:建立基于U-net架构的生成器网络,包含一个输入层和一个卷积栈,所述生成器网络用于生成第一预测分割图;Step 21: set up a generator network based on U-net architecture, including an input layer and a convolution stack, and the generator network is used to generate the first prediction segmentation map; 步骤22:建立基于FCN架构的判别器网络,包含一个卷积栈,所述判别器网络用于计算生成器所生成的预测分割图的wasserstein散度损失;Step 22: Establish a discriminator network based on the FCN architecture, including a convolution stack, and the discriminator network is used to calculate the wasserstein divergence loss of the predicted segmentation map generated by the generator; 步骤23:对生成器网络和判别器网络权重向量进行初始化。Step 23: Initialize the generator network and discriminator network weight vectors. 3.如权利要求1所述的遥感图像语义分割方法,其特征在于,测试所述遥感图像语义分割模型的方法包括:3. the remote sensing image semantic segmentation method as claimed in claim 1, is characterized in that, the method for testing described remote sensing image semantic segmentation model comprises: 步骤41:将所述验证集中的待分割RGB遥感图像输入预训练模型生成器网络中,得到第二预分割图像;Step 41: Input the RGB remote sensing image to be segmented in the verification set into the pre-training model generator network to obtain the second pre-segmented image; 步骤42:将所述待分割RGB遥感图像对应的二值图标签输入到生成器网络,通过对比所述第二预分割图像和二值图标签的差异性计算验证集的分割精度,差异量化选用DICE相似性系数和IoU交并比参数来定量估计;Step 42: Input the binary image label corresponding to the RGB remote sensing image to be segmented into the generator network, calculate the segmentation accuracy of the verification set by comparing the difference between the second pre-segmented image and the binary image label, and select the difference quantification DICE similarity coefficient and IoU intersection and ratio parameters are used to quantitatively estimate; 步骤43:若验证集的分割精度未达到设定值,执行步骤2至步骤4来调整参数提高分割精度。Step 43: If the segmentation accuracy of the verification set does not reach the set value, perform steps 2 to 4 to adjust parameters to improve segmentation accuracy.
CN202110053047.4A 2021-01-15 2021-01-15 Remote sensing image semantic segmentation method based on W divergence countermeasure network Active CN112733756B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110053047.4A CN112733756B (en) 2021-01-15 2021-01-15 Remote sensing image semantic segmentation method based on W divergence countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110053047.4A CN112733756B (en) 2021-01-15 2021-01-15 Remote sensing image semantic segmentation method based on W divergence countermeasure network

Publications (2)

Publication Number Publication Date
CN112733756A CN112733756A (en) 2021-04-30
CN112733756B true CN112733756B (en) 2023-01-20

Family

ID=75593294

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110053047.4A Active CN112733756B (en) 2021-01-15 2021-01-15 Remote sensing image semantic segmentation method based on W divergence countermeasure network

Country Status (1)

Country Link
CN (1) CN112733756B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331859B (en) * 2021-07-15 2024-12-20 西安科技大学 Image deblurring method based on improved generative adversarial network
CN113657538B (en) * 2021-08-24 2024-09-10 北京百度网讯科技有限公司 Model training and data classification method, device, equipment, storage medium and product
CN114359120B (en) * 2022-03-21 2022-06-21 深圳市华付信息技术有限公司 Remote sensing image processing method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635748A (en) * 2018-12-14 2019-04-16 中国公路工程咨询集团有限公司 The extracting method of roadway characteristic in high resolution image
CN111080723A (en) * 2019-12-17 2020-04-28 易诚高科(大连)科技有限公司 Image element segmentation method based on Unet network
CN111080645A (en) * 2019-11-12 2020-04-28 中国矿业大学 Semi-supervised semantic segmentation of remote sensing images based on generative adversarial networks
CN111783782A (en) * 2020-05-29 2020-10-16 河海大学 Fusion and improvement of UNet and SegNet for semantic segmentation of remote sensing images
CN111898507A (en) * 2020-07-22 2020-11-06 武汉大学 A deep learning method for predicting land cover categories in unlabeled remote sensing images

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10825219B2 (en) * 2018-03-22 2020-11-03 Northeastern University Segmentation guided image generation with adversarial networks
CN108830209B (en) * 2018-06-08 2021-12-17 西安电子科技大学 Remote sensing image road extraction method based on generation countermeasure network
CN109120652A (en) * 2018-11-09 2019-01-01 重庆邮电大学 It is predicted based on difference WGAN network safety situation
US20200342361A1 (en) * 2019-04-29 2020-10-29 International Business Machines Corporation Wasserstein barycenter model ensembling
US11048974B2 (en) * 2019-05-06 2021-06-29 Agora Lab, Inc. Effective structure keeping for generative adversarial networks for single image super resolution
US11636332B2 (en) * 2019-07-09 2023-04-25 Baidu Usa Llc Systems and methods for defense against adversarial attacks using feature scattering-based adversarial training
CN110827297A (en) * 2019-11-04 2020-02-21 中国科学院自动化研究所 Insulator segmentation method for generating countermeasure network based on improved conditions
CN111985532B (en) * 2020-07-10 2021-11-09 西安理工大学 Scene-level context-aware emotion recognition deep network method
CN112070209B (en) * 2020-08-13 2022-07-22 河北大学 Stable controllable image generation model training method based on W distance
CN112163401B (en) * 2020-10-22 2023-05-30 大连民族大学 Chinese character font generation method based on compression and excitation GAN network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635748A (en) * 2018-12-14 2019-04-16 中国公路工程咨询集团有限公司 The extracting method of roadway characteristic in high resolution image
CN111080645A (en) * 2019-11-12 2020-04-28 中国矿业大学 Semi-supervised semantic segmentation of remote sensing images based on generative adversarial networks
CN111080723A (en) * 2019-12-17 2020-04-28 易诚高科(大连)科技有限公司 Image element segmentation method based on Unet network
CN111783782A (en) * 2020-05-29 2020-10-16 河海大学 Fusion and improvement of UNet and SegNet for semantic segmentation of remote sensing images
CN111898507A (en) * 2020-07-22 2020-11-06 武汉大学 A deep learning method for predicting land cover categories in unlabeled remote sensing images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Remote Sensing Image Segmentation based on Generative Adversarial Network with Wasserstein divergence;Xia Cao等;《2020 3rd International Conference on Algorithms,Computing and Artificial Intelligence (ACAI 2020)》;20201224;第2-5页第2-3节以及图1-4 *
基于GAN的SAR图像生成研究;王雷雷;《中国优秀硕士学位论文全文数据库 信息科技辑》;20191115(第11期);第A006-724页 *

Also Published As

Publication number Publication date
CN112733756A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN110263705B (en) Two phases of high-resolution remote sensing image change detection system for the field of remote sensing technology
CN111768388B (en) A product surface defect detection method and system based on positive sample reference
CN113052834B (en) Pipeline defect detection method based on convolution neural network multi-scale features
CN112733756B (en) Remote sensing image semantic segmentation method based on W divergence countermeasure network
CN111126202A (en) Object detection method of optical remote sensing image based on hole feature pyramid network
CN110838112A (en) An insulator defect detection method based on Hough transform and YOLOv3 network
CN110751195B (en) Fine-grained image classification method based on improved YOLOv3
CN110909615B (en) Target detection method based on multi-scale input mixed perception neural network
CN114627290A (en) An Image Segmentation Algorithm of Mechanical Parts Based on Improved DeepLabV3+ Network
CN110532946A (en) A method of the green vehicle spindle-type that is open to traffic is identified based on convolutional neural networks
CN112164077B (en) Cell instance segmentation method based on bottom-up path enhancement
CN113610024B (en) A multi-strategy deep learning remote sensing image small target detection method
CN115908311B (en) Lens forming inspection equipment and method based on machine vision
CN116934780B (en) Deep learning-based electric imaging logging image crack segmentation method and system
CN113570540A (en) Image tampering blind evidence obtaining method based on detection-segmentation architecture
CN118230131B (en) Image recognition and target detection method
CN113537211B (en) A deep learning license plate frame location method based on asymmetric IOU
CN114943894A (en) ConvCRF-based high-resolution remote sensing image building extraction optimization method
CN114519819A (en) Remote sensing image target detection method based on global context awareness
CN112580661A (en) Multi-scale edge detection method under deep supervision
CN116645514A (en) Improved U 2 Ceramic tile surface defect segmentation method of Net
CN114998261B (en) A dual-stream U-Net image tampering detection network system and image tampering detection method
CN110633706B (en) Semantic segmentation method based on pyramid network
CN118470114A (en) A 6D pose estimation method for robot grasping tasks
CN115358981B (en) Method, device, equipment and storage medium for determining glue defects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant