CN104392456B

CN104392456B - SAR (synthetic aperture radar) image segmentation method based on depth autoencoders and area charts

Info

Publication number: CN104392456B
Application number: CN201410751944.2A
Authority: CN
Inventors: 刘芳; 石西建; 李玲玲; 焦李成; 郝红侠; 石俊飞; 杨淑媛; 段平; 段一平; 张向荣; 尚荣华
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2014-12-09
Filing date: 2014-12-09
Publication date: 2017-05-17
Anticipated expiration: 2034-12-09
Also published as: CN104392456A

Abstract

The invention discloses a SAR image segmentation method based on a depth self-encoder and a region map, which mainly solves the problems of inaccurate and detailed segmentation in the prior art. The implementation steps are: 1. Obtain the SAR pixel sketch map according to the initial sketch model, complete the sketch line segments to obtain the region map, and map the region map to the original image to obtain the aggregated, homogeneous and structural regions; 2. Separate the aggregated and homogeneous regions Train with different depth autoencoders to obtain representations corresponding to all points, and the last two layers of the cascaded coding layer are used as the point features; 3. Construct dictionaries for the aggregated and homogeneous regions respectively, and project the features of each point to the corresponding dictionary and aggregate each point. Regional characteristics of sub-regions; 4. Clustering of sub-regional features of two types of regions; 5. Segmentation of structural regions using superpixel merging under the guidance of sketch line segments; 6. Merging the segmentation results of each region to complete SAR image segmentation. The invention has the advantages of accurate and detailed segmentation and can be used for target recognition.

Description

SAR Image Segmentation Method Based on Depth Autoencoder and Region Map

技术领域technical field

本发明属于图像处理技领域，特别涉及深度学习方法中的深度自编码器和SAR图像的区域图进行SAR图像分割的方法，可用于进一步对SAR图像的识别与分类。The invention belongs to the technical field of image processing, and in particular relates to a method for segmenting SAR images by a deep self-encoder in a deep learning method and a region map of a SAR image, which can be used to further identify and classify SAR images.

背景技术Background technique

合成孔径雷达SAR就是利用雷达与目标的相对运动把尺寸较小的真实天线孔径用数据处理的方法合成一较大的等效天线孔径的雷达，其具有全天候全天时稳定的高分辨率成像的特点；而SAR图像则是通过雷达孔径合成的方式形成较大的雷达天线所成的图像，其特点导致其成为普遍应用于军事、农业、导航、地理等诸多领域非常有价值的图像。图像分割是按照其灰度、纹理、结构、聚集性等其他特征把图像分为具有独特性质互相连接且不相交的若干区域的技术和过程，是进一步对图像进行识别与分类的关键环节。SAR图像分割作为图像分割中的一个重要应用领域显得尤为重要。Synthetic Aperture Radar (SAR) is a radar that uses the relative motion of the radar and the target to synthesize a smaller real antenna aperture into a larger equivalent antenna aperture by data processing. It has all-weather and all-time stable high-resolution imaging The SAR image is an image formed by a larger radar antenna through radar aperture synthesis, and its characteristics make it a very valuable image that is widely used in many fields such as military affairs, agriculture, navigation, and geography. Image segmentation is the technology and process of dividing an image into several interconnected and disjoint regions with unique properties according to its grayscale, texture, structure, aggregation and other characteristics. It is a key link in further image recognition and classification. SAR image segmentation is particularly important as an important application field in image segmentation.

传统的用于SAR图像分割方法中，有Kmeans、FCM、谱聚类等聚类算法，由于SAR图像本身独特的成像机理及其复制性，其所成图像中含有大量的相干斑噪声，导致其可分性极差，传统聚类算法的分割结果不够理想；还有一些半监督特征提取后进行的分割方法，分割结果优于前面所述的聚类分割算法，但它需要人机交互进行标签数据指导数据的划分，而SAR图源比较稀缺且覆盖范围广，所以标记过的数据很少，而标记新的代价又很高。受斑点噪声的影响，现有的SAR图像分割方法很难抑制其影响，很难获得理想的结果，无法满足SAR图像普遍应用的要求。为减少人工参与的程度，无监督特征学习方法成了特征提取方式的主流，用于SAR图像分割中的特征提取方式，使用无监督特征学习也成了发展必然。SAR图像分割中特征的选取很关键，既是无监督提取方式又能获得较好的特征很困难。In the traditional SAR image segmentation methods, there are clustering algorithms such as Kmeans, FCM, and spectral clustering. Due to the unique imaging mechanism and replicability of the SAR image itself, the resulting image contains a large amount of coherent speckle noise, resulting in its The separability is extremely poor, and the segmentation results of traditional clustering algorithms are not ideal; there are also some segmentation methods after semi-supervised feature extraction, and the segmentation results are better than the aforementioned clustering segmentation algorithms, but it requires human-computer interaction for labeling Data guides the division of data, and SAR map sources are relatively scarce and cover a wide range, so there is very little marked data, and the cost of marking new ones is high. Affected by speckle noise, the existing SAR image segmentation methods are difficult to suppress its influence, and it is difficult to obtain ideal results, which cannot meet the requirements of the general application of SAR images. In order to reduce the degree of manual participation, unsupervised feature learning methods have become the mainstream of feature extraction methods, and the use of unsupervised feature learning has become an inevitable development method for feature extraction in SAR image segmentation. The selection of features in SAR image segmentation is very important, and it is very difficult to obtain better features in an unsupervised extraction method.

深度学习作为无监督特征学习方法中的新兴技术，在各领域的成功应用使其受到了强烈关注。深度学习能够学习数据本身固有特征，不仅可以无监督方式进行特征学习，而且学到的特征优于其他绝大部分是以假定其分布为基础提取的特征，在SAR图像分割领域引入深度学习方法理论上可以获得更好的SAR图像分割结果。然而在实际应用中，首先会因无法获取较为理想的训练数据而陷入困境，其次对学习到的特征很难应用到区域中进行分割表示，因此，如何有效结合深度学习方法应用到SAR图像分割中并获得非常理想的效果是个很困难的问题，而且本身基于这种技术的研究非常少。As an emerging technology in unsupervised feature learning methods, deep learning has attracted strong attention due to its successful application in various fields. Deep learning can learn the inherent characteristics of the data itself. Not only can feature learning be performed in an unsupervised manner, but the learned features are better than most other features extracted based on the assumption of its distribution. Introduce deep learning method theory in the field of SAR image segmentation can get better SAR image segmentation results. However, in practical applications, first of all, it will be in trouble due to the inability to obtain ideal training data, and secondly, it is difficult to apply the learned features to the region for segmentation and representation. Therefore, how to effectively combine deep learning methods for SAR image segmentation It is a very difficult problem to obtain a very ideal effect, and there are very few studies based on this technology.

发明内容Contents of the invention

本发明的目的在于克服上述已有技术与方法的不足，提出了一种基于深度学习中深度自编码器和区域图的SAR图像分割方法，以提升SAR图像分割的效果。The purpose of the present invention is to overcome the deficiencies of the above-mentioned existing technologies and methods, and propose a SAR image segmentation method based on a deep self-encoder and a region map in deep learning to improve the effect of SAR image segmentation.

实现本发明的技术方案是：根据初始素描模型得到SAR图像素描图，补全素描图中的素描线段得到区域图，将区域图映射到原图得到聚集、匀质和结构区域；分别对聚集和匀质区域使用不同的深度自编码器训练；用训练好的深度自编码器得到对应所有点的表示，级联编码层最后两层的表示作为该点的特征；根据词袋模型，分别对聚集和匀质区域构建字典，将各点特征投影至相应字典得到相应子区域特征；分别对两类区域的子区域特征聚类；将结构区域分割成超像素在素描线段指导下合并，并与匀质区域合并；将各区域的分割结果进行合并，完成SAR图像分割。具体步骤包括如下：The technical solution for realizing the present invention is: obtain the SAR pixel sketch map according to the initial sketch model, complete the sketch line segment in the sketch map to obtain the area map, map the area map to the original image to obtain the aggregation, homogeneity and structure area; The homogeneous area is trained with different depth autoencoders; the trained depth autoencoder is used to obtain the representations corresponding to all points, and the representations of the last two layers of the cascaded coding layer are used as the features of the points; according to the bag-of-words model, the aggregation Construct a dictionary with the homogeneous area, project the features of each point to the corresponding dictionary to obtain the corresponding sub-area features; cluster the sub-area features of the two types of areas respectively; divide the structural area into superpixels and merge them under the guidance of the sketch line segment, and combine them with the uniform Combining the qualitative regions; combining the segmentation results of each region to complete the SAR image segmentation. The specific steps include the following:

(1)根据初始素描模型得到SAR图像素描图，补全素描图中素描线段得到区域图，将区域图映射到原图得到聚集区域a、匀质区域b和结构区域c；(1) According to the initial sketch model, obtain the SAR pixel sketch map, complete the sketch line segment in the sketch map to obtain the area map, and map the area map to the original image to obtain the aggregation area a, homogeneous area b and structural area c;

(2)根据聚集区域a和匀质区域b各自特性，分别对聚集区域a和匀质区域b构建两个不同的深度自编码器Sa和Sb；(2) According to the respective characteristics of the aggregation area a and the homogeneous area b, construct two different depth autoencoders Sa and Sb for the aggregation area a and the homogeneous area b respectively;

(3)依照区域图中聚集区域a和匀质区域b的位置，分别在聚集区域a和匀质区域b上取样并训练相应的深度自编码器Sa和Sb；(3) According to the positions of the aggregated area a and the homogeneous area b in the area map, sample and train the corresponding depth autoencoders Sa and Sb on the aggregated area a and the homogeneous area b respectively;

(4)使用训练好的两个深度自编码器Sa和Sb，得到对应区域类型中的区域所有点各自的多层编码层表示，并级联每个点的最后两层编码层表示作为该点的特征；(4) Using the two trained deep self-encoders Sa and Sb, obtain the respective multi-layer coding layer representations of all points in the region in the corresponding region type, and concatenate the last two coding layer representations of each point as the point Characteristics;

(5)根据词袋模型，分别由聚集区域a和匀质区域b的所有点的特征构建聚集区域a和匀质区域b的字典，各点特征投影至相应字典并汇聚出各个子区域的区域特征；(5) According to the bag-of-words model, the dictionaries of aggregated area a and homogeneous area b are constructed from the features of all points in the aggregated area a and homogeneous area b, and the features of each point are projected to the corresponding dictionary and the areas of each sub-area are aggregated feature;

(6)分别对聚集区域a和匀质区域b的所有子区域特征进行聚类，得到聚集区域a和匀质区域b的分割结果；(6) Clustering all sub-area features of the aggregation area a and the homogeneous area b, respectively, to obtain the segmentation results of the aggregation area a and the homogeneous area b;

(7)将结构区域c使用分水岭算法分割成许多超像素，并在素描图中的素描线指导下对超像素进行一次合并，得到线目标和边界，再对其他超像素进行二次合并，并把二次合并后的超像素与匀质区域b的子区域进行三次合并，三次合并后剩余的超像素为独立目标，完成对结构区域c的分割；(7) Use the watershed algorithm to divide the structural region c into many superpixels, and merge the superpixels once under the guidance of the sketch lines in the sketch map to obtain the line target and boundary, and then merge other superpixels twice, and Merge the superpixels after the second merging with the sub-regions of the homogeneous region b three times, and the remaining superpixels after the three times merging are independent targets, and complete the segmentation of the structural region c;

(8)将聚集区域a、匀质区域b和结构区域c的分割结果进行合并，得到最终的SAR图像分割结果。(8) Merge the segmentation results of the aggregation area a, the homogeneous area b and the structural area c to obtain the final SAR image segmentation result.

本发明与现有技术相比具有如下优点：Compared with the prior art, the present invention has the following advantages:

1.本发明充分结合区域图，可以有效提取聚集区域，而且在聚集区域和匀质区域均使用深度自编码器学习特征，可以提取数据更为本质的特征，进而获得更加准确的分割结果。1. The present invention fully combines the area map to effectively extract the aggregated area, and uses the depth autoencoder to learn features in both the aggregated area and the homogeneous area, which can extract more essential features of the data, and then obtain more accurate segmentation results.

2.本发明使用深度自编码器学到的特征结合词袋模型进行投影得到区域特征，并使用层次聚类方法进行聚类，可以获得较好的区域一致性。2. The present invention uses the features learned by the deep self-encoder combined with the bag-of-words model to perform projection to obtain regional features, and uses the hierarchical clustering method for clustering to obtain better regional consistency.

3.本发明对结构区域结合素描线段信息使用超像素合并的方法，不仅可以提取线目标区域，而且可以获得较好的边缘一致性。3. The present invention uses superpixel merging method for structure region combined with sketch line segment information, not only can extract line object region, but also can obtain better edge consistency.

实验表明，本发明将深度自编码器应用到SAR图像分割中，并结合区域图、词袋模型及超像素合并获得了分割较为准确、区域一致性及边缘一致性较好的分割结果。Experiments show that the present invention applies the deep self-encoder to SAR image segmentation, and combines region maps, bag-of-words models and superpixel merging to obtain segmentation results with more accurate segmentation, better regional consistency and edge consistency.

附图说明Description of drawings

图1是本发明的实现总流程图；Fig. 1 is the realization overall flowchart of the present invention;

图2是本发明中对SAR图像聚集区域和匀质区域进行分割的子流程图；Fig. 2 is the sub-flow chart of segmenting the SAR image gathering area and homogeneous area in the present invention;

图3是本发明中用于仿真实验的SAR图像原图；Fig. 3 is the SAR image original picture that is used for simulation experiment in the present invention;

图4是本发明中基于素描模型提取的素描图；Fig. 4 is the sketch figure extracted based on the sketch model in the present invention;

图5是本发明中由素描图得到的区域图；Fig. 5 is the area figure that obtains by sketch figure among the present invention;

图6是本发明中使用的两个深度自编码器的结构图；Fig. 6 is a structural diagram of two depth autoencoders used in the present invention;

图7是本发明中对SAR图像聚集区域分割的结果图；Fig. 7 is the result figure to SAR image gathering region segmentation in the present invention;

图8是本发明中对SAR图像匀质区域分割的结果图；Fig. 8 is the result figure of SAR image homogeneous region segmentation in the present invention;

图9是本发明中对SAR图像结构区域使用超像素合并的结果图；Fig. 9 is a result diagram of using superpixel merging in the SAR image structure region in the present invention;

图10是用本发明对SAR图像分割的最终结果图。Fig. 10 is a diagram of the final result of SAR image segmentation using the present invention.

具体实施方式detailed description

以下结合实施例附图对本发明做进一步描述。The present invention will be further described below in conjunction with the accompanying drawings of the embodiments.

参照图1-2，本发明详细的实施步骤如下：With reference to Fig. 1-2, the detailed implementation steps of the present invention are as follows:

步骤1，获取SAR图像的区域图。Step 1, obtain the area map of the SAR image.

(1.1)输入一幅如图3所示的SAR图像，根据初始素描模型得到该SAR图像的素描图，如图4所示，该初始素描模型是源于计算机视觉理论，是一种使用视觉基元对图像进行抽象表示的模型，可用于提取图像的素描图；(1.1) Input a SAR image as shown in Figure 3, and get the sketch of the SAR image according to the initial sketch model, as shown in Figure 4, the initial sketch model is derived from computer vision theory, which is a vision-based A model for abstract representation of images, which can be used to extract sketches of images;

(1.2)使用射线法对素描图中的素描线段进行补全得到区域图，如图5所示，该射线法是对素描图中所有素描线段分别发射射线进而寻找区域边界的方法，可用于区域图提取；(1.2) Use the ray method to complete the sketch line segments in the sketch map to obtain the area map, as shown in Figure 5. The ray method is a method of emitting rays to all the sketch line segments in the sketch map to find the boundary of the area, which can be used in the area graph extraction;

(1.3)将区域图映射到原图得到聚集区域a、匀质区域b和结构区域c。(1.3) Map the region map to the original map to obtain the aggregation region a, the homogeneous region b and the structural region c.

步骤2，根据聚集区域a和匀质区域b各自特性，分别构建两个不同的深度自编码器Sa和Sb，如图6所示。Step 2, according to the respective characteristics of the aggregation area a and the homogeneous area b, construct two different depth autoencoders Sa and Sb, as shown in Figure 6.

(2.1)设定聚集区域a的深度自编码器Sa训练时的输入窗口大小为21*21，对应输入层节点数为441；(2.1) Set the input window size of the depth autoencoder Sa of the aggregation area a to be 21*21 during training, and the corresponding number of input layer nodes is 441;

(2.2)设定匀质区域b的深度自编码器Sb训练时的输入窗口大小为15*15，对应输入层节点数为225；(2.2) Set the input window size of the depth autoencoder Sb training in the homogeneous region b to be 15*15, and the corresponding number of input layer nodes is 225;

(2.3)设定所述深度自编码器Sa和深度自编码器Sb中编码层的隐层数均为6，对应解码层的隐层数也均为6；(2.3) The number of hidden layers of the encoding layer in the depth autoencoder Sa and the depth autoencoder Sb is set to be 6, and the number of hidden layers of the corresponding decoding layer is also 6;

(2.4)设定深度自编码器Sa的编码层各隐层抽象窗口大小分别为25*25、19*19、15*15、13*13、9*9、5*5，其分别对应各层节点数为：625、361、225、169、81、25；(2.4) Set the abstract window size of each hidden layer of the coding layer of the deep self-encoder Sa to be 25*25, 19*19, 15*15, 13*13, 9*9, 5*5, respectively corresponding to each layer The number of nodes is: 625, 361, 225, 169, 81, 25;

(2.5)设定深度自编码器Sb的编码层各隐层抽象窗口大小分别为19*19、13*13、11*11、9*9、7*7、5*5，其分别对应各层节点数为：361、169、121、81、49、25；(2.5) Set the abstract window size of each hidden layer of the coding layer of the deep self-encoder Sb to be 19*19, 13*13, 11*11, 9*9, 7*7, 5*5 respectively, which correspond to each layer The number of nodes is: 361, 169, 121, 81, 49, 25;

(2.6)对于深度自编码器Sa的解码层部分，用编码层最后一层与解码层第一层作为解码层与编码层的公共层，用编码层倒数第二层节点数作为解码层第二层节点数，用编码层倒数第三层节点数作为解码层第三层节点数，依次类推，得到解码层的各层节点数分别为25、81、169、225、361、625；(2.6) For the decoding layer part of the depth self-encoder Sa, the last layer of the encoding layer and the first layer of the decoding layer are used as the common layer of the decoding layer and the encoding layer, and the number of nodes in the penultimate layer of the encoding layer is used as the second layer of the decoding layer. For the number of layer nodes, use the number of nodes in the last third layer of the coding layer as the number of nodes in the third layer of the decoding layer, and so on, to obtain the number of nodes in each layer of the decoding layer as 25, 81, 169, 225, 361, 625;

(2.7)对于深度自编码器Sb的解码层部分，用编码层最后一层与解码层第一层作为解码层与编码层的公共层，用编码层倒数第二层节点数作为解码层第二层节点数，用编码层倒数第三层节点数作为解码层第三层节点数，依次类推，得到解码层的各层节点数分别为25、49、81、121、169、361。(2.7) For the decoding layer part of the depth self-encoder Sb, the last layer of the encoding layer and the first layer of the decoding layer are used as the common layer of the decoding layer and the encoding layer, and the number of nodes in the penultimate layer of the encoding layer is used as the second layer of the decoding layer. For the number of layer nodes, the number of nodes in the last third layer of the coding layer is used as the number of nodes in the third layer of the decoding layer, and so on, and the number of nodes in each layer of the decoding layer is 25, 49, 81, 121, 169, and 361, respectively.

步骤3，依照区域图中聚集区域a和匀质区域b的位置，分别在聚集区域a和匀质区域b上取样并训练相应的深度自编码器Sa和Sb。Step 3: According to the locations of the aggregated area a and the homogeneous area b in the area map, sample and train the corresponding depth autoencoders Sa and Sb on the aggregated area a and the homogeneous area b respectively.

(3.1)在聚集区域a中使用21*21的窗口，以每隔4个点划窗取样；(3.1) Use a 21*21 window in the aggregation area a to sample every 4 dotted windows;

(3.2)在匀质区域b中使用15*15的窗口，以每隔6个点划窗取样；(3.2) Use a 15*15 window in the homogeneous area b to sample every 6 dotted windows;

(3.3)将取到的聚集区域a中的样本输入到深度自编码器Sa的输入层，将取到的匀质区域b中的样本输入到深度自编码器Sb的输入层，再由两个深度自编码器均采用如下过程生成其各层之间的初始权值：(3.3) Input the samples in the aggregated area a obtained into the input layer of the depth autoencoder Sa, and input the samples in the homogeneous area b obtained into the input layer of the depth autoencoder Sb, and then two Each deep autoencoder uses the following process to generate initial weights between its layers:

(3.3a)由输入层与编码层的第一层构成一个限制玻尔兹曼机RBM网络，使用基于对比散度的RBM快速学习算法得到该RBM网络的权值，作为输入层与编码层的第一层之间的初始权值；(3.3a) A restricted Boltzmann machine RBM network is formed from the input layer and the first layer of the encoding layer, and the weight of the RBM network is obtained by using the RBM fast learning algorithm based on contrastive divergence, which is used as the weight of the input layer and the encoding layer Initial weights between the first layer;

(3.3b)用编码层的第一层与编码层的第二层构成一个新的RBM网络，使用基于对比散度的RBM快速学习算法得到该RBM网络的权值，作为这两层之间的初始权值，依次向上生成新的RBM网络，最终得到所有编码层每两层之间的初始权值；(3.3b) Use the first layer of the encoding layer and the second layer of the encoding layer to form a new RBM network, and use the RBM fast learning algorithm based on contrastive divergence to obtain the weight of the RBM network as the weight between the two layers Initial weights, generate a new RBM network in turn, and finally get the initial weights between every two layers of all coding layers;

(3.3c)用编码层最后一层与倒数第二层之间初始权值的转置作为解码层的第一层与第二层之间的初始权值，用编码层倒数第二层与倒数第三层之间初始权值的转置作为解码层的第二层与第三层之间的初始权值，依次类推，得到所有解码层每两层之间的初始权值；(3.3c) Use the transposition of the initial weight value between the last layer of the coding layer and the penultimate layer as the initial weight value between the first layer and the second layer of the decoding layer, and use the penultimate layer and the penultimate layer of the coding layer The transposition of the initial weights between the third layers is used as the initial weights between the second layer and the third layer of the decoding layer, and so on, to obtain the initial weights between every two layers of all decoding layers;

(3.4)将生成初始权值后的这两个深度自编码器，均采用BP算法迭代调整其各自网络的权值。(3.4) After the initial weights are generated, the two deep autoencoders use the BP algorithm to iteratively adjust the weights of their respective networks.

步骤4，使用训练好的两个深度自编码器Sa和Sb，得到对应区域类型中的区域所有点各自的多层编码层表示，并级联每个点的最后两层编码层表示作为该点的特征。Step 4: Use the two trained deep self-encoders Sa and Sb to obtain the respective multi-layer coding layer representations of all points in the region in the corresponding region type, and concatenate the last two coding layer representations of each point as the point Characteristics.

(4.1)对聚集区域a中的所有点，结合各点周围上下左右各10个点，组成该点对应的21*21窗口大小的样本；(4.1) For all points in the aggregation area a, combine the 10 points around each point to form a sample of 21*21 window size corresponding to the point;

(4.2)对匀质区域b中的所有点，结合各点周围上下左右各7个点，组成该点对应的15*15窗口大小的样本；(4.2) For all points in the homogeneous region b, combine the 7 points around each point to form a sample of 15*15 window size corresponding to the point;

(4.3)将得到的聚集区域a中的各点对应的样本输入到深度自编码器Sa的输入层，根据训练好的深度自编码器Sa的各层之间的权值，得到该点对应样本的多层编码层表示，将这些表示中的最后两层编码层表示级联，得到一个106维的向量作为该点特征；(4.3) Input the samples corresponding to each point in the obtained aggregation area a to the input layer of the depth autoencoder Sa, and obtain the corresponding samples of the point according to the weights between the layers of the trained depth autoencoder Sa The multi-layer coding layer representation of , the last two coding layer representations in these representations are concatenated to obtain a 106-dimensional vector as the point feature;

(4.4)将得到的匀质区域b中的各点对应的样本输入到深度自编码器Sb的输入层，根据训练好的深度自编码器Sb的各层之间的权值，得到该点对应样本的多层编码层表示，将这些表示中的最后两层编码层表示级联，得到一个74维的向量作为该点特征。(4.4) Input the samples corresponding to each point in the obtained homogeneous region b to the input layer of the depth autoencoder Sb, and obtain the corresponding The multi-layer encoding layer representation of the sample, the last two encoding layer representations in these representations are concatenated, and a 74-dimensional vector is obtained as the feature of the point.

步骤5，根据词袋模型，分别由聚集区域a和匀质区域b的所有点的特征构建聚集区域a和匀质区域b的字典，并由相应字典得到聚集区域a和匀质区域b的各子区域特征。Step 5, according to the bag-of-words model, construct the dictionaries of aggregation area a and homogeneity area b from the characteristics of all points in aggregation area a and homogeneity area b respectively, and obtain the respective values of aggregation area a and homogeneity area b from the corresponding dictionary. Subregion characteristics.

所述的词袋模型，包括提取特征、构造视觉字典、向视觉字典投影量化图像特征并利用词频表示图像等过程。The bag-of-words model includes processes such as extracting features, constructing a visual dictionary, projecting quantified image features to the visual dictionary, and using word frequency to represent images.

本步骤根据该词袋模型，具体过程如下：This step is based on the bag-of-words model, and the specific process is as follows:

(5.1)对聚集区域a中的各个子区域，选取其中距离特征均值最近的20个特征作为字典原子，所有子区域选取的字典原子共同构成聚集区域a的字典；(5.1) For each sub-area in the aggregation area a, select the 20 features closest to the feature mean as dictionary atoms, and the dictionary atoms selected in all sub-areas together form the dictionary of the aggregation area a;

(5.2)对匀质区域b中的各个子区域，选取其中距离特征均值最近的10个特征作为字典原子，所有子区域选取的字典原子共同构成匀质区域b的字典；(5.2) For each sub-region in the homogeneous region b, select the 10 features closest to the feature mean value as dictionary atoms, and the dictionary atoms selected in all sub-regions together form the dictionary of the homogeneous region b;

(5.3)将聚集区域a中的所有点的特征使用局部约束线性编码向聚集区域a的字典进行投影，得到聚集区域a中所有点的稀疏编码；(5.3) Project the features of all points in the aggregation area a to the dictionary of the aggregation area a using locally constrained linear coding, and obtain the sparse coding of all points in the aggregation area a;

(5.4)将匀质区域b中的所有点的特征使用局部约束线性编码向聚集区域a的字典进行投影，得到匀质区域b中所有点的稀疏编码；(5.4) Project the features of all points in the homogeneous region b to the dictionary of the aggregation region a using locally constrained linear coding, and obtain the sparse coding of all points in the homogeneous region b;

(5.5)分别对聚集区域a和匀质区域b中的各个子区域，将子区域内所有点的稀疏编码汇聚成编码矩阵，取编码矩阵中每一维的最高分量得到的向量作为该子区域的区域特征。(5.5) For each sub-area in the aggregation area a and the homogeneous area b, the sparse coding of all points in the sub-area is aggregated into an encoding matrix, and the vector obtained by taking the highest component of each dimension in the encoding matrix is used as the sub-area regional characteristics.

步骤6，分别对聚集区域a和匀质区域b的所有子区域特征进行聚类。Step 6, clustering all the sub-area features of the aggregation area a and the homogeneous area b respectively.

(6.1)对聚集区域a的所有子区域特征使用层次聚类方法进行聚类，得到聚集区域a的分割结果，如图7所示；(6.1) Use the hierarchical clustering method to cluster all the sub-area features of the aggregation area a, and obtain the segmentation result of the aggregation area a, as shown in Figure 7;

(6.2)对匀质区域b的所有子区域特征使用层次聚类方法进行聚类，得到匀质区域b的分割结果，如图8所示。(6.2) Use the hierarchical clustering method to cluster all the sub-region features of the homogeneous region b, and obtain the segmentation results of the homogeneous region b, as shown in Figure 8.

步骤7，对结构区域c进行分割。Step 7, segment the structural region c.

(7.1)使用分水岭算法将该SAR图像进行过分割，得到该图像的许多超像素：(7.1) Use the watershed algorithm to over-segment the SAR image to obtain many superpixels of the image:

(7.1a)计算出图像的梯度，得到图像的梯度图；(7.1a) Calculate the gradient of the image to obtain the gradient map of the image;

(7.1b)对图像的梯度图使用分水岭变换，得到图像的标注；(7.1b) Use watershed transformation on the gradient map of the image to obtain the annotation of the image;

(7.1c)根据图像的标注将图像的结构区域c分割成许多超像素；(7.1c) Segment the structural region c of the image into many superpixels according to the annotation of the image;

(7.2)在素描图中的素描线指导下对超像素进行一次合并：(7.2) Merge the superpixels once under the guidance of the sketch lines in the sketch:

(7.2a)在素描图的素描线中，将平行且距离小于7个像素的两条素描线确定为一类线目标素描线，将一类线目标素描线之间的超像素进行合并作为一类线目标；(7.2a) Among the sketch lines in the sketch image, two sketch lines that are parallel and less than 7 pixels apart are determined as a class of line target sketch lines, and the superpixels between the class of line target sketch lines are merged as one line-like target;

(7.2b)在一般素描线中，对两边属于同一类子区域的素描线确定为二类线目标素描线，将二类线目标素描线两边各扩一个像素作为二类线目标，其他素描线作为刻画边界的素描线；(7.2b) Among the general sketch lines, the sketch lines belonging to the same sub-area on both sides are determined as the second-class line target sketch lines, and each side of the second-class line target sketch line is expanded by one pixel as the second-class line target, and other sketch lines as a sketch line to describe the boundary;

(7.3)对其他超像素进行二次合并，是对除了线目标和边界以外的各个超像素中，将其相邻且灰度均值之差小于25的超像素进行合并，并迭代合并至不存在相邻且灰度均值之差小于25的两个超像素为止，结果如图9所示；(7.3) The secondary merging of other superpixels is to merge the superpixels adjacent to each superpixel except for the line target and the boundary, and the difference between the average gray value is less than 25, and merge iteratively until no Two superpixels that are adjacent and whose gray mean value difference is less than 25, the result is shown in Figure 9;

(7.4)对二次合并后的各个超像素进行三次合并，将各个超像素分别合并到与该超像素的灰度均值的差最小且小于25的匀质区域b的子区域中，剩余超像素为独立目标。(7.4) Merge the superpixels after the second merging three times, and merge each superpixel into the sub-region of the homogeneous region b whose difference with the average gray value of the superpixel is the smallest and less than 25, and the remaining superpixels for independent goals.

步骤8，将聚集区域a、匀质区域b和结构区域c的分割结果进行合并，得到最终的SAR图像分割结果，如图10所示。Step 8: Merge the segmentation results of the aggregation area a, homogeneous area b, and structural area c to obtain the final SAR image segmentation result, as shown in Figure 10.

从图10可见，使用本发明的方法将原SAR图像分割为了10类，得到的分割结果较为准确、细致、区域一致性和边缘一致性较好。It can be seen from Fig. 10 that the original SAR image is segmented into 10 categories using the method of the present invention, and the resulting segmentation results are more accurate, detailed, and have better regional consistency and edge consistency.

以上描述仅是本发明的一个具体实例，并不构成对本发明的任何限制。显然对于本领域的专业人员来说，在了解了本发明内容和原理后，都可能在不背离本发明原理、结构的情况下，进行形式和细节上的各种修改和改变，但是这些基于本发明思想的修正和改变仍在本发明的权利要求保护范围之内。The above description is only a specific example of the present invention, and does not constitute any limitation to the present invention. Obviously, for those skilled in the art, after understanding the content and principles of the present invention, it is possible to make various modifications and changes in form and details without departing from the principles and structures of the present invention, but these are based on the present invention. The modification and change of the inventive concept are still within the protection scope of the claims of the present invention.

Claims

1. A SAR image segmentation method based on depth self-encoder and region map, comprising the steps:

(1) According to the initial sketch model, obtain the SAR pixel sketch map, complete the sketch line segment in the sketch map to obtain the area map, and map the area map to the original image to obtain the aggregation area a, homogeneous area b and structural area c;

(2) According to the respective characteristics of the aggregation area a and the homogeneous area b, construct two different depth autoencoders Sa and Sb for the aggregation area a and the homogeneous area b respectively;

(3) According to the positions of the aggregated area a and the homogeneous area b in the area map, sample and train the corresponding depth autoencoders Sa and Sb on the aggregated area a and the homogeneous area b respectively;

(4) Using the two trained deep self-encoders Sa and Sb, obtain the respective multi-layer coding layer representations of all points in the region in the corresponding region type, and concatenate the last two coding layer representations of each point as the point Characteristics;

(5) According to the bag-of-words model, the dictionaries of aggregated area a and homogeneous area b are constructed from the features of all points in the aggregated area a and homogeneous area b, and the features of each point are projected to the corresponding dictionary and the areas of each sub-area are aggregated feature;

(6) Clustering all sub-area features of the aggregation area a and the homogeneous area b, respectively, to obtain the segmentation results of the aggregation area a and the homogeneous area b;

(7) Use the watershed algorithm to divide the structural region c into several superpixels, and merge the divided superpixels for the first time under the guidance of the sketch lines in the sketch map to obtain the line target and boundary, and then perform other superpixels The pixels are merged for the second time, and the superpixels after the second merger are combined with the sub-regions of the homogeneous region b for the third time. After the third merger, the remaining superpixels are independent targets, and the structure region c is completed. segmentation;

(8) Merge the segmentation results of the aggregation area a, the homogeneous area b and the structural area c to obtain the final SAR image segmentation result.

2. The SAR image segmentation method according to claim 1, wherein according to the respective characteristics of the aggregation area a and the homogeneous area b described in step (2), two different depths are respectively constructed for the aggregation area a and the homogeneous area b Autoencoders Sa and Sb, proceed as follows:

(2.1) Set the input window size of the depth autoencoder Sa of the aggregation area a to be 21*21 during training, and the corresponding number of input layer nodes is 441;

(2.2) Set the input window size of the depth autoencoder Sb training in the homogeneous region b to be 15*15, and the corresponding number of input layer nodes is 225;

(2.3) The number of hidden layers of the encoding layer in the depth autoencoder Sa and the depth autoencoder Sb is set to be 6, and the number of hidden layers of the corresponding decoding layer is also 6;

(2.4) Set the abstract window size of each hidden layer of the coding layer of the deep self-encoder Sa to be 25*25, 19*19, 15*15, 13*13, 9*9, 5*5, respectively corresponding to each layer The number of nodes is: 625, 361, 225, 169, 81, 25;

(2.5) Set the abstract window size of each hidden layer of the coding layer of the deep self-encoder Sb to be 19*19, 13*13, 11*11, 9*9, 7*7, 5*5 respectively, which correspond to each layer The number of nodes is: 361, 169, 121, 81, 49, 25;

(2.6) For the decoding layer part of the depth self-encoder Sa, the last layer of the encoding layer and the first layer of the decoding layer are used as the common layer of the decoding layer and the encoding layer, and the number of nodes in the penultimate layer of the encoding layer is used as the second layer of the decoding layer. For the number of layer nodes, use the number of nodes in the last third layer of the coding layer as the number of nodes in the third layer of the decoding layer, and so on, to obtain the number of nodes in each layer of the decoding layer as 25, 81, 169, 225, 361, 625;

(2.7) For the decoding layer part of the depth self-encoder Sb, the last layer of the encoding layer and the first layer of the decoding layer are used as the common layer of the decoding layer and the encoding layer, and the number of nodes in the penultimate layer of the encoding layer is used as the second layer of the decoding layer. For the number of layer nodes, the number of nodes in the last third layer of the coding layer is used as the number of nodes in the third layer of the decoding layer, and so on, and the number of nodes in each layer of the decoding layer is 25, 49, 81, 121, 169, and 361, respectively.

3. The SAR image segmentation method according to claim 1, wherein according to the positions of the aggregation area a and the homogeneous area b in the area map described in step (3), sampling is performed on the aggregation area a and the homogeneous area b respectively To train the corresponding deep autoencoders Sa and Sb, proceed as follows:

(3.1) Use a 21*21 window in the aggregation area a to sample every 4 dotted windows;

(3.2) Use a 15*15 window in the homogeneous area b to sample every 6 dotted windows;

(3.3) Input the samples in the acquired aggregation area a to the input layer of the depth autoencoder Sa, input the samples in the acquired aggregation area b to the input layer of the depth autoencoder Sb, and then use two depth Each autoencoder uses the following process to generate initial weights between its layers:

(3.3a) A restricted Boltzmann machine RBM network is formed from the input layer and the first layer of the encoding layer, and the weight of the RBM network is obtained by using the RBM fast learning algorithm based on contrastive divergence, which is used as the weight of the input layer and the encoding layer Initial weights between the first layer;

(3.3b) Use the first layer of the encoding layer and the second layer of the encoding layer to form a new RBM network, and use the RBM fast learning algorithm based on contrastive divergence to obtain the weight of the RBM network as the weight between the two layers Initial weights, generate a new RBM network in turn, and finally get the initial weights between every two layers of all coding layers;

(3.3c) Use the transposition of the initial weight value between the last layer of the coding layer and the penultimate layer as the initial weight value between the first layer and the second layer of the decoding layer, and use the penultimate layer and the penultimate layer of the coding layer The transposition of the initial weights between the third layers is used as the initial weights between the second layer and the third layer of the decoding layer, and so on, to obtain the initial weights between every two layers of all decoding layers;

(3.4) After the initial weights are generated, the two deep autoencoders use the BP algorithm to iteratively adjust the weights of their respective networks.

4. SAR image segmentation method according to claim 1, wherein the two depth self-encoders Sa and Sb that use training described in step (4) obtain the respective multi-layer codes of all points in the region in the corresponding region type layer representation, and concatenate the last two layers of encoding layer representations of each point as the feature of the point, proceed as follows:

(4.1) Use 21*21 window sampling for all points in the aggregation area a, as the corresponding sample of the point;

(4.2) Use a 15*15 window to sample all points in the homogeneous area b as the corresponding sample of the point;

(4.3) Input the samples corresponding to each point in the obtained aggregation area a to the input layer of the depth autoencoder Sa, and obtain the corresponding samples of the point according to the weights between the layers of the trained depth autoencoder Sa The multi-layer coding layer representation of , the last two coding layer representations in these representations are concatenated to obtain a 106-dimensional vector as the point feature;

(4.4) Input the samples corresponding to each point in the obtained homogeneous region b to the input layer of the depth autoencoder Sb, and obtain the corresponding The multi-layer encoding layer representation of the sample, the last two encoding layer representations in these representations are concatenated, and a 74-dimensional vector is obtained as the feature of the point.

5. SAR image segmentation method according to claim 1, wherein according to the bag-of-words model described in step (5), the feature construction aggregation area a and the homogeneous area are respectively constructed by the features of all points of the aggregation area a and the homogeneous area b In the dictionary of b, the features of each point are projected to the corresponding dictionary and the regional features of each sub-region are gathered, and the steps are as follows:

(5.1) For each sub-area in the aggregation area a, select the 20 features closest to the feature mean as dictionary atoms, and the dictionary atoms selected in all sub-areas together form the dictionary of the aggregation area a;

(5.2) For each sub-region in the homogeneous region b, select the 10 features closest to the feature mean value as dictionary atoms, and the dictionary atoms selected in all sub-regions together form the dictionary of the homogeneous region b;

(5.3) Project the features of all points in the aggregation area a to the dictionary of the aggregation area a using locally constrained linear coding, and obtain the sparse coding of all points in the aggregation area a;

(5.4) Project the features of all points in the homogeneous region b to the dictionary of the aggregation region a using locally constrained linear coding, and obtain the sparse coding of all points in the homogeneous region b;

(5.5) For each sub-area in the aggregation area a and the homogeneous area b, the sparse coding of all points in the sub-area is aggregated into an encoding matrix, and the vector obtained by taking the highest component of each dimension in the encoding matrix is used as the sub-area regional characteristics.

6. SAR image segmentation method according to claim 1, wherein the described structure region c of step (7) is divided into several superpixels using the watershed algorithm, and is carried out as follows:

(6.1) Calculate the gradient of the image to obtain the gradient map of the image;

(6.2) Use watershed transformation to the gradient map of the image to obtain the annotation of the image;

(6.3) Segment the structural region c of the image into several superpixels according to the annotation of the image.

7. SAR image segmentation method according to claim 1, wherein described in step (7) under the guidance of the sketch line in the sketch map, these superpixels after the segmentation are merged for the first time to obtain line targets and boundaries, Proceed as follows:

(7.1) In the sketch lines of the sketch map, two sketch lines parallel and less than 7 pixels apart are determined as the first type of line target sketch lines, and the superpixels between the first type of line target sketch lines are merged as first line target;

(7.2) Among the general sketch lines, the sketch lines belonging to the same type of sub-region on both sides are determined as the second-type line target sketch lines, and each side of the second-type line target sketch line is expanded by one pixel as the second-type line target, and the other The sketch line is used as a sketch line to describe the boundary.

8. SAR image segmentation method according to claim 1, wherein said in step (7) carries out the second merging to other superpixels, is to in each superpixel except line object and boundary, its adjacent and The superpixels whose gray mean difference is less than 25 are merged, and merged iteratively until there are no two adjacent superpixels whose gray mean difference is less than 25.

9. The SAR image segmentation method according to claim 1, wherein the superpixels after the second merge and the sub-region of the homogeneous region b are merged for the third time described in step (7), which is to merge the superpixels for the second time Each merged superpixel is respectively merged into a subregion of the homogeneous region b whose difference with the average gray value of the superpixel is the smallest and less than 25.