[go: up one dir, main page]

CN115937704A - Remote sensing image road segmentation method based on topology perception neural network - Google Patents

Remote sensing image road segmentation method based on topology perception neural network Download PDF

Info

Publication number
CN115937704A
CN115937704A CN202211575990.2A CN202211575990A CN115937704A CN 115937704 A CN115937704 A CN 115937704A CN 202211575990 A CN202211575990 A CN 202211575990A CN 115937704 A CN115937704 A CN 115937704A
Authority
CN
China
Prior art keywords
remote sensing
module
sensing image
semantic segmentation
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211575990.2A
Other languages
Chinese (zh)
Other versions
CN115937704B (en
Inventor
郭学俊
周瑞森
刘伟琳
李龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN202211575990.2A priority Critical patent/CN115937704B/en
Publication of CN115937704A publication Critical patent/CN115937704A/en
Application granted granted Critical
Publication of CN115937704B publication Critical patent/CN115937704B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明属于遥感图像语义分割方法技术领域,具体是一种基于拓扑感知神经网络的遥感图像道路分割方法。包括以下步骤:1)对高分辨遥感数据集划分为训练集和测试集,并对训练集和测试集中的遥感图像分别进行预处理;2)搭建图像语义分割网络模型;3)将训练集输入语义分割网络,首先对语义分割网络进行初始化,然后对模型中参数进行更新,优化损失函数直至收敛;4)将测试集数据输入训练好的生成器模块,得到高精度语义分割结果。本发明解决了现有遥感图像道路自动提取技术中存在的特征表达能力不足、无法准确感知道路的拓扑结构、上下文信息局限于单个训练样本、语义分割的过程中边缘和细节信息损失较大、训练所需样本量大等问题。

Figure 202211575990

The invention belongs to the technical field of remote sensing image semantic segmentation methods, in particular to a remote sensing image road segmentation method based on topology perception neural network. It includes the following steps: 1) divide the high-resolution remote sensing data set into training set and test set, and preprocess the remote sensing images in the training set and test set respectively; 2) build the image semantic segmentation network model; 3) input the training set The semantic segmentation network first initializes the semantic segmentation network, then updates the parameters in the model, and optimizes the loss function until convergence; 4) Input the test set data into the trained generator module to obtain high-precision semantic segmentation results. The invention solves the problem of insufficient feature expression ability, inability to accurately perceive the topological structure of the road, limited context information to a single training sample, large loss of edge and detail information in the process of semantic segmentation, and large loss of edge and detail information in the existing remote sensing image road automatic extraction technology The large sample size required and so on.

Figure 202211575990

Description

基于拓扑感知神经网络的遥感图像道路分割方法Road segmentation method for remote sensing images based on topology-aware neural network

技术领域Technical Field

本发明属于遥感图像语义分割方法技术领域,具体是一种基于拓扑感知神经网络的遥感图像道路分割方法。The invention belongs to the technical field of remote sensing image semantic segmentation methods, and specifically is a remote sensing image road segmentation method based on a topology-aware neural network.

背景技术Background Art

随着高分辨率遥感卫星不断发射升空、无人机技术的日益广泛应用、遥感图像的空间分辨率不断提升,对地观测技术已经进入高分辨率遥感大数据时代。高分辨率遥感大数据中,路网信息不仅直观反映了一个国家或地区的经济和社会发展水平,而且交通管理、城市规划、自动导航等领域均有重要的应用价值。With the continuous launch of high-resolution remote sensing satellites, the increasing application of drone technology, and the continuous improvement of the spatial resolution of remote sensing images, earth observation technology has entered the era of high-resolution remote sensing big data. In high-resolution remote sensing big data, road network information not only directly reflects the economic and social development level of a country or region, but also has important application value in traffic management, urban planning, automatic navigation and other fields.

传统高分辨率遥感图像道路分割主要依靠人工标注,不仅费时、费力而且结果还具有很强的主观性。为了提高检测效率,各种基于对象的语义分割方法被不断提出,并取得了一定应用成果。但是这些技术严重依赖基于纹理、几何形状和边缘等人工设计的低级语义特征,无法很好地应用到高分辨率遥感图像这种复杂场景、因而泛化性能差而且容易受各种噪声干扰影响。Traditional high-resolution remote sensing image road segmentation mainly relies on manual annotation, which is not only time-consuming and laborious, but also highly subjective. In order to improve detection efficiency, various object-based semantic segmentation methods have been continuously proposed and have achieved certain application results. However, these technologies rely heavily on low-level semantic features designed manually based on texture, geometry, and edges, and cannot be well applied to complex scenes such as high-resolution remote sensing images. Therefore, they have poor generalization performance and are easily affected by various noise interferences.

基于深度学习的全卷积神经网络通过多层网络结构和非线性变换能够自动地从遥感图像中获取高级语义特征。此外,全卷积神经网络端到端、像素到像素式的实现方式,能够提供像素级道路的识别、定位,是目前最具潜力的是遥感图像道路提取方法。但目前用于遥感图像道路提取的全卷积神经网络主要专注于像素分类精度,无法准确感知道路的拓扑结构而且语义分割的过程中边缘和细节信息损失较大、容易受到遮挡或阴影的影响,严重影响了结果的精度和完整性。此外,全卷积模型往往依赖海量的高精度训练样本或由日常场景海量样本训练所得的预训练模型。然而,遥感图像的像素级标注却往往费时、费力。同时,由于日常场景图像与遥感图像差异较大,因而由日常场景图像训练得到的预训练模型分割精度经常差强人意。此外,这些复杂的模型参数量巨大对存储和计算设备均提出了较高要求,训练和应用模型也均非常耗时。这些缺陷极大的限制了这些方法的实际应用。The fully convolutional neural network based on deep learning can automatically obtain high-level semantic features from remote sensing images through a multi-layer network structure and nonlinear transformation. In addition, the end-to-end, pixel-to-pixel implementation of the fully convolutional neural network can provide pixel-level road recognition and positioning, and is currently the most promising method for extracting roads from remote sensing images. However, the fully convolutional neural network currently used for road extraction from remote sensing images mainly focuses on pixel classification accuracy, cannot accurately perceive the topological structure of the road, and the edge and detail information is lost in the process of semantic segmentation, and is easily affected by occlusion or shadows, which seriously affects the accuracy and integrity of the results. In addition, the fully convolutional model often relies on a large number of high-precision training samples or pre-trained models trained from a large number of samples of daily scenes. However, pixel-level annotation of remote sensing images is often time-consuming and laborious. At the same time, due to the large difference between daily scene images and remote sensing images, the segmentation accuracy of the pre-trained model trained from daily scene images is often unsatisfactory. In addition, the huge number of parameters in these complex models places high demands on storage and computing devices, and training and applying models are also very time-consuming. These defects greatly limit the practical application of these methods.

发明内容Summary of the invention

本发明为了综合解决现有高分辨遥感图像道路分割方法无法准确感知拓扑结构、语义分割的过程中边缘和细节信息损失较大、容易受到遮挡和阴影的干扰、模型效率低下和训练困难等问题。提供一种基于拓扑感知神经网络的遥感图像道路分割方法。The present invention aims to comprehensively solve the problems that the existing high-resolution remote sensing image road segmentation method cannot accurately perceive the topological structure, the edge and detail information are lost greatly during the semantic segmentation process, it is easily disturbed by occlusion and shadow, the model is inefficient and difficult to train. A remote sensing image road segmentation method based on a topology-aware neural network is provided.

本发明采取以下技术方案:一种高分辨率遥感图像语义分割网络,包括下采样路径、空间上下文模块和上采样路径,所述下采样路径包括可变形卷积层以及5个连续的下采样单元,输入数据首先通过可变形卷积层,得到语义特征图,随后通过5个连续的下采样单元进行特征提取和下采样,最后输出一个特征图;所述上采样路径依次由5个连续的上采样单元、一个A聚合操作、一个卷积层以及Softmax层组成,最后生成分割预测图;所述5个下采样单元和5上采样单元之间一一对应,并采用经过通道注意力模块的注意力调整的跳跃连接相连;所述空间上下文模块对下采样路径输出的特征图进行分割融合后并输出给上采样路径。The present invention adopts the following technical scheme: a high-resolution remote sensing image semantic segmentation network, including a downsampling path, a spatial context module and an upsampling path, the downsampling path includes a deformable convolution layer and 5 continuous downsampling units, the input data first passes through the deformable convolution layer to obtain a semantic feature map, then passes through 5 continuous downsampling units for feature extraction and downsampling, and finally outputs a feature map; the upsampling path is composed of 5 continuous upsampling units, an A aggregation operation, a convolution layer and a Softmax layer in sequence, and finally generates a segmentation prediction map; the 5 downsampling units and the 5 upsampling units correspond one to one, and are connected by a jump connection adjusted by the attention of a channel attention module; the spatial context module performs segmentation and fusion on the feature map output by the downsampling path and outputs it to the upsampling path.

在一些实施例中,可变形卷积层是一个卷积核大小为3×3、步长为1的可变形卷积,该操作对3×3卷积核中每个采样点的位置都增加了一个可学习的偏移变量和权重系数。In some embodiments, the deformable convolution layer is a deformable convolution with a convolution kernel size of 3×3 and a stride of 1, which adds a learnable offset variable and weight coefficient to the position of each sampling point in the 3×3 convolution kernel.

在一些实施例中,下采样单元包括一次聚合模块、聚合操作和下转换模块,其中一次聚合模块负责提取特征,然后聚合操作将一次聚合模块的输入和输出聚合,再将结果特征图分别传输至下转换模块和通道注意力模块,其中通道注意力模块的输出与输入进行对应元素相乘,得到经过通道注意力调整的跳跃连接特征并输入至上采样路径对应部分。In some embodiments, the downsampling unit includes a primary aggregation module, an aggregation operation, and a down-conversion module, wherein the primary aggregation module is responsible for extracting features, and then the aggregation operation aggregates the input and output of the primary aggregation module, and then transmits the resulting feature map to the down-conversion module and the channel attention module respectively, wherein the output of the channel attention module is multiplied by the input by corresponding elements to obtain the jump connection features adjusted by the channel attention and input to the corresponding part of the upsampling path.

在一些实施例中,上采样单元由一组上转换模块、B聚合操作和一次聚合模块,其中上转换模块负责通过上采样恢复特征图的空间分辨率,B聚合操作负责将通道注意力模块调整过的跳跃连接特征图和上转换模块所得特征图进行聚合,一次聚合模块负责从B聚合操作的结果中提取特征。In some embodiments, the upsampling unit consists of a group of up-conversion modules, B aggregation operations and one-time aggregation modules, wherein the up-conversion module is responsible for restoring the spatial resolution of the feature map through upsampling, the B aggregation operation is responsible for aggregating the jump connection feature map adjusted by the channel attention module and the feature map obtained by the up-conversion module, and the one-time aggregation module is responsible for extracting features from the results of the B aggregation operation.

在一些实施例中,一次聚合模块的结构如下:特征图

Figure 100002_DEST_PATH_IMAGE001
输入至一次聚合模块后,首先经过
Figure 100002_DEST_PATH_IMAGE003
个卷积模块,得到
Figure 303154DEST_PATH_IMAGE003
个新特征图
Figure 774586DEST_PATH_IMAGE004
,其中前两个卷积模块为可变形卷积模块;然后将所得
Figure 953764DEST_PATH_IMAGE003
个特征图
Figure 100002_DEST_PATH_IMAGE005
进行通道堆叠操作,得到通道堆叠后的特征图
Figure 586871DEST_PATH_IMAGE006
;所述卷积模块,由批归一化层、ReLU激活函数层、3
Figure DEST_PATH_IMAGE008
3卷积层和随机失活层依次组成;所述可变形卷积模块,由批归一化层、ReLU激活函数层、3
Figure 94075DEST_PATH_IMAGE008
3可变形卷积层和随机失活层依次组成。In some embodiments, the structure of the primary aggregation module is as follows:
Figure 100002_DEST_PATH_IMAGE001
After being input into the primary aggregation module, it first passes through
Figure 100002_DEST_PATH_IMAGE003
convolution modules, we get
Figure 303154DEST_PATH_IMAGE003
New feature map
Figure 774586DEST_PATH_IMAGE004
, where the first two convolution modules are deformable convolution modules; then the obtained
Figure 953764DEST_PATH_IMAGE003
Feature Map
Figure 100002_DEST_PATH_IMAGE005
Perform channel stacking operation to obtain the feature map after channel stacking
Figure 586871DEST_PATH_IMAGE006
; The convolution module consists of a batch normalization layer, a ReLU activation function layer, 3
Figure DEST_PATH_IMAGE008
3 convolutional layers and random dropout layers in sequence; the deformable convolution module consists of a batch normalization layer, a ReLU activation function layer, 3
Figure 94075DEST_PATH_IMAGE008
3 Deformable convolutional layers and random dropout layers are composed in sequence.

在一些实施例中,通道注意力模块的结构为:大小为

Figure 803754DEST_PATH_IMAGE009
的输入特征分别经过全局最大池化和全局平均池化得到两个
Figure DEST_PATH_IMAGE010
的特征图;接着将它们分别送入一个共享多层感知器:第一层神经元个数为
Figure 712804DEST_PATH_IMAGE011
,r为减少率,r=16,激活函数为Relu,第二层神经元个数为
Figure DEST_PATH_IMAGE012
;输出的两个特征进行相加,再经过Sigmoid激活操作,生成最终的通道注意力;通道注意力和输入特征做对应元素乘法操作。In some embodiments, the structure of the channel attention module is:
Figure 803754DEST_PATH_IMAGE009
The input features of are respectively subjected to global maximum pooling and global average pooling to obtain two
Figure DEST_PATH_IMAGE010
feature map; then they are sent to a shared multilayer perceptron: the number of neurons in the first layer is
Figure 712804DEST_PATH_IMAGE011
, r is the reduction rate, r=16, the activation function is Relu, and the number of neurons in the second layer is
Figure DEST_PATH_IMAGE012
; The two output features are added, and then the Sigmoid activation operation is performed to generate the final channel attention; the channel attention and the input features perform corresponding element multiplication operations.

在一些实施例中,空间上下文模块包括“高度方向”和“宽度方向”两条路径;In some embodiments, the spatial context module includes two paths: "height direction" and "width direction";

其中“高度方向”路径来自下采样路径的特征图,大小为

Figure 251233DEST_PATH_IMAGE013
,其中C代表通道数目,H代表特征图高度,W代表特征图宽度,在垂直方向被分为H个切片:
Figure DEST_PATH_IMAGE014
,对每个切片
Figure 370367DEST_PATH_IMAGE015
Figure DEST_PATH_IMAGE016
,使用大小为
Figure 652444DEST_PATH_IMAGE017
卷积操作进行线性投影得到
Figure DEST_PATH_IMAGE018
,然后将
Figure 838837DEST_PATH_IMAGE019
输入第一个线性层
Figure DEST_PATH_IMAGE020
,得到注意力
Figure 610484DEST_PATH_IMAGE021
,再对注意力
Figure 498806DEST_PATH_IMAGE021
依次在特征图尺寸和通道维度分别使用Softmax和L1正则化,随后将正则化后的注意力输入第二个线性层
Figure DEST_PATH_IMAGE022
,得到新的切片
Figure 646890DEST_PATH_IMAGE023
,最后将所有新切片沿“高度方向”聚合为大小为
Figure DEST_PATH_IMAGE024
的特征图;The "height direction" path comes from the feature map of the downsampling path, and its size is
Figure 251233DEST_PATH_IMAGE013
, where C represents the number of channels, H represents the feature map height, and W represents the feature map width, and is divided into H slices in the vertical direction:
Figure DEST_PATH_IMAGE014
, for each slice
Figure 370367DEST_PATH_IMAGE015
,
Figure DEST_PATH_IMAGE016
, using a size of
Figure 652444DEST_PATH_IMAGE017
The convolution operation is linearly projected to obtain
Figure DEST_PATH_IMAGE018
, then
Figure 838837DEST_PATH_IMAGE019
Input the first linear layer
Figure DEST_PATH_IMAGE020
, get attention
Figure 610484DEST_PATH_IMAGE021
, and then pay attention
Figure 498806DEST_PATH_IMAGE021
Softmax and L1 regularization are used in the feature map size and channel dimension respectively, and then the regularized attention is input into the second linear layer
Figure DEST_PATH_IMAGE022
, get a new slice
Figure 646890DEST_PATH_IMAGE023
, and finally aggregate all new slices along the "height direction" into a size of
Figure DEST_PATH_IMAGE024
The feature map of

其中“宽度方向”路径,来自下采样路径的特征图沿“宽度”方向被分为W个切片,切片按照同样的方式更新并聚合为特征图;In the “width direction” path, the feature map from the downsampling path is divided into W slices along the “width” direction, and the slices are updated and aggregated into feature maps in the same way;

来自“高度方向”和“宽度方向”两条路径、大小均为

Figure 186325DEST_PATH_IMAGE025
的特征图通过加法运算进行融合。From the "height direction" and "width direction" paths, the size is
Figure 186325DEST_PATH_IMAGE025
The feature maps are fused by addition operation.

一种基于拓扑感知神经网络的遥感图像道路分割方法,包括以下步骤,A remote sensing image road segmentation method based on topology-aware neural network includes the following steps:

S100:将高分辨率遥感数据集划分为训练集和测试集,并对训练集和测试集中的图像分别进行预处理;S100: Divide the high-resolution remote sensing dataset into a training set and a test set, and preprocess the images in the training set and the test set respectively;

S200:搭建高分辨率遥感图像语义分割网络;S200: Building a high-resolution remote sensing image semantic segmentation network;

S300:将S100中经过预处理的训练集图像输入至S200中的高分辨遥感图像语义分割网络进行训练,首先使用He Uniform方法对高分辨遥感图像分割网络进行初始化,然后对模型中参数进行更新,优化损失函数直至收敛;S300: input the preprocessed training set images in S100 into the high-resolution remote sensing image semantic segmentation network in S200 for training. First, the high-resolution remote sensing image segmentation network is initialized using the He Uniform method, and then the parameters in the model are updated to optimize the loss function until convergence.

S400:将经过预处理的测试集遥感图像输入到S200中训练好的分割网络,输出分高分辨率遥感图像语义分割结果。S400: Input the preprocessed test set remote sensing images into the segmentation network trained in S200, and output the semantic segmentation results of the high-resolution remote sensing images.

步骤S100中,所述预处理包括图像人工标注、图像裁剪和数据增强;In step S100, the preprocessing includes manual image annotation, image cropping and data enhancement;

所述的图像人工标注具体为:在ArcGIS软件中人工对高分辨率图像中的道路进行像素级语义标注得到带标签的表面裂纹图像;The manual image annotation specifically includes: manually annotating the roads in the high-resolution image at the pixel level in ArcGIS software to obtain a labeled surface crack image;

所述的图像裁剪具体为:将带标签的高分辨率遥感图像随机裁剪为512像素×512像素的子图像;The image cropping is specifically as follows: randomly cropping the labeled high-resolution remote sensing image into sub-images of 512 pixels×512 pixels;

所述数据增强包括:将子图像进行尺度随机变换、随机角度图像旋转、图像垂直与水平翻转得到的高分辨率遥感图像。The data enhancement includes: performing random scale transformation on sub-images, random angle image rotation, and vertical and horizontal image flipping to obtain a high-resolution remote sensing image.

所述步骤S300中,损失函数采用Dice Loss,具体为:In step S300, the loss function adopts Dice Loss, which is specifically:

Figure DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE026

其中X代表预测图中所有像素对应预测值的集合,Y代表标签图像所有像素对应值的集合,

Figure 332135DEST_PATH_IMAGE027
代表X和Y的交集,
Figure DEST_PATH_IMAGE028
Figure 661748DEST_PATH_IMAGE029
代表X和Y中元素的个数。Where X represents the set of predicted values corresponding to all pixels in the prediction image, and Y represents the set of values corresponding to all pixels in the label image.
Figure 332135DEST_PATH_IMAGE027
represents the intersection of X and Y,
Figure DEST_PATH_IMAGE028
and
Figure 661748DEST_PATH_IMAGE029
Represents the number of elements in X and Y.

与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

1)本发明采用空间上下文模块,空间上下文模块将传统的卷积层接层(Layer-by-layer)的连接形式的转为特征图中片连片卷积(Slice-by-slice)的形式,使得图中像素行和列之间能够传递信息,因此即使是在道路被遮挡或其它外观线索较差时,空间上下文模块因其强大的拓扑形态感知能力仍然适用于检测长距离连续形状的裂纹目标。1) The present invention adopts a spatial context module, which converts the traditional layer-by-layer convolution connection form into a slice-by-slice convolution form in the feature map, so that information can be transmitted between pixel rows and columns in the map. Therefore, even when the road is blocked or other appearance clues are poor, the spatial context module is still suitable for detecting crack targets with long-distance continuous shapes due to its powerful topological morphological perception ability.

2)本发明中的空间上下文模块,采用两个具有记忆功能的线性层替代传统自注意力机制中的Key矢量和Value矢量,不仅可以从整个训练集中所有样本中获取全局上下文信息,同时降低计算复杂度。2) The spatial context module in the present invention uses two linear layers with memory function to replace the Key vector and Value vector in the traditional self-attention mechanism, which can not only obtain global context information from all samples in the entire training set, but also reduce the computational complexity.

3)本发明在网络中的下转换模块和上转换模块中使用了小波变换和逆小波变换实现下采样和上采样进一步避免模型中的细节和边缘信息损失,进而提升模型性能。3) The present invention uses wavelet transform and inverse wavelet transform in the down-conversion module and up-conversion module in the network to implement down-sampling and up-sampling to further avoid the loss of details and edge information in the model, thereby improving the model performance.

4)本发明的下采样和上采样路径中,利用一次性聚合模块提取道路特征。该设计使得模型对特征的重复利用率更高并减少冗余计算,因而模型易于训练,无需预训练模型而且降低了模型训练所需样本的数量;同时一次性聚合模块还结合了可变形卷积,有利于获取精确的道路几何特征和增强拓扑感知能力。4) In the downsampling and upsampling paths of the present invention, a one-time aggregation module is used to extract road features. This design allows the model to reuse features more efficiently and reduces redundant calculations, making the model easier to train, eliminating the need for pre-training models and reducing the number of samples required for model training. At the same time, the one-time aggregation module is combined with deformable convolution, which is beneficial for obtaining accurate road geometry features and enhancing topology perception capabilities.

5)本发明的跳跃连接中采用了通道注意力模块,该设计有利于选取合适通道信息进行融合并避免干扰,以更好的提升模型的语义分割精度。5) The jump connection of the present invention adopts a channel attention module, which is conducive to selecting appropriate channel information for fusion and avoiding interference, so as to better improve the semantic segmentation accuracy of the model.

本发明通过以上的有益效果,综合解决了现有高分辨率遥感图像道路提取方法中存在的无法准确感知道路的拓扑结构、全局上下文信息的获取局限于单个样本、语义分割的过程中边缘和细节信息损失较大、训练所需样本量大等问题。Through the above beneficial effects, the present invention comprehensively solves the problems existing in the existing high-resolution remote sensing image road extraction methods, such as the inability to accurately perceive the topological structure of the road, the acquisition of global context information is limited to a single sample, the loss of edge and detail information in the semantic segmentation process is large, and the number of samples required for training is large.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明方法中构建的高分辨率遥感图像语义分割网络总体结构示意图;FIG1 is a schematic diagram of the overall structure of a high-resolution remote sensing image semantic segmentation network constructed in the method of the present invention;

图2为本发明方法中构建的高分辨率遥感图像语义分割网络中结合可变形卷积的一次聚合模块的结构组成示意图;FIG2 is a schematic diagram of the structural composition of a primary aggregation module combined with deformable convolution in a high-resolution remote sensing image semantic segmentation network constructed in the method of the present invention;

图3为本发明方法中构建的高分辨率遥感图像语义分割网络中通道注意力模块的组成结构示意图;FIG3 is a schematic diagram of the composition structure of the channel attention module in the high-resolution remote sensing image semantic segmentation network constructed in the method of the present invention;

图4为本发明方法中构建的高分辨率遥感图像语义分割网络中空间上下文模块的组成结构示意图。FIG4 is a schematic diagram of the composition structure of the spatial context module in the high-resolution remote sensing image semantic segmentation network constructed in the method of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图对本发明的具体实施方式进行描述,以便本领域的技术人员更好地理解本发明。The specific implementation modes of the present invention are described below in conjunction with the accompanying drawings so that those skilled in the art can better understand the present invention.

本发明涉及一种基于多尺度拓扑感知神经网络的表面裂纹检测方法,包括以下步骤:The present invention relates to a surface crack detection method based on a multi-scale topological perception neural network, comprising the following steps:

步骤一:将高分辨率遥感数据集划分为训练集和测试集,并对训练集和测试集中的图像分别进行预处理。Step 1: Divide the high-resolution remote sensing dataset into a training set and a test set, and preprocess the images in the training set and the test set respectively.

所述预处理包括图像人工标注、图像裁剪和数据增强;所述的图像人工标注具体为:在ArcGIS软件中人工对高分辨率图像中的道路进行像素级语义标注得到带标签的表面裂纹图像。The preprocessing includes manual image annotation, image cropping and data enhancement; the manual image annotation specifically includes: manually performing pixel-level semantic annotation on roads in high-resolution images in ArcGIS software to obtain labeled surface crack images.

所述的图像裁剪具体为:将带标签的高分辨率遥感图像随机裁剪为512像素×512像素的子图像。The image cropping is specifically as follows: randomly cropping the labeled high-resolution remote sensing image into sub-images of 512 pixels×512 pixels.

所述数据增强包括:将子图像进行尺度随机变换、随机角度图像旋转、图像垂直与水平翻转得到的高分辨率遥感图像。The data enhancement includes: performing random scale transformation on sub-images, random angle image rotation, and vertical and horizontal image flipping to obtain a high-resolution remote sensing image.

步骤二:搭建高分辨率遥感图像语义分割网络,如图1所示,所述的高分辨率遥感图像语义分割网络依次由下采样路径、空间上下文模块和上采样路径组成,下采样路径与上采样路径采用经过通道注意力调整的跳跃连接相连。Step 2: Build a high-resolution remote sensing image semantic segmentation network. As shown in Figure 1, the high-resolution remote sensing image semantic segmentation network consists of a downsampling path, a spatial context module and an upsampling path in sequence. The downsampling path and the upsampling path are connected by a jump connection adjusted by channel attention.

下采样路径;从输入数据开始,首先通过卷积核大小为3×3、步长为1的可变形卷积层,得到通道数目

Figure 613523DEST_PATH_IMAGE031
的取值为48的语义特征图,随后通过5个连续的下采样单元进行特征提取和下采样。Downsampling path: Starting from the input data, first pass through a deformable convolution layer with a convolution kernel size of 3×3 and a stride of 1 to get the number of channels
Figure 613523DEST_PATH_IMAGE031
The semantic feature map with a value of 48 is then extracted and downsampled through 5 consecutive downsampling units.

可变形卷积对常规卷积核中每个采样点的位置都增加了一个可学习的偏移变量和权重系数;通过这些变量,卷积核就可以在当前位置附近随意的采样并排除了无用上下文信息的干扰,不再局限于常规卷积的规则格点;可变形卷积有利于道路复杂几何特征的准确提取。Deformable convolution adds a learnable offset variable and weight coefficient to the position of each sampling point in the conventional convolution kernel; through these variables, the convolution kernel can sample arbitrarily near the current position and eliminate the interference of useless context information, and is no longer limited to the regular grid of conventional convolution; deformable convolution is conducive to the accurate extraction of complex geometric features of the road.

下采样单元依次由一次聚合模块、聚合操作和下转换模块组成,其中一次聚合模块负责提取特征,然后聚合操作将一次聚合模块的输入和输出聚合,再将结果特征图分别传输至下转换模块和通道注意力模块,其中通道注意力模块的输出(通道注意力)与输入进行对应元素相乘,得到经过通道注意力调整的跳跃连接特征并输入至上采样路径对应部分。The downsampling unit is composed of a single aggregation module, an aggregation operation, and a down-conversion module in sequence. The single aggregation module is responsible for extracting features. The aggregation operation then aggregates the input and output of the single aggregation module, and then transmits the resulting feature map to the down-conversion module and the channel attention module respectively. The output of the channel attention module (channel attention) is multiplied by the input by corresponding elements to obtain the jump connection features adjusted by the channel attention and input into the corresponding part of the upsampling path.

如图2所示,所述一次聚合模块的结构如下:特征图

Figure DEST_PATH_IMAGE032
输入至一次聚合模块后,首先经过n个卷积模块,得到n个新特征图
Figure 289355DEST_PATH_IMAGE033
(其中前两个卷积模块为可变形卷积模块);然后将所得
Figure 871646DEST_PATH_IMAGE003
个特征图
Figure DEST_PATH_IMAGE034
进行通道堆叠操作,得到通道堆叠后的特征图
Figure 249407DEST_PATH_IMAGE035
;一次性聚合模块将特征图一次性聚合,即重复利用了网络所提取的特征又避免了冗余计算,因而可以有效降低模型训练的难度并减少所需样本数量。As shown in Figure 2, the structure of the primary aggregation module is as follows:
Figure DEST_PATH_IMAGE032
After being input into the primary aggregation module, it first passes through n convolution modules to obtain n new feature maps
Figure 289355DEST_PATH_IMAGE033
(The first two convolution modules are deformable convolution modules); then
Figure 871646DEST_PATH_IMAGE003
Feature Map
Figure DEST_PATH_IMAGE034
Perform channel stacking operation to obtain the feature map after channel stacking
Figure 249407DEST_PATH_IMAGE035
The one-time aggregation module aggregates the feature maps at one time, which reuses the features extracted by the network and avoids redundant calculations, thus effectively reducing the difficulty of model training and the number of samples required.

下采样路径中每个一次聚合模块包含的常规卷积和可变形卷积模块总个数

Figure 880239DEST_PATH_IMAGE003
的取值分别为4、5、7、10、12,对应输出的特征图通道数目m的取值分别为112、192、304、464、656;每个上采样单元中一次聚合模块包含的常规卷积和可变形卷积模块总个数
Figure 941736DEST_PATH_IMAGE003
的取值分别为12、10、7、5、4,输出到下一个上转换模块的特征图通道数
Figure DEST_PATH_IMAGE036
的取值为192、160、112、80、64。The total number of regular convolution and deformable convolution modules contained in each primary aggregation module in the downsampling path
Figure 880239DEST_PATH_IMAGE003
The values of are 4, 5, 7, 10, and 12, respectively, and the corresponding values of the number of feature map channels m of the output are 112, 192, 304, 464, and 656, respectively; the total number of conventional convolution and deformable convolution modules contained in the aggregation module in each upsampling unit is
Figure 941736DEST_PATH_IMAGE003
The values of are 12, 10, 7, 5, and 4, respectively, and the number of feature map channels output to the next up-conversion module
Figure DEST_PATH_IMAGE036
The values are 192, 160, 112, 80, and 64.

一次聚合模块中的卷积模块或可变形卷积模块,由批归一化层、ReLU激活函数层、3

Figure 914503DEST_PATH_IMAGE008
3卷积层或可变形卷积层(通道数m=16)和随机失活层(失活概率p=0.2)依次组成。The convolution module or deformable convolution module in the primary aggregation module consists of a batch normalization layer, a ReLU activation function layer, 3
Figure 914503DEST_PATH_IMAGE008
It consists of 3 convolutional layers or deformable convolutional layers (channel number m=16) and random inactivation layers (inactivation probability p=0.2).

如图3所示,所述通道注意力模块结构如下:大小为

Figure 858188DEST_PATH_IMAGE037
的输入特征分别经过全局最大池化和全局平均池化得到两个
Figure DEST_PATH_IMAGE038
的特征图,接着将它们分别送入一个共享多层感知器:第一层神经元个数为
Figure 823870DEST_PATH_IMAGE039
(r为减少率,r=16),激活函数为Relu,第二层神经元个数为
Figure DEST_PATH_IMAGE040
;输出的两个特征进行相加,再经过Sigmoid激活操作,生成最终的通道注意力;通道注意力和输入特征图做对应元素乘法操作。As shown in Figure 3, the channel attention module structure is as follows:
Figure 858188DEST_PATH_IMAGE037
The input features of are respectively subjected to global maximum pooling and global average pooling to obtain two
Figure DEST_PATH_IMAGE038
The feature maps are then sent to a shared multilayer perceptron: the number of neurons in the first layer is
Figure 823870DEST_PATH_IMAGE039
(r is the reduction rate, r=16), the activation function is Relu, and the number of neurons in the second layer is
Figure DEST_PATH_IMAGE040
; The two output features are added, and then the Sigmoid activation operation is performed to generate the final channel attention; the channel attention and the input feature map perform corresponding element multiplication operations.

下转换模块选用Daubechies小波为小波基,经过一次小波变换图像的每个通道被分解为原先一半高度和宽度的4个子图像:

Figure 395666DEST_PATH_IMAGE041
Figure DEST_PATH_IMAGE042
Figure 54180DEST_PATH_IMAGE043
Figure DEST_PATH_IMAGE044
,其中下标中字母的前后顺序分别代表水平和垂直方向,G代表高频信息、而D代表低频信息,例如GG代表:水平和垂直方向均为高频信息下标;将
Figure 219582DEST_PATH_IMAGE044
子图像输出,而其余子图像传输至上采样路径对应的模块。The down-conversion module uses Daubechies wavelet as the wavelet basis. After a wavelet transform, each channel of the image is decomposed into four sub-images with half the original height and width:
Figure 395666DEST_PATH_IMAGE041
,
Figure DEST_PATH_IMAGE042
,
Figure 54180DEST_PATH_IMAGE043
and
Figure DEST_PATH_IMAGE044
, where the order of the letters in the subscripts represents the horizontal and vertical directions respectively, G represents high-frequency information, and D represents low-frequency information. For example, GG represents: both the horizontal and vertical directions are high-frequency information subscripts;
Figure 219582DEST_PATH_IMAGE044
The sub-image is output, while the remaining sub-images are transmitted to the modules corresponding to the upsampling path.

如图4所示,所述空间上下文模块由“高度方向”和“宽度方向”两条路径组成;在“高度方向”路径来自下采样路径的特征图(假设大小为

Figure 942950DEST_PATH_IMAGE045
,其中C代表通道数目,H代表特征图高度,W代表特征图宽度)在垂直方向被分为H个切片:
Figure DEST_PATH_IMAGE046
,首先对每个切片
Figure 182301DEST_PATH_IMAGE047
Figure DEST_PATH_IMAGE048
)使用大小为
Figure 995405DEST_PATH_IMAGE049
卷积操作进行线性投影得到
Figure DEST_PATH_IMAGE050
,然后将
Figure 648104DEST_PATH_IMAGE051
输入第一个线性层
Figure DEST_PATH_IMAGE052
Figure 690009DEST_PATH_IMAGE053
,其中S=64,C=656),得到注意力
Figure DEST_PATH_IMAGE054
,再对注意力
Figure 800178DEST_PATH_IMAGE054
依次在特征图尺寸和通道维度分别使用Softmax和L1正则化,随后将正则化后的注意力输入第二个线性层
Figure 3758DEST_PATH_IMAGE055
Figure DEST_PATH_IMAGE056
,其中C=656,S=64),得到新的切片
Figure 878173DEST_PATH_IMAGE057
,最后将所有新切片沿“高度方向”聚合为大小为
Figure DEST_PATH_IMAGE058
的特征图;在“宽度方向”路径,来自下采样路径的特征图沿“宽度”方向被分为W个切片,切片按照前两个阶段的方式更新并聚合;来自“高度方向”和“宽度方向”两条路径、大小均为
Figure 504195DEST_PATH_IMAGE059
的特征图通过加法运算进行融合;空间上下文模块将传统的卷积层接层(Layer-by-layer)的连接形式的转为特征图中片连片(Slice-by-slice)线性复杂度自注意力的形式,使得图中像素行和列之间能够传递信息,同时采用具有记忆功能的线性层替代传统自注意力中的Key矢量和Value矢量,因此即使是遥感图像中的道路被遮挡或被阴影覆盖,空间上下文模块仍然适用于检测长距离连续形状的道路目标,并且能够提取分布于整个训练集的全局上下文信息,同时降低计算复杂度。As shown in FIG4 , the spatial context module consists of two paths, “height direction” and “width direction”; in the “height direction” path, the feature map from the downsampling path (assuming the size is
Figure 942950DEST_PATH_IMAGE045
, where C represents the number of channels, H represents the feature map height, and W represents the feature map width) is divided into H slices in the vertical direction:
Figure DEST_PATH_IMAGE046
, first for each slice
Figure 182301DEST_PATH_IMAGE047
Figure DEST_PATH_IMAGE048
) Use size
Figure 995405DEST_PATH_IMAGE049
The convolution operation is linearly projected to obtain
Figure DEST_PATH_IMAGE050
, then
Figure 648104DEST_PATH_IMAGE051
Input the first linear layer
Figure DEST_PATH_IMAGE052
Figure 690009DEST_PATH_IMAGE053
, where S=64, C=656), get attention
Figure DEST_PATH_IMAGE054
, and then pay attention
Figure 800178DEST_PATH_IMAGE054
Softmax and L1 regularization are used in the feature map size and channel dimension respectively, and then the regularized attention is input into the second linear layer
Figure 3758DEST_PATH_IMAGE055
Figure DEST_PATH_IMAGE056
, where C=656, S=64), get the new slice
Figure 878173DEST_PATH_IMAGE057
, and finally aggregate all new slices along the "height direction" into a size of
Figure DEST_PATH_IMAGE058
In the “width direction” path, the feature map from the downsampling path is divided into W slices along the “width direction”, and the slices are updated and aggregated in the same way as in the first two stages; the feature maps from the “height direction” and “width direction” paths are both
Figure 504195DEST_PATH_IMAGE059
The feature maps are fused by addition operations; the spatial context module converts the traditional convolutional layer-by-layer connection form into the slice-by-slice linear complexity self-attention form in the feature map, so that information can be transmitted between pixel rows and columns in the map. At the same time, a linear layer with memory function is used to replace the Key vector and Value vector in the traditional self-attention. Therefore, even if the road in the remote sensing image is blocked or covered by shadows, the spatial context module is still suitable for detecting road targets with long-distance continuous shapes, and can extract global context information distributed in the entire training set while reducing computational complexity.

所述上采样路径上采样路径依次由5个连续的上采样单元、一个卷积核大小为1×1、步长为1的卷积层、Softmax层组成,最后生成分割预测图。The upsampling path is composed of 5 consecutive upsampling units, a convolution layer with a convolution kernel size of 1×1 and a step size of 1, and a Softmax layer, and finally generates a segmentation prediction map.

所述上采样路径上采样路径依次由5个连续的上采样单元、一个聚合操作(记为A聚合操作)、一个卷积核大小为1×1、步长为1的卷积层、Softmax层组成,最后生成分割预测图。The upsampling path is composed of 5 consecutive upsampling units, an aggregation operation (denoted as A aggregation operation), a convolution layer with a convolution kernel size of 1×1 and a step size of 1, and a Softmax layer, and finally generates a segmentation prediction map.

所述上采样单元由一组上转换模块、聚合操作(记为B聚合操作)和一次聚合模块组成,其中上转换模块负责通过上采样恢复特征图的空间分辨率,B聚合操作负责将通道注意力模块调整过的跳跃连接特征图和上转换模块所得特征图进行聚合,一次聚合模块负责从B聚合操作的结果中提取特征。The upsampling unit consists of a group of up-conversion modules, aggregation operations (denoted as B aggregation operations) and a one-time aggregation module, wherein the up-conversion module is responsible for restoring the spatial resolution of the feature map through upsampling, the B aggregation operation is responsible for aggregating the skip connection feature map adjusted by the channel attention module and the feature map obtained by the up-conversion module, and the one-time aggregation module is responsible for extracting features from the results of the B aggregation operation.

所述上转换模块将来自下采样路径中对应下转换模块输出的高频信息特征图(

Figure 249297DEST_PATH_IMAGE060
Figure DEST_PATH_IMAGE061
Figure 420516DEST_PATH_IMAGE062
)在通过1×1卷积通道数提升后与空间上下文模块或上一个一次聚合模块输出的新特征组合,进行Daubechies小波为小波基的小波逆变换,从而实现上采样。The up-conversion module converts the high-frequency information feature map (
Figure 249297DEST_PATH_IMAGE060
,
Figure DEST_PATH_IMAGE061
and
Figure 420516DEST_PATH_IMAGE062
) After the number of channels is increased through 1×1 convolution, it is combined with the new features output by the spatial context module or the previous one-time aggregation module, and an inverse wavelet transform with Daubechies wavelet as the wavelet basis is performed to achieve upsampling.

所述上采样路径中的A聚合操作将最后一个上采样单元中B聚合操作结果与一次聚合模块输出再次进行聚合。The A aggregation operation in the upsampling path aggregates the result of the B aggregation operation in the last upsampling unit with the output of the primary aggregation module again.

所述卷积核大小为1×1、步长为1的卷积层将通道数转化为2。The convolution layer with a convolution kernel size of 1×1 and a stride of 1 converts the number of channels to 2.

步骤三:将步骤一中经过预处理的训练集图像输入至步骤二中的高分辨遥感图像语义分割网络进行训练,首先使用He Uniform方法对高分辨遥感图像分割网络进行初始化,然后对模型中参数进行更新,优化损失函数直至收敛。Step 3: Input the preprocessed training set images in step 1 into the high-resolution remote sensing image semantic segmentation network in step 2 for training. First, use the He Uniform method to initialize the high-resolution remote sensing image segmentation network, then update the parameters in the model, and optimize the loss function until convergence.

其中X代表预测图中所有像素对应预测值的集合,Y代表标签图像所有像素对应值的集合,

Figure 205063DEST_PATH_IMAGE027
代表X和Y的交集,
Figure 385509DEST_PATH_IMAGE028
Figure 922800DEST_PATH_IMAGE029
代表X和Y中元素的个数。Where X represents the set of predicted values corresponding to all pixels in the prediction image, and Y represents the set of values corresponding to all pixels in the label image.
Figure 205063DEST_PATH_IMAGE027
represents the intersection of X and Y,
Figure 385509DEST_PATH_IMAGE028
and
Figure 922800DEST_PATH_IMAGE029
Represents the number of elements in X and Y.

步骤四:将经过预处理的测试集遥感图像输入到步骤三中训练好的分割网络,输出分高分辨率遥感图像语义分割结果。Step 4: Input the preprocessed test set remote sensing images into the segmentation network trained in step 3, and output the semantic segmentation results of the high-resolution remote sensing images.

关于本发明具体结构需要说明的是,本发明采用的各部件模块相互之间的连接关系是确定的、可实现的,除实施例中特殊说明的以外,其特定的连接关系可以带来相应的技术效果,并基于不依赖相应软件程序执行的前提下,解决本发明提出的技术问题,本发明中出现的部件、模块、具体元器件的型号、连接方式除具体说明的以外,均属于本领域技术人员在申请日前可以获取到的已公开专利、已公开的期刊论文、或公知常识等现有技术,无需赘述,使得本案提供的技术方案是清楚、完整、可实现的,并能根据该技术手段重现或获得相应的实体产品。What needs to be explained about the specific structure of the present invention is that the connection relationship between the various component modules adopted in the present invention is definite and feasible. Except for the special instructions in the embodiments, the specific connection relationship can bring about the corresponding technical effects and solve the technical problems raised by the present invention without relying on the execution of the corresponding software program. Except for the specific instructions, the models and connection methods of the components, modules and specific components appearing in the present invention belong to the existing technologies such as published patents, published journal articles or common knowledge that can be obtained by technical personnel in this field before the application date, and there is no need to elaborate, so that the technical solution provided in this case is clear, complete and feasible, and the corresponding physical products can be reproduced or obtained according to the technical means.

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or replace some or all of the technical features therein by equivalents. However, these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A high-resolution remote sensing image semantic segmentation network is characterized in that: comprises a down-sampling path, a spatial context module and an up-sampling path,
the down-sampling path comprises a deformable convolution layer and 5 continuous down-sampling units, input data firstly passes through the deformable convolution layer to obtain a semantic feature map, then feature extraction and down-sampling are carried out through the 5 continuous down-sampling units, and finally a feature map is output;
the up-sampling path sequentially consists of 5 continuous up-sampling units, an aggregation operation A, a convolution layer and a Softmax layer, and finally a segmentation prediction graph is generated;
the 5 down sampling units and the 5 up sampling units are in one-to-one correspondence and are connected by adopting a jump connection which is adjusted by the attention of the channel attention module;
and the spatial context module is used for segmenting and fusing the feature map output by the down-sampling path and outputting the feature map to the up-sampling path.
2. The high resolution remote sensing image semantic segmentation network according to claim 1, characterized in that: the deformable convolution layer is a deformable convolution with a convolution kernel size of 3 x 3 and a step size of 1, and this operation adds a learnable offset variable and weight coefficient to the position of each sample point in the 3 x 3 convolution kernel.
3. The high resolution remote sensing image semantic segmentation network according to claim 2, characterized in that: the down-sampling unit comprises a primary aggregation module, an aggregation operation module and a down-conversion module, wherein the primary aggregation module is responsible for extracting features, then the aggregation operation module aggregates the input and the output of the primary aggregation module, and then the result feature graph is respectively transmitted to the down-conversion module and the channel attention module, wherein the output and the input of the channel attention module are multiplied by corresponding elements to obtain jump connection features subjected to channel attention adjustment and input the jump connection features to the corresponding part of the upper sampling path.
4. The high resolution remote sensing image semantic segmentation network according to claim 1, characterized in that: the up-sampling unit comprises a group of up-conversion modules, a B aggregation operation and a primary aggregation module, wherein the up-conversion modules are responsible for recovering the spatial resolution of the feature map through up-sampling, the B aggregation operation is responsible for aggregating the jump connection feature map adjusted by the channel attention module and the feature map obtained by the up-conversion modules, and the primary aggregation module is responsible for extracting features from the result of the B aggregation operation.
5. The high resolution remote sensing image semantic segmentation network according to claim 3 or 4, characterized in that: the structure of the primary polymerization module is as follows: characteristic diagram
Figure DEST_PATH_IMAGE001
After being input into a primary polymerization module, the data is firstly processed through->
Figure DEST_PATH_IMAGE003
A convolution module to get >>
Figure 224174DEST_PATH_IMAGE003
Individual new characteristic map>
Figure 823782DEST_PATH_IMAGE004
Wherein the first two convolution modules are deformable convolution modules; then combining the result>
Figure 369164DEST_PATH_IMAGE003
Characteristic diagram
Figure 749330DEST_PATH_IMAGE004
Performing channel stacking operation to obtain a characteristic diagram after channel stacking>
Figure DEST_PATH_IMAGE005
The convolution module consists of a batch normalization layer, a ReLU activation function layer and 3
Figure DEST_PATH_IMAGE007
3, sequentially forming a convolution layer and a random inactivation layer; the deformable convolution module consists of a batch normalization layer, a ReLU activation function layer and a 3 ^ er>
Figure 687461DEST_PATH_IMAGE007
3 a deformable convolution layer and a random deactivation layer.
6. The high resolution remote sensing image semantic segmentation network according to claim 5, characterized in that: the structure of the channel attention module is as follows: is of the size of
Figure 407156DEST_PATH_IMAGE008
Are subject to global maximum pooling and global mean pooling, respectively, to obtain two->
Figure DEST_PATH_IMAGE009
A characteristic diagram of (1);
then respectively sending the data to a shared multilayer perceptron, wherein the number of first layer neurons is
Figure 920177DEST_PATH_IMAGE010
R is decreasing rate, r =16, activation function is Relu, number of second layer neurons is +>
Figure DEST_PATH_IMAGE011
;/>
Adding the two output characteristics, and generating final channel attention through Sigmoid activation operation; and carrying out corresponding element multiplication operation on the channel attention and the input features.
7. The high resolution remote sensing image semantic segmentation network according to claim 6, characterized in that: the space context module comprises two paths of a height direction and a width direction;
wherein the "height direction" path is from a feature map of the down-sampled path having a size of
Figure 381114DEST_PATH_IMAGE012
Where C represents the number of channels, H represents the feature height, W represents the feature width, divided into H slices in the vertical direction:
Figure DEST_PATH_IMAGE013
For each slice->
Figure 965679DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE015
Is used with a size of->
Figure 8721DEST_PATH_IMAGE016
Convolution operation performs linear projection to get->
Figure DEST_PATH_IMAGE017
Then will->
Figure 381059DEST_PATH_IMAGE018
Input the first linear level->
Figure DEST_PATH_IMAGE019
Get attention->
Figure 876762DEST_PATH_IMAGE020
Then attention is called>
Figure 265018DEST_PATH_IMAGE020
Regularization using Softmax and L1, respectively, in the feature size and channel dimensions in turn, and then inputting the regularized attention into the second linear layer ≦>
Figure DEST_PATH_IMAGE021
Obtaining new sections>
Figure 615097DEST_PATH_IMAGE022
Finally, all new sections are combined in the "height direction" to a size ^ 4>
Figure DEST_PATH_IMAGE023
A characteristic diagram of (1);
in the 'width direction' path, the feature map from the downsampling path is divided into W slices along the 'width' direction, and the slices are updated and aggregated into the feature map in the same way;
the two paths from the height direction and the width direction are both
Figure 204342DEST_PATH_IMAGE024
The feature maps of (a) are fused by addition.
8. A remote sensing image road segmentation method based on a topology-aware neural network is characterized by comprising the following steps: comprises the following steps of (a) preparing a solution,
s100: dividing the high-resolution remote sensing data set into a training set and a testing set, and respectively preprocessing images in the training set and the testing set;
s200: building a high-resolution remote sensing image semantic segmentation network;
s300: inputting the preprocessed training set image in the S100 into a high-resolution remote sensing image semantic segmentation network in the S200 for training, initializing the high-resolution remote sensing image semantic segmentation network by using a He Uniform method, updating parameters in a model, and optimizing a loss function until convergence;
s400: and inputting the preprocessed test set remote sensing image into the trained segmentation network in S200, and outputting a semantic segmentation result of the high-resolution remote sensing image.
9. The remote sensing image road segmentation method based on the topology aware neural network as claimed in claim 8, characterized in that: in the step S100, the preprocessing includes image manual labeling, image cropping, and data enhancement;
the image manual labeling specifically comprises the following steps: manually carrying out pixel-level semantic annotation on a road in the high-resolution image in ArcGIS software to obtain a labeled surface crack image;
the image cutting specifically comprises the following steps: randomly cutting the high-resolution remote sensing image with the label into a sub-image of 512 pixels multiplied by 512 pixels;
the data enhancement comprises: and carrying out scale random transformation, random angle image rotation and image vertical and horizontal overturning on the subimages to obtain a high-resolution remote sensing image.
10. The remote sensing image road segmentation method based on the topology aware neural network as claimed in claim 8, characterized in that: in step S300, the Loss function adopts Dice Loss, which specifically includes:
Figure DEST_PATH_IMAGE025
wherein X represents the set of predicted values corresponding to all pixels in the prediction map, Y represents the set of values corresponding to all pixels in the label image,
Figure 515237DEST_PATH_IMAGE026
represents the intersection of X and Y>
Figure DEST_PATH_IMAGE027
And &>
Figure 55985DEST_PATH_IMAGE028
Represents the number of elements in X and Y. />
CN202211575990.2A 2022-12-09 2022-12-09 Remote sensing image road segmentation method based on topology perception neural network Active CN115937704B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211575990.2A CN115937704B (en) 2022-12-09 2022-12-09 Remote sensing image road segmentation method based on topology perception neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211575990.2A CN115937704B (en) 2022-12-09 2022-12-09 Remote sensing image road segmentation method based on topology perception neural network

Publications (2)

Publication Number Publication Date
CN115937704A true CN115937704A (en) 2023-04-07
CN115937704B CN115937704B (en) 2025-03-11

Family

ID=86653560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211575990.2A Active CN115937704B (en) 2022-12-09 2022-12-09 Remote sensing image road segmentation method based on topology perception neural network

Country Status (1)

Country Link
CN (1) CN115937704B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118351446A (en) * 2024-04-26 2024-07-16 江西省国土空间调查规划研究院 A road network optimization method considering direction-connectivity
CN118536251A (en) * 2024-02-28 2024-08-23 浙江大学 A controllable generation method and system of topological road network for autonomous driving simulation

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145718A (en) * 2018-07-04 2019-01-04 国交空间信息技术(北京)有限公司 The road network extracting method and device of remote sensing image based on topology ambiguity
CN112016511A (en) * 2020-09-08 2020-12-01 重庆市地理信息和遥感应用中心 Remote sensing image blue top room detection method based on large-scale depth convolution neural network
CN113313180A (en) * 2021-06-04 2021-08-27 太原理工大学 Remote sensing image semantic segmentation method based on deep confrontation learning
US20210272266A1 (en) * 2020-02-27 2021-09-02 North China Institute of Aerospace Engineering Automatic Interpretation Method for Winter Wheat Based on Deformable Fully Convolutional Neural Network
WO2021175434A1 (en) * 2020-03-05 2021-09-10 Cambridge Enterprise Limited System and method for predicting a map from an image
CN113888550A (en) * 2021-09-27 2022-01-04 太原理工大学 Remote sensing image road segmentation method combining super-resolution and attention mechanism
WO2022199143A1 (en) * 2021-03-26 2022-09-29 南京邮电大学 Medical image segmentation method based on u-shaped network
US11521377B1 (en) * 2021-10-26 2022-12-06 Nanjing University Of Information Sci. & Tech. Landslide recognition method based on laplacian pyramid remote sensing image fusion

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145718A (en) * 2018-07-04 2019-01-04 国交空间信息技术(北京)有限公司 The road network extracting method and device of remote sensing image based on topology ambiguity
US20210272266A1 (en) * 2020-02-27 2021-09-02 North China Institute of Aerospace Engineering Automatic Interpretation Method for Winter Wheat Based on Deformable Fully Convolutional Neural Network
WO2021175434A1 (en) * 2020-03-05 2021-09-10 Cambridge Enterprise Limited System and method for predicting a map from an image
CN112016511A (en) * 2020-09-08 2020-12-01 重庆市地理信息和遥感应用中心 Remote sensing image blue top room detection method based on large-scale depth convolution neural network
WO2022199143A1 (en) * 2021-03-26 2022-09-29 南京邮电大学 Medical image segmentation method based on u-shaped network
CN113313180A (en) * 2021-06-04 2021-08-27 太原理工大学 Remote sensing image semantic segmentation method based on deep confrontation learning
CN113888550A (en) * 2021-09-27 2022-01-04 太原理工大学 Remote sensing image road segmentation method combining super-resolution and attention mechanism
US11521377B1 (en) * 2021-10-26 2022-12-06 Nanjing University Of Information Sci. & Tech. Landslide recognition method based on laplacian pyramid remote sensing image fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
尚群锋;沈炜;帅世渊;: "基于深度学习高分辨率遥感影像语义分割", 计算机系统应用, no. 07, 15 July 2020 (2020-07-15) *
邓睿哲;陈启浩;陈奇;刘修国;: "遥感影像船舶检测的特征金字塔网络建模方法", 测绘学报, no. 06, 15 June 2020 (2020-06-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118536251A (en) * 2024-02-28 2024-08-23 浙江大学 A controllable generation method and system of topological road network for autonomous driving simulation
CN118351446A (en) * 2024-04-26 2024-07-16 江西省国土空间调查规划研究院 A road network optimization method considering direction-connectivity

Also Published As

Publication number Publication date
CN115937704B (en) 2025-03-11

Similar Documents

Publication Publication Date Title
CN110287849B (en) Lightweight depth network image target detection method suitable for raspberry pi
CN113673590B (en) Rain removal method, system and medium based on multi-scale hourglass densely connected network
CN112233129B (en) Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device
CN111401436B (en) Streetscape image segmentation method fusing network and two-channel attention mechanism
Chen et al. Single image super-resolution using deep CNN with dense skip connections and inception-resnet
CN110781773A (en) Road extraction method based on residual error neural network
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN115937704A (en) Remote sensing image road segmentation method based on topology perception neural network
CN110517272A (en) Blood cell segmentation method based on deep learning
CN113838064A (en) A Cloud Removal Method Using Multitemporal Remote Sensing Data Based on Branch GAN
CN109034198A (en) The Scene Segmentation and system restored based on characteristic pattern
CN118230131B (en) Image recognition and target detection method
CN116563682A (en) An Attention Scheme and Strip Convolutional Semantic Line Detection Method Based on Deep Hough Networks
CN114511786A (en) A cloud removal method for remote sensing images by fusing multi-temporal information and sub-channel dense convolution
CN117593187A (en) Remote sensing image super-resolution reconstruction method based on meta-learning and transducer
CN117351360A (en) Remote sensing image road extraction method based on attention mechanism improvement
CN112785629A (en) Aurora motion characterization method based on unsupervised deep optical flow network
CN116503251A (en) Super-resolution reconstruction method for generating countermeasure network remote sensing image by combining hybrid expert
CN115526779A (en) Infrared image super-resolution reconstruction method based on dynamic attention mechanism
CN115424017A (en) Building internal and external contour segmentation method, device and storage medium
CN117788296B (en) Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network
Guo et al. TDEGAN: A texture-detail-enhanced dense generative adversarial network for remote sensing image super-resolution
CN111967516A (en) Pixel-by-pixel classification method, storage medium and classification equipment
CN118038053A (en) A real-time semantic segmentation method and system with dual-resolution interactive attention
CN111598841A (en) Example significance detection method based on regularized dense connection feature pyramid

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant