CN115631412A - Remote sensing image building extraction method based on coordinate attention and data correlation upsampling - Google Patents
Remote sensing image building extraction method based on coordinate attention and data correlation upsampling Download PDFInfo
- Publication number
- CN115631412A CN115631412A CN202211270279.6A CN202211270279A CN115631412A CN 115631412 A CN115631412 A CN 115631412A CN 202211270279 A CN202211270279 A CN 202211270279A CN 115631412 A CN115631412 A CN 115631412A
- Authority
- CN
- China
- Prior art keywords
- building
- image
- remote sensing
- data
- cad
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 55
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000007781 pre-processing Methods 0.000 claims abstract description 14
- 238000005070 sampling Methods 0.000 claims abstract description 9
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 22
- 230000001419 dependent effect Effects 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 13
- 238000011156 evaluation Methods 0.000 claims description 12
- 230000009466 transformation Effects 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 238000000844 transformation Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 9
- 230000009467 reduction Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000002474 experimental method Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/176—Urban or other man-made structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
Abstract
Description
技术领域technical field
本发明涉及图像处理技术领域,尤其是一种基于坐标注意力和数据相关上采样的遥感图像建筑物提取方法。The invention relates to the technical field of image processing, in particular to a method for extracting buildings from remote sensing images based on coordinate attention and data correlation upsampling.
背景技术Background technique
建筑物是人们日常生活中不可或缺的活动场所,也是城市建设发展过程中的重要组成部分。建筑物提取的主要任务是从遥感图像识别并提取出建筑物区域。建筑物提取对智慧城市建设、交通管理、人口估计和土地利用监测等具有重要意义。随着遥感技术的迅速发展,遥感图像开始由低分辨率向高分辨率过渡,形成了高空间分辨率、高光谱分辨率、高时间分辨率为特征的发展趋势。高分辨率遥感图像的特征和信息不断增加,噪声和干扰信息也相应的增多,这给建筑物提取带来了新的挑战,如何从高分辨率遥感影像中精确提取建筑物已成为研究的热点和难点。Buildings are an indispensable activity place in people's daily life, and also an important part of the process of urban construction and development. The main task of building extraction is to identify and extract building areas from remote sensing images. Building extraction is of great significance for smart city construction, traffic management, population estimation, and land use monitoring, etc. With the rapid development of remote sensing technology, remote sensing images began to transition from low resolution to high resolution, forming a development trend characterized by high spatial resolution, high spectral resolution, and high temporal resolution. The features and information of high-resolution remote sensing images continue to increase, and the noise and interference information also increase accordingly, which brings new challenges to building extraction. How to accurately extract buildings from high-resolution remote sensing images has become a research hotspot and difficult.
传统的建筑物提取方法通常是基于先验知识和手工特征,然后再采用聚类等算法进行建筑物提取,主要包括基于建筑物特征的方法、基于辅助信息的方法等。这些方法大多是利用建筑物的形状纹理等特征和基于辅助信息等方式进行建筑物提取,实现原理较为简单,存在识别率低、错误较多等问题,并且过程费时费力,在实际应用中存在很大局限性,性能也受到极大限制。具体表现在:Traditional building extraction methods are usually based on prior knowledge and manual features, and then clustering and other algorithms are used to extract buildings, mainly including methods based on building features and auxiliary information. Most of these methods use the features of the building's shape, texture, and other methods to extract buildings based on auxiliary information. The realization principle is relatively simple, but there are problems such as low recognition rate and many errors, and the process is time-consuming and laborious. Large limitations, performance is also greatly limited. Specifically in:
第一,对建筑物的位置信息关注较少;建筑物的位置信息对于建筑物提取任务异常重要,建筑物通常规则分布于一张遥感图像内,且通常存在树木等阴影遮挡情况,着重关注建筑物的位置信息可以获取建筑物在一张图像内的准确位置,避免错分现象,现有技术未充分考虑到建筑物的位置信息,尤其是对于存在阴影遮挡的和复杂粘连的建筑物,对位置信息的关注度不够,因而很容易出现错分现象;First, less attention is paid to the location information of buildings; the location information of buildings is extremely important for building extraction tasks. Buildings are usually regularly distributed in a remote sensing image, and there are usually shadows such as trees, so focus on building The location information of the object can obtain the accurate location of the building in an image and avoid misclassification. The existing technology does not fully consider the location information of the building, especially for buildings with shadow occlusion and complex adhesion. Insufficient attention is paid to location information, so misclassification is easy to occur;
第二,建筑物提取边界粗糙、模糊;建筑物多为矩形,通常拥有规则的边界,边界信息是建筑物提取任务中不容忽视的重要特征,对建筑物进行提取时,如果忽略边缘信息,容易导致边界粗糙模糊、边界混乱、存在孔洞等问题,现有技术在进行建筑物提取时,仅仅进行常规的特征提取,未能充分挖掘建筑物的边缘特征,因而导致提取效果不佳,存在边界粗糙模糊等问题;Second, the boundary of building extraction is rough and fuzzy; buildings are mostly rectangular and usually have regular boundaries. Boundary information is an important feature that cannot be ignored in the task of building extraction. When extracting buildings, if the edge information is ignored, it is easy to It leads to problems such as rough and blurred boundaries, chaotic boundaries, and holes. When extracting buildings in the existing technology, only conventional feature extraction is performed, and the edge features of buildings cannot be fully excavated, resulting in poor extraction results and rough boundaries. problems such as fuzzy;
第三,存在正负样本不均衡问题;建筑物提取是一个二分类任务,主要区分建筑物和背景。但通常情况下,遥感图像中背景像素会多于建筑物像素,这会在训练时削弱模型对建筑物的提取能力。现有技术未能充分考虑到正负样本不均衡的问题,因而导致模型的建筑物提取精度不高,泛化能力偏弱。Third, there is an imbalance between positive and negative samples; building extraction is a binary classification task, which mainly distinguishes buildings and backgrounds. However, under normal circumstances, there are more background pixels in remote sensing images than building pixels, which will weaken the model's ability to extract buildings during training. The existing technology fails to fully consider the problem of the imbalance between positive and negative samples, which leads to low accuracy of building extraction and weak generalization ability of the model.
发明内容Contents of the invention
本发明的目的在于提供一种有效解决对建筑物位置信息关注不够而导致的错分问题,优化建筑物的边界提取效果,缓解正负样本不均衡问题,提高网络的泛化能力的基于坐标注意力和数据相关上采样的遥感图像建筑物提取方法。The purpose of the present invention is to provide a coordinate-based attention that effectively solves the problem of misclassification caused by insufficient attention to building location information, optimizes the boundary extraction effect of buildings, alleviates the problem of unbalanced positive and negative samples, and improves the generalization ability of the network. Force- and data-dependent upsampling for building extraction from remote sensing images.
为实现上述目的,本发明采用了以下技术方案:一种基于坐标注意力和数据相关上采样的遥感图像建筑物提取方法,该方法包括下列顺序的步骤:In order to achieve the above object, the present invention adopts the following technical solutions: a method for extracting buildings from remote sensing images based on coordinate attention and data correlation upsampling, the method includes the steps in the following order:
(1)获取遥感数据:下载WHU建筑物数据集和Massachusetts建筑物数据集;(1) Obtain remote sensing data: download WHU building dataset and Massachusetts building dataset;
(2)数据预处理与数据增强:所述预处理是指对数据集中的大图像进行裁剪,对裁剪后的遥感图像和标签图像进行数据增强;将数据增强后的遥感图像和标签图像分别按照8:1:1的比例划分为训练集、验证集和测试集;(2) Data preprocessing and data enhancement: the preprocessing refers to cropping the large images in the data set, and performing data enhancement on the cropped remote sensing images and label images; The ratio of 8:1:1 is divided into training set, verification set and test set;
(3)构建CAD-UNet网络模型:以UNet网络为基础进行改进,构建包括编码器、坐标注意力CA模块、数据相关上采样DUp模块的建筑物提取网络模型,即CAD-UNet网络模型;(3) Constructing the CAD-UNet network model: Based on the UNet network, improve and construct the building extraction network model including the encoder, the coordinate attention CA module, and the data-dependent upsampling DUp module, namely the CAD-UNet network model;
(4)模型训练与评估:基于训练集数据BCE Loss二分类交叉熵损失与Focal Loss焦点损失相结合的联合损失函数,对CAD-UNet网络模型进行训练,训练完成后用测试集评估CAD-UNet网络模型的建筑物提取精度和效果;(4) Model training and evaluation: Based on the joint loss function of the training set data BCE Loss binary classification cross entropy loss and Focal Loss focal loss, the CAD-UNet network model is trained, and the test set is used to evaluate CAD-UNet after training Building extraction accuracy and effect of network model;
(5)建筑物自动化提取:将新的待提取的遥感图像进行数据预处理后,输入训练完成的CAD-UNet网络模型,CAD-UNet网络模型输出预测图像,得到建筑物提取结果。(5) Automatic extraction of buildings: after data preprocessing of the new remote sensing images to be extracted, the trained CAD-UNet network model is input, and the CAD-UNet network model outputs predicted images to obtain building extraction results.
所述步骤(2)具体包括以下步骤:Described step (2) specifically comprises the following steps:
(2a)通过滑动窗口的方式,对Massachusetts建筑物数据集中遥感大图像和标签图像进行裁剪,裁剪成512×512大小的图像,并将WHU建筑物数据集和Massachusetts建筑物数据集中标签图像中建筑的像素值标注为1,背景的像素值标注为0;(2a) By means of sliding window, crop the remote sensing large image and label image in the Massachusetts building dataset, cut it into a 512×512 image, and put the WHU building dataset and the Massachusetts building dataset in the label image into the building The pixel value of the background is marked as 1, and the pixel value of the background is marked as 0;
(2b)对WHU建筑物数据集中遥感图像和标签图像、以及裁剪后的Massachusetts建筑物数据集中遥感图像和标签图像进行数据增强,扩大数据量,所述数据增强包括:(2b) Carry out data enhancement to the remote sensing images and label images in the WHU building dataset and the cropped Massachusetts building dataset remote sensing images and label images to expand the amount of data. The data enhancement includes:
水平翻转:使用图像处理库OpenCV分别对遥感图像和标签图像进行水平翻转;Horizontal flip: Use the image processing library OpenCV to flip the remote sensing image and the label image horizontally;
垂直翻转:使用图像处理库OpenCV分别对遥感图像和标签图像进行垂直翻转;Vertical flip: Use the image processing library OpenCV to flip the remote sensing image and the label image vertically;
水平垂直翻转:使用图像处理库OpenCV分别对遥感图像和标签图像先水平翻转再垂直翻转;Horizontal and vertical flip: Use the image processing library OpenCV to flip the remote sensing image and the label image first horizontally and then vertically;
移位、缩放、随机裁剪和添加噪声:分别对遥感图像和标签图像进行移位、缩放、随机裁剪和添加噪声等操作;Shift, zoom, random crop and add noise: perform operations such as shift, zoom, random crop and add noise on remote sensing images and label images respectively;
(2c)将数据增强后的遥感图像和标签图像分别按照8:1:1的比例划分为训练集、验证集和测试集,所述训练集用于直接参与CAD-UNet网络模型的训练,进行特征提取;所述验证集用于调整CAD-UNet网络模型的超参数;所述测试集用于在训练完成后测试CAD-UNet网络模型的精度和提取效果。(2c) divide the remote sensing image and label image after data enhancement into training set, verification set and test set according to the ratio of 8:1:1 respectively, and the training set is used to directly participate in the training of CAD-UNet network model, and carry out Feature extraction; the verification set is used to adjust the hyperparameters of the CAD-UNet network model; the test set is used to test the accuracy and extraction effect of the CAD-UNet network model after the training is completed.
所述步骤(3)具体包括以下步骤:Described step (3) specifically comprises the following steps:
(3a)对UNet网络编码器进行替换:用VGG16网络模块替换UNet网络编码器,VGG16网络模块由VGG16网络去掉最后一个池化层和全连接层构成,VGG16网络模块通过多次卷积和四次最大池化进行下采样,提取遥感图像中的建筑物特征,并输出四个不同尺度的特征图;(3a) Replace the UNet network encoder: replace the UNet network encoder with the VGG16 network module. The VGG16 network module is composed of the VGG16 network without the last pooling layer and the fully connected layer. The VGG16 network module passes multiple convolutions and four times Maximum pooling is used for down-sampling, extracting building features in remote sensing images, and outputting feature maps of four different scales;
(3b)构造坐标注意力CA模块:将坐标注意力CA模块嵌入到经步骤(3a)得到的UNet网络的跳跃连接处;(3b) Construct the coordinate attention CA module: embed the coordinate attention CA module into the skip connection of the UNet network obtained through step (3a);
坐标注意力CA模块沿着两个空间方向分别捕获长程依赖和保留位置信息,将特征图分别编码,形成两个特征图,分别对方向感知和对位置敏感,将任何中间张量X=[x1,x2,x3...,xC]∈RC×H×W作为输入,并输出一个同样长度的张量Y=[y1,y2,y3...,yC],具体而言对输入X使用尺寸(H,1)和(1,W)的池化核沿着水平和竖直两个方向对每个通道进行编码,高度为H的第c个通道的表述如下:The coordinate attention CA module captures long-range dependencies and retains location information along two spatial directions, and encodes the feature maps separately to form two feature maps, which are direction-aware and position-sensitive, respectively, and any intermediate tensor X=[x 1 , x 2 , x 3 ..., x C ]∈R C×H×W as input, and output a tensor of the same length Y=[y 1 , y 2 , y 3 ..., y C ] , specifically for the input X, use a pooling kernel of size (H, 1) and (1, W) to encode each channel along the horizontal and vertical directions, and the expression of the cth channel with a height of H as follows:
式中,H表示图像的高度、W表示图像的宽度,C表示图像的通道总数,c表示第c个通道,Xc表示第c个通道的图像,i表示图像的横坐标,R表示中间张量集合;In the formula, H represents the height of the image, W represents the width of the image, C represents the total number of channels of the image, c represents the c-th channel, X c represents the image of the c-th channel, i represents the abscissa of the image, and R represents the middle sheet volume collection;
类似的,宽度为W的第c个通道的输出表述如下:Similarly, the output of the cth channel of width W is expressed as follows:
公式(1)、(2)是特征聚合的两个变换,它们分别沿着两个空间方向进行聚合,返回两个方向感知注意力图;坐标注意力CA模块在级联之前生成两个特征层,之后共用一个1×1的卷积操作来变换,如公式(3)所示:Formulas (1), (2) are two transformations of feature aggregation, which are aggregated along two spatial directions respectively, and return two-direction perceptual attention maps; the coordinate attention CA module generates two feature layers before cascading, Then share a 1×1 convolution operation to transform, as shown in formula (3):
f=δ(F1([ZH,ZW])) (3)f=δ(F 1 ([Z H , Z W ])) (3)
式中,δ为非线性激活函数,f是中间特征映射,是在水平和垂直两个方向上,对空间信息进行特征编码后的结果;然后沿着空间维数将f分解为2个单独的张量,fH∈RC/r×H和fW∈RC/r×W,r是用来控制通道数量的缩减比,然后利用另外两个1×1的卷积变换FH和FW分别将fH和fW变换为两个含有相同特征层数的张量gH和gW:In the formula, δ is a nonlinear activation function, and f is an intermediate feature map, which is the result of feature encoding of spatial information in both horizontal and vertical directions; then f is decomposed into two separate Tensor, f H ∈ R C/r×H and f W ∈ R C/r×W , r is used to control the reduction ratio of the number of channels, and then use two other 1×1 convolutions to transform F H and F W transforms f H and f W into two tensors g H and g W with the same number of feature layers, respectively:
gH=σ(FH(fH)) (4)g H =σ(F H (f H )) (4)
gW=σ(FH(fW))(7)g W =σ(F H (f W ))(7)
式中,FH和FW是两个1×1的卷积变换,fH和fW是将f分解后得到的两个单独张量,gH和gW是经卷积变换和激活函数后得到的张量,σ是sigmoid激活函数,在变换过程中,使用缩减比r来减少f的通道数,然后对输出的gH和gW进行扩展,分别作为注意力权重;坐标注意力CA模块最后的输出如公式(6)所示:In the formula, F H and F W are two 1×1 convolution transformations, f H and f W are two separate tensors obtained by decomposing f, g H and g W are convolution transformations and activation functions The obtained tensor, σ is the sigmoid activation function. During the transformation process, use the reduction ratio r to reduce the number of channels of f, and then expand the output g H and g W , respectively as attention weights; coordinate attention CA The final output of the module is shown in formula (6):
(3c)构造数据相关上采样DUp模块:结合卷积层和数据相关型上采样模块构造数据相关上采样DUp模块,并用于提取高分辨率建筑物的边界信息,对于输入的四个不同尺度特征图,先经过一个3×3卷积层,减少特征图的通道数;再进行数据相关上采样,将特征图直接恢复到512×512大小,将上采样后得到的四个特征图进行逐点相加融合后,从数据相关上采样DUp模块输出;(3c) Construct a data-dependent upsampling DUp module: combine the convolutional layer and the data-dependent upsampling module to construct a data-dependent upsampling DUp module, and use it to extract the boundary information of high-resolution buildings. For the four different scale features of the input Figure, first pass through a 3×3 convolutional layer to reduce the number of channels of the feature map; then perform data-related upsampling to directly restore the feature map to a size of 512×512, and perform point-by-point processing on the four feature maps obtained after upsampling After addition and fusion, output from the data correlation upsampling DUp module;
(4)得到CAD-UNet网络模型。(4) Obtain the CAD-UNet network model.
所述步骤(4)具体包括以下步骤:Described step (4) specifically comprises the following steps:
(4a)构造联合损失函数:构造BCE Loss二分类交叉熵损失与Focal Loss焦点损失相结合的联合损失函数,BCE Loss二分类交叉熵损失与Focal Loss焦点损失的公式分别如下:(4a) Construct a joint loss function: Construct a joint loss function that combines BCE Loss binary cross-entropy loss and Focal Loss focal loss. The formulas of BCE Loss binary cross-entropy loss and Focal Loss focal loss are as follows:
BL(pt,target)=-ω*(target*ln(pt)+(1-target)*ln(1-pt)) (7)BL(p t , target)=-ω*(target*ln(p t )+(1-target)*ln(1-p t )) (7)
式中,pt是CAD-UNet网络模型的预测值,target是标签值,ω为权重值;In the formula, p t is the predicted value of CAD-UNet network model, target is the label value, and ω is the weight value;
FL(pt)=-α(1-pt)γlog(pt) (8)FL(p t )=-α(1-p t ) γ log(p t ) (8)
式中,pt为CAD-UNet网络模型的预测值,α是平衡参数,用来平衡正负样本的比例,取值范围(0,1];γ是聚焦参数,用来减少易分类样本的损失,取值范围[0,+∞);In the formula, p t is the prediction value of the CAD-UNet network model, α is a balance parameter, used to balance the proportion of positive and negative samples, and the value range is (0, 1]; γ is a focusing parameter, used to reduce the number of easy-to-classify samples Loss, value range [0, +∞);
联合损失函数如公式(9)所示:The joint loss function is shown in Equation (9):
Loss=BL+FL (9)Loss=BL+FL (9)
(4b)参数设置:设置ω=1,α=0.5,γ=2;(4b) Parameter setting: set ω=1, α=0.5, γ=2;
(4c)训练策略:训练时使用VGG16网络的预训练权重,采用冻结训练方式,前100个epoch冻结主干网络的参数进行训练,后100个epoch正常训练,每次实验共训练200个epoch;(4c) Training strategy: use the pre-training weight of the VGG16 network during training, adopt the frozen training method, freeze the parameters of the backbone network for the first 100 epochs for training, and train normally for the next 100 epochs, and train a total of 200 epochs for each experiment;
(4d)模型精度评估:采用评价指标精确率Precision、交并比IoU来评价精度,评价指标计算公式如式(10)、(11)所示:(4d) Evaluation of model accuracy: The accuracy of evaluation indicators Precision and IoU are used to evaluate the accuracy. The calculation formulas of evaluation indicators are shown in formulas (10) and (11):
式中,TP为真值是正,模型判定为正;FP为真值是负,模型判定为正;FN为真值是正,模型判定为负。In the formula, TP means that the true value is positive, and the model judges it as positive; FP means that the true value is negative, and the model judges it as positive; FN means that the true value is positive, and the model judges it as negative.
所述步骤(5)具体包括以下步骤:Described step (5) specifically comprises the following steps:
(5a)将新的待提取的遥感图像进行数据预处理后,大小调整为512×512大小;(5a) After performing data preprocessing on the new remote sensing image to be extracted, the size is adjusted to 512×512;
(5b)将调整好的图像输进训练完成的CAD-UNet网络模型,通过CAD-UNet网络模型后输出预测图像,得到建筑物提取结果,CAD-UNet网络模型预测为建筑物的像素值为255,CAD-UNet网络模型预测为背景的像素值为0,因此在预测图像中白色部分为建筑物区域,黑色部分为背景区域。(5b) Input the adjusted image into the trained CAD-UNet network model, output the predicted image through the CAD-UNet network model, and obtain the building extraction result. The CAD-UNet network model predicts that the pixel value of the building is 255 , the CAD-UNet network model predicts that the pixel value of the background is 0, so the white part in the predicted image is the building area, and the black part is the background area.
由上述技术方案可知,本发明的有益效果为:第一,建筑物提取精度高,与其他方法相比,本发明设计的网络逐步提取建筑物的深层特征,进行特征融合之后再逐步上采样至输入分辨率大小,对建筑物提取任务更为友好,显著提高了建筑物提取精度;第二,拥有更好的建筑物边界提取效果,本发明添加和构造的坐标注意力CA模块、数据相关上采样DUp模块,能有效捕捉建筑物的位置信息和边界信息,因而能够使提取的建筑物拥有更加平滑的边界和完整的轮廓;第三,网络参数量少,易于训练,本发明采用的坐标注意力是一种即插即用的轻量级注意力,且本发明的CAD-UNet网络模型相较原UNet模型通道数有所减少,降低了网络复杂度,因为本发明的方法参数量较少,易于训练。It can be seen from the above technical solution that the beneficial effects of the present invention are as follows: First, the building extraction accuracy is high. Compared with other methods, the network designed by the present invention gradually extracts the deep features of the buildings, and then gradually upsamples to The input resolution size is more friendly to the building extraction task, and the accuracy of building extraction is significantly improved; second, it has better building boundary extraction effect, the coordinate attention CA module added and constructed by the present invention, and the data correlation The sampling DUp module can effectively capture the location information and boundary information of the building, so that the extracted building can have a smoother boundary and a complete outline; the third, the network parameter amount is small, and it is easy to train. The coordinates used in the present invention pay attention to Force is a kind of plug-and-play lightweight attention, and the CAD-UNet network model of the present invention has fewer channels than the original UNet model, which reduces the complexity of the network, because the method of the present invention has fewer parameters , easy to train.
附图说明Description of drawings
图1为本发明的方法流程图;Fig. 1 is method flowchart of the present invention;
图2为本发明中CAD-UNet网络模型的结构图;Fig. 2 is the structural diagram of CAD-UNet network model among the present invention;
图3为本发明中坐标注意力CA模块的结构图;Fig. 3 is the structural diagram of coordinate attention CA module in the present invention;
图4为本发明中数据相关上采样DUp模块的结构图;Fig. 4 is the structural diagram of data-dependent upsampling DUp module among the present invention;
图5为本发明中的训练数据示例;Fig. 5 is the training data example among the present invention;
图6为本发明中的预测结果图。Fig. 6 is a graph of prediction results in the present invention.
具体实施方式Detailed ways
如图1所示,一种基于坐标注意力和数据相关上采样的遥感图像建筑物提取方法,该方法包括下列顺序的步骤:As shown in Figure 1, a remote sensing image building extraction method based on coordinate attention and data correlation upsampling, the method includes the following sequential steps:
(1)获取遥感数据:下载WHU建筑物数据集和Massachusetts建筑物数据集;所述WHU建筑物数据集为武汉大学建筑物数据集,所述Massachusetts建筑物数据集为马萨诸塞州建筑物数据集;(1) Obtain remote sensing data: download WHU building data set and Massachusetts building data set; The WHU building data set is Wuhan University building data set, and the Massachusetts building data set is Massachusetts building data set;
(2)数据预处理与数据增强:所述预处理是指对数据集中的大图像进行裁剪,对裁剪后的遥感图像和标签图像进行数据增强;将数据增强后的遥感图像和标签图像分别按照8:1:1的比例划分为训练集、验证集和测试集;(2) Data preprocessing and data enhancement: the preprocessing refers to cropping the large images in the data set, and performing data enhancement on the cropped remote sensing images and label images; The ratio of 8:1:1 is divided into training set, verification set and test set;
(3)构建CAD-UNet网络模型:以UNet网络为基础进行改进,构建包括编码器、坐标注意力CA模块、数据相关上采样DUp模块的建筑物提取网络模型,即CAD-UNet网络模型;(3) Constructing the CAD-UNet network model: Based on the UNet network, improve and construct the building extraction network model including the encoder, the coordinate attention CA module, and the data-dependent upsampling DUp module, namely the CAD-UNet network model;
(4)模型训练与评估:基于训练集数据BCE Loss二分类交叉熵损失与Focal Loss焦点损失相结合的联合损失函数,对CAD-UNet网络模型进行训练,训练完成后用测试集评估CAD-UNet网络模型的建筑物提取精度和效果;(4) Model training and evaluation: Based on the joint loss function of the training set data BCE Loss binary classification cross entropy loss and Focal Loss focal loss, the CAD-UNet network model is trained, and the test set is used to evaluate CAD-UNet after training Building extraction accuracy and effect of network model;
(5)建筑物自动化提取:将新的待提取的遥感图像进行数据预处理后,输入训练完成的CAD-UNet网络模型,CAD-UNet网络模型输出预测图像,得到建筑物提取结果。(5) Automatic extraction of buildings: after data preprocessing of the new remote sensing images to be extracted, the trained CAD-UNet network model is input, and the CAD-UNet network model outputs predicted images to obtain building extraction results.
所述步骤(2)具体包括以下步骤:Described step (2) specifically comprises the following steps:
(2a)通过滑动窗口的方式,对Massachusetts建筑物数据集中遥感大图像和标签图像进行裁剪,裁剪成512×512大小的图像,并将WHU建筑物数据集和Massachusetts建筑物数据集中标签图像中建筑的像素值标注为1,背景的像素值标注为0;(2a) By means of sliding window, crop the remote sensing large image and label image in the Massachusetts building dataset, cut it into a 512×512 image, and put the WHU building dataset and the Massachusetts building dataset in the label image into the building The pixel value of the background is marked as 1, and the pixel value of the background is marked as 0;
(2b)对WHU建筑物数据集中遥感图像和标签图像、以及裁剪后的Massachusetts建筑物数据集中遥感图像和标签图像进行数据增强,扩大数据量,所述数据增强包括:(2b) Carry out data enhancement to the remote sensing images and label images in the WHU building dataset and the cropped Massachusetts building dataset remote sensing images and label images to expand the amount of data. The data enhancement includes:
水平翻转:使用图像处理库OpenCV分别对遥感图像和标签图像进行水平翻转;Horizontal flip: Use the image processing library OpenCV to flip the remote sensing image and the label image horizontally;
垂直翻转:使用图像处理库OpenCV分别对遥感图像和标签图像进行垂直翻转;Vertical flip: Use the image processing library OpenCV to flip the remote sensing image and the label image vertically;
水平垂直翻转:使用图像处理库OpenCV分别对遥感图像和标签图像先水平翻转再垂直翻转;Horizontal and vertical flip: Use the image processing library OpenCV to flip the remote sensing image and the label image first horizontally and then vertically;
移位、缩放、随机裁剪和添加噪声:分别对遥感图像和标签图像进行移位、缩放、随机裁剪和添加噪声等操作;Shift, zoom, random crop and add noise: perform operations such as shift, zoom, random crop and add noise on remote sensing images and label images respectively;
(2c)将数据增强后的遥感图像和标签图像分别按照8∶1∶1的比例划分为训练集、验证集和测试集,所述训练集用于直接参与CAD-UNet网络模型的训练,进行特征提取;所述验证集用于调整CAD-UNet网络模型的超参数;所述测试集用于在训练完成后测试CAD-UNet网络模型的精度和提取效果。(2c) Divide the remote sensing images and label images after data enhancement into training set, verification set and test set according to the ratio of 8:1:1. The training set is used to directly participate in the training of CAD-UNet network model. Feature extraction; the verification set is used to adjust the hyperparameters of the CAD-UNet network model; the test set is used to test the accuracy and extraction effect of the CAD-UNet network model after the training is completed.
所述步骤(3)具体包括以下步骤:Described step (3) specifically comprises the following steps:
(3a)对UNet网络编码器进行替换:用VGG16网络模块替换UNet网络编码器,VGG16网络模块由VGG16网络去掉最后一个池化层和全连接层构成,VGG16网络模块通过多次卷积和四次最大池化进行下采样,提取遥感图像中的建筑物特征,并输出四个不同尺度的特征图;(3a) Replace the UNet network encoder: replace the UNet network encoder with the VGG16 network module. The VGG16 network module is composed of the VGG16 network without the last pooling layer and the fully connected layer. The VGG16 network module passes multiple convolutions and four times Maximum pooling is used for down-sampling, extracting building features in remote sensing images, and outputting feature maps of four different scales;
(3b)构造坐标注意力CA模块:将坐标注意力CA模块嵌入到经步骤(3a)得到的UNet网络的跳跃连接处;(3b) Construct the coordinate attention CA module: embed the coordinate attention CA module into the skip connection of the UNet network obtained through step (3a);
坐标注意力CA模块沿着两个空间方向分别捕获长程依赖和保留位置信息,将特征图分别编码,形成两个特征图,分别对方向感知和对位置敏感,将任何中间张量X=[x1,x2,x3...,xC]∈RC×H×W作为输入,并输出一个同样长度的张量Y=[y1,y2,y3...,yC],具体而言对输入X使用尺寸(H,1)和(1,W)的池化核沿着水平和竖直两个方向对每个通道进行编码,高度为H的第c个通道的表述如下:The coordinate attention CA module captures long-range dependence and retains positional information along two spatial directions, and encodes the feature maps separately to form two feature maps, which are direction-aware and position-sensitive, respectively, and any intermediate tensor X=[x 1 , x 2 , x 3 ..., x C ]∈R C×H×W as input, and output a tensor of the same length Y=[y 1 , y 2 , y 3 ..., y C ] , specifically for the input X, use a pooling kernel of size (H, 1) and (1, W) to encode each channel along the horizontal and vertical directions, and the expression of the cth channel with a height of H as follows:
式中,H表示图像的高度、W表示图像的宽度,C表示图像的通道总数,c表示第c个通道,Xc表示第c个通道的图像,i表示图像的横坐标,R表示中间张量集合;In the formula, H represents the height of the image, W represents the width of the image, C represents the total number of channels of the image, c represents the c-th channel, X c represents the image of the c-th channel, i represents the abscissa of the image, and R represents the middle sheet volume collection;
类似的,宽度为W的第c个通道的输出表述如下:Similarly, the output of the cth channel of width W is expressed as follows:
公式(1)、(2)是特征聚合的两个变换,它们分别沿着两个空间方向进行聚合,返回两个方向感知注意力图;坐标注意力CA模块在级联之前生成两个特征层,之后共用一个1×1的卷积操作来变换,如公式(3)所示:Formulas (1), (2) are two transformations of feature aggregation, which are aggregated along two spatial directions respectively, and return two-direction perceptual attention maps; the coordinate attention CA module generates two feature layers before cascading, Then share a 1×1 convolution operation to transform, as shown in formula (3):
f=δ(F1([ZH,ZW])) (3)f=δ(F 1 ([Z H , Z W ])) (3)
式中,δ为非线性激活函数,f是中间特征映射,是在水平和垂直两个方向上,对空间信息进行特征编码后的结果;然后沿着空间维数将f分解为2个单独的张量,fH∈RC/r×H和fW∈RC/r×W,r是用来控制通道数量的缩减比,然后利用另外两个1×1的卷积变换FH和FW分别将fH和fW变换为两个含有相同特征层数的张量gH和gW:In the formula, δ is a nonlinear activation function, and f is an intermediate feature map, which is the result of feature encoding of spatial information in both horizontal and vertical directions; then f is decomposed into two separate Tensor, f H ∈ R C/r×H and f W ∈ R C/r×W , r is used to control the reduction ratio of the number of channels, and then use two other 1×1 convolutions to transform F H and F W transforms f H and f W into two tensors g H and g W with the same number of feature layers, respectively:
gH=σ(FH(fH)) (4)g H =σ(F H (f H )) (4)
gW=σ(FH(fW))(11)g W =σ(F H (f W ))(11)
式中,FH和FW是两个1×1的卷积变换,fH和fW是将f分解后得到的两个单独张量,gH和gW是经卷积变换和激活函数后得到的张量,σ是sigmoid激活函数,在变换过程中,使用缩减比r来减少f的通道数,然后对输出的gH和gW进行扩展,分别作为注意力权重;坐标注意力CA模块最后的输出如公式(6)所示:In the formula, F H and F W are two 1×1 convolution transformations, f H and f W are two separate tensors obtained by decomposing f, g H and g W are convolution transformations and activation functions The obtained tensor, σ is the sigmoid activation function. During the transformation process, use the reduction ratio r to reduce the number of channels of f, and then expand the output g H and g W , respectively as attention weights; coordinate attention CA The final output of the module is shown in formula (6):
(3c)构造数据相关上采样DUp模块:结合卷积层和数据相关型上采样模块构造数据相关上采样DUp模块,并用于提取高分辨率建筑物的边界信息,对于输入的四个不同尺度特征图,先经过一个3×3卷积层,减少特征图的通道数;再进行数据相关上采样,将特征图直接恢复到512×512大小,将上采样后得到的四个特征图进行逐点相加融合后,从数据相关上采样DUp模块输出;(3c) Construct a data-dependent upsampling DUp module: combine the convolutional layer and the data-dependent upsampling module to construct a data-dependent upsampling DUp module, and use it to extract the boundary information of high-resolution buildings. For the four different scale features of the input Figure, first pass through a 3×3 convolutional layer to reduce the number of channels of the feature map; then perform data-related upsampling to directly restore the feature map to a size of 512×512, and perform point-by-point processing on the four feature maps obtained after upsampling After addition and fusion, output from the data correlation upsampling DUp module;
(4)得到CAD-UNet网络模型。(4) Obtain the CAD-UNet network model.
所述步骤(4)具体包括以下步骤:Described step (4) specifically comprises the following steps:
(4a)构造联合损失函数:构造BCE Loss二分类交叉熵损失与Focal Loss焦点损失相结合的联合损失函数,BCE Loss二分类交叉熵损失与Focal Loss焦点损失的公式分别如下:(4a) Construct a joint loss function: Construct a joint loss function that combines BCE Loss binary cross-entropy loss and Focal Loss focal loss. The formulas of BCE Loss binary cross-entropy loss and Focal Loss focal loss are as follows:
BL(pt,target)=-ω*(target*ln(pt)+(1-target)*ln(1-pt)) (7)BL(p t , target)=-ω*(target*ln(p t )+(1-target)*ln(1-p t )) (7)
式中,pt是CAD-UNet网络模型的预测值,target是标签值,ω为权重值;In the formula, p t is the predicted value of CAD-UNet network model, target is the label value, and ω is the weight value;
FL(pt)=-α(1-pt)γlog(pt) (8)FL(p t )=-α(1-p t ) γ log(p t ) (8)
式中,pt为CAD-UNet网络模型的预测值,α是平衡参数,用来平衡正负样本的比例,取值范围(0,1];γ是聚焦参数,用来减少易分类样本的损失,取值范围[0,+∞);In the formula, p t is the prediction value of the CAD-UNet network model, α is a balance parameter, used to balance the proportion of positive and negative samples, and the value range is (0, 1]; γ is a focusing parameter, used to reduce the number of easy-to-classify samples Loss, value range [0, +∞);
联合损失函数如公式(9)所示:The joint loss function is shown in Equation (9):
Loss=BL+FL (9)Loss=BL+FL (9)
(4b)参数设置:设置ω=1,α=0.5,γ=2;(4b) Parameter setting: set ω=1, α=0.5, γ=2;
(4c)训练策略:训练时使用VGG16网络的预训练权重,采用冻结训练方式,前100个epoch冻结主干网络的参数进行训练,后100个epoch正常训练,每次实验共训练200个epoch;(4c) Training strategy: use the pre-training weight of the VGG16 network during training, adopt the frozen training method, freeze the parameters of the backbone network for the first 100 epochs for training, and train normally for the next 100 epochs, and train a total of 200 epochs for each experiment;
(4d)模型精度评估:采用评价指标精确率Precision、交并比IoU来评价精度,评价指标计算公式如式(10)、(11)所示:(4d) Evaluation of model accuracy: The accuracy of evaluation indicators Precision and IoU are used to evaluate the accuracy. The calculation formulas of evaluation indicators are shown in formulas (10) and (11):
式中,TP为真值是正,模型判定为正;FP为真值是负,模型判定为正;FN为真值是正,模型判定为负。In the formula, TP means that the true value is positive, and the model judges it as positive; FP means that the true value is negative, and the model judges it as positive; FN means that the true value is positive, and the model judges it as negative.
所述步骤(5)具体包括以下步骤:Described step (5) specifically comprises the following steps:
(5a)将新的待提取的遥感图像进行数据预处理后,大小调整为512×512大小;(5a) After performing data preprocessing on the new remote sensing image to be extracted, the size is adjusted to 512×512;
(5b)将调整好的图像输进训练完成的CAD-UNet网络模型,通过CAD-UNet网络模型后输出预测图像,得到建筑物提取结果,CAD-UNet网络模型预测为建筑物的像素值为255,CAD-UNet网络模型预测为背景的像素值为0,因此在预测图像中白色部分为建筑物区域,黑色部分为背景区域。(5b) Input the adjusted image into the trained CAD-UNet network model, output the predicted image through the CAD-UNet network model, and obtain the building extraction result. The CAD-UNet network model predicts that the pixel value of the building is 255 , the CAD-UNet network model predicts that the pixel value of the background is 0, so the white part in the predicted image is the building area, and the black part is the background area.
为了验证本发明的有效性,特选取Unet作为比较例,使用标准的建筑物数据集来比较结果,比较算法的精确率和交并比。In order to verify the effectiveness of the present invention, Unet is selected as a comparative example, and the standard building data set is used to compare the results, and compare the accuracy and intersection ratio of the algorithm.
表1:实施例和比较例在数据集上的结果比较Table 1: Comparison of the results of the examples and comparative examples on the data set
如图2所示,本发明中CAD-UNet网络模型同样采用编解码结构,左边是编码器,用于下采样和特征提取;中间实线箭头代表坐标注意力CA模块,用于关注建筑物的位置信息;右边是解码器,作用是特征融合和上采样;右下角虚线箭头代表数据相关上采样DUp模块,用于提取建筑物的边界信息;最后经过1×1卷积调整通道数后,输出建筑物提取结果。As shown in Figure 2, the CAD-UNet network model in the present invention also adopts a codec structure, the left side is an encoder for down-sampling and feature extraction; the middle solid arrow represents the coordinate attention CA module, which is used to pay attention to the building Position information; on the right is the decoder, which is used for feature fusion and upsampling; the dotted arrow in the lower right corner represents the data-related upsampling DUp module, which is used to extract the boundary information of the building; finally, after adjusting the number of channels by 1×1 convolution, the output Building extraction results.
如图3所示,本发明中坐标注意力CA模块对于输入的特征张量,首先分别从水平方向和垂直方向对通道进行编码;然后,将两个不同方向的特征进行聚合后,得到中间张量;最后,将中间张量沿空间维度进行拆分,并分别经过卷积层和Sigmoid函数后得到最终的输出。As shown in Figure 3, for the input feature tensor, the coordinate attention CA module in the present invention first encodes the channel from the horizontal direction and the vertical direction respectively; then, after aggregating the features in two different directions, the middle tensor is obtained Quantity; Finally, the intermediate tensor is split along the spatial dimension, and the final output is obtained after going through the convolutional layer and the Sigmoid function respectively.
如图4所示,本发明中数据相关上采样DUp模块,将输入的四个不同尺度特征图先经过一个3×3卷积层,减少通道数;再进行数据相关型上采样,将四个特征图直接恢复到512×512大小;最后将四个特征图进行逐点相加融合后,从DUp模块输出。As shown in Figure 4, the data-dependent up-sampling DUp module in the present invention passes the four input feature maps of different scales through a 3×3 convolution layer to reduce the number of channels; then performs data-dependent up-sampling, and the four The feature map is directly restored to the size of 512×512; finally, the four feature maps are added and fused point by point, and then output from the DUp module.
图5中分别是WHU建筑物数据集和Massachusetts建筑物数据集示例,左边是遥感图像,右边是对应的真实标签。Figure 5 is an example of the WHU building dataset and the Massachusetts building dataset respectively. The left side is the remote sensing image, and the right side is the corresponding real label.
图6是本发明CAD-UNet网络模型的预测结果,第一列是遥感图像,第二列是对应的真实标签,第三列是本发明CAD-UNet网络模型的预测结果,第四列是UNet的预测结果,从图中可以看到本发明方法预测结果拥有更加平滑清晰的边界,优于UNet的预测结果。Fig. 6 is the predicted result of the CAD-UNet network model of the present invention, the first column is the remote sensing image, the second column is the corresponding real label, the third column is the predicted result of the CAD-UNet network model of the present invention, and the fourth column is UNet It can be seen from the figure that the prediction result of the method of the present invention has a smoother and clearer boundary, which is better than that of UNet.
综上所述,本发明设计的网络逐步提取建筑物的深层特征,进行特征融合之后再逐步上采样至输入分辨率大小,对建筑物提取任务更为友好,显著提高了建筑物提取精度;本发明添加和构造的坐标注意力CA模块、数据相关上采样DUp模块,能有效捕捉建筑物的位置信息和边界信息,因而能够使提取的建筑物拥有更加平滑的边界和完整的轮廓;本发明采用的坐标注意力是一种即插即用的轻量级注意力,且本发明的CAD-UNet网络模型相较原UNet模型通道数有所减少,降低了网络复杂度,因为本发明的方法参数量较少,易于训练。In summary, the network designed in the present invention gradually extracts the deep features of buildings, and then gradually upsamples to the input resolution after feature fusion, which is more friendly to building extraction tasks and significantly improves the accuracy of building extraction; The invention adds and constructs the coordinate attention CA module and the data-related upsampling DUp module, which can effectively capture the location information and boundary information of the building, so that the extracted building can have a smoother boundary and a complete outline; the present invention adopts Coordinate attention is a plug-and-play lightweight attention, and the CAD-UNet network model of the present invention has fewer channels than the original UNet model, reducing network complexity, because the method parameters of the present invention Small amount, easy to train.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211270279.6A CN115631412A (en) | 2022-10-18 | 2022-10-18 | Remote sensing image building extraction method based on coordinate attention and data correlation upsampling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211270279.6A CN115631412A (en) | 2022-10-18 | 2022-10-18 | Remote sensing image building extraction method based on coordinate attention and data correlation upsampling |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115631412A true CN115631412A (en) | 2023-01-20 |
Family
ID=84906561
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211270279.6A Pending CN115631412A (en) | 2022-10-18 | 2022-10-18 | Remote sensing image building extraction method based on coordinate attention and data correlation upsampling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115631412A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116503464A (en) * | 2023-06-25 | 2023-07-28 | 武汉理工大学三亚科教创新园 | Farmland building height prediction method based on remote sensing image |
CN119067991A (en) * | 2024-08-17 | 2024-12-03 | 江西师范大学 | Retinal image segmentation method based on lightweight two-way cascade network |
CN119441940A (en) * | 2024-10-30 | 2025-02-14 | 内蒙古医科大学 | A classification method for threatened abortion based on loss function optimization of blood routine tests |
CN119439152A (en) * | 2024-10-29 | 2025-02-14 | 广东省水利水电科学研究院 | A three-dimensional imaging method and system for termite nests on dams based on ground penetrating radar |
CN119784678A (en) * | 2024-11-26 | 2025-04-08 | 云南大学附属医院 | Anti-vascular endothelial growth factor efficacy evaluation and research methods based on deep learning |
-
2022
- 2022-10-18 CN CN202211270279.6A patent/CN115631412A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116503464A (en) * | 2023-06-25 | 2023-07-28 | 武汉理工大学三亚科教创新园 | Farmland building height prediction method based on remote sensing image |
CN116503464B (en) * | 2023-06-25 | 2023-10-03 | 武汉理工大学三亚科教创新园 | Height prediction method of farmland buildings based on remote sensing images |
CN119067991A (en) * | 2024-08-17 | 2024-12-03 | 江西师范大学 | Retinal image segmentation method based on lightweight two-way cascade network |
CN119439152A (en) * | 2024-10-29 | 2025-02-14 | 广东省水利水电科学研究院 | A three-dimensional imaging method and system for termite nests on dams based on ground penetrating radar |
CN119441940A (en) * | 2024-10-30 | 2025-02-14 | 内蒙古医科大学 | A classification method for threatened abortion based on loss function optimization of blood routine tests |
CN119784678A (en) * | 2024-11-26 | 2025-04-08 | 云南大学附属医院 | Anti-vascular endothelial growth factor efficacy evaluation and research methods based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110276316B (en) | A human keypoint detection method based on deep learning | |
CN115631412A (en) | Remote sensing image building extraction method based on coordinate attention and data correlation upsampling | |
CN111626128B (en) | A Pedestrian Detection Method Based on Improved YOLOv3 in Orchard Environment | |
CN114119638A (en) | Medical image segmentation method integrating multi-scale features and attention mechanism | |
CN111753682B (en) | Hoisting area dynamic monitoring method based on target detection algorithm | |
CN111401436B (en) | Streetscape image segmentation method fusing network and two-channel attention mechanism | |
CN108647639A (en) | Real-time body's skeletal joint point detecting method | |
CN113505640B (en) | A small-scale pedestrian detection method based on multi-scale feature fusion | |
CN113657214B (en) | A building damage assessment method based on Mask RCNN | |
CN114299383A (en) | Remote sensing image target detection method based on integration of density map and attention mechanism | |
CN116071676A (en) | Infrared small target detection method based on attention-directed pyramid fusion | |
CN112861970A (en) | Fine-grained image classification method based on feature fusion | |
CN117830788A (en) | A method of image target detection based on multi-source information fusion | |
CN117765361A (en) | Method for detecting building change area in double-time-phase remote sensing image based on residual neural network | |
CN115223009A (en) | Small target detection method and device based on improved YOLOv5 | |
CN112507849A (en) | Dynamic-to-static scene conversion method for generating countermeasure network based on conditions | |
CN116503726A (en) | Multi-scale light smoke image segmentation method and device | |
CN111444913A (en) | License plate real-time detection method based on edge-guided sparse attention mechanism | |
CN115984714A (en) | Cloud detection method based on double-branch network model | |
CN114943893A (en) | Feature enhancement network for land coverage classification | |
CN116580317A (en) | Remote sensing image change detection method and system based on neural network | |
CN116681976A (en) | Progressive feature fusion method for infrared small target detection | |
CN116229217A (en) | Infrared target detection method applied to complex environment | |
CN114926734B (en) | Solid waste detection device and method based on feature aggregation and attention fusion | |
CN115953577A (en) | A Semantic Segmentation Method of Remote Sensing Image Based on Supervised Long-range Correlation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |