Remote Sensing

Research

Jump to: Review

22 pages, 3919 KiB

Open AccessArticle

RS-DARTS: A Convolutional Neural Architecture Search for Remote Sensing Image Scene Classification

by Zhen Zhang, Shanghao Liu, Yang Zhang and Wenbo Chen

Remote Sens. 2022, 14(1), 141; https://doi.org/10.3390/rs14010141 - 29 Dec 2021

Cited by 29 | Viewed by 4645

Abstract

Due to the superiority of convolutional neural networks, many deep learning methods have been used in image classification. The enormous difference between natural images and remote sensing images makes it difficult to directly utilize or modify existing CNN models for remote sensing scene [...] Read more.

Due to the superiority of convolutional neural networks, many deep learning methods have been used in image classification. The enormous difference between natural images and remote sensing images makes it difficult to directly utilize or modify existing CNN models for remote sensing scene classification tasks. In this article, a new paradigm is proposed that can automatically design a suitable CNN architecture for scene classification. A more efficient search framework, RS-DARTS, is adopted to find the optimal network architecture. This framework has two phases. In the search phase, some new strategies are presented, making the calculation process smoother, and better distinguishing the optimal and other operations. In addition, we added noise to suppress skip connections in order to close the gap between trained and validation processing and ensure classification accuracy. Moreover, a small part of the neural network is sampled to reduce the redundancy in exploring the network space and speed up the search processing. In the evaluation phase, the optimal cell architecture is stacked to construct the final network. Extensive experiments demonstrated the validity of the search strategy and the impressive classification performance of RS-DARTS on four public benchmark datasets. The proposed method showed more effectiveness than the manually designed CNN model and other methods of neural architecture search. Especially, in terms of search cost, RS-DARTS consumed less time than other NAS methods. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Figure 1

Figure 1
Illustration of the proposed approach (best viewed in color), RS-DARTS. Full article ">Figure 2
Some images random sampled from four benchmark datasets. Full article ">Figure 3
Searched time cost in searched phase. Full article ">Figure 4
Searched cell architectures of three methods on Large-RS data set. (a) Normal cell searched by DARTS. (b) Reduction cell searched by DARTS. (c) Normal cell searched by PCDARTS. (d) Reduction cell searched by PCDARTS. (e) Normal cell searched by RS-DARTS. (f) Reduction cell searched by RS-DARTS. Full article ">Figure 4 Cont.
Searched cell architectures of three methods on Large-RS data set. (a) Normal cell searched by DARTS. (b) Reduction cell searched by DARTS. (c) Normal cell searched by PCDARTS. (d) Reduction cell searched by PCDARTS. (e) Normal cell searched by RS-DARTS. (f) Reduction cell searched by RS-DARTS. Full article ">Figure 5
Searched cell architectures of GPAS [<a href="#B41-remotesensing-14-00141" class="html-bibr">41</a>]. (a) Normal cell searched by GPAS. (b) Reduction cell searched by GPAS. Full article ">Figure 6
Searched cell architectures of Auto-RSISC [<a href="#B42-remotesensing-14-00141" class="html-bibr">42</a>]. (a) Normal cell searched by Auto-RSISC. (b) Reduction cell searched by Auto-RSISC. Full article ">

22 pages, 7275 KiB

Open AccessArticle

Semi-Autonomous Learning Algorithm for Remote Image Object Detection Based on Aggregation Area Instance Refinement

by Bei Cheng, Zhengzhou Li, Hui Li, Zhiquan Ding and Tianqi Qin

Remote Sens. 2021, 13(24), 5065; https://doi.org/10.3390/rs13245065 - 14 Dec 2021

Viewed by 2090

Abstract

Semi-autonomous learning for object detection has attracted more and more attention in recent years, which usually tends to find only one object instance with the highest score in each image. However, this strategy usually highlights the most representative part of the object instead [...] Read more.

Semi-autonomous learning for object detection has attracted more and more attention in recent years, which usually tends to find only one object instance with the highest score in each image. However, this strategy usually highlights the most representative part of the object instead of the whole object, which may lead to the loss of a lot of important information. To solve this problem, a novel end-to-end aggregate-guided semi-autonomous learning residual network is proposed to perform object detection. Firstly, a progressive modified residual network (MRN) is applied to the backbone network to make the detector more sensitive to the boundary features of the object. Then, an aggregate-based region-merging strategy (ARMS) is designed to select high-quality instances by selecting aggregation areas and merging these regions. The ARMS selects the aggregation areas that are highly related to the object through association coefficient, and then evaluates the aggregation areas through a similarity coefficient and fuses them to obtain high-quality object instance areas. Finally, a regression-locating branch is further developed to refine the location of the object, which can be optimized jointly with regional classification. Extensive experiments demonstrate that the proposed method is superior to state-of-the-art methods. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Figure 1

21 pages, 24683 KiB

Open AccessArticle

Application of Supervised Machine Learning Technique on LiDAR Data for Monitoring Coastal Land Evolution

by Maurizio Barbarella, Alessandro Di Benedetto and Margherita Fiani

Remote Sens. 2021, 13(23), 4782; https://doi.org/10.3390/rs13234782 - 25 Nov 2021

Cited by 8 | Viewed by 2958

Abstract

Machine Learning (ML) techniques are now being used very successfully in predicting and supporting decisions in multiple areas such as environmental issues and land management. These techniques have also provided promising results in the field of natural hazard assessment and risk mapping. The [...] Read more.

Machine Learning (ML) techniques are now being used very successfully in predicting and supporting decisions in multiple areas such as environmental issues and land management. These techniques have also provided promising results in the field of natural hazard assessment and risk mapping. The aim of this work is to apply the Supervised ML technique to train a model able to classify a particular gravity-driven coastal hillslope geomorphic model (slope-over-wall) involving most of the soft rocks of Cilento (southern Italy). To train the model, only geometric data have been used, namely morphometric feature maps computed on a Digital Terrain Model (DTM) derived from Light Detection and Ranging (LiDAR) data. Morphometric maps were computed using third-order polynomials, so as to obtain products that best describe landforms. Not all morphometric parameters from literature were used to train the model, the most significant ones were chosen by applying the Neighborhood Component Analysis (NCA) method. Different models were trained and the main indicators derived from the confusion matrices were compared. The best results were obtained using the Weighted k-NN model (accuracy score = 75%). Analysis of the Receiver Operating Characteristic (ROC) curves also shows that the discriminating capacity of the test reached percentages higher than 95%. The model, resulting more accurate in the training area, will be extended to similar areas along the Tyrrhenian coastal land. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Figure 1

14 pages, 5410 KiB

Open AccessArticle

MSST-Net: A Multi-Scale Adaptive Network for Building Extraction from Remote Sensing Images Based on Swin Transformer

by Wei Yuan and Wenbo Xu

Remote Sens. 2021, 13(23), 4743; https://doi.org/10.3390/rs13234743 - 23 Nov 2021

Cited by 47 | Viewed by 4414

Abstract

The segmentation of remote sensing images by deep learning technology is the main method for remote sensing image interpretation. However, the segmentation model based on a convolutional neural network cannot capture the global features very well. A transformer, whose self-attention mechanism can supply [...] Read more.

The segmentation of remote sensing images by deep learning technology is the main method for remote sensing image interpretation. However, the segmentation model based on a convolutional neural network cannot capture the global features very well. A transformer, whose self-attention mechanism can supply each pixel with a global feature, makes up for the deficiency of the convolutional neural network. Therefore, a multi-scale adaptive segmentation network model (MSST-Net) based on a Swin Transformer is proposed in this paper. Firstly, a Swin Transformer is used as the backbone to encode the input image. Then, the feature maps of different levels are decoded separately. Thirdly, the convolution is used for fusion, so that the network can automatically learn the weight of the decoding results of each level. Finally, we adjust the channels to obtain the final prediction map by using the convolution with a kernel of 1 × 1. By comparing this with other segmentation network models on a WHU building data set, the evaluation metrics, mIoU, F1-score and accuracy are all improved. The network model proposed in this paper is a multi-scale adaptive network model that pays more attention to the global features for remote sensing segmentation. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Figure 1

Figure 1
WHU building data set map. Full article ">Figure 2
Differently sized houses in the remote sensing image. Full article ">Figure 3
MSST-Net architecture. The red dotted rectangle is the encoding architecture of the Swin Transformer. The blue dotted rectangle is the decoding architecture of multi-level merging. The circle in the figure represents twice convolution with a kernel of 3 × 3, to the right of the circle is its output dimension, H represents the height of input image, W represents the width of the input image, the last number is the number of channels. The square represents single convolution with a kernel of 1 × 1, to the right of the square is its output dimension where N represents class number. The downward dashed arrow represents the deconvolution of upsampling to double size. The yellow curve represents a jump connection. Full article ">Figure 4
Regular window and shift window of Swin Transformer in the Swin Transformer block [<a href="#B34-remotesensing-13-04743" class="html-bibr">34</a>]. Left is the regular window, and the red windows divide the input image into four equal patches. Right is the shift window, which is formed by scrolling the regular window right and down by 2 pixels, so that the pixels divided into one patch are not the same as in the patch of the regular window. Full article ">Figure 5
Comparison of prediction results by networks based on CNN. Full article ">Figure 6
Comparison of prediction results by network based on transformer. Full article ">

19 pages, 3017 KiB

Open AccessArticle

Adaptable Convolutional Network for Hyperspectral Image Classification

by Mercedes E. Paoletti and Juan M. Haut

Remote Sens. 2021, 13(18), 3637; https://doi.org/10.3390/rs13183637 - 11 Sep 2021

Cited by 6 | Viewed by 2850

Abstract

Nowadays, a large number of remote sensing instruments are providing a massive amount of data within the frame of different Earth Observation missions. These instruments are characterized by the wide variety of data they can collect, as well as the impressive volume of [...] Read more.

Nowadays, a large number of remote sensing instruments are providing a massive amount of data within the frame of different Earth Observation missions. These instruments are characterized by the wide variety of data they can collect, as well as the impressive volume of data and the speed at which it is acquired. In this sense, hyperspectral imaging data has certain properties that make it difficult to process, such as its large spectral dimension coupled with problematic data variability. To overcome these challenges, convolutional neural networks have been proposed as classification models because of their ability to extract relevant spectral–spatial features and learn hidden patterns, along their great architectural flexibility. Their high performance relies on the convolution kernels to exploit the spatial relationships. Thus, filter design is crucial for the correct performance of models. Nevertheless, hyperspectral data may contain objects with different shapes and orientations, preventing filters from “seeing everything possible” during the decision making. To overcome this limitation, this paper proposes a novel adaptable convolution model based on deforming kernels combined with deforming convolution layers to fit their effective receptive field to the input data. The proposed adaptable convolutional network (named DKDCNet) has been evaluated over two well-known hyperspectral scenes, demonstrating that it is able to achieve better results than traditional strategies with similar computational cost for HSI classification. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Figure 1

25 pages, 3461 KiB

Open AccessArticle

Efficient Transformer for Remote Sensing Image Segmentation

by Zhiyong Xu, Weicun Zhang, Tianxiang Zhang, Zhifang Yang and Jiangyun Li

Remote Sens. 2021, 13(18), 3585; https://doi.org/10.3390/rs13183585 - 9 Sep 2021

Cited by 150 | Viewed by 13709

Abstract

Semantic segmentation for remote sensing images (RSIs) is widely applied in geological surveys, urban resources management, and disaster monitoring. Recent solutions on remote sensing segmentation tasks are generally addressed by CNN-based models and transformer-based models. In particular, transformer-based architecture generally struggles with two [...] Read more.

Semantic segmentation for remote sensing images (RSIs) is widely applied in geological surveys, urban resources management, and disaster monitoring. Recent solutions on remote sensing segmentation tasks are generally addressed by CNN-based models and transformer-based models. In particular, transformer-based architecture generally struggles with two main problems: a high computation load and inaccurate edge classification. Therefore, to overcome these problems, we propose a novel transformer model to realize lightweight edge classification. First, based on a Swin transformer backbone, a pure Efficient transformer with mlphead is proposed to accelerate the inference speed. Moreover, explicit and implicit edge enhancement methods are proposed to cope with object edge problems. The experimental results evaluated on the Potsdam and Vaihingen datasets present that the proposed approach significantly improved the final accuracy, achieving a trade-off between computational complexity (Flops) and accuracy (Efficient-L obtaining 3.23% mIoU improvement on Vaihingen and 2.46% mIoU improvement on Potsdam compared with HRCNet_W48). As a result, it is believed that the proposed Efficient transformer will have an advantage in dealing with remote sensing image segmentation problems. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

24 pages, 3630 KiB

Open AccessArticle

Densely Connected Pyramidal Dilated Convolutional Network for Hyperspectral Image Classification

by Feng Zhao, Junjie Zhang, Zhe Meng and Hanqiang Liu

Remote Sens. 2021, 13(17), 3396; https://doi.org/10.3390/rs13173396 - 26 Aug 2021

Cited by 22 | Viewed by 3319

Abstract

Recently, with the extensive application of deep learning techniques in the hyperspectral image (HSI) field, particularly convolutional neural network (CNN), the research of HSI classification has stepped into a new stage. To avoid the problem that the receptive field of naive convolution is [...] Read more.

Recently, with the extensive application of deep learning techniques in the hyperspectral image (HSI) field, particularly convolutional neural network (CNN), the research of HSI classification has stepped into a new stage. To avoid the problem that the receptive field of naive convolution is small, the dilated convolution is introduced into the field of HSI classification. However, the dilated convolution usually generates blind spots in the receptive field, resulting in discontinuous spatial information obtained. In order to solve the above problem, a densely connected pyramidal dilated convolutional network (PDCNet) is proposed in this paper. Firstly, a pyramidal dilated convolutional (PDC) layer integrates different numbers of sub-dilated convolutional layers is proposed, where the dilated factor of the sub-dilated convolution increases exponentially, achieving multi-sacle receptive fields. Secondly, the number of sub-dilated convolutional layers increases in a pyramidal pattern with the depth of the network, thereby capturing more comprehensive hyperspectral information in the receptive field. Furthermore, a feature fusion mechanism combining pixel-by-pixel addition and channel stacking is adopted to extract more abstract spectral–spatial features. Finally, in order to reuse the features of the previous layers more effectively, dense connections are applied in densely pyramidal dilated convolutional (DPDC) blocks. Experiments on three well-known HSI datasets indicate that PDCNet proposed in this paper has good classification performance compared with other popular models. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

24 pages, 26957 KiB

Open AccessArticle

CscGAN: Conditional Scale-Consistent Generation Network for Multi-Level Remote Sensing Image to Map Translation

by Yuanyuan Liu, Wenbin Wang, Fang Fang, Lin Zhou, Chenxing Sun, Ying Zheng and Zhanlong Chen

Remote Sens. 2021, 13(10), 1936; https://doi.org/10.3390/rs13101936 - 15 May 2021

Cited by 5 | Viewed by 3440

Abstract

Automatic remote sensing (RS) image to map translation is a crucial technology for intelligent tile map generation. Although existing methods based on a generative network (GAN) generated unannotated maps at a single level, they have limited capacity in handling multi-resolution map generation at [...] Read more.

Automatic remote sensing (RS) image to map translation is a crucial technology for intelligent tile map generation. Although existing methods based on a generative network (GAN) generated unannotated maps at a single level, they have limited capacity in handling multi-resolution map generation at different levels. To address the problem, we proposed a novel conditional scale-consistent generation network (CscGAN) to simultaneously generate multi-level tile maps from multi-scale RS images, using only a single and unified model. Specifically, the CscGAN first uses the level labels and map annotations as prior conditions to guide hierarchical feature learning with different scales. Then, a multi-scale discriminator and two multi-scale generators are introduced to describe both high-resolution and low-resolution representations, aiming to improve the similarity of generated maps and thus produce high-quality multi-level tile maps. Meanwhile, a level classifier is designed for further exploring the characteristics of tile maps at different levels. Moreover, the CscGAN is optimized by jointly multi-scale adversarial loss, level classification loss, and scale-consistent loss in an end-to-end manner. Extensive experiments on multiple datasets and study areas demonstrate that the CscGAN outperforms the state-of-the-art methods in multi-level map translation, with great robustness and efficiency. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Figure 1

22 pages, 29182 KiB

Open AccessArticle

Generative Adversarial Learning in YUV Color Space for Thin Cloud Removal on Satellite Imagery

by Xue Wen, Zongxu Pan, Yuxin Hu and Jiayin Liu

Remote Sens. 2021, 13(6), 1079; https://doi.org/10.3390/rs13061079 - 12 Mar 2021

Cited by 34 | Viewed by 4641

Abstract

Clouds are one of the most serious disturbances when using satellite imagery for ground observations. The semi-translucent nature of thin clouds provides the possibility of 2D ground scene reconstruction based on a single satellite image. In this paper, we propose an effective framework [...] Read more.

Clouds are one of the most serious disturbances when using satellite imagery for ground observations. The semi-translucent nature of thin clouds provides the possibility of 2D ground scene reconstruction based on a single satellite image. In this paper, we propose an effective framework for thin cloud removal involving two aspects: a network architecture and a training strategy. For the network architecture, a Wasserstein generative adversarial network (WGAN) in YUV color space called YUV-GAN is proposed. Unlike most existing approaches in RGB color space, our method performs end-to-end thin cloud removal by learning luminance and chroma components independently, which is efficient at reducing the number of unrecoverable bright and dark pixels. To preserve more detailed features, the generator adopts a residual encoding–decoding network without down-sampling and up-sampling layers, which effectively competes with a residual discriminator, encouraging the accuracy of scene identification. For the training strategy, a transfer-learning-based method was applied. Instead of using either simulated or scarce real data to train the deep network, adequate simulated pairs were used to train the YUV-GAN at first. Then, pre-trained convolutional layers were optimized by real pairs to encourage the applicability of the model to real cloudy images. Qualitative and quantitative results on RICE1 and Sentinel-2A datasets confirmed that our YUV-GAN achieved state-of-the-art performance compared with other approaches. Additionally, our method combining the YUV-GAN with a transfer-learning-based training strategy led to better performance in the case of scarce training data. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

21 pages, 8976 KiB

Open AccessArticle

Integrating Weighted Feature Fusion and the Spatial Attention Module with Convolutional Neural Networks for Automatic Aircraft Detection from SAR Images

by Jielan Wang, Hongguang Xiao, Lifu Chen, Jin Xing, Zhouhao Pan, Ru Luo and Xingmin Cai

Remote Sens. 2021, 13(5), 910; https://doi.org/10.3390/rs13050910 - 28 Feb 2021

Cited by 37 | Viewed by 3857

Abstract

The automatic detection of aircrafts from SAR images is widely applied in both military and civil fields, but there are still considerable challenges. To address the high variety of aircraft sizes and complex background information in SAR images, a new fast detection framework [...] Read more.

The automatic detection of aircrafts from SAR images is widely applied in both military and civil fields, but there are still considerable challenges. To address the high variety of aircraft sizes and complex background information in SAR images, a new fast detection framework based on convolution neural networks is proposed, which achieves automatic and rapid detection of aircraft with high accuracy. First, the airport runway areas are detected to generate the airport runway mask and rectangular contour of the whole airport are generated. Then, a new deep neural network proposed in this paper, named Efficient Weighted Feature Fusion and Attention Network (EWFAN), is used to detect aircrafts. EWFAN integrates the weighted feature fusion module, the spatial attention mechanism, and the CIF loss function. EWFAN can effectively reduce the interference of negative samples and enhance feature extraction, thereby significantly improving the detection accuracy. Finally, the airport runway mask is applied to the detected results to reduce false alarms and produce the final aircraft detection results. To evaluate the performance of the proposed framework, large-scale Gaofen-3 SAR images with 1 m resolution are utilized in the experiment. The detection rate and false alarm rate of our EWFAN algorithm are 95.4% and 3.3%, respectively, which outperforms Efficientdet and YOLOv4. In addition, the average test time with the proposed framework is only 15.40 s, indicating satisfying efficiency of automatic aircraft detection. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

32 pages, 10384 KiB

Open AccessArticle

AVILNet: A New Pliable Network with a Novel Metric for Small-Object Segmentation and Detection in Infrared Images

by Ikhwan Song and Sungho Kim

Remote Sens. 2021, 13(4), 555; https://doi.org/10.3390/rs13040555 - 4 Feb 2021

Cited by 11 | Viewed by 3810

Abstract

Infrared small-object segmentation (ISOS) has a persistent trade-off problem—that is, which came first, recall or precision? Constructing a fine balance between of them is, au fond, of vital importance to obtain the best performance in real applications, such as surveillance, tracking, and many [...] Read more.

Infrared small-object segmentation (ISOS) has a persistent trade-off problem—that is, which came first, recall or precision? Constructing a fine balance between of them is, au fond, of vital importance to obtain the best performance in real applications, such as surveillance, tracking, and many fields related to infrared searching and tracking. F1-score may be a good evaluation metric for this problem. However, since the F1-score only depends upon a specific threshold value, it cannot reflect the user’s requirements according to the various application environment. Therefore, several metrics are commonly used together. Now we introduce F-area, a novel metric for a panoptic evaluation of average precision and F1-score. It can simultaneously consider the performance in terms of real application and the potential capability of a model. Furthermore, we propose a new network, called the Amorphous Variable Inter-located Network (AVILNet), which is of pliable structure based on GridNet, and it is also an ensemble network consisting of the main and its sub-network. Compared with the state-of-the-art ISOS methods, our model achieved an AP of 51.69%, F1-score of 63.03%, and F-area of 32.58% on the International Conference on Computer Vision 2019 ISOS Single dataset by using one generator. In addition, an AP of 53.6%, an F1-score of 60.99%, and F-area of 32.69% by using dual generators, with beating the existing best record (AP, 51.42%; F1-score, 57.04%; and F-area, 29.33%). Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

13 pages, 2698 KiB

Open AccessArticle

A Novel Deeplabv3+ Network for SAR Imagery Semantic Segmentation Based on the Potential Energy Loss Function of Gibbs Distribution

by Yingying Kong, Yanjuan Liu, Biyuan Yan, Henry Leung and Xiangyang Peng

Remote Sens. 2021, 13(3), 454; https://doi.org/10.3390/rs13030454 - 28 Jan 2021

Cited by 21 | Viewed by 4515

Abstract

Synthetic aperture radar (SAR) provides rich information about the Earth’s surface under all-weather and day-and-night conditions, and is applied in many relevant fields. SAR imagery semantic segmentation, which can be a final product for end users and a fundamental procedure to support other [...] Read more.

Synthetic aperture radar (SAR) provides rich information about the Earth’s surface under all-weather and day-and-night conditions, and is applied in many relevant fields. SAR imagery semantic segmentation, which can be a final product for end users and a fundamental procedure to support other applications, is one of the most difficult challenges. This paper proposes an encoding-decoding network based on Deeplabv3+ to semantically segment SAR imagery. A new potential energy loss function based on the Gibbs distribution is proposed here to establish the semantic dependence among different categories through the relationship among different cliques in the neighborhood system. This paper introduces an improved channel and spatial attention module to the Mobilenetv2 backbone to improve the recognition accuracy of small object categories in SAR imagery. The experimental results show that the proposed method achieves the highest mean intersection over union (mIoU) and global accuracy (GA) with the least running time, which verifies the effectiveness of our method. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
The structure of the Deeplabv3+ network. Full article ">Figure 2
The channel and spatial attention module. Full article ">Figure 3
The SAR imagery and its corresponding ground truth: (a) SAR imagery; (b) ground truth. Full article ">Figure 4
(a) The convergence of the proposed loss function; (b) the change in <math display="inline"><semantics> <mrow> <mi>m</mi> <mi>I</mi> <mi>o</mi> <msub> <mi>U</mi> <mrow> <mi>c</mi> <mi>l</mi> <mi>s</mi> </mrow> </msub> </mrow> </semantics></math>. Full article ">Figure 5
The results of different networks: (a) SAR images; (b) ground truth; (c) Deeplabv3+–drn output; (d) Deeplabv3+–ResNet output; (e) Deeplabv3+–Mobilenetv2 output; (f) PSPNet output; (g) FCN output. Full article ">Figure 5 Cont.
The results of different networks: (a) SAR images; (b) ground truth; (c) Deeplabv3+–drn output; (d) Deeplabv3+–ResNet output; (e) Deeplabv3+–Mobilenetv2 output; (f) PSPNet output; (g) FCN output. Full article ">Figure 6
The results of the three different networks: (a) SAR imagery; (b) ground truth; (c) Deeplabv3–drn output; (d) Deeplabv3–ResNet output; (e) Deeplabv3–Mobilenetv2 output. Full article ">

26 pages, 28935 KiB

Open AccessArticle

Structured Object-Level Relational Reasoning CNN-Based Target Detection Algorithm in a Remote Sensing Image

by Bei Cheng, Zhengzhou Li, Bitong Xu, Xu Yao, Zhiquan Ding and Tianqi Qin

Remote Sens. 2021, 13(2), 281; https://doi.org/10.3390/rs13020281 - 14 Jan 2021

Cited by 26 | Viewed by 3358

Abstract

Deep learning technology has been extensively explored by existing methods to improve the performance of target detection in remote sensing images, due to its powerful feature extraction and representation abilities. However, these methods usually focus on the interior features of the target, but [...] Read more.

Deep learning technology has been extensively explored by existing methods to improve the performance of target detection in remote sensing images, due to its powerful feature extraction and representation abilities. However, these methods usually focus on the interior features of the target, but ignore the exterior semantic information around the target, especially the object-level relationship. Consequently, these methods fail to detect and recognize targets in the complex background where multiple objects crowd together. To handle this problem, a diversified context information fusion framework based on convolutional neural network (DCIFF-CNN) is proposed in this paper, which employs the structured object-level relationship to improve the target detection and recognition in complex backgrounds. The DCIFF-CNN is composed of two successive sub-networks, i.e., a multi-scale local context region proposal network (MLC-RPN) and an object-level relationship context target detection network (ORC-TDN). The MLC-RPN relies on the fine-grained details of objects to generate candidate regions in the remote sensing image. Then, the ORC-TDN utilizes the spatial context information of objects to detect and recognize targets by integrating an attentional message integrated module (AMIM) and an object relational structured graph (ORSG). The AMIM is integrated into the feed-forward CNN to highlight the useful object-level context information, while the ORSG builds the relations between a set of objects by processing their appearance features and geometric features. Finally, the target detection method based on DCIFF-CNN effectively represents the interior and exterior information of the target by exploiting both the multiscale local context information and the object-level relationships. Extensive experiments are conducted, and experimental results demonstrate that the proposed DCIFF-CNN method improves the target detection and recognition accuracy in complex backgrounds, showing superiority to other state-of-the-art methods. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

25 pages, 6240 KiB

Open AccessArticle

Adaptive Weighting Feature Fusion Approach Based on Generative Adversarial Network for Hyperspectral Image Classification

by Hongbo Liang, Wenxing Bao and Xiangfei Shen

Remote Sens. 2021, 13(2), 198; https://doi.org/10.3390/rs13020198 - 8 Jan 2021

Cited by 16 | Viewed by 3982

Abstract

Recently, generative adversarial network (GAN)-based methods for hyperspectral image (HSI) classification have attracted research attention due to their ability to alleviate the challenges brought by having limited labeled samples. However, several studies have demonstrated that existing GAN-based HSI classification methods are limited in [...] Read more.

Recently, generative adversarial network (GAN)-based methods for hyperspectral image (HSI) classification have attracted research attention due to their ability to alleviate the challenges brought by having limited labeled samples. However, several studies have demonstrated that existing GAN-based HSI classification methods are limited in redundant spectral knowledge and cannot extract discriminative characteristics, thus affecting classification performance. In addition, GAN-based methods always suffer from the model collapse, which seriously hinders their development. In this study, we proposed a semi-supervised adaptive weighting feature fusion generative adversarial network (AWF²-GAN) to alleviate these problems. We introduced unlabeled data to address the issue of having a small number of samples. First, to build valid spectral–spatial feature engineering, the discriminator learns both the dense global spectrum and neighboring separable spatial context via well-designed extractors. Second, a lightweight adaptive feature weighting component is proposed for feature fusion; it considers four predictive fusion options, that is, adding or concatenating feature maps with similar or adaptive weights. Finally, for the mode collapse, the proposed AWF²-GAN combines supervised central loss and unsupervised mean minimization loss for optimization. Quantitative results on two HSI datasets show that our AWF²-GAN achieves superior performance over state-of-the-art GAN-based methods. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

23 pages, 8782 KiB

Open AccessArticle

HRCNet: High-Resolution Context Extraction Network for Semantic Segmentation of Remote Sensing Images

by Zhiyong Xu, Weicun Zhang, Tianxiang Zhang and Jiangyun Li

Remote Sens. 2021, 13(1), 71; https://doi.org/10.3390/rs13010071 - 27 Dec 2020

Cited by 119 | Viewed by 10369

Abstract

Semantic segmentation is a significant method in remote sensing image (RSIs) processing and has been widely used in various applications. Conventional convolutional neural network (CNN)-based semantic segmentation methods are likely to lose the spatial information in the feature extraction stage and usually pay [...] Read more.

Semantic segmentation is a significant method in remote sensing image (RSIs) processing and has been widely used in various applications. Conventional convolutional neural network (CNN)-based semantic segmentation methods are likely to lose the spatial information in the feature extraction stage and usually pay little attention to global context information. Moreover, the imbalance of category scale and uncertain boundary information meanwhile exists in RSIs, which also brings a challenging problem to the semantic segmentation task. To overcome these problems, a high-resolution context extraction network (HRCNet) based on a high-resolution network (HRNet) is proposed in this paper. In this approach, the HRNet structure is adopted to keep the spatial information. Moreover, the light-weight dual attention (LDA) module is designed to obtain global context information in the feature extraction stage and the feature enhancement feature pyramid (FEFP) structure is promoted and employed to fuse the contextual information of different scales. In addition, to achieve the boundary information, we design the boundary aware (BA) module combined with the boundary aware loss (BAloss) function. The experimental results evaluated on Potsdam and Vaihingen datasets show that the proposed approach can significantly improve the boundary and segmentation performance up to 92.0% and 92.3% on overall accuracy scores, respectively. As a consequence, it is envisaged that the proposed HRCNet model will be an advantage in remote sensing images segmentation. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

19 pages, 2301 KiB

Open AccessArticle

Air Pollution Prediction with Multi-Modal Data and Deep Neural Networks

by Jovan Kalajdjieski, Eftim Zdravevski, Roberto Corizzo, Petre Lameski, Slobodan Kalajdziski, Ivan Miguel Pires, Nuno M. Garcia and Vladimir Trajkovik

Remote Sens. 2020, 12(24), 4142; https://doi.org/10.3390/rs12244142 - 18 Dec 2020

Cited by 75 | Viewed by 8957

Abstract

Air pollution is becoming a rising and serious environmental problem, especially in urban areas affected by an increasing migration rate. The large availability of sensor data enables the adoption of analytical tools to provide decision support capabilities. Employing sensors facilitates air pollution monitoring, [...] Read more.

Air pollution is becoming a rising and serious environmental problem, especially in urban areas affected by an increasing migration rate. The large availability of sensor data enables the adoption of analytical tools to provide decision support capabilities. Employing sensors facilitates air pollution monitoring, but the lack of predictive capability limits such systems’ potential in practical scenarios. On the other hand, forecasting methods offer the opportunity to predict the future pollution in specific areas, potentially suggesting useful preventive measures. To date, many works tackled the problem of air pollution forecasting, most of which are based on sequence models. These models are trained with raw pollution data and are subsequently utilized to make predictions. This paper proposes a novel approach evaluating four different architectures that utilize camera images to estimate the air pollution in those areas. These images are further enhanced with weather data to boost the classification accuracy. The proposed approach exploits generative adversarial networks combined with data augmentation techniques to mitigate the class imbalance problem. The experiments show that the proposed method achieves robust accuracy of up to 0.88, which is comparable to sequence models and conventional models that utilize air pollution data. This is a remarkable result considering that the historic air pollution data is directly related to the output—future air pollution data, whereas the proposed architecture uses camera images to recognize the air pollution—which is an inherently much more difficult problem. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

Review

Jump to: Research

22 pages, 5900 KiB

Open AccessReview

Effect of Attention Mechanism in Deep Learning-Based Remote Sensing Image Processing: A Systematic Literature Review

by Saman Ghaffarian, João Valente, Mariska van der Voort and Bedir Tekinerdogan

Remote Sens. 2021, 13(15), 2965; https://doi.org/10.3390/rs13152965 - 28 Jul 2021

Cited by 120 | Viewed by 14652

Abstract

Machine learning, particularly deep learning (DL), has become a central and state-of-the-art method for several computer vision applications and remote sensing (RS) image processing. Researchers are continually trying to improve the performance of the DL methods by developing new architectural designs of the [...] Read more.

Machine learning, particularly deep learning (DL), has become a central and state-of-the-art method for several computer vision applications and remote sensing (RS) image processing. Researchers are continually trying to improve the performance of the DL methods by developing new architectural designs of the networks and/or developing new techniques, such as attention mechanisms. Since the attention mechanism has been proposed, regardless of its type, it has been increasingly used for diverse RS applications to improve the performances of the existing DL methods. However, these methods are scattered over different studies impeding the selection and application of the feasible approaches. This study provides an overview of the developed attention mechanisms and how to integrate them with different deep learning neural network architectures. In addition, it aims to investigate the effect of the attention mechanism on deep learning-based RS image processing. We identified and analyzed the advances in the corresponding attention mechanism-based deep learning (At-DL) methods. A systematic literature review was performed to identify the trends in publications, publishers, improved DL methods, data types used, attention types used, overall accuracies achieved using At-DL methods, and extracted the current research directions, weaknesses, and open problems to provide insights and recommendations for future studies. For this, five main research questions were formulated to extract the required data and information from the literature. Furthermore, we categorized the papers regarding the addressed RS image processing tasks (e.g., image classification, object detection, and change detection) and discussed the results within each group. In total, 270 papers were retrieved, of which 176 papers were selected according to the defined exclusion criteria for further analysis and detailed review. The results reveal that most of the papers reported an increase in overall accuracy when using the attention mechanism within the DL methods for image classification, image segmentation, change detection, and object detection using remote sensing images. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithm for Remote Sensing Imagery Processing)

► Show Figures

Graphical abstract

Journal Menu

Journal Browser

Artificial Intelligence Algorithm for Remote Sensing Imagery Processing

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (17 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI