Remote Sensing

Research

Jump to: Review

20 pages, 11529 KiB

Open AccessArticle

Multi-Field Context Fusion Network for Semantic Segmentation of High-Spatial-Resolution Remote Sensing Images

by Xinran Du, Shumeng He, Houqun Yang and Chunxiao Wang

Remote Sens. 2022, 14(22), 5830; https://doi.org/10.3390/rs14225830 - 17 Nov 2022

Cited by 5 | Viewed by 2291

Abstract

High spatial resolution (HSR) remote sensing images have a wide range of application prospects in the fields of urban planning, agricultural planning and military training. Therefore, the research on the semantic segmentation of remote sensing images becomes extremely important. However, large data volume [...] Read more.

High spatial resolution (HSR) remote sensing images have a wide range of application prospects in the fields of urban planning, agricultural planning and military training. Therefore, the research on the semantic segmentation of remote sensing images becomes extremely important. However, large data volume and the complex background of HSR remote sensing images put great pressure on the algorithm efficiency. Although the pressure on the GPU can be relieved by down-sampling the image or cropping it into small patches for separate processing, the loss of local details or global contextual information can lead to limited segmentation accuracy. In this study, we propose a multi-field context fusion network (MCFNet), which can preserve both global and local information efficiently. The method consists of three modules: a backbone network, a patch selection module (PSM), and a multi-field context fusion module (FM). Specifically, we propose a confidence-based local selection criterion in the PSM, which adaptively selects local locations in the image that are poorly segmented. Subsequently, the FM dynamically aggregates the semantic information of multiple visual fields centered on that local location to enhance the segmentation of these local locations. Since MCFNet only performs segmentation enhancement on local locations in an image, it can improve segmentation accuracy without consuming excessive GPU memory. We implement our method on two high spatial resolution remote sensing image datasets, DeepGlobe and Potsdam, and compare the proposed method with state-of-the-art methods. The results show that the MCFNet method achieves the best balance in terms of segmentation accuracy, memory efficiency, and inference speed. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

19 pages, 14157 KiB

Open AccessArticle

An Efficient Feature Extraction Network for Unsupervised Hyperspectral Change Detection

by Hongyu Zhao, Kaiyuan Feng, Yue Wu and Maoguo Gong

Remote Sens. 2022, 14(18), 4646; https://doi.org/10.3390/rs14184646 - 17 Sep 2022

Cited by 11 | Viewed by 2367

Abstract

Change detection (CD) in hyperspectral images has become a research hotspot in the field of remote sensing due to the extremely wide spectral range of hyperspectral images compared to traditional remote sensing images. It is challenging to effectively extract features from redundant high-dimensional [...] Read more.

Change detection (CD) in hyperspectral images has become a research hotspot in the field of remote sensing due to the extremely wide spectral range of hyperspectral images compared to traditional remote sensing images. It is challenging to effectively extract features from redundant high-dimensional data for hyperspectral change detection tasks due to the fact that hyperspectral data contain abundant spectral information. In this paper, a novel feature extraction network is proposed, which uses a Recurrent Neural Network (RNN) to mine the spectral information of the input image and combines this with a Convolutional Neural Network (CNN) to fuse the spatial information of hyperspectral data. Finally, the feature extraction structure of hybrid RNN and CNN is used as a building block to complete the change detection task. In addition, we use an unsupervised sample generation strategy to produce high-quality samples for network training. The experimental results demonstrate that the proposed method yields reliable detection results. Moreover, the proposed method has fewer noise regions than the pixel-based method. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Graphical abstract

17 pages, 4037 KiB

Open AccessArticle

Power Line Extraction Framework Based on Edge Structure and Scene Constraints

by Kuansheng Zou and Zhenbang Jiang

Remote Sens. 2022, 14(18), 4575; https://doi.org/10.3390/rs14184575 - 13 Sep 2022

Cited by 6 | Viewed by 2016

Abstract

Power system maintenance is an important guarantee for the stable operation of the power system. Power line autonomous inspection based on Unmanned Aerial Vehicles (UAVs) provides convenience for maintaining power systems. The Power Line Extraction (PLE) is one of the key issues that [...] Read more.

Power system maintenance is an important guarantee for the stable operation of the power system. Power line autonomous inspection based on Unmanned Aerial Vehicles (UAVs) provides convenience for maintaining power systems. The Power Line Extraction (PLE) is one of the key issues that needs solved first for autonomous power line inspection. However, most of the existing PLE methods have the problem that small edge lines are extracted from scene images without power lines, and bringing about that PLE method cannot be well applied in practice. To solve this problem, a PLE method based on edge structure and scene constraints is proposed in this paper. The Power Line Scene Recognition (PLSR) is used as an auxiliary task for the PLE and scene constraints are set first. Based on the characteristics of power line images, the shallow feature map of the fourth layer of the encoding stage is transmitted to the middle three layers of the decoding stage, thus, structured detailed edge features are provided for upsampling. It is helpful to restore the power line edges more finely. Experimental results show that the proposed method has good performance, robustness, and generalization in multiple scenes with complex backgrounds. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

21 pages, 2503 KiB

Open AccessArticle

Semi-Supervised DEGAN for Optical High-Resolution Remote Sensing Image Scene Classification

by Jia Li, Yujia Liao, Junjie Zhang, Dan Zeng and Xiaoliang Qian

Remote Sens. 2022, 14(17), 4418; https://doi.org/10.3390/rs14174418 - 5 Sep 2022

Cited by 9 | Viewed by 2422

Abstract

Semi-supervised methods have made remarkable achievements via utilizing unlabeled samples for optical high-resolution remote sensing scene classification. However, the labeled data cannot be effectively combined with unlabeled data in the existing semi-supervised methods during model training. To address this issue, we present a [...] Read more.

Semi-supervised methods have made remarkable achievements via utilizing unlabeled samples for optical high-resolution remote sensing scene classification. However, the labeled data cannot be effectively combined with unlabeled data in the existing semi-supervised methods during model training. To address this issue, we present a semi-supervised optical high-resolution remote sensing scene classification method based on Diversity Enhanced Generative Adversarial Network (DEGAN), in which the supervised and unsupervised stages are deeply combined in the DEGAN training. Based on the unsupervised characteristic of the Generative Adversarial Network (GAN), a large number of unlabeled and labeled images are jointly employed to guide the generator to obtain a complete and accurate probability density space of fake images. The Diversity Enhanced Network (DEN) is designed to increase the diversity of generated images based on massive unlabeled data. Therefore, the discriminator is promoted to provide discriminative features by enhancing the generator given the game relationship between two models in DEGAN. Moreover, the conditional entropy is adopted to make full use of the information of unlabeled data during the discriminator training. Finally, the features extracted from the discriminator and VGGNet-16 are employed for scene classification. Experimental results on three large datasets demonstrate that the proposed scene classification method yields a superior classification performance compared with other semi-supervised methods. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
The framework of DEGAN. Full article ">Figure 2
The flowchart of the proposed scene classification based on DEGAN. Full article ">Figure 3
The architecture of the FGN. Full article ">Figure 4
The architecture of the DEN. Full article ">Figure 5
The architecture of the discriminator. Full article ">Figure 6
Illustrations of images of three optical high-resolution remote sensing image scene classification datasets. UC Merced, AID, and NWPU-RESISC45 datasets are displayed sequentially from top to bottom: (a) Baseball court; (b) Beach; (c) Storage tank; (d) Forest; (e) Harbor; (f) River; (g) Parking; (h) Sparse residual; (i) Medium residual; and (j) Dense residual. Full article ">Figure 7
Confusion matrix of the proposed method on the UC Merced dataset under the training ratio of 10%. Full article ">Figure 8
Confusion matrix of the proposed method on the AID dataset under the training ratio of 10%. Full article ">Figure 9
Confusion matrix of the proposed method on the NWPU-RESISC45 dataset under the training ratio of 10%. Full article ">

23 pages, 6282 KiB

Open AccessArticle

Hyperspectral Band Selection via Band Grouping and Adaptive Multi-Graph Constraint

by Mengbo You, Xiancheng Meng, Yishu Wang, Hongyuan Jin, Chunting Zhai and Aihong Yuan

Remote Sens. 2022, 14(17), 4379; https://doi.org/10.3390/rs14174379 - 3 Sep 2022

Cited by 4 | Viewed by 2554

Abstract

Unsupervised band selection has gained increasing attention recently since massive unlabeled high-dimensional data often need to be processed in the domains of machine learning and data mining. This paper presents a novel unsupervised HSI band selection method via band grouping and adaptive multi-graph [...] Read more.

Unsupervised band selection has gained increasing attention recently since massive unlabeled high-dimensional data often need to be processed in the domains of machine learning and data mining. This paper presents a novel unsupervised HSI band selection method via band grouping and adaptive multi-graph constraint. A band grouping strategy that assigns each group different weights to construct a global similarity matrix is applied to address the problem of overlooking strong correlations among adjacent bands. Different from previous studies that are limited to fixed graph constraints, we adjust the weight of the local similarity matrix dynamically to construct a global similarity matrix. By partitioning the HSI cube into several groups, the model is built with a combination of significance ranking and band selection. After establishing the model, we addressed the optimization problem by an iterative algorithm, which updates the global similarity matrix, its corresponding reconstruction weights matrix, the projection, and the pseudo-label matrix to ameliorate each of them synergistically. Extensive experimental results indicate our method outperforms the other five state-of-the-art band selection methods in the publicly available datasets. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

16 pages, 5951 KiB

Open AccessArticle

A Tracking Imaging Control Method for Dual-FSM 3D GISC LiDAR

by Yu Cao, Xiuqin Su, Xueming Qian, Haitao Wang, Wei Hao, Meilin Xie, Xubin Feng, Junfeng Han, Mingliang Chen and Chenglong Wang

Remote Sens. 2022, 14(13), 3167; https://doi.org/10.3390/rs14133167 - 1 Jul 2022

Cited by 5 | Viewed by 2031

Abstract

In this paper, a tracking and pointing control system with dual-FSM (fast steering mirror) composite axis is proposed. It is applied to the target-tracking accuracy control in a 3D GISC LiDAR (three-dimensional ghost imaging LiDAR via sparsity constraint) system. The tracking and pointing [...] Read more.

In this paper, a tracking and pointing control system with dual-FSM (fast steering mirror) composite axis is proposed. It is applied to the target-tracking accuracy control in a 3D GISC LiDAR (three-dimensional ghost imaging LiDAR via sparsity constraint) system. The tracking and pointing imaging control system of the dual-FSM 3D GISC LiDAR proposed in this paper is a staring imaging method with multiple measurements, which mainly solves the problem of high-resolution remote-sensing imaging of high-speed moving targets when the technology is transformed into practical applications. In the research of this control system, firstly, we propose a method that combines motion decoupling and sensor decoupling to solve the mechanical coupling problem caused by the noncoaxial sensor installation of the FSM. Secondly, we suppress the inherent mechanical resonance of the FSM in the control system. Thirdly, we propose the optical path design of a dual-FSM 3D GISC LiDAR tracking imaging system to solve the problem of receiving aperture constraint. Finally, after sufficient experimental verification, our method is shown to successfully reduce the coupling from 7% to 0.6%, and the precision tracking bandwidth reaches 300 Hz. Moreover, when the distance between the GISC system and the target is 2.74 km and the target flight speed is 7 m/s, the tracking accuracy of the system is improved from 15.7 μrad (σ) to 2.2 μrad (σ), and at the same time, the system recognizes the target contour clearly. Our research is valuable to put the GISC technology into practical applications. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

24 pages, 7222 KiB

Open AccessArticle

PolSAR Scene Classification via Low-Rank Constrained Multimodal Tensor Representation

by Bo Ren, Mengqian Chen, Biao Hou, Danfeng Hong, Shibin Ma, Jocelyn Chanussot and Licheng Jiao

Remote Sens. 2022, 14(13), 3117; https://doi.org/10.3390/rs14133117 - 28 Jun 2022

Cited by 1 | Viewed by 2230

Abstract

Polarimetric synthetic aperture radar (PolSAR) data can be acquired at all times and are not impacted by weather conditions. They can efficiently capture geometrical and geographical structures on the ground. However, due to the complexity of the data and the difficulty of data [...] Read more.

Polarimetric synthetic aperture radar (PolSAR) data can be acquired at all times and are not impacted by weather conditions. They can efficiently capture geometrical and geographical structures on the ground. However, due to the complexity of the data and the difficulty of data availability, PolSAR image scene classification remains a challenging task. To this end, in this paper, a low-rank constrained multimodal tensor representation method (LR-MTR) is proposed to integrate PolSAR data in multimodal representations. To preserve the multimodal polarimetric information simultaneously, the target decompositions in a scene from multiple spaces (e.g., Freeman, H/A/

α

, Pauli, etc.) are exploited to provide multiple pseudo-color images. Furthermore, a representation tensor is constructed via the representation matrices and constrained by the low-rank norm to keep the cross-information from multiple spaces. A projection matrix is also calculated by minimizing the differences between the whole cascaded data set and the features in the corresponding space. It also reduces the redundancy of those multiple spaces and solves the out-of-sample problem in the large-scale data set. To support the experiments, two new PolSAR image data sets are built via ALOS-2 full polarization data, covering the areas of Shanghai, China, and Tokyo, Japan. Compared with state-of-the-art (SOTA) dimension reduction algorithms, the proposed method achieves the best quantitative performance and demonstrates superiority in fusing multimodal PolSAR features for image scene classification. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Graphical abstract

23 pages, 3842 KiB

Open AccessArticle

Fine-Grained Ship Classification by Combining CNN and Swin Transformer

by Liang Huang, Fengxiang Wang, Yalun Zhang and Qingxia Xu

Remote Sens. 2022, 14(13), 3087; https://doi.org/10.3390/rs14133087 - 27 Jun 2022

Cited by 22 | Viewed by 5147

Abstract

The mainstream algorithms used for ship classification and detection can be improved based on convolutional neural networks (CNNs). By analyzing the characteristics of ship images, we found that the difficulty in ship image classification lies in distinguishing ships with similar hull structures but [...] Read more.

The mainstream algorithms used for ship classification and detection can be improved based on convolutional neural networks (CNNs). By analyzing the characteristics of ship images, we found that the difficulty in ship image classification lies in distinguishing ships with similar hull structures but different equipment and superstructures. To extract features such as ship superstructures, this paper introduces transformer architecture with self-attention into ship classification and detection, and a CNN and Swin transformer model (CNN-Swin model) is proposed for ship image classification and detection. The main contributions of this study are as follows: (1) The proposed approach pays attention to different scale features in ship image classification and detection, introduces a transformer architecture with self-attention into ship classification and detection for the first time, and uses a parallel network of a CNN and a transformer to extract features of images. (2) To exploit the CNN’s performance and avoid overfitting as much as possible, a multi-branch CNN-Block is designed and used to construct a CNN backbone with simplicity and accessibility to extract features. (3) The performance of the CNN-Swin model is validated on the open FGSC-23 dataset and a dataset containing typical military ship categories based on open-source images. The results show that the model achieved accuracies of 90.9% and 91.9% for the FGSC-23 dataset and the military ship dataset, respectively, outperforming the existing nine state-of-the-art approaches. (4) The good extraction effect on the ship features of the CNN-Swin model is validated as the backbone of the three state-of-the-art detection methods on the open datasets HRSC2016 and FAIR1M. The results show the great potential of the CNN-Swin backbone with self-attention in ship detection. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Graphical abstract

16 pages, 2625 KiB

Open AccessArticle

Character Segmentation and Recognition of Variable-Length License Plates Using ROI Detection and Broad Learning System

by Bingshu Wang, Hongli Xiao, Jiangbin Zheng, Dengxiu Yu and C. L. Philip Chen

Remote Sens. 2022, 14(7), 1560; https://doi.org/10.3390/rs14071560 - 24 Mar 2022

Cited by 5 | Viewed by 3300

Abstract

Variable-length license plate segmentation and recognition has always been a challenging barrier in the application of intelligent transportation systems. Previous approaches mainly concern fixed-length license plates, lacking adaptability for variable-length license plates. Although objection detection methods can be used to address the issue, [...] Read more.

Variable-length license plate segmentation and recognition has always been a challenging barrier in the application of intelligent transportation systems. Previous approaches mainly concern fixed-length license plates, lacking adaptability for variable-length license plates. Although objection detection methods can be used to address the issue, they face a series of difficulties: cross class problem, missing detections, and recognition errors between letters and digits. To solve these problems, we propose a machine learning method that regards each character as a region of interest. It covers three parts. Firstly, we explore a transfer learning algorithm based on Faster-RCNN with InceptionV2 structure to generate candidate character regions. Secondly, a strategy of cross-class removal of character is proposed to reject the overlapped results. A mechanism of template matching and position predicting is designed to eliminate missing detections. Moreover, a twofold broad learning system is designed to identify letters and digits separately. Experiments performed on Macau license plates demonstrate that our method achieves an average 99.68% of segmentation accuracy and an average 99.19% of recognition rate, outperforming some conventional and deep learning approaches. The adaptability is expected to transfer the developed algorithm to other countries or regions. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

23 pages, 30433 KiB

Open AccessArticle

Two-Stream Swin Transformer with Differentiable Sobel Operator for Remote Sensing Image Classification

by Siyuan Hao, Bin Wu, Kun Zhao, Yuanxin Ye and Wei Wang

Remote Sens. 2022, 14(6), 1507; https://doi.org/10.3390/rs14061507 - 20 Mar 2022

Cited by 33 | Viewed by 6036

Abstract

Remote sensing (RS) image classification has attracted much attention recently and is widely used in various fields. Different to natural images, the RS image scenes consist of complex backgrounds and various stochastically arranged objects, thus making it difficult for networks to focus on [...] Read more.

Remote sensing (RS) image classification has attracted much attention recently and is widely used in various fields. Different to natural images, the RS image scenes consist of complex backgrounds and various stochastically arranged objects, thus making it difficult for networks to focus on the target objects in the scene. However, conventional classification methods do not have any special treatment for remote sensing images. In this paper, we propose a two-stream swin transformer network (TSTNet) to address these issues. TSTNet consists of two streams (i.e., original stream and edge stream) which use both the deep features of the original images and the ones from the edges to make predictions. The swin transformer is used as the backbone of each stream given its good performance. In addition, a differentiable edge Sobel operator module (DESOM) is included in the edge stream which can learn the parameters of Sobel operator adaptively and provide more robust edge information that can suppress background noise. Experimental results on three publicly available remote sensing datasets show that our TSTNet achieves superior performance over the state-of-the-art (SOTA) methods. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

22 pages, 8223 KiB

Open AccessArticle

Meta-Pixel-Driven Embeddable Discriminative Target and Background Dictionary Pair Learning for Hyperspectral Target Detection

by Tan Guo, Fulin Luo, Leyuan Fang and Bob Zhang

Remote Sens. 2022, 14(3), 481; https://doi.org/10.3390/rs14030481 - 20 Jan 2022

Cited by 12 | Viewed by 2615

Abstract

In hyperspectral target detection, the spectral high-dimensionality, variability, and heterogeneity will pose great challenges to the accurate characterizations of the target and background. To alleviate the problems, we propose a Meta-pixel-driven Embeddable Discriminative target and background Dictionary Pair (MEDDP) learning model by combining [...] Read more.

In hyperspectral target detection, the spectral high-dimensionality, variability, and heterogeneity will pose great challenges to the accurate characterizations of the target and background. To alleviate the problems, we propose a Meta-pixel-driven Embeddable Discriminative target and background Dictionary Pair (MEDDP) learning model by combining low-dimensional embeddable subspace projection and the discriminative target and background dictionary pair learning. In MEDDP, the meta-pixel set is built by taking the merits of homogeneous superpixel segmentation and the local manifold affinity structures, which can significantly reduce the influence of spectral variability and find the most typical and informative prototype spectral signature. Afterward, an embeddable discriminative dictionary pair learning model is established to learn a target and background dictionary pair based on the structural incoherent constraint with embeddable subspace projection. The proposed joint learning strategy can reduce the high-dimensional redundant information and simultaneously enhance the discrimination and compactness of the target and background dictionaries. The proposed MEDDP model is solved by an iterative and alternate optimization algorithm and applied with the meta-pixel-level target detection method. Experimental results on four benchmark HSI datasets indicate that the proposed method can consistently yield promising performance in comparison with some state-of-the-art target detectors. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

20 pages, 5903 KiB

Open AccessArticle

A Lightweight Convolutional Neural Network Based on Group-Wise Hybrid Attention for Remote Sensing Scene Classification

by Cuiping Shi, Xinlei Zhang, Jingwei Sun and Liguo Wang

Remote Sens. 2022, 14(1), 161; https://doi.org/10.3390/rs14010161 - 30 Dec 2021

Cited by 13 | Viewed by 3400

Abstract

With the development of computer vision, attention mechanisms have been widely studied. Although the introduction of an attention module into a network model can help to improve classification performance on remote sensing scene images, the direct introduction of an attention module can increase [...] Read more.

With the development of computer vision, attention mechanisms have been widely studied. Although the introduction of an attention module into a network model can help to improve classification performance on remote sensing scene images, the direct introduction of an attention module can increase the number of model parameters and amount of calculation, resulting in slower model operations. To solve this problem, we carried out the following work. First, a channel attention module and spatial attention module were constructed. The input features were enhanced through channel attention and spatial attention separately, and the features recalibrated by the attention modules were fused to obtain the features with hybrid attention. Then, to reduce the increase in parameters caused by the attention module, a group-wise hybrid attention module was constructed. The group-wise hybrid attention module divided the input features into four groups along the channel dimension, then used the hybrid attention mechanism to enhance the features in the channel and spatial dimensions for each group, then fused the features of the four groups along the channel dimension. Through the use of the group-wise hybrid attention module, the number of parameters and computational burden of the network were greatly reduced, and the running time of the network was shortened. Finally, a lightweight convolutional neural network was constructed based on the group-wise hybrid attention (LCNN-GWHA) for remote sensing scene image classification. Experiments on four open and challenging remote sensing scene datasets demonstrated that the proposed method has great advantages, in terms of classification accuracy, even with a very low number of parameters. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Graphical abstract

20 pages, 10292 KiB

Open AccessArticle

Building Plane Segmentation Based on Point Clouds

by Zhonghua Su, Zhenji Gao, Guiyun Zhou, Shihua Li, Lihui Song, Xukun Lu and Ning Kang

Remote Sens. 2022, 14(1), 95; https://doi.org/10.3390/rs14010095 - 25 Dec 2021

Cited by 14 | Viewed by 4773

Abstract

Planes are essential features to describe the shapes of buildings. The segmentation of a plane is significant when reconstructing a building in three dimensions. However, there is a concern about the accuracy in segmenting plane from point cloud data. The objective of this [...] Read more.

Planes are essential features to describe the shapes of buildings. The segmentation of a plane is significant when reconstructing a building in three dimensions. However, there is a concern about the accuracy in segmenting plane from point cloud data. The objective of this paper was to develop an effective segmentation algorithm for building planes that combines the region growing algorithm with the distance algorithm based on boundary points. The method was tested on point cloud data from a cottage and pantry as scanned using a Faro Focus 3D laser range scanner and Matterport Camera, respectively. A coarse extraction of the building plane was obtained from the region growing algorithm. The coplanar points where two planes intersect were obtained from the distance algorithm. The building plane’s optimal segmentation was then obtained by combining the coarse extraction plane points and the corresponding coplanar points. The results show that the proposed method successfully segmented the plane points of the cottage and pantry. The optimal distance thresholds using the proposed method from the uncoarse extraction plane points to each plane boundary point of cottage and pantry were 0.025 m and 0.030 m, respectively. The highest correct rate and the highest error rate of the cottage’s (pantry’s) plane segmentations using the proposed method under the optimal distance threshold were 99.93% and 2.30% (98.55% and 2.44%), respectively. The F1 score value of the cottage’s and pantry’s plane segmentations using the proposed method under the optimal distance threshold reached 97.56% and 95.75%, respectively. This method can segment different objects on the same plane, while the random sample consensus (RANSAC) algorithm causes the plane to become over-segmented. The proposed method can also extract the coplanar points at the intersection of two planes, which cannot be separated using the region growing algorithm. Although the RANSAC-RG method combining the RANSAC algorithm and the region growing algorithm can optimize the segmentation results of the RANSAC (region growing) algorithm and has little difference in segmentation effect (especially for cottage data) with the proposed method, the method still loses coplanar points at some intersection of the two planes. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

22 pages, 2202 KiB

Open AccessArticle

Accurate Instance Segmentation for Remote Sensing Images via Adaptive and Dynamic Feature Learning

by Feng Yang, Xiangyue Yuan, Jie Ran, Wenqiang Shu, Yue Zhao, Anyong Qin and Chenqiang Gao

Remote Sens. 2021, 13(23), 4774; https://doi.org/10.3390/rs13234774 - 25 Nov 2021

Cited by 6 | Viewed by 3664

Abstract

Instance segmentation for high-resolution remote sensing images (HRSIs) is a fundamental yet challenging task in earth observation, which aims at achieving instance-level location and pixel-level classification for instances of interest on the earth’s surface. The main difficulties come from the huge scale variation, [...] Read more.

Instance segmentation for high-resolution remote sensing images (HRSIs) is a fundamental yet challenging task in earth observation, which aims at achieving instance-level location and pixel-level classification for instances of interest on the earth’s surface. The main difficulties come from the huge scale variation, arbitrary instance shapes, and numerous densely packed small objects in HRSIs. In this paper, we design an end-to-end multi-category instance segmentation network for HRSIs, where three new modules based on adaptive and dynamic feature learning are proposed to address the above issues. The cross-scale adaptive fusion (CSAF) module introduces a novel multi-scale feature fusion mechanism to enhance the capability of the model to detect and segment objects with noticeable size variation. To predict precise masks for the complex boundaries of remote sensing instances, we embed a context attention upsampling (CAU) kernel instead of deconvolution in the segmentation branch to aggregate contextual information for refined upsampling. Furthermore, we extend the general fixed positive and negative sample judgment threshold strategy into a dynamic sample selection (DSS) module to select more suitable positive and negative samples flexibly for densely packed instances. These three modules enable a better feature learning of the instance segmentation network. Extensive experiments are conducted on the iSAID and NWU VHR-10 instance segmentation datasets to validate the proposed method. Attributing to the three proposed modules, we have achieved 1.9% and 2.9% segmentation performance improvements on these two datasets compared with the baseline method and achieved the state-of-the-art performance. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

25 pages, 4607 KiB

Open AccessArticle

Remote Sensing Scene Image Classification Based on Dense Fusion of Multi-level Features

by Cuiping Shi, Xinlei Zhang, Jingwei Sun and Liguo Wang

Remote Sens. 2021, 13(21), 4379; https://doi.org/10.3390/rs13214379 - 30 Oct 2021

Cited by 14 | Viewed by 2483

Abstract

For remote sensing scene image classification, many convolution neural networks improve the classification accuracy at the cost of the time and space complexity of the models. This leads to a slow running speed for the model and cannot realize a trade-off between the [...] Read more.

For remote sensing scene image classification, many convolution neural networks improve the classification accuracy at the cost of the time and space complexity of the models. This leads to a slow running speed for the model and cannot realize a trade-off between the model accuracy and the model running speed. As the network deepens, it is difficult to extract the key features with a sample double branched structure, and it also leads to the loss of shallow features, which is unfavorable to the classification of remote sensing scene images. To solve this problem, we propose a dual branch multi-level feature dense fusion-based lightweight convolutional neural network (BMDF-LCNN). The network structure can fully extract the information of the current layer through 3 × 3 depthwise separable convolution and 1 × 1 standard convolution, identity branches, and fuse with the features extracted from the previous layer 1 × 1 standard convolution, thus avoiding the loss of shallow information due to network deepening. In addition, we propose a downsampling structure that is more suitable for extracting the shallow features of the network by using the pooled branch to downsample and the convolution branch to compensate for the pooled features. Experiments were carried out on four open and challenging remote sensing image scene data sets. The experimental results show that the proposed method has higher classification accuracy and lower model complexity than some state-of-the-art classification methods and realizes the trade-off between model accuracy and model running speed. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Graphical abstract

Review

Jump to: Research

37 pages, 3377 KiB

Open AccessReview

Self-Supervised Learning for Scene Classification in Remote Sensing: Current State of the Art and Perspectives

by Paul Berg, Minh-Tan Pham and Nicolas Courty

Remote Sens. 2022, 14(16), 3995; https://doi.org/10.3390/rs14163995 - 17 Aug 2022

Cited by 38 | Viewed by 6559

Abstract

Deep learning methods have become an integral part of computer vision and machine learning research by providing significant improvement performed in many tasks such as classification, regression, and detection. These gains have been also observed in the field of remote sensing for Earth [...] Read more.

Deep learning methods have become an integral part of computer vision and machine learning research by providing significant improvement performed in many tasks such as classification, regression, and detection. These gains have been also observed in the field of remote sensing for Earth observation where most of the state-of-the-art results are now achieved by deep neural networks. However, one downside of these methods is the need for large amounts of annotated data, requiring lots of labor-intensive and expensive human efforts, in particular for specific domains that require expert knowledge such as medical imaging or remote sensing. In order to limit the requirement on data annotations, several self-supervised representation learning methods have been proposed to learn unsupervised image representations that can consequently serve for downstream tasks such as image classification, object detection or semantic segmentation. As a result, self-supervised learning approaches have been considerably adopted in the remote sensing domain within the last few years. In this article, we review the underlying principles developed by various self-supervised methods with a focus on scene classification task. We highlight the main contributions and analyze the experiments, as well as summarize the key conclusions, from each study. We then conduct extensive experiments on two public scene classification datasets to benchmark and evaluate different self-supervised models. Based on comparative results, we investigate the impact of individual augmentations when applied to remote sensing data as well as the use of self-supervised pre-training to boost the classification performance with limited number of labeled samples. We finally underline the current trends and challenges, as well as perspectives of self-supervised scene classification. Full article

(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)

► Show Figures

Figure 1

Journal Menu

Journal Browser

State-of-the-Art Remote Sensing Image Scene Classification

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (16 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI