[go: up one dir, main page]

 
 
remotesensing-logo

Journal Browser

Journal Browser

Learning to Understand Remote Sensing Images

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (31 December 2018) | Viewed by 355368

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor


grade E-Mail Website
Collection Editor
School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, P.O. Box 64, Xi'an 710072, China
Interests: remote sensing; image analysis; computer vision; pattern recognition; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the recent advances of remote sensing technologies for Earth observation, many different remote sensors are collecting data with distinctive properties. The obtained data are so large and complex that analyzing them manually becomes impractical or even impossible. For example, multi-source/multi-temporal/multi-scale data are frequently delivered by remote sensors. But, if we want to explore them by hand and then obtain useful information, the workload would be overwhelming and the performance would be unsatisfactory. Therefore, understanding remote sensing images effectively, in connection with physics, has been the primary concern of the remote sensing research community in recent years. For this purpose, machine learning is thought to be a promising technique because it can make the system learn to improve itself. With this distinctive characteristic, the algorithms will be more adaptive, automatic and intelligent.

In recent decades, this area has attracted a lot of research interest, and significant progress has been made in this direction, particularly in the optical, hyperspectral and microwave remote sensing communities. For instance, there have been several tutorials at various conferences that directly or indirectly correlate with machine learning topics, and numerous papers are published each year in top journals in the remote sensing community. Particularly, with the popularity of deep learning and big data concepts, research towards data learning and mining paradigms has reached new heights. The success of machine learning techniques lies in its practical effectiveness, improving current methods and achieving the state-of-the-art performance.

Nevertheless, there are still problems that need to be solved. For example, how to bring together the limited training samples available for deep learning related methods? How to adapt the original machine learning prototypes to remote sensing applications and particularly to physics? How to compromise between the learning speed and its effectiveness? Many other challenges remain in the remote sensing field which have fostered new efforts and developments to better understand remote sensing images via machine learning techniques.

This Collection on "Learning to Understand Remote Sensing Images" focuses on this topic. We invite original submissions reporting recent advances in the machine learning approaches towards analyzing and understanding remote sensing images, and aim foster an increased interest in this field.

This Collection will emphasize the use of state-of-the-art machine learning techniques and statistical computing methods, such as deep learning, graphical models, sparse coding and kernel machines. 

Potential topics include, but are not limited to:

  • Optical, hyperspectral, microwave and other types of remote sensing data;

  • Feature learning for remote sensing images; 

  • Learning strategies for multi-source/multi-temporal/multi-scale image fusion; 

  • Novel machine learning and statistical computing methods for remote sensing; 

  • Learning metrics on benchmark databases;

  • Applications such as classification, segmentation, unmixing, change detection, semantic labelling using the learning approaches.

Authors are requested to check and follow the specific Instructions to Authors, https://www.mdpi.com/journal/remotesensing/instructions.

We look forward to receiving your submissions.

Dr. Qi Wang
Collection Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (40 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

24 pages, 24260 KiB  
Article
Nonlocal Tensor Sparse Representation and Low-Rank Regularization for Hyperspectral Image Compressive Sensing Reconstruction
by Jize Xue, Yongqiang Zhao, Wenzhi Liao and Jonathan Cheung-Wai Chan
Remote Sens. 2019, 11(2), 193; https://doi.org/10.3390/rs11020193 - 19 Jan 2019
Cited by 56 | Viewed by 6649
Abstract
Hyperspectral image compressive sensing reconstruction (HSI-CSR) is an important issue in remote sensing, and has recently been investigated increasingly by the sparsity prior based approaches. However, most of the available HSI-CSR methods consider the sparsity prior in spatial and spectral vector domains via [...] Read more.
Hyperspectral image compressive sensing reconstruction (HSI-CSR) is an important issue in remote sensing, and has recently been investigated increasingly by the sparsity prior based approaches. However, most of the available HSI-CSR methods consider the sparsity prior in spatial and spectral vector domains via vectorizing hyperspectral cubes along a certain dimension. Besides, in most previous works, little attention has been paid to exploiting the underlying nonlocal structure in spatial domain of the HSI. In this paper, we propose a nonlocal tensor sparse and low-rank regularization (NTSRLR) approach, which can encode essential structured sparsity of an HSI and explore its advantages for HSI-CSR task. Specifically, we study how to utilize reasonably the l 1 -based sparsity of core tensor and tensor nuclear norm function as tensor sparse and low-rank regularization, respectively, to describe the nonlocal spatial-spectral correlation hidden in an HSI. To study the minimization problem of the proposed algorithm, we design a fast implementation strategy based on the alternative direction multiplier method (ADMM) technique. Experimental results on various HSI datasets verify that the proposed HSI-CSR algorithm can significantly outperform existing state-of-the-art CSR techniques for HSI recovery. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>Flowchart of the proposed HSI-CSR algorithm, which consists of two steps: sensing and reconstruction. First, it acquires the compressive measurement <span class="html-italic">y</span> by a random sampling matrix <math display="inline"><semantics> <mo>Φ</mo> </semantics></math>. Second, NTSRLR recovers an HSI from the measurements <math display="inline"><semantics> <mrow> <mi>y</mi> <mo>=</mo> <mo>Φ</mo> <mi>x</mi> </mrow> </semantics></math>.</p>
Full article ">Figure 2
<p>Nonlocal tensor sparsity and low-rank property analysis in HSI.</p>
Full article ">Figure 3
<p>HSIs employed in the compressive sensing experiments: (<b>a</b>) <span class="html-italic">Toy</span>; (<b>b</b>) <span class="html-italic">PaviaU</span>; and (<b>c</b>) <span class="html-italic">Indian Pines</span>.</p>
Full article ">Figure 4
<p>Compressive sensing reconstructed results on pseudocolor images with bands (25,15, 5) of the <span class="html-italic">Toy</span> image from different methods under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.20.</p>
Full article ">Figure 5
<p>Compressive sensing reconstructed results on pseudocolor images with bands (55, 30, 5) of the <span class="html-italic">PaviaU</span> image from different methods under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.10.</p>
Full article ">Figure 6
<p>Compressive sensing reconstructed results on pseudocolor images with bands (23, 13, 3) of the <span class="html-italic">Indian Pines</span> image from different methods under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.15.</p>
Full article ">Figure 7
<p>PSNR, SSIM and FSIM values comparison of different methods for each band on <span class="html-italic">Indian Pines</span> dataset under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.20.</p>
Full article ">Figure 8
<p>Comparison of spectra difference on <span class="html-italic">Toy</span> and <span class="html-italic">PaviaU</span> datasets: (<b>b</b>,<b>c</b>) the spectra difference curves of different methods corresponding to the region marked by cyan and green rectangles of <span class="html-italic">Toy</span> in (<b>a</b>) under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.05; and (<b>e</b>,<b>f</b>) the spectra difference curves of different methods corresponding to the region marked by red and blue rectangles of <span class="html-italic">PaviaU</span> in (<b>d</b>) under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.10.</p>
Full article ">Figure 9
<p>Classification results for the <span class="html-italic">Indian Pines</span> image using SVM before and after CSR under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.20.</p>
Full article ">Figure 10
<p>Compressive sensing reconstructed results on pseudocolor images with bands (186, 131, 1) of the noisy <span class="html-italic">Urban</span> image from different methods under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.10.</p>
Full article ">Figure 11
<p>Horizontal mean profiles of compressive sensing reconstructed results on 1st band of real noisy <span class="html-italic">Urban</span> HSI data from different methods under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.10.</p>
Full article ">Figure 12
<p>Horizontal mean profiles of compressive sensing reconstructed results on 186th band of real noisy <span class="html-italic">Urban</span> HSI data from different methods under sampling rate <math display="inline"><semantics> <mi>ρ</mi> </semantics></math> = 0.10.</p>
Full article ">Figure 13
<p>MPSNR, MSSIM and SAM bars of different methods under sampling rates 0.05 to 0.20 with interval 0.05 on <span class="html-italic">PaviaU</span> dataset.</p>
Full article ">Figure 14
<p>Verification of the convergence of the proposed method. Progression of the PSNRs for the <span class="html-italic">Toy</span> and <span class="html-italic">Indian Pines</span> datasets under different sampling rates.</p>
Full article ">
15 pages, 2195 KiB  
Article
Online Hashing for Scalable Remote Sensing Image Retrieval
by Peng Li, Xiaoyu Zhang, Xiaobin Zhu and Peng Ren
Remote Sens. 2018, 10(5), 709; https://doi.org/10.3390/rs10050709 - 4 May 2018
Cited by 26 | Viewed by 5320
Abstract
Recently, hashing-based large-scale remote sensing (RS) image retrieval has attracted much attention. Many new hashing algorithms have been developed and successfully applied to fast RS image retrieval tasks. However, there exists an important problem rarely addressed in the research literature of RS image [...] Read more.
Recently, hashing-based large-scale remote sensing (RS) image retrieval has attracted much attention. Many new hashing algorithms have been developed and successfully applied to fast RS image retrieval tasks. However, there exists an important problem rarely addressed in the research literature of RS image hashing. The RS images are practically produced in a streaming manner in many real-world applications, which means the data distribution keeps changing over time. Most existing RS image hashing methods are batch-based models whose hash functions are learned once for all and kept fixed all the time. Therefore, the pre-trained hash functions might not fit the ever-growing new RS images. Moreover, the batch-based models have to load all the training images into memory for model learning, which consumes many computing and memory resources. To address the above deficiencies, we propose a new online hashing method, which learns and adapts its hashing functions with respect to the newly incoming RS images in terms of a novel online partial random learning scheme. Our hash model is updated in a sequential mode such that the representative power of the learned binary codes for RS images are improved accordingly. Moreover, benefiting from the online learning strategy, our proposed hashing approach is quite suitable for scalable real-world remote sensing image retrieval. Extensive experiments on two large-scale RS image databases under online setting demonstrated the efficacy and effectiveness of the proposed method. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>The illustration of the proposed online hashing approach for scalable remote sensing image retrieval.</p>
Full article ">Figure 2
<p>Some sample images from: (<b>a</b>) SAT-4 dataset; and (<b>b</b>) SAT-6 dataset.</p>
Full article ">Figure 3
<p>The average precision with respect to different retrieved samples and precision-recall curves for the compared methods on the two datasets: (<b>a–f</b>) SAT-4; and (<b>g–l</b>) SAT-6.</p>
Full article ">Figure 4
<p>Comparison of average precision at each round of the online hashing methods on: (<b>a</b>) SAT-4 dataset; and (<b>b</b>) SAT-6 dataset (64-bits).</p>
Full article ">Figure 5
<p>Visualized retrieval example after different rounds by our OPRH method on SAT-6 dataset with 64-bits. Top-16 returned image patches for the query are shown for each round and the false positives are annotated with a red rectangle.</p>
Full article ">
18 pages, 87543 KiB  
Article
A Novel Affine and Contrast Invariant Descriptor for Infrared and Visible Image Registration
by Xiangzeng Liu, Yunfeng Ai, Juli Zhang and Zhuping Wang
Remote Sens. 2018, 10(4), 658; https://doi.org/10.3390/rs10040658 - 23 Apr 2018
Cited by 38 | Viewed by 6545
Abstract
Infrared and visible image registration is a very challenging task due to the large geometric changes and the significant contrast differences caused by the inconsistent capture conditions. To address this problem, this paper proposes a novel affine and contrast invariant descriptor called maximally [...] Read more.
Infrared and visible image registration is a very challenging task due to the large geometric changes and the significant contrast differences caused by the inconsistent capture conditions. To address this problem, this paper proposes a novel affine and contrast invariant descriptor called maximally stable phase congruency (MSPC), which integrates the affine invariant region extraction with the structural features of images organically. First, to achieve the contrast invariance and ensure the significance of features, we detect feature points using moment ranking analysis and extract structural features via merging phase congruency images in multiple orientations. Then, coarse neighborhoods centered on the feature points are obtained based on Log-Gabor filter responses over scales and orientations. Subsequently, the affine invariant regions of feature points are determined by using maximally stable extremal regions. Finally, structural descriptors are constructed from those regions and the registration can be implemented according to the correspondence of the descriptors. The proposed method has been tested on various infrared and visible pairs acquired by different platforms. Experimental results demonstrate that our method outperforms several state-of-the-art methods in terms of robustness and precision with different image data and also show its effectiveness in the application of trajectory tracking. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Differences of contrast and viewpoints in input images. (<b>a</b>) Infrared image; (<b>b</b>) Corresponding regions and their gradient images; and (<b>c</b>) Visible image.</p>
Full article ">Figure 2
<p>Illustration of registration by using the proposed method.</p>
Full article ">Figure 3
<p>Feature points detection by the method of salient feature points extraction (MSFPE).</p>
Full article ">Figure 4
<p>Structural features extraction using multi-orientation phase congruency.</p>
Full article ">Figure 5
<p>The construction of the maximally stable phase congruency (MSPC) descriptor from input images. (<b>a</b>) Original patches around the feature points; (<b>b</b>) Rectangle regions from structural features image (SFI) according to the scales and orientations of the feature points; (<b>c</b>) Fine ellipse regions detected by maximally stable extremal regions (MSER) based on the rectangle regions; (<b>d</b>) Normalized circle regions relate to the ellipse regions; (<b>e</b>) MSPC descriptors constructed in the circle regions.</p>
Full article ">Figure 6
<p>Flow chart of the proposed registration.</p>
Full article ">Figure 7
<p>(<b>a</b>–<b>d</b>) are different infrared and visible image pairs from CVC datasets.</p>
Full article ">Figure 7 Cont.
<p>(<b>a</b>–<b>d</b>) are different infrared and visible image pairs from CVC datasets.</p>
Full article ">Figure 8
<p>Matching results using the proposed method for <a href="#remotesensing-10-00658-f007" class="html-fig">Figure 7</a>. (<b>a</b>–<b>d</b>) are the matching results of the <a href="#remotesensing-10-00658-f007" class="html-fig">Figure 7</a>a–d respectively.</p>
Full article ">Figure 9
<p>(<b>a</b>–<b>f</b>) are the samples of image pairs captured from electro-optical pod (EOP) on UAV.</p>
Full article ">Figure 10
<p>Matching results by the proposed method for <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a>. (<b>a</b>–<b>f</b>) are the matching results of the <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a>a–f respectively.</p>
Full article ">Figure 10 Cont.
<p>Matching results by the proposed method for <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a>. (<b>a</b>–<b>f</b>) are the matching results of the <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a>a–f respectively.</p>
Full article ">Figure 11
<p>Comparison of matching performance by the related methods. (<b>a</b>) is the matching precision for the six image pairs in <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a> by the related methods; (<b>b</b>) is repeatability for the six image pairs in <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a> by the related methods.</p>
Full article ">Figure 12
<p>Registration results by the proposed method for <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a>. (<b>a</b>–<b>f</b>) are the registration results of the proposed method for <a href="#remotesensing-10-00658-f009" class="html-fig">Figure 9</a>a–f respectively.</p>
Full article ">Figure 13
<p>Reference image download from Google.</p>
Full article ">Figure 14
<p>Samples of the sub-images from the real-time images.</p>
Full article ">Figure 15
<p>Several registration results of the samples in <a href="#remotesensing-10-00658-f014" class="html-fig">Figure 14</a> and the sub-regions of the reference image in <a href="#remotesensing-10-00658-f013" class="html-fig">Figure 13</a>.</p>
Full article ">Figure 16
<p>UAV trajectory tracking results of our registration method.</p>
Full article ">
24 pages, 3900 KiB  
Article
Deep Salient Feature Based Anti-Noise Transfer Network for Scene Classification of Remote Sensing Imagery
by Xi Gong, Zhong Xie, Yuanyuan Liu, Xuguo Shi and Zhuo Zheng
Remote Sens. 2018, 10(3), 410; https://doi.org/10.3390/rs10030410 - 6 Mar 2018
Cited by 39 | Viewed by 5971
Abstract
Remote sensing (RS) scene classification is important for RS imagery semantic interpretation. Although tremendous strides have been made in RS scene classification, one of the remaining open challenges is recognizing RS scenes in low quality variance (e.g., various scales and noises). This paper [...] Read more.
Remote sensing (RS) scene classification is important for RS imagery semantic interpretation. Although tremendous strides have been made in RS scene classification, one of the remaining open challenges is recognizing RS scenes in low quality variance (e.g., various scales and noises). This paper proposes a deep salient feature based anti-noise transfer network (DSFATN) method that effectively enhances and explores the high-level features for RS scene classification in different scales and noise conditions. In DSFATN, a novel discriminative deep salient feature (DSF) is introduced by saliency-guided DSF extraction, which conducts a patch-based visual saliency (PBVS) algorithm using “visual attention” mechanisms to guide pre-trained CNNs for producing the discriminative high-level features. Then, an anti-noise network is proposed to learn and enhance the robust and anti-noise structure information of RS scene by directly propagating the label information to fully-connected layers. A joint loss is used to minimize the anti-noise network by integrating anti-noise constraint and a softmax classification loss. The proposed network architecture can be easily trained with a limited amount of training data. The experiments conducted on three different scale RS scene datasets show that the DSFATN method has achieved excellent performance and great robustness in different scales and noise conditions. It obtains classification accuracy of 98.25%, 98.46%, and 98.80%, respectively, on the UC Merced Land Use Dataset (UCM), the Google image dataset of SIRI-WHU, and the SAT-6 dataset, advancing the state-of-the-art substantially. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The framework of deep salient feature based anti-noise transfer network (DSFATN) contains two main steps: saliency-guided deep salient feature (DSF) extraction and anti-noise transfer network based classification. The saliency-guided DSF extraction conducts a patch-based visual saliency (PBVS) to guide pre-trained convolutional neural networks (CNNs) for producing the discriminative high-level DSF for remote sensing (RS) scene with different scale and various noises. Then, the anti-noise transfer network is trained to learn and enhance the robust and anti-noise structure information of RS scene by minimizing a joint loss. For anti-noise learning, the input scenes include origin scenes and scenes with various noises (e.g., salt and pepper, occlusions and mixtures).</p>
Full article ">Figure 2
<p>The flowchart of PBVS based salient patch extraction. The brightness in the saliency map indicates the salient level of the corresponding parts in the input RS scenes: brighter in saliency map, more salient in RS scene. The overlay of RS scene and saliency map make the salient level reflected in the input RS scene, the bigger salient value corresponds higher salient level. The red rectangle is the salient region of the scene.</p>
Full article ">Figure 3
<p>The architecture of the very deep CNN with 19 layers (VGG-19)</p>
Full article ">Figure 4
<p>The anti-noise transfer network.</p>
Full article ">Figure 5
<p>The categories sequences of the UC Merced Land Use Dataset (UCM), The Google image dataset designed by RS_IDEA Group in Wuhan University (SIRI-WHU) and the SAT-6 dataset: the numbers before the category names will be used to represent the corresponding categories in the experiments.</p>
Full article ">Figure 6
<p>Confusion matrix of DSFATN on the UCM dataset: the horizontal and vertical axes represent the predict labels and true labels respectively. All categories obtain accuracy higher than 0.96.</p>
Full article ">Figure 7
<p>Confusion matrix of DSFATN on: (<b>a</b>) the SIRI-WHU dataset; and (<b>b</b>) the SAT-6 dataset. The horizontal and vertical axes represent the predict labels and true labels respectively.</p>
Full article ">Figure 8
<p>The comparison of different features on the three datasets by per-class two-dimensional feature visualization. From left to right: the UCM dataset, the SIRI-WHU dataset and the SAT-6 dataset. From top to bottom: histogram of oriented gradients (HOG), local binary patterns (LBP), scale invariant feature transform (SIFT), CNN(6conv+2fc) feature, and DSF. It is obvious that DSF (the last row) has more clearly separated clusters.</p>
Full article ">Figure 9
<p>The per-class accuracy comparisons on: the UCM dataset with salt and pepper noise (<b>top left</b>); the SIRI-WHU dataset with salt and pepper noise (<b>top right</b>); the UCM dataset with partial occlusion (<b>middle left</b>); the SIRI-WHU dataset with partial occlusion (<b>middle right</b>); the UCM dataset with mixed noise (<b>bottom left</b>); and the SIRI-WHU dataset with mixed noise (<b>bottom right</b>). In most cases, the accuracies rank in the following order: DSFATN &gt; TN-2 &gt; TN-1.</p>
Full article ">Figure 10
<p>The influence of salient patches’ number α in DSFATN on UCM dataset.</p>
Full article ">Figure 11
<p>The influence of the regularization coefficient <math display="inline"> <semantics> <mrow> <mi>λ</mi> <mo>.</mo> </mrow> </semantics> </math></p>
Full article ">Figure 12
<p>The example scenes of an example tennis court scene at five levels of noise conditions.</p>
Full article ">Figure 13
<p>The results of DSFATN at five levels of noise conditions: the classification accuracies decrease when the noise level increases.</p>
Full article ">
18 pages, 9907 KiB  
Article
Learning a Dilated Residual Network for SAR Image Despeckling
by Qiang Zhang, Qiangqiang Yuan, Jie Li, Zhen Yang and Xiaoshuang Ma
Remote Sens. 2018, 10(2), 196; https://doi.org/10.3390/rs10020196 - 29 Jan 2018
Cited by 191 | Viewed by 9494
Abstract
In this paper, to break the limit of the traditional linear models for synthetic aperture radar (SAR) image despeckling, we propose a novel deep learning approach by learning a non-linear end-to-end mapping between the noisy and clean SAR images with a dilated residual [...] Read more.
In this paper, to break the limit of the traditional linear models for synthetic aperture radar (SAR) image despeckling, we propose a novel deep learning approach by learning a non-linear end-to-end mapping between the noisy and clean SAR images with a dilated residual network (SAR-DRN). SAR-DRN is based on dilated convolutions, which can both enlarge the receptive field and maintain the filter size and layer depth with a lightweight structure. In addition, skip connections and a residual learning strategy are added to the despeckling model to maintain the image details and reduce the vanishing gradient problem. Compared with the traditional despeckling methods, the proposed method shows a superior performance over the state-of-the-art methods in both quantitative and visual assessments, especially for strong speckle noise. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The architecture of the proposed SAR-DRN.</p>
Full article ">Figure 2
<p>Receptive field size of different dilated convolution. (<math display="inline"> <semantics> <mi>d</mi> </semantics> </math> = 1, 2, and 4, where the dark color regions represent the receptive field).</p>
Full article ">Figure 3
<p>Dilated convolution in the proposed model.</p>
Full article ">Figure 4
<p>Diagram of skip connection structure in the proposed model. (<b>a</b>) Connecting dilated convolution layer 1 to dilated convolution layer 3. (<b>b</b>) Dilated convolution layer 4 to dilated convolution layer 7.</p>
Full article ">Figure 5
<p>The framework of SAR image despeckling based on deep learning.</p>
Full article ">Figure 6
<p>Filtered images for the Airplane image contaminated by two-look speckle. (<b>a</b>) Original image. (<b>b</b>) Speckled image. (<b>c</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>d</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>e</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>f</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>g</b>) SAR-DRN.</p>
Full article ">Figure 6 Cont.
<p>Filtered images for the Airplane image contaminated by two-look speckle. (<b>a</b>) Original image. (<b>b</b>) Speckled image. (<b>c</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>d</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>e</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>f</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>g</b>) SAR-DRN.</p>
Full article ">Figure 7
<p>Filtered images for the Building image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) Speckled image. (<b>c</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>d</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>e</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>f</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>g</b>) SAR-DRN.</p>
Full article ">Figure 8
<p>Filtered images for the Highway image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) Speckled image. (<b>c</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>d</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>e</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>f</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>g</b>) SAR-DRN.</p>
Full article ">Figure 9
<p>Filtered images for the <span class="html-italic">Flevoland</span> SAR image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>c</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>d</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>e</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>f</b>) SAR-DRN.</p>
Full article ">Figure 9 Cont.
<p>Filtered images for the <span class="html-italic">Flevoland</span> SAR image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>c</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>d</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>e</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>f</b>) SAR-DRN.</p>
Full article ">Figure 10
<p>Filtered images for the <span class="html-italic">Deathvalley</span> SAR image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>c</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>d</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>e</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>f</b>) SAR-DRN.</p>
Full article ">Figure 11
<p>Filtered images for the <span class="html-italic">San Francisco</span> SAR image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>c</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>d</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>e</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>f</b>) SAR-DRN.</p>
Full article ">Figure 11 Cont.
<p>Filtered images for the <span class="html-italic">San Francisco</span> SAR image contaminated by four-look speckle. (<b>a</b>) Original image. (<b>b</b>) PPB [<a href="#B13-remotesensing-10-00196" class="html-bibr">13</a>]. (<b>c</b>) SAR-BM3D [<a href="#B14-remotesensing-10-00196" class="html-bibr">14</a>]. (<b>d</b>) SAR-POTDF [<a href="#B16-remotesensing-10-00196" class="html-bibr">16</a>]. (<b>e</b>) SAR-CNN [<a href="#B28-remotesensing-10-00196" class="html-bibr">28</a>]. (<b>f</b>) SAR-DRN.</p>
Full article ">Figure 12
<p>The simulated SAR image despeckling results of the four specific models in (<b>a</b>) training loss and (<b>b</b>) average PSNR, with respect to iterations. The four specific models were different combinations of dilated convolutions (Dconv) and skip connections (SK), and were trained with one-look images in the same environment. The results were evaluated for the <span class="html-italic">Set14</span> [<a href="#B43-remotesensing-10-00196" class="html-bibr">43</a>] dataset.</p>
Full article ">Figure 13
<p>The simulated SAR image despeckling results of the two specific models with/without batch normalization (BN). The two specific models were trained with one-look images in the same environment, and the results were evaluated for the <span class="html-italic">Set14</span> [<a href="#B43-remotesensing-10-00196" class="html-bibr">43</a>] dataset.</p>
Full article ">
18 pages, 4726 KiB  
Article
Comparative Analysis of Responses of Land Surface Temperature to Long-Term Land Use/Cover Changes between a Coastal and Inland City: A Case of Freetown and Bo Town in Sierra Leone
by Musa Tarawally, Wenbo Xu, Weiming Hou and Terence Darlington Mushore
Remote Sens. 2018, 10(1), 112; https://doi.org/10.3390/rs10010112 - 15 Jan 2018
Cited by 53 | Viewed by 9608
Abstract
Urban growth and its associated expansion of built-up areas are expected to continue through to the twenty second century and at a faster pace in developing countries. This has the potential to increase thermal discomfort and heat-related distress. There is thus a need [...] Read more.
Urban growth and its associated expansion of built-up areas are expected to continue through to the twenty second century and at a faster pace in developing countries. This has the potential to increase thermal discomfort and heat-related distress. There is thus a need to monitor growth patterns, especially in resource constrained countries such as Africa, where few studies have so far been conducted. In view of this, this study compares urban growth and temperature response patterns in Freetown and Bo town in Sierra Leone. Multispectral Landsat images obtained in 1998, 2000, 2007, and 2015 are used to quantify growth and land surface temperature responses. The contribution index (CI) is used to explain how changes per land use and land cover class (LULC) contributed to average city surface temperatures. The population size of Freetown was about eight times greater than in Bo town. Landsat data mapped urban growth patterns with a high accuracy (Overall Accuracy > 80%) for both cities. Significant changes in LULC were noted in Freetown, characterized by a 114 km2 decrease in agriculture area, 23 km2 increase in dense vegetation, and 77 km2 increase in built-up area. Between 1998 and 2015, built-up area increased by 16 km2, while dense vegetation area decreased by 14 km2 in Bo town. Average surface temperature increased from 23.7 to 25.5 °C in Freetown and from 24.9 to 28.2 °C in Bo town during the same period. Despite the larger population size and greater built-up extent, as well as expansion rate, Freetown was 2 °C cooler than Bo town in all periods. The low temperatures are attributed to proximity to sea and the very large proportion of vegetation surrounding the city. Even close to the sea and abundant vegetation, the built-up area had an elevated temperature compared to the surroundings. The findings are important for formulating heat mitigation strategies for both inland and coastal cities in developing countries. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Location of Freetown and Bo town in Sierra Leone, West Africa.</p>
Full article ">Figure 2
<p>Urban growth induced LULC changes in Freetown (<b>A</b>–<b>D</b>) and Bo town (<b>E</b>–<b>H</b>) between 1998 and 2015.</p>
Full article ">Figure 3
<p>Urban growth induced LULC changes in Freetown and Bo town (1998 to 2015).</p>
Full article ">Figure 4
<p>Land surface temperature change in Freetown (<b>A</b>–<b>D</b>) and Bo town (<b>E</b>–<b>H</b>) between 1998 and 2015.</p>
Full article ">
6695 KiB  
Article
Automatic Counting of Large Mammals from Very High Resolution Panchromatic Satellite Imagery
by Yifei Xue, Tiejun Wang and Andrew K. Skidmore
Remote Sens. 2017, 9(9), 878; https://doi.org/10.3390/rs9090878 - 23 Aug 2017
Cited by 53 | Viewed by 11028
Abstract
Estimating animal populations by direct counting is an essential component of wildlife conservation and management. However, conventional approaches (i.e., ground survey and aerial survey) have intrinsic constraints. Advances in image data capture and processing provide new opportunities for using applied remote sensing to [...] Read more.
Estimating animal populations by direct counting is an essential component of wildlife conservation and management. However, conventional approaches (i.e., ground survey and aerial survey) have intrinsic constraints. Advances in image data capture and processing provide new opportunities for using applied remote sensing to count animals. Previous studies have demonstrated the feasibility of using very high resolution multispectral satellite images for animal detection, but to date, the practicality of detecting animals from space using panchromatic imagery has not been proven. This study demonstrates that it is possible to detect and count large mammals (e.g., wildebeests and zebras) from a single, very high resolution GeoEye-1 panchromatic image in open savanna. A novel semi-supervised object-based method that combines a wavelet algorithm and a fuzzy neural network was developed. To discern large mammals from their surroundings and discriminate between animals and non-targets, we used the wavelet technique to highlight potential objects. To make full use of geometric attributes, we carefully trained the classifier, using the adaptive-network-based fuzzy inference system. Our proposed method (with an accuracy index of 0.79) significantly outperformed the traditional threshold-based method (with an accuracy index of 0.58) detecting large mammals in open savanna. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>Location of the Maasai Mara National Reserve in Kenya and the three pilot study areas on a natural color composite of a GeoEye-1 image, acquired on 11 August 2009.</p>
Full article ">Figure 2
<p>The panchromatic band of the GeoEye-1 image taken on 11 August 2009, showing large mammals in the Maasai Mara National Reserve. Pilot area No. 1 represents low complexity regarding animal numbers and uniformity of background; Pilot area No. 2 represents moderate complexity; and Pilot area No. 3 represents high complexity. The rectangle visible in the top-left corner of Pilot area No. 3 is a white vehicle.</p>
Full article ">Figure 3
<p>Visual interpretation of target animals by comparing two pan-sharpened GeoEye-1 images (0.5 m): one acquired 10 August 2013, without large animals present (top), and one acquired 11 August 2009, with large animals (bottom). The three pilot study areas represent the complexity of the landscape and the abundance of animals appearing in these images, from left to right: low, moderate and high.</p>
Full article ">Figure 4
<p>Workflow of the proposed method for counting large mammals from a single, very high resolution panchromatic GeoEye-1 satellite image.</p>
Full article ">Figure 5
<p>Flow diagram of the adaptive-network-based fuzzy inference system (ANFIS) based reclassification system.</p>
Full article ">Figure 6
<p>Results regarding large mammal detection in the three different pilot study areas. The columns show images of the three pilot areas: No. 1 is of low complexity, No. 2 of moderate complexity and No. 3 of high complexity. The first row contains the original panchromatic satellite images; the second row illustrates results based on the thresholding method; and the third row illustrates the results obtained using the method proposed in this study (i.e., ANFIS-wavelet). The green, red and yellow dots indicate true positive, false negative and false positive results, respectively.</p>
Full article ">Figure 7
<p>Indentification of optimum epoch number based on the root-mean-square error of both training error and checking error.</p>
Full article ">
9026 KiB  
Article
Topic Modelling for Object-Based Unsupervised Classification of VHR Panchromatic Satellite Images Based on Multiscale Image Segmentation
by Li Shen, Linmei Wu, Yanshuai Dai, Wenfan Qiao and Ying Wang
Remote Sens. 2017, 9(8), 840; https://doi.org/10.3390/rs9080840 - 14 Aug 2017
Cited by 9 | Viewed by 6759
Abstract
Image segmentation is a key prerequisite for object-based classification. However, it is often difficult, or even impossible, to determine a unique optimal segmentation scale due to the fact that various geo-objects, and even an identical geo-object, present at multiple scales in very high [...] Read more.
Image segmentation is a key prerequisite for object-based classification. However, it is often difficult, or even impossible, to determine a unique optimal segmentation scale due to the fact that various geo-objects, and even an identical geo-object, present at multiple scales in very high resolution (VHR) satellite images. To address this problem, this paper presents a novel unsupervised object-based classification for VHR panchromatic satellite images using multiple segmentations via the latent Dirichlet allocation (LDA) model. Firstly, multiple segmentation maps of the original satellite image are produced by means of a common multiscale segmentation technique. Then, the LDA model is utilized to learn the grayscale histogram distribution for each geo-object and the mixture distribution of geo-objects within each segment. Thirdly, the histogram distribution of each segment is compared with that of each geo-object using the Kullback-Leibler (KL) divergence measure, which is weighted with a constraint specified by the mixture distribution of geo-objects. Each segment is allocated a geo-object category label with the minimum KL divergence. Finally, the final classification map is achieved by integrating the multiple classification results at different scales. Extensive experimental evaluations are designed to compare the performance of our method with those of some state-of-the-art methods for three different types of images. The experimental results over three different types of VHR panchromatic satellite images demonstrate the proposed method is able to achieve scale-adaptive classification results, and improve the ability to differentiate the geo-objects with spectral overlap, such as water and grass, and water and shadow, in terms of both spatial consistency and semantic consistency. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Flowchart of the proposed method.</p>
Full article ">Figure 2
<p>Experimental datasets. (<b>a</b>) Mapping Satellite-1 panchromatic image; (<b>b</b>) ground truth map of the Mapping Satellite-1 image; (<b>c</b>) QuickBird panchromatic image; (<b>d</b>) ground truth map of the QuickBird image; (<b>e</b>) ZY-3 panchromatic image; and (<b>f</b>) ground truth map of the ZY-3 panchromatic image.</p>
Full article ">Figure 3
<p>Classification results of various methods for the Mapping Satellite-1 image. (<b>a</b>) Ground truth map; (<b>b</b>) O_ISODATA; (<b>c</b>) O_LDA; (<b>d</b>) msLDA; (<b>e</b>) HDP_IBP; (<b>f</b>) mSegLDA; and (<b>g</b>) Details of (<b>a</b>–<b>f</b>).</p>
Full article ">Figure 4
<p>Classification results of various methods for the QuickBird image. (<b>a</b>) Ground truth map; (<b>b</b>) O_ISODATA; (<b>c</b>) O_LDA; (<b>d</b>) msLDA; (<b>e</b>) HDP_IBP; (<b>f</b>) mSegLDA; and (<b>g</b>) Details of (<b>a</b>–<b>f</b>).</p>
Full article ">Figure 5
<p>Classification results of various methods for the ZY-3 image. (<b>a</b>) Ground truth map; (<b>b</b>) O_ISODATA; (<b>c</b>) O_LDA; (<b>d</b>) msLDA; (<b>e</b>) HDP_IBP; and (<b>f</b>) mSegLDA.</p>
Full article ">Figure 6
<p>OA and OE versus the number of scales for the Mapping Satellite-1 image. (<b>a</b>) Influence on OA; and (<b>b</b>) influence on OE.</p>
Full article ">Figure 7
<p>OA and OE versus the number of scales for the QuickBird image. (<b>a</b>) Influence on OA; and (<b>b</b>) influence on OE.</p>
Full article ">Figure 8
<p>Classification results of six special cases of the mSegLDA for the Mapping Satellite-1 image. (<b>a</b>) Ground truth map; (<b>b</b>) Case #1; (<b>c</b>) Case #2; (<b>d</b>) Case #3; (<b>e</b>) Case #4; (<b>f</b>) Case #5; (<b>g</b>) Case #6; and (<b>h</b>) mSegLDA.</p>
Full article ">Figure 9
<p>Classification results of nine special cases of the mSegLDA for the QuickBird image. (<b>a</b>) Ground truth map; (<b>b</b>) Case #1; (<b>c</b>) Case #2; (<b>d</b>) Case #3; (<b>e</b>) Case #4; (<b>f</b>) Case #5; (<b>g</b>) Case #6; (<b>h</b>) Case #7; (<b>i</b>) Case #8; (<b>j</b>) Case #9; and (<b>k</b>) mSegLDA.</p>
Full article ">
2182 KiB  
Article
Learning-Based Sub-Pixel Change Detection Using Coarse Resolution Satellite Imagery
by Yong Xu, Lin Lin and Deyu Meng
Remote Sens. 2017, 9(7), 709; https://doi.org/10.3390/rs9070709 - 10 Jul 2017
Cited by 11 | Viewed by 7196
Abstract
Moderate Resolution Imaging Spectroradiometer (MODIS) data are effective and efficient for monitoring urban dynamics such as urban cover change and thermal anomalies, but the spatial resolution provided by MODIS data is 500 m (for most of its shorter spectral bands), which results in [...] Read more.
Moderate Resolution Imaging Spectroradiometer (MODIS) data are effective and efficient for monitoring urban dynamics such as urban cover change and thermal anomalies, but the spatial resolution provided by MODIS data is 500 m (for most of its shorter spectral bands), which results in difficulty in detecting subtle spatial variations within a coarse pixel—especially for a fast-growing city. Given that the historical land use/cover products and satellite data at finer resolution are valuable to reflect the urban dynamics with more spatial details, finer spatial resolution images, as well as land cover products at previous times, are exploited in this study to improve the change detection capability of coarse resolution satellite data. The proposed approach involves two main steps. First, pairs of coarse and finer resolution satellite data at previous times are learned and then applied to generate synthetic satellite data with finer spatial resolution from coarse resolution satellite data. Second, a land cover map was produced at a finer spatial resolution and adjusted with the obtained synthetic satellite data and prior land cover maps. The approach was tested for generating finer resolution synthetic Landsat images using MODIS data from the Guangzhou study area. The finer resolution Landsat-like data were then applied to detect land cover changes with more spatial details. Test results show that the change detection accuracy using the proposed approach with the synthetic Landsat data is much better than the results using the original MODIS data or conventional spatial and temporal fusion-based approaches. The proposed approach is beneficial for detecting subtle urban land cover changes with more spatial details when multitemporal coarse satellite data are available. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Illustration of precise land cover mapping using synthetic Landsat data. (<b>a</b>) Synthetic Landsat data and its land cover proportions; (<b>b</b>) Land cover map at the predicted time (t1) using the proposed approach.</p>
Full article ">Figure 2
<p>Test with the actual MODIS data for sub-pixel change detection using different downscaling methods: (<b>a</b>) Landsat data for the year 2002 and the actual LCC from 2002 to 2004 (highlighted with black); (<b>b</b>) MODIS data for 2002; (<b>c</b>) Landsat for 2004 as a reference; (<b>d</b>) MODIS data for 2004; (<b>e</b>) Fused result for 2004 with the proposed approach; (<b>f</b>) Initial land cover map for 2004 with the fused result shown in <a href="#remotesensing-09-00709-f001" class="html-fig">Figure 1</a>e; (<b>g</b>) Final land cover map for 2004 with the initial land cover map using the proposed approach; (<b>h</b>) Change detection result using the proposed approach; (<b>i</b>) MODIS data for 2004; (<b>j</b>) Land cover map for 2004 with MODIS data; (<b>k</b>) Change detection result with MODIS data from 2002 to 2004; (<b>l</b>) Actual LCC from 2002 to 2004 for validation.</p>
Full article ">Figure 3
<p>Sub-pixel change detection results with the simulated MODIS data using different methods at a scaling factor of 16. The upper row shows the results using simulated MODIS data (<span class="html-italic">s</span> = 16): (<b>a</b>) Simulated MODIS data for the year 2004; (<b>b</b>) Land cover map using simulated MODIS data; and (<b>c</b>) Change detection result using simulated MODIS data. The middle row shows the results using the conventional fusion-based method. (<b>d</b>) Synthetic Landsat data for 2004 using the STARFM method; (<b>e</b>) Initial land cover map from the result shown in (<b>d</b>); (<b>f</b>) Final land cover map from synthetic Landsat data using the STARFM method; and (<b>g</b>) Change detection result from 2002 to 2004 using the STARFM method. The lower row shows the results using the proposed approach. (<b>h</b>) Synthetic Landsat data for 2004 using the proposed approach; (<b>i</b>) Initial land cover map from the result shown in (<b>h</b>); (<b>j</b>) Final land cover map using the proposed approach; (<b>k</b>) Change detection result from 2002 to 2004 using the proposed approach.</p>
Full article ">
68400 KiB  
Article
Road Segmentation of Remotely-Sensed Images Using Deep Convolutional Neural Networks with Landscape Metrics and Conditional Random Fields
by Teerapong Panboonyuen, Kulsawasd Jitkajornwanich, Siam Lawawirojwong, Panu Srestasathiern and Peerapon Vateekul
Remote Sens. 2017, 9(7), 680; https://doi.org/10.3390/rs9070680 - 1 Jul 2017
Cited by 112 | Viewed by 14238
Abstract
Object segmentation of remotely-sensed aerial (or very-high resolution, VHS) images and satellite (or high-resolution, HR) images, has been applied to many application domains, especially in road extraction in which the segmented objects are served as a mandatory layer in geospatial databases. Several attempts [...] Read more.
Object segmentation of remotely-sensed aerial (or very-high resolution, VHS) images and satellite (or high-resolution, HR) images, has been applied to many application domains, especially in road extraction in which the segmented objects are served as a mandatory layer in geospatial databases. Several attempts at applying the deep convolutional neural network (DCNN) to extract roads from remote sensing images have been made; however, the accuracy is still limited. In this paper, we present an enhanced DCNN framework specifically tailored for road extraction of remote sensing images by applying landscape metrics (LMs) and conditional random fields (CRFs). To improve the DCNN, a modern activation function called the exponential linear unit (ELU), is employed in our network, resulting in a higher number of, and yet more accurate, extracted roads. To further reduce falsely classified road objects, a solution based on an adoption of LMs is proposed. Finally, to sharpen the extracted roads, a CRF method is added to our framework. The experiments were conducted on Massachusetts road aerial imagery as well as the Thailand Earth Observation System (THEOS) satellite imagery data sets. The results showed that our proposed framework outperformed Segnet, a state-of-the-art object segmentation technique, on any kinds of remote sensing imagery, in most of the cases in terms of precision, recall, and F 1 . Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>A process in our proposed framework.</p>
Full article ">Figure 2
<p>A proposed network architecture for object segmentation (exponential linear unit (ELU)-SegNet).</p>
Full article ">Figure 3
<p>Illustration of shape index scores on each extracted road object. Any objects with shape index score lower than 1.25 are considered as noises and subsequently removed.</p>
Full article ">Figure 4
<p>Two sample aerial images from the Massachusetts road corpus, where a row refers to each image (<b>a</b>) Aerial image and (<b>b</b>) Binary map, which is a ground truth image denoting the location of roads.</p>
Full article ">Figure 5
<p>Sample satellite images from five provinces of our data sets; each row refers to a single sample image from one province (Nakhonpathom, Chonburi, Songkhla, Surin, and Ubonratchathani) in a satellite image format (<b>a</b>) and in a binary map (<b>b</b>), which is served as a ground truth image denoting the location of roads.</p>
Full article ">Figure 6
<p>Iteration plot on Massachusetts aerial corpus of the proposed technique, ELU-SegNet-LMs-CRFs; <span class="html-italic">x</span> refers to epochs and <span class="html-italic">y</span> refers to different measures. (<b>a</b>) Plot of model loss (cross entropy) on training and validation data sets, and (<b>b</b>) Performance plot on the validation data set.</p>
Full article ">Figure 7
<p>Two sample input and output aerial images on Massachusetts corpus, where rows refer different images. (<b>a</b>) Original input image; (<b>b</b>) Target road map (ground truth); (<b>c</b>) Output of ELU-SegNet; (<b>d</b>) Output of ELU-SegNet-LMs; and (<b>e</b>) Output of ELU-SegNet-LMs-CRFs.</p>
Full article ">Figure 8
<p>Iteration plot on THEOS satellite data sets of the proposed technique, ELU-SegNet-LMs-CRFs. <span class="html-italic">x</span> refers to epochs and <span class="html-italic">y</span> refers to different measures. Each row refers to different data set (province). (<b>a</b>) Plot of model loss (cross entropy) on training and validation data sets; and (<b>b</b>) Performance plot on the validation data set.</p>
Full article ">Figure 9
<p>Two sample input and output THEOS satellite images on the Nakhonpathom data set, where rows refer different images. (<b>a</b>) Original input image; (<b>b</b>) Target road map (ground truth); (<b>c</b>) Output of ELU-SegNet; (<b>d</b>) Output of ELU-SegNet-LMs; and (<b>e</b>) Output of ELU-SegNet-LMs-CRFs.</p>
Full article ">Figure 10
<p>Two sample input and output THEOS satellite images on the Chonburi data set, where rows refer different images. (<b>a</b>) Original input image; (<b>b</b>) Target road map (ground truth); (<b>c</b>) Output of ELU-SegNet; (<b>d</b>) Output of ELU-SegNet-LMs; and (<b>e</b>) Output of ELU-SegNet-LMs-CRFs.</p>
Full article ">Figure 11
<p>Two sample input and output THEOS satellite images on the Songkhla data set, where rows refer different images. (<b>a</b>) Original input image; (<b>b</b>) Target road map (ground truth); (<b>c</b>) Output of ELU-SegNet; (<b>d</b>) Output of ELU-SegNet-LMs; and (<b>e</b>) Output of ELU-SegNet-LMs-CRFs.</p>
Full article ">Figure 12
<p>Two sample input and output THEOS satellite images on the Surin data set, where rows refer different images. (<b>a</b>) Original input image; (<b>b</b>) Target road map (ground truth); (<b>c</b>) Output of ELU-SegNet; (<b>d</b>) output of ELU-SegNet-LMs; and (<b>e</b>) Output of ELU-SegNet-LMs-CRFs.</p>
Full article ">Figure 13
<p>Two sample input and output THEOS satellite images on Ubonratchathani data set, where rows refer different images. (<b>a</b>) Original input image; (<b>b</b>) Target road map (ground truth); (<b>c</b>) Output of ELU-SegNet; (<b>d</b>) Output of ELU-SegNet-LMs; and (<b>e</b>) Output of ELU-SegNet-LMs-CRFs.</p>
Full article ">
2179 KiB  
Article
Nonlinear Classification of Multispectral Imagery Using Representation-Based Classifiers
by Yan Xu, Qian Du, Wei Li, Chen Chen and Nicolas H. Younan
Remote Sens. 2017, 9(7), 662; https://doi.org/10.3390/rs9070662 - 28 Jun 2017
Cited by 8 | Viewed by 5296
Abstract
This paper investigates representation-based classification for multispectral imagery. Due to small spectral dimension, the performance of classification may be limited, and, in general, it is difficult to discriminate different classes with multispectral imagery. Nonlinear band generation method with explicit functions is proposed to [...] Read more.
This paper investigates representation-based classification for multispectral imagery. Due to small spectral dimension, the performance of classification may be limited, and, in general, it is difficult to discriminate different classes with multispectral imagery. Nonlinear band generation method with explicit functions is proposed to use which can provide additional spectral information for multispectral image classification. Specifically, we propose the simple band ratio function, which can yield better performance than the nonlinear kernel method with implicit mapping function. Two representation-based classifiers—i.e., sparse representation classifier (SRC) and nearest regularized subspace (NRS) method—are evaluated on the nonlinearly generated datasets. Experimental results demonstrate that this dimensionality-expansion approach can outperform the traditional kernel method in terms of high classification accuracy and low computational cost when classifying multispectral imagery. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Framework of the band generation method.</p>
Full article ">Figure 2
<p>Color-infrared composites for (<b>a</b>) Indian Pines Dataset; (<b>b</b>) University of Pavia dataset.</p>
Full article ">Figure 3
<p>Thematic maps using 110 samples per class for the multispectral Indian Pines dataset with eight classes (and OA values). (<b>a</b>) Ground truth; (<b>b</b>) Training; (<b>c</b>) Original + NRS (0.7492); (<b>d</b>) Original + Multiplication + NRS (0.7781); (<b>e</b>) Original + Division + NRS (0.8159); (<b>f</b>) Original + Multiplication + Division + NRS (0.8124); (<b>g</b>) Original + KNRS (0.7852); (<b>h</b>) Original + KSVM (0.8193).</p>
Full article ">Figure 4
<p>Thematic maps using 110 samples per class for the multispectral University of Pavia dataset with nine classes (and OA values). (<b>a</b>) Ground truth; (<b>b</b>) Training; (<b>c</b>) Original + NRS (0.7698); (<b>d</b>) Original + Multiplication + NRS (0.7820); (<b>e</b>) Original + Division + NRS (0.7896); (<b>f)</b> Original + Multiplication + Division + NRS (0.7880); (<b>g</b>) Original + KNRS (0.7736); (<b>h</b>) Original + KSVM (0.7981).</p>
Full article ">Figure 5
<p>Classification on the multispectral dataset generated from the hyperspectral Indian Pines dataset.</p>
Full article ">Figure 6
<p>Classification on the multispectral dataset generated from the hyperspectral University of Pavia dataset.</p>
Full article ">Figure 7
<p>Classification Accuracy with different <span class="html-italic">λ</span> using NRS and SRC for: (<b>a</b>) multispectral Indian Pines; and (<b>b</b>) multispectral University of Pavia datasets.</p>
Full article ">Figure 8
<p>Classification on the multispectral Indian Pines dataset using the original plus division-generated bands (original + division) with different adjustment parameter <span class="html-italic">K</span>.</p>
Full article ">Figure 9
<p>Classification on the multispectral University of Pavia dataset using the original plus division-generated bands (original + division) with different adjustment parameter <span class="html-italic">K</span>.</p>
Full article ">
1515 KiB  
Article
Saliency Analysis via Hyperparameter Sparse Representation and Energy Distribution Optimization for Remote Sensing Images
by Libao Zhang, Xinran Lv and Xu Liang
Remote Sens. 2017, 9(6), 636; https://doi.org/10.3390/rs9060636 - 21 Jun 2017
Cited by 6 | Viewed by 5481
Abstract
In an effort to detect the region-of-interest (ROI) of remote sensing images with complex data distributions, sparse representation based on dictionary learning has been utilized, and has proved able to process high dimensional data adaptively and efficiently. In this paper, a visual attention [...] Read more.
In an effort to detect the region-of-interest (ROI) of remote sensing images with complex data distributions, sparse representation based on dictionary learning has been utilized, and has proved able to process high dimensional data adaptively and efficiently. In this paper, a visual attention model uniting hyperparameter sparse representation with energy distribution optimization is proposed for analyzing saliency and detecting ROIs in remote sensing images. A dictionary learning algorithm based on biological plausibility is adopted to generate the sparse feature space. This method only focuses on finite features, instead of various considerations of feature complexity and massive parameter tuning in other dictionary learning algorithms. In another portion of the model, aimed at obtaining the saliency map, the contribution of each feature is evaluated in a sparse feature space and the coding length of each feature is accumulated. Finally, we calculate the segmentation threshold using the saliency map and obtain the binary mask to separate the ROI from the original images. Experimental results show that the proposed model achieves better performance in saliency analysis and ROI detection for remote sensing images. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The framework of the proposed model.</p>
Full article ">Figure 2
<p>Region-of-interest (ROI) detection results produced by our model and the other 3 models. (<b>a</b>) origin images; (<b>b</b>) ITTI; (<b>c</b>) FT; (<b>d</b>) frequency domain analysis (FDA) and (<b>e</b>) our model.</p>
Full article ">Figure 3
<p>The structure of Hyperparameter Sparse Representation algorithm.</p>
Full article ">Figure 4
<p>The learned dictionary.</p>
Full article ">Figure 5
<p>Saliency maps by our proposed model and nine competing models on SPOT 5 images. (<b>a</b>) Origin images; (<b>b</b>) Ground truth; (<b>c</b>) CA; (<b>d</b>) FT; (<b>e</b>) GBVS; (<b>f</b>) ITTI; (<b>g</b>) WT; (<b>h</b>) SR; (<b>i</b>) MFF; (<b>j</b>) SACH; (<b>k</b>) FDA and (<b>l</b>) Ours.</p>
Full article ">Figure 6
<p>Saliency maps by our proposed model and nine competing models on Google Earth images. (<b>a</b>) Origin images; (<b>b</b>) Ground truth; (<b>c</b>) CA; (<b>d</b>) FT; (<b>e</b>) GBVS; (<b>f</b>) ITTI; (<b>g</b>) WT; (<b>h</b>) SR; (<b>i</b>) MFF; (<b>j</b>) SACH; (<b>k</b>) FDA and (<b>l</b>) Ours.</p>
Full article ">Figure 7
<p>ROIs extracted by our proposed model and nine competing models on SPOT 5 images. (<b>a</b>) Origin images; (<b>b</b>) Ground truth; (<b>c</b>) CA; (<b>d</b>) FT; (<b>e</b>) GBVS; (<b>f</b>) ITTI; (<b>g</b>) WT; (<b>h</b>) SR; (<b>i</b>) MFF; (<b>j</b>) SACH; (<b>k</b>) FDA and (<b>l</b>) ours.</p>
Full article ">Figure 8
<p>ROIs extracted by our proposed model and nine competing models on Google Earth images. (<b>a</b>) Origin images; (<b>b</b>) Ground truth; (<b>c</b>) CA; (<b>d</b>) FT; (<b>e</b>) GBVS; (<b>f</b>) ITTI; (<b>g</b>) WT; (<b>h</b>) SR; (<b>i</b>) MFF; (<b>j</b>) SACH; (<b>k</b>) FDA and (<b>l</b>) ours.</p>
Full article ">Figure 9
<p>ROC curves of our proposed model and nine competing models on (<b>a</b>) SPOT 5 and (<b>b</b>) Google Earth images.</p>
Full article ">Figure 10
<p>AUC of ROC curves of our proposed model and nine competing models on (<b>a</b>) SPOT 5 and (<b>b</b>) Google Earth images.</p>
Full article ">Figure 11
<p>Precision, Recall and F-Measure of ROIs by our proposed model and nine competing models on (<b>a</b>) SPOT 5 and (<b>b</b>) Google Earth images.</p>
Full article ">Figure 12
<p>ROI compression example of remote sensing image. (<b>a</b>) reconstructed image; (<b>b</b>) part of ROI; and (<b>c</b>) part of background region. From top to bottom: reconstructed images are 0.5 bpp and 2.0 bpp, respectively.</p>
Full article ">
20385 KiB  
Article
One-Dimensional Convolutional Neural Network Land-Cover Classification of Multi-Seasonal Hyperspectral Imagery in the San Francisco Bay Area, California
by Daniel Guidici and Matthew L. Clark
Remote Sens. 2017, 9(6), 629; https://doi.org/10.3390/rs9060629 - 20 Jun 2017
Cited by 96 | Viewed by 12272
Abstract
In this study, a 1-D Convolutional Neural Network (CNN) architecture was developed, trained and utilized to classify single (summer) and three seasons (spring, summer, fall) of hyperspectral imagery over the San Francisco Bay Area, California for the year 2015. For comparison, the Random [...] Read more.
In this study, a 1-D Convolutional Neural Network (CNN) architecture was developed, trained and utilized to classify single (summer) and three seasons (spring, summer, fall) of hyperspectral imagery over the San Francisco Bay Area, California for the year 2015. For comparison, the Random Forests (RF) and Support Vector Machine (SVM) classifiers were trained and tested with the same data. In order to support space-based hyperspectral applications, all analyses were performed with simulated Hyperspectral Infrared Imager (HyspIRI) imagery. Three-season data improved classifier overall accuracy by 2.0% (SVM), 1.9% (CNN) to 3.5% (RF) over single-season data. The three-season CNN provided an overall classification accuracy of 89.9%, which was comparable to overall accuracy of 89.5% for SVM. Both three-season CNN and SVM outperformed RF by over 7% overall accuracy. Analysis and visualization of the inner products for the CNN provided insight to distinctive features within the spectral-temporal domain. A method for CNN kernel tuning was presented to assess the importance of learned features. We concluded that CNN is a promising candidate for hyperspectral remote sensing applications because of the high classification accuracy and interpretability of its inner products. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Study area overview with an AVIRIS-C, 11 June 2015 RGB mosaic of 12 individual flight runs. Reference data are red and cyan points. Inset shows the multi-seasonal image extent with water and cloud mask applied.</p>
Full article ">Figure 2
<p>Twenty-five randomly selected three-season reflectance spectra from each of the twelve LCCS classes. Note that the x-axis is the band number (1–558) with seasons in spring- summer-fall sequence. Bad bands have been removed within each season (186 bands per season).</p>
Full article ">Figure 3
<p>Training and testing reference data distributions.</p>
Full article ">Figure 4
<p>Overview of machine learning classification and accuracy assessment methodology.</p>
Full article ">Figure 5
<p>Convolutional Neural Network (CNN) flow diagram.</p>
Full article ">Figure 6
<p>Detailed Convolutional Neural Network (CNN) architecture.</p>
Full article ">Figure 7
<p>An example convolutional feature map for an Annual Crop three-season spectrum. The example class spectrum is shown in background. Green, yellow and gray background areas represent bands in the spring, summer and fall, respectively. Each season has 186 bands, ordered 370 to 2500 nm. However, in this figure the original 224 bands per season are shown in order to display the bad data gaps (e.g., atmospheric absorption windows).</p>
Full article ">Figure 8
<p>Classified land-cover maps for (<b>A</b>) Support Vector Machine; (<b>B</b>) Random Forests; and (<b>C</b>) Convolutional Neural Networks. White areas indicate pixels that were not classified (e.g., water, clouds, no data); (<b>D</b>) Natural color mosaic of imagery from June 2015.</p>
Full article ">Figure 9
<p>Kernel Importance Matrix. This table shows the percent change in producer accuracy when zeroing a kernel from the CNN. Class index definitions found in <a href="#remotesensing-09-00629-t003" class="html-table">Table 3</a>.</p>
Full article ">Figure 10
<p>Convolutional feature maps for each class for the three-season CNN. These feature maps are the result of averaging the convolution of the kernels with 75 spectra per class (<a href="#sec2dot4dot3-remotesensing-09-00629" class="html-sec">Section 2.4.3</a>). An example class spectrum is shown in background. Green, yellow and gray background areas represent bands in the spring, summer and fall, respectively. Each season has 186 bands, ordered 370 to 2500 nm. However, in this figure the original 224 bands per season are shown in order to display the bad data gaps (e.g., atmospheric absorption windows).</p>
Full article ">Figure 11
<p>Feature importance for the three-season Random Forests classifier. Nominal spectral profile for ENT shown for context.</p>
Full article ">Figure 12
<p>Mini-batch training accuracy curve.</p>
Full article ">Figure 13
<p>Testing accuracy curve.</p>
Full article ">
1980 KiB  
Article
Convolutional Neural Networks Based Hyperspectral Image Classification Method with Adaptive Kernels
by Chen Ding, Ying Li, Yong Xia, Wei Wei, Lei Zhang and Yanning Zhang
Remote Sens. 2017, 9(6), 618; https://doi.org/10.3390/rs9060618 - 16 Jun 2017
Cited by 51 | Viewed by 8244
Abstract
Hyperspectral image (HSI) classification aims at assigning each pixel a pre-defined class label, which underpins lots of vision related applications, such as remote sensing, mineral exploration and ground object identification, etc. Lots of classification methods thus have been proposed for better hyperspectral imagery [...] Read more.
Hyperspectral image (HSI) classification aims at assigning each pixel a pre-defined class label, which underpins lots of vision related applications, such as remote sensing, mineral exploration and ground object identification, etc. Lots of classification methods thus have been proposed for better hyperspectral imagery interpretation. Witnessing the success of convolutional neural networks (CNNs) in the traditional images based classification tasks, plenty of efforts have been made to leverage CNNs to improve HSI classification. An advanced CNNs architecture uses the kernels generated from the clustering method, such as a K-means network uses K-means to generate the kernels. However, the above methods are often obtained heuristically (e.g., the number of kernels should be assigned manually), and how to data-adaptively determine the number of convolutional kernels (i.e., filters), and thus generate the kernels that better represent the data, are seldom studied in existing CNNs based HSI classification methods. In this study, we propose a new CNNs based HSI classification method where the convolutional kernels can be automatically learned from the data through clustering without knowing the cluster number. With those data-adaptive kernels, the proposed CNNs method achieves better classification results. Experimental results from the datasets demonstrate the effectiveness of the proposed method. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The flowchart of the MCFSFDP based CNNs method.</p>
Full article ">Figure 2
<p>The block (sample) is extracted from image R and the patch is extracted from block, respectively.</p>
Full article ">Figure 3
<p>Decision graph of 10,000 patches with a size of 10 × 10 on the Indian pines dataset.</p>
Full article ">Figure 4
<p>The curve for determining the adaptive distance with patches with a size of 10 × 10 the on Indian pines dataset. (<b>a</b>) shows the curve of point-number over distance <math display="inline"> <semantics> <mrow> <msub> <mi>δ</mi> <mi>v</mi> </msub> </mrow> </semantics> </math>; (<b>b</b>) gives the curve of quotients of differential over distance <math display="inline"> <semantics> <mrow> <msub> <mi>δ</mi> <mi>v</mi> </msub> </mrow> </semantics> </math>.</p>
Full article ">Figure 5
<p>The structure of MCFSFDP based CNNs.</p>
Full article ">Figure 6
<p>The Indian Pines on Dataset 2. (<b>a</b>) shows the composite image; (<b>b</b>) shows the groundtruth of Indian Pines dataset, where the white area denotes the unlabeled pixels.</p>
Full article ">Figure 7
<p>The Pavia University in Dataset 3. (<b>a</b>) shows the composite image; (<b>b</b>) shows the groundtruth of the Pavia University dataset, white area denotes the unlabeled pixels.</p>
Full article ">Figure 8
<p>The classification accuracy influence with the number of kernels. (<b>a</b>) the classification accuracy with the increased number of kernels with different kernel size on Dataset 1; (<b>b</b>) the classification accuracy with the increased number of kernels with different kernel size on Dataset 2; (<b>c</b>) the classification accuracy with the increased number of kernels on Dataset 3.</p>
Full article ">
6024 KiB  
Article
Road Detection by Using a Generalized Hough Transform
by Weifeng Liu, Zhenqing Zhang, Shuying Li and Dapeng Tao
Remote Sens. 2017, 9(6), 590; https://doi.org/10.3390/rs9060590 - 10 Jun 2017
Cited by 42 | Viewed by 7815
Abstract
Road detection plays key roles for remote sensing image analytics. Hough transform (HT) is one very typical method for road detection, especially for straight line road detection. Although many variants of Hough transform have been reported, it is still a great challenge to [...] Read more.
Road detection plays key roles for remote sensing image analytics. Hough transform (HT) is one very typical method for road detection, especially for straight line road detection. Although many variants of Hough transform have been reported, it is still a great challenge to develop a low computational complexity and time-saving Hough transform algorithm. In this paper, we propose a generalized Hough transform (i.e., Radon transform) implementation for road detection in remote sensing images. Specifically, we present a dictionary learning method to approximate the Radon transform. The proposed approximation method treats a Radon transform as a linear transform, which then facilitates parallel implementation of the Radon transform for multiple images. To evaluate the proposed algorithm, we conduct extensive experiments on the popular RSSCN7 database for straight road detection. The experimental results demonstrate that our method is superior to the traditional algorithms in terms of accuracy and computing complexity. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>Mapping of <math display="inline"> <semantics> <mrow> <msub> <mi>P</mi> <mn>1</mn> </msub> </mrow> </semantics> </math> and <math display="inline"> <semantics> <mrow> <msub> <mi>P</mi> <mn>2</mn> </msub> </mrow> </semantics> </math> from Cartesian space to the slope-intercept parameter space.</p>
Full article ">Figure 2
<p>Mapping of <math display="inline"> <semantics> <mrow> <msub> <mi>P</mi> <mn>1</mn> </msub> </mrow> </semantics> </math> and <math display="inline"> <semantics> <mrow> <msub> <mi>P</mi> <mn>2</mn> </msub> </mrow> </semantics> </math> from Cartesian space to the <math display="inline"> <semantics> <mrow> <mrow> <mo>(</mo> <mrow> <mi>ρ</mi> <mo>,</mo> <mi>θ</mi> </mrow> <mo>)</mo> </mrow> </mrow> </semantics> </math> parameter space.</p>
Full article ">Figure 3
<p>Some remote sensing images with straight road examples from the RSSCN7 dataset.</p>
Full article ">Figure 4
<p>a215 is a test image. (<b>a</b>) Radon transform of test image in two-dimensional parameter space. (<b>b</b>) Three-dimensional form of (<b>a</b>). (<b>c</b>) Detected line from (<b>b</b>) overlaid on test image. (<b>d</b>) Binary image of the test image. (<b>e</b>) Hough transform of (<b>d</b>). (<b>f</b>) Three-dimensional form of (<b>e</b>). (<b>g</b>) Detected line from (<b>f</b>). (<b>h</b>) Receiver Operator Curves of the evaluated detection methods. (<b>i</b>) Transform image obtained by our method. (<b>j</b>) Three-dimensional form of (<b>i</b>). (<b>k</b>) Detected line from (<b>j</b>).</p>
Full article ">Figure 5
<p>a266 is a test image. (<b>a</b>) Radon transform of test image in two-dimensional parameter space. (<b>b</b>) Three-dimensional form of (<b>a</b>). (<b>c</b>) Detected line from (<b>b</b>) overlaid on test image. (<b>d</b>) Binary image of the test image. (<b>e</b>) Hough transform of (<b>d</b>). (<b>f</b>) Three-dimensional form of (<b>e</b>). (<b>g</b>) Detected line from (<b>f</b>). (<b>h</b>) Receiver Operator Curves of the evaluated detection methods. (<b>i</b>) Transform image obtained by our method. (<b>j</b>) Three-dimensional form of (<b>i</b>). (<b>k</b>) Detected line from (<b>j</b>).</p>
Full article ">Figure 6
<p>b088 is a test image. (<b>a</b>) Radon transform of test image in two-dimensional parameter space. (<b>b</b>) Three-dimensional form of (<b>a</b>). (<b>c</b>) Detected line from (<b>b</b>) overlaid on test image. (<b>d</b>) Binary image of the test image. (<b>e</b>) Hough transform of (<b>d</b>). (<b>f</b>) Three-dimensional form of (<b>e</b>). (<b>g</b>) Detected line from (<b>f</b>). (<b>h</b>) Receiver Operator Curves of the evaluated detection methods. (<b>i</b>) Transform image obtained by our method. (<b>j</b>) Three-dimensional form of (<b>i</b>). (<b>k</b>) Detected line from (<b>j</b>).</p>
Full article ">Figure 7
<p>g146 is a test image. (<b>a</b>) Radon transform of test image in two-dimensional parameter space. (<b>b</b>) Three-dimensional form of (<b>a</b>). (<b>c</b>) Detected line from (<b>b</b>) overlaid on test image. (<b>d</b>) Binary image of the test image. (<b>e</b>) Hough transform of (<b>d</b>). (<b>f</b>) Three-dimensional form of (<b>e</b>). (<b>g</b>) Detected line from (<b>f</b>). (<b>h</b>) Receiver Operator Curves of the evaluated detection methods. (<b>i</b>) Transform image obtained by our method. (<b>j</b>) Three-dimensional form of (<b>i</b>). (<b>k</b>) Detected line from (<b>j</b>).</p>
Full article ">Figure 8
<p>a038 is a test image. (<b>a</b>) Radon transform of test image in two-dimensional parameter space. (<b>b</b>) Three-dimensional form of (<b>a</b>). (<b>c</b>) Detected line from (<b>b</b>) overlaid on test image. (<b>d</b>) Binary image of the test image. (<b>e</b>) Hough transform of (<b>d</b>). (<b>f</b>) Three-dimensional form of (<b>e</b>). (<b>g</b>) Detected line from (<b>f</b>). (<b>h</b>) Receiver Operator Curves of the evaluated detection methods. (<b>i</b>) Transform image obtained by our method. (<b>j</b>) Three-dimensional form of (<b>i</b>). (<b>k</b>) Detected line from (<b>j</b>).</p>
Full article ">Figure 9
<p>b230 is a test image. (<b>a</b>) Radon transform of test image in two-dimensional parameter space. (<b>b</b>) Three-dimensional form of (<b>a</b>). (<b>c</b>) Detected line from (<b>b</b>) overlaid on test image. (<b>d</b>) Binary image of the test image. (<b>e</b>) Hough transform of (<b>d</b>). (<b>f</b>) Three-dimensional form of (<b>e</b>). (<b>g</b>) Detected line from (<b>f</b>). (<b>h</b>) Receiver Operator Curves of the evaluated detection methods. (<b>i</b>) Transform image obtained by our method. (<b>j</b>) Three-dimensional form of (<b>i</b>). (<b>k</b>) Detected line from (<b>j</b>).</p>
Full article ">
1069 KiB  
Article
Geometry-Based Global Alignment for GSMS Remote Sensing Images
by Dan Zeng, Rui Fang, Shiming Ge, Shuying Li and Zhijiang Zhang
Remote Sens. 2017, 9(6), 587; https://doi.org/10.3390/rs9060587 - 10 Jun 2017
Cited by 4 | Viewed by 5293
Abstract
Alignment of latitude and longitude for all pixels is important for geo-stationary meteorological satellite (GSMS) images. To align landmarks and non-landmarks in the GSMS images, we propose a geometry-based global alignment method. Firstly, the Global Self-consistent, Hierarchical, High-resolution Geography (GSHHG) database and GSMS [...] Read more.
Alignment of latitude and longitude for all pixels is important for geo-stationary meteorological satellite (GSMS) images. To align landmarks and non-landmarks in the GSMS images, we propose a geometry-based global alignment method. Firstly, the Global Self-consistent, Hierarchical, High-resolution Geography (GSHHG) database and GSMS images are expressed as feature maps by geometric coding. According to the geometric and gradient similarity of feature maps, initial feature matching is obtained. Then, neighborhood spatial consistency based local geometric refinement algorithm is utilized to remove outliers. Since the earth is not a standard sphere, polynomial fitting models are used to describe the global relationship between latitude, longitude and coordinates for all pixels in the GSMS images. Finally, with registered landmarks and polynomial fitting models, the latitude and longitude of each pixel in the GSMS images can be calculated. Experimental results show that the proposed method globally align the GSMS images with high accuracy, recall and significantly low computation complexity. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>The landmarks and GSMS images in the southern coastal area of Thailand and their initial matching results. These points in circles are outliers. (<b>a</b>) landmarks; (<b>b</b>) edge probability image; (<b>c</b>) edge binary image; (<b>d</b>) initial matching.</p>
Full article ">Figure 2
<p>Illustration of mismatched features and many-to-one matched features.(<b>a</b>) mismatched features; (<b>b</b>) many-to-one matched features.</p>
Full article ">Figure 3
<p>The details of the initial matching and feature refinement in the southern coastal area of Thailand. (<b>a</b>) initial matching; (<b>b</b>) feature refinement; (<b>c</b>) initial matching; and (<b>d</b>) feature refinement.</p>
Full article ">Figure 4
<p>Performance of local feature matching with different <span class="html-italic">K</span>s.</p>
Full article ">Figure 5
<p>Mean precision and recall values of feature refinement with different <span class="html-italic">n</span>s (the number of candidate matched pairs nearest the seed matched pair). (<b>a</b>) mean precision; and (<b>b</b>) mean recall.</p>
Full article ">Figure 6
<p>Performance of eight algorithms on 25 images. NSCM is competitive with RANSAC, GTM, WGTM, RSOC, KNN-TAR, ISSC and RFVTM in precision, recall and RMSE. (<b>a</b>) precision; (<b>b</b>) recall; (<b>c</b>) RMSE.</p>
Full article ">Figure 7
<p>Mean precision, recall and RMSE values of Polynomial fitting with different <span class="html-italic">m</span>s (the order of Polynomial function). (<b>a</b>) mean precision; (<b>b</b>) mean recall; and (<b>c</b>) mean RMSE.</p>
Full article ">Figure 8
<p>Pixel alignment results.</p>
Full article ">Figure 9
<p>3D earth.</p>
Full article ">
6795 KiB  
Article
Exploiting Deep Matching and SAR Data for the Geo-Localization Accuracy Improvement of Optical Satellite Images
by Nina Merkle, Wenjie Luo, Stefan Auer, Rupert Müller and Raquel Urtasun
Remote Sens. 2017, 9(6), 586; https://doi.org/10.3390/rs9060586 - 10 Jun 2017
Cited by 117 | Viewed by 12060
Abstract
Improving the geo-localization of optical satellite images is an important pre-processing step for many remote sensing tasks like monitoring by image time series or scene analysis after sudden events. These tasks require geo-referenced and precisely co-registered multi-sensor data. Images captured by the high [...] Read more.
Improving the geo-localization of optical satellite images is an important pre-processing step for many remote sensing tasks like monitoring by image time series or scene analysis after sudden events. These tasks require geo-referenced and precisely co-registered multi-sensor data. Images captured by the high resolution synthetic aperture radar (SAR) satellite TerraSAR-X exhibit an absolute geo-location accuracy within a few decimeters. These images represent therefore a reliable source to improve the geo-location accuracy of optical images, which is in the order of tens of meters. In this paper, a deep learning-based approach for the geo-localization accuracy improvement of optical satellite images through SAR reference data is investigated. Image registration between SAR and optical images requires few, but accurate and reliable matching points. These are derived from a Siamese neural network. The network is trained using TerraSAR-X and PRISM image pairs covering greater urban areas spread over Europe, in order to learn the two-dimensional spatial shifts between optical and SAR image patches. Results confirm that accurate and reliable matching points can be generated with higher matching accuracy and precision with respect to state-of-the-art approaches. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>Visual comparison of an optical (<b>top</b>) and SAR image (<b>bottom</b>) acquired over the same area. Both images have a ground sampling distance of <math display="inline"> <semantics> <mrow> <mn>1.25</mn> <mtext> </mtext> <mi mathvariant="normal">m</mi> </mrow> </semantics> </math>.</p>
Full article ">Figure 2
<p>Network architecture (<b>left</b>) and a detailed overview of the convolutional layers (<b>right</b>). Abbreviations: convolutional neural network (CNN), convolution (Conv), batch normalization (BN) and rectified linear unit (ReLU).</p>
Full article ">Figure 3
<p>Visual comparison between optical (<b>a</b>), SAR (<b>b</b>) and despeckled SAR patches (<b>c</b>).</p>
Full article ">Figure 4
<p>Influence of the speckle filter and comparison of different network architectures during training time (all results are generated from the validation set): (<b>a</b>) shows the matching accuracy during training. Here, the matching accuracy is measured as the percentage of matching points, where the <math display="inline"> <semantics> <msub> <mi>L</mi> <mn>2</mn> </msub> </semantics> </math> distance to the ground truth location is less than or equal to three pixels; (<b>b</b>) shows the average <math display="inline"> <semantics> <msub> <mi>L</mi> <mn>2</mn> </msub> </semantics> </math> distance between the matching points and the ground truth location during training.</p>
Full article ">Figure 5
<p>Illustration of influence of the raw score as a threshold: (<b>a</b>) the relation between the predicted score and the number of patches; (<b>b</b>) relation between the number of patches and the matching accuracy; (<b>c</b>) relation between the predicted score and the matching accuracy; and (<b>d</b>) relation between the predicted score and the average distance (<math display="inline"> <semantics> <msub> <mi>L</mi> <mn>2</mn> </msub> </semantics> </math>) between the predicted matching points and the ground truth location. The matching accuracy in <a href="#remotesensing-09-00586-f005" class="html-fig">Figure 5</a>b is measured as the percentage of matching points, where the <math display="inline"> <semantics> <msub> <mi>L</mi> <mn>2</mn> </msub> </semantics> </math> distance to the ground truth location is less than three pixels and in <a href="#remotesensing-09-00586-f005" class="html-fig">Figure 5</a>c less than 2, 3 and 4 pixels.</p>
Full article ">Figure 6
<p>Side by side comparison between (<b>a</b>) optical patches (<math display="inline"> <semantics> <mrow> <mn>201</mn> <mo>×</mo> <mn>201</mn> </mrow> </semantics> </math> pixels), (<b>b</b>) the score maps of NCC, (<b>c</b>) MI, and (<b>d</b>) our method (<math display="inline"> <semantics> <mrow> <mn>51</mn> <mo>×</mo> <mn>51</mn> </mrow> </semantics> </math> pixels), and (<b>e</b>) the reference despeckled SAR patches (<math display="inline"> <semantics> <mrow> <mn>251</mn> <mo>×</mo> <mn>251</mn> </mrow> </semantics> </math> pixels).</p>
Full article ">Figure 7
<p>Checkerboard overlays of two optical and one SAR image with a pixel spacing of <math display="inline"> <semantics> <mrow> <mn>2.5</mn> <mtext> </mtext> <mi mathvariant="normal">m</mi> </mrow> </semantics> </math> and image tiles size of <math display="inline"> <semantics> <mrow> <mn>100</mn> <mo>×</mo> <mn>100</mn> <mtext> </mtext> <mi mathvariant="normal">m</mi> </mrow> </semantics> </math>: (<b>a</b>) shows the optical image before and (<b>b</b>) after the sensor model adjustment (geo-localization enhancement) through the generated matching points.</p>
Full article ">
5189 KiB  
Article
Multiobjective Optimized Endmember Extraction for Hyperspectral Image
by Rong Liu, Bo Du and Liangpei Zhang
Remote Sens. 2017, 9(6), 558; https://doi.org/10.3390/rs9060558 - 3 Jun 2017
Cited by 18 | Viewed by 5625
Abstract
Endmember extraction (EE) is one of the most important issues in hyperspectral mixture analysis. It is also a challenging task due to the intrinsic complexity of remote sensing images and the lack of priori knowledge. In recent years, a number of EE methods [...] Read more.
Endmember extraction (EE) is one of the most important issues in hyperspectral mixture analysis. It is also a challenging task due to the intrinsic complexity of remote sensing images and the lack of priori knowledge. In recent years, a number of EE methods have been developed, where several different optimization objectives have been proposed from different perspectives. In all of these methods, only one objective function has to be optimized, which represents a specific characteristic of endmembers. However, one single-objective function may not be able to express all the characteristics of endmembers from various aspects, which would not be powerful enough to provide satisfactory unmixing results because of the complexity of remote sensing images. In this paper, a multiobjective discrete particle swarm optimization algorithm (MODPSO) is utilized to tackle the problem of EE, where two objective functions, namely, volume maximization (VM) and root-mean-square error (RMSE) minimization are simultaneously optimized. Experimental results on two real hyperspectral images show the superiority of the proposed MODPSO with respect to the single objective D-PSO method, and MODPSO still needs further improvement on the optimization of the VM with respect to other approaches. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Feasible solutions for minimization optimization. Blue points stand for common solutions, and red points stand for Pareto-optimal solutions.</p>
Full article ">Figure 2
<p>The update of the particle’s personal best position. The blue point stands for the current <span class="html-italic">pbest</span> of one particle, and other points are possible locations of the particle at the next time. The plane can be divided into four parts centered on the <span class="html-italic">pbest</span>. If the particle appears in the area where the purple point located, then the <span class="html-italic">pbest</span> of the particle should remain unchanged; if the particle appears in the area where the red point located, then the <span class="html-italic">pbest</span> of the particle will be updated by the red point; and if the particle appears in the area where the cyan points located, then randomly select one point as the <span class="html-italic">pbest.</span></p>
Full article ">Figure 3
<p>Selection of the best local guide among the global best archive (GBA) for each particle. The squares stand for the GBA members, and circles stand for all the particles. The sigma values of all the GBA members and particles are calculated and compared. For one particle, the GBA member that has the closest sigma value with it is chosen as the best local guide for it.</p>
Full article ">Figure 4
<p>The flowchart of the multiobjective discrete particle swarm optimization (MODPSO) method.</p>
Full article ">Figure 5
<p>Sub-scene extracted from the Washington DC dataset.</p>
Full article ">Figure 6
<p>The objective function value as a function of the number of iteration times for the Washington DC dataset: (<b>a</b>) the volume inverse; and (<b>b</b>) root-mean-square error (RMSE).</p>
Full article ">Figure 7
<p>The results of the Washington DC image: (<b>a</b>) the Pareto front obtained by MODPSO; and (<b>b</b>) comparison of the results by four methods.</p>
Full article ">Figure 7 Cont.
<p>The results of the Washington DC image: (<b>a</b>) the Pareto front obtained by MODPSO; and (<b>b</b>) comparison of the results by four methods.</p>
Full article ">Figure 8
<p>Endmember spectra manually selected from the image and automatically extracted by the four methods for the Washington DC dataset: (<b>a</b>) grass; (<b>b</b>) path; (<b>c</b>) roof; (<b>d</b>) street; (<b>e</b>) tree; and (<b>f</b>) water.</p>
Full article ">Figure 9
<p>The Urban hyperspectral dataset.</p>
Full article ">Figure 10
<p>The objective function value as a function of the number of iteration times for the Urban dataset: (<b>a</b>) the volume inverse; and (<b>b</b>) RMSE.</p>
Full article ">Figure 11
<p>The results of the Urban image: (<b>a</b>) the Pareto front obtained by MODPSO; and (<b>b</b>) comparison of the results by four methods.</p>
Full article ">Figure 12
<p>Endmember spectra manually selected from the image and automatically extracted by the four methods for the Urban dataset: (<b>a</b>) Road#1; (<b>b</b>) Roof#1; (<b>c</b>) Grass; (<b>d</b>) Tree; (<b>e</b>) Road#2; (<b>f</b>) Roof#2; and (<b>g</b>) spectra unmatched with the reference endmembers.</p>
Full article ">
8345 KiB  
Article
Optimized Kernel Minimum Noise Fraction Transformation for Hyperspectral Image Classification
by Lianru Gao, Bin Zhao, Xiuping Jia, Wenzhi Liao and Bing Zhang
Remote Sens. 2017, 9(6), 548; https://doi.org/10.3390/rs9060548 - 1 Jun 2017
Cited by 57 | Viewed by 7728
Abstract
This paper presents an optimized kernel minimum noise fraction transformation (OKMNF) for feature extraction of hyperspectral imagery. The proposed approach is based on the kernel minimum noise fraction (KMNF) transformation, which is a nonlinear dimensionality reduction method. KMNF can map the original data [...] Read more.
This paper presents an optimized kernel minimum noise fraction transformation (OKMNF) for feature extraction of hyperspectral imagery. The proposed approach is based on the kernel minimum noise fraction (KMNF) transformation, which is a nonlinear dimensionality reduction method. KMNF can map the original data into a higher dimensional feature space and provide a small number of quality features for classification and some other post processing. Noise estimation is an important component in KMNF. It is often estimated based on a strong relationship between adjacent pixels. However, hyperspectral images have limited spatial resolution and usually have a large number of mixed pixels, which make the spatial information less reliable for noise estimation. It is the main reason that KMNF generally shows unstable performance in feature extraction for classification. To overcome this problem, this paper exploits the use of a more accurate noise estimation method to improve KMNF. We propose two new noise estimation methods accurately. Moreover, we also propose a framework to improve noise estimation, where both spectral and spatial de-correlation are exploited. Experimental results, conducted using a variety of hyperspectral images, indicate that the proposed OKMNF is superior to some other related dimensionality reduction methods in most cases. Compared to the conventional KMNF, the proposed OKMNF benefits significant improvements in overall classification accuracy. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Airborne Visible/Infrared Imaging Spectrometer radiance images used for noise estimation, where (<b>a</b>) is the first subimage of Jasper Ridge; (<b>b</b>) is the second subimage of Jasper Ridge; (<b>c</b>) is the first subimage of Low Altitude; (<b>d</b>) is the second subimage of Low Altitude; (<b>e</b>) is the first subimage of Moffett Field; and (<b>f</b>) is the second subimage of Moffett Field.</p>
Full article ">Figure 2
<p>Noise estimation results of spectral and spatial de-correlation (SSDC) of <a href="#remotesensing-09-00548-f001" class="html-fig">Figure 1</a>a in a different size of sub-block.</p>
Full article ">Figure 3
<p>Noise estimation results of SSDC, SSDC<sub>1</sub>, and SSDC<sub>2</sub> of <a href="#remotesensing-09-00548-f001" class="html-fig">Figure 1</a>a in the 6 <math display="inline"> <semantics> <mo>×</mo> </semantics> </math> 6 size of sub-block.</p>
Full article ">Figure 4
<p>Noise estimation results of (<b>a</b>) <a href="#remotesensing-09-00548-f001" class="html-fig">Figure 1</a>a,b; (<b>b</b>) <a href="#remotesensing-09-00548-f001" class="html-fig">Figure 1</a>c,d; (<b>c</b>) <a href="#remotesensing-09-00548-f001" class="html-fig">Figure 1</a>e,f, through the difference of spatial neighborhood (DSN) used in kernel minimum noise fraction (KMNF), and the SSDC, SSDC<sub>1</sub>, and SSDC<sub>2</sub> used in optimize KMNF (OKMNF).</p>
Full article ">Figure 5
<p>(<b>a</b>) original Indian Pines image; (<b>b</b>) ground reference map containing nine land-cover classes.</p>
Full article ">Figure 6
<p>Comparison of accuracies of maximum likelihood-based classification (ML) classification after different dimensionality reduction methods.</p>
Full article ">Figure 7
<p>Parameter tuning in the experiments using the Indian Pines dataset for ML classification after different feature extraction methods (number of features = 8), where (<b>a</b>) is <math display="inline"> <semantics> <mi>r</mi> </semantics> </math> versus accuracies; (<b>b</b>) is <math display="inline"> <semantics> <mi>m</mi> </semantics> </math> versus accuracies; (<b>c</b>) is <math display="inline"> <semantics> <mi>s</mi> </semantics> </math> versus accuracies.</p>
Full article ">Figure 8
<p>The first three features (from up to bottom) of kernel PCA (KPCA), KMNF, OKMNF-SSDC, OKMNF-SSDC<sub>1</sub>, and OKMNF-SSDC<sub>2</sub>.</p>
Full article ">Figure 9
<p>The results of ML classification after different dimensionality reduction methods (number of features = 5).</p>
Full article ">Figure 10
<p>(<b>a</b>) true color image of the Minamimaki scene; (<b>b</b>) ground reference map with 6 classes.</p>
Full article ">Figure 11
<p>Comparison of accuracies of ML classification after different dimensionality reduction methods.</p>
Full article ">Figure 12
<p>Parameter tuning in experiments using the Minamimaki dataset for ML classification after different dimensionality methods (number of features = 8), where (<b>a</b>) is <math display="inline"> <semantics> <mi>r</mi> </semantics> </math> versus accuracies; (<b>b</b>) is <math display="inline"> <semantics> <mi>m</mi> </semantics> </math> versus accuracies; (<b>c</b>) is <math display="inline"> <semantics> <mi>s</mi> </semantics> </math> versus accuracies.</p>
Full article ">Figure 13
<p>The results of ML classification after different dimensionality reduction methods (number of features = 3).</p>
Full article ">
10872 KiB  
Article
Hourglass-ShapeNetwork Based Semantic Segmentation for High Resolution Aerial Imagery
by Yu Liu, Duc Minh Nguyen, Nikos Deligiannis, Wenrui Ding and Adrian Munteanu
Remote Sens. 2017, 9(6), 522; https://doi.org/10.3390/rs9060522 - 25 May 2017
Cited by 135 | Viewed by 15003
Abstract
A new convolution neural network (CNN) architecture for semantic segmentation of high resolution aerial imagery is proposed in this paper. The proposed architecture follows an hourglass-shaped network (HSN) design being structured into encoding and decoding stages. By taking advantage of recent advances in [...] Read more.
A new convolution neural network (CNN) architecture for semantic segmentation of high resolution aerial imagery is proposed in this paper. The proposed architecture follows an hourglass-shaped network (HSN) design being structured into encoding and decoding stages. By taking advantage of recent advances in CNN designs, we use the composed inception module to replace common convolutional layers, providing the network with multi-scale receptive areas with rich context. Additionally, in order to reduce spatial ambiguities in the up-sampling stage, skip connections with residual units are also employed to feed forward encoding-stage information directly to the decoder. Moreover, overlap inference is employed to alleviate boundary effects occurring when high resolution images are inferred from small-sized patches. Finally, we also propose a post-processing method based on weighted belief propagation to visually enhance the classification results. Extensive experiments based on the Vaihingen and Potsdam datasets demonstrate that the proposed architectures outperform three reference state-of-the-art network designs both numerically and visually. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Illustration of elementary modules for the convolutional layer. (<b>a</b>) Convolutional layer and (<b>b</b>) Transposed convolutional layer.</p>
Full article ">Figure 2
<p>The fully-convolutional network (FCN) [<a href="#B21-remotesensing-09-00522" class="html-bibr">21</a>], SegNet [<a href="#B19-remotesensing-09-00522" class="html-bibr">19</a>] and full patch labeling (FPL) [<a href="#B22-remotesensing-09-00522" class="html-bibr">22</a>] network designs. A, B, C and D are convolutional layers; E is a pooling layer; F is a transposed convolutional layer or unpooling layer (in SegNet); G is a loss layer.</p>
Full article ">Figure 3
<p>The proposed hourglass-shaped network (HSN) architecture. A and B are convolutional layers; C and D are inception modules; E is the max pooling layer; F is the transposed convolutional layer; G is the residuals modules; H is the loss layer.</p>
Full article ">Figure 4
<p>Composition modules in the proposed HSN architecture. (<b>a</b>) Inception module; (<b>b</b>) Residual module.</p>
Full article ">Figure 5
<p>Full tile prediction for tile No. 34. Legend on the Vaihingen dataset: white: impervious surface; blue: buildings; cyan: low vegetation; green: trees; yellow: cars; red: clutter (best viewed in color). (<b>a</b>) Ground truth; (<b>b</b>) HSN; (<b>c</b>) HSN-NS; (<b>d</b>) HSN-NI.</p>
Full article ">Figure 6
<p>Full tile prediction for No. 30. Legend on the Vaihengen dataset: white: impervious surface; blue: buildings; cyan: low vegetation; green: trees; yellow: cars; red: clutter (best viewed in color). (<b>a</b>) TOP, true orthophoto; (<b>b</b>) nDSM, normalized DSM; (<b>c</b>) GT, Ground truth labeling; (<b>d</b>–<b>g</b>) the inference result from FCN, SegNet, FPL and HSN respectively; (<b>h</b>) HSN + WBP, HSN inference result after WBP post-processing.</p>
Full article ">Figure 7
<p>Semantic segmentation results for some patches of Vaihingen dataset. white: impervious surface; blue: buildings; {cyan}: low vegetation; {green}: trees; {yellow}: cars; {red}: clutter (best viewed in~color). Four different tiles from Vaihingen are included: (<b>a</b>) a narrow passage; (<b>b</b>) shadowed areas from trees and buildings; (<b>c</b>) cars in the shadow; and (<b>d</b>) building roofs with depth discontinuities.</p>
Full article ">Figure 8
<p>Full tile prediction for tile No. 04_12. Legend on the Potsdam dataset: white: impervious surface; blue: buildings; cyan: low vegetation; green: trees; yellow: cars; red: clutter (best viewed in color). (<b>a</b>) TOP, true orthophoto; (<b>b</b>) nDSM, normalized DSM; (<b>c</b>) GT, Ground truth labeling; (<b>d</b>–<b>g</b>) the inference result from FCN, SegNet, FPL and HSN respectively; (<b>h</b>) HSN + WBP, HSN inference result after WBP post-processing.</p>
Full article ">Figure 9
<p>Semantic segmentation results for some patches of Potsdam dataset.white: impervious surface; blue: buildings; {cyan}: low vegetation; {green}: trees; {yellow}: cars; {red}: clutter (best viewed in~color). Four tiles from Potsdam are included: (<b>a</b>) buildings with backyards; (<b>b</b>) parking lot; (<b>c</b>) rooftops; and (<b>d</b>) low vegetation areas.</p>
Full article ">
2678 KiB  
Article
Hypergraph Embedding for Spatial-Spectral Joint Feature Extraction in Hyperspectral Images
by Yubao Sun, Sujuan Wang, Qingshan Liu, Renlong Hang and Guangcan Liu
Remote Sens. 2017, 9(5), 506; https://doi.org/10.3390/rs9050506 - 22 May 2017
Cited by 30 | Viewed by 8492
Abstract
The fusion of spatial and spectral information in hyperspectral images (HSIs) is useful for improving the classification accuracy. However, this approach usually results in features of higher dimension and the curse of the dimensionality problem may arise resulting from the small ratio between [...] Read more.
The fusion of spatial and spectral information in hyperspectral images (HSIs) is useful for improving the classification accuracy. However, this approach usually results in features of higher dimension and the curse of the dimensionality problem may arise resulting from the small ratio between the number of training samples and the dimensionality of features. To ease this problem, we propose a novel algorithm for spatial-spectral feature extraction based on hypergraph embedding. Firstly, each HSI pixel is regarded as a vertex and the joint of extended morphological profiles (EMP) and spectral features is adopted as the feature associated with the vertex. A hypergraph is then constructed by the K-Nearest-Neighbor method, in which each pixel and its most K relevant pixels are linked as one hyperedge to represent the complex relationships between HSI pixels. Secondly, the hypergraph embedding model is designed to learn a low dimensional feature with the reservation of geometric structure of HSI. An adaptive hyperedge weight estimation scheme is also introduced to preserve the prominent hyperedges by the regularization constraint on the weight. Finally, the learned low-dimensional features are fed to the support vector machine (SVM) for classification. The experimental results on three benchmark hyperspectral databases are presented. They highlight the importance of spatial–spectral joint features embedding for the accurate classification of HSI data. The weight estimation is better for further improving the classification accuracy. These experimental results verify the proposed method. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The flowchart of the proposed method.</p>
Full article ">Figure 2
<p>The example of graph and hypergraph (<b>a</b>) simple graph, each edge consists of only two data points; (<b>b</b>) hypergraph <b>G</b>, each hyperedge is marked by an ellipse and consists of at least two data points; (<b>c</b>) taking the seven vertices as example, <b>H</b> is the incidence matrix of <b>G</b>, whose values are usually binary.</p>
Full article ">Figure 3
<p>Indian Pines. (<b>a</b>) three-channel color composite image with bands 65, 52, 36; (<b>b</b>,<b>c</b>) ground-truth map and class labels; (<b>d</b>–<b>i</b>) classification maps of PCA, EMP, EMPSpe, SH, SSHG, SSHG*, respectively.</p>
Full article ">Figure 4
<p>Pavia university. (<b>a</b>) three-channel color composite image with bands 102, 56, 31; (<b>b</b>,<b>c</b>) ground-truth map and class labels; (<b>d</b>–<b>i</b>) classification maps of PCA, EMP, EMPSpe, SH, SSHG, SSHG*, respectively.</p>
Full article ">Figure 5
<p>Botswana. (<b>a</b>) three-channel color composite image with bands 65, 52, 36; (<b>b</b>,<b>c</b>) ground-truth map and class labels; (<b>d</b>–<b>i</b>) classification maps of PCA, EMP, EMPSpe, SH, SSHG, SSHG*, respectively.</p>
Full article ">Figure 6
<p>Effects of the number <span class="html-italic">K</span> of nearest neighbors on OA. (<b>a</b>) Indian Pines; (<b>b</b>) Pavia University; (<b>c</b>) Botswana.</p>
Full article ">Figure 7
<p>Effects on the reduced dimensions. (<b>a</b>) Indian Pines; (<b>b</b>) Pavia University; (<b>c</b>) Botswana.</p>
Full article ">
42450 KiB  
Article
Learning Dual Multi-Scale Manifold Ranking for Semantic Segmentation of High-Resolution Images
by Mi Zhang, Xiangyun Hu, Like Zhao, Ye Lv, Min Luo and Shiyan Pang
Remote Sens. 2017, 9(5), 500; https://doi.org/10.3390/rs9050500 - 19 May 2017
Cited by 50 | Viewed by 11102
Abstract
Semantic image segmentation has recently witnessed considerable progress by training deep convolutional neural networks (CNNs). The core issue of this technique is the limited capacity of CNNs to depict visual objects. Existing approaches tend to utilize approximate inference in a discrete domain or [...] Read more.
Semantic image segmentation has recently witnessed considerable progress by training deep convolutional neural networks (CNNs). The core issue of this technique is the limited capacity of CNNs to depict visual objects. Existing approaches tend to utilize approximate inference in a discrete domain or additional aides and do not have a global optimum guarantee. We propose the use of the multi-label manifold ranking (MR) method in solving the linear objective energy function in a continuous domain to delineate visual objects and solve these problems. We present a novel embedded single stream optimization method based on the MR model to avoid approximations without sacrificing expressive power. In addition, we propose a novel network, which we refer to as dual multi-scale manifold ranking (DMSMR) network, that combines the dilated, multi-scale strategies with the single stream MR optimization method in the deep learning architecture to further improve the performance. Experiments on high resolution images, including close-range and remote sensing datasets, demonstrate that the proposed approach can achieve competitive accuracy without additional aides in an end-to-end manner. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>Dual multi-scale manifold ranking (<b>DMSMR</b>) network overview. For each dilated convolutional layer, a non-dilated convolution layer is applied following the pooling layer in each scale. The dilated and non-dilated convolution layers form a dual layer, in which the corresponding layers are optimized with the embedded feedforward single stream manifold ranking network. The scale factor is implicitly represented by the pooling layer in each block. <a href="#remotesensing-09-00500-f002" class="html-fig">Figure 2</a> illustrates how to embed the manifold ranking optimization method into the single stream network (marked with orange color in this figure). The optimized outputs of each scale, that is, <math display="inline"> <semantics> <msub> <mover accent="true"> <mi mathvariant="bold">F</mi> <mo>^</mo> </mover> <mi>l</mi> </msub> </semantics> </math> generated in each scale, are combined by Equation (<a href="#FD17-remotesensing-09-00500" class="html-disp-formula">17</a>).</p>
Full article ">Figure 2
<p>The embedded feedforward single stream manifold ranking optimization network. The output of the convolutional features that upsample to full image resolution for each class, such as road, sky and building, within the CamVid dataset [<a href="#B68-remotesensing-09-00500" class="html-bibr">68</a>,<a href="#B69-remotesensing-09-00500" class="html-bibr">69</a>] depicted in the figure, serves as the initial manifold ranking score <math display="inline"> <semantics> <msup> <mover accent="true"> <mi mathvariant="bold">F</mi> <mo>˜</mo> </mover> <mo>*</mo> </msup> </semantics> </math> to be optimized. By applying the feedforward MR inference with the contextual information extracted from the input image, the optimal MR score <math display="inline"> <semantics> <mover accent="true"> <mi mathvariant="bold">F</mi> <mo>^</mo> </mover> </semantics> </math> of each class can be obtained by Equation (<a href="#FD10-remotesensing-09-00500" class="html-disp-formula">10</a>). The only requirement for the proposed network is the multi-label neighborhood relationship, which is designed for constructing the Laplacian matrix <math display="inline"> <semantics> <mover accent="true"> <mi mathvariant="bold">L</mi> <mo>˜</mo> </mover> </semantics> </math> in a single stream rather than the unary and pairwise streams presented in [<a href="#B26-remotesensing-09-00500" class="html-bibr">26</a>,<a href="#B29-remotesensing-09-00500" class="html-bibr">29</a>].</p>
Full article ">Figure 3
<p>Several semantic segmentation results on PASCAL VOC 2012 validation images. <b>DMSMR</b>: Semantic segmentation result predicted by dual multi-scale manifold ranking network. <b>GT</b>: Ground Truth.</p>
Full article ">Figure 4
<p>Semantic segmentation results on CamVid images. <b>DMSMR</b>: Semantic segmentation result predicted by dual multi-scale manifold ranking network (<b>DMSMR</b>). <b>GT</b>: Ground Truth.</p>
Full article ">Figure 5
<p>Accuracy analysis with respect to boundary on CamVid dataset. (<b>a</b>) Trimap visualization on CamVid dataset. Top-left: source image. Top-right: ground truth. Bottom-left: trimap with one pixel band width. Bottom-right: trimap with three pixels band width. (<b>b</b>) Pixel mIoU with respect to band width around object boundaries. We measure the relationship of our model before and after employing the multi-scale (<b>MS</b>), dilated convolution (<b>Dilated</b>), single stream Manifold Ranking (<b>MR-Opti</b>) and joint strategies (<b>DMSMR</b>).</p>
Full article ">Figure 6
<p>Visualization of the comparative results on a few Vaihingen testing imagery (tile numbers 2, 4, 6 and 8). For each image, we generate the dense prediction results and corresponding error maps (red/green image) with different approaches.</p>
Full article ">Figure 7
<p>Semantic segmentation results with different strategies on the EvLab-SS validation patches. Four kinds of image patches with different spatial resolutions and illuminations are depicted in the figure. The first and second rows are the GeoEye and World-View 2 satellite images with resample GSD of 0.5 m and 0.2 m. The third and the last rows are the aerial images with resample GSD of 0.25 m and 0.1 m, respectively. <b>MS</b>: Predictions with multi-scale approach. <b>MR-Opti</b>: Semantic segmentation results using manifold ranking optimization method. <b>DMSMR</b>: Segmentation result predicted by dual multi-scale manifold ranking network. <b>GT</b>: Ground Truth.</p>
Full article ">Figure 8
<p>Accuracy analysis with respect to boundary on EvLab-SS dataset. (<b>a</b>) Visualization of trimap for EvLab-SS dataset. Top-left: source patch. Top-right: ground truth. Bottom-left: trimap with one pixel band width. Bottom-right: trimap with three pixels band width. (<b>b</b>) Pixel mIoU with respect to band width around object boundaries. We measure the relationship for our model before and after employing the multi-scale (<b>MS</b>), dilated convolution (<b>Dilated</b>), single stream Manifold Ranking (<b>MR-Opti</b>) and joint strategies (<b>DMSMR</b>) on the EvLab-SS dataset.</p>
Full article ">Figure 9
<p>The architectures of the networks with different strategies: (<b>a</b>) Convolutional networks before employing the strategies (<b>Before</b>); (<b>b</b>) Networks using multi-scale strategy (<b>MS</b>); (<b>c</b>) Networks using dilated method (<b>Dilated</b>); (<b>d</b>) Networks using manifold ranking optimization (<b>MR-Opti</b>).</p>
Full article ">Figure 9 Cont.
<p>The architectures of the networks with different strategies: (<b>a</b>) Convolutional networks before employing the strategies (<b>Before</b>); (<b>b</b>) Networks using multi-scale strategy (<b>MS</b>); (<b>c</b>) Networks using dilated method (<b>Dilated</b>); (<b>d</b>) Networks using manifold ranking optimization (<b>MR-Opti</b>).</p>
Full article ">
10169 KiB  
Article
Classification for High Resolution Remote Sensing Imagery Using a Fully Convolutional Network
by Gang Fu, Changjun Liu, Rong Zhou, Tao Sun and Qijian Zhang
Remote Sens. 2017, 9(5), 498; https://doi.org/10.3390/rs9050498 - 18 May 2017
Cited by 320 | Viewed by 21238
Abstract
As a variant of Convolutional Neural Networks (CNNs) in Deep Learning, the Fully Convolutional Network (FCN) model achieved state-of-the-art performance for natural image semantic segmentation. In this paper, an accurate classification approach for high resolution remote sensing imagery based on the improved FCN [...] Read more.
As a variant of Convolutional Neural Networks (CNNs) in Deep Learning, the Fully Convolutional Network (FCN) model achieved state-of-the-art performance for natural image semantic segmentation. In this paper, an accurate classification approach for high resolution remote sensing imagery based on the improved FCN model is proposed. Firstly, we improve the density of output class maps by introducing Atrous convolution, and secondly, we design a multi-scale network architecture by adding a skip-layer structure to make it capable for multi-resolution image classification. Finally, we further refine the output class map using Conditional Random Fields (CRFs) post-processing. Our classification model is trained on 70 GF-2 true color images, and tested on the other 4 GF-2 images and 3 IKONOS true color images. We also employ object-oriented classification, patch-based CNN classification, and the FCN-8s approach on the same images for comparison. The experiments show that compared with the existing approaches, our approach has an obvious improvement in accuracy. The average precision, recall, and Kappa coefficient of our approach are 0.81, 0.78, and 0.83, respectively. The experiments also prove that our approach has strong applicability for multi-resolution image classification. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The general pipeline of our approach: The training stage and the classification stage are illustrated in the upper and lower parts, respectively.</p>
Full article ">Figure 2
<p>Network architectures for standard Convolutional Neural Network (CNN) and Fully Convolutional Network (FCN). (<b>a</b>) Architecture of standard CNN: stacks of convolutional-pooling layers and fully connected (FC) layers. Given an image, the distribution over classes is predicted. The class with the largest distribution value is considered as the class of a given image; (<b>b</b>) Architecture of FCN: FC layers are replaced by convolutional layers. FCN maintains the 2-D structure of the image.</p>
Full article ">Figure 3
<p>“Atrous” convolutions with <math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo> </mo> <mn>2</mn> <mo>,</mo> <mi>and</mi> <mo> </mo> <mn>3</mn> </mrow> </semantics> </math>. The first convolution (<math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics> </math>) is actually the ordinary convolution.</p>
Full article ">Figure 4
<p>Illustration of atrous convolution for dense feature map generation. Red route: standard convolution performed on a low resolution feature map. Blue route: dense feature map generated using atrous convolution with rate <math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>2</mn> </mrow> </semantics> </math> on a high resolution input feature map.</p>
Full article ">Figure 5
<p>Multi-scale network architecture.</p>
Full article ">Figure 6
<p>Three sample examples for our classification training. (<b>a</b>) Original images; (<b>b</b>) Ground truth (GT) labels corresponding to the images in (<b>a</b>).</p>
Full article ">Figure 7
<p>General procedure of network training.</p>
Full article ">Figure 8
<p>Softmax function performed on the output feature map.</p>
Full article ">Figure 9
<p>General procedure of image classification using the trained network.</p>
Full article ">Figure 10
<p>General procedure of our patch-based CNN classification experiment.</p>
Full article ">Figure 11
<p>Classification results on GF-2 images (Experiment A). (<b>a</b>) Original images; (<b>b</b>) GT labels corresponding to the images in (<b>a</b>); (<b>c</b>–<b>e</b>) Results of the MR-SVM object-oriented classification, patch-based CNN classification, and FCN-8s classification corresponding to the images in (<b>a</b>), respectively; (<b>f</b>) Our classification results corresponding to the images in (<b>a</b>).</p>
Full article ">Figure 12
<p>Classification result on IKONOS images (Experiment B). (<b>a</b>) Original images; (<b>b</b>) GT labels corresponding to the images in (<b>a</b>); (<b>c</b>–<b>e</b>) Results of the MR-SVM object-oriented classification, patch-based CNN classification, and FCN-8s classification corresponding to the images in (<b>a</b>), respectively; (<b>f</b>) Our classification results corresponding to the images in (<b>a</b>).</p>
Full article ">Figure 13
<p>Incorrect image object generated by MR segmentation. (<b>a</b>) Original images; (<b>b</b>) GT labels corresponding to the images in (<b>a</b>); (<b>c</b>) Incorrect image object covers both the building and cement ground (with yellow boundary).</p>
Full article ">Figure 14
<p>Heat map for the building generated by patch-based CNN and our approach. (<b>a</b>) Original images; (<b>b</b>) Heat map generated by patch-based CNN classification using <math display="inline"> <semantics> <mrow> <mn>128</mn> <mo>×</mo> <mn>128</mn> </mrow> </semantics> </math> patches; (<b>c</b>) Heat map generated by the FCN model.</p>
Full article ">Figure 15
<p>Detail comparison between FCN-8s and our approach. (<b>a</b>) Original images; (<b>b</b>) Classification result from FCN-8s; (<b>c</b>) Classification result from our approach.</p>
Full article ">
10421 KiB  
Article
Multi-Scale Analysis of Very High Resolution Satellite Images Using Unsupervised Techniques
by Jérémie Sublime, Andrés Troya-Galvis and Anne Puissant
Remote Sens. 2017, 9(5), 495; https://doi.org/10.3390/rs9050495 - 18 May 2017
Cited by 7 | Viewed by 7671
Abstract
This article is concerned with the use of unsupervised methods to process very high resolution satellite images with minimal or little human intervention. In a context where more and more complex and very high resolution satellite images are available, it has become increasingly [...] Read more.
This article is concerned with the use of unsupervised methods to process very high resolution satellite images with minimal or little human intervention. In a context where more and more complex and very high resolution satellite images are available, it has become increasingly difficult to propose learning sets for supervised algorithms to process such data and even more complicated to process them manually. Within this context, in this article we propose a fully unsupervised step by step method to process very high resolution images, making it possible to link clusters to the land cover classes of interest. For each step, we discuss the various challenges and state of the art algorithms to make the full process as efficient as possible. In particular, one of the main contributions of this article comes in the form of a multi-scale analysis clustering algorithm that we use during the processing of the image segments. Our proposed methods are tested on a very high resolution image (Pléiades) of the urban area around the French city of Strasbourg and show relevant results at each step of the process. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Step by step approach to image processing.</p>
Full article ">Figure 2
<p>Examples of over-segmentation and under-segmentation. (<b>a</b>) Example of an over-segmentation on two houses that could be fixed during the clustering step: the algorithm may still detect that these two segments are part of the same cluster; (<b>b</b>) example of an under-segmentation where the white object in the middle of the lake was not detected during the segmentation step and will never be since it is now merged with a lake segment.</p>
Full article ">Figure 3
<p>Illustration of the MRF clustering problem with very few features: in this example, we try to guess the cluster of the central segment based on five features and the clusters of its neighbor segments (identified using the colors).</p>
Full article ">Figure 4
<p>Example of an affinity matrix: Diagonal values indicate whether or not the clusters are forming compact areas (high value) or are scattered elements in the image (low value). Non-diagonal elements indicate which clusters are often neighbors on the image (high value) or incompatible neighbors (low value).</p>
Full article ">Figure 5
<p>(<b>Left</b>) the metropolitan area of Strasbourg (Spotimage ©CNES, 2012); (<b>right</b>) extract of the Pan-sharpened Pléiades image (Airbus ©CNES, 2012).</p>
Full article ">Figure 6
<p>Expert classes (<b>a</b>) and hierarchical classes retained for the experiments (<b>b</b>).</p>
Full article ">Figure 7
<p>Example of reference data from geographic information systems (GIS). (<b>a</b>) GIS labeled data; (<b>b</b>) contours of the GIS polygons.</p>
Full article ">Figure 8
<p>Expert classes in grey (<b>right</b>) and hierarchical clusters extracted from the confusion matrices <math display="inline"> <semantics> <mi mathvariant="sans-serif">Ω</mi> </semantics> </math> found by our proposed method (<b>left</b>): The plain arrows highlight strong links, dashed arrows mild links and dotted arrows weak links. The arrows and characters in red highlight potentially armful errors in the clusters or their hierarchy when compared with the expected classes.</p>
Full article ">Figure 9
<p>Original image (extract), reference data images and results using different algorithms looking for six clusters. (<b>a</b>) Original image, Pléiades ©Airbus, CNES 2012; (<b>b</b>) reference data ©EMS 2012: raw polygons; (<b>c</b>) hybrid reference data; (<b>d</b>) multi-scale SR-ICM at the six clusters’ scale; (<b>e</b>) SOM algorithm [<a href="#B4-remotesensing-09-00495" class="html-bibr">4</a>] with six clusters; (<b>f</b>) EM algorithm with six clusters.</p>
Full article ">Figure 10
<p>Original image (extract), reference data and our algorithm at scales of six and 10 clusters. (<b>a</b>) Original image, Pléiades ©Airbus, CNES 2012; (<b>b</b>) hybrid reference data; (<b>c</b>) multi-scale SR-ICM at the six clusters’ scale; (<b>d</b>) multi-scale SR-ICM at the 10 clusters’ scale.</p>
Full article ">
4065 KiB  
Article
Cost-Effective Class-Imbalance Aware CNN for Vehicle Localization and Categorization in High Resolution Aerial Images
by Feimo Li, Shuxiao Li, Chengfei Zhu, Xiaosong Lan and Hongxing Chang
Remote Sens. 2017, 9(5), 494; https://doi.org/10.3390/rs9050494 - 18 May 2017
Cited by 21 | Viewed by 7609
Abstract
Joint vehicle localization and categorization in high resolution aerial images can provide useful information for applications such as traffic flow structure analysis. To maintain sufficient features to recognize small-scaled vehicles, a regions with convolutional neural network features (R-CNN) -like detection structure is employed. [...] Read more.
Joint vehicle localization and categorization in high resolution aerial images can provide useful information for applications such as traffic flow structure analysis. To maintain sufficient features to recognize small-scaled vehicles, a regions with convolutional neural network features (R-CNN) -like detection structure is employed. In this setting, cascaded localization error can be averted by equally treating the negatives and differently typed positives as a multi-class classification task, but the problem of class-imbalance remains. To address this issue, a cost-effective network extension scheme is proposed. In it, the correlated convolution and connection costs during extension are reduced by feature map selection and bi-partite main-side network construction, which are realized with the assistance of a novel feature map class-importance measurement and a new class-imbalance sensitive main-side loss function. By using an image classification dataset established from a set of traditional real-colored aerial images with 0.13 m ground sampling distance which are taken from the height of 1000 m by an imaging system composed of non-metric cameras, the effectiveness of the proposed network extension is verified by comparing with its similarly shaped strong counter-parts. Experiments show an equivalent or better performance, while requiring the least parameter and memory overheads are required. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>A typical convolutional neural network (CNN) structure, with feature and difference maps produced by the forward and backward propagations. SW: station wagon; WT: working truck.</p>
Full article ">Figure 2
<p>Illustration of the semantic meaning of the convolutional kernels. The raw input image is displayed in the <span class="html-italic">Raw Image</span> column; the six feature maps produced by six different kernels at the CONV5 layer are shown in the <span class="html-italic">Feature Map</span> column; and six arrays of local image crops on which the top six feature map activations are produced are shown in the <span class="html-italic">Top Activation Image Crops</span> column.</p>
Full article ">Figure 3
<p>The general structure of the proposed network enhancement method.</p>
Full article ">Figure 4
<p>The first-order term of the Taylor expansion in Equation (<a href="#FD8-remotesensing-09-00494" class="html-disp-formula">8</a>). <math display="inline"> <semantics> <mfrac> <mrow> <mo>∂</mo> <mi>P</mi> <mfenced separators="" open="(" close=")"> <mrow> <mrow> <mi>y</mi> <mo>=</mo> <mi>i</mi> <mo>|</mo> </mrow> <msup> <mrow> <mi mathvariant="bold">Z</mi> </mrow> <mrow> <mo>(</mo> <mi>k</mi> <mo>−</mo> <mn>1</mn> <mo>)</mo> </mrow> </msup> </mrow> </mfenced> </mrow> <mrow> <mo>∂</mo> <msubsup> <mi>Z</mi> <mi>q</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>−</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> </mrow> </mfrac> </semantics> </math> denotes the feature map difference, positive, negative, and zero values marked as green, red, and black.</p>
Full article ">Figure 5
<p>Correlations of the max-activations and class-importance with the class probability of the negative class. (<b>a</b>) Max-activation vs. class probability. (<b>b</b>) Max class-importance vs. class probability.</p>
Full article ">Figure 6
<p>Scatter plots showing the distribution of the feature maps <math display="inline"> <semantics> <mfenced separators="" open="{" close="}"> <msub> <mi>Z</mi> <mi>q</mi> </msub> </mfenced> </semantics> </math> from CONV3 and CONV4 in the class-importance vs. max-activation space. (<b>a</b>) The distributions of CONV3 and CONV4 feature maps. (<b>b</b>) Feature maps correlated to the five classes by the class-importance measurement.</p>
Full article ">Figure 7
<p>(<b>a</b>) The 50 selected maps for <math display="inline"> <semantics> <mrow> <msub> <mi>N</mi> <mrow> <mi>s</mi> <mi>e</mi> <mi>l</mi> </mrow> </msub> <mo>=</mo> <mn>64</mn> </mrow> </semantics> </math>. (<b>b</b>) The 109 selected maps for <math display="inline"> <semantics> <mrow> <msub> <mi>N</mi> <mrow> <mi>s</mi> <mi>e</mi> <mi>l</mi> </mrow> </msub> <mo>=</mo> <mn>160</mn> </mrow> </semantics> </math>.</p>
Full article ">Figure 8
<p>Principle structure of the class-imbalance aware Main-Side Network.</p>
Full article ">Figure 9
<p>The t-Distribution stochastic neighbor embedding (t-SNE) -based visualization [<a href="#B73-remotesensing-09-00494" class="html-bibr">73</a>] of the negatives and vehicle types in the FC8 output space, and the three penalization modes used for <math display="inline"> <semantics> <mi mathvariant="normal">B</mi> </semantics> </math>: (<b>a</b>) global, (<b>b</b>) local, and (<b>c</b>) batch-wise.</p>
Full article ">Figure 10
<p>(<b>a</b>) A typical frame from the training sample. (<math display="inline"> <semantics> <msub> <mi mathvariant="bold">b</mi> <mn mathvariant="bold">1</mn> </msub> </semantics> </math> ∼ <math display="inline"> <semantics> <msub> <mi mathvariant="bold">b</mi> <mn mathvariant="bold">4</mn> </msub> </semantics> </math>) Typical difficult detection cases. (<b>c</b>) The close-to-vehicle region (shaded blue) and categorical sampling positions.</p>
Full article ">Figure 11
<p>The sample categories used on the three regions: Centered, Close Range, and Far Range.</p>
Full article ">Figure 12
<p>Three typical extension schemes. (<b>a</b>) Plain extension with blank kernel generated feature maps; (<b>b</b>) Plain extension with selected feature maps; (<b>c</b>) Main-Side bi-parted extension with selected feature maps.</p>
Full article ">Figure 13
<p>The five network structures studied in the experimental section. (<b>a</b>) The baseline network miniature miniature visual geometry group (VGG-M) (<span class="html-italic">Orig.M</span>) and (<b>b</b>) 16-layered VGG (<span class="html-italic">Orig.16</span>), the comparative extensions with either (<b>c</b>,<b>d</b>) the Loss of Softmax (<span class="html-italic">New Ext.</span>, <span class="html-italic">Select Ext.</span>) or (<b>e</b>,<b>f</b>) the proposed Main-Side Loss (<span class="html-italic">New S-Ext.</span>, <span class="html-italic">Select S-Ext.</span>).</p>
Full article ">Figure 14
<p>Network classification performance improvement illustrated by the established classification dataset. (<b>a</b>) Newly recognized positives after extension. (<b>b</b>) Prediction accuracies and the increments on sample categories: Centered (Cent.), Close Range (Close), and Far Range (Far).</p>
Full article ">Figure 15
<p>Overall performance comparisons between the <span class="html-italic">Orig.M</span>, <span class="html-italic">New Ext.</span> and <span class="html-italic">Select Ext.</span> under different extension sizes. (<b>a</b>) the averaged F1 scores, (<b>b</b>) the averaged accuracies. Instances where <span class="html-italic">Select Ext.</span> is comparable to <span class="html-italic">New Ext.</span> are marked by arrows.</p>
Full article ">Figure 16
<p>Efficiency comparison of extended feature maps (kernels). <math display="inline"> <semantics> <msub> <mi>N</mi> <mrow> <mi>p</mi> <mi>o</mi> <mi>s</mi> </mrow> </msub> </semantics> </math> is the quantity of all vehicles. Selected feature maps (kernels) are more effective for small extension and minority classes.</p>
Full article ">Figure 16 Cont.
<p>Efficiency comparison of extended feature maps (kernels). <math display="inline"> <semantics> <msub> <mi>N</mi> <mrow> <mi>p</mi> <mi>o</mi> <mi>s</mi> </mrow> </msub> </semantics> </math> is the quantity of all vehicles. Selected feature maps (kernels) are more effective for small extension and minority classes.</p>
Full article ">Figure 17
<p>Influences of the coefficient <math display="inline"> <semantics> <mi>λ</mi> </semantics> </math> and ReLU constraint on the overall accuracy and <math display="inline"> <semantics> <mrow> <mi>F</mi> <mn>1</mn> </mrow> </semantics> </math> score in three modes. (<b>a</b>) the averaged accuracies; (<b>b</b>) the averaged <math display="inline"> <semantics> <mrow> <mi>F</mi> <mn>1</mn> </mrow> </semantics> </math> scores.</p>
Full article ">Figure 18
<p>Influences of the penalization mode and the coefficient <math display="inline"> <semantics> <mi>λ</mi> </semantics> </math> on accuracy and <math display="inline"> <semantics> <mrow> <mi>F</mi> <mn>1</mn> </mrow> </semantics> </math> score for different vehicle types. <math display="inline"> <semantics> <msub> <mi>N</mi> <mrow> <mi>p</mi> <mi>o</mi> <mi>s</mi> </mrow> </msub> </semantics> </math> is the quantity of positives, which is all the vehicles.</p>
Full article ">
6973 KiB  
Article
Automatic Color Correction for Multisource Remote Sensing Images with Wasserstein CNN
by Jiayi Guo, Zongxu Pan, Bin Lei and Chibiao Ding
Remote Sens. 2017, 9(5), 483; https://doi.org/10.3390/rs9050483 - 15 May 2017
Cited by 21 | Viewed by 8299
Abstract
In this paper a non-parametric model based on Wasserstein CNN is proposed for color correction. It is suitable for large-scale remote sensing image preprocessing from multiple sources under various viewing conditions, including illumination variances, atmosphere disturbances, and sensor and aspect angles. Color correction [...] Read more.
In this paper a non-parametric model based on Wasserstein CNN is proposed for color correction. It is suitable for large-scale remote sensing image preprocessing from multiple sources under various viewing conditions, including illumination variances, atmosphere disturbances, and sensor and aspect angles. Color correction aims to alter the color palette of an input image to a standard reference which does not suffer from the mentioned disturbances. Most of current methods highly depend on the similarity between the inputs and the references, with respect to both the contents and the conditions, such as illumination and atmosphere condition. Segmentation is usually necessary to alleviate the color leakage effect on the edges. Different from the previous studies, the proposed method matches the color distribution of the input dataset with the references in a probabilistic optimal transportation framework. Multi-scale features are extracted from the intermediate layers of the lightweight CNN model and are utilized to infer the undisturbed distribution. The Wasserstein distance is utilized to calculate the cost function to measure the discrepancy between two color distributions. The advantage of the method is that no registration or segmentation processes are needed, benefiting from the local texture processing potential of the CNN models. Experimental results demonstrate that the proposed method is effective when the input and reference images are of different sources, resolutions, and under different illumination and atmosphere conditions. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Color discrepancy in remote sensing images. (<b>a</b>,<b>b</b>) Digital Globe images on different dates from Google Earth; (<b>c</b>,<b>d</b>) Digital Globe (bottom, right) and NASA (National Aeronautics and Space Administration) Copernicus (<b>top</b>, <b>left</b>) images on the same date from Google Earth; (<b>e</b>) GF1 (Gaofen-1) images from different sensors, same area and date.</p>
Full article ">Figure 2
<p>Matching algorithms of “scheme A” take both input and reference in the form of histograms. As this scheme is not content related, two similar distributions with different contexts could be not be mapped to their corresponding reference with one unified mapping.</p>
Full article ">Figure 3
<p>Matching algorithms of “scheme B” take both input and reference in the form of images. Similar distributions could be mapped to different corresponding references, as the scheme is content based. However, the same grayscales could be mapped to different grayscales when they are in different contexts, violating Property 1.</p>
Full article ">Figure 4
<p>Matching algorithms of “scheme C” take images as inputs and histograms as references in the form of images. Similar distributions could be mapped to different corresponding references, as the scheme is content related.</p>
Full article ">Figure 5
<p>Calculation method of the Wasserstein distance between the inferred histograms and the ground-truth reference. STEP 1: stack the histograms on the frequency axis; STEP 2: subtract the stacked histograms, and integrate with respect to the cumulative frequency.</p>
Full article ">Figure 6
<p>Structure of the “fire module” in the Squeeze-net.</p>
Full article ">Figure 7
<p>Model structure of the proposed model.</p>
Full article ">Figure 8
<p>Color transforming curves in the random augmentation process.</p>
Full article ">Figure 9
<p>Results of matching the color palette of GF1 to GF2. Bars: histograms of input patches; solid lines with color: predicted histograms of our model; dashed lines in black: histograms of reference images; from top to bottom: histograms of images of the same area, but under different illumination and atmospheric conditions.</p>
Full article ">Figure 10
<p>Color matching results of GF1 and GF2. From top to bottom: satellite images of the same area, but under different illumination and atmospheric conditions; left: input images; middle: output images with the predicted color palette; right: reference images, only needed in the training process to calculate the loss function. The model is able to infer the corrected color palette based on the content of the input images in the absence of a reference, when the model is fully trained.</p>
Full article ">Figure 11
<p>Two one-dimensional uniform distributions.</p>
Full article ">Figure 12
<p>Comparisons between color matching methods.</p>
Full article ">Figure 13
<p>Boxplots of L1-norm distances between the processed images and the ground truth with respect to left: ORB; middle: SIFT, and right: BRISK feature descriptors. The distances represent the dissimilarity between the processed results and the ground truth (the smaller the better). There are five horizontal line segments in each patch, indicating five percentiles of the distances within the processed images by the corresponding method; from top to bottom: the maximum (worst) distance, the worst-25% distance, the median distance, the best-25% distance, and the minimum (best) distance.</p>
Full article ">
5825 KiB  
Article
Hyperspectral Target Detection via Adaptive Joint Sparse Representation and Multi-Task Learning with Locality Information
by Yuxiang Zhang, Ke Wu, Bo Du, Liangpei Zhang and Xiangyun Hu
Remote Sens. 2017, 9(5), 482; https://doi.org/10.3390/rs9050482 - 14 May 2017
Cited by 22 | Viewed by 7333
Abstract
Target detection from hyperspectral images is an important problem but encounters a critical challenge of simultaneously reducing spectral redundancy and preserving the discriminative information. Recently, the joint sparse representation and multi-task learning (JSR-MTL) approach was proposed to address the challenge. However, it does [...] Read more.
Target detection from hyperspectral images is an important problem but encounters a critical challenge of simultaneously reducing spectral redundancy and preserving the discriminative information. Recently, the joint sparse representation and multi-task learning (JSR-MTL) approach was proposed to address the challenge. However, it does not fully explore the prior class label information of the training samples and the difference between the target dictionary and background dictionary when constructing the model. Besides, there may exist estimation bias for the unknown coefficient matrix with the use of minimization which is usually inconsistent in variable selection. To address these problems, this paper proposes an adaptive joint sparse representation and multi-task learning detector with locality information (JSRMTL-ALI). The proposed method has the following capabilities: (1) it takes full advantage of the prior class label information to construct an adaptive joint sparse representation and multi-task learning model; (2) it explores the great difference between the target dictionary and background dictionary with different regularization strategies in order to better encode the task relatedness; (3) it applies locality information by imposing an iterative weight on the coefficient matrix in order to reduce the estimation bias. Extensive experiments were carried out on three hyperspectral images, and it was found that JSRMTL-ALI generally shows a better detection performance than the other target detection methods. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Illustration of the band cross-grouping strategy for the multiple detection tasks. HSI = hyperspectral image.</p>
Full article ">Figure 2
<p>Schematic illustration of the adaptive joint sparse representation and multi-task learning detector with locality information (JSRMTL-ALI) algorithm.</p>
Full article ">Figure 3
<p>The AVIRIS dataset.</p>
Full article ">Figure 4
<p>The Indian dataset.</p>
Full article ">Figure 5
<p>The Cri dataset.</p>
Full article ">Figure 6
<p>Receiver operation characteristic (ROC) curves for the effectiveness investigation of JSRMTL-ALI model.</p>
Full article ">Figure 7
<p>Detection performance of JSRMTL-ALI versus the detection task number <math display="inline"> <semantics> <mi>K</mi> </semantics> </math>.</p>
Full article ">Figure 8
<p>Detection performance of JSRMTL-ALI versus the detection task number <math display="inline"> <semantics> <mi>ρ</mi> </semantics> </math>.</p>
Full article ">Figure 9
<p>Detection performance of JSRMTL-ALI versus the size of the outer window region (OWR).</p>
Full article ">Figure 10
<p>Detection performance of eight detectors for three datasets.</p>
Full article ">Figure 11
<p>The separability maps of eight detectors for three datasets.</p>
Full article ">Figure 12
<p>Two-dimensional plots of the detection map for the AVIRIS dataset.</p>
Full article ">Figure 13
<p>Two-dimensional plots of the detection map for the Indian dataset.</p>
Full article ">Figure 14
<p>Two-dimensional plots of the detection map for the Cri dataset.</p>
Full article ">
22348 KiB  
Article
Maritime Semantic Labeling of Optical Remote Sensing Images with Multi-Scale Fully Convolutional Network
by Haoning Lin, Zhenwei Shi and Zhengxia Zou
Remote Sens. 2017, 9(5), 480; https://doi.org/10.3390/rs9050480 - 14 May 2017
Cited by 74 | Viewed by 9015
Abstract
In current remote sensing literature, the problems of sea-land segmentation and ship detection (including in-dock ships) are investigated separately despite the high correlation between them. This inhibits joint optimization and makes the implementation of the methods highly complicated. In this paper, we propose [...] Read more.
In current remote sensing literature, the problems of sea-land segmentation and ship detection (including in-dock ships) are investigated separately despite the high correlation between them. This inhibits joint optimization and makes the implementation of the methods highly complicated. In this paper, we propose a novel fully convolutional network to accomplish the two tasks simultaneously, in a semantic labeling fashion, i.e., to label every pixel of the image into 3 classes, sea, land and ships. A multi-scale structure for the network is proposed to address the huge scale gap between different classes of targets, i.e., sea/land and ships. Conventional multi-scale structure utilizes shortcuts to connect low level, fine scale feature maps to high level ones to increase the network’s ability to produce finer results. In contrast, our proposed multi-scale structure focuses on increasing the receptive field of the network while maintaining the ability towards fine scale details. The multi-scale convolution network accommodates the huge scale difference between sea-land and ships and provides comprehensive features, and is able to accomplish the tasks in an end-to-end manner that is easy for implementation and feasible for joint optimization. In the network, the input forks into fine-scale and coarse-scale paths, which share the same convolution layers to minimize network parameter increase, and then are joined together to produce the final result. The experiments show that the network tackles the semantic labeling problem with improved performance. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>A typical FCN, each cuboid indicating an output matrix of a convolution layer. The numbers indicate the size of the 3rd dimension of each cuboid, or equally, the number of kernels of the corresponding layer.</p>
Full article ">Figure 2
<p>An illustration of the input/output of crop layer and the resize layer.</p>
Full article ">Figure 3
<p>An illustration of the proposed network. The texts on the arrowed lines specify the layers the data go through to produce the displayed results. The color of the outline (blue and yellow) of each result marks the corresponding input area it represents for the sake of clarity. The results that are directly connected to loss layers are underlined with dashed lines.</p>
Full article ">Figure 4
<p>The receptive field of each convolution layer with a <math display="inline"> <semantics> <mrow> <mn>3</mn> <mo>×</mo> <mn>3</mn> </mrow> </semantics> </math> kernel. The green area marks the receptive field of one pixel in Layer 2, and the yellow area marks the receptive field of one pixel in Layer 3.</p>
Full article ">Figure 5
<p>Different multi-scale structures in our experiment, (<b>a</b>) the features are summed up element-wise at the very end, (<b>b</b>) features are concatenated before fc_** layers and (<b>c</b>) features are concatenated between fc_** layers. Here green blocks indicate inputs of different scales, white indicates convolution layers, blue indicates fc_** layers, circle with a plus indicates element-wise addition operation of feature maps, circle with a C indicates concatenation operation. The loss layers are placed on the very top of each network.</p>
Full article ">Figure 6
<p>Training average Accuracy/epoch, Recall/epoch and IOU/epoch curve of networks with different multi-scale structure in <a href="#remotesensing-09-00480-f005" class="html-fig">Figure 5</a>, with (<b>a</b>-<b>c</b>) labeled accordingly.</p>
Full article ">Figure 6 Cont.
<p>Training average Accuracy/epoch, Recall/epoch and IOU/epoch curve of networks with different multi-scale structure in <a href="#remotesensing-09-00480-f005" class="html-fig">Figure 5</a>, with (<b>a</b>-<b>c</b>) labeled accordingly.</p>
Full article ">Figure 7
<p>First 20 weights of Layer fc_a2 plotted as lines. Each line represents the weights corresponding to a specific output category (sea, land and ship) as listed in the legend. Each dot on the line represent a weight corresponding to an input dimension. The first 3 input dimensions corresponds to coarse-scale score slices of sea, land and ship, respectively, and the other dimensions corresponds to feature maps from fine-scale Layer fc_a1.</p>
Full article ">Figure 8
<p>Semantic labeling results on Google Map images (<b>a</b>,<b>b</b>) and GaoFen-1 images (<b>c</b>,<b>d</b>). The images are arranged as original (<b>top</b>), proposed method without multi-scale (<b>a</b>,<b>c</b>) and proposed method with multi-scale (<b>b</b>,<b>d</b>). Here sea, land and ship are labeled as blue, green and white, respectively.</p>
Full article ">Figure 9
<p>Zoomed in semantic labeling results (<b>bottom</b>) on GaoFen-1 images (<b>a</b>-<b>c</b>), presented with The original image (<b>top</b>). Here sea, land and ship are labeled as blue, green and white, respectively.</p>
Full article ">Figure 9 Cont.
<p>Zoomed in semantic labeling results (<b>bottom</b>) on GaoFen-1 images (<b>a</b>-<b>c</b>), presented with The original image (<b>top</b>). Here sea, land and ship are labeled as blue, green and white, respectively.</p>
Full article ">Figure 10
<p>Zoomed in semantic labeling results of DenseCRF (<b>a</b>-<b>c</b>), SLIC (<b>d</b>-<b>f</b>) and proposed method (<b>g</b>-<b>i</b>) with original images (<b>top</b>). Here sea, land and ship are labeled as blue, green and white, respectively.</p>
Full article ">Figure 10 Cont.
<p>Zoomed in semantic labeling results of DenseCRF (<b>a</b>-<b>c</b>), SLIC (<b>d</b>-<b>f</b>) and proposed method (<b>g</b>-<b>i</b>) with original images (<b>top</b>). Here sea, land and ship are labeled as blue, green and white, respectively.</p>
Full article ">
1335 KiB  
Article
Hyperspectral Dimensionality Reduction by Tensor Sparse and Low-Rank Graph-Based Discriminant Analysis
by Lei Pan, Heng-Chao Li, Yang-Jun Deng, Fan Zhang, Xiang-Dong Chen and Qian Du
Remote Sens. 2017, 9(5), 452; https://doi.org/10.3390/rs9050452 - 6 May 2017
Cited by 50 | Viewed by 8289
Abstract
Recently, sparse and low-rank graph-based discriminant analysis (SLGDA) has yielded satisfactory results in hyperspectral image (HSI) dimensionality reduction (DR), for which sparsity and low-rankness are simultaneously imposed to capture both local and global structure of hyperspectral data. However, SLGDA fails to exploit the [...] Read more.
Recently, sparse and low-rank graph-based discriminant analysis (SLGDA) has yielded satisfactory results in hyperspectral image (HSI) dimensionality reduction (DR), for which sparsity and low-rankness are simultaneously imposed to capture both local and global structure of hyperspectral data. However, SLGDA fails to exploit the spatial information. To address this problem, a tensor sparse and low-rank graph-based discriminant analysis (TSLGDA) is proposed in this paper. By regarding the hyperspectral data cube as a third-order tensor, small local patches centered at the training samples are extracted for the TSLGDA framework to maintain the structural information, resulting in a more discriminative graph. Subsequently, dimensionality reduction is performed on the tensorial training and testing samples to reduce data redundancy. Experimental results of three real-world hyperspectral datasets demonstrate that the proposed TSLGDA algorithm greatly improves the classification performance in the low-dimensional space when compared to state-of-the-art DR methods. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Visual illustration of <span class="html-italic">n</span>-mode vectors, <span class="html-italic">n</span>-mode unfolding, and <span class="html-italic">n</span>-mode product of a third-order tensor from a hyperspectral image.</p>
Full article ">Figure 2
<p>Parameter tuning of <math display="inline"> <semantics> <mi>β</mi> </semantics> </math> and <math display="inline"> <semantics> <mi>λ</mi> </semantics> </math> for the proposed TSLGDA algorithm using three datasets: (<b>a</b>) Indian Pines; (<b>b</b>) University of Pavia; (<b>c</b>) Salinas.</p>
Full article ">Figure 3
<p>Parameter tuning of window size for MPCA and TSLGDA using three datasets: (<b>a</b>) Indian Pines; (<b>b</b>) University of Pavia; (<b>c</b>) Salinas.</p>
Full article ">Figure 4
<p>Overall accuracy versus the reduced spectral dimension for different methods using three datasets: (<b>a</b>) Indian Pines; (<b>b</b>) University of Pavia; (<b>c</b>) Salinas.</p>
Full article ">Figure 5
<p>Classification maps of different methods for the Indian Pines dataset: (<b>a</b>) ground truth; (<b>b</b>) training set; (<b>c</b>) origin; (<b>d</b>) PCA; (<b>e</b>) LDA; (<b>f</b>) LFDA; (<b>g</b>) SGDA; (<b>h</b>) GDA-SS; (<b>i</b>) SLGDA; (<b>j</b>) MPCA; (<b>k</b>) G-LTDA; and (<b>l</b>) TSLGDA.</p>
Full article ">Figure 6
<p>Classification maps of different methods for the University of Pavia dataset: (<b>a</b>) ground truth; (<b>b</b>) training set; (<b>c</b>) origin; (<b>d</b>) PCA; (<b>e</b>) LDA; (<b>f</b>) LFDA; (<b>g</b>) SGDA; (<b>h</b>) GDA-SS; (<b>i</b>) SLGDA; (<b>j</b>) MPCA; (<b>k</b>) G-LTDA; and (<b>l</b>) TSLGDA.</p>
Full article ">Figure 7
<p>Classification maps of different methods for the Salinas dataset: (<b>a</b>) ground truth; (<b>b</b>) training set; (<b>c</b>) origin; (<b>d</b>) PCA; (<b>e</b>) LDA; (<b>f</b>) LFDA; (<b>g</b>) SGDA; (<b>h</b>) GDA-SS; (<b>i</b>) SLGDA; (<b>j</b>) MPCA; (<b>k</b>) G-LTDA; and (<b>l</b>) TSLGDA.</p>
Full article ">Figure 8
<p>Overall classification accuracy and standard deviation versus different numbers of training samples per class for all methods using three datasets: (<b>a</b>) Indian Pines; (<b>b</b>) University of Pavia; (<b>c</b>) Salinas.</p>
Full article ">
2391 KiB  
Article
Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images
by Hongzhen Wang, Ying Wang, Qian Zhang, Shiming Xiang and Chunhong Pan
Remote Sens. 2017, 9(5), 446; https://doi.org/10.3390/rs9050446 - 5 May 2017
Cited by 172 | Viewed by 16171
Abstract
Semantic segmentation is a fundamental task in remote sensing image processing. The large appearance variations of ground objects make this task quite challenging. Recently, deep convolutional neural networks (DCNNs) have shown outstanding performance in this task. A common strategy of these methods (e.g., [...] Read more.
Semantic segmentation is a fundamental task in remote sensing image processing. The large appearance variations of ground objects make this task quite challenging. Recently, deep convolutional neural networks (DCNNs) have shown outstanding performance in this task. A common strategy of these methods (e.g., SegNet) for performance improvement is to combine the feature maps learned at different DCNN layers. However, such a combination is usually implemented via feature map summation or concatenation, indicating that the features are considered indiscriminately. In fact, features at different positions contribute differently to the final performance. It is advantageous to automatically select adaptive features when merging different-layer feature maps. To achieve this goal, we propose a gated convolutional neural network to fulfill this task. Specifically, we explore the relationship between the information entropy of the feature maps and the label-error map, and then a gate mechanism is embedded to integrate the feature maps more effectively. The gate is implemented by the entropy maps, which are generated to assign adaptive weights to different feature maps as their relative importance. Generally, the entropy maps, i.e., the gates, guide the network to focus on the highly-uncertain pixels, where detailed information from lower layers is required to improve the separability of these pixels. The selected features are finally combined to feed into the classifier layer, which predicts the semantic label of each pixel. The proposed method achieves competitive segmentation accuracy on the public ISPRS 2D Semantic Labeling benchmark, which is challenging for segmentation by only using the RGB images. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>The strong relationship between segmentation error label map with entropy heat map. (<b>a</b>) Input image; (<b>b</b>) Segmentation reference map; (<b>c</b>) Predicted label map; (<b>d</b>) Error map with white pixels indicating wrongly classified pixels; (<b>e</b>) Corresponding entropy heat map.</p>
Full article ">Figure 2
<p>The overview of our gated segmentation network. In the encoder part, we use ResNet-101 as the feature extractor. Then the Entropy Control Module (ECM) are proposed for feature fusion in decoder. In addition, we design the Residual Convolution Module (RCM) as a basic processing unit. The details of RCM and ECM are shown in the dashed boxes.</p>
Full article ">Figure 3
<p>Overview of the ISPRS 2D Vaihingen Labeling dataset. There are 33 tiles. Numbers in the figure refer to the individual tile flag.</p>
Full article ">Figure 4
<p>Model visualization. We show the error maps, entropy heat maps, and predictions at different iterations in the training procedure. Four rows at each iteration block correspond to four ECMs, which are used to merge five kinds of feature maps with different resolutions.</p>
Full article ">Figure 5
<p>Visual comparisons between GSN and other related methods on ISPRS test set. Images come from the website of ISPRS 2D Semantic Labeling Contest.</p>
Full article ">Figure 6
<p>Three failure modules. (<b>a</b>) Placing gate on the output of lower layer; (<b>b</b>) Placing gate both on the output of lower layer and upper layers; (<b>c</b>) Gate on the output of lower layer is created by the combination of lower and upper layers output.</p>
Full article ">
20819 KiB  
Article
Image Registration and Fusion of Visible and Infrared Integrated Camera for Medium-Altitude Unmanned Aerial Vehicle Remote Sensing
by Hongguang Li, Wenrui Ding, Xianbin Cao and Chunlei Liu
Remote Sens. 2017, 9(5), 441; https://doi.org/10.3390/rs9050441 - 5 May 2017
Cited by 64 | Viewed by 11087
Abstract
This study proposes a novel method for image registration and fusion via commonly used visible light and infrared integrated cameras mounted on medium-altitude unmanned aerial vehicles (UAVs).The innovation of image registration lies in three aspects. First, it reveals how complex perspective transformation can [...] Read more.
This study proposes a novel method for image registration and fusion via commonly used visible light and infrared integrated cameras mounted on medium-altitude unmanned aerial vehicles (UAVs).The innovation of image registration lies in three aspects. First, it reveals how complex perspective transformation can be converted to simple scale transformation and translation transformation between two sensor images under long-distance and parallel imaging conditions. Second, with the introduction of metadata, a scale calculation algorithm is designed according to spatial geometry, and a coarse translation estimation algorithm is presented based on coordinate transformation. Third, the problem of non-strictly aligned edges in precise translation estimation is solved via edge–distance field transformation. A searching algorithm based on particle swarm optimization is introduced to improve efficiency. Additionally, a new image fusion algorithm is designed based on a pulse coupled neural network and nonsubsampled contourlet transform to meet the special requirements of preserving color information, adding infrared brightness information, improving spatial resolution, and highlighting target areas for unmanned aerial vehicle (UAV) applications. A medium-altitude UAV is employed to collect datasets. The result is promising, especially in applications that involve other medium-altitude or high-altitude UAVs with similar system structures. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>UAV system for earthquake emergency and rescue including: (1) unmanned aerial vehicle (UAV); (2) ground control system; (3) information processing center; and (4) launcher.</p>
Full article ">Figure 2
<p>UAV airborne visible light and infrared integrated camera platform with two degrees of freedom.</p>
Full article ">Figure 3
<p>Visible light and infrared integrated camera, in which the two imaging axes are parallel to each other.</p>
Full article ">Figure 4
<p>Process of visible and infrared image registration, including scale calculation, coarse translation estimation, and precise translation estimation.</p>
Full article ">Figure 5
<p>Process of visible and infrared image fusion based on PCNN and NSCT.</p>
Full article ">Figure 6
<p>Five coordinate systems of coarse translation estimation.</p>
Full article ">Figure 7
<p>Non-strictly aligned characteristics of edges: (<b>a</b>) original visible image; (<b>b</b>) original infrared image; (<b>c</b>) visible edge image; and (<b>d</b>) infrared edge image.</p>
Full article ">Figure 8
<p>Edge distance field transformation based on Gaussian: (<b>a</b>) visible edge image; and (<b>b</b>) distance field map of visible edge.</p>
Full article ">Figure 9
<p>Infrared template image extraction and template image searching in the distance field map of visible edge: (<b>a</b>) infrared edge image; and (<b>b</b>) distance field map of visible edge.</p>
Full article ">Figure 10
<p>NSPFB and NSDFB of NSCT transform. The left-hand portion is the image decomposition based on NSPFB. The right-hand portion shows the decomposition of each subband in different directions based on NSDFB.</p>
Full article ">Figure 11
<p>Study area and flight course covering about 300 km<sup>2</sup> in Eastern China.</p>
Full article ">Figure 12
<p>First experiment of scale calculation: (<b>a</b>) original image; (<b>b</b>) original infrared image; (<b>c</b>) scale-transformed result of image (<b>b</b>); and (<b>d</b>) fusion image of images (<b>a</b>) and (<b>c</b>).</p>
Full article ">Figure 13
<p>Second experiment of scale calculation. (<b>a</b>) Original image; (<b>b</b>) original infrared image; (<b>c</b>) scale-transformed result of image (<b>b</b>); and (<b>d</b>) fusion image of images (<b>a</b>) and (<b>c</b>).</p>
Full article ">Figure 14
<p>Third experiment of scale calculation: (<b>a</b>) original image; (<b>b</b>) original infrared image; (<b>c</b>) scale-transformed result of image (<b>b</b>); and (<b>d</b>) fusion image of images (<b>a</b>) and (<b>c</b>).</p>
Full article ">Figure 15
<p>Fusion image of coarse translation-transformed infrared image and original visible image: (<b>a</b>) first experiment image; (<b>b</b>) second experiment image; and (<b>c</b>) third experiment image.</p>
Full article ">Figure 16
<p>Fusion image of the precise translation-transformed infrared image and the original visible image: (<b>a</b>) first experiment image; (<b>b</b>) second experiment image; and (<b>c</b>) third experiment image.</p>
Full article ">Figure 17
<p>Fusion of visible image and low spatial infrared image: (<b>a</b>) Visible image; (<b>b</b>) infrared image; (<b>c</b>) fusion image based on IHS; and (<b>d</b>) fusion image based on the proposed method.</p>
Full article ">Figure 18
<p>Fusion of interesting areas in two scenes: (<b>a</b>,<b>b</b>,<b>d</b>,<b>e</b>) original image; and (<b>c</b>,<b>f</b>) fusion image based on the proposed method.</p>
Full article ">Figure 19
<p>Performance analysis of the first experiment under translation conditions.</p>
Full article ">Figure 20
<p>Performance analysis of the second experiment under rotation conditions.</p>
Full article ">Figure 21
<p>Performance analysis of the third experiment under scale conditions.</p>
Full article ">Figure 22
<p>Average gradient and Shannon values of the four image fusion methods: (<b>a</b>) average gradient; and (<b>b</b>) Shannon value.</p>
Full article ">
2188 KiB  
Article
Quantifying Sub-Pixel Surface Water Coverage in Urban Environments Using Low-Albedo Fraction from Landsat Imagery
by Weiwei Sun, Bo Du and Shaolong Xiong
Remote Sens. 2017, 9(5), 428; https://doi.org/10.3390/rs9050428 - 1 May 2017
Cited by 32 | Viewed by 6519
Abstract
The problem of mixed pixels negatively affects the delineation of accurate surface water in Landsat Imagery. Linear spectral unmixing has been demonstrated to be a powerful technique for extracting surface materials at a sub-pixel scale. Therefore, in this paper, we propose an innovative [...] Read more.
The problem of mixed pixels negatively affects the delineation of accurate surface water in Landsat Imagery. Linear spectral unmixing has been demonstrated to be a powerful technique for extracting surface materials at a sub-pixel scale. Therefore, in this paper, we propose an innovative low albedo fraction (LAF) method based on the idea of unconstrained linear spectral unmixing. The LAF stands on the “High Albedo-Low Albedo-Vegetation” model of spectral unmixing analysis in urban environments, and investigates the urban surface water extraction problem with the low albedo fraction map. Three experiments are carefully designed using Landsat TM/ETM+ images on the three metropolises of Wuhan, Shanghai, and Guangzhou in China, and per-pixel and sub-pixel accuracies are estimated. The results are compared against extraction accuracies from three popular water extraction methods including the normalized difference water index (NDWI), modified normalized difference water index (MNDWI), and automated water extraction index (AWEI). Experimental results show that LAF achieves a better accuracy when extracting urban surface water than both MNDWI and AWEI do, especially in boundary mixed pixels. Moreover, the LAF has the smallest threshold variations among the three methods, and the fraction threshold of 1 is a proper choice for LAF to obtain good extraction results. Therefore, the LAF is a promising approach for extracting urban surface water coverage. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The images of Landsat data on three metropolises: (<b>a</b>) Wuhan; (<b>b</b>) Shanghai; and (<b>c</b>) Guangzhou.</p>
Full article ">Figure 2
<p>The overall procedure of the LAF method.</p>
Full article ">Figure 3
<p>The triangular topology from scatter plots of the first two PCs. The vertexes correspond to three endmembers: high albedo (e.g., concrete), low albedo (e.g., water), and vegetation (e.g., grass).</p>
Full article ">Figure 4
<p>The category of true water proportions in testing boundary pixels in Shahu lake, Wuhan.</p>
Full article ">Figure 5
<p>Comparison of water extraction results from all three methods on three test sites.</p>
Full article ">
25281 KiB  
Article
Sea Ice Concentration Estimation during Freeze-Up from SAR Imagery Using a Convolutional Neural Network
by Lei Wang, K. Andrea Scott and David A. Clausi
Remote Sens. 2017, 9(5), 408; https://doi.org/10.3390/rs9050408 - 26 Apr 2017
Cited by 78 | Viewed by 9333
Abstract
In this study, a convolutional neural network (CNN) is used to estimate sea ice concentration using synthetic aperture radar (SAR) scenes acquired during freeze-up in the Gulf of St. Lawrence on the east coast of Canada. The ice concentration estimates from the CNN [...] Read more.
In this study, a convolutional neural network (CNN) is used to estimate sea ice concentration using synthetic aperture radar (SAR) scenes acquired during freeze-up in the Gulf of St. Lawrence on the east coast of Canada. The ice concentration estimates from the CNN are compared to those from a neural network (multi-layer perceptron or MLP) that uses hand-crafted features as input and a single layer of hidden nodes. The CNN is found to be less sensitive to pixel level details than the MLP and produces ice concentration that is less noisy and in closer agreement with that from image analysis charts. This is due to the multi-layer (deep) structure of the CNN, which enables abstract image features to be learned. The CNN ice concentration is also compared with ice concentration estimated from passive microwave brightness temperature data using the ARTIST sea ice (ASI) algorithm. The bias and RMS of the difference between the ice concentration from the CNN and that from image analysis charts is reduced as compared to that from either the MLP or ASI algorithm. Additional results demonstrate the impact of varying the input patch size, varying the number of CNN layers, and including the incidence angle as an additional input. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Study area and the dataset for the Gulf of Saint Lawrence. There are 25 scenes of dual-pol SAR images acquired between 16 January 2014 and 10 February 2014 in this area. The coverage for each scene is marked in a translucent polygon with different colors. Yellow scenes are used for training, red are used for validation and blue for testing.</p>
Full article ">Figure 2
<p>Errors at different ice concentration levels for ASI (1st row), MLP40 (2nd row), and CNN (3rd row) for training (1st column), validation (2nd column) and testing (3rd column) sets. The red lines represent the mean ice concentration, and half length of a bar represents the error standard deviation.</p>
Full article ">Figure 3
<p>Histogram of the percentage of samples from each 10% interval of the image analyses for training, validation and testing dataset of the Gulf of Saint Lawrence. The training samples are strongly biased since the majority of the training samples are either water or ice. (<b>a</b>) Training; (<b>b</b>) Validation; (<b>c</b>) Testing.</p>
Full article ">Figure 4
<p>Ice concentration estimated by CNN compared to that from other methods. The HH and HV images are shown in panels (<b>a</b>) and (<b>b</b>) repectively. Panel (<b>c</b>) is the image analysis, (<b>d</b>–<b>f</b>) are the ice concentration from ASI, MLP40 and CNN, repectively. Scene shown is 20140117_103914, which is used for testing. Scene centered at 47.99<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 66.85<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W with extent of 500 km by 500 km.</p>
Full article ">Figure 5
<p>Ice concentration estimated by CNN compared to that from other methods. The HH and HV images are shown in panels (<b>a</b>) and (<b>b</b>) respectively. Panel (<b>c</b>) is the image analysis. Panels (<b>d</b>–<b>f</b>) are ice concentration from ASI, MLP40 and CNN respectively. Scene shown is 20140210_103911, which is used for testing. Scene centered at 49.90<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 66.42<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W with extent of 500 km by 230 km.</p>
Full article ">Figure 6
<p>An example shows the details for a region with new ice and water. The ASI result is mainly water for this region. It can be seen MLP40 (<b>d</b>) produces noisy ice concentration estimates with new ice in the bottom left identified as water with some ice of low ice concentration. The CNN (<b>e</b>) is able to correctly identify new ice and water with higher accuracy. Subscene of dimension 60 km × 60 km from 20140117_103914 centered at 47.60<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 64.13<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W. The HH image, HV image and image analysis are shown in panels (<b>a</b>–<b>c</b>) respectively.</p>
Full article ">Figure 7
<p>Example of water misidentified as ice for both MLP40 and CNN due to the banding effect in HV pol. Water in the right part of HV pol (<b>a</b>) is obviously brighter than water in the left. Water regions are estimated incorrectly for MLP40 (<b>c</b>), and CNN with patch size 45 (<b>d</b>). Results from the CNN are improved when a patch size of 55 is used, as shown in panel (<b>e</b>), although the features are also less sharp. Subscene centered at 49.72<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 59.11<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W of dimension 200 km × 200 km from 20140121_214420. Image analysis is shown in panel (<b>b</b>).</p>
Full article ">Figure 8
<p>Visual comparison of different patch sizes, (<b>e</b>) 25 × 25 pixels, (<b>f</b>) 35 × 35 pixels, (<b>g</b>) 45 × 45 pixels, (<b>h</b>) 55 × 55 pixels. Estimate of ice concentration is improved when patch size increases. Patch size 45, corresponding to ground distance of 18 km, has cleaner water estimates than the others. Subscene of dimension 270 km × 270 km from 20140124_215646 centered at 47.86<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 60.94<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W. Panels (<b>a</b>–<b>d</b>) are the HH image, HV image, image analysis chart and ASI ice concentration respectively.</p>
Full article ">Figure 9
<p>New ice can be seen in the HH image as the dark regions along the coast (<b>a</b>). This ice is correctly identified when incidence angle data are used (<b>c</b>), as compared to when the incidence angle data is not used (<b>d</b>). Subscene of dimension 120 km × 52 km from 20140206_221744 centered at 47.12<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 64.72<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W. Image analysis is shown in panel (<b>b</b>).</p>
Full article ">Figure 10
<p>The network with three convolutional layers (<b>d</b>), improves the estimation for new ice compared to network with two convolutional layers (<b>c</b>). Panel (<b>a</b>) is the HH image, and (<b>b</b>) is the image analysis chart. Subscene of dimension 8 km × 8 km centered at 47.06<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N and 64.46<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W from 20140117_103914.</p>
Full article ">Figure 11
<p>Comparison of results produced by networks with two-convolutional-layer (<b>c</b>) and three-convolutional-layer structures (<b>d</b>) for a sample location centered at 49.57<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>N, 66.59<math display="inline"> <semantics> <msup> <mrow/> <mo>∘</mo> </msup> </semantics> </math>W with size 200 km × 200 km in scene 20140127_104734 in Gulf of Saint Lawrence. Estimate by the two-convolutional-layer network is noisier. The three-convolutional-layer network produces smoother and more reasonable results. Panel (<b>a</b>) is the HH image and (<b>b</b>) is the image analysis chart for the subregion.</p>
Full article ">
10570 KiB  
Article
Unassisted Quantitative Evaluation of Despeckling Filters
by Luis Gomez, Raydonal Ospina and Alejandro C. Frery
Remote Sens. 2017, 9(4), 389; https://doi.org/10.3390/rs9040389 - 20 Apr 2017
Cited by 79 | Viewed by 7585
Abstract
SAR (Synthetic Aperture Radar) imaging plays a central role in Remote Sensing due to, among other important features, its ability to provide high-resolution, day-and-night and almost weather-independent images. SAR images are affected from a granular contamination, speckle, that can be described by a [...] Read more.
SAR (Synthetic Aperture Radar) imaging plays a central role in Remote Sensing due to, among other important features, its ability to provide high-resolution, day-and-night and almost weather-independent images. SAR images are affected from a granular contamination, speckle, that can be described by a multiplicative model. Many despeckling techniques have been proposed in the literature, as well as measures of the quality of the results they provide. Assuming the multiplicative model, the observed image Z is the product of two independent fields: the backscatter X and the speckle Y. The result of any speckle filter is X ^ , an estimator of the backscatter X, based solely on the observed data Z. An ideal estimator would be the one for which the ratio of the observed image to the filtered one I = Z / X ^ is only speckle: a collection of independent identically distributed samples from Gamma variates. We, then, assess the quality of a filter by the closeness of I to the hypothesis that it is adherent to the statistical properties of pure speckle. We analyze filters through the ratio image they produce with regards to first- and second-order statistics: the former check marginal properties, while the latter verifies lack of structure. A new quantitative image-quality index is then defined, and applied to state-of-the-art despeckling filters. This new measure provides consistent results with commonly used quality measures (equivalent number of looks, PSNR, MSSIM, ? edge correlation, and preservation of the mean), and ranks the filters results also in agreement with their visual analysis. We conclude our study showing that the proposed measure can be successfully used to optimize the (often many) parameters that define a speckle filter. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>(<b>Top</b>): original SAR image; (<b>Middle</b>): SRAD (<math display="inline"> <semantics> <mrow> <mi>T</mi> <mo>=</mo> <mn>50</mn> </mrow> </semantics> </math>) filtered image and ratio image; (<b>Bottom</b>): zoom of a selected area within the ratio image and extracted edges by Canny’s edge detector.</p>
Full article ">Figure 2
<p>A step: constant and textured versions, and their return. (<b>a</b>) Constant step and speckled return; (<b>b</b>) Textured step and speckled return.</p>
Full article ">Figure 3
<p>Estimated speckle by the ideal filter and by overmoothing.</p>
Full article ">Figure 4
<p>Slowly varying backscatter, fully developed speckle, and estimated speckle. (<b>a</b>) Slowly-varying mean value and its return; (<b>b</b>) Estimated speckle.</p>
Full article ">Figure 5
<p>Ratio image resulting from neglecting a slowly varying structure under fully developed speckle.</p>
Full article ">Figure 6
<p>The effect of oversmoothing on an image of strips of varying width. (<b>a</b>) Strips and speckle; (<b>b</b>) Filtered strips with oversmoothing.</p>
Full article ">Figure 7
<p>Estimated speckle: ideal and oversmoothing filters.</p>
Full article ">Figure 8
<p>Speckled strips, result of applying a <math display="inline"> <semantics> <mrow> <mn>5</mn> <mo>×</mo> <mn>5</mn> </mrow> </semantics> </math> BoxCar filter, ratio image. (<b>a</b>) Speckled strips; (<b>b</b>) Filtered strips; (<b>c</b>) Ratio image.</p>
Full article ">Figure 9
<p>Selection of mean and ENL values for the first-order measure.</p>
Full article ">Figure 10
<p>Blocks and points phantom, and <math display="inline"> <semantics> <mrow> <mn>500</mn> <mo>×</mo> <mn>500</mn> </mrow> </semantics> </math> pixels simulated single-look intensity image. (<b>a</b>) Blocks and points phantom; (<b>b</b>) Speckled version, single look.</p>
Full article ">Figure 11
<p>Results for the simulated single-look intensity data. Top to bottom, (<b>left</b>) results of applying the SRAD, the E-Lee, the PPB and the FANS filters. Top to bottom (<b>right</b>), their ratio images.</p>
Full article ">Figure 12
<p>Zoom of the results for synthetic data: (<b>top</b>) Noisy image, (<b>first row</b>, <b>left</b>) SRAD filter, (first row, right) E-Lee filter, (<b>second row</b>, <b>left</b>) PPB filter and, (<b>second row</b>, <b>right</b>) FANS filter.</p>
Full article ">Figure 13
<p>Intensity AIRSAR images, HH polarization, three looks. (<b>a</b>) Flevoland; (<b>b</b>) San Francisco bay.</p>
Full article ">Figure 14
<p>Results for the Flevoland image. Top to bottom, (<b>left</b>) results of applying SRAD, E-Lee, PPB and FANS filters. Top to bottom (<b>right</b>), their ratio images.</p>
Full article ">Figure 15
<p>Zoom of the results for Flevoland image: (<b>top</b>) Noisy image, (<b>first row</b>, <b>left</b>) SRAD filter, (<b>first row</b>, <b>right</b>) E-Lee filter, (<b>second row</b>, <b>left</b>) PPB filter and, (<b>second row</b>, <b>right</b>) FANS filter.</p>
Full article ">Figure 16
<p>Result for the San Francisco bay image. Top to bottom, (<b>left</b>) results of applying SRAD, E-Lee, PPB and FANS. Top to bottom (<b>right</b>), their ratio images.</p>
Full article ">Figure 17
<p>Zoom of the results for San Francisco image: (<b>top</b>) Noisy image, (<b>first row</b>, <b>left</b>) SRAD filter, (<b>first row</b>, <b>right</b>) E-Lee filter, (<b>second row</b>, <b>left</b>) PPB filter and, (<b>second row</b>, <b>right</b>) FANS filter.</p>
Full article ">Figure 18
<p>Intensity Pi-SAR, HH one look Niigata image (<b>left</b>); Results of applying FANS filters with default parameters (<b>middle</b>) and with optimized parameters (<b>right</b>).</p>
Full article ">Figure 19
<p>Ratio images for Niigata data; FANS with default parameters (<b>left</b>) and with optimized parameters (<b>right</b>).</p>
Full article ">
8617 KiB  
Article
A Fuzzy-GA Based Decision Making System for Detecting Damaged Buildings from High-Spatial Resolution Optical Images
by Milad Janalipour and Ali Mohammadzadeh
Remote Sens. 2017, 9(4), 349; https://doi.org/10.3390/rs9040349 - 20 Apr 2017
Cited by 29 | Viewed by 6303
Abstract
In this research, a semi-automated building damage detection system is addressed under the umbrella of high-spatial resolution remotely sensed images. The aim of this study was to develop a semi-automated fuzzy decision making system using Genetic Algorithm (GA). Our proposed system contains four [...] Read more.
In this research, a semi-automated building damage detection system is addressed under the umbrella of high-spatial resolution remotely sensed images. The aim of this study was to develop a semi-automated fuzzy decision making system using Genetic Algorithm (GA). Our proposed system contains four main stages. In the first stage, post-event optical images were pre-processed. In the second stage, textural features were extracted from the pre-processed post-event optical images using Haralick texture extraction method. Afterwards, in the third stage, a semi-automated Fuzzy-GA (Fuzzy Genetic Algorithm) decision making system was used to identify damaged buildings from the extracted texture features. In the fourth stage, a comprehensive sensitivity analysis was performed to achieve parameters of GA leading to more accurate results. Finally, the accuracy of results was assessed using check and test samples. The proposed system was tested over the 2010 Haiti earthquake (Area 1 and Area 2) and the 2003 Bam earthquake (Area 3). The proposed system resulted in overall accuracies of 76.88 ± 1.22%, 65.43 ± 0.29%, and 90.96 ± 0.15% over Area 1, Area 2, and Area 3, respectively. On the one hand, based on the concept of the proposed Fuzzy-GA decision making system, the automation level of this system is higher than other existing systems. On the other hand, based on the accuracy of our proposed system and four advanced machine learning techniques, i.e., bagging, boosting, random forests, and support vector machine, in the detection of damaged buildings, it seems that our proposed system is robust and efficient. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The workflow of our semi-automated damage detection system in this study.</p>
Full article ">Figure 2
<p>A schematic presentation of a MFIS with three inputs and one output and its MFs.</p>
Full article ">Figure 3
<p>The diagram of convergence of GA over (<b>a</b>) Area 1, (<b>b</b>) Area 3.</p>
Full article ">Figure 4
<p>The presentation of preliminary MFs and optimized MFs in an experiment for input 2 and input 3: (<b>a</b>) preliminary MFs for input 2, (<b>b</b>) optimized MFs for input 2, (<b>c</b>) preliminary MFs for input 3, (<b>d</b>) optimized MFs for input 3.</p>
Full article ">Figure 4 Cont.
<p>The presentation of preliminary MFs and optimized MFs in an experiment for input 2 and input 3: (<b>a</b>) preliminary MFs for input 2, (<b>b</b>) optimized MFs for input 2, (<b>c</b>) preliminary MFs for input 3, (<b>d</b>) optimized MFs for input 3.</p>
Full article ">Figure 5
<p>Building damage maps resulting from the proposed method on (<b>a</b>) Area 1, (<b>b</b>) Area 2, (<b>c</b>) Area 3.</p>
Full article ">
3985 KiB  
Article
A New Spatial Attraction Model for Improving Subpixel Land Cover Classification
by Lizhen Lu, Yanlin Huang, Liping Di and Danwei Hang
Remote Sens. 2017, 9(4), 360; https://doi.org/10.3390/rs9040360 - 11 Apr 2017
Cited by 25 | Viewed by 5341
Abstract
Subpixel mapping (SPM) is a technique that produces hard classification maps at a spatial resolution finer than that of the input images produced when handling mixed pixels. Existing spatial attraction model (SAM) techniques have been proven to be an effective SPM method. The [...] Read more.
Subpixel mapping (SPM) is a technique that produces hard classification maps at a spatial resolution finer than that of the input images produced when handling mixed pixels. Existing spatial attraction model (SAM) techniques have been proven to be an effective SPM method. The techniques mostly differ in the way in which they compute the spatial attraction, for example, from the surrounding pixels in the subpixel/pixel spatial attraction model (SPSAM), from the subpixels within the surrounding pixels in the modified SPSAM (MSPSAM), or from the subpixels within the surrounding pixels and the touching subpixels within the central pixel in the mixed spatial attraction model (MSAM). However, they have a number of common defects, such as a lack of consideration of the attraction from subpixels within the central pixel and the unequal treatment of attraction from surrounding subpixels of the same distance. In order to overcome these defects, this study proposed an improved SAM (ISAM) for SPM. ISAM estimates the attraction value of the current subpixel at the center of a moving window from all subpixels within the window, and moves the window one subpixel per step. Experimental results from both Landsat and MODIS imagery have proven that ISAM, when compared with other SAMs, can improve SPM accuracies and is a more efficient SPM technique than MSPSAM and MSAM. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Illustration of spatial dependence theory for subpixel mapping in an 8 × 8 subpixel scene. (<b>a</b>) 3 × 3 coarse resolution pixels with the indicated proportion of a specific (gray) class; (<b>b</b>) and (<b>c</b>) The possible results of the subpixel allocation of the gray specific class.</p>
Full article ">Figure 2
<p>Illustration of the subpixel/pixel spatial attraction model (SPSAM) (adapted from [<a href="#B30-remotesensing-09-00360" class="html-bibr">30</a>]).</p>
Full article ">Figure 3
<p>Illustration of the modified subpixel/pixel spatial attraction model (MSPSAM) and the mixed spatial attraction model (MSAM) (adapted from [<a href="#B37-remotesensing-09-00360" class="html-bibr">37</a>]). (<b>a</b>) MSPSAM; (<b>b</b>) MSAM.</p>
Full article ">Figure 4
<p>Illustration of the Improved Spatial Attraction Model (ISAM) (the size of the moving window is equal to 2<span class="html-italic">S</span> + 1).</p>
Full article ">Figure 5
<p>The algorithm flowchart of the improved spatial attraction model (ISAM): <span class="html-italic">H<sub>max</sub></span>—maximum steps of iteration; <span class="html-italic">H</span>—current step of iteration; <span class="html-italic">Acc</span>—the accuracies of SPM of previous iteration step; <span class="html-italic">Acc_c</span>—the accuracies of SPM of current iteration step; <math display="inline"> <semantics> <mrow> <msub> <mi>P</mi> <mrow> <mi>c</mi> <mi>e</mi> <mi>n</mi> </mrow> </msub> </mrow> </semantics> </math>—the central pixel; <math display="inline"> <semantics> <mrow> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> </mrow> </semantics> </math>—the subpixel which spatial attractions are currently calculated; <span class="html-italic">C</span>—the number of classes in the study case; <span class="html-italic">J<sub>c,ij</sub>—</span>the spatial attractions of subpixel <math display="inline"> <semantics> <mrow> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> </mrow> </semantics> </math> when it is assigned to class <span class="html-italic">c</span>, <span class="html-italic">SLS<sub>c</sub>—</span>the spatial location sequence of class c of pixel <math display="inline"> <semantics> <mrow> <msub> <mi>P</mi> <mrow> <mi>c</mi> <mi>e</mi> <mi>n</mi> </mrow> </msub> </mrow> </semantics> </math>.</p>
Full article ">Figure 6
<p>The comparison of SPM results among ISAM, SPSAM, MSPSAM, and MSAM (scale factor <span class="html-italic">S</span> = 2): (<b>a</b>) The classification result from Landsat data; (<b>b</b>) The hard classification result at scale factor <span class="html-italic">S</span>; (<b>c</b>–<b>f</b>) the results from ISAM, SPSAM, MSPSAM, and MSAM, respectively.</p>
Full article ">Figure 7
<p>The comparison of SPM results among ISAM, SPSAM, MSPSAM, and MSAM (scale factor <span class="html-italic">S</span> = 4): (<b>a</b>) The classification result from Landsat data; (<b>b</b>) The hard classification result at scale factor <span class="html-italic">S</span>; (<b>c</b>–<b>f</b>) the results from ISAM, SPSAM, MSPSAM, and MSAM, respectively.</p>
Full article ">Figure 8
<p>The comparison of SPM results among ISAM, SPSAM, MSPSAM, and MSAM (scale factor <span class="html-italic">S</span> = 8): (<b>a</b>) The classification result from Landsat data; (<b>b</b>) The hard classification result at scale factor <span class="html-italic">S</span>, (<b>c</b>–<b>f</b>) the results from ISAM, SPSAM, MSPSAM, and MSAM respectively.</p>
Full article ">Figure 9
<p>The comparison of subpixel mapping results among ISAM, SPSAM, MSPSAM, and MSAM (Scale factor <span class="html-italic">S</span> = 8; Mask represents the class of nonagricultural land; PML represents the class of plastic mulched landcover which is a type of farmland covered by plastic mulch film): (<b>a</b>) The classification results from Landsat data; (<b>b</b>) The hard classification results at scale factor S; (<b>c</b>–<b>f</b>) the results from ISAM, SPSAM, MSPSAM, and MSAM, respectively.</p>
Full article ">
8703 KiB  
Article
Supervised and Semi-Supervised Multi-View Canonical Correlation Analysis Ensemble for Heterogeneous Domain Adaptation in Remote Sensing Image Classification
by Alim Samat, Claudio Persello, Paolo Gamba, Sicong Liu, Jilili Abuduwaili and Erzhu Li
Remote Sens. 2017, 9(4), 337; https://doi.org/10.3390/rs9040337 - 1 Apr 2017
Cited by 24 | Viewed by 10432
Abstract
In this paper, we present the supervised multi-view canonical correlation analysis ensemble (SMVCCAE) and its semi-supervised version (SSMVCCAE), which are novel techniques designed to address heterogeneous domain adaptation problems, i.e., situations in which the data to be processed and recognized are collected from [...] Read more.
In this paper, we present the supervised multi-view canonical correlation analysis ensemble (SMVCCAE) and its semi-supervised version (SSMVCCAE), which are novel techniques designed to address heterogeneous domain adaptation problems, i.e., situations in which the data to be processed and recognized are collected from different heterogeneous domains. Specifically, the multi-view canonical correlation analysis scheme is utilized to extract multiple correlation subspaces that are useful for joint representations for data association across domains. This scheme makes homogeneous domain adaption algorithms suitable for heterogeneous domain adaptation problems. Additionally, inspired by fusion methods such as Ensemble Learning (EL), this work proposes a weighted voting scheme based on canonical correlation coefficients to combine classification results in multiple correlation subspaces. Finally, the semi-supervised MVCCAE extends the original procedure by incorporating multiple speed-up spectral regression kernel discriminant analysis (SRKDA). To validate the performances of the proposed supervised procedure, a single-view canonical analysis (SVCCA) with the same base classifier (Random Forests) is used. Similarly, to evaluate the performance of the semi-supervised approach, a comparison is made with other techniques such as Logistic label propagation (LLP) and the Laplacian support vector machine (LapSVM). All of the approaches are tested on two real hyperspectral images, which are considered the target domain, with a classifier trained from synthetic low-dimensional multispectral images, which are considered the original source domain. The experimental results confirm that multi-view canonical correlation can overcome the limitations of SVCCA. Both of the proposed procedures outperform the ones used in the comparison with respect to not only the classification accuracy but also the computational efficiency. Moreover, this research shows that canonical correlation weighted voting (CCWV) is a valid option with respect to other ensemble schemes and that because of their ability to balance diversity and accuracy, canonical views extracted using partially joint random view generation are more effective than those obtained by exploiting disjoint random view generation. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>General flowchart for the proposed heterogeneous DA algorithms SMVCCAE and SSMVCCAE for RS image classification.</p>
Full article ">Figure 2
<p>(<b>a</b>–<b>d</b>) False color composite of the: synthetic low spectral resolution (<b>a</b>); and the original hyperspectral (<b>c</b>) images of the University campus in Pavia, together with: training (<b>b</b>); and validation (<b>d</b>) data sets (legend and sample details are reported in <a href="#remotesensing-09-00337-t001" class="html-table">Table 1</a>). False color composites are obtained and are displayed as R, G, and B bands 7, 5, and 4 for the synthetic, and bands 60, 30, and 2 for the original image, respectively.</p>
Full article ">Figure 3
<p>(<b>a</b>–<b>d</b>) False color composites of the: simulated low spectral resolution (<b>a</b>); and original hyperspectral (<b>c</b>) images of Indian Pines data, together with: training (<b>b</b>); and validation (<b>d</b>) data sets (color legend and sample details are reported in <a href="#remotesensing-09-00337-t002" class="html-table">Table 2</a>). False color composites are obtained displaying as R, G, and B bands, 6, 4, and 5 for the synthetic, and bands 99, 51, and 21 for the original image, respectively.</p>
Full article ">Figure 4
<p>Average canonical correlation coefficient versus embedded features for: ROSIS (<b>a</b>–<b>d</b>); and Indian Pines (<b>e</b>–<b>h</b>) data sets using different view generation techniques: disjoint random sampling (<b>a</b>,<b>e</b>); uniform slice (<b>b</b>,<b>f</b>); clustering (<b>c</b>,<b>g</b>); and partially joint random generation (<b>d</b>,<b>h</b>).</p>
Full article ">Figure 5
<p>(<b>a</b>–<b>h</b>)Average OA values versus target view dimensionality for SMVCCAE with different fusion strategies using: spectral (<b>a</b>,<b>e</b>); spectral-OO (<b>b</b>,<b>f</b>); spectral-MPs (<b>c</b>,<b>g</b>); and spectral-OO-MPs (<b>d</b>,<b>h</b>) features on: ROSIS University (<b>a</b>–<b>d</b>); and Indian Pine datasets (<b>e</b>–<b>h</b>).</p>
Full article ">Figure 6
<p>Average OA, Kappa (<span class="html-italic">κ</span>) and CPU time in seconds vs. the number of views for SMVCCA with PJR view generation and various fusion strategies applied to spectral features of ROSIS: University (<b>a</b>–<b>c</b>); and Indian Pines datasets (<b>d</b>–<b>f</b>).</p>
Full article ">Figure 7
<p>(<b>a</b>–<b>t</b>) Summary of the best classification maps with OA values for SMVCCAE with different fusion strategies using spectral, OO and MPs features of ROSIS University.</p>
Full article ">Figure 8
<p>(<b>a</b>–<b>t</b>) Summary of the best classification maps with OA values for SMVCCAE with different fusion strategies using spectral, OO and MPs features of Indian Pines.</p>
Full article ">Figure 9
<p>(<b>a</b>–<b>f</b>) OA values and CPU time (in seconds) versus the regularization parameter (<span class="html-italic">δ</span>) and nearest neighborhood size (<span class="html-italic">NN</span>) set of SSMVCCAE with DJR view generation strategy for ROSIS University using different sizes of labeled samples: 10 pixels/class (<b>a</b>,<b>d</b>); 50 pixels/class (<b>b</b>,<b>e</b>); and 100 pixels/class (<b>c</b>,<b>f</b>).</p>
Full article ">Figure 10
<p>(<b>a</b>–<b>f</b>) OA values and CPU time (in seconds) versus the regularization parameter (<span class="html-italic">δ</span>) and nearest neighborhood size (<span class="html-italic">NN</span>) set of SSMVCCAE with the DJR view generation strategy for Indian Pine using different size of labeled samples: 10 pixels/class (<b>a</b>,<b>d</b>); 30 pixels/class (<b>b</b>,<b>e</b>); and 55 pixels/class (<b>c</b>,<b>f</b>).</p>
Full article ">Figure 11
<p>(<b>a</b>–<b>p</b>) Average OA values versus labeled pixels for SSMVCCAE with different view generation and fusion strategies for ROSIS University dataset.</p>
Full article ">Figure 12
<p>(<b>a</b>–<b>p</b>) Average OA values versus labeled pixels for SSMVCCAE with different view generation and fusion strategies on Indian Pines data.</p>
Full article ">Figure 13
<p>(<b>a</b>–<b>h</b>) CPU time consumption in seconds versus the size of the labeled samples for SSMVCCAE-SRKDA/-LLP/-LapSVM for the ROSIS University data.</p>
Full article ">Figure 14
<p>(<b>a</b>–<b>h</b>) CPU time versus the size of the labeled samples for SSMVCCAE-SRKDA/-LLP/-LapSVM for the Indian Pines data.</p>
Full article ">
3742 KiB  
Article
Urban Change Analysis with Multi-Sensor Multispectral Imagery
by Yuqi Tang and Liangpei Zhang
Remote Sens. 2017, 9(3), 252; https://doi.org/10.3390/rs9030252 - 9 Mar 2017
Cited by 36 | Viewed by 7202
Abstract
An object-based method is proposed in this paper for change detection in urban areas with multi-sensor multispectral (MS) images. The co-registered bi-temporal images are resampled to match each other. By mapping the segmentation of one image to the other, a change map is [...] Read more.
An object-based method is proposed in this paper for change detection in urban areas with multi-sensor multispectral (MS) images. The co-registered bi-temporal images are resampled to match each other. By mapping the segmentation of one image to the other, a change map is generated by characterizing the change probability of image objects based on the proposed change feature analysis. The map is then used to separate the changes from unchanged areas by two threshold selection methods and k-means clustering (k = 2). In order to consider the multi-scale characteristics of ground objects, multi-scale fusion is implemented. The experimental results obtained with QuickBird and IKONOS images show the superiority of the proposed method in detecting urban changes in multi-sensor MS images. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Processing flow of the proposed method.</p>
Full article ">Figure 2
<p>Possible distributions of a changed object and its relevant changed area, whose statistical feature variation is described as above (<b>a</b>–<b>f</b>).</p>
Full article ">Figure 3
<p>Interpolated bi-temporal images of the first study area. (<b>a</b>) Acquired by QuickBird in April 2005 (L1); and (<b>b</b>) acquired by IKONOS in July 2009 (L2).</p>
Full article ">Figure 4
<p>The change detection maps resulting from: (<b>a</b>) Otsu’s thresholding method, (<b>b</b>) threshold selection by clustering gray levels of boundaries, and (<b>c</b>) <span class="html-italic">k</span>-means clustering (<span class="html-italic">k</span> = 2), compared with (<b>d</b>) the reference image, with L1 as the basis image in the first study area (scale = 100).</p>
Full article ">Figure 4 Cont.
<p>The change detection maps resulting from: (<b>a</b>) Otsu’s thresholding method, (<b>b</b>) threshold selection by clustering gray levels of boundaries, and (<b>c</b>) <span class="html-italic">k</span>-means clustering (<span class="html-italic">k</span> = 2), compared with (<b>d</b>) the reference image, with L1 as the basis image in the first study area (scale = 100).</p>
Full article ">Figure 5
<p>Overall errors of change detection with different segmentation scales, with L1 as the basis image in the first study area.</p>
Full article ">Figure 6
<p>Overall errors of change detection using different multi-scale fusion thresholds, with L1 as the basis image in the first study area.</p>
Full article ">Figure 7
<p>Change detection maps resulting from (<b>a</b>) the proposed multi-scale <span class="html-italic">k</span>-means method and (<b>b</b>) the method using varying geometric and radiometric properties [<a href="#B35-remotesensing-09-00252" class="html-bibr">35</a>], with L2 as the basis image in the first study area (scale = 100).</p>
Full article ">Figure 8
<p>Degraded bi-temporal images of the first study area: (<b>a</b>) acquired by QuickBird in April 2005 (L1) and (<b>b</b>) acquired by IKONOS in July 2009 (L2).</p>
Full article ">Figure 9
<p>The change detection maps resulting from (<b>a</b>) Otsu’s thresholding method, (<b>b</b>) threshold selection by clustering gray levels of boundary, and (<b>c</b>) <span class="html-italic">k</span>-means clustering (<span class="html-italic">k</span> = 2), compared with (<b>d</b>) the reference image, with L2 as the basis image in the first study area (scale = 50).</p>
Full article ">Figure 10
<p>Overall errors of the change detection with different segmentation scales, with L2 as the basis image in the first study area.</p>
Full article ">Figure 11
<p>Change detection maps resulting from (<b>a</b>) the proposed multi-scale <span class="html-italic">k</span>-means method and (<b>b</b>) the method using varying geometric and radiometric properties [<a href="#B35-remotesensing-09-00252" class="html-bibr">35</a>], with L2 as the basis image in the first study area (scale = 50).</p>
Full article ">Figure 12
<p>Preprocessed bi-temporal images of the second study area: (<b>a</b>) acquired by QuickBird in May 2002 (L1) and (<b>b</b>) acquired by IKONOS in July 2009 (L2).</p>
Full article ">Figure 13
<p>Change detection maps resulting from: (<b>a</b>) Otsu’s thresholding method, (<b>b</b>) threshold selection by clustering gray levels of boundary, and (<b>c</b>) <span class="html-italic">k</span>-means clustering (<span class="html-italic">k</span> = 2), compared with (<b>d</b>) the reference image, with L2 as the basis image in the second study area (scale = 50).</p>
Full article ">Figure 14
<p>Change detection maps resulting from (<b>a</b>) the proposed multi-scale <span class="html-italic">k</span>-means method and (<b>b</b>) the method using varying geometric and radiometric properties [<a href="#B35-remotesensing-09-00252" class="html-bibr">35</a>], with L2 as the basis image in the second study area (scale = 50).</p>
Full article ">
5089 KiB  
Article
Refinement of Hyperspectral Image Classification with Segment-Tree Filtering
by Lu Li, Chengyi Wang, Jingbo Chen and Jianglin Ma
Remote Sens. 2017, 9(1), 69; https://doi.org/10.3390/rs9010069 - 16 Jan 2017
Cited by 8 | Viewed by 6806
Abstract
This paper proposes a novel method of segment-tree filtering to improve the classification accuracy of hyperspectral image (HSI). Segment-tree filtering is a versatile method that incorporates spatial information and has been widely applied in image preprocessing. However, to use this powerful framework in [...] Read more.
This paper proposes a novel method of segment-tree filtering to improve the classification accuracy of hyperspectral image (HSI). Segment-tree filtering is a versatile method that incorporates spatial information and has been widely applied in image preprocessing. However, to use this powerful framework in hyperspectral image classification, we must reduce the original feature dimensionality to avoid the Hughes problem; otherwise, the computational costs are high and the classification accuracy by original bands in the HSI is unsatisfactory. Therefore, feature extraction is adopted to produce new salient features. In this paper, the Semi-supervised Local Fisher (SELF) method of discriminant analysis is used to reduce HSI dimensionality. Then, a tree-structure filter that adaptively incorporates contextual information is constructed. Additionally, an initial classification map is generated using multi-class support vector machines (SVMs), and segment-tree filtering is conducted using this map. Finally, a simple Winner-Take-All (WTA) rule is applied to determine the class of each pixel in an HSI based on the maximum probability. The experimental results demonstrate that the proposed method can improve HSI classification accuracy significantly. Furthermore, a comparison between the proposed method and the current state-of-the-art methods, such as Extended Morphological Profiles (EMPs), Guided Filtering (GF), and Markov Random Fields (MRFs), suggests that our method is both competitive and robust. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Workflow of Segment-Tree Filtering for HSI classification.</p>
Full article ">Figure 2
<p>Indian Pines reconstructed using different methods of dimensional reduction: (<b>a</b>) LDA; (<b>b</b>) LFDA; (<b>c</b>) PCA; (<b>d</b>) SELF. The number of samples in the training set accounts for only 1% of all samples.</p>
Full article ">Figure 3
<p>Indian Pines reconstructed using different methods of dimensional reduction: (<b>a</b>) LDA; (<b>b</b>) LFDA; (<b>c</b>) PCA; (<b>d</b>) SELF. The number of samples in the training set accounts for 20% of all samples.</p>
Full article ">Figure 4
<p>Subtrees constructed for different standard benchmarks: (<b>a</b>) Indian Pines; (<b>b</b>) University of Pavia; (<b>c</b>) Salinas. Each color segment represents a subtree.</p>
Full article ">Figure 5
<p>Tree structure of the Segment-Tree Filter for the Indian Pines dataset: (<b>a</b>) Image of the segment tree; (<b>b</b>) Close-up of the red rectangular region in (<b>a</b>).</p>
Full article ">Figure 6
<p>Segment-Tree Filtering in two sequential passes: (<b>a</b>) Forward filtering from the leaves to root; (<b>b</b>) Backward filtering from the root to leaves.</p>
Full article ">Figure 7
<p>Classification results for Indian Pines: (<b>a</b>) Actual values; (<b>b</b>) Multi-class SVM; (<b>c</b>) ST; (<b>d</b>) PCA + ST; (<b>e</b>) LDA + ST; (<b>f</b>) LFDA + ST; (<b>g</b>) EMPs; (<b>h</b>) SVM + MRF; (<b>i</b>) PCA + GF ([<a href="#B18-remotesensing-09-00069" class="html-bibr">18</a>]); (<b>j</b>) The proposed method; (<b>d</b>–<b>f</b>) combine different dimensionality reduction methods (before “+”) with Segment-Tree Filtering (after “+”); (<b>g</b>–<b>i</b>) are other methods of spatial-spectral classification for HSIs.</p>
Full article ">Figure 8
<p>Classification results for Pavia University: (<b>a</b>) Actual values; (<b>b</b>) Multi-class SVM; (<b>c</b>) ST; (<b>d</b>) PCA + ST; (<b>e</b>) LDA + ST; (<b>f</b>) LFDA +ST; (<b>g</b>) EMPs; (<b>h</b>) SVM + MRF; (<b>i</b>) PCA + GF ([<a href="#B18-remotesensing-09-00069" class="html-bibr">18</a>]); (<b>j</b>) The proposed method. (<b>d</b>–<b>f</b>) combine different dimensionality reduction methods (before “+”) with Segment-Tree Filtering (after “+”); (<b>g</b>–<b>i</b>) are other methods of spatial-spectral classification for HSIs.</p>
Full article ">Figure 8 Cont.
<p>Classification results for Pavia University: (<b>a</b>) Actual values; (<b>b</b>) Multi-class SVM; (<b>c</b>) ST; (<b>d</b>) PCA + ST; (<b>e</b>) LDA + ST; (<b>f</b>) LFDA +ST; (<b>g</b>) EMPs; (<b>h</b>) SVM + MRF; (<b>i</b>) PCA + GF ([<a href="#B18-remotesensing-09-00069" class="html-bibr">18</a>]); (<b>j</b>) The proposed method. (<b>d</b>–<b>f</b>) combine different dimensionality reduction methods (before “+”) with Segment-Tree Filtering (after “+”); (<b>g</b>–<b>i</b>) are other methods of spatial-spectral classification for HSIs.</p>
Full article ">Figure 9
<p>Classification results for Salinas; (<b>a</b>) Actual values; (<b>b</b>) Multi-class SVM; (<b>c</b>) ST; (<b>d</b>) PCA + ST; (<b>e</b>) LDA + ST; (<b>f</b>) LFDA + ST; (<b>g</b>) EMPs; (<b>h</b>) SVM + MRF; (<b>i</b>) PCA + GF ([<a href="#B18-remotesensing-09-00069" class="html-bibr">18</a>]); (<b>j</b>) The proposed method; (<b>d</b>–<b>f</b>) combine different dimensionality reduction methods (before “+”) with Segment-Tree Filtering (after “+”); (<b>g</b>–<b>i</b>) are other methods of spatial-spectral classification for HSIs.</p>
Full article ">Figure 10
<p>Influence of parameter <math display="inline"> <semantics> <mi>β</mi> </semantics> </math> on the classification accuracy.</p>
Full article ">Figure 11
<p>Influence of parameter <math display="inline"> <semantics> <mi>K</mi> </semantics> </math> on the classification accuracy.</p>
Full article ">Figure 12
<p>Influence of parameter <math display="inline"> <semantics> <mi>r</mi> </semantics> </math> on the classification accuracy.</p>
Full article ">Figure 13
<p>Influences of different dissimilarity measures on the classification accuracy.</p>
Full article ">Figure 14
<p>Classification accuracy based on the number of training samples for the Indian Pines dataset.</p>
Full article ">

Other

Jump to: Research

22900 KiB  
Technical Note
Flood Inundation Mapping from Optical Satellite Images Using Spatiotemporal Context Learning and Modest AdaBoost
by Xiaoyi Liu, Hichem Sahli, Yu Meng, Qingqing Huang and Lei Lin
Remote Sens. 2017, 9(6), 617; https://doi.org/10.3390/rs9060617 - 16 Jun 2017
Cited by 21 | Viewed by 9488
Abstract
Due to its capacity for temporal and spatial coverage, remote sensing has emerged as a powerful tool for mapping inundation. Many methods have been applied effectively in remote sensing flood analysis. Generally, supervised methods can achieve better precision than unsupervised. However, human intervention [...] Read more.
Due to its capacity for temporal and spatial coverage, remote sensing has emerged as a powerful tool for mapping inundation. Many methods have been applied effectively in remote sensing flood analysis. Generally, supervised methods can achieve better precision than unsupervised. However, human intervention makes its results subjective and difficult to obtain automatically, which is important for disaster response. In this work, we propose a novel procedure combining spatiotemporal context learning method and Modest AdaBoost classifier, which aims to extract inundation in an automatic and accurate way. First, the context model was built with images to calculate the confidence value of each pixel, which represents the probability of the pixel remaining unchanged. Then, the pixels with the highest probabilities, which we define as ‘permanent pixels’, were used as samples to train the Modest AdaBoost classifier. By applying the strong classifier to the target scene, an inundation map can be obtained. The proposed procedure is validated using two flood cases with different sensors, HJ-1A CCD and GF-4 PMS. Qualitative and quantitative evaluation results showed that the proposed procedure can achieve accurate and robust mapping results. Full article
(This article belongs to the Special Issue Learning to Understand Remote Sensing Images)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Location of the first study site near the border of Russia and China; (<b>b</b>) Extent of the HJ-1A CCD data used in this study.</p>
Full article ">Figure 2
<p>(<b>a</b>) Location of the second study site at the North of Hunan Province in China; (<b>b</b>) Extent of the GF-4 PMS data used in this study.</p>
Full article ">Figure 3
<p>Flowchart of permanent pixel extraction procedure.</p>
Full article ">Figure 4
<p>Flowchart of inundation mapping procedure.</p>
Full article ">Figure 5
<p>False colour composites (R4G3B2) of HJ-1A CCD images acquired on (<b>a</b>) 12 July 2013 (before the flood) and (<b>b</b>) 27 August 2013 (after the flood) for the first case study. (<b>c</b>) Corresponding MOD44W water mask product with water in blue and land in black.</p>
Full article ">Figure 6
<p>(<b>a</b>–<b>d</b>) Flood inundation mapping results for the first case study using K-MEANs, SP-MADB, STP-MADB and STCLP-MADB methods. (Gray: flood pixels in both the detection and reference maps; Blue: flood pixels only in the detection map; Yellow: flood pixels only in the reference map; Black: the background). (<b>e</b>) The locations of the four sub-regions in red rectangle, shown on the reference map of inundation derived from the GF-1 WFV data, with the water in yellow and the background in black.</p>
Full article ">Figure 7
<p>From the left to the right column: the regions of interest A, B, C and D for the first case study. From the second to the fifth row: corresponding detection and reference maps with the flood in blue and the background in black.</p>
Full article ">Figure 8
<p>The first case study: commission, omission and overall accuracy in function of <math display="inline"> <semantics> <mi>n</mi> </semantics> </math>, the percentage of permanent pixels.</p>
Full article ">Figure 9
<p>False colour composites (R: 5, G: 4, B: 3) of GF-4 PMS images acquired on (<b>a</b>) 17 June 2016 (before the flood) and (<b>b</b>) 23 July 2016 (after the flood) for the second test case. (<b>c</b>) Corresponding MOD44W water mask product with water in blue and land in black.</p>
Full article ">Figure 10
<p>(<b>a</b>–<b>d</b>) Flood inundation mapping results for the second test area using K-MEANs, SP-MADB, STP-MADB and STCLP-MADB methods. (Gray: flood pixels in both the detection and reference maps; Blue: flood pixels only in the detection map; Yellow: flood pixels only in the reference map; Black: the background). (<b>e</b>) The locations of the four sub-regions in red rectangle, shown on the reference map of inundation derived from the HJ-1B CCD data, with the water in yellow and the background in black.</p>
Full article ">Figure 11
<p>From the left to the right column: the regions of interest A, B, C and D for the second case study. From the second to the fifth row: corresponding detection and reference maps with the flood in blue and the background in black.</p>
Full article ">Figure 12
<p>Second test—commission, omission and overall accuracy in function of <math display="inline"> <semantics> <mi>n</mi> </semantics> </math>, the percentage of permanent pixels.</p>
Full article ">
Back to TopTop