[go: up one dir, main page]

 
 
remotesensing-logo

Journal Browser

Journal Browser

Image Processing from Aerial and Satellite Imagery

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 15 March 2025 | Viewed by 16786

Special Issue Editors


E-Mail Website
Guest Editor
School of Applied Computational Sciences, Meharry Medical College, Nashville, TN 37208, USA
Interests: geospatial big data to support health care related application scenarios; unmanned aerial systems for environmental monitoring and emergence situations response; close-range photogrammetry, computer vision and 3D printing for health care and epidemiology; human–computer/human–robot symbiosis for decision support systems

E-Mail Website
Guest Editor
Interdisciplinary Research Center for Aviation and Space Exploration, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
Interests: machine learning and artificial intelligence models; approaches to remote sensing applications and geospatial data processing Innovative remote sensing and photogrammetry technologies for the assessment of the environmental impact of construction; solving problems of town planning and spatial territorial management; research into and application of remote sensing, UAS, close-range photogrammetry, and terrestrial laser scanning

E-Mail Website
Guest Editor
School of Computer Science and Engineering, Vellore Institute of Technology, Chennai 60012, Tamil Nadu, India
Interests: remote sensing for aquatic and land applications; earth observations; image processing; ocean optics; radiative transfer theory; modelling water quality features using machine learning; object detection using remote sensing and AI-based approaches; land capability, coastal vulnerability, and environmental sensitivity mapping for decision support

Special Issue Information

Dear Colleagues,

Aerial and satellite imagery are invaluable resources in various fields, including environmental monitoring, urban planning, agriculture, disaster management, and more. This Special Issue of Remote Sensing, entitled “Image Processing from Aerial and Satellite Imagery”, aims to bring together cutting-edge interdisciplinary research in image processing techniques, geospatial science, and technology tailored to these data sources. With the proliferation of remote sensing platforms, there is a growing need for the use of advanced image analysis methods to extract meaningful information from the vast volumes of aerial and satellite imagery available today.

The primary objective of this Special Issue is to provide a platform for researchers, scientists, and experts that allows them to share their latest findings and innovations in the field of image processing for aerial and satellite imagery. This research aligns seamlessly with the journal's scope, which focuses on remote sensing technologies and their applications. By fostering collaboration and knowledge exchange, this Special Issue seeks to advance state-of-the-art image processing techniques, geospatial information science, and technologies for real-world applications, including challenges associated with the deployment of the geospatial big data obtained using satellite-, aerial/UAV-, and terrestrial-based observation techniques of Earth observation.

We invite submissions of original research articles, reviews, and innovative methodologies addressing, but not limited to, the following themes:

  • Image enhancement and restoration: Novel approaches for improving the quality of aerial and satellite images, including noise reduction, deblurring, correction, and super-resolution.
  • Feature extraction and classification: Algorithms and methods for automated detection and classification of objects and phenomena in imagery, such as land use/land cover classification, object recognition, and change detection.
  • Machine learning and deep learning: Applications of machine learning and deep learning techniques for image analysis in remote sensing, including convolutional neural networks, recurrent neural networks, and generative adversarial networks.
  • Data fusion: Integration of multiple data sources, such as multispectral, hyperspectral, and LiDAR data, to enhance the information extracted from imagery.
  • Time series analysis: Temporal analysis of aerial and satellite imagery to monitor dynamic processes and long-term trends.
  • Applications: Real-world applications of aerial and satellite imagery processing in fields like agriculture, forestry, urban planning, disaster monitoring, and environmental conservation.

This Special Issue provides a unique opportunity for researchers to contribute to the advancement of image processing methods for aerial and satellite imagery, ultimately supporting informed decision making and sustainable development in a variety of domains. We encourage authors to submit their high-quality research in order to help shape the future of this critical research area.

Prof. Dr. Eugene Levin
Prof. Dr. Roman Shults
Dr. Surya Prakash Tiwari
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • remote sensing
  • aerial imagery
  • satellite imagery
  • image processing
  • data fusion
  • machine learning
  • feature extraction
  • change detection
  • environmental monitoring
  • photogrammetry/space photogrammetry
  • land use/land cover classification
  • geospatial analysis

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

28 pages, 28459 KiB  
Article
Multi-Temporal Remote Sensing Satellite Data Analysis for the 2023 Devastating Flood in Derna, Northern Libya
by Roman Shults, Ashraf Farahat, Muhammad Usman and Md Masudur Rahman
Remote Sens. 2025, 17(4), 616; https://doi.org/10.3390/rs17040616 - 11 Feb 2025
Viewed by 503
Abstract
Floods are considered to be among the most dangerous and destructive geohazards, leading to human victims and severe economic outcomes. Yearly, many regions around the world suffer from devasting floods. The estimation of flood aftermaths is one of the high priorities for the [...] Read more.
Floods are considered to be among the most dangerous and destructive geohazards, leading to human victims and severe economic outcomes. Yearly, many regions around the world suffer from devasting floods. The estimation of flood aftermaths is one of the high priorities for the global community. One such flood took place in northern Libya in September 2023. The presented study is aimed at evaluating the flood aftermath for Derna city, Libya, using high resolution GEOEYE-1 and Sentinel-2 satellite imagery in Google Earth Engine environment. The primary task is obtaining and analyzing data that provide high accuracy and detail for the study region. The main objective of study is to explore the capabilities of different algorithms and remote sensing datasets for quantitative change estimation after the flood. Different supervised classification methods were examined, including random forest, support vector machine, naïve-Bayes, and classification and regression tree (CART). The various sets of hyperparameters for classification were considered. The high-resolution GEOEYE-1 images were used for precise change detection using image differencing (pixel-to-pixel comparison and geographic object-based image analysis (GEOBIA) for extracting building), whereas Sentinel-2 data were employed for the classification and further change detection by classified images. Object based image analysis (OBIA) was also performed for the extraction of building footprints using very high resolution GEOEYE images for the quantification of buildings that collapsed due to the flood. The first stage of the study was the development of a workflow for data analysis. This workflow includes three parallel processes of data analysis. High-resolution GEOEYE-1 images of Derna city were investigated for change detection algorithms. In addition, different indices (normalized difference vegetation index (NDVI), soil adjusted vegetation index (SAVI), transformed NDVI (TNDVI), and normalized difference moisture index (NDMI)) were calculated to facilitate the recognition of damaged regions. In the final stage, the analysis results were fused to obtain the damage estimation for the studied region. As the main output, the area changes for the primary classes and the maps that portray these changes were obtained. The recommendations for data usage and further processing in Google Earth Engine were developed. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>Map of Libya with the location of Derna (P.C. BBC News after the Flooding).</p>
Full article ">Figure 2
<p>Pre-flood (1 July 2023) and post-flood (13 September 2023) images show damage caused by the collapse of the Derna Dam.</p>
Full article ">Figure 3
<p>Change detection analysis flowchart.</p>
Full article ">Figure 4
<p>Image differencing for Derna region using (<b>a</b>) band 1, (<b>b</b>) band 2, and (<b>c</b>) band 3.</p>
Full article ">Figure 5
<p>Area changes: (<b>a</b>) band 1, (<b>b</b>) band 2, and (<b>c</b>) band 3.</p>
Full article ">Figure 6
<p>Spectral indices and their changes for the Derna area using NDVI (<b>a</b>,<b>b</b>), SAVI (<b>c</b>,<b>d</b>), TNDVI (<b>e</b>,<b>f</b>), and NDMI (<b>g</b>,<b>h</b>).</p>
Full article ">Figure 6 Cont.
<p>Spectral indices and their changes for the Derna area using NDVI (<b>a</b>,<b>b</b>), SAVI (<b>c</b>,<b>d</b>), TNDVI (<b>e</b>,<b>f</b>), and NDMI (<b>g</b>,<b>h</b>).</p>
Full article ">Figure 7
<p>(<b>a</b>) Derna Region Random Forest Classification results for image dated (<b>a</b>) 18 August 2023 and (<b>b</b>) Derna Region Random Forest Classification dated 22 September 2023.</p>
Full article ">Figure 8
<p>Derna Region CART Classification results for image dated (<b>a</b>) 18 August 2023 and (<b>b</b>) Derna Region Random Forest Classification dated 22 September 2023.</p>
Full article ">Figure 9
<p>Derna Region Naïve Bayes Classification for 18 August 2023 (<b>a</b>) and 22 September 2023 (<b>b</b>).</p>
Full article ">Figure 10
<p>SVM hyperparameters and classification ways.</p>
Full article ">Figure 11
<p>Derna Region SVM Classification for 18 August 2023 (<b>a</b>) and 22 September 2023 (<b>b</b>).</p>
Full article ">Figure 12
<p>Derna Region SVM Classification, the polynomial kernel for 18 August 2023 (<b>a</b>) and 22 September 2023 (<b>b</b>).</p>
Full article ">Figure 13
<p>Building footprint extracted using GEOBIA and building damaged due to flash.</p>
Full article ">
22 pages, 24659 KiB  
Article
A Multi-Scale Fusion Deep Learning Approach for Wind Field Retrieval Based on Geostationary Satellite Imagery
by Wei Zhang, Yapeng Wu, Kunkun Fan, Xiaojiang Song, Renbo Pang and Boyu Guoan
Remote Sens. 2025, 17(4), 610; https://doi.org/10.3390/rs17040610 - 11 Feb 2025
Viewed by 416
Abstract
Wind field retrieval, a crucial component of weather forecasting, has been significantly enhanced by recent advances in deep learning. However, existing approaches that are primarily focused on wind speed retrieval are limited by their inability to achieve real-time, full-coverage retrievals at large scales. [...] Read more.
Wind field retrieval, a crucial component of weather forecasting, has been significantly enhanced by recent advances in deep learning. However, existing approaches that are primarily focused on wind speed retrieval are limited by their inability to achieve real-time, full-coverage retrievals at large scales. To address this problem, we propose a novel multi-scale fusion retrieval (MFR) method, leveraging geostationary observation satellites. At the mesoscale, MFR incorporates a cloud-to-wind transformer model, which employs local self-attention mechanisms to extract detailed wind field features. At large scales, MFR incorporates a multi-encoder coordinate U-net model, which incorporates multiple encoders and utilises coordinate information to fuse meso- to large-scale features, enabling accurate and regionally complete wind field retrievals, while reducing the computational resources required. The MFR method was validated using Level 1 data from the Himawari-8 satellite, covering a geographic range of 0–60°N and 100–160°E, at a resolution of 0.25°. Wind field retrieval was accomplished within seconds using a single graphics processing unit. The mean absolute error of wind speed obtained by the MFR was 0.97 m/s, surpassing the accuracy of the CFOSAT and HY-2B Level 2B wind field products. The mean absolute error for wind direction achieved by the MFR was 23.31°, outperforming CFOSAT Level 2B products and aligning closely with HY-2B Level 2B products. The MFR represents a pioneering approach for generating initial fields for large-scale grid forecasting models. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>Vector diagrams for (<b>a</b>) normal conditions, (<b>b</b>) Typhoon Hinnamnor, and (<b>c</b>) Typhoon Nanmadol in the study area. Arrow directions and lengths indicate wind direction and speed, respectively.</p>
Full article ">Figure 2
<p>Multi-scale fusion retrieval architecture. Res, resolution.</p>
Full article ">Figure 3
<p>Sliding window sampling method.</p>
Full article ">Figure 4
<p>Structure of the C2W-Former model.</p>
Full article ">Figure 5
<p>(<b>a</b>) The Swin Transformer block architecture; (<b>b</b>) W-MSA and SW-MSA, which are multi-head self-attention modules with regular and shifted windowing configurations, respectively. Blue and red boxes represent patches and windows, respectively.</p>
Full article ">Figure 6
<p>Discontinuous and blurred boundaries in the preliminary UV.</p>
Full article ">Figure 7
<p>Architecture of the Multi-encoder Coordinate U-net (M-CoordUnet) model. (<b>a</b>) Overall architecture. (<b>b</b>–<b>d</b>) Structural details of the encoder, centre, and decoder blocks, respectively.</p>
Full article ">Figure 8
<p>Analysis of MFR and OSR in comparison to ERA5 for land and sea regions.</p>
Full article ">Figure 9
<p>Scatter plots of each model versus ERA5 at 00:00 on 19 April 2021 (Super Typhoon Surigae). Closer proximity to the x = y line indicates better agreement between the two models. Warmer colours indicate higher frequency.</p>
Full article ">Figure 10
<p>Analysis of ERA5, IFS, and MFR results in comparison to weather station data.</p>
Full article ">Figure 11
<p>UV MAE statistics of MFR wind fields for land (green), sea (orange), and the total study area (blue) across different months and regions in the test data. Solid line indicates average error; shaded area indicates the 95% confidence interval.</p>
Full article ">Figure 12
<p>Comparison of wind field characteristics among different models and data products during Super Typhoon Surigae (19 April 2021, 00:00 UTC).</p>
Full article ">Figure 13
<p>Comparison of wind field characteristics among different models and data products during Super Typhoon Mindulle (28 September 2021, 12:00 UTC).</p>
Full article ">
20 pages, 8861 KiB  
Article
An Improved Registration Method for UAV-Based Linear Variable Filter Hyperspectral Data
by Xiao Wang, Chunyao Yu, Xiaohong Zhang, Xue Liu, Yinxing Zhang, Junyong Fang and Qing Xiao
Remote Sens. 2025, 17(1), 55; https://doi.org/10.3390/rs17010055 - 27 Dec 2024
Viewed by 433
Abstract
Linear Variable Filter (LVF) hyperspectral cameras possess the advantages of high spectral resolution, compact size, and light weight, making them highly suitable for unmanned aerial vehicle (UAV) platforms. However, challenges arise in data registration due to the imaging characteristics of LVF data and [...] Read more.
Linear Variable Filter (LVF) hyperspectral cameras possess the advantages of high spectral resolution, compact size, and light weight, making them highly suitable for unmanned aerial vehicle (UAV) platforms. However, challenges arise in data registration due to the imaging characteristics of LVF data and the instability of UAV platforms. These challenges stem from the diversity of LVF data bands and significant inter-band differences. Even after geometric processing, adjacent flight lines still exhibit varying degrees of geometric deformation. In this paper, a progressive grouping-based strategy for iterative band selection and registration is proposed. In addition, an improved Scale-Invariant Feature Transform (SIFT) algorithm, termed the Double Sufficiency–SIFT (DS-SIFT) algorithm, is introduced. This method first groups bands, selects the optimal reference band, and performs coarse registration based on the SIFT method. Subsequently, during the fine registration stage, it introduces an improved position/scale/orientation joint SIFT registration algorithm (IPSO-SIFT) that integrates partitioning and the principle of structural similarity. This algorithm iteratively refines registration based on the grouping results. Experimental data obtained from a self-developed and integrated LVF hyperspectral remote sensing system are utilized to verify the effectiveness of the proposed algorithm. A comparison with classical algorithms, such as SIFT and PSO-SIFT, demonstrates that the registration of LVF hyperspectral data using the proposed method achieves superior accuracy and efficiency. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>Comparison diagram of the optical splitting mode.</p>
Full article ">Figure 2
<p>Initial grouping diagram.</p>
Full article ">Figure 3
<p>Composition of the experimental system.</p>
Full article ">Figure 4
<p>Overall appearance of the self-integrated UAV hyperspectral remote sensing system.</p>
Full article ">Figure 5
<p>Appearance of the LVF hyperspectral camera.</p>
Full article ">Figure 6
<p>Local plan view of the experimental area.</p>
Full article ">Figure 7
<p>Unmanned aerial vehicle hyperspectral remote sensing system.</p>
Full article ">Figure 8
<p>Data processing workflow.</p>
Full article ">Figure 9
<p>Original data (<b>left</b>) and reconstructed data (<b>right</b>).</p>
Full article ">Figure 10
<p>Image before and after geometric correction, taking the 90th band as an example. (<b>a</b>) Image before geometric correction. (<b>b</b>) Image after geometric correction.</p>
Full article ">Figure 11
<p>Mutual information diagram after initial group registration. (<b>a</b>) Initial group-wise mutual information map after registration. (<b>b</b>) Mutual information map after group-wise registration of the first-level base bands.</p>
Full article ">Figure 12
<p>DS-SIFT algorithm registration process. (<b>a</b>) Image of feature point detection; the numbers in the image represent feature point numbers. (<b>b</b>) Image of feature point connectivity. (<b>c</b>) The reference image and the registered image. (<b>d</b>) The reference checkerboard image (<b>left</b>), the registered checkerboard image (<b>middle</b>), and the fused mosaic image (<b>right</b>). (<b>e</b>) Histograms of the image before (<b>left</b>) and after (<b>right</b>) registration.</p>
Full article ">Figure 12 Cont.
<p>DS-SIFT algorithm registration process. (<b>a</b>) Image of feature point detection; the numbers in the image represent feature point numbers. (<b>b</b>) Image of feature point connectivity. (<b>c</b>) The reference image and the registered image. (<b>d</b>) The reference checkerboard image (<b>left</b>), the registered checkerboard image (<b>middle</b>), and the fused mosaic image (<b>right</b>). (<b>e</b>) Histograms of the image before (<b>left</b>) and after (<b>right</b>) registration.</p>
Full article ">Figure 13
<p>Root mean square error of the DS-SIFT algorithm.</p>
Full article ">Figure 14
<p>Difference value of mutual information.</p>
Full article ">Figure 15
<p>Running time comparison.</p>
Full article ">Figure 16
<p>DS-SIFT algorithm registration effect. (<b>a</b>) Synthetic data for the pre-registration of three-band imagery. (<b>b</b>) Synthetic data for the post-registration of three-band imagery.</p>
Full article ">Figure 17
<p>DS-SIFT algorithm registration effect picture of other test district. (<b>a</b>) Synthetic data for the pre-registration of three-band imagery. (<b>b</b>) Synthetic data for the post-registration of three-band imagery.</p>
Full article ">
19 pages, 4696 KiB  
Article
The Analysis of Land Use and Climate Change Impacts on Lake Victoria Basin Using Multi-Source Remote Sensing Data and Google Earth Engine (GEE)
by Maram Ali, Tarig Ali, Rahul Gawai, Lara Dronjak and Ahmed Elaksher
Remote Sens. 2024, 16(24), 4810; https://doi.org/10.3390/rs16244810 - 23 Dec 2024
Viewed by 1174
Abstract
Over 30 million people rely on Lake Victoria for survival in Northeast African countries, including Ethiopia, Eritrea, Somalia, and Djibout. The lake faces significant challenges due to changes in land use and climate. This study used multi-source remote sensing data in the Google [...] Read more.
Over 30 million people rely on Lake Victoria for survival in Northeast African countries, including Ethiopia, Eritrea, Somalia, and Djibout. The lake faces significant challenges due to changes in land use and climate. This study used multi-source remote sensing data in the Google Earth Engine (GEE) platform to create Land Use and Land Cover (LULC), land surface temperature (LST), and Normalized Difference Water Index (NDWI) layers in the period 2000–2023 to understand the impact of LULC and climate change on Lake Victoria Basin. The land use/land cover trends before 2020 indicated an increase in the urban areas from 0.13% in 2000 to 0.16% in 2020. Croplands increased from 6.51% in 2000 to 7.88% in 2020. The water surface area averaged 61,559 square km, which has increased since 2000 with an average rate of 1.3%. The “Permanent Wetland” size change from 2000 to 2020 varied from 1.70% to 1.83%. Cropland/Natural Vegetation Mosaics rose from 12.77% to 15.01%, through 2000 to 2020. However, more than 29,000 residents were displaced in mid-2020 as the water increased by 1.21 m from the fall of 2019 to the middle of 2020. Furthermore, land-surface temperature averaged 23.98 degrees in 2000 and 23.49 in 2024. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>The percentage of shoreline length by country.</p>
Full article ">Figure 2
<p>Lake Victoria Basin countries share percentages of areas.</p>
Full article ">Figure 3
<p>Lake Victoria’s hydrological and geographical settings [<a href="#B9-remotesensing-16-04810" class="html-bibr">9</a>].</p>
Full article ">Figure 4
<p>Land use and land cover change in the LVB, 2000–2020.</p>
Full article ">Figure 5
<p>LVB Land use and land cover patterns change in percentage, 2000–2020.</p>
Full article ">Figure 6
<p>The change in water level in Victoria Lake from 1992 to 2024 (<a href="https://hydroweb.theia-land.fr/hydroweb/view/L_victoria?lang=en(accessed" target="_blank">https://hydroweb.theia-land.fr/hydroweb/view/L_victoria?lang=en(accessed</a> on 18 March 2024)).</p>
Full article ">Figure 7
<p>The land surface temperature change in the LVB in 2000, 2005, 2010, 2015, 2020, and 2024.</p>
Full article ">Figure 8
<p>The annual maximum, minimum, and average LST in January in 2000, 2005, 2010, 2015, 2020, and 2024, Google Earth data source.</p>
Full article ">Figure 9
<p>The polygon layers that represent the lake’s boundaries in 2000–2023.</p>
Full article ">Figure 10
<p>Lake Victoria’s water area variation in the period 2000–2023.</p>
Full article ">
28 pages, 12630 KiB  
Article
Satellite Image Restoration via an Adaptive QWNNM Model
by Xudong Xu, Zhihua Zhang and M. James C. Crabbe
Remote Sens. 2024, 16(22), 4152; https://doi.org/10.3390/rs16224152 - 7 Nov 2024
Viewed by 847
Abstract
Due to channel noise and random atmospheric turbulence, retrieved satellite images are always distorted and degraded and so require further restoration before use in various applications. The latest quaternion-based weighted nuclear norm minimization (QWNNM) model, which utilizes the idea of low-rank matrix approximation [...] Read more.
Due to channel noise and random atmospheric turbulence, retrieved satellite images are always distorted and degraded and so require further restoration before use in various applications. The latest quaternion-based weighted nuclear norm minimization (QWNNM) model, which utilizes the idea of low-rank matrix approximation and the quaternion representation of multi-channel satellite images, can achieve image restoration and enhancement. However, the QWNNM model ignores the impact of noise on similarity measurement, lacks the utilization of residual image information, and fixes the number of iterations. In order to address these drawbacks, we propose three adaptive strategies: adaptive noise-resilient block matching, adaptive feedback of residual image, and adaptive iteration stopping criterion in a new adaptive QWNNM model. Both simulation experiments with known noise/blurring and real environment experiments with unknown noise/blurring demonstrated that the effectiveness of adaptive QWNNM models outperformed the original QWNNM model and other state-of-the-art satellite image restoration models in very different technique approaches. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>Magnitude of QDCT of clear and degraded images.</p>
Full article ">Figure 2
<p>(<b>a</b>) Degraded image (σ = 25/MB(20, 60)), (<b>b</b>) Restored image (MB(20, 60)/<span class="html-italic">σ</span> = 25), (<b>c</b>) Residual image.</p>
Full article ">Figure 3
<p>The quality of image restoration measured by PSNR/SSIM varies with the numbers of iterations.</p>
Full article ">Figure 3 Cont.
<p>The quality of image restoration measured by PSNR/SSIM varies with the numbers of iterations.</p>
Full article ">Figure 4
<p>The flowchart of the adaptive QWNNM model.</p>
Full article ">Figure 5
<p>All satellite images used in the restoration experiments, enumerated from left-to-right and top-to-bottom.</p>
Full article ">Figure 6
<p>Evolution of PSNR (<b>left</b>) and SSIM (<b>right</b>) values for different parameters <math display="inline"><semantics> <mi>ε</mi> </semantics></math> and <math display="inline"><semantics> <mi>c</mi> </semantics></math> during the restoration of satellite image ‘Img47’.</p>
Full article ">Figure 7
<p>Four quantitative indicators for the average quality of satellite image restoration using different models.</p>
Full article ">Figure 8
<p>Restoration performance on “Img5” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored satellite image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 8 Cont.
<p>Restoration performance on “Img5” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored satellite image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 9
<p>Restoration performance on “Img2” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored satellite image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 10
<p>Restoration performance on “Img10” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 10 Cont.
<p>Restoration performance on “Img10” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 11
<p>Restoration performance on “Img16” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 12
<p>Restoration performance on “Img20” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 12 Cont.
<p>Restoration performance on “Img20” with visual quality and numerical results (PSNR/SSIM). (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level <math display="inline"><semantics> <mrow> <mi>σ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math>; Restored image by (<b>c</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>d</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>e</b>) DVTV [<a href="#B36-remotesensing-16-04152" class="html-bibr">36</a>]; (<b>f</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>g</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>h</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>i</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 13
<p>Satellite images degraded by haze, stripes, and blurring in a real environment, enumerated from left-to-right.</p>
Full article ">Figure 14
<p>Visual quality comparison of original satellite image 1 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 14 Cont.
<p>Visual quality comparison of original satellite image 1 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 15
<p>Visual quality comparison of original satellite image 2 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 16
<p>Visual quality comparison of original satellite image 3 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 17
<p>Visual quality comparison of original satellite image 4 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 18
<p>Visual quality comparison of original satellite image 5 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 19
<p>Visual quality comparison of original satellite image 6 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 19 Cont.
<p>Visual quality comparison of original satellite image 6 (<b>a</b>) and restored satellite image by (<b>b</b>) F-ABF [<a href="#B35-remotesensing-16-04152" class="html-bibr">35</a>]; (<b>c</b>) K-QSVD [<a href="#B10-remotesensing-16-04152" class="html-bibr">10</a>,<a href="#B11-remotesensing-16-04152" class="html-bibr">11</a>]; (<b>d</b>) BM3D [<a href="#B37-remotesensing-16-04152" class="html-bibr">37</a>]; (<b>e</b>) QNLM [<a href="#B12-remotesensing-16-04152" class="html-bibr">12</a>,<a href="#B13-remotesensing-16-04152" class="html-bibr">13</a>]; (<b>f</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>g</b>) adaptive QWNNM (ours).</p>
Full article ">Figure 20
<p>Restoration performance with visual quality. (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level 25; Restored satellite image by (<b>c</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>d</b>) S1; (<b>e</b>) S1&amp;S2; (<b>f</b>) S1&amp;S3; (<b>g</b>) adaptive QWNNM (S1&amp;S2&amp;S3).</p>
Full article ">Figure 20 Cont.
<p>Restoration performance with visual quality. (<b>a</b>) Ground truth; (<b>b</b>) Degraded image with motion kernel (20, 60) and noise level 25; Restored satellite image by (<b>c</b>) QWNNM [<a href="#B15-remotesensing-16-04152" class="html-bibr">15</a>]; (<b>d</b>) S1; (<b>e</b>) S1&amp;S2; (<b>f</b>) S1&amp;S3; (<b>g</b>) adaptive QWNNM (S1&amp;S2&amp;S3).</p>
Full article ">
15 pages, 6018 KiB  
Article
Consistency Self-Training Semi-Supervised Method for Road Extraction from Remote Sensing Images
by Xingjian Gu, Supeng Yu, Fen Huang, Shougang Ren and Chengcheng Fan
Remote Sens. 2024, 16(21), 3945; https://doi.org/10.3390/rs16213945 - 23 Oct 2024
Viewed by 1151
Abstract
Road extraction techniques based on remote sensing image have significantly advanced. Currently, fully supervised road segmentation neural networks based on remote sensing images require a significant number of densely labeled road samples, limiting their applicability in large-scale scenarios. Consequently, semi-supervised methods that utilize [...] Read more.
Road extraction techniques based on remote sensing image have significantly advanced. Currently, fully supervised road segmentation neural networks based on remote sensing images require a significant number of densely labeled road samples, limiting their applicability in large-scale scenarios. Consequently, semi-supervised methods that utilize fewer labeled data have gained increasing attention. However, the imbalance between a small quantity of labeled data and a large volume of unlabeled data leads to local detail errors and overall cognitive mistakes in semi-supervised road extraction. To address this challenge, this paper proposes a novel consistency self-training semi-supervised method (CSSnet), which effectively learns from a limited number of labeled data samples and a large amount of unlabeled data. This method integrates self-training semi-supervised segmentation with semi-supervised classification. The semi-supervised segmentation component relies on an enhanced generative adversarial network for semantic segmentation, which significantly reduces local detail errors. The semi-supervised classification component relies on an upgraded mean-teacher network to handle overall cognitive errors. Our method exhibits excellent performance with a modest amount of labeled data. This study was validated on three separate road datasets comprising high-resolution remote sensing satellite images and UAV photographs. Experimental findings showed that our method consistently outperformed state-of-the-art semi-supervised methods and several classic fully supervised methods. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>Overview of our proposed semi-supervised road extraction method—CSSnet. The superscripts <span class="html-italic">u</span> and <span class="html-italic">l</span> for input <span class="html-italic">X</span> and output <span class="html-italic">Y</span> in the figure represent unlabeled and labeled data, respectively.</p>
Full article ">Figure 2
<p>Overview of the main structure of the SegGan component.</p>
Full article ">Figure 3
<p>Visualcomparisons of different methods on Massachusetts Roads dataset with 20% labeled rate. (<b>a</b>) DeeplabV2. (<b>b</b>) Unet. (<b>c</b>) D-Linknet. (<b>d</b>) AdvSemiSeg. (<b>e</b>) SII-Net. (<b>f</b>) ST++. (<b>g</b>) s4GAN. (<b>h</b>) Our method.</p>
Full article ">Figure 4
<p>Visual comparisons of different methods on CHN6-CUG Road dataset with 20% labeled rate. (<b>a</b>) DeeplabV2. (<b>b</b>) Unet. (<b>c</b>) D-Linknet. (<b>d</b>) AdvSemiSeg. (<b>e</b>) SII-Net. (<b>f</b>) ST++. (<b>g</b>) s4GAN. (<b>h</b>) Our method.</p>
Full article ">Figure 5
<p>Visual comparison of different methods on Berlin Road dataset with 20% labeled rate. (<b>a</b>) DeeplabV2. (<b>b</b>) Unet. (<b>c</b>) D-Linknet. (<b>d</b>) AdvSemiSeg. (<b>e</b>) SII-Net. (<b>f</b>) ST++. (<b>g</b>) s4GAN. (<b>h</b>) Our method.</p>
Full article ">
18 pages, 3781 KiB  
Article
Self-Attention Multiresolution Analysis-Based Informal Settlement Identification Using Remote Sensing Data
by Rizwan Ahmed Ansari and Timothy J. Mulrooney
Remote Sens. 2024, 16(17), 3334; https://doi.org/10.3390/rs16173334 - 8 Sep 2024
Viewed by 1259
Abstract
The global dilemma of informal settlements persists alongside the fast process of urbanization. Various methods for analyzing remotely sensed images to identify informal settlements using semantic segmentation have been extensively researched, resulting in the development of numerous supervised and unsupervised algorithms. Texture-based analysis [...] Read more.
The global dilemma of informal settlements persists alongside the fast process of urbanization. Various methods for analyzing remotely sensed images to identify informal settlements using semantic segmentation have been extensively researched, resulting in the development of numerous supervised and unsupervised algorithms. Texture-based analysis is a topic extensively studied in the literature. However, it is important to note that approaches that do not utilize a multiresolution strategy are unable to take advantage of the fact that texture exists at different spatial scales. The capacity to do online mapping and precise segmentation on a vast scale while considering the diverse characteristics present in remotely sensed images carries significant consequences. This research presents a novel approach for identifying informal settlements using multiresolution analysis and self-attention techniques. The technique shows potential for being resilient in the presence of inherent variability in remotely sensed images due to its capacity to extract characteristics at many scales and prioritize areas that contain significant information. Segmented pictures underwent an accuracy assessment, where a comparison analysis was conducted based on metrics such as mean intersection over union, precision, recall, F-score, and overall accuracy. The proposed method’s robustness is demonstrated by comparing it to various state-of-the-art techniques. This comparison is conducted using remotely sensed images that have different spatial resolutions and informal settlement characteristics. The proposed method achieves a higher accuracy of approximately 95%, even when dealing with significantly different image characteristics. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>Proposed Methodology. Bx MRA represents different multiresolution analysis bands capturing different details; DeConv box performs de-convolution; multi-scale feature vector F<sub>MS</sub> is fed to the dual-attention module.</p>
Full article ">Figure 2
<p>Curvelet MRA construction (adapted from [<a href="#B35-remotesensing-16-03334" class="html-bibr">35</a>]).</p>
Full article ">Figure 3
<p>Dual-attention mechanism (adapted from [<a href="#B36-remotesensing-16-03334" class="html-bibr">36</a>]).</p>
Full article ">Figure 4
<p>Sample informal settlement areas depicting variability in structures: (<b>a</b>) informal settlements around tall buildings near BKC area; (<b>b</b>) informal settlements along railway line near Mankhurd area; (<b>c</b>) informal settlements near Sion showing inner pockets; (<b>d</b>) informal settlements near main road in Shivaji Nagar area.</p>
Full article ">Figure 5
<p>Slum area identification (IRS-1C) results: areas encircled in red show misclassified areas; areas encircled in blue show irregular boundaries in the result; and areas encircled in green show the proper regular boundary detection in the result. (<b>a</b>) Original image covering Dharavi and nearby areas; (<b>b</b>) reference showing slum areas; (<b>c</b>) slums identified using UNet [<a href="#B39-remotesensing-16-03334" class="html-bibr">39</a>]; (<b>d</b>) slums identified using ResUNet [<a href="#B40-remotesensing-16-03334" class="html-bibr">40</a>]; (<b>e</b>) slums identified using FCN-32 [<a href="#B41-remotesensing-16-03334" class="html-bibr">41</a>]; (<b>f</b>) slums identified using ASFNet [<a href="#B44-remotesensing-16-03334" class="html-bibr">44</a>]; (<b>g</b>) slums identified using ResiDualGAN [<a href="#B48-remotesensing-16-03334" class="html-bibr">48</a>]; (<b>h</b>) slums identified using self-attention and wavelet-based MRA; (<b>i</b>) slums identified using the proposed method of self-attention and curvelet-based MRA.</p>
Full article ">Figure 6
<p>Slum area identification results using the Worldviw-2 image covering Cheeta Camp and nearby areas: areas encircled in red show misclassified areas, areas encircled in blue show irregular boundaries in the result, and areas encircled in green show proper regular boundary detection in the result. (<b>a</b>) Original image covering Dharavi and nearby areas; (<b>b</b>) reference showing slum areas; (<b>c</b>) slums identified using UNet [<a href="#B39-remotesensing-16-03334" class="html-bibr">39</a>]; (<b>d</b>) slums identified using ResUNet [<a href="#B40-remotesensing-16-03334" class="html-bibr">40</a>]; (<b>e</b>) slums identified using FCN-32 [<a href="#B41-remotesensing-16-03334" class="html-bibr">41</a>]; (<b>f</b>) slums identified using ASFNet [<a href="#B44-remotesensing-16-03334" class="html-bibr">44</a>]; (<b>g</b>) slums identified using ResiDualGAN [<a href="#B48-remotesensing-16-03334" class="html-bibr">48</a>]; (<b>h</b>) slums identified using self-attention and wavelet-based MRA; (<b>i</b>) slums identified using the proposed method of self-attention and curvelet-based MRA.</p>
Full article ">Figure 7
<p>Ablation results. (<b>a</b>) Slums identified using the baseline model. (<b>b</b>) Slums identified using the baseline architecture with MRA features. (<b>c</b>) Slums identified using the baseline model integrated with self-attention. (<b>d</b>) Slums identified using the proposed method of baseline model integrated with MRA features and the self-attention mechanism.</p>
Full article ">
21 pages, 16543 KiB  
Article
Bidirectional Feature Fusion and Enhanced Alignment Based Multimodal Semantic Segmentation for Remote Sensing Images
by Qianqian Liu and Xili Wang
Remote Sens. 2024, 16(13), 2289; https://doi.org/10.3390/rs16132289 - 22 Jun 2024
Cited by 2 | Viewed by 2097
Abstract
Image–text multimodal deep semantic segmentation leverages the fusion and alignment of image and text information and provides more prior knowledge for segmentation tasks. It is worth exploring image–text multimodal semantic segmentation for remote sensing images. In this paper, we propose a bidirectional feature [...] Read more.
Image–text multimodal deep semantic segmentation leverages the fusion and alignment of image and text information and provides more prior knowledge for segmentation tasks. It is worth exploring image–text multimodal semantic segmentation for remote sensing images. In this paper, we propose a bidirectional feature fusion and enhanced alignment-based multimodal semantic segmentation model (BEMSeg) for remote sensing images. Specifically, BEMSeg first extracts image and text features by image and text encoders, respectively, and then the features are provided for fusion and alignment to obtain complementary multimodal feature representation. Secondly, a bidirectional feature fusion module is proposed, which employs self-attention and cross-attention to adaptively fuse image and text features of different modalities, thus reducing the differences between multimodal features. For multimodal feature alignment, the similarity between the image pixel features and text features is computed to obtain a pixel–text score map. Thirdly, we propose a category-based pixel-level contrastive learning on the score map to reduce the differences among the same category’s pixels and increase the differences among the different categories’ pixels, thereby enhancing the alignment effect. Additionally, a positive and negative sample selection strategy based on different images is explored during contrastive learning. Averaging pixel values across different training images for each category to set positive and negative samples compares global pixel information while also limiting sample quantity and reducing computational costs. Finally, the fused image features and aligned pixel–text score map are concatenated and fed into the decoder to predict the segmentation results. Experimental results on the ISPRS Potsdam, Vaihingen, and LoveDA datasets demonstrate that BEMSeg is superior to comparison methods on the Potsdam and Vaihingen datasets, with improvements in mIoU ranging from 0.57% to 5.59% and 0.48% to 6.15%, and compared with Transformer-based methods, BEMSeg also performs competitively on LoveDA dataset with improvements in mIoU ranging from 0.37% to 7.14%. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>The framework of the proposed BEMSeg, which consists of an image encoder, a text encoder, multimodal feature fusion, an alignment module, and a decoder. BFF and CPC denote the bidirectional feature fusion module and category-based pixel-level contrastive learning in the multimodal feature fusion and alignment module, and C denotes the number of categories. The colored squares at the top denote the text features of different categories.</p>
Full article ">Figure 2
<p>We propose a new attention-based bidirectional feature fusion module. This structure enables image-attention text features to be incorporated into text representations (and vice versa) by designing a dual-branch structure and adding a self-attention mechanism.</p>
Full article ">Figure 3
<p>Multi-scale image feature fusion network in the decoder of Semantic FPN.</p>
Full article ">Figure 4
<p>Proportion of the number of pixels in each category in Potsdam and Vaihingen remote sensing datasets.</p>
Full article ">Figure 5
<p>The qualitative results of comparison methods on some test images of the Potsdam dataset.</p>
Full article ">Figure 6
<p>The IoU of each class of comparison methods on the Vaihingen dataset.</p>
Full article ">Figure 7
<p>Different parameters of <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>1</mn> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>λ</mi> <mn>2</mn> </msub> </semantics></math> correspond to the mIoU value of the BEMSeg model on the Potsdam dataset.</p>
Full article ">
20 pages, 12264 KiB  
Article
Land Use Recognition by Applying Fuzzy Logic and Object-Based Classification to Very High Resolution Satellite Images
by Dario Perregrini and Vittorio Casella
Remote Sens. 2024, 16(13), 2273; https://doi.org/10.3390/rs16132273 - 21 Jun 2024
Viewed by 856
Abstract
The past decade has seen remarkable advancements in Earth observation satellite technologies, leading to an unprecedented level of detail in satellite imagery, with ground resolutions nearing an impressive 30 cm. This progress has significantly broadened the scope of satellite imagery utilization across various [...] Read more.
The past decade has seen remarkable advancements in Earth observation satellite technologies, leading to an unprecedented level of detail in satellite imagery, with ground resolutions nearing an impressive 30 cm. This progress has significantly broadened the scope of satellite imagery utilization across various domains that were traditionally reliant on aerial data. Our ultimate goal is to leverage this high-resolution satellite imagery to classify land use types and derive soil permeability maps by attributing permeability values to the different types of classified soil. Specifically, we aim to develop an object-based classification algorithm using fuzzy logic techniques to describe the different classes relevant to soil permeability by analyzing different test areas, and once a complete method has been developed, apply it to the entire image of Pavia. In this study area, a logical scheme was developed to classify the field classes, cultivated and uncultivated, and distinguish them from large industrial buildings, which, due to their radiometric similarity, can be classified incorrectly, especially with uncultivated fields. Validation of the classification results against ground truth data, produced by an operator manually classifying part of the image, yielded an impressive overall accuracy of 95.32%. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>This diagram represents the bands captured by the multispectral sensor of the WorldView3 satellite with the relative wavelength intervals.</p>
Full article ">Figure 2
<p>Pavia is a city in northern Italy, located south of the city of Milan. The area studied in this work, highlighted in red, is located just outside the city center towards the east.</p>
Full article ">Figure 3
<p>The studied area is characterized by a modest industrial area surrounded mainly by land for agricultural use, except for a small urban area present in the upper right corner of the image and another just below the industrial area.</p>
Full article ">Figure 4
<p>Parameters used for the segmentation of bigger objects—field segmentation (scale factor = 700).</p>
Full article ">Figure 5
<p>In the images above, we see a comparison of the image before segmentation to detect large objects (<b>a</b>) and after segmentation (<b>b</b>).</p>
Full article ">Figure 6
<p>Parameters used for the classification of smaller objects—urban segmentation (scale factor = 50).</p>
Full article ">Figure 7
<p>The image shows a comparison between a portion of an image before (<b>a</b>) and after (<b>b</b>) segmentation for the identification of objects in the urban context.</p>
Full article ">Figure 8
<p>Membership functions are used to assign fuzzified values to the different features used to describe the classes; in red is the increasing function (<b>a</b>) and in blue is the decreasing function (<b>b</b>).</p>
Full article ">Figure 9
<p>Logical scheme with which the membership functions applied to the various features involved in the identification of fields are combined, with increasing and decreasing membership functions highlighted in red and blue, respectively.</p>
Full article ">Figure 10
<p>The image shows the area before (<b>a</b>) and after (<b>b</b>) the classification of the fields and their further distinction into cultivated and uncultivated.</p>
Full article ">Figure 11
<p>Logical scheme with which the membership functions applied to the various features involved in the identification of water are combined; functions A and B are highlighted in red and blue, respectively.</p>
Full article ">Figure 12
<p>Comparison between the water classified in the scene before (<b>a</b>) and after refinement (<b>b</b>).</p>
Full article ">Figure 13
<p>Comparison between an example field before (<b>a</b>) and after refinement (<b>b</b>).</p>
Full article ">Figure 14
<p>In the images above, we see a comparison of the result obtained from the classification of the study area (<b>a</b>) and the ground truth created manually for validation in the same area (<b>b</b>).</p>
Full article ">Figure 15
<p>Comparison between the area affected by the classification error shown in the classified image (<b>a</b>) and in the ground truth (<b>b</b>).</p>
Full article ">Figure 16
<p>Comparison Example of an area in which the adopted method gave excellent results: (<b>a</b>) portion of the raw image; (<b>b</b>) ground truth identified; (<b>c</b>) classification result.</p>
Full article ">
26 pages, 10617 KiB  
Article
Lightweight Super-Resolution Generative Adversarial Network for SAR Images
by Nana Jiang, Wenbo Zhao, Hui Wang, Huiqi Luo, Zezhou Chen and Jubo Zhu
Remote Sens. 2024, 16(10), 1788; https://doi.org/10.3390/rs16101788 - 18 May 2024
Cited by 2 | Viewed by 1687
Abstract
Due to a unique imaging mechanism, Synthetic Aperture Radar (SAR) images typically exhibit degradation phenomena. To enhance image quality and support real-time on-board processing capabilities, we propose a lightweight deep generative network framework, namely, the Lightweight Super-Resolution Generative Adversarial Network (LSRGAN). This method [...] Read more.
Due to a unique imaging mechanism, Synthetic Aperture Radar (SAR) images typically exhibit degradation phenomena. To enhance image quality and support real-time on-board processing capabilities, we propose a lightweight deep generative network framework, namely, the Lightweight Super-Resolution Generative Adversarial Network (LSRGAN). This method introduces Depthwise Separable Convolution (DSConv) in residual blocks to compress the original Generative Adversarial Network (GAN) and uses the SeLU activation function to construct a lightweight residual module (LRM) suitable for SAR image characteristics. Furthermore, we combine the LRM with an optimized Coordinated Attention (CA) module, enhancing the lightweight network’s capability to learn feature representations. Experimental results on spaceborne SAR images demonstrate that compared to other deep generative networks focused on SAR image super-resolution reconstruction, LSRGAN achieves compression ratios of 74.68% in model storage requirements and 55.93% in computational resource demands. In this work, we significantly reduce the model complexity, improve the quality of spaceborne SAR images, and validate the effectiveness of the SAR image super-resolution algorithm as well as the feasibility of real-time on-board processing technology. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Schematic diagram of the Generator Network structure with corresponding kernel size, number of feature maps, and stride.</p>
Full article ">Figure 2
<p>Schematic diagram of the Discriminator Network structure with corresponding kernel size, number of feature maps, and stride.</p>
Full article ">Figure 3
<p>Convolution structure, it includes both standard convolution and depthwise separable convolution. In the illustration, the left and right sides represent the input and output images respectively. The input image displays different channels in red, green, and blue, while the output image is shown in yellow. The central part of the diagram elaborately depicts the core transition process from standard convolution to depthwise separable convolution.</p>
Full article ">Figure 4
<p>Lightweight Residual Module; DWConv, and PWConv are depthwise convolution and pointwise convolution, respectively.</p>
Full article ">Figure 5
<p>Coordinate Attention module.</p>
Full article ">Figure 6
<p>Some examples of the MSAR-1.0 dataset.</p>
Full article ">Figure 7
<p>Some examples of the SSDD dataset.</p>
Full article ">Figure 8
<p>Some examples of SAR image datasets in our study.</p>
Full article ">Figure 9
<p>Visual qualitative comparison of ships and bridges on the SAR image test set. The local magnification areas of different methods are marked with red boxes in the original image. From the images, it is evident that the sea areas occupy more than half of the space in both images, indicating that they are simple scenes. The best results are bolded.</p>
Full article ">Figure 10
<p>Visual qualitative comparison of airplanes and oil tanks in the SAR image test set. The background of the airplane images includes airport facilities with detailed textures, categorizing it as a complex scene. The local magnification areas of different methods are marked with red boxes in the original image. For the oil tank images, both the sea areas and terrestrial features occupy about half of the image area each, with these scenes to be considered either simple scenes or complex scenes. Moreover, the values of objective evaluation metrics indicate that they fall between these two categories. The best results are bolded.</p>
Full article ">Figure 11
<p>Visual qualitative comparison of islands and ports on the SAR image test set. The local magnification areas of different methods are marked with red boxes in the original image. From the images, it can be seen that the sea areas occupy less than half of the image area in both pictures, indicating that they belong to simple scenes. The best results are bolded.</p>
Full article ">Figure 12
<p>Loss curves for the training of ablation experiment.</p>
Full article ">Figure 13
<p>PSNR and SSIM curves for training in the ablation experiment.</p>
Full article ">
28 pages, 11352 KiB  
Article
Pansharpening Low-Altitude Multispectral Images of Potato Plants Using a Generative Adversarial Network
by Sourav Modak, Jonathan Heil and Anthony Stein
Remote Sens. 2024, 16(5), 874; https://doi.org/10.3390/rs16050874 - 1 Mar 2024
Cited by 4 | Viewed by 3129
Abstract
Image preprocessing and fusion are commonly used for enhancing remote-sensing images, but the resulting images often lack useful spatial features. As the majority of research on image fusion has concentrated on the satellite domain, the image-fusion task for Unmanned Aerial Vehicle (UAV) images [...] Read more.
Image preprocessing and fusion are commonly used for enhancing remote-sensing images, but the resulting images often lack useful spatial features. As the majority of research on image fusion has concentrated on the satellite domain, the image-fusion task for Unmanned Aerial Vehicle (UAV) images has received minimal attention. This study investigated an image-improvement strategy by integrating image preprocessing and fusion tasks for UAV images. The goal is to improve spatial details and avoid color distortion in fused images. Techniques such as image denoising, sharpening, and Contrast Limited Adaptive Histogram Equalization (CLAHE) were used in the preprocessing step. The unsharp mask algorithm was used for image sharpening. Wiener and total variation denoising methods were used for image denoising. The image-fusion process was conducted in two steps: (1) fusing the spectral bands into one multispectral image and (2) pansharpening the panchromatic and multispectral images using the PanColorGAN model. The effectiveness of the proposed approach was evaluated using quantitative and qualitative assessment techniques, including no-reference image quality assessment (NR-IQA) metrics. In this experiment, the unsharp mask algorithm noticeably improved the spatial details of the pansharpened images. No preprocessing algorithm dramatically improved the color quality of the enhanced images. The proposed fusion approach improved the images without importing unnecessary blurring and color distortion issues. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Sample images from the dataset: RGB, and spectral bands (NIR, red, red-edge, green). (<b>a</b>) RGB; (<b>b</b>) NIR; (<b>c</b>) red; (<b>d</b>) red-edge; (<b>e</b>) green.</p>
Full article ">Figure 2
<p>Integration of green channel, Near-Infrared (NIR), red, and red-edge channels into a composite multispectral image—combining multiple spectral layers for comprehensive data insight using the ArcPy Python module.</p>
Full article ">Figure 3
<p>PanColorGAN training: the pansharpened image <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mi>G</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> is generated from the input <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>G</mi> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math>. The quality of the generation was measured by calculating the reconstruction loss (<math display="inline"><semantics> <mrow> <mi>L</mi> <mi>o</mi> <mi>s</mi> <mi>s</mi> <mo>(</mo> <mi>L</mi> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math>) between the colorized output <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mi>G</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> and the multispectral input <math display="inline"><semantics> <msub> <mi>Y</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math>. This loss serves as a crucial metric for training the PanColorGAN model. The methodology is adapted from the approach proposed in PanColorGAN paper [<a href="#B24-remotesensing-16-00874" class="html-bibr">24</a>].</p>
Full article ">Figure 4
<p>PanColorGAN architecture for training: grayscale <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>P</mi> <mi>A</mi> <mi>N</mi> </mrow> </msub> </semantics></math> or <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>G</mi> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math> and multispectral <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math> are input to the generator <span class="html-italic">G</span>, which produces pansharpened <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mi>G</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> or <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mi>P</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math>. In the discriminator network, <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>G</mi> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>X</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math>, and <math display="inline"><semantics> <msub> <mi>Y</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math> are included in the genuine batch image, while <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mi>P</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> or <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mi>G</mi> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> are placed back into the fake batch image. The architecture is adapted from Ozcelik et al. (2021) [<a href="#B24-remotesensing-16-00874" class="html-bibr">24</a>].</p>
Full article ">Figure 5
<p>Pansharpening process: <math display="inline"><semantics> <msub> <mi>Y</mi> <mrow> <mi>P</mi> <mi>A</mi> <mi>N</mi> </mrow> </msub> </semantics></math> (750 × 750 pixels) and <math display="inline"><semantics> <msub> <mi>Y</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> </msub> </semantics></math> (416 × 416 pixels) are resized into (<math display="inline"><semantics> <msubsup> <mi>X</mi> <mrow> <mi>P</mi> <mi>A</mi> <mi>N</mi> </mrow> <mrow> <mi>D</mi> <mi>O</mi> <mi>W</mi> <mi>N</mi> </mrow> </msubsup> </semantics></math>), (<math display="inline"><semantics> <msubsup> <mi>X</mi> <mrow> <mi>M</mi> <mi>S</mi> </mrow> <mrow> <mi>U</mi> <mi>P</mi> </mrow> </msubsup> </semantics></math>) of 512 × 512 pixel size and put into the generator network. The final enhanced pansharpened image is <math display="inline"><semantics> <mover accent="true"> <msub> <mi>Y</mi> <mrow> <mi>p</mi> <mi>s</mi> </mrow> </msub> <mo stretchy="false">^</mo> </mover> </semantics></math> (512 × 512 pixels).</p>
Full article ">Figure 6
<p>Visual representation of various image types: RGB (1), PAN (2), and spectral bands (green, NIR, red, red-edge) (3–6); Multispectral (MS) images (UN MS, MS WF, MS USM, MS TV, MS CL, MS CL WF, MS CL USM, MS CL TV) (7–14); and Pansharpened (PS) images (UN PS, PS WF, PS USM, PS TV, PS CL, PS CL WF, PS CL USM, PS CL TV) (15–22).</p>
Full article ">Figure 7
<p>Visual representation of various image types: RGB (1), PAN (2), and spectral bands (green, NIR, red, red-edge) (3–6); Multispectral (MS) images (UN MS, MS WF, MS USM, MS TV, MS CL, MS CL WF, MS CL USM, MS CL TV) (7–14); and Pansharpened (PS) images (UN PS, PS WF, PS USM, PS TV, PS CL, PS CL WF, PS CL USM, PS CL TV) (15–22).</p>
Full article ">Figure 8
<p>Visual comparison of (<b>A</b>) RGB, (<b>B</b>) PanColorGAN pansharpened and (<b>C</b>) Brovey transform-based pansharpened images, where the red circle in the RGB image highlights white flowers which disappear in both pansharpened images.</p>
Full article ">

Other

Jump to: Research

15 pages, 2538 KiB  
Technical Note
Multi-Scale Image- and Feature-Level Alignment for Cross-Resolution Person Re-Identification
by Guoqing Zhang, Zhun Wang, Jiangmei Zhang, Zhiyuan Luo and Zhihao Zhao
Remote Sens. 2024, 16(2), 278; https://doi.org/10.3390/rs16020278 - 10 Jan 2024
Viewed by 1453
Abstract
Cross-Resolution Person Re-Identification (re-ID) aims to match images with disparate resolutions arising from variations in camera hardware and shooting distances. Most conventional works utilize Super-Resolution (SR) models to recover Low Resolution (LR) images to High Resolution (HR) images. However, because the SR models [...] Read more.
Cross-Resolution Person Re-Identification (re-ID) aims to match images with disparate resolutions arising from variations in camera hardware and shooting distances. Most conventional works utilize Super-Resolution (SR) models to recover Low Resolution (LR) images to High Resolution (HR) images. However, because the SR models cannot completely compensate for the missing information in the LR images, there is still a large gap between the HR image recovered from the LR images and the real HR images. To tackle this challenge, we propose a novel Multi-Scale Image- and Feature-Level Alignment (MSIFLA) framework to align the images on multiple resolution scales at both the image and feature level. Specifically, (i) we design a Cascaded Multi-Scale Resolution Reconstruction (CMSR2) module, which is composed of three cascaded Image Reconstruction (IR) networks, and can continuously reconstruct multiple variables of different resolution scales from low to high for each image, regardless of image resolution. The reconstructed images with specific resolution scales are of similar distribution; therefore, the images are aligned on multiple resolution scales at the image level. (ii) We propose a Multi-Resolution Representation Learning (MR2L) module which consists of three-person re-ID networks to encourage the IR models to preserve the ID-discriminative information during training separately. Each re-ID network focuses on mining discriminative information from a specific scale without the disturbance from various resolutions. By matching the extracted features on three resolution scales, the images with different resolutions are also aligned at the feature-level. We conduct extensive experiments on multiple public cross-resolution person re-ID datasets to demonstrate the superiority of the proposed method. In addition, the generalization of MSIFLA in handling cross-resolution retrieval tasks is verified on the UAV vehicle dataset. Full article
(This article belongs to the Special Issue Image Processing from Aerial and Satellite Imagery)
Show Figures

Figure 1

Figure 1
<p>The challenge of cross-resolution person re-ID. (<b>a</b>) The resolution misalignment problem not only exists between LR query and HR gallery images, (<b>b</b>) but it also exists between LR query images with different resolution scales.</p>
Full article ">Figure 2
<p>The illustration of image- and feature-level alignment.</p>
Full article ">Figure 3
<p>The overall architecture of our proposed method.</p>
Full article ">Figure 4
<p>The architecture of person re-ID network.</p>
Full article ">Figure 5
<p>Comparison of vehicle images captured by surveillance cameras and UAVs.</p>
Full article ">Figure 6
<p>Visualizations of partial retrieval results of the proposed method on the MLR-VRU dataset. The first column shows low-resolution query images, while the following five columns display the top five retrieval results from high-resolution gallery images. The bounding boxes are used to indicate correct (green) or incorrect (red) retrieval results.</p>
Full article ">
Back to TopTop