Remote Sensing Image Classification Based on Canny Operator Enhanced Edge Features
<p>Example of images from different datasets. NWPU-RESISC45 (<b>a</b>–<b>d</b>); UCM (<b>e</b>–<b>h</b>); MSTAR (<b>i</b>–<b>l</b>).</p> "> Figure 2
<p>Illustration of the distinction between (<b>a</b>) previous works and (<b>b</b>) our work.</p> "> Figure 3
<p>The overarching framework of CAF. The original image and edge image (with the same depth) are fused using the CAF module, ensuring that multi-layer features maintain their dimensions through upsampling. Subsequently, the resulting fusion features are fed as input into the Swin-transformer. The details of CAF and the multi-scale channel attention module (MSCAM) are also presented.</p> "> Figure 4
<p>The proposed CAF. By employing the attention-based feature fusion approach, the weights <math display="inline"><semantics> <msub> <mi>λ</mi> <mi>X</mi> </msub> </semantics></math> and <math display="inline"><semantics> <msub> <mi>λ</mi> <mi>Y</mi> </msub> </semantics></math> are computed for integrating the original image and edge image. In comparison to the addition and concatenation methods, this technique enables enhanced focus on crucial regions and features within the image, thereby augmenting its performance and robustness.</p> "> Figure 5
<p>The confusion matrices of UCM were computed using two different methods, with a training ratio of 80%.</p> "> Figure 6
<p>The confusion matrices of MSTAR were computed using two different methods, with a training ratio of 80%.</p> "> Figure 7
<p>The assessment metrics of NWPU-RESISC45 employing diverse methodologies.</p> "> Figure 8
<p>The assessment metrics of MSTAR employing diverse methodologies.</p> "> Figure 9
<p>The assessment metrics of UCM employing diverse methodologies.</p> "> Figure 10
<p>The sequence from (<b>a</b>–<b>f</b>) includes the UCM dataset’s initial image, the heat map created by the model without incorporating the CAF module (overlaid on top of the original image), and the heat map generated by integrating the CAF module into the model (also overlaid on top of the original image).</p> "> Figure 11
<p>The sequence from (<b>a</b>–<b>f</b>) includes the NWPU-RESISC45 dataset’s initial image, the heat map created by the model without incorporating the CAF module (overlaid on top of the original image), and the heat map generated by integrating the CAF module into the model (also overlaid on top of the original image).</p> "> Figure 12
<p>The sequence from (<b>a</b>–<b>f</b>) includes the MSTAR dataset’s initial image, the heat map created by the model without incorporating the CAF module (overlaid on top of the original image), and the heat map generated by integrating the CAF module into the model (also overlaid on top of the original image).</p> ">
Abstract
:1. Introduction
- (1)
- The proposed methodology utilizes a multi-level feature fusion approach to extract diverse features from remote sensing images. The integration of features at various levels facilitates the capture of intricate details as well as comprehensive contextual information within the image. Simultaneously, it facilitates the gradual extraction of abstract features, spanning from low-level to high-level representations, thereby yielding more comprehensive and semantically meaningful feature representations that facilitate enhanced interpretation and comprehension of the image.
- (2)
- At each level, the edge information is simultaneously fused to enhance the representation of detailed features and achieve results.
- (3)
- The CAF method, which fuses image features and edge features at each level, achieves better classification performance than without this method.
2. Related Work and Motivation
2.1. Related Work
2.2. Motivation
3. Method
3.1. Overall Framework
3.2. Multi-Level Feature Extraction
3.3. Edge Information Enhancement
3.4. Feature Information Fusion Network
3.5. Loss Function
4. Experiments and Result
4.1. Experimental Details
4.2. Performance Evaluation Metrics
4.3. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cheng, G.; Xie, X.; Han, J.; Guo, L.; Xia, G.S. Remote sensing image scene classification meets deep learning: Challenges, methods, benchmarks, and opportunities. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3735–3756. [Google Scholar] [CrossRef]
- Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef]
- Thapa, A.; Horanont, T.; Neupane, B.; Aryal, J. Deep learning for remote sensing image scene classification: A review and meta-analysis. Remote Sens. 2023, 15, 4804. [Google Scholar] [CrossRef]
- Adegun, A.A.; Viriri, S.; Tapamo, J.R. Review of deep learning methods for remote sensing satellite images classification: Experimental survey and comparative analysis. J. Big Data 2023, 10, 93. [Google Scholar] [CrossRef]
- Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep learning for hyperspectral image classification: An overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef]
- Ai, J.; Mao, Y.; Luo, Q.; Jia, L.; Xing, M. SAR target classification using the multikernel-size feature fusion-based convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5214313. [Google Scholar] [CrossRef]
- Tang, X.; Li, M.; Ma, J.; Zhang, X.; Liu, F.; Jiao, L. EMTCAL: Efficient multiscale transformer and cross-level attention learning for remote sensing scene classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5626915. [Google Scholar] [CrossRef]
- Wang, Q.; Huang, W.; Xiong, Z.; Li, X. Looking closer at the scene: Multiscale representation learning for remote sensing image scene classification. IEEE Trans. Neural Netw. Learn. Syst. 2020, 33, 1414–1428. [Google Scholar] [CrossRef]
- Hong, D.; Hu, J.; Yao, J.; Chanussot, J.; Zhu, X.X. Multimodal remote sensing benchmark datasets for land cover classification with a shared and specific feature learning model. ISPRS J. Photogramm. Remote Sens. 2021, 178, 68–80. [Google Scholar] [CrossRef] [PubMed]
- Wu, X.; Hong, D.; Chanussot, J. Convolutional neural networks for multimodal remote sensing data classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5517010. [Google Scholar] [CrossRef]
- Bai, L.; Liu, Q.; Li, C.; Ye, Z.; Hui, M.; Jia, X. Remote sensing image scene classification using multiscale feature fusion covariance network with octave convolution. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5620214. [Google Scholar] [CrossRef]
- Yang, J.; Wu, C.; Du, B.; Zhang, L. Enhanced multiscale feature fusion network for HSI classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 10328–10347. [Google Scholar] [CrossRef]
- Li, J.; Lin, D.; Wang, Y.; Xu, G.; Zhang, Y.; Ding, C.; Zhou, Y. Deep discriminative representation learning with attention map for scene classification. Remote Sens. 2020, 12, 1366. [Google Scholar] [CrossRef]
- He, C.; He, B.; Yin, X.; Wang, W.; Liao, M. Relationship prior and adaptive knowledge mimic based compressed deep network for aerial scene classification. IEEE Access 2019, 7, 137080–137089. [Google Scholar] [CrossRef]
- He, C.; Li, S.; Xiong, D.; Fang, P.; Liao, M. Remote sensing image semantic segmentation based on edge information guidance. Remote Sens. 2020, 12, 1501. [Google Scholar] [CrossRef]
- Xu, Z.; Zhang, W.; Zhang, T.; Yang, Z.; Li, J. Efficient transformer for remote sensing image segmentation. Remote Sens. 2021, 13, 3585. [Google Scholar] [CrossRef]
- Wang, H.; Li, X.; Zhou, G.; Chen, W.; Wang, L. Edge Enhanced Channel Attention-based Graph Convolution Network for Scene Classification of Complex Landscapes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 3831–3849. [Google Scholar] [CrossRef]
- Hao, S.; Wu, B.; Zhao, K.; Ye, Y.; Wang, W. Two-stream swin transformer with differentiable sobel operator for remote sensing image classification. Remote Sens. 2022, 14, 1507. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Ke, X.; Liu, C.; Xu, X.; Zhan, X.; Wang, C.; Ahmad, I.; Zhou, Y.; Pan, D.; et al. HOG-ShipCLSNet: A novel deep learning network with hog feature fusion for SAR ship classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5210322. [Google Scholar] [CrossRef]
- Xu, C.; Zhu, G.; Shu, J. A combination of lie group machine learning and deep learning for remote sensing scene classification using multi-layer heterogeneous feature extraction and fusion. Remote Sens. 2022, 14, 1445. [Google Scholar] [CrossRef]
- Dai, Y.; Gieseke, F.; Oehmcke, S.; Wu, Y.; Barnard, K. Attentional feature fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2021; pp. 3560–3569. [Google Scholar]
- Oliva, A. Gist of the scene. In Neurobiology of Attention; Elsevier: Amsterdam, The Netherlands, 2005; pp. 251–256. [Google Scholar]
- Zhang, X.; Cui, J.; Wang, W.; Lin, C. A study for texture feature extraction of high-resolution satellite images based on a direction measure and gray level co-occurrence matrix fusion algorithm. Sensors 2017, 17, 1474. [Google Scholar] [CrossRef] [PubMed]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Zhu, Q.; Zhong, Y.; Zhao, B.; Xia, G.S.; Zhang, L. Bag-of-visual-words scene classifier with local and global features for high spatial resolution remote sensing imagery. IEEE Geosci. Remote Sens. Lett. 2016, 13, 747–751. [Google Scholar] [CrossRef]
- Zhao, F.; Sun, H.; Liu, S.; Zhou, S. Combining low level features and visual attributes for VHR remote sensing image classification. In Proceedings of the MIPPR 2015: Remote Sensing Image Processing, Geographic Information Systems, and Other Applications, Enshi, China, 31 October–1 November 2015; SPIE: California, CA, USA, 2015; Volume 9815, pp. 74–81. [Google Scholar]
- Khan, S.D.; Basalamah, S. Multi-branch deep learning framework for land scene classification in satellite imagery. Remote Sens. 2023, 15, 3408. [Google Scholar] [CrossRef]
- Wu, H.; Zhou, H.; Wang, A.; Iwahori, Y. Precise Crop Classification of Hyperspectral Images Using Multi-Branch Feature Fusion and Dilation-Based MLP. Remote Sens. 2022, 14, 2713. [Google Scholar] [CrossRef]
- Shi, C.; Wang, T.; Wang, L. Branch feature fusion convolution network for remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5194–5210. [Google Scholar] [CrossRef]
- Shi, C.; Zhang, X.; Sun, J.; Wang, L. Remote sensing scene image classification based on dense fusion of multi-level features. Remote Sens. 2021, 13, 4379. [Google Scholar] [CrossRef]
- Cheng, G.; Si, Y.; Hong, H.; Yao, X.; Guo, L. Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 431–435. [Google Scholar] [CrossRef]
- Jiang, N.; Shi, H.; Geng, J. Multi-Scale Graph-Based Feature Fusion for Few-Shot Remote Sensing Image Scene Classification. Remote Sens. 2022, 14, 5550. [Google Scholar] [CrossRef]
- Shi, C.; Zhao, X.; Wang, L. A multi-branch feature fusion strategy based on an attention mechanism for remote sensing image scene classification. Remote Sens. 2021, 13, 1950. [Google Scholar] [CrossRef]
- Sun, X.; Wang, B.; Wang, Z.; Li, H.; Li, H.; Fu, K. Research progress on few-shot learning for remote sensing image interpretation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2387–2402. [Google Scholar] [CrossRef]
- Jiang, J.; Ma, J.; Wang, Z.; Chen, C.; Liu, X. Hyperspectral image classification in the presence of noisy labels. IEEE Trans. Geosci. Remote Sens. 2018, 57, 851–865. [Google Scholar] [CrossRef]
- Barz, B.; Denzler, J. Deep learning on small datasets without pre-training using cosine loss. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Snowmass, CO, USA, 1–5 March 2020; pp. 1371–1380. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Ranasinghe, K.; Naseer, M.; Hayat, M.; Khan, S.; Khan, F.S. Orthogonal projection loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 12333–12343. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Wang, G.; Zhang, N.; Liu, W.; Chen, H.; Xie, Y. MFST: A Multi-Level Fusion Network for Remote Sensing Scene Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6516005. [Google Scholar] [CrossRef]
- Yang, Y.; Newsam, S. Bag-of-visual-words and spatial extensions for land-use classification. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 2–5 November 2010; pp. 270–279. [Google Scholar]
- Ross, T.D.; Worrell, S.W.; Velten, V.J.; Mossing, J.C.; Bryant, M.L. Standard SAR ATR evaluation experiments using the MSTAR public release data set. In Proceedings of the Algorithms for Synthetic Aperture Radar Imagery V. SPIE, Orlando, FL, USA, 13–17 April 1998; Volume 3370, pp. 566–573. [Google Scholar]
- Brigato, L.; Barz, B.; Iocchi, L.; Denzler, J. Image classification with small datasets: Overview and benchmark. IEEE Access 2022, 10, 49233–49250. [Google Scholar] [CrossRef]
- Huang, L.; Wang, F.; Zhang, Y.; Xu, Q. Fine-grained ship classification by combining CNN and swin transformer. Remote Sens. 2022, 14, 3087. [Google Scholar] [CrossRef]
- Zhang, J.; Zhao, H.; Li, J. TRS: Transformers for remote sensing scene classification. Remote Sens. 2021, 13, 4143. [Google Scholar] [CrossRef]
- Chen, P.; Zhou, H.; Li, Y.; Liu, B.; Liu, P. Shape similarity intersection-over-union loss hybrid model for detection of synthetic aperture radar small ship objects in complex scenes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 9518–9529. [Google Scholar] [CrossRef]
- Li, B.; Guo, Y.; Yang, J.; Wang, L.; Wang, Y.; An, W. Gated recurrent multiattention network for VHR remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5606113. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Datasets | Remote Sensing Imaging Types | Number of Classes | Number of per Class | Numbers of Instances | Image Size | Pixel Resolution | Year |
---|---|---|---|---|---|---|---|
NWPU-RESISC45 | Very High-Resolution | 45 | 700 | 31,500 | 256 × 256 | 0.2–30 m | 2017 |
UCM | Very High-Resolution | 21 | 100 | 2100 | 256 × 256 | 0.3 m | 2010 |
MSTAR | Synthetic Aperture Radar | 8 | 428–573 | 5172 | 368 × 368 | 0.3 m | 1996 |
Method | NWPURESISC 45 | UCM | MSTAR |
---|---|---|---|
cross-entropy | 90.63% | 95.71% | 93.97% |
COFE-Loss | 91.20% | 96.67% | 94.13% |
Method | NWPURESISC 45 | UCM | MSTAR |
---|---|---|---|
cross-entropy | 90.63% | 95.71% | 93.97% |
cross-entropy + CAF | 92.39% | 96.24% | 99.26% |
cross-entropy + 2 × CAF | 92.98% | 97.24% | 99.15% |
Method | NWPURESISC 45 | UCM | MSTAR |
---|---|---|---|
COFE-Loss | 91.20% | 96.67 % | 94.13% |
COFE-Loss + CAF | 92.65% | 96.24% | 99.63% |
COFE-Loss + 2 × CAF | 94.12% | 97.49% | 97.28% |
Dataset | Method | Metric | ||||
---|---|---|---|---|---|---|
Accuracy% | Precision% | Recall% | F1% | Kappa% | ||
NWPU | cross-entropy | 90.63 | 90.96 | 90.63 | 90.67 | 90.42 |
COFE-Loss | 91.20 | 91.59 | 91.20 | 91.25 | 91.00 | |
cross-entropy + CAF | 92.39 | 92.69 | 92.39 | 92.41 | 92.22 | |
COFE-Loss + CAF | 92.65 | 92.80 | 92.65 | 92.63 | 92.48 | |
cross-entropy + 2 × CAF | 92.98 | 93.23 | 92.98 | 92.96 | 92.82 | |
COFE-Loss + 2 × CAF | 94.12 | 94.30 | 94.12 | 94.13 | 93.98 | |
MSTAR | cross-entropy | 93.97 | 94.09 | 94.07 | 94.07 | 93.06 |
COFE-Loss | 94.13 | 94.67 | 93.50 | 93.98 | 93.24 | |
cross-entropy + CAF | 99.26 | 98.97 | 99.33 | 99.14 | 99.15 | |
COFE-Loss + CAF | 99.63 | 99.44 | 99.41 | 99.42 | 99.27 | |
cross-entropy + 2 × CAF | 99.15 | 99.22 | 99.26 | 99.24 | 99.03 | |
COFE-Loss + 2 × CAF | 97.28 | 97.56 | 97.71 | 97.57 | 97.44 | |
UCM | cross-entropy | 95.71 | 95.86 | 95.71 | 95.70 | 95.50 |
COFE-Loss | 96.67 | 97.08 | 96.67 | 96.70 | 96.50 | |
cross-entropy + CAF | 96.24 | 96.37 | 96.24 | 96.17 | 96.05 | |
COFE-Loss + CAF | 96.24 | 96.52 | 96.24 | 96.21 | 96.05 | |
cross-entropy + 2 × CAF | 97.24 | 97.32 | 97.24 | 97.22 | 97.11 | |
COFE-Loss + 2 × CAF | 97.49 | 97.60 | 97.49 | 97.48 | 97.37 |
Method | NWPURESISC 45 | UCM | MSTAR |
---|---|---|---|
Res-Net50 [49] | 94.96% | 97.35% | 53.80% |
densenet121 [50] | 94.90% | 96.82% | 64.56% |
VGG11 [51] | 93.56% | 79.36% | 58.92% |
EMTCAL [7] | 92.31% | 96.25% | 99.41% |
ours method | 95.04% | 97.49% | 99.63% |
Method | NWPURESISC 45 | UCM | MSTAR |
---|---|---|---|
Swin-transformer | 90.63% | 95.71% | 93.97% |
add + Swin-transformer | 90.28% | 96.24% | 98.15% |
CAF + Swin-transformer | 92.39% | 96.24% | 99.26% |
2 × CAF + Swin-transformer | 92.98% | 97.24% | 99.15% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, M.; Zhou, Y.; Yang, D.; Song, K. Remote Sensing Image Classification Based on Canny Operator Enhanced Edge Features. Sensors 2024, 24, 3912. https://doi.org/10.3390/s24123912
Zhou M, Zhou Y, Yang D, Song K. Remote Sensing Image Classification Based on Canny Operator Enhanced Edge Features. Sensors. 2024; 24(12):3912. https://doi.org/10.3390/s24123912
Chicago/Turabian StyleZhou, Mo, Yue Zhou, Dawei Yang, and Kai Song. 2024. "Remote Sensing Image Classification Based on Canny Operator Enhanced Edge Features" Sensors 24, no. 12: 3912. https://doi.org/10.3390/s24123912