Frequency-Oriented Transformer for Remote Sensing Image Dehazing
<p>We swapped the amplitude of the hazy image with the amplitude of the paired clean image, and it can be seen that the original hazy image became clear, while the clear image became blurred.</p> "> Figure 2
<p>The architecture of the proposed frequency-oriented transformer for remote sensing image dehazing, which takes remote sensing hazy images as the input and generates dehazed images as the output. It mainly contains the frequency-prompt attention evaluator (FPAE), content reconstruction feed-forward network (CRFN), and spatial-frequency aggregation block (SFAB).</p> "> Figure 3
<p>Visual comparison of the results from the thin haze SateHaze1k dataset. They are best viewed by zooming in on the figures on high-resolution displays.</p> "> Figure 4
<p>Visual comparison of the results from the moderate haze SateHaze1k dataset. They are best viewed by zooming in on the figures on high-resolution displays.</p> "> Figure 5
<p>Visual comparison of the results from the thick haze SateHaze1k dataset. They are best viewed by zooming in on the figures on high-resolution displays.</p> "> Figure 6
<p>Visual comparison of the results from the RICE dataset. They are best viewed by zooming in on the figures on high-resolution displays.</p> "> Figure 7
<p>Visual comparison of the results from the RRSD300 dataset. They are best viewed by zooming in on the figures on high-resolution displays.</p> ">
Abstract
:1. Introduction
- We developed a frequency-prompt attention evaluator to learn complex details and comprehensive features in the model in the frequency domain. This evaluator can effectively aggregate relevant features in the frequency domain rather than the spatial domain, thereby improving the image restoration performance.
- We propose a content reconstruction feed-forward network that captures information between different scales in features and integrates and processes global frequency domain information and local multi-scale spatial information in Fourier space to reconstruct global content under the guidance of the amplitude spectrum.
- We designed a spatial-frequency aggregation block to exchange and fuse features from the frequency domain and spatial domain of the encoder and decoder, promoting the propagation of features from the encoder stream to the decoder and alleviating the problem of information loss in the network.
2. Related Work
2.1. Image Dehazing
2.2. Visual Prompt Learning
3. Proposed Methods
3.1. Overview
3.2. Frequency-Prompt Attention Evaluator
3.3. Content Reconstruction Feed-Forward Network
3.4. Spatial-Frequency Aggregation Block
3.5. Loss Functions
4. Experimental Results
4.1. Datasets
- (1)
- SateHaze1k: SateHaze1k [1] is the commonly employed synthetic dataset consisting of three subsets: thin haze, moderate haze, and thick haze. Each subset contains 400 pairs of synthetic remote sensing RGB haze images, where the resolution of all images is ; 320 pairs were adopted for training and 80 pairs were used for testing.
- (2)
- RICE: RICE [34] was proposed by Google Earth for remote sensing image cloud removal tasks. This dataset consists of 500 paired remote sensing RGB hazy images, of which 425 pairs were used for training and 75 pairs were used for testing. The same as SateHaze1k, the resolution of the images in RICE is .
- (3)
- RRSD300 [42]: To further demonstrate the universality of the proposed method, we conducted experiments on the real-world dataset RRSD300. RRSD300 is a dataset containing 300 real remote sensing haze images and not containing paired clear images. These images were captured from remote sensing platforms in the real world, which includes dense and non-uniform haze scenes.
4.2. Compared Methods
4.3. Implementation Details
4.4. Evaluation Metrics
4.5. Main Results
- Synthetic datasets: Table 1 and Table 2 provide a comprehensive comparison between our proposed method and 13 representative and competitive dehazing methods. It is evident that capturing the potential frequency information could significantly improve the performance in terms of PSNR and SSIM values compared with all the other baselines. Notably, our approach achieved more appealing results on the thin haze benchmark of SateHaze1k, surpassing the recent CNN-based method SCANet by 3.57 dB in the PSNR. Compared with the recent Transformer-based methods AIDNet and RSDformer, the proposed method attained 1.53 dB and 1.11 dB gains in terms of the PSNR, respectively. The performance improvement, when compared with existing remote sensing dehazing methods, shows that learning the frequency information from the frequency domain can facilitate high-quality image dehazing results. In addition, Figure 3, Figure 4 and Figure 5 show the qualitative comparison with other dehazing methods on the SateHaze1k dataset. As expected, it can be observed that SCANet, UFormer, and AIDNet failed to fully remove dense haze, showing contrast decline and color distortion. However, recent typical image restoration methods, such as RSDformer and FFA-Net, could obtain higher-quality images. Unfortunately, these still produced residual artifacts after haze removal and could not fully restore the color and edge details. Compared with these competitive methods, the proposed FOTformer preserved more details and achieved excellent perceptual quality.
Benchmark Datasets | Thin Haze [22] | Moderate Haze [22] | Thick Haze [22] | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Metrics | PSNR (dB)↑ | SSIM↑ | LPIPS↓ | PSNR (dB)↑ | SSIM↑ | LPIPS↓ | PSNR (dB)↑ | SSIM↑ | LPIPS↓ | |
Prior-based method | DCP [2] | 13.259 | 0.7445 | 0.2423 | 9.64 | 0.6141 | 0.3938 | 10.45 | 0.6110 | 0.3786 |
CNN-based methods | DehazeNet [9] | 16.57 | 0.4887 | 0.7493 | 16.93 | 0.2992 | 0.8673 | 15.44 | 0.3689 | 0.8138 |
AODNet [29] | 18.74 | 0.8584 | 0.0991 | 17.69 | 0.7969 | 0.3143 | 13.41 | 0.6523 | 0.4746 | |
PFFNet [30] | 18.02 | 0.6689 | 0.5881 | 18.06 | 0.5487 | 0.5265 | 15.06 | 0.3369 | 0.7877 | |
FFA-Net [31] | 24.26 | 0.9102 | 0.0681 | 25.39 | 0.9302 | 0.0852 | 21.83 | 0.8361 | 0.1616 | |
FCTF [34] | 19.54 | 0.8528 | 0.1348 | 18.41 | 0.7314 | 0.2875 | 17.11 | 0.7205 | 0.5835 | |
MSBDN [43] | 21.76 | 0.8812 | 0.0873 | 23.59 | 0.8877 | 0.1034 | 20.21 | 0.7959 | 0.2254 | |
LD-Net [32] | 20.24 | 0.8739 | 0.0844 | 19.40 | 0.7370 | 0.2616 | 18.62 | 0.7803 | 0.1862 | |
SCANet [7] | 21.75 | 0.8587 | 0.1210 | 21.39 | 0.7290 | 0.4166 | 19.32 | 0.8007 | 0.1914 | |
Transformer-based methods | DehazeFormer [13] | 23.25 | 0.8996 | 0.0654 | 25.38 | 0.9282 | 0.0738 | 22.60 | 0.8366 | 0.1579 |
UFormer [33] | 21.68 | 0.8885 | 0.0745 | 21.14 | 0.8321 | 0.1399 | 19.88 | 0.8062 | 0.1901 | |
AIDNet [22] | 23.79 | 0.8942 | 0.0603 | 25.15 | 0.9032 | 0.0414 | 20.60 | 0.8149 | 0.1281 | |
RSDformer [21] | 24.21 | 0.9118 | 0.0677 | 26.24 | 0.9341 | 0.0657 | 23.01 | 0.8528 | 0.1576 | |
Ours | 25.34 | 0.9170 | 0.0517 | 26.32 | 0.9419 | 0.0608 | 23.24 | 0.8503 | 0.1157 |
Benchmark Datasets | RICE Dataset [34] | |||
---|---|---|---|---|
Metrics | PSNR (dB)↑ | SSIM↑ | LPIPS↓ | |
Prior-based method | DCP [2] | 17.48 | 0.7841 | 0.1794 |
CNN-based methods | DehazeNet [9] | - | - | - |
AODNet [29] | 23.77 | 0.8731 | 0.1469 | |
PFFNet [30] | 25.64 | 0.8977 | 0.1975 | |
FFA-Net [31] | 28.54 | 0.9396 | 0.0755 | |
FCTF [34] | 16.57 | 0.8847 | 0.1567 | |
MSBDN [43] | 30.37 | 0.8584 | 0.0991 | |
LD-Net [32] | 28.88 | 0.9336 | 0.0897 | |
SCANet [7] | 30.84 | 0.9433 | 0.0689 | |
Transformer-based methods | DehazeFormer [13] | 30.91 | 0.9350 | 0.0721 |
UFormer [33] | 32.13 | 0.9413 | 0.0590 | |
AIDNet [22] | - | - | - | |
RSDformer [21] | 33.01 | 0.9525 | 0.0675 | |
Ours | 33.39 | 0.9537 | 0.0606 |
- Real-world datasets: To further evaluate the qualitative performance, we conducted additional experiments using the RRSD300 benchmark dataset. The visual results are reported in Figure 7. The results demonstrate that most models had difficulty in effectively dealing with large-range and non-uniform distributions of real-world haze, resulting in noticeable haze effects in their outputs. In contrast, our model achieved impressive remote sensing dehazing results compared with the other comparative models. The proposed model could effectively eliminate a significant portion of the haze disturbances, resulting in visually pleasing restoration effects. This indicates that in real remote sensing dehazing scenarios, our network exhibited superior output quality with clearer content and enhanced perceptual quality.
4.6. Model Efficiency
4.7. Ablation Studies
- Effectiveness of different components: In this section, we discuss the effectiveness of components in FOTformer. Table 4 shows the quantitative results of the model on the thin haze dataset. The different models are described as follows: (Baseline) The same framework and settings as FOTformer were used, where we applied MDTA [47] and FFN [13] as the basic components. (a) MDTA was replaced with the proposed FPAE. (b) MDTA was kept and FFN was replaced with CRFN. (c) MDTA and FFN were repalced with FPAE and CRFN, respectively. (d) MDTA was replaced with FPAE and SFAB was added. (e) FFN was replaced with CRFN and SFAB was added. (f) The proposed FOTformer. It can be observed that for the baseline, FPAE and CRFN provided PSNR performance gains of 1.62 dB and 1.14 dB, respectively. This indicates that FPAE and CRFN were indispensable components in our model that were capable of effectively learning degradation information related to haze in images, thereby eliminating haze disturbances and reconstructing clear images. Relative to model (b), model (e) achieved a performance improvement of 0.39 dB by adding SFAB. This indicates that combining frequency information and spatial information could effectively enhance the feature modeling ability of the model. The performance of model (f), i.e., the proposed FOTformer, was the best, obtaining the better PSNR and SSIM and lower LIIPS values. This indicates that the collaboration of the three designs could achieve the best remote sensing image dehazing effect.
- Effectiveness of different loss functions: To further reduce the difference between the dehazing images and ground-truth images, we explored the impacts of different loss functions on the proposed network. Table 5 reports the experimental results. It can be significantly observed that the combined performance of various loss functions is better than a single loss function.
- Effectiveness of designs of the FPAE and CRFN: To enrich the contextual features, we introduced learnable prompt blocks (LPBs). We conducted experiments on FOTformer using and not using LPBs to verify the effectiveness of the design. The experimental results are shown in Table 6, and it can be observed that the FOTformer using LPBs achieved higher PSNR values, which demonstrated that LPBs could allow the model to remove haze disturbances in the image. In addition, to verify the effectiveness of the motivation presented in this article, we removed the frequency domain information modeling section in CRFN so that only the spatial multi-scale feature capture part existed. The implementation results are shown in Table 6. FOTformer generates a performance degradation of 0.55 dB. This means that decomposing the image into the phase and amplitude in the frequency domain and separately learning the potential frequency representation could facilitate clear image reconstruction.
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Huang, B.; Zhi, L.; Yang, C.; Sun, F.; Song, Y. Single satellite optical imagery dehazing using SAR image prior based on conditional generative adversarial networks. In Proceedings of the Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020; pp. 1806–1813. [Google Scholar]
- He, K.; Sun, J.; Tang, X. Single Image Haze Removal Using Dark Channel Prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar]
- Liu, C.; Jianbo, H.U.; Lin, Y.U.; Shihong, W.U.; Huang, W. Haze detection, perfection and removal for high spatial resolution satellite imagery. Int. J. Remote Sens. 2011, 32, 8685–8697. [Google Scholar] [CrossRef]
- Liu, X.; Ma, Y.; Shi, Z.; Chen, J. GridDehazeNet: Attention-Based Multi-Scale Network for Image Dehazing. In Proceedings of the International Conference on Computer Vision (ICCV), Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Berman, D.; Avidan, S. Non-local image dehazing. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1674–1682. [Google Scholar]
- Chen, X.; Li, Y.; Dai, L.; Kong, C. Hybrid high-resolution learning for single remote sensing satellite image Dehazing. IEEE Geosci. Remote Sens. Lett. 2021, 19, 6002805. [Google Scholar] [CrossRef]
- Guo, Y.; Gao, Y.; Liu, W.; Lu, Y.; Qu, J.; He, S.; Ren, W. SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 1884–1893. [Google Scholar]
- Wu, H.; Qu, Y.; Lin, S.; Zhou, J.; Qiao, R.; Zhang, Z.; Xie, Y.; Ma, L. Contrastive learning for compact single image dehazing. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 25 June 2021; pp. 10551–10560. [Google Scholar]
- Cai, B.; Xu, X.; Jia, K.; Qing, C.; Tao, D. Dehazenet: An end-to-end system for single image haze removal. IEEE Trans. Image Process. 2016, 25, 5187–5198. [Google Scholar] [CrossRef]
- Chen, Z.; Li, Q.; Feng, H.; Xu, Z.; Chen, Y. Nonuniformly dehaze network for visible remote sensing images. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 447–456. [Google Scholar]
- Wang, G.; Yu, X. MSFFDN: Multi Scale Feature Fusion Dehazing Network with Dense Connection. In Proceedings of the Asian Conference on Computer Vision (ACCV), Macao, China, 4–8 December 2022; pp. 444–459. [Google Scholar]
- Chen, X.; Huang, Y. Memory-Oriented Unpaired Learning for Single Remote Sensing Image Dehazing. IEEE Geosci. Remote Sens. Lett. 2022, 19, 3511705. [Google Scholar] [CrossRef]
- Song, Y.; He, Z.; Qian, H. Vision Transformers for Single Image Dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef]
- He, X.; Li, J.; Jia, T. Learning hybrid dynamic transformers for underwater image super-resolution. Front. Mar. Sci. 2024, 11, 1389553. [Google Scholar] [CrossRef]
- Song, T.; Li, P.; Fan, S.; Jin, J.; Jin, G.; Fan, L. Exploring a context-gated network for effective image deraining. J. Vis. Commun. Image Represent. 2024, 98, 104060. [Google Scholar] [CrossRef]
- Song, T.; Li, P.; Jin, G.; Jin, J.; Fan, S.; Chen, X. Image Deraining transformer with sparsity and frequency guidance. In Proceedings of the International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 10–14 July 2023; pp. 1889–1894. [Google Scholar]
- Huang, J.; Liu, Y.; Zhao, F.; Yan, K.; Zhang, J.; Huang, Y.; Zhou, M.; Xiong, Z. Deep fourier-based exposure correction network with spatial-frequency interaction. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 163–180. [Google Scholar]
- Zhao, C.; Cai, W.; Dong, C.; Hu, C. Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration. arXiv 2023, arXiv:2311.16845. [Google Scholar]
- Wang, X.; Fu, X.; Jiang, P.T.; Huang, J.; Zhou, M.; Li, B.; Zha, Z.J. Decoupling Degradation and Content Processing for Adverse Weather Image Restoration. arXiv 2023, arXiv:2312.05006. [Google Scholar]
- Song, T.; Fan, S.; Jin, J.; Jin, G.; Fan, L. Exploring an efficient frequency-guidance transformer for single image deraining. Signal Image Video Process. 2024, 18, 2429–2438. [Google Scholar] [CrossRef]
- Song, T.; Fan, S.; Li, P.; Jin, J.; Jin, G.; Fan, L. Learning an effective transformer for remote sensing satellite image dehazing. IEEE Geosci. Remote Sens. Lett. 2023, 20, 8002305. [Google Scholar] [CrossRef]
- Kulkarni, A.; Murala, S. Aerial Image Dehazing with Attentive Deformable Transformers. In Proceedings of the Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–7 January 2023; pp. 6305–6314. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2017; Volume 30. [Google Scholar]
- Chen, X.; Li, H.; Li, M.; Pan, J. Learning a sparse transformer network for effective image deraining. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 5896–5905. [Google Scholar]
- Kong, L.; Dong, J.; Ge, J.; Li, M.; Pan, J. Efficient frequency domain-based transformers for high-quality image deblurring. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 5886–5895. [Google Scholar]
- Pan, H.; Zhu, X.; Atici, S.F.; Cetin, A. A hybrid quantum-classical approach based on the hadamard transform for the convolutional layer. In Proceedings of the International Conference on Machine Learning (ICML), Honolulu, HI, USA, 23–29 July 2023; pp. 26891–26903. [Google Scholar]
- Potlapalli, V.; Zamir, S.W.; Khan, S.; Khan, F.S. Promptir: Prompting for all-in-one blind image restoration. arXiv 2023, arXiv:2306.13090. [Google Scholar]
- Berman, D.; Treibitz, T.; Avidan, S. Air-light estimation using haze-lines. In Proceedings of the International Conference on Computational Photography (ICCP), Stanford, CA, USA, 12–14 May 2017; pp. 1–9. [Google Scholar]
- Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. Aod-net: All-in-one dehazing network. In Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
- Mei, K.; Jiang, A.; Li, J.; Wang, M. Progressive feature fusion network for realistic image dehazing. In Proceedings of the Asian Conference on Computer Vision (ACCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 203–215. [Google Scholar]
- Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11908–11915. [Google Scholar]
- Ullah, H.; Muhammad, K.; Irfan, M.; Anwar, S.; Sajjad, M.; Imran, A.S.; de Albuquerque, V.H.C. Light-DehazeNet: A novel lightweight CNN architecture for single image dehazing. IEEE Trans. Image Process. 2021, 30, 8968–8982. [Google Scholar] [CrossRef]
- Wang, Z.; Cun, X.; Bao, J.; Zhou, W.; Liu, J.; Li, H. Uformer: A general u-shaped transformer for image restoration. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 17683–17693. [Google Scholar]
- Li, Y.; Chen, X. A coarse-to-fine two-stage attentive network for haze removal of remote sensing images. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1751–1755. [Google Scholar] [CrossRef]
- Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 2023, 55, 1–35. [Google Scholar] [CrossRef]
- Khattak, M.U.; Rasheed, H.; Maaz, M.; Khan, S.; Khan, F.S. Maple: Multi-modal prompt learning. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 19113–19122. [Google Scholar]
- Jin, F.; Lu, J.; Zhang, J.; Zong, C. Instance-aware prompt learning for language understanding and generation. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2023, 22, 1–18. [Google Scholar] [CrossRef]
- Yi, X.; Xu, H.; Zhang, H.; Tang, L.; Ma, J. Text-IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion. arXiv 2024, arXiv:2403.16387. [Google Scholar]
- Kong, X.; Dong, C.; Zhang, L. Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning Strategy. arXiv 2024, arXiv:2401.03379. [Google Scholar]
- Khan, R.; Mishra, P.; Mehta, N.; Phutke, S.S.; Vipparthi, S.K.; Nandi, S.; Murala, S. Spectroformer: Multi-Domain Query Cascaded Transformer Network for Underwater Image Enhancement. In Proceedings of the Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 4–8 January 2024; pp. 1454–1463. [Google Scholar]
- Agrawal, A.; Mittal, N. Using CNN for facial expression recognition: A study of the effects of kernel size and number of filters on accuracy. Vis. Comput. 2020, 36, 405–412. [Google Scholar] [CrossRef]
- Wen, Y.; Gao, T.; Zhang, J.; Li, Z.; Chen, T. Encoder-Free Multiaxis Physics-Aware Fusion Network for Remote Sensing Image Dehazing. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4705915. [Google Scholar] [CrossRef]
- Dong, H.; Pan, J.; Xiang, L.; Hu, Z.; Zhang, X.; Wang, F.; Yang, M.H. Multi-scale boosted dehazing network with dense feature fusion. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 2157–2167. [Google Scholar]
- Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the International Conference on Pattern Recognition (ICIP), Hong Kong, China, 26–29 September 2010; pp. 2366–2369. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 586–595. [Google Scholar]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5728–5739. [Google Scholar]
- Scribano, C.; Franchini, G.; Prato, M.; Bertogna, M. DCT-Former: Efficient Self-Attention with Discrete Cosine Transform. J. Sci. Comput. 2023, 94, 67. [Google Scholar] [CrossRef]
Methods | PFFNet | FFA-Net | MSBDN | SCANet | DehazeFormer | UFormer | AIDNet | RSDformer | Ours |
---|---|---|---|---|---|---|---|---|---|
Parameters (M) | 20.85 | 4.45 | 31.35 | 2.38 | 9.68 | 20.60 | 27.12 | 4.27 | 8.56 |
FLOPs (G) | 7.24 | 287.53 | 41.52 | 17.65 | 89.85 | 41.09 | 439.92 | 50.12 | 83.19 |
Model | FPAE | CRFN | SFAB | PSNR | SSIM | LPIPS |
---|---|---|---|---|---|---|
Baseline | 23.46 | 0.9044 | 0.0667 | |||
(a) | ✓ | 25.08 | 0.9134 | 0.0648 | ||
(b) | ✓ | 24.60 | 0.9080 | 0.0705 | ||
(c) | ✓ | ✓ | 24.44 | 0.9137 | 0.0617 | |
(d) | ✓ | ✓ | 24.64 | 0.9142 | 0.0627 | |
(e) | ✓ | ✓ | 24.99 | 0.9101 | 0.0563 | |
(f) | ✓ | ✓ | ✓ | 25.32 | 0.9153 | 0.0619 |
Model | PSNR | SSIM | LPIPS | |||
---|---|---|---|---|---|---|
(g) | ✓ | 24.83 | 0.9123 | 0.0687 | ||
(h) | ✓ | ✓ | 25.16 | 0.9141 | 0.0705 | |
(i) | ✓ | ✓ | 24.43 | 0.9141 | 0.0568 | |
(j) | ✓ | ✓ | ✓ | 25.32 | 0.9153 | 0.0619 |
Model | w/o LPB | w/o FCR | PSNR | SSIM | LPIPS |
---|---|---|---|---|---|
(k) | ✓ | 24.64 | 0.9127 | 0.0538 | |
(l) | ✓ | 24.77 | 0.9145 | 0.0552 | |
(m) | ✓ | ✓ | 25.32 | 0.9153 | 0.0619 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Y.; He, X.; Zhan, C.; Li, J. Frequency-Oriented Transformer for Remote Sensing Image Dehazing. Sensors 2024, 24, 3972. https://doi.org/10.3390/s24123972
Zhang Y, He X, Zhan C, Li J. Frequency-Oriented Transformer for Remote Sensing Image Dehazing. Sensors. 2024; 24(12):3972. https://doi.org/10.3390/s24123972
Chicago/Turabian StyleZhang, Yaoqing, Xin He, Chunxia Zhan, and Junjie Li. 2024. "Frequency-Oriented Transformer for Remote Sensing Image Dehazing" Sensors 24, no. 12: 3972. https://doi.org/10.3390/s24123972