A Novel Transformer-Based Attention Network for Image Dehazing
<p>Architecture of the proposed Transformer for image dehazing (TID).</p> "> Figure 2
<p>Architecture of the proposed Transformer-based channel attention module (TCAM).</p> "> Figure 3
<p>Architecture of the proposed attention module.</p> "> Figure 4
<p>Dehazing results on HSTS dataset. (<b>a</b>) Hazy image; (<b>b</b>) Fattal’s [<a href="#B26-sensors-22-03428" class="html-bibr">26</a>]; (<b>c</b>) FVR [<a href="#B43-sensors-22-03428" class="html-bibr">43</a>]; (<b>d</b>) DehazeNet [<a href="#B32-sensors-22-03428" class="html-bibr">32</a>]; (<b>e</b>) AOD-Net [<a href="#B15-sensors-22-03428" class="html-bibr">15</a>]; (<b>f</b>) EPDN [<a href="#B35-sensors-22-03428" class="html-bibr">35</a>]; (<b>g</b>) AECR-Net [<a href="#B36-sensors-22-03428" class="html-bibr">36</a>]; (<b>h</b>) TID (ours); (<b>i</b>) ground truth.</p> "> Figure 5
<p>Dehazing results on HSTS dataset. (<b>a</b>) Hazy image; (<b>b</b>) Fattal’s [<a href="#B26-sensors-22-03428" class="html-bibr">26</a>]; (<b>c</b>) FVR [<a href="#B43-sensors-22-03428" class="html-bibr">43</a>]; (<b>d</b>) DehazeNet [<a href="#B32-sensors-22-03428" class="html-bibr">32</a>]; (<b>e</b>) AOD-Net [<a href="#B15-sensors-22-03428" class="html-bibr">15</a>]; (<b>f</b>) EPDN [<a href="#B26-sensors-22-03428" class="html-bibr">26</a>]; (<b>g</b>) AECR-Net [<a href="#B36-sensors-22-03428" class="html-bibr">36</a>]; (<b>h</b>) TID (ours).</p> "> Figure 6
<p>Effects of ablation study (1).</p> "> Figure 7
<p>Effects of ablation study (2).</p> "> Figure 8
<p>Effects of ablation study (3).</p> ">
Abstract
:1. Introduction
- We propose to apply Transformer as a channel attention module to the image dehazing task. We perform quantitative and qualitative comparisons with state-of-the-art methods on synthetic and real-world hazy image datasets, achieving better results on both.
- Our proposed Transformer-based channel attention module (TCAM) is a plug-and-play module that can be applied to other models or tasks, such as image classification, object detection, etc.
2. Related Work
2.1. Prior-Based Method
2.2. Learning-Based Method
3. Proposed Network Framework
3.1. Multiscale Parallel Residual Module
3.2. Attention Module
3.2.1. Transformer-Based Channel Attention Module
3.2.2. Spatial Attention Module
4. Results
4.1. Comparison with State-of-the-Art Methods
4.1.1. Quantitative and Qualitative Results on the Synthetic Dataset
4.1.2. Qualitative Results in Real-World Hazy Images
4.2. Ablation Studies
4.2.1. Attention Module and Non-Attention Module
4.2.2. Our Proposed Attention Module Is Compared with SE-Net and CBAM
4.2.3. For Output of TCAM, the Average-Pooled Patch Token Is Compared with the Class Token
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kumari, A.; Sahoo, S.K. Real time image and video deweathering: The future prospects and possibilities. Optik 2016, 127, 829–839. [Google Scholar] [CrossRef]
- Narasimhan, S.G.; Nayar, S.K. Interactive (de) weathering of an image using physical models. IEEE Workshop Color Photom. Methods Comput. Vis. 2003, 6, 1. [Google Scholar]
- Zhu, Z.; Luo, Y.; Wei, H.; Li, Y.; Qi, G.; Mazur, N.; Li, Y.; Li, P. Atmospheric Light Estimation Based Remote Sensing Image Dehazing. Remote Sens. 2021, 13, 2432. [Google Scholar] [CrossRef]
- Yin, S.; Wang, Y.; Yang, Y.-H. Attentive U-recurrent encoder-decoder network for image dehazing. Neurocomputing 2021, 437, 143–156. [Google Scholar] [CrossRef]
- Wang, Y.; Liu, S.; Chen, C.; Zeng, B. A Hierarchical Approach for Rain or Snow Removing in a Single Color Image. IEEE Trans. Image Process. 2017, 26, 3936–3950. [Google Scholar] [CrossRef]
- Li, Z.; Zhang, J.; Zhong, R.; Bhanu, B.; Chen, Y.; Zhang, Q.; Tang, H. Lightweight and Efficient Image Dehazing Network Guided by Transmission Estimation from Real-World Hazy Scenes. Sensors 2021, 21, 960. [Google Scholar] [CrossRef]
- Shin, J.; Paik, J. Photo-Realistic Image Dehazing and Verifying Networks via Complementary Adversarial Learning. Sensors 2021, 21, 6182. [Google Scholar] [CrossRef]
- Zhu, Z.; Luo, Y.; Qi, G.; Meng, J.; Li, Y.; Mazur, N. Remote Sensing Image Defogging Networks Based on Dual Self-Attention Boost Residual Octave Convolution. Remote Sens. 2021, 13, 3104. [Google Scholar] [CrossRef]
- Kim, J.-H.; Sim, J.-Y.; Kim, C.-S. Video Deraining and Desnowing Using Temporal Correlation and Low-Rank Matrix Completion. IEEE Trans. Image Process. 2015, 24, 2658–2670. [Google Scholar] [CrossRef]
- Chaitanya, B.; Mukherjee, S. Single image dehazing using improved cycleGAN. J. Vis. Commun. Image Represent. 2021, 74, 103014. [Google Scholar] [CrossRef]
- El Mahdaoui, A.; Ouahabi, A.; Moulay, M.S. Image Denoising Using a Compressive Sensing Approach Based on Regularization Constraints. Sensors 2022, 22, 2199. [Google Scholar] [CrossRef] [PubMed]
- Haneche, H.; Boudraa, B.; Ouahabi, A. A new way to enhance speech signal based on compressed sensing. Measurement 2020, 151, 107117. [Google Scholar] [CrossRef]
- Ouahabi, A. A review of wavelet denoising in medical imaging. In Proceedings of the 2013 8th IEEE International Workshop on Systems, Signal Processing and Their Applications (WoSSPA), Algiers, Algeria, 12–15 May 2013; pp. 19–26. [Google Scholar]
- McCartney, E.J. Optics of the Atmosphere: Scattering by Molecules and Particles. Phys. Today 1977, 30, 76–77. [Google Scholar] [CrossRef]
- Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. Aod-net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
- Zhao, H.; Jia, J.; Koltun, V. Exploring self-attention for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 14–19 June 2020; pp. 10076–10085. [Google Scholar]
- Ramachandran, P.; Parmar, N.; Vaswani, A.; Bello, I.; Levskaya, A.; Shlens, J.J. Stand-alone self-attention in vision models. arXiv 2019, arXiv:1906.05909. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. arXiv 2018, arXiv:1807.06521. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Zhao, D.; Li, J.; Li, H.; Xu, L. Hybrid local-global transformer for image dehazing. arXiv 2021, arXiv:2109.07100. [Google Scholar]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 16, 140. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.J. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Online, 19–25 June 2021; pp. 10012–10022. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Online, 23–28 August 2020; pp. 213–229. [Google Scholar]
- Sharma, R.; Chopra, V. Applications: A review on different image dehazing methods. Int. J. Comput. Appl. 2014, 6, 11. [Google Scholar]
- Fattal, R. Single image dehazing. ACM Trans. Graph. 2008, 27, 1–9. [Google Scholar] [CrossRef]
- Tan, R.T. Visibility in bad weather from a single image. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar] [PubMed]
- Tang, K.; Yang, J.; Wang, J. Investigating haze-relevant features in a learning framework for image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2995–3000. [Google Scholar]
- Berman, D.; Avidan, S. Non-local image dehazing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 27–30 June 2016; pp. 1674–1682. [Google Scholar]
- Zhu, Z.; Wei, H.; Hu, G.; Li, Y.; Qi, G.; Mazur, N. A Novel Fast Single Image Dehazing Algorithm Based on Artificial Multiexposure Image Fusion. IEEE Trans. Instrum. Meas. 2021, 70, 1–23. [Google Scholar] [CrossRef]
- Cai, B.; Xu, X.; Jia, K.; Qing, C.; Tao, D. DehazeNet: An End-to-End System for Single Image Haze Removal. IEEE Trans. Image Process. 2016, 25, 5187–5198. [Google Scholar] [CrossRef] [Green Version]
- Ren, W.; Liu, S.; Zhang, H.; Pan, J.; Cao, X.; Yang, M.-H. Single Image Dehazing via Multi-scale Convolutional Neural Networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 154–169. [Google Scholar]
- Zhang, H.; Patel, V.M. Densely connected pyramid dehazing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3194–3203. [Google Scholar]
- Qu, Y.; Chen, Y.; Huang, J.; Xie, Y. Enhanced pix2pix dehazing network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 8160–8168. [Google Scholar]
- Wu, H.; Qu, Y.; Lin, S.; Zhou, J.; Qiao, R.; Zhang, Z.; Xie, Y.; Ma, L. Contrastive learning for compact single image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 10551–10560. [Google Scholar]
- Yeh, C.-H.; Huang, C.-H.; Kang, L.-W. Multi-Scale Deep Residual Learning-Based Single Image Haze Removal via Image Decomposition. IEEE Trans. Image Process. 2020, 29, 3153–3167. [Google Scholar] [CrossRef]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I.J. Attention is all you need. Adv. Neural Inf. Processing Syst. 2017, 30, 117. [Google Scholar]
- Wang, Q.; Li, B.; Xiao, T.; Zhu, J.; Li, C.; Wong, D.F.; Chao, L. Learning deep transformer models for machine translation. arXiv 2019, arXiv:1906.01787. [Google Scholar]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Li, B.; Ren, W.; Fu, D.; Tao, D.; Feng, D.; Zeng, W.; Wang, Z. Benchmarking Single-Image Dehazing and Beyond. IEEE Trans. Image Process. 2018, 28, 492–505. [Google Scholar] [CrossRef] [Green Version]
- Tarel, J.-P.; Hautiere, N. Fast visibility restoration from a single color or gray level image. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 2201–2208. [Google Scholar]
- Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [Green Version]
Layer | Kernel_Size/Padding | Output Channel |
---|---|---|
Conv1_1 | 1 × 1/0 | 3 |
Conv1_2 | 3 × 3/1 | 3 |
Conv1_3 | 5 × 5/2 | 9 |
Conv2_1 | 3 × 3/1 | 3 |
Conv2_2 | 5 × 5/2 | 3 |
Conv2_3 | 7 × 7/3 | 9 |
Conv3 | 3 × 3/1 | 3 |
Dataset | Fattal’s | FVR | DehazeNet | AOD-Net | EPDN | AECR-Net | Ours | |
---|---|---|---|---|---|---|---|---|
SOTS | PSNR | 161143 | 16.8931 | 18.7453 | 18.5211 | 20.1722 | 20.5466 | 21.4393 |
SSIM | 0.7261 | 0.7484 | 0.8314 | 0.8314 | 0.8576 | 0.8642 | 0.8851 | |
HSTS | PSNR | 17.7348 | 18.0142 | 21.2218 | 21.2218 | 22.3145 | 22.7693 | 23.8276 |
SSIM | 0.8123 | 0.8217 | 0.8687 | 0.8687 | 0.8809 | 0.8914 | 0.9022 |
Non-Attention Module | Attention Module | |
---|---|---|
PSNR | 20.0566 | 21.4394 |
SSIM | 0.8543 | 0.8851 |
SE-Net | CBAM | Ours | |
---|---|---|---|
PSNR | 20.2163 | 20.7560 | 21.4394 |
SSIM | 0.8610 | 0.8593 | 0.8851 |
Non-Attention Module | Attention Module | |
---|---|---|
PSNR | 21.0335 | 21.4394 |
SSIM | 0.8688 | 0.8851 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gao, G.; Cao, J.; Bao, C.; Hao, Q.; Ma, A.; Li, G. A Novel Transformer-Based Attention Network for Image Dehazing. Sensors 2022, 22, 3428. https://doi.org/10.3390/s22093428
Gao G, Cao J, Bao C, Hao Q, Ma A, Li G. A Novel Transformer-Based Attention Network for Image Dehazing. Sensors. 2022; 22(9):3428. https://doi.org/10.3390/s22093428
Chicago/Turabian StyleGao, Guanlei, Jie Cao, Chun Bao, Qun Hao, Aoqi Ma, and Gang Li. 2022. "A Novel Transformer-Based Attention Network for Image Dehazing" Sensors 22, no. 9: 3428. https://doi.org/10.3390/s22093428