LMDFS: A Lightweight Model for Detecting Forest Fire Smoke in UAV Images Based on YOLOv7
"> Figure 1
<p>The network architecture of YOLOv7.</p> "> Figure 2
<p>Structure of the model Ghost Shuffle Convolution.</p> "> Figure 3
<p>ELAN model before and after improvement: (<b>a</b>) Structure of W-ELAN; (<b>b</b>) Structure of GS-ELAN.</p> "> Figure 4
<p>Spatial pyramid pooling module before and after improvement: (<b>a</b>) Structure of SPPCSPC; (<b>b</b>) Structure of GSSPPFCSPC.</p> "> Figure 4 Cont.
<p>Spatial pyramid pooling module before and after improvement: (<b>a</b>) Structure of SPPCSPC; (<b>b</b>) Structure of GSSPPFCSPC.</p> "> Figure 5
<p>Flowchart of attention mechanism CA.</p> "> Figure 6
<p>Sampling on CARAFE.</p> "> Figure 7
<p>Schematic diagram of SIoU.</p> "> Figure 8
<p>The network architecture of modified YOLOv7.</p> "> Figure 9
<p>Typical representative images of a dataset of forest fire smoke. (<b>a</b>) Normal smoke; (<b>b</b>) Smoke with multiple scales; (<b>c</b>) Smoke with interference containing something similar to smoke in images; (<b>d</b>) Synthetic smoke.</p> "> Figure 10
<p>Comparison of different activation functions.</p> "> Figure 11
<p>Comparison of different loss functions.</p> "> Figure 12
<p>Test results of the original YOLOv7 model and the improved YOLOv7 model in different scenarios: (<b>a</b>) The baseline was unable to detect smoke, while the improved model was able to detect smoke; (<b>b</b>) The baseline was unable to detect the complete smoke, while the improved model was able to accurately identify the complete smoke.</p> "> Figure 12 Cont.
<p>Test results of the original YOLOv7 model and the improved YOLOv7 model in different scenarios: (<b>a</b>) The baseline was unable to detect smoke, while the improved model was able to detect smoke; (<b>b</b>) The baseline was unable to detect the complete smoke, while the improved model was able to accurately identify the complete smoke.</p> "> Figure 13
<p>Recognition results of small smoke and smoke with interference containing something similar to smoke in images: (<b>a</b>) Small smoke; (<b>b</b>) Smoke with interference containing something similar to smoke in images.</p> ">
Abstract
:1. Introduction
2. Materials and Methods for Experiments
2.1. YOLOv7
2.2. Improvements to Lightweighting
2.2.1. Ghost Shuffle Convolution
2.2.2. Improvements in the Activation Function
2.3. Improvements in Accuracy
2.3.1. Coordinate Attention
- (1)
- Firstly, shift adaptive averaging pooling is performed simultaneously along the horizontal and vertical directions, respectively, and its mathematical expression is as follows:
- (2)
- The two outputs obtained above are then stitched together and 1 × 1 convolved, and the flow is shown below:
- (3)
- Next, f is expanded along two dimensions, and , to obtain the feature attention maps and in two directions, and convolution operations are performed to obtain and , respectively.
- (4)
- Finally, the attention weights and the original feature maps are multiplied and weighted to obtain the final output as follows:
2.3.2. Content-Aware Reassembly of Features
2.4. SCYLLA-Intersection over Union
- (1)
- Angle cost:
- (2)
- Distance cost:
- (3)
- Shape cost:
- (4)
- IoU cost:
2.5. A Lightweight Model for Detecting Forest Fire Smoke Based on YOLOv7
3. Methods for Evaluation
3.1. The Dataset
3.2. Evaluation of the Model
3.3. Comparison with Other Models
- (1)
- Faster R-CNN: Faster R-CNN is a popular object detection model that combines region proposal network (RPN) and Fast R-CNN. It achieves high accuracy but has a slower inference speed compared to that of other models.
- (2)
- EfficientDet: EfficientDet is a state-of-the-art object detection model that achieves high accuracy while being efficient in terms of computation. It uses a compound scaling method to balance accuracy and efficiency.
- (3)
- SSD: SSD (Single Shot MultiBox Detector) is a fast object detection model that achieves real-time performance. It uses multiple layers for predicting bounding boxes and class probabilities but may have lower accuracy compared to that of some other models.
- (4)
- RetinaNet: RetinaNet is an object detection model that addresses the problem of class imbalance during training by introducing a focal loss. It provides a good balance between accuracy and speed but may not achieve the highest accuracy compared to that of some other models.
- (5)
- YOLOv5: YOLOv5 is part of the You Only Look Once (YOLO) series, known for its real-time object detection capabilities. YOLOv5 is lightweight and achieves a good balance between accuracy and speed. It has a smaller model size and is suitable for various applications.
4. Results
4.1. The Environment for Training and Hyper-Parameters
4.2. Analysis of Module Effectiveness
4.2.1. Effectiveness of Hardswish
4.2.2. Effectiveness of CA
4.2.3. Effectiveness of SIoU
4.3. Ablation Experiments
4.4. Comparison Experiments
4.5. Testing in Different Scenarios
5. Discussion and Conclusions
6. Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Scholten, R.C.; Jandt, R.; Miller, E.A.; Rogers, B.M.; Veraverbeke, S. Overwintering fires in boreal forests. Nature 2021, 593, 399–404. [Google Scholar] [CrossRef]
- Hu, Y.; Zhan, J.; Zhou, G.; Chen, A.; Cai, W.; Guo, K.; Hu, Y.; Li, L. Fast forest fire smoke detection using MVMNet. Knowl.-Based Syst. 2022, 241, 108219. [Google Scholar] [CrossRef]
- Zheng, R.; Zhang, D.; Lu, S.; Yang, S. Discrimination Between Fire Smokes and Nuisance Aerosols Using Asymmetry Ratio and Two Wavelengths. Fire Technol. 2019, 55, 1753–1770. [Google Scholar] [CrossRef]
- Li, X.; Liu, J.; Huang, Y.; Wang, D.; Miao, Y. Human Motion Pattern Recognition and Feature Extraction: An Approach Using Multi-Information Fusion. Micromachines 2022, 13, 1205. [Google Scholar] [CrossRef]
- Gubbi, J.; Marusic, S.; Palaniswami, M. Smoke detection in video using wavelets and support vector machines. Fire Saf. J. 2009, 44, 1110–1115. [Google Scholar] [CrossRef]
- López-Naranjo, E.J.; Alzate-Gaviria, L.M.; Hernández-Zárate, G.; Reyes-Trujeque, J.; Cruz-Estrada, R.H. Effect of accelerated weathering and termite attack on the tensile properties and aesthetics of recycled HDPE-pinewood composites. J. Thermoplast. Compos. Mater. 2013, 27, 831–844. [Google Scholar] [CrossRef]
- Emmy Prema, C.; Vinsley, S.S.; Suresh, S. Multi Feature Analysis of Smoke in YUV Color Space for Early Forest Fire Detection. Fire Technol. 2016, 52, 1319–1342. [Google Scholar] [CrossRef]
- Rong, D.; Xie, L.; Ying, Y. Computer vision detection of foreign objects in walnuts using deep learning. Comput. Electron. Agric. 2019, 162, 1001–1010. [Google Scholar] [CrossRef]
- Khan, S.; Muhammad, K.; Mumtaz, S.; Baik, S.W.; de Albuquerque, V.H.C. Energy-Efficient Deep CNN for Smoke Detection in Foggy IoT Environment. IEEE Internet Things J. 2019, 6, 9237–9245. [Google Scholar] [CrossRef]
- Minghua, J.; Yaxin, Z.; Feng, Y.; Changlong, Z.; Tao, P. A self-attention network for smoke detection. Fire Safety J. 2022, 129, 103547. [Google Scholar] [CrossRef]
- Wu, X.; Lu, X.; Leung, H. A motion and lightness saliency approach for forest smoke segmentation and detection. Multimed. Tools Appl. 2019, 79, 69–88. [Google Scholar] [CrossRef]
- Yin, H.; Wei, Y.; Liu, H.; Liu, S.; Liu, C.; Gao, Y. Deep Convolutional Generative Adversarial Network and Convolutional Neural Network for Smoke Detection. Complexity 2020, 2020, 6843869. [Google Scholar] [CrossRef]
- Zhang, Q.; Lin, G.; Zhang, Y.; Xu, G.; Wang, J. Wildland forest fire smoke detection based on faster R-CNN using synthetic smoke images. Procedia Eng. 2018, 211, 441–446. [Google Scholar] [CrossRef]
- Guo, Y.; Chen, S.; Zhan, R.; Wang, W.; Zhang, J. LMSD-YOLO: A Lightweight YOLO Algorithm for Multi-Scale SAR Ship Detection. Remote. Sens. 2022, 14, 4801. [Google Scholar] [CrossRef]
- Li, W.; Zhang, L.; Wu, C.; Cui, Z.; Niu, C. A new lightweight deep neural network for surface scratch detection. Int. J. Adv. Manuf. Technol. 2022, 123, 1999–2015. [Google Scholar] [CrossRef]
- Sheng, D.; Deng, J.; Xiang, J. Automatic Smoke Detection Based on SLIC-DBSCAN Enhanced Convolutional Neural Network. IEEE Access 2021, 9, 63933–63942. [Google Scholar] [CrossRef]
- Ilina, O.V.; Tereshonok, M.V. Robustness study of a deep convolutional neural network for vehicle detection in aerial imagery. J. Commun. Technol. Electron. 2022, 67, 164–170. [Google Scholar] [CrossRef]
- Marciniak, T.; Chmielewska, A.; Weychan, R.; Parzych, M.; Dabrowski, A. Influence of low resolution of images on reliability of face detection and recognition. Multimed. Tools Appl. 2015, 74, 4329–4349. [Google Scholar] [CrossRef] [Green Version]
- Wang, C.; Bochkovskiy, A.; Liao, H.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.; Liao, H.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Zhang, X.; Zeng, H.; Guo, S.; Zhang, L. Efficient long-range attention network for image super-resolution. In Proceedings of the Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022. Part XVII. pp. 649–667. [Google Scholar]
- Lin, T.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles. arXiv 2022, arXiv:2206.02424. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. Carafe: Content-aware reassembly of features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3007–3016. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar]
- Xu, G.; Zhang, Y.; Zhang, Q.; Lin, G.; Wang, J. Deep domain adaptation based video smoke detection using synthetic smoke images. Fire Saf. J. 2017, 93, 53–59. [Google Scholar] [CrossRef] [Green Version]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28, Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016. Part I 14. pp. 21–37. [Google Scholar]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Zhang, L.; Wang, M.; Ding, Y.; Bu, X. MS-FRCNN: A Multi-Scale Faster RCNN Model for Small Target Forest Fire Detection. Forests 2023, 14, 616. [Google Scholar] [CrossRef]
- Khan, S.; Khan, A. Ffirenet: Deep learning based forest fire classification and detection in smart cities. Symmetry 2022, 14, 2155. [Google Scholar] [CrossRef]
- Hendel, I.; Ross, G.M. Efficacy of remote sensing in early forest fire detection: A thermal sensor comparison. Can. J. Remote. Sens. 2020, 46, 414–428. [Google Scholar] [CrossRef]
- Enoh, M.A.; Okeke, U.C.; Narinua, N.Y. Identification and modelling of forest fire severity and risk zones in the Cross–Niger transition forest with remotely sensed satellite data. Egypt. J. Remote Sens. Space Sci. 2021, 24, 879–887. [Google Scholar] [CrossRef]
- Wang, Y.; Xu, R.; Bai, D.; Lin, H. Integrated Learning-Based Pest and Disease Detection Method for Tea Leaves. Forests 2023, 14, 1012. [Google Scholar] [CrossRef]
Taken From an Overhead Perspective | Taken at an Oblique Angle | |||
---|---|---|---|---|
Normal smoke | Normal smoke | Small smoke | Smoke with smoke-like interference | Synthetic smoke |
377 | 1017 | 968 | 1657 | 1292 |
Dataset | Train | Validation | Test | Summary |
---|---|---|---|---|
Number | 4249 | 531 | 531 | 5311 |
Experimental Environment | Details |
---|---|
Programming language | Python 3.8 |
Operating system | Windows 10 |
Deep learning framework | PyTorch 1.10.0 |
GPU | NVIDIA GeForce GTX 3080 |
GPU acceleration tool | CUDA:11.0 |
Training Parameters | Details |
---|---|
epochs | 300 |
batch-size | 16 |
img-size (pixels) | 640 × 640 |
initial learning rate | 0.01 |
optimization algorithm | SGD |
MODEL | P/% | R/% | [email protected]/% |
---|---|---|---|
YOLOv7 | 69.8 | 68.1 | 74.3 |
YOLOv7-CBAM | 67.9 | 65.1 | 71.5 |
YOLOv7-SE | 76.5 | 65.1 | 75.6 |
YOLOv7-CA | 75.3 | 68.9 | 76.9 |
MODEL | P/% | R/% | [email protected]/% | [email protected]:.95/% | Parameters/M | GFLOPs | FPS |
---|---|---|---|---|---|---|---|
YOLOv7 | 69.8 | 65.0 | 74.3 | 47.4 | 9.32 | 26.7 | 62.89 |
YOLOv7-CA | 70.3 | 71.9 | 76.9 | 49.8 | 9.38 | 26.8 | 64.1 |
YOLOv7-SIoU | 72.3 | 73.1 | 77.7 | 51.1 | 9.32 | 26.7 | 65.3 |
YOLOv7-GSELAN | 69.8 | 73.2 | 75.5 | 48.8 | 8.67 | 25.4 | 67.3 |
YOLOv7-GSSPPFCSPC | 73.1 | 67.7 | 74.8 | 48.1 | 8.45 | 26.0 | 64.1 |
YOLOv7-CARAFE | 71.7 | 71.6 | 76.8 | 49.8 | 9.36 | 26.8 | 60.25 |
YOLOv7-CA-SIoU | 74.0 | 71.2 | 78.2 | 51.2 | 9.38 | 26.8 | 64.5 |
YOLOv7-CA-SIoU-GSI | 73.8 | 71.5 | 79.2 | 51.0 | 7.93 | 25.0 | 67.34 |
Ours | 77.1 | 71.8 | 80.2 | 52.8 | 7.96 | 25.1 | 63.39 |
Indicators | [email protected] | Parameters | GFLOPs | FPS |
---|---|---|---|---|
Null Hypothesis | There is no significant difference between our model and the baseline. | |||
Statistical Test Method | Corrected paired Student’s t-test. | |||
p-value/% | 2.21 | 0.93 | 0.55 | 0.86 |
MODEL | [email protected]/% | GFLOPs | Parameters/M | FPS |
---|---|---|---|---|
Faster R-CNN | 81.1 | 206.66 | 41.12 | 38.8 |
EfficientDet | 71.9 | 116.73 | 18.34 | 27.8 |
SSD | 68.2 | 342.75 | 23.75 | 94 |
Retinanet | 73.5 | 153.79 | 19.61 | 50.1 |
YOLOv5m | 75.1 | 48.2 | 20.8 | 80.2 |
Ours | 80.2 | 25.1 | 7.96 | 63.39 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, G.; Cheng, R.; Lin, X.; Jiao, W.; Bai, D.; Lin, H. LMDFS: A Lightweight Model for Detecting Forest Fire Smoke in UAV Images Based on YOLOv7. Remote Sens. 2023, 15, 3790. https://doi.org/10.3390/rs15153790
Chen G, Cheng R, Lin X, Jiao W, Bai D, Lin H. LMDFS: A Lightweight Model for Detecting Forest Fire Smoke in UAV Images Based on YOLOv7. Remote Sensing. 2023; 15(15):3790. https://doi.org/10.3390/rs15153790
Chicago/Turabian StyleChen, Gong, Renxi Cheng, Xufeng Lin, Wanguo Jiao, Di Bai, and Haifeng Lin. 2023. "LMDFS: A Lightweight Model for Detecting Forest Fire Smoke in UAV Images Based on YOLOv7" Remote Sensing 15, no. 15: 3790. https://doi.org/10.3390/rs15153790