Fittings Detection Method Based on Multi-Scale Geometric Transformation and Attention-Masking Mechanism
<p>Transmission line images captured by the UAV.</p> "> Figure 2
<p>Transmission line images from different shooting angles.</p> "> Figure 3
<p>Scale distribution of fittings in different transmission line datasets.</p> "> Figure 4
<p>The basic architecture of the MAG-DETR.</p> "> Figure 5
<p>The architecture of the module of MVGT.</p> "> Figure 6
<p>The architectures of different FPNs.</p> "> Figure 7
<p>Images from different datasets.</p> "> Figure 8
<p>The architectures of different FPN.</p> ">
Abstract
:1. Introduction
- We have designed a multi-view geometric transformation enhancement strategy that models geometric transformations as a combination of multiple homomorphic images to obtain image features from multiple views. At the same time, this paper introduces an efficient multi-scale feature fusion method to improve the detection performance of transmission line fittings from different perspectives and scales.
- We introduced an attention-masking mechanism to reduce the computational burden of model-learning multiscale features, thereby further improving the detection speed of the model without affecting its detection accuracy.
- We conducted experiments on three different sets of transmission line fittings detection data, and the experimental results show that the method proposed in this paper can effectively improve the detection accuracy of different scale fittings from different perspectives.
2. Methods
2.1. Multi-View Geometric Transformation Strategy
2.2. Bidirectional Feature Pyramid Network
2.3. Attention-Masking Mechanism
3. Experimental Results and Analysis
3.1. The Introduction of Datasets
3.2. Comparative Experiment
3.3. Ablation Experiment
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
UAV | Unmanned Aerial Vehicle |
CNN | Convolutional Neural Network |
MVGT | Multi-View Geometric Transformation strategy |
BiFPN | Bidirectional Feature Pyramid Network |
AMM | Attention-Masking Mechanism |
References
- Dong, Z.; Zhao, H.; Wen, F.; Xue, Y. From Smart Grid to Energy Internet: Basic Concept and Research Framework. Autom. Electr. Power Syst. 2014, 15, 1–11. [Google Scholar]
- Nguyen, V.; Jenssen, R.; Roverso, D. Automatic autonomous vision-based power line inspection: A review of current status and the potential role of deep learning. Int. J. Electr. Power Energy Syst. 2018, 99, 107–120. [Google Scholar] [CrossRef]
- Zhao, Z.; Zhang, W.; Zhai, Y.; Zhao, W.; Zhang, K. Concept, Research Status and Prospect of Electric Power Vision Technology. Electr. Power Sci. Eng. 2020, 57, 57–69. [Google Scholar]
- Cheng, Z.; Fan, M.; Li, Y.; Zhao, Y.; Li, C. Review on Semantic Segmentation of UAV Aerial Images. Comput. Eng. Appl. 2021, 57, 57–69. [Google Scholar]
- Deng, C.; Wang, S.; Huang, Z. Unmanned aerial vehicles for power line inspection: A cooperative way in platforms and communications. J. Commun. 2014, 9, 687–692. [Google Scholar] [CrossRef]
- Hu, B.; Wang, J. Deep learning based on hand gesture recognition and UAV flight controls. Int. J. Autom. Comput. 2020, 17, 17–29. [Google Scholar] [CrossRef]
- Zhao, Z.; Cui, Y. Research progress of visual detection methods for transmission line key components based on deep learning. Electr. Power Sci. Eng. 2018, 34, 1. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 1137–1149. [Google Scholar] [CrossRef]
- Sun, P.; Zhang, R.; Jiang, Y. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 14454–14463. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 10–16 October 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F. Yolox: Exceeding Yolo Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Salman, K.; Muzammal, N.; Munawar, H. Transformers in Vision: A Survey. ACM Comput. Surv. (CSUR) 2022, 54, 1–41. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Online, 23–28 August 2020; pp. 213–229. [Google Scholar]
- Zhu, X.; Su, W.; Lu, L. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv 2021, arXiv:2010.04159. [Google Scholar]
- Roh, B.; Shin, J.; Shin, W. Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity. arXiv 2021, arXiv:2111.14330. [Google Scholar]
- Fang, Y.; Liao, B.; Wang, X. You only look at one sequence: Rethinking transformer in vision through object detection. Adv. Neural Inf. Process. Syst. 2021, 34, 26183–26197. [Google Scholar]
- Song, H.; Sun, D.; Chun, S. ViDT: An Efficient and Effective Fully Transformerbased Object Detector. arXiv 2021, arXiv:2110.03921. [Google Scholar]
- Wu, K.; Peng, H.; Chen, M. Rethinking and improving relative position encoding for vision transformer. In Proceedings of the International Conference on Computer Vision, Montreal, Canada, 10–17 October 2021; pp. 10033–10041. [Google Scholar]
- Qi, Y.; Wu, X.; Zhao, Z.; Shi, B.; Nie, L. Bolt defect detection for aerial transmission lines using Faster R-CNN with an embedded dual attention mechanism. J. Image Graph. 2021, 26, 2594–2604. [Google Scholar]
- Zhang, S.; Wang, H.; Dong, X. Bolt Detection Technology of Transmission Lines Based on Deep Learning. Power Syst. Technol. 2020, 45, 2821–2829. [Google Scholar]
- Zhong, J.; Liu, Z.; Han, Z. A CNN-based defect inspection method for catenary split pins in high-speed railway. IEEE Trans. Instrum. Meas. 2019, 68, 2849–2860. [Google Scholar] [CrossRef]
- Zhao, Z.; Duan, J.; Kong, Y.; Zhang, D. Construction and Application of Bolt and Nut Pair Knowledge Graph Based on GGNN. Power Syst. Technol. 2021, 56, 98–106. [Google Scholar]
- Zhao, Z.; Xu, G.; Qi, Y. Multi-patch deep features for power line insulator status classification from aerial images. In Proceedings of the International Joint Conference on Neural Networks, Vancouver, BC, Canada, 24–29 July 2016; pp. 3187–3194. [Google Scholar]
- Zhao, Z.; Ma, D.; Ding, J. Weakly Supervised Detection Method for Pin-missing Bolt of Transmission Line Based on SAW-PCL. J. Beijing Univ. Aeronaut. Astronaut. 2023, 1–10. [Google Scholar] [CrossRef]
- Zhang, K.; Zhao, K.; Guo, X. HRM-CenterNet: A High-Resolution Real-time Fittings Detection Method. In Proceedings of the International Conference on Systems, Man, and Cybernetics, Melbourne, Australia, 17–20 October 2021; pp. 564–569. [Google Scholar]
- Zhang, K.; He, Y.; Zhao, K. Multi Label Classification of Bolt Attributes based on Deformable NTS-Net Network. J. Image Graph. 2021, 26, 2582–2593. [Google Scholar]
- Lou, W.; Zhang, K.; Guo, X. PAformer: Visually Indistinguishable Bolt Defect Recognition Based on Bolt Position and Attributes. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, Chiang Mai, Thailand, 7–10 November 2022; pp. 884–889. [Google Scholar]
- Qi, Y.; Lang, Y.; Zhao, Z.; Jiang, A.; Nie, L. Relativistic GAN for bolts image generation with attention mechanism. Electr. Meas. Instrum. 2019, 56, 64–69. [Google Scholar]
- Yu, Y.; Gong, Z.; Zhong, P. Unsupervised representation learning with deep convolutional neural network for remote sensing images. In Proceedings of the Image and Graphics: 9th International Conference, Los Angeles, CA, USA, 28–30 July 2017; pp. 97–108. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference Computer Vision and Pattern Recognition, Hawaii, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- He, J.; Chen, J.; Liu, S. TransFG: A Transformer Architecture for Fine-Grained Recognition. arXiv 2021, arXiv:2103.07976. [Google Scholar] [CrossRef]
- Chen, Z.; Wei, X.; Wang, P.; Guo, Y. Multi-Label Image Recognition with Graph Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5177–5186. [Google Scholar]
- Lin, T.; Dollar, P.; Girshick, R. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Rao, Y.; Zhao, W.; Liu, B. DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification. Adv. Neural Inf. Process. Syst. 2021, 34, 13937–13949. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2018, arXiv:1711.05101. [Google Scholar]
Model | AP (FD-9) | AP (FD-12) | AP (FD-25) | GFLOPs/FPS | Params |
---|---|---|---|---|---|
Faster R-CNN | 80.2 | 75.1 | 59.4 | 246/20 | 60 M |
YOLOX | 83.4 | 78.3 | 61.3 | 73.8/81.3 | 25.3 M |
DETR | 85.6 | 78.6 | 61.7 | 86/28 | 41 M |
Deformable DETR | 85.9 | 81.2 | 62.5 | 173/19 | 40 M |
Sparse DETR | 86.2 | 81.5 | 63.2 | 113/21.2 | 41 M |
MGA-DETR | 88.7 | 83.4 | 66.8 | 101/25.7 | 38 M |
Fittings | AP (FD-9) | AP (FD-12) | AP (FD-25) |
---|---|---|---|
glass insulator | × | × | × |
grading ring | × | 83.1/89.7 | 72.6/80.4 |
shielded ring | × | 83.2/90.2 | 69.8/79.5 |
adjusting board | 87.3/90.7 | 78.8/85.1 | 57.9/68.7 |
yoke plate | 87.9/91.2 | 79.3/84.4 | 58.3/69.1 |
weight | 88.2/91.3 | 78.2/85.2 | 57.5/68.2 |
hanging board | 79.7/86.9 | 75.9/80.4 | 53.4/63.1 |
bowl hanging board | 81.3/86.6 | 76.1/80.5 | 52.7/62.9 |
u-type hanging ring | 82.6/86.9 | 75.4/80.1 | 53.5/62.3 |
Model | MVGT | BiFPN | AMM | AP (FD-9) | AP (FD-12) | AP (FD-25) |
---|---|---|---|---|---|---|
MGA-DETR | × | × | × | 85.6 | 78.6 | 61.7 |
√ | × | × | 85.9 | 80.1 | 63.2 | |
× | √ | × | 86.3 | 80.4 | 63.9 | |
× | × | √ | 85.8 | 79.7 | 62.9 | |
√ | √ | × | 87.6 | 82.9 | 65.4 | |
√ | × | √ | 87.3 | 81.6 | 64.7 | |
× | √ | √ | 87.4 | 81.7 | 64.9 | |
√ | √ | √ | 88.7 | 83.4 | 66.8 |
Model | Number | AP (FD-9) | AP (FD-12) | AP (FD-25) |
---|---|---|---|---|
MVGT | 0 | 87.4 | 81.7 | 64.9 |
1 | 87.8 | 82.5 | 65.3 | |
2 | 88.0 | 82.9 | 65.9 | |
3 | 88.3 | 83.1 | 66.5 | |
4 | 88.7 | 83.4 | 66.8 | |
5 | 88.6 | 83.3 | 66.6 | |
6 | 88.1 | 82.7 | 66.1 |
Model | FPN | PAFPN | BiFPN | AP (FD-9) | AP (FD-12) | AP (FD-25) |
---|---|---|---|---|---|---|
MGA-DETR | × | × | × | 85.1 | 81.6 | 60.7 |
√ | × | × | 86.7 | 82.3 | 62.1 | |
× | √ | × | 87.2 | 82.8 | 64.3 | |
× | × | √ | 88.7 | 83.4 | 66.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, N.; Zhang, K.; Zhu, J.; Zhao, L.; Huang, Z.; Wen, X.; Zhang, Y.; Lou, W. Fittings Detection Method Based on Multi-Scale Geometric Transformation and Attention-Masking Mechanism. Sensors 2023, 23, 4923. https://doi.org/10.3390/s23104923
Wang N, Zhang K, Zhu J, Zhao L, Huang Z, Wen X, Zhang Y, Lou W. Fittings Detection Method Based on Multi-Scale Geometric Transformation and Attention-Masking Mechanism. Sensors. 2023; 23(10):4923. https://doi.org/10.3390/s23104923
Chicago/Turabian StyleWang, Ning, Ke Zhang, Jinwei Zhu, Liuqi Zhao, Zhenlin Huang, Xing Wen, Yuheng Zhang, and Wenshuo Lou. 2023. "Fittings Detection Method Based on Multi-Scale Geometric Transformation and Attention-Masking Mechanism" Sensors 23, no. 10: 4923. https://doi.org/10.3390/s23104923