Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model
"> Figure 1
<p>Image collection location: (<b>A</b>) Litchi Expo, (<b>B</b>) Mache New Fruit Farm.</p> "> Figure 2
<p>CSP-Darknet53 architecture for YOLOv5 backbone network. Bottleneck*N indicates the number of times Bottleneck is repeated.</p> "> Figure 3
<p>Network architecture of YOLOv5-TinyLitchi. Backbone: a feed-forward CSP-Darknet53 architecture extracts the multi-scale feature maps. Neck: the P2 feature layer fused into BiFPN in order to fuse more localization information. Head: NWD was added to the regression loss function.</p> "> Figure 4
<p>Schematic diagram of PANet and BiFPN structures: (<b>a</b>) original YOLOv5 PANet, (<b>b</b>) proposed YOLOv5 BiFPN.</p> "> Figure 5
<p>Slicing Aided Hyper Inference schematic.</p> "> Figure 6
<p>The AP of ablation experiments.</p> "> Figure 7
<p>Comparison of visualization feature maps. (<b>A</b>) Original image. (<b>B</b>) Feature map of the YOLOv5 model. (<b>C</b>) Feature map of the proposed model.</p> "> Figure 8
<p>Classification loss of P2 feature layer fusion.</p> "> Figure 9
<p>The detection effect comparison before and after using SAHI. Orange region demonstrates the improved detection of SAHI for small targets, and purple region demonstrates the misdetection of SAHI for medium and large targets.</p> "> Figure 10
<p>Examples of various detections in the dataset. (<b>A</b>,<b>B</b>) Correct detection results. (<b>C</b>–<b>H</b>) Missed and misdirected detections affected by factors such as the environment and illumination. (<b>I</b>,<b>J</b>) Detection effect in occlusion and blurring situations. The red box denotes that the detection is for mature litchi and the green box denotes that the detection is for immature litchi.</p> "> Figure 11
<p>Comparison of the effects of the proposed model and the original model. Purple region shows the detection for the small target litchi and orange region shows the detection for the occluded litchi. (<b>A</b>) Original YOLOv5 model. (<b>B</b>) Proposal model.</p> "> Figure 12
<p>The detection effect on three different photographic perspectives. (<b>A1</b>,<b>A2</b>) Front view perspective of litchi tree. (<b>B1</b>,<b>B2</b>) Top view perspective of litchi tree. (<b>C1</b>,<b>C2</b>) Other perspective of litchi tree. Blue region demonstrates the excellent detection ability of the proposed model for small target litchi.</p> ">
Abstract
:1. Introduction
2. Materials and Methods
2.1. Image Data Collection
2.2. Dataset Construction
2.3. Experimental Environment Setup
2.4. Evaluation Metrics
2.5. Overview of YOLOv5
2.6. The Proposed Model
2.6.1. BiFPN with P2 Feature Layer Fusion
2.6.2. NWD
2.6.3. SAHI
3. Experimental Results and Comparative Analysis
3.1. Ablation Experiments
3.1.1. BiFPN
3.1.2. P2 Feature Layer Fusion
3.1.3. NWD
3.1.4. SAHI
4. Comparative Discussion
4.1. Comparison with Other Object Detection Algorithms
4.2. Analysis of Model Detection Effects
4.3. Test Results on Datasets
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Wen’E, Q.; Houbin, C.; Tao, L.; Fengxian, S. Development Status, Trend and Suggestion of Litchi Industry in Mainland China. Guangdong Agric. Sci. 2019, 46, 132–139. [Google Scholar] [CrossRef]
- Qi, W.; Chen, H.; Li, J. Status, Trend and Countermeasures of Development of Litchi Industry in the Mainland of China in 2022. Guangdong Agric. Sci. 2023, 1–10. [Google Scholar]
- Lan, Y.; Huang, Z.; Deng, X.; Zhu, Z.; Huang, H.; Zheng, Z.; Lian, B.; Zeng, G.; Tong, Z. Comparison of machine learning methods for citrus greening detection on UAV multispectral images. Comput. Electron. Agric. 2020, 171, 105234. [Google Scholar] [CrossRef]
- Chen, P.; Douzals, J.P.; Lan, Y.; Cotteux, E.; Delpuech, X.; Pouxviel, G.; Zhan, Y. Characteristics of unmanned aerial spraying systems and related spray drift: A review. Front. Plant Sci. 2022, 13, 870956. [Google Scholar] [CrossRef]
- Junos, M.H.; Mohd Khairuddin, A.S.; Thannirmalai, S.; Dahari, M. Automatic detection of oil palm fruits from UAV images using an improved YOLO model. Vis. Comput. 2022, 38, 2341–2355. [Google Scholar] [CrossRef]
- Maldonado, W., Jr.; Barbosa, J.E.C. Automatic green fruit counting in orange trees using digital images. Comput. Electron. Agric. 2016, 127, 572–581. [Google Scholar] [CrossRef] [Green Version]
- Bhargava, A.; Bansal, A. Automatic Detection and Grading of Multiple Fruits by Machine Learning. Food Anal. Methods 2020, 13, 751–761. [Google Scholar] [CrossRef]
- Xiong, J.; Lin, R.; Liu, Z.; He, Z.; Tang, L.; Yang, Z.; Zou, X. The recognition of litchi clusters and the calculation of picking point in a nocturnal natural environment. Biosyst. Eng. 2018, 166, 44–57. [Google Scholar] [CrossRef]
- Wang, C.; Tang, Y.; Zou, X.; Luo, L.; Chen, X. Recognition and Matching of Clustered Mature Litchi Fruits Using Binocular Charge-Coupled Device (CCD) Color Cameras. Sensors 2017, 17, 2564. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef] [Green Version]
- Apolo-Apolo, O.E.; Martinez-Guanter, J.; Egea, G.; Raja, P.; Pérez-Ruiz, M. Deep learning techniques for estimation of the yield 556 and size of citrus fruits using a UAV. Eur. J. Agron. 2020, 115, 126030. [Google Scholar] [CrossRef]
- Gao, F.; Fu, L.; Zhang, X.; Majeed, Y.; Li, R.; Karkee, M.; Zhang, Q. Multi-class fruit-on-plant detection for apple in SNAP system using Faster R-CNN. Comput. Electron. Agric. 2020, 176, 105634. [Google Scholar] [CrossRef]
- Zhang, J.; Karkee, M.; Zhang, Q.; Zhang, X.; Yaqoob, M.; Fu, L.; Wang, S. Multi-class object detection using faster R-CNN and estimation of shaking locations for automated shake-and-catch apple harvesting. Comput. Electron. Agric. 2020, 173, 105384. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef] [Green Version]
- Lin, P.; Li, D.; Jia, Y.; Chen, Y.; Huang, G.; Elkhouchlaa, H.; Yao, Z.; Zhou, Z.; Zhou, H.; Li, J.; et al. A novel approach for estimating the flowering rate of litchi based on deep learning and UAV images. Front. Plant Sci. 2022, 13, 966639. [Google Scholar] [CrossRef]
- Wang, L.; Zhao, Y.; Xiong, Z.; Wang, S.; Li, Y.; Lan, Y. Fast and precise detection of litchi fruits for yield estimation based on the improved YOLOv5 model. Front. Plant Sci. 2022, 13, 965425. [Google Scholar] [CrossRef]
- Liang, J.; Chen, X.; Liang, C.; Long, T.; Tang, X.; Shi, Z.; Zhou, M.; Zhao, J.; Lan, Y.; Long, Y. A detection approach for late-autumn shoots of litchi based on unmanned aerial vehicle (UAV) remote sensing. Comput. Electron. Agric. 2023, 204, 107535. [Google Scholar] [CrossRef]
- Liu, G.; Han, J.; Rong, W. Feedback-driven loss function for small object detection. Image Vis. Comput. 2021, 111, 104197. [Google Scholar] [CrossRef]
- Gong, Y.; Yu, X.; Ding, Y.; Peng, X.; Zhao, J.; Han, Z. Effective fusion factor in FPN for tiny object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 5–9 January 2021; pp. 1160–1168. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar] [CrossRef] [Green Version]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision—ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef] [Green Version]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar] [CrossRef]
- Chen, J.; Mai, H.; Luo, L.; Chen, X.; Wu, K. Effective Feature Fusion Network in BIFPN for Small Object Detection. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 699–703. [Google Scholar] [CrossRef]
- Lv, J.; Xu, H.; Han, Y.; Lu, W.; Xu, L.; Rong, H.; Yang, B.; Zou, L.; Ma, Z. A visual identification method for the apple growth forms in the orchard. Comput. Electron. Agric. 2022, 197, 106954. [Google Scholar] [CrossRef]
- Liu, X.; Li, G.; Chen, W.; Liu, B.; Chen, M.; Lu, S. Detection of dense Citrus fruits by combining coordinated attention and cross-scale connection with weighted feature fusion. Appl. Sci. 2022, 12, 6600. [Google Scholar] [CrossRef]
- Wang, J.; Xu, C.; Yang, W.; Yu, L. A normalized Gaussian Wasserstein distance for tiny object detection. arXiv 2021, arXiv:2110.13389. [Google Scholar] [CrossRef]
- Yang, J.; Yang, H.; Wang, F.; Chen, X. A modified YOLOv5 for object detection in UAV-captured scenarios. In Proceedings of the 2022 IEEE International Conference on Networking, Sensing and Control (ICNSC), Shanghai, China, 15–18 December 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Yu, Z.; Huang, H.; Chen, W.; Su, Y.; Liu, Y.; Wang, X. YOLO-FaceV2: A Scale and Occlusion Aware Face Detector. arXiv 2022, arXiv:2208.02019. [Google Scholar] [CrossRef]
- Xu, C.; Wang, J.; Yang, W.; Yu, H.; Yu, L.; Xia, G.S. Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark. ISPRS J. Photogramm. Remote Sens. 2022, 190, 79–93. [Google Scholar] [CrossRef]
- Akyon, F.C.; Onur Altinuc, S.; Temizel, A. Slicing Aided Hyper Inference and Fine-Tuning for Small Object Detection. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 966–970. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2020; pp. 213–229. [Google Scholar] [CrossRef]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
Dataset | Number of Images | Number of Labels |
---|---|---|
Train | 280 | 29,922 |
Validation | 40 | 4480 |
Test | 80 | 8754 |
Total | 400 | 43,156 |
Model | Weight (MB) | Params (M) | FPS | ||||||
---|---|---|---|---|---|---|---|---|---|
YOLOv5 | 50.6 | 27.8 | 53.8 | 81.3 | 77.3 | 24.0 | 13.7 | 7.01 | 68.8 |
+NWD | 48.5 | 22.6 | 64.0 | 74.2 | 65.5 | 31.6 | 13.7 | 7.01 | 68.2 |
+BiFPN | 55.2 | 31.6 | 67.9 | 82.3 | 78.6 | 31.8 | 13.8 | 7.08 | 70.9 |
+NWD + BiFPN | 63.6 | 35.8 | 79.5 | 86.4 | 80.1 | 47.1 | 13.8 | 7.08 | 70.4 |
+Fuse P2 + BiFPN | 62.9 | 35.9 | 78.2 | 85.7 | 80.2 | 45.6 | 14.2 | 7.24 | 68.2 |
+Fuse P2 + NWD + BiFPN | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 | 14.2 | 7.24 | 71.4 |
+Fuse P2 + NWD + BiFPN + SAHI | 72.6 | 57.3 | 80.1 | 86.0 | 86.4 | 58.8 | 14.2 | 7.24 | — |
Repeated Blocks | Weight (MB) | Params (M) | ||||||
---|---|---|---|---|---|---|---|---|
1× | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 | 14.2 | 7.24 |
2× | 64.2 | 35.6 | 79.7 | 87.7 | 80.4 | 47.9 | 20.3 | 10.4 |
3× | 58.7 | 34.7 | 74.1 | 81.0 | 80.3 | 37.1 | 26.3 | 13.5 |
NWD | CIoU | ||||||
---|---|---|---|---|---|---|---|
0 | 1 | 62.9 | 35.9 | 78.2 | 85.7 | 80.2 | 45.6 |
1 | 0 | 63.8 | 35.8 | 78.7 | 87.4 | 80.4 | 47.1 |
0.5 | 0.5 | 63.9 | 37.1 | 77.6 | 87.1 | 80.8 | 46.9 |
0.8 | 0.2 | 62.8 | 36.7 | 78.5 | 85.3 | 81.3 | 44.3 |
0.2 | 0.8 | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 |
Model | Weight (MB) | Params (M) | FPS | ||||||
---|---|---|---|---|---|---|---|---|---|
YOLOv5 | 50.6 | 27.8 | 53.8 | 81.3 | 77.3 | 24.0 | 13.7 | 7.01 | 68.8 |
YOLOv5-TinyLitchi | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 | 14.2 | 7.24 | 71.4 |
YOLOv5-TinyLitchi with SAHI | 72.6 | 57.3 | 80.1 | 86.0 | 86.4 | 58.8 | 14.2 | 7.24 | — |
DETR | 25.7 | 6.0 | 28.8 | 55.0 | 31.3 | 20.1 | 158 | 41.28 | 14.0 |
Faster R-CNN | 53.5 | 18.7 | 69.1 | 83.6 | 64.9 | 42.0 | 159 | 41.13 | 10.5 |
RetinaNet | 46.1 | 15.1 | 51.3 | 80.0 | 54.6 | 37.6 | 145 | 36.13 | 12.1 |
SSD | 31.5 | 4.9 | 36.0 | 69.5 | 44.3 | 18.7 | 130 | 23.88 | 33.6 |
YOLOX | 68.1 | 50.2 | 79.4 | 80.3 | 86.9 | 49.3 | 34.4 | 8.94 | 27.9 |
FCOS | 36.2 | 10.3 | 43.4 | 63.6 | 55.7 | 16.6 | 123 | 31.84 | 12.3 |
Figure | Dtected | Real | False | Omission | False Detection Rate | Correct Detection Rate |
---|---|---|---|---|---|---|
A1 | 148 | 155 | 9 | 16 | 6.1% | 89.7% |
A2 | 143 | 139 | 5 | 1 | 3.5% | 99.3% |
B1 | 292 | 286 | 25 | 19 | 8.6% | 93.4% |
B2 | 206 | 219 | 16 | 29 | 7.8% | 86.8% |
C1 | 173 | 199 | 11 | 37 | 6.4% | 81.4% |
C2 | 170 | 153 | 28 | 11 | 16.5% | 92.8% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiong, Z.; Wang, L.; Zhao, Y.; Lan, Y. Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model. Remote Sens. 2023, 15, 4017. https://doi.org/10.3390/rs15164017
Xiong Z, Wang L, Zhao Y, Lan Y. Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model. Remote Sensing. 2023; 15(16):4017. https://doi.org/10.3390/rs15164017
Chicago/Turabian StyleXiong, Zhangjun, Lele Wang, Yingjie Zhao, and Yubin Lan. 2023. "Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model" Remote Sensing 15, no. 16: 4017. https://doi.org/10.3390/rs15164017