[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (4)

Search Parameters:
Keywords = YOLOP

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 5578 KiB  
Article
Study on Nighttime Pedestrian Trajectory-Tracking from the Perspective of Driving Blind Spots
by Wei Zhao, Congcong Ren and Ao Tan
Electronics 2024, 13(17), 3460; https://doi.org/10.3390/electronics13173460 - 31 Aug 2024
Viewed by 382
Abstract
With the acceleration of urbanization and the growing demand for traffic safety, developing intelligent systems capable of accurately recognizing and tracking pedestrian trajectories at night or under low-light conditions has become a research focus in the field of transportation. This study aims to [...] Read more.
With the acceleration of urbanization and the growing demand for traffic safety, developing intelligent systems capable of accurately recognizing and tracking pedestrian trajectories at night or under low-light conditions has become a research focus in the field of transportation. This study aims to improve the accuracy and real-time performance of nighttime pedestrian-detection and -tracking. A method that integrates the multi-object detection algorithm YOLOP with the multi-object tracking algorithm DeepSORT is proposed. The improved YOLOP algorithm incorporates the C2f-faster structure in the Backbone and Neck sections, enhancing feature extraction capabilities. Additionally, a BiFormer attention mechanism is introduced to focus on the recognition of small-area features, the CARAFE module is added to improve shallow feature fusion, and the DyHead dynamic target-detection head is employed for comprehensive fusion. In terms of tracking, the ShuffleNetV2 lightweight module is integrated to reduce model parameters and network complexity. Experimental results demonstrate that the proposed FBCD-YOLOP model improves lane detection accuracy by 5.1%, increases the IoU metric by 0.8%, and enhances detection speed by 25 FPS compared to the baseline model. The accuracy of nighttime pedestrian-detection reached 89.6%, representing improvements of 1.3%, 0.9%, and 3.8% over the single-task YOLO v5, multi-task TDL-YOLO, and the original YOLOP models, respectively. These enhancements significantly improve the model’s detection performance in complex nighttime environments. The enhanced DeepSORT algorithm achieved an MOTA of 86.3% and an MOTP of 84.9%, with ID switch occurrences reduced to 5. Compared to the ByteTrack and StrongSORT algorithms, MOTA improved by 2.9% and 0.4%, respectively. Additionally, network parameters were reduced by 63.6%, significantly enhancing the real-time performance of nighttime pedestrian-detection and -tracking, making it highly suitable for deployment on intelligent edge computing surveillance platforms. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

Figure 1
<p>Nighttime driver’s blind spot pedestrian-tracking technology route.</p>
Full article ">Figure 2
<p>Algorithm implementation flowchart.</p>
Full article ">Figure 3
<p>C2faster structural diagram. (<b>a</b>) FasterNet block, (<b>b</b>) C2f-faster.</p>
Full article ">Figure 4
<p>BiFormer attention mechanism structure diagram.</p>
Full article ">Figure 5
<p>CARAFE upsampling structure diagram.</p>
Full article ">Figure 6
<p>Dynamic detection head DyHead structure diagram.</p>
Full article ">Figure 7
<p>Improved YOLOP network structure diagram.</p>
Full article ">Figure 8
<p>ShuffleNetV2 structure diagram.</p>
Full article ">Figure 9
<p>DIoU schematic diagram.</p>
Full article ">Figure 10
<p>Improved DeepSORT structure flowchart.</p>
Full article ">Figure 11
<p>The lane line-detection results at night are presented. In Scene 1, the road at night is unobstructed and the lane lines are clear. In Scene 2, the road at night has obstructions. In Scene 3, the lane lines on the road at night are unclear.</p>
Full article ">Figure 12
<p>FBCD-YOLOP Training Process Results Diagram.</p>
Full article ">Figure 13
<p>The results of the loss during the training and validation process of the tracking algorithm.</p>
Full article ">Figure 14
<p>Nighttime pedestrian-tracking results. (<b>a</b>) A frame from the first video sequence showing the initial detection and tracking of pedestrians by the proposed algorithm. (<b>b</b>) The corresponding frame from the first video sequence where the IDS-tracking process is shown; the proposed algorithm accurately tracks pedestrian ID3 through the crowd, while the YOLOP-DeepSort algorithm exhibits ID switches (highlighted by orange circles). (<b>c</b>) A frame from the second video sequence showing the proposed algorithm’s detection of pedestrians with no ID changes or false detections. (<b>d</b>) The corresponding frame from the second video sequence where the YOLOP-DeepSort algorithm mistakenly identifies a tree trunk and a wall crack as pedestrians (highlighted by red circles), demonstrating the superiority of the proposed algorithm in avoiding false detections.</p>
Full article ">
21 pages, 20528 KiB  
Article
Multi-Task Visual Perception for Object Detection and Semantic Segmentation in Intelligent Driving
by Jiao Zhan, Jingnan Liu, Yejun Wu and Chi Guo
Remote Sens. 2024, 16(10), 1774; https://doi.org/10.3390/rs16101774 - 16 May 2024
Cited by 1 | Viewed by 810
Abstract
With the rapid development of intelligent driving vehicles, multi-task visual perception based on deep learning emerges as a key technological pathway toward safe vehicle navigation in real traffic scenarios. However, due to the high-precision and high-efficiency requirements of intelligent driving vehicles in practical [...] Read more.
With the rapid development of intelligent driving vehicles, multi-task visual perception based on deep learning emerges as a key technological pathway toward safe vehicle navigation in real traffic scenarios. However, due to the high-precision and high-efficiency requirements of intelligent driving vehicles in practical driving environments, multi-task visual perception remains a challenging task. Existing methods typically adopt effective multi-task learning networks to concurrently handle multiple tasks. Despite the fact that they obtain remarkable achievements, better performance can be achieved through tackling existing problems like underutilized high-resolution features and underexploited non-local contextual dependencies. In this work, we propose YOLOPv3, an efficient anchor-based multi-task visual perception network capable of handling traffic object detection, drivable area segmentation, and lane detection simultaneously. Compared to prior works, we make essential improvements. On the one hand, we propose architecture enhancements that can utilize multi-scale high-resolution features and non-local contextual dependencies for improving network performance. On the other hand, we propose optimization improvements aiming at enhancing network training, enabling our YOLOPv3 to achieve optimal performance via straightforward end-to-end training. The experimental results on the BDD100K dataset demonstrate that YOLOPv3 sets a new state of the art (SOTA): 96.9% recall and 84.3% mAP50 in traffic object detection, 93.2% mIoU in drivable area segmentation, and 88.3% accuracy and 28.0% IoU in lane detection. In addition, YOLOPv3 maintains competitive inference speed against the lightweight YOLOP. Thus, YOLOPv3 stands as a robust solution for handling multi-task visual perception problems. The code and trained models have been released on GitHub. Full article
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The network architecture of YOLOPv3. YOLOPv3 is a unified encoder–decoder network, consisting of one shared encoder (i.e., a backbone network and a neck network) and three different decoders (i.e., object detection head, drivable area segmentation head, and lane detection head). We introduce the high-resolution features <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>C</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msub> </mrow> </semantics></math> generated by the backbone network into the neck network. ‘CBS’ and ‘CBS-N’ are the basic building units of YOLOPv3. ‘CBS’ comprises a convolutional layer, a BatchNorm layer and a SiLU activation function. ‘CBS-N’ consists of a ghost convolutional layer which contributes to decreasing the model parameters and computation. The detailed structures of other modules (e.g., ‘ELAN’, ‘ELAN-H’, ‘SPPCSPC’, and ‘UP’) are described in the figure below. ‘C’ denotes concatenation operation. ‘SA’ denotes an SA-based refined module that captures non-local contextual dependencies and enhances lane detection with little computation cost.</p>
Full article ">Figure 2
<p>The architecture of SA-based refined module. <math display="inline"><semantics> <mrow> <mo>⨀</mo> </mrow> </semantics></math> denotes the Hadamard product, <math display="inline"><semantics> <mrow> <mo>⊗</mo> </mrow> </semantics></math> denotes the matrix product, and <math display="inline"><semantics> <mrow> <mo>⨁</mo> </mrow> </semantics></math> denotes the matrix addition.</p>
Full article ">Figure 3
<p>The architecture of ‘RepConv’ module. <math display="inline"><semantics> <mrow> <mo>⨁</mo> </mrow> </semantics></math> denotes the matrix addition.</p>
Full article ">Figure 4
<p>The comparison chart of multi-task predictions. The rows from top to bottom are the input images, the ground truth, and the predictions of YOLOP, HybridNets and ours, respectively. Yellow boxes indicate traffic objects. Green areas are drivable areas. Red lines indicate lane lines.</p>
Full article ">Figure 5
<p>The comparison chart of traffic object detection. The rows from top to bottom are the input images, the ground truth, and the predictions of YOLOP, HybridNets and ours, respectively. The yellow boxes indicate the traffic objects.</p>
Full article ">Figure 6
<p>The comparison chart of drivable area segmentation. The rows from top to bottom are the input images, the ground truth, and the predictions of YOLOP, HybridNets and ours, respectively. The green areas are the drivable areas.</p>
Full article ">Figure 7
<p>The comparison chart of lane detection. The rows from top to bottom are the input images, the ground truth, and the predictions of YOLOP, HybridNets and ours, respectively. The red lines show the lane lines.</p>
Full article ">Figure 8
<p>The comparison of lane detection before and after utilizing the SA-based refined module. The red lines show the lane lines. It is obvious that our results demonstrate better accuracy and continuity after utilizing the SA-based refined module.</p>
Full article ">
18 pages, 39416 KiB  
Article
Optimal Configuration of Multi-Task Learning for Autonomous Driving
by Woomin Jun, Minjun Son, Jisang Yoo and Sungjin Lee
Sensors 2023, 23(24), 9729; https://doi.org/10.3390/s23249729 - 9 Dec 2023
Cited by 1 | Viewed by 1703
Abstract
For autonomous driving, it is imperative to perform various high-computation image recognition tasks with high accuracy, utilizing diverse sensors to perceive the surrounding environment. Specifically, cameras are used to perform lane detection, object detection, and segmentation, and, in the absence of lidar, tasks [...] Read more.
For autonomous driving, it is imperative to perform various high-computation image recognition tasks with high accuracy, utilizing diverse sensors to perceive the surrounding environment. Specifically, cameras are used to perform lane detection, object detection, and segmentation, and, in the absence of lidar, tasks extend to inferring 3D information through depth estimation, 3D object detection, 3D reconstruction, and SLAM. However, accurately processing all these image recognition operations in real-time for autonomous driving under constrained hardware conditions is practically unfeasible. In this study, considering the characteristics of image recognition tasks performed by these sensors and the given hardware conditions, we investigated MTL (multi-task learning), which enables parallel execution of various image recognition tasks to maximize their processing speed, accuracy, and memory efficiency. Particularly, this study analyzes the combinations of image recognition tasks for autonomous driving and proposes the MDO (multi-task decision and optimization) algorithm, consisting of three steps, as a means for optimization. In the initial step, a MTS (multi-task set) is selected to minimize overall latency while meeting minimum accuracy requirements. Subsequently, additional training of the shared backbone and individual subnets is conducted to enhance accuracy with the predefined MTS. Finally, both the shared backbone and each subnet undergo compression while maintaining the already secured accuracy and latency performance. The experimental results indicate that integrated accuracy performance is critically important in the configuration and optimization of MTL, and this integrated accuracy is determined by the ITC (inter-task correlation). The MDO algorithm was designed to consider these characteristics and construct multi-task sets with tasks that exhibit high ITC. Furthermore, the implementation of the proposed MDO algorithm, coupled with additional SSL (semi-supervised learning) based training, resulted in a significant performance enhancement. This advancement manifested as approximately a 12% increase in object detection mAP performance, a 15% improvement in lane detection accuracy, and a 27% reduction in latency, surpassing the results of previous three-task learning techniques like YOLOP and HybridNet. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of multi-task learning for autonomous driving.</p>
Full article ">Figure 2
<p>MDO algorithm.</p>
Full article ">Figure 3
<p>The example results of two tasks.</p>
Full article ">Figure 4
<p>The example results of three tasks.</p>
Full article ">Figure 5
<p>The example results for depth estimation.</p>
Full article ">
14 pages, 7651 KiB  
Article
Research on a Lightweight Panoramic Perception Algorithm for Electric Autonomous Mini-Buses
by Yulin Liu, Gang Li, Liguo Hao, Qiang Yang and Dong Zhang
World Electr. Veh. J. 2023, 14(7), 179; https://doi.org/10.3390/wevj14070179 - 8 Jul 2023
Cited by 2 | Viewed by 1370
Abstract
Autonomous mini-buses are low-cost passenger vehicles that travel along designated routes in industrial parks. In order to achieve this goal, it is necessary to implement functionalities such as lane-keeping and obstacle avoidance. To address the challenge of deploying deep learning algorithms to detect [...] Read more.
Autonomous mini-buses are low-cost passenger vehicles that travel along designated routes in industrial parks. In order to achieve this goal, it is necessary to implement functionalities such as lane-keeping and obstacle avoidance. To address the challenge of deploying deep learning algorithms to detect environmental information on low-performance computing units, which leads to difficulties in model deployment and the inability to meet real-time requirements, a lightweight algorithm called YOLOP-E based on the YOLOP algorithm is proposed. (The letter ‘E’ stands for EfficientNetV2, and YOLOP-E represents the optimization of the entire algorithm by replacing the backbone of the original model with EfficientNetV2.) The algorithm has been optimized and improved in terms of the following three aspects: Firstly, the YOLOP backbone network is reconstructed using the lightweight backbone network EfficientNet-V2, and depth-wise separable convolutions are used instead of regular convolutions. Secondly, a hybrid attention mechanism called CABM is employed to enhance the model’s feature-representation capability. Finally, the Focal EIoU and Smoothed Cross-Entropy loss functions are utilized to improve detection accuracy. YOLOP-E is the final result after the aforementioned optimizations are completed. Experimental results demonstrate that on the BDD100K dataset, the optimized algorithm achieves a 3.5% increase in mAP50 and a 4.1% increase in mIoU. During real-world vehicle testing, the detection rate reaches 41.6 FPS, achieving the visual perception requirements of the autonomous shuttle bus while maintaining a lightweight design and improving detection accuracy. Full article
Show Figures

Figure 1

Figure 1
<p>YOLOP neural network architecture.</p>
Full article ">Figure 2
<p>Backbone architecture. Old (<b>left</b>); new (<b>right</b>).</p>
Full article ">Figure 3
<p>Attention mechanism of CBAM.</p>
Full article ">Figure 4
<p>MBConv module architecture (<b>left</b>), FMBConv module architecture (<b>right</b>). (The arrow indicates the direction of convolutional operation propagation within a block, while the symbol on top represents the repetition of the previous process.).</p>
Full article ">Figure 5
<p>Areas with unclear lane lines.</p>
Full article ">Figure 6
<p>Chess board calibration board picture.</p>
Full article ">Figure 7
<p>Monocular camera calibration.</p>
Full article ">Figure 8
<p>Electric Autonomous Mini-buses. (The red point is the placement of the camera sensor.)</p>
Full article ">Figure 9
<p>Model comparison after training for 100 epochs. YOLOP-E (red), YOLOP (blue). (<b>a</b>) mAP50 of object detection. (<b>b</b>) mIoU of lane lines segmentation.</p>
Full article ">Figure 10
<p>Comparison of the final detection results in areas with unclear lane lines. (<b>a</b>) YOLOP. (<b>b</b>) YOLOP-E.</p>
Full article ">
Back to TopTop