3D object detection algorithm based on multi-sensor segmental fusion of frustum association for autonomous driving

Chongben Tao^1,2,3,
Weitao Bian¹,
Chen Wang ORCID: orcid.org/0000-0002-5340-9737¹,
Huayi Li¹,
Zhen Gao⁴,
Zufeng Zhang⁵,
Sifa Zheng⁵ &
…
Yuan Zhu⁶

520 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

The rotation characteristics of point clouds are challenging to capture in current multimodal fusion methods for 3D object detection. A single fusion method cannot well balance the accuracy and speed in object detection. Therefore, a multi-sensor segmental fusion of frustum is proposed for 3D object detection in autonomous driving. A monocular camera, lidar, and radar are used for piecewise distributed feature-level fusion through frustum association. Firstly, a fully convolutional network is used to obtain a 2D detection frame and a center point of an object from an image. Frustum is generated according to the depth and scale information in a 3D space. Secondly, region of interest in the lidar and radar point clouds is determined by using the frustum association method. Then, spherical voxelization and spherical voxel convolution are performed on the lidar point cloud while extracting the rotation-invariant feature. Finally, feature-level fusion is performed with object attributes extracted from an image and the radar point cloud to improve the detection results. Meanwhile, a dynamic adaptive neural network of parameters for feature fusion is proposed, and it quickly obtains fusion features and ensures the accuracy of fusion results. The proposed method is both compared with other algorithms on the nuScenes dataset and tested on a severe weather dataset Radiate and in a real scenario. The proposed method has achieved the highest NDS score and the highest average accuracy in severe weather compared with other advanced methods. The experimental results indicate that the proposed method has higher accuracy and more excellent adaptability in various complex and severe weather driving environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regional feature fusion for on-road detection of objects using camera and 3D-LiDAR in high-speed autonomous vehicles

Article 03 October 2023

LiDAR-Camera-Based Deep Dense Fusion for Robust 3D Object Detection

Robust 3D Object Detection from LiDAR Point Cloud Data with Spatial Information Aggregation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets used in this study are publicly available online.

References

Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 1907–1915
Li X, Kong, D (2022) SRIF-RCNN: sparsely represented inputs fusion of different sensors for 3D object detection. Appl Intell 1–22
Qi CR, Liu W, Wu C, Su H, Guibas LJ (2018) Frustum pointnets for 3D object detection from RGB-D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 918–927
Yang B, Guo R, Liang M, Casas S, Urtasun R (2020) Radarnet: exploiting radar for robust perception of dynamic objects. In: European Conference on Computer Vision. Springer, pp 496–512
Xu C, Li Q, Zhou M, Zhou Q, Zhou Y, Ma Y (2022) RGB-T salient object detection via CNN feature and result saliency map fusion. Appl Intell 1–20
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 652–660
Liu H, Wang X, Zhang W, Zhang Z, Li Y-F (2020) Infrared head pose estimation with multi-scales feature fusion on the irhp database for human attention recognition. Neurocomputing 411:510–520
Article Google Scholar
Liu T, Liu H, Li Y-F, Chen Z, Zhang Z, Liu S (2019) Flexible FTIR spectral imaging enhancement for industrial robot infrared vision sensing. IEEE Trans Industr Inf 16(1):544–554
Article Google Scholar
Liu H, Nie H, Zhang Z, Li Y-F (2021) Anisotropic angle distribution learning for head pose estimation and attention understanding in human-computer interaction. Neurocomputing 433:310–322
Article Google Scholar
Liu H, Liu T, Zhang Z, Sangaiah AK, Yang B, Li Y (2022) ARHPE: asymmetric relation-aware representation learning for head pose estimation in industrial human-computer interaction. IEEE Trans Industr Inf 18(10):7107–7117
Article Google Scholar
Liu H, Zheng C, Li D, Shen X, Lin K, Wang J, Zhang Z, Zhang Z, Xiong NN (2021) EDMF: efficient deep matrix factorization with review feature learning for industrial recommender system. IEEE Trans Industr Inf 18(7):4361–4371
Article Google Scholar
Wang Z, Jia K (2019) Frustum convnet: sliding frustums to aggregate local point-wise features for Amodal 3D object detection. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp 1742–1749
Nabati R, Qi H (2021) Centerfusion: center-based radar and camera fusion for 3D object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp 1527–1536
Tao C, Fu S, Wang C, Luo X, Li H, Gao Z, Zhang Z, Zheng S (2022) F-PVNET: frustum-level 3D object detection on point-voxel feature representation for autonomous driving. IEEE Internet Things J
Tao C, He H, Xu F, Cao J (2021) Stereo priori RCNN based car detection on point level for autonomous driving. Knowl-Based Syst 229:107346
Article Google Scholar
Vora S, Lang AH, Helou B, Beijbom O (2020) Pointpainting: sequential fusion for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 4604–4612
Shi S, Wang X, Li H (2019) POINTRCNN: 3D object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 770–779
Yan Y, Mao Y, Li B (2018) Second: sparsely embedded convolutional detection. Sensors 18(10):3337
Article Google Scholar
Wang Y, Chao W-L, Garg D, Hariharan B, Campbell M, Weinberger KQ (2019) Pseudo-lidar from visual depth estimation: bridging the gap in 3D object detection for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 8445–8453
Chang J-R, Chen Y-S (2018) Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5410–5418
Nakrani NM, Joshi MM (2022) A human-like decision intelligence for obstacle avoidance in autonomous vehicle parking. Appl Intell 52(4):3728–3747
Article Google Scholar
Ku J, Mozifian M, Lee J, Harakeh A, Waslander SL (2018) Joint 3D proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp 1–8
Li Y, Zhou S, Chen H (2022) Attention-based fusion factor in FPN for object detection. Appl Intell 1–10
Xie L, Xiang C, Yu Z, Xu G, Yang Z, Cai D, He X (2020) PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module. Proceedings of the AAAI Conference on Artificial Intelligence 34:12460–12467
Article Google Scholar
Liang M, Yang B, Wang S, Urtasun R (2018) Deep continuous fusion for multi-sensor 3D object detection. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 641–656
Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Appl Intell 51(9):6400–6429
Article Google Scholar
Wang S, Suo S, Ma W-C, Pokrovsky A, Urtasun R (2018) Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2589–2597
Yoo JH, Kim Y, Kim J, Choi JW (2020) 3D-CVF: generating joint camera and lidar features using cross-view spatial feature fusion for 3D object detection. In: European Conference on Computer Vision. Springer, pp 720–736
Liang M, Yang B, Chen Y, Hu R, Urtasun R (2019) Multi-task multi-sensor fusion for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 7345–7353
Huang T, Liu Z, Chen X, Bai X (2020) EPNET: enhancing point features with image semantics for 3D object detection. In: European Conference on Computer Vision. Springer, pp 35–52
Piergiovanni A, Casser V, Ryoo MS, Angelova A (2021) 4D-net for learned multi-modal alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 15435–15445
Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H (2020) PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 10529–10538
Yang Z, Sun Y, Liu S, Jia, J (2020) 3DSSD: point-based 3D single stage object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 11040–11048
Cao J, Tao C, Zhang Z, Gao Z, Luo X, Zheng S, Zhu Y (2023) Accelerating Point-Voxel representation of 3D object detection for automatic driving. IEEE Transactions on Artificial Intelligence
Wang Y, Ye T, Cao L, Huang W, Sun F, He F, Tao D (2022) Bridged transformer for vision and point cloud 3D object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 12114–12123
Gao Y, Liu X, Li J, Fang Z, Jiang X, Huq KMS (2022) LFT-NET: local feature transformer network for point clouds analysis. IEEE Trans Intell Transp Syst
Zhou X, Koltun V., Krähenbühl P (2020) Tracking objects as points. In: European Conference on Computer Vision. Springer, pp 474–490
Yin T, Zhou X, Krahenbuhl P (2021) Center-based 3D object detection and tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 11784–11793
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2980–2988
Tsai D, Worrall S, Shan M, Lohr A, Nebot E (2021) Optimising the selection of samples for robust lidar camera calibration. In: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE, pp 2631–2638
Verma S, Berrio JS, Worrall S, Nebot E (2019) Automatic extrinsic calibration between a camera and a 3D lidar using 3D point and plane correspondences. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, pp 3906–3912
Park Y, Yun S, Won CS, Cho K, Um K, Sim S (2014) Calibration between color camera and 3D lidar instruments with a polygonal planar board. Sensors 14(3):5333–5353
Article Google Scholar
You Y, Lou Y, Liu Q, Tai Y-W, Ma L, Lu C, Wang W (2020) Pointwise rotation-invariant network with adaptive sampling and 3D spherical Voxel convolution. Proceedings of the AAAI Conference on Artificial Intelligence 34:12717–12724
Article Google Scholar
Esteves C, Allen-Blanchette C, Makadia A, Daniilidis K (2018) Learning so (3) equivariant representations with spherical CNNS. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 52–68
Seghouane A-K, Shokouhi N (2019) Adaptive learning for robust radial basis function networks. IEEE Trans Cybern 51(5):2847–2856
Zouari M, Baklouti N, Sanchez-Medina J, Kammoun HM, Ayed MB, Alimi AM (2020) PSO-based adaptive hierarchical interval type-2 fuzzy knowledge representation system (PSO-AHIT2FKRS) for travel route guidance. IEEE Trans Intell Transp Syst
Liu S, Johns E, Davison AJ (2019) End-to-end multi-task learning with attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1871–1880
Caesar H, Bankiti V, Lang AH, Vora S, Liong VE, Xu Q, Krishnan A, Pan Y, Baldan G, Beijbom O (2020) nuscenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 11621–11631
Barnes D, Gadd M, Murcutt P, Newman P, Posner I (2020) The Oxford Radar Robotcar Dataset: a radar extension to the Oxford Robotcar Dataset. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp 6433–6438
Wang J, Lan S, Gao M, Davis LS (2020) Infofocus: 3D object detection for autonomous driving with dynamic information modeling. In: European Conference on Computer Vision. Springer, pp 405–420
Lang AH, Vora S, Caesar H, Zhou L, Yang J, Beijbom O (2019) Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 12697–12705
Simonelli A, Bulo SR, Porzi L, Antequera ML, Kontschieder P (2020) Disentangling monocular 3D object detection: from single to multi-class recognition. IEEE Trans Pattern Anal Mach Intell
Liu Q, Zhou W, Zhang Y, Fei X (2021) Multi-target detection based on multi-sensor redundancy and dynamic weight distribution for driverless cars. In: 2021 International Conference on Communications, Information System and Computer Engineering (CISCE). IEEE, pp 229–234

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant No.62201375, Grant No.61972454), by the China Postdoctoral Science Foundation (2021M691848), by the Natural Science Foundation of Jiangsu Province(BK20220635,BK20201405), by the Science and Technology Projects Fund of Suzhou (Grant No. SYG202142).

Author information

Authors and Affiliations

Department of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, China
Chongben Tao, Weitao Bian, Chen Wang & Huayi Li
Suzhou Automobile Research Institute, Tsinghua University, Suzhou, 215134, China
Chongben Tao
College of Mechanical Engineering, Tongji University, Shanghai, 200333, China
Chongben Tao
Faculty of Engineering, McMaster University, Hamilton, ON L8S 0A, Canada
Zhen Gao
Tsinghua University, Beijing, 100084, China
Zufeng Zhang & Sifa Zheng
College of Automotive Studies, Tongji University, Shanghai, 201804, China
Yuan Zhu

Authors

Chongben Tao
View author publications
You can also search for this author in PubMed Google Scholar
Weitao Bian
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Huayi Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zufeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Sifa Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tao, C., Bian, W., Wang, C. et al. 3D object detection algorithm based on multi-sensor segmental fusion of frustum association for autonomous driving. Appl Intell 53, 22753–22774 (2023). https://doi.org/10.1007/s10489-023-04630-4

Download citation

Accepted: 10 April 2023
Published: 02 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10489-023-04630-4

3D object detection algorithm based on multi-sensor segmental fusion of frustum association for autonomous driving

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Regional feature fusion for on-road detection of objects using camera and 3D-LiDAR in high-speed autonomous vehicles

LiDAR-Camera-Based Deep Dense Fusion for Robust 3D Object Detection

Robust 3D Object Detection from LiDAR Point Cloud Data with Spatial Information Aggregation

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

3D object detection algorithm based on multi-sensor segmental fusion of frustum association for autonomous driving

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Regional feature fusion for on-road detection of objects using camera and 3D-LiDAR in high-speed autonomous vehicles

LiDAR-Camera-Based Deep Dense Fusion for Robust 3D Object Detection

Robust 3D Object Detection from LiDAR Point Cloud Data with Spatial Information Aggregation

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now