1. Introduction
The resolution and integrity of collected traffic data directly impact the effectiveness of resulting insights derived from data analysis and computation. Typically, the more affluent and fine-grain traffic data are perceived, the more influential the produced wisdom becomes. Therefore, the desire for high-resolution, highly precise, trajectory-level traffic data has grown significantly in intelligent transportation system (ITS) applications, such as traffic accident prediction and prevention, vehicle-to-infrastructure cooperation, and mobility planning and decision making.
However, conventional traffic detectors have inherent limitations in terms of providing high-resolution and trajectory-level traffic data. Loop detectors are widely used in ITS, but cannot capture the traffic object’s trajectory and are prone to damage under heavy traffic conditions. Video cameras can offer trajectory data, but the accuracy is impeded by the challenges of varying illumination conditions and instances of occlusion [
1]. Light detection and ranging (LiDAR) sensors have demonstrated the ability to detect and track road users at a high resolution and with lane-level accuracy through point cloud data processing [
2]. Nonetheless, they also encounter obstacles such as occlusion and the requirement for continuous tracking within confined detection ranges. Furthermore, the robustness of LiDAR and cameras in adverse weather conditions, such as heavy rainfall, fog, and dust storms, remains a concern.
A promising solution emerges in the form of the 4D millimeter-wave (MMW) radar sensor. It offers a wealth of high-resolution 4D point cloud data, encompassing distance, azimuth, elevation, and Doppler velocity. In addition, it provides a more extended detection range and exhibits remarkable resilience against variations in light and adverse weather conditions [
3]. The unique capabilities, coupled with its low cost, position the 4D MMW radar sensor as a promising candidate for widespread deployment in roadside units to provide high-resolution, trajectory-level traffic data for ITS applications.
Object detection is the fundamental task of vision-based sensors in environment perception and sensing. In the existing research, researchers focus on using a deep learning framework to process onboard MMW radar data for autonomous vehicles. It often works with LiDAR, camera, or GPS data to perceive the surrounding environment to identify potential danger [
4,
5,
6]. However, the data processing procedure for roadside MMW radars differs from that for onboard MMW radars. Roadside MMW radars work independently to perceive all road users, such as vehicles, pedestrians, and bicyclists, within its detection range cost-effectively. The methods for onboard MMW radars cannot be employed in roadside MMW radars directly and cannot obtain the desired results. In addition, to the best of our knowledge, existing studies on object detection and tracking for roadside 4D MMW radar are limited. Therefore, to harness the full potential of the roadside 4D MMW radar data, developing algorithms tailored to its distinctive characteristics is necessary.
Motivated by the objective of leveraging the 4D MMW radar point cloud data to extract high-resolution and lane-level traffic information, this paper proposes a Louvain-based traffic object detection method for roadside 4D MMW radar based on its data characteristics. First, velocity-based filtering and region of interest (ROI) extraction were employed to filter and associate point cloud data by merging the data frames to enhance the point relationship. Second, the point cloud data of the 4D MMW radar were converted into a graph structure using a similarity matrix by mapping the point distance with the Gaussian kernel function. The Doppler velocity of the point was used as the weight of the graph link. Then, the Louvain algorithm [
7] was introduced to divide the graph as a function of point cloud clustering. Finally, a detection augmentation method was proposed to address the over-clustering and under-clustering problems raised by the Louvain-based clustering algorithm. As a unique characteristic of 4D MMW radar point cloud data, the associated object ID was also used to verify the clustering results.
The major contributions of this article can be considered as follows:
A Louvain-based traffic object detection method for roadside 4D MMW radar was proposed. To leverage the Louvain algorithm, the point cloud data were transformed into a graph structure, and it was first used to process the point cloud data in the current research, to the best of our knowledge.
A detection augmentation method was proposed to address the over-clustering and under-clustering problem, leveraging the unique attribute of 4D MMW radar point cloud data.
Louvain-based traffic object detection for 4D MMW radar point cloud data performed better than the state-of-the-art in precision and F1 score, with the lowest errors in over-clustering and under-clustering.
The remainder of this paper is organized as follows:
Section 2 reviews related object detection methods using different sensors.
Section 3 introduces the 4D MMW radar and its data.
Section 4 presents the proposed method for traffic object detection for the roadside 4D MMW radar. In
Section 5, experiments are conducted to illustrate the performance of the proposed method.
Section 6 concludes the paper with a summary and prospects for possible future works.
2. Related Work
In the realm of environmental sensing, modern visual sensors such as cameras, LiDAR, and MMW sensors have gained prominence. These sensors are widely installed in onboard vehicles and infrastructure units to perceive the surrounding environments and extract traffic flow information. To draw a clear distinction between this study and previous research, the literature on camera-based object detection, LiDAR-based object detection, and MMW-based object detection was reviewed. The multi-sensor fusion for object detection is also introduced in this section.
2.1. Camera-Based Object Detection
In the existing research, researchers have extensively investigated object detection using camera-based sensors. It boasts a wealth of mature technologies and extensive research outcomes. As a new-fashioned and effective method, deep learning is used for the mainstream camera-based object detection technology, such as the R-CNN series [
8,
9] and YOLO [
10] series methods. They have surfaced as prominent methodologies recognized for their capacity to attain high accuracy and real-time performance in object detection tasks. However, adverse weather conditions and illumination environments affect the robustness and reliability of the camera-based detection method. In addition, camera-based detection using deep learning requires additional resources for data annotation and training [
11].
2.2. LiDAR-Based Object Detection
Compared to cameras, LiDAR takes advantage of resistance to lighting fluctuations, high resolution, and rich information [
12]. The utilization of LiDAR sensors spans diverse applications, encompassing object detection and classification [
13,
14], lane marking detection [
15,
16], and stained and worn pavement marking identification [
17,
18]. In the current study, deep learning frameworks are also widely used to process point cloud data directly. For instance, Qi [
19] designed a novel neural network architecture, PointNet, to process point cloud data directly. Zhou [
20] introduced VoxelNet, which eliminates the need for manual feature extraction by segmenting point clouds into evenly 3D voxels. Machine learning methodologies also hold promise for LiDAR-based object detection. Lin [
21] employed the L-shape fitting method to derive more accurate bounding boxes, facilitating object feature extraction. Despite its resilience to varying lighting conditions, LiDAR’s performance can be impacted by extreme weather conditions and occlusion in density traffic flow. Additionally, the expense associated with LiDAR systems remains a deterrent, although efforts have been made to extend the detection range of cost-effective, low-channel LiDAR [
22].
2.3. MMW Radar-Based Object Detection
The conventional MMW radar has been widely used for object detection in various applications, motivating extensive research focus on enhancing its detection accuracy. Clustering, often employed as an unsupervised learning technique, is a cornerstone for point cloud data processing in object detection. In the current research, researchers focused on improving the performance of density-based spatial clustering of applications with the noise (DBSCAN) algorithm and combined it with other data aggregation algorithms to enhance object detection accuracy and speed [
23,
24]. Typically, Wang [
25] incorporated DBSCAN into merged data and introduced frame order features to mitigate multipath noise and distinguish target points from noise. Xie [
26] utilized a multi-frame merging strategy to bolster single-frame clustering accuracy while leveraging frame sequence attributes to address multi-target noise. However, traditional MMW radar has several limitations, including its incapacity to capture height information [
24], inability to detect stationary objects, and limited resolution. Therefore, these constraints render it ill-suited for autonomous driving and traffic perception.
The emerging 4D MMW radar not only preserves the benefits of conventional MMW radar, but also effectively addresses several of its limitations, showcasing the promising potential for robust and high-resolution object detection. In the existing research, investigations into object detection leveraging 4D MMW radar are predominantly focused on autonomous driving. For instance, Lutz [
27] employed supervised machine learning techniques to identify noisy points within 4D radar point cloud data. Jin [
28] utilized a Gaussian mixture model (GMM) to detect and classify pedestrians and vehicles using 4D MMW radar data. Tan [
29] proposed a deep learning framework for object detection by leveraging 4D MMW radar multi-frame point cloud data, considering velocity-compensated in point cloud data processing. The method is a baseline in object detection for onboard 4D MMW radar. Liu [
30] introduced an object detection method named SMURF to address the problem of sparse and noisy radar point cloud data by utilizing various representations, such as pillarization and density features obtained through kernel density estimation.
It is imperative to acknowledge that the characteristics of onboard 4D MMW radar data diverge from those of roadside 4D MMW radar. A frame of onboard radar data can often yield vehicle contours and only contain a few vehicles. In contrast, a frame of roadside radar data can contain a large number of vehicles, but cannot yield the vehicles’ contours. Therefore, the previously mentioned algorithms may not yield the anticipated outcomes when applied to roadside 4D MMW radar point cloud data. Therefore, developing dedicated object detection algorithms tailored for roadside 4D MMW radar is essential for ITS application in the future.
2.4. Fusion-Based Object Detection
In light of the inherent limitations present in individual sensors, sensor fusion emerges as a pivotal avenue for exploration within the domain of traffic perception. Researchers have diligently worked to synergistically amalgamate diverse sensor types to amplify the overall performance of object detection.
Current research on 4D MMW radar and other sensor fusions for object detection focuses on fusion with camera and LiDAR. For 4D MMW radar and camera fusion, Cui [
5] proposed a convolutional neural network (CNN) and cross-fusion strategy which leverages dual 4D MMW radars with a monovision camera. Zheng [
31] proposed an RCFusion model integrating camera and 4D radar data to enhance object detection accuracy and robustness for autonomous vehicles. Xiong [
32] proposed a “LiDAR Excluded Lean (LXL)” model that utilizes maps of predicted depth distribution in images and 3D occupancy grids based on radar data by leveraging the “sampling” view transformation strategy to enhance detection accuracy.
For 4D MMW radar and LiDAR fusion, Wang [
33] introduced an interaction-based fusion framework leveraging the self-attention mechanism to aggregate features from radar and LiDAR. However, the different characteristics and noise distributions of radar and LiDAR point cloud data affect the detection performance when directly integrating radar and LiDAR data. Therefore, Wang [
34] employed a Gaussian distribution model in the M2-Fusion framework to mitigate the variations in point cloud distribution. Chen [
35] proposed an end-to-end multi-sensor fusion framework called FUTR3D, which can adapt to most existing sensors in object tracking.
However, the large-scale deployment of roadside sensors for ITS and smart city applications remains challenging due to the significantly increased investment cost of installing multiple sensors at a single site. In addition, such installations may inadvertently curtail the detection range, especially when integrating a 4D MMW radar with other sensors will inadvertently curtail the detection range, as the detection range of a 4D MMW radar far exceeds that of other sensors. Therefore, this paper is dedicated to improving the detection accuracy of individual roadside 4D MMW radar by utilizing its inherent point cloud data.
3. 4D MMW Radar Sensor and Its Data
The 4D MMW radar point cloud data were extracted from the Continental ARS 548 RDI long-range radar sensor. The sensor’s inherent parameters are shown in
Table 1.
The 4D MMW radar offers two distinct output modes: detection and object. In object mode, the sensor outputs point objects akin to the traditional MMW radar. The detection mode yields point cloud data similar to the LiDAR sensor’s. However, the availability of reference objects is limited and inaccurate, as shown in
Figure 1.
The point cloud data encompass five primary attributes: (range), (azimuth), (elevation), V (Doppler velocity), and RCS (radar cross section).
To facilitate the algorithm’s application, the polar coordinates of the point cloud data were converted into Cartesian coordinates:
The converted Cartesian coordinates of the radar are shown in
Figure 2.
The point cloud data were also associated with additional auxiliary attributes, as listed in
Table 2.
4. Method
The proposed method can be divided into two modules: point cloud data preprocessing and Louvain-based point cloud clustering, as shown in
Figure 3. The point cloud data preprocessing module includes multi-frame aggregation and background filtering. The Louvain-based point cloud clustering module first converted the point cloud into a graph structure. Then, the Louvain algorithm was employed to segment the graph into different modularities to cluster the point cloud data. Finally, a detection augmentation method was proposed to address the over-clustering and under-clustering problems in the Louvain algorithm and enhance cluster accuracy. The output clusters of the traffic object were identified with a box to roughly identify the vehicle’s 2D outline.
4.1. Point Cloud Data Preprocessing
The raw single frame of point cloud data extracted from the 4D MMW radar includes a limited number of points. To analyze the distribution characteristics and extract more information from 4D MMW radar data, the multi-frame point cloud data merging approach has demonstrated its effectiveness [
22,
23]. Therefore, every five raw point cloud frames were amalgamated into a singular aggregated frame for the following data processing task. Experimental observation showed that the noisy points can be easily distinguished and the traffic object features can be enhanced by multi-frame consolidation.
The point cloud 4D MMW radar data also encompass background points, such as tree and building points. To improve the traffic object identification accuracy and reduce the computational resources, background point filtering is the primary task in 4D MMW radar point cloud data preprocessing. Therefore, a combined region of interest (ROI) extraction and velocity-based filtering method was employed to exclude the background points as follows:
Step 1: A Cartesian coordinate system was established with the radar deployment location as the origin point.
Step 2: The ROI was defined by constraining the 3D coordinates. Any points falling outside the selected range were identified as background points and excluded.
Step 3: If the velocity of the radar point was lower than the setting threshold, then the radar point was identified as the noise point and filtered out.
After the above processing, the points attributed to actual objects evolve into concise trajectories, as shown in
Figure 4.
4.2. Traffic Object Detection
The point cloud data extracted from the 4D MMW radar are sparse. However, is the data are associated with velocity and auxiliary information. A two-stage traffic object detection method was proposed for roadside 4D MMW radar to improve the identification accuracy and robustness, encompassing Louvain-based point cloud clustering and detection augmentation.
4.2.1. Louvain-Based Point Cloud Clustering
The Louvain algorithm is specifically designed to identify communities or groups of nodes within a network or graph, leveraging their connections and interactions. The modularity of the graph can be used to divide it into different sections, which can be used to analyze the spatial relationship of the point cloud data. Therefore, the Louvain algorithm was used to cluster the 4D MMW radar point cloud data as follows:
Step 1: graph link generation. Louvain is a graph-based algorithm. The initial step is creating nodes for the graph. For point cloud data, they can be derived directly from the point. The primary task is establishing the connection between the points, which will form the graph structure’s links and edges. The distance between the points can be described as:
where
is the distance between points
and
in the XY plane.
The matrix is a symmetric matrix, which means that = . The Z-coordinate was not used in distance computation as millimeter waves may generate echoes at different vehicle locations, resulting in the possibility of developing point clouds of varying heights for the same object.
To amplify the differences and showcase the connections between the points, the Gaussian kernel function was used to transform the distance matrix
into a similarity matrix:
where
is the distance between the data point and reference point.
is a hyper-parameterization that controls the influence range of the kernel function.
The values of the mapped similarity matrix were between 0 and 1. A larger value means that the two nodes were more similar. Then, the nodes were linked if their similarity was greater than the set threshold. The experiment’s threshold value was set between 0.4 and 0.6 according to the experimental observations to find the best results.
Step 2: link weight formulation. The weights of the edges are usually used to indicate the strength of links between the nodes. For the 4D MMW radar, points associated with the same object should have the same motion properties. Therefore, the point’s Doppler velocity difference was calculated and used to form a weight matrix. The Gaussian kernel function also mapped the weight matrix. Therefore, the graph structure was combined with the point’s positional and motion information for subsequent graph-based processing.
Step 3: Louvain-based point clustering. Modularity is the essential concept of the Louvain algorithm for processing graph structure, which was used to explore the potential relationships of graph components. The modularity is defined as:
where
is the weight of the connected edges between nodes
and
.
m = represents the total weights of edges in the graph.
indicates the sum of the weights of all of the connected links pointing to node
.
is the flag for whether two points
are in the same community.
Modularity is a measure that quantifies how well a network can be divided into distinct communities. The algorithm starts with each node belonging to its own community and then merges communities to maximize the modularity. This process continues iteratively until no further improvement in modularity can be achieved. In essence, the Louvain algorithm aims to find a partition of the network into communities, such that the connections within communities are dense and the connections between communities are sparse.
4.2.2. Object Detection Augmentation
As a clustering algorithm, the Louvain-based point cloud clustering for roadside 4D MMW radar data is not free from two types of errors: segmenting points belonging to the same object into micro-groups (error 1: over-clustering) and merging points from distinct objects into a single group (error 2: under-clustering). To improve the robustness and reliability, the problems of over-clustering and under-clustering should be addressed in the traffic object detection process for roadside 4D MMW radar. The points associated with the object ID information are the unique characteristics of 4D MMW radar point cloud data. It can be used to verify the object clustering results. Therefore, an object detection augmentation method was proposed to solve the problems of errors 1 and 2, as shown in
Figure 5.
For error 1, over-clustering clusters need to be merged into one cluster. For error 2, the under-clustering cluster need to be divided into different clusters. To facilitate a detailed description of the algorithm, a point is defined as , where and are the point coordinates, and is the associated object. A cluster is defined as .
The over-clustering problem was addressed as follows:
Step 1: The most frequent associated objects for each cluster were counted and stored in a list.
Step 2: We checked for non-zero duplicate elements in the list and identified their positions.
Step 3: The difference in the average x-coordinate of points belonging to the cluster that shares the same associated object was calculated. The difference
was calculated as:
where
and
are the clusters with the same associated object.
If is less than the threshold = 1, the positional relationship between the clusters should be merged. The purpose of the threshold was set as 1 to avoid merging objects with the same ID into neighboring lanes.
The under-clustering problem was solved as follows:
An object with more than five points can be defined as a valid object because five frames of raw point cloud were aggregated in the point cloud data preprocessing module. If there is more than one object ID within the cluster and , the cluster is defined as under-clustered and should be re-clustered.
Where is the difference between the clusters’ x-coordinates. The threshold is set to 1 to avoid the cluster merging objects in adjacent lanes incorrectly. is the length of the cluster. is the maximum trajectory length.
5. Experiments and Validation
5.1. Data Collection
In the experiments, the roadside 4D MMW radar was deployed on two pedestrian overpasses: Ziyou Road (site I) and Yatai Expressway (site II). The deployment height was 6.0 m. A tripod 4.2 m in height deployed the 4D MMW radar on the roadside of Yatai Expressway (site III) and was used for comparison in different traffic scenarios. In addition, an 8K video (7680 × 4320 px resolution) was used to obtain a ground truth traffic environment by leveraging a cellphone in the same position as a roadside 4D MMW radar. The y-axis of the radar coordinate system was parallel to the roads at all sites. The 4D MMW radar sensor locations and the geographic information of the sites are shown in
Figure 6.
The proposed method and comparison algorithms were tested in Python and executed on a laptop equipped with Intel(R) Core(TM) i7-8750H CPU @ 2.20 GHz, NVIDIA GeForce GTX 1050 Ti, and 32 GB of RAM.
5.2. Evaluation
5.2.1. Detection Evaluation
A rough 2D outline was generated to present the cluster based on the distribution of the points in the cluster in order to compute and evaluate the detection result more easily, as shown in
Figure 7.
To verify the effectiveness of the proposed method, DBSCAN with Mahalanobis distance [
23], improved DBSCAN [
2], and the Leiden algorithm [
36] was used to compare the performance with the same metrics, such as precision, recall, and F1 score. The performances of different object clustering methods using 4D MMW radar point cloud data are shown in
Table 3.
The results show that the proposed method exhibited a higher degree of precision and a higher F1 score, and showcased minimal errors. The recall of the proposed method was the same as the best performance. In addition, the proposed method had the same advantage as the DBSCAN algorithm without predetermining the number of clusters.
In the field, the traffic signs were mainly made up of metal. Millimeter waves are sensitive to metal and are wave-reflective. They can dismiss the points that belong to the traffic object, but reflect on the traffic signs. This is also called the multipath effect. At sites II and III, the recalls were lower than at site I due to the traffic signs of sites II and III. Therefore, the clustering algorithms were all affected by the multipath effect of 4D MMW radar.
Leiden is also a Louvain-based clustering algorithm. However, its performance seems unsteady in different scenarios. It performed better than DBSCAN in sites II and III, but not in site I. In addition, Error 1 occurred more frequently than the proposed method using the Leiden algorithm.
5.2.2. Ablation Experiments
Ablation experiments were performed to evaluate the module of the proposed method, leveraging the same point cloud data as in the previous section. The results are shown in
Table 4.
The effect of a single module of the proposed method was unstable at different sites. However, the three evaluation metrics using only the Louvain algorithm at all three sites were better than detection augmentation. Errors 1 and 2 occurred less frequently when using detection augmentation at Site I than when using only Louvain. Suppose the distance between two objects is small or the speed is close. In that case, it is easier to identify the objects to the same object, leading to the object detection augmentation method being less effective in these situations. However, if it is combined with the Louvain algorithm, the results are far better than those of the module running individually.
5.2.3. Evaluation of the Performance in Distance
The point cloud density of 4D MMW radar and LiDAR decreases as the distance increases [
22]. The variation in the point cloud was measured using the average number of points within the object at different distances, as shown in
Figure 8.
To evaluate the performance of the proposed method at different distances, the detected distance was divided into five areas: 50 to 100 m, 100 to 150 m, 150 to 200 m, 200 to 250 m, and 250 to 300 m. The precision of the proposed method in different areas was calculated, as shown in
Table 5.
It can be seen that the accuracy of object detection was not correlated with the decay of the number of point clouds. At Site I, the highest precision occurred within 50~100 m, but the highest precision occurred within 250~300 m at Site II and 200~250 at Site III. This result demonstrates that the attenuation of the number of 4D MMW radar point clouds did not affect the object detection performance. The object detection range of the radar was the same as the sensor detection range, except for the 0~50 m data, which had a detection blind spot due to the deployment height. The distance-irrelevant characteristics of 4D MMW radar can be considered an advantage in traffic object detection.
6. Conclusions
This paper introduces a novel approach to traffic object detection by leveraging the capabilities of advanced 4D MMW radar sensors. The method is designed as a multi-step process to enhance object detection process’ accuracy and practicality. The initial focus is refining the raw point cloud data obtained from the radar sensor. This refinement process involves combining techniques, including ROI extraction and velocity-based filtering. These steps are crucial to simplifying the dataset by isolating relevant information while filtering extraneous noise.
The core goal of our method is to convert point cloud data into graph data. The similarity and velocity matrices are calculated and used as the conversion base in this phase. This conversion provides a new idea for processing low-density point cloud data and effectively integrates the multidimensional features of 4D MMW radar point clouds.
A two-stage method based on the Louvain algorithm was proposed to elevate the precision of traffic object detection. This algorithm is a corrective mechanism that addresses any irregularities during the grouping process. By refining the groups and addressing potential errors, this algorithm fine-tunes the detection results, ultimately contributing to our methodology’s overall reliability and effectiveness. Collectively, these steps showcase a comprehensive approach to enhancing object detection using 4D MMW radar sensors, with the potential for significant impact in various real-world applications.
The field experiments were assessed across three diverse sites, revealing impressive precision at different sites with 97.73%, 99.77%, and 96.95%, respectively. However, the proposed method is still not free from limitations:
- (1)
Occlusions and larger vehicles: One of the challenges that still affects detection accuracy is occlusion, especially when larger vehicles obstruct the emitted millimeter waves. This challenge leads to an increased likelihood of missed detections. Future research should focus on developing strategies to mitigate the impact of occlusions on detection accuracy, thereby enhancing the robustness of the proposed method.
- (2)
Intersection detection: Our study’s scope excluded intersection sites due to the current limitations of 4D radar detection capabilities. However, ongoing advancements in radar technology could potentially enable the detection of traffic objects at intersections. Future research should investigate the potential of extending the proposed method to intersection scenarios, which could open up new and exciting possibilities for broader applications.
Future research directions encompass several pivotal areas:
- (1)
Sensor adaptability: To assess the adaptability and robustness of the proposed method, future work should involve testing it with a variety of 4D MMW radar sensors. This approach will help to determine the method’s effectiveness across different sensor configurations and specifications.
- (2)
Pedestrian detection: Exploring pedestrian detection using 4D MMW radar is a compelling avenue within traffic safety. Developing algorithms that accurately detect and track pedestrians using radar data can significantly enhance road safety.
- (3)
Object-tracking algorithms: Acknowledging that detection is the foundational step in traffic perception, future research should focus on developing dedicated object-tracking algorithms tailored explicitly for 4D MMW radar. Accurate tracking of detected objects over time can provide valuable insights for traffic management and control systems.
In conclusion, while the proposed method has demonstrated promising results, there remain challenges to address and exciting research directions to pursue. By overcoming limitations and delving into the suggested future avenues, the capabilities of 4D MMW radar in object detection and traffic perception applications can be further studied.