CN114359334A

CN114359334A - Target tracking method, apparatus, computer equipment and storage medium

Info

Publication number: CN114359334A
Application number: CN202011062425.7A
Authority: CN
Inventors: 关喜嘉; 王邓江; 蓝煜东; 邓永强
Original assignee: Beijing Wanji Technology Co Ltd
Current assignee: Beijing Wanji Technology Co Ltd
Priority date: 2020-09-30
Filing date: 2020-09-30
Publication date: 2022-04-15

Abstract

The application relates to a target tracking method, a device, computer equipment and a storage medium, wherein the method comprises the steps of obtaining first current frame data of a target scene collected by a first sensor and second current frame data of the target scene collected by at least one second sensor different from the first sensor in type, obtaining actual measurement characteristic information of each candidate detection object in the current frame, wherein each detection object in the first current frame data and each detection object in the second current frame data are successfully associated and matched, and then tracking each candidate detection object according to the actual measurement characteristic information of each candidate detection object in the current frame and a predicted detection frame of each target to be tracked in the current frame. The method can accurately determine the target to be tracked of each detection object, and effectively complete target tracking of each frame.

Description

Target tracking method, apparatus, computer equipment and storage medium

技术领域technical field

本申请涉及跟踪技术领域，特别是涉及一种目标跟踪方法、装置、计算机设备和存储介质。The present application relates to the field of tracking technology, and in particular, to a target tracking method, device, computer equipment and storage medium.

背景技术Background technique

随着传感器技术、计算机技术的发展，基于各类传感器的同时定位与地图构建解决方案已被广泛应用于机器人自主导航、无人驾驶、移动测量和战场环境构建等领域。With the development of sensor technology and computer technology, simultaneous positioning and map construction solutions based on various sensors have been widely used in the fields of robot autonomous navigation, unmanned driving, mobile measurement and battlefield environment construction.

例如，在进行目标跟踪时，可以通过传感器检测目标的数据信息，并对检测的数据信息进行分析后实现对目标的跟踪。通常，不同的传感器获得的数据信息维度不同，架构不同，特征(轮廓、尺寸、轨迹，类别、颜色、纹理等)侧重点不同，对于同一个目标，提取到的特征有所不同。但传统技术中，往往采用单一传感器跟踪目标，容易造成检测目标时特征维度的不足，导致目标的跟踪结果精度较低。For example, during target tracking, the data information of the target can be detected by the sensor, and the target tracking can be realized after analyzing the detected data information. Usually, the data information obtained by different sensors has different dimensions, different architectures, and different emphases of features (contour, size, trajectory, category, color, texture, etc.), and for the same target, the extracted features are different. However, in the traditional technology, a single sensor is often used to track the target, which is easy to cause insufficient feature dimension when detecting the target, resulting in low accuracy of the target tracking result.

发明内容SUMMARY OF THE INVENTION

基于此，有必要针对上述技术问题，提供一种能够提高目标跟踪结果精度的目标跟踪方法、装置、计算机设备和存储介质。Based on this, it is necessary to provide a target tracking method, device, computer equipment and storage medium that can improve the accuracy of target tracking results in view of the above technical problems.

第一方面，本申请实施例提供一种目标跟踪方法，该方法包括：In a first aspect, an embodiment of the present application provides a target tracking method, which includes:

获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据；第一传感器和第二传感器为不同类型的传感器；Obtain the first current frame data of the target scene collected by the first sensor, and the second current frame data of the target scene collected by at least one second sensor; the first sensor and the second sensor are different types of sensors;

根据第一当前帧数据和第二当前帧数据，获取各候选检测对象在当前帧中的实测特征信息；候选检测对象为第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功的检测对象；实测特征信息表示各候选检测对象的固有特性信息；According to the first current frame data and the second current frame data, the measured feature information of each candidate detection object in the current frame is obtained; the candidate detection objects are the detection objects in the first current frame data and the detection objects in the second current frame data Associating successfully matched detection objects; the measured feature information represents the inherent characteristic information of each candidate detection object;

根据各待跟踪目标在上一帧的历史检测框，预测各待跟踪目标在当前帧中预测检测框；According to the historical detection frame of each target to be tracked in the previous frame, predict the detection frame of each target to be tracked in the current frame;

根据各候选检测对象在当前帧中的实测特征信息，以及各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪。Each candidate detection object is tracked according to the measured feature information of each candidate detection object in the current frame and the predicted detection frame of each target to be tracked in the current frame.

在其中一个实施例中，上述实测特征信息包括实测三维检测框和实测轨迹特征；上述根据各候选检测对象在当前帧中的实测特征信息，以及各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪，包括:In one of the embodiments, the above-mentioned measured feature information includes the measured three-dimensional detection frame and the measured track feature; the above-mentioned measured feature information of each candidate detection object in the current frame and each target to be tracked predict the detection frame in the current frame, right Each candidate detection object is tracked, including:

获取各候选检测对象在当前帧中的实测三维检测框与各待跟踪目标在当前帧中预测的三维检测框之间的交并比；Obtain the intersection ratio between the measured 3D detection frame of each candidate detection object in the current frame and the 3D detection frame predicted by each target to be tracked in the current frame;

获取剩余检测对象的实测轨迹特征与各待跟踪目标在上一帧中历史轨迹特征之间的相似度；剩余检测对象为交并比小于预设交并比阈值的候选检测对象；Obtain the similarity between the measured track features of the remaining detection objects and the historical track features of each target to be tracked in the previous frame; the remaining detection objects are candidate detection objects whose intersection ratio is less than a preset intersection ratio threshold;

对交并比大于交并比阈值的各候选检测对象，和相似度大于预设相似度阈值的各剩余检测对象进行跟踪。Track each candidate detection object whose intersection ratio is greater than the intersection ratio threshold, and each remaining detection object whose similarity is greater than the preset similarity threshold.

在其中一个实施例中，上述获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据，包括：In one embodiment, the above-mentioned first current frame data of the target scene collected by the first sensor and the second current frame data of the target scene collected by at least one second sensor include:

对于任一帧传感器采集的数据，获取第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2；For the data collected by any frame sensor, obtain the timestamp T1 of the first current frame data and the timestamp T2 of the second current frame data;

若T1和T2之间间隔小于预设的间隔阈值，则确定第一当前帧数据和第二当前帧数据为当前帧采集的数据；If the interval between T1 and T2 is less than the preset interval threshold, then determine that the first current frame data and the second current frame data are the data collected by the current frame;

若T1和T2之间间隔大于预设阈值，则丢弃第一当前帧数据和第二当前帧数据，重新获取下一帧的第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2。If the interval between T1 and T2 is greater than the preset threshold, discard the first current frame data and the second current frame data, and re-acquire the first current frame data timestamp T1 and the second current frame data timestamp T2 of the next frame .

在其中一个实施例中，上述获取第一当前帧数据时间戳T1和第二当前帧数据的时间戳T 2，包括：In one embodiment, obtaining the timestamp T1 of the first current frame data and the timestamp T2 of the second current frame data above includes:

若第一当前帧数据和第二当前帧数据中未携带时间戳，则将第一当前帧数据的采集时间和第二当前帧数据的采集时间转换成同一时间轴下，获取T1和T2。If the first current frame data and the second current frame data do not carry a time stamp, then the acquisition time of the first current frame data and the acquisition time of the second current frame data are converted into the same time axis to obtain T1 and T2.

在其中一个实施例中，上述获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据之前，该方法还包括：In one of the embodiments, before obtaining the first current frame data of the target scene collected by the first sensor and the second current frame data of the target scene collected by at least one second sensor, the method further includes:

调整第一传感器的采样频率和第二传感器的采样频率相同，以及对第一传感器和第二传感器之间的外参信息进行标定。The sampling frequency of the first sensor is adjusted to be the same as the sampling frequency of the second sensor, and the external parameter information between the first sensor and the second sensor is calibrated.

在其中一个实施例中，上述第一传感器为摄像设备；第二传感器为激光雷达；In one of the embodiments, the first sensor is a camera device; the second sensor is a lidar;

则上述对第一传感器和第二传感器之间的外参信息进行标定，包括：Then the above-mentioned calibration of the external parameter information between the first sensor and the second sensor includes:

将摄像设备和激光雷达之间的相对位姿信息调整为目标位姿信息，并根据预设的标定算法获取摄像设备标定后的外参信息；Adjust the relative pose information between the camera device and the lidar to the target pose information, and obtain the calibrated external parameter information of the camera device according to the preset calibration algorithm;

根据标定后的外参信息对摄像设备的外参信息进行标定。The external parameter information of the camera device is calibrated according to the calibrated external parameter information.

在其中一个实施例中，上述第一当前帧数据为摄像设备采集的像素数据，第二当前帧数据为激光雷达采集的点云数据；上述实测特征信息包括实测三维检测框和实测轨迹特征；In one embodiment, the above-mentioned first current frame data is pixel data collected by a camera device, and the second current frame data is point cloud data collected by lidar; the above-mentioned measured feature information includes the measured three-dimensional detection frame and the measured trajectory feature;

则上述根据第一当前帧数据和第二当前帧数据，确定各候选检测对象在当前帧中的实测特征信息，包括：Then above-mentioned according to the first current frame data and the second current frame data, determine the measured feature information of each candidate detection object in the current frame, including:

获取像素数据中各检测对象的二维检测框，和点云数据中各检测对象的实测三维检测框；Obtain the 2D detection frame of each detection object in the pixel data and the measured 3D detection frame of each detection object in the point cloud data;

将像素数据中各检测对象的二维检测框和点云数据中各检测对象的实测三维检测框进行匹配，确定候选检测对象；Matching the two-dimensional detection frame of each detection object in the pixel data and the measured three-dimensional detection frame of each detection object in the point cloud data to determine candidate detection objects;

根据各候选检测对象在像素数据中的二维检测框和特征信息，以及各检测对象在点云数据中的实测三维检测框和特征信息，确定各候选检测对象的实测轨迹特征。According to the two-dimensional detection frame and feature information of each candidate detection object in the pixel data, and the measured 3D detection frame and feature information of each detection object in the point cloud data, the measured trajectory characteristics of each candidate detection object are determined.

在其中一个实施例中，上述将像素数据中各检测对象的二维检测框和点云数据中各检测对象的实测三维检测框进行匹配，确定候选检测对象，包括：In one of the embodiments, the above-mentioned two-dimensional detection frame of each detection object in the pixel data and the measured three-dimensional detection frame of each detection object in the point cloud data are matched, and the candidate detection object is determined, including:

将各实测三维检测框映射为对应的二维映射检测框；mapping each measured 3D detection frame to a corresponding 2D mapped detection frame;

获取各二维检测框和各二维映射检测框之间的交并比，并将交并比大于交并比阈值的检测对象确定为候选检测对象。Obtain the intersection ratio between each two-dimensional detection frame and each two-dimensional mapping detection frame, and determine the detection object whose intersection ratio is greater than the threshold of the intersection ratio as a candidate detection object.

在其中一个实施例中，上述根据各候选检测对象在像素数据中的二维检测框和特征信息，以及各检测对象在点云数据中的实测三维检测框和特征信息，确定各候选检测对象的实测轨迹特征，包括：In one of the embodiments, according to the two-dimensional detection frame and feature information of each candidate detection object in the pixel data, and the measured 3D detection frame and feature information of each detection object in the point cloud data, determine the detection object of each candidate. Measured trajectory features, including:

将各候选检测对象在像素数据中的二维检测框和特征信息综合确定为各候选检测对象的二维轨迹特征；将各候选检测对象在点云数据中的实测三维检测框和特征信息综合确定为各候选检测对象的三维轨迹特征；The two-dimensional detection frame and feature information of each candidate detection object in the pixel data are comprehensively determined as the two-dimensional trajectory feature of each candidate detection object; the measured three-dimensional detection frame and feature information of each candidate detection object in the point cloud data are comprehensively determined is the three-dimensional trajectory feature of each candidate detection object;

将各候选检测对象的三维轨迹特征转换为对应的二维映射轨迹特征；Convert the three-dimensional trajectory features of each candidate detection object into the corresponding two-dimensional mapping trajectory features;

根据各二维映射轨迹特征和各二维轨迹特征的融合数据，提取各候选检测对象的实测轨迹特征。According to the fusion data of each 2D mapped trajectory feature and each 2D trajectory feature, the measured trajectory feature of each candidate detection object is extracted.

在其中一个实施例中，上述将各候选检测对象的三维轨迹特征转换为对应的二维映射轨迹特征，包括：In one of the embodiments, the above-mentioned three-dimensional trajectory features of each candidate detection object are converted into corresponding two-dimensional mapping trajectory features, including:

将各三维轨迹特征对应的实测三维检测框中的点云点的三维坐标转换为二维坐标；Convert the three-dimensional coordinates of the point cloud points in the measured three-dimensional detection frame corresponding to each three-dimensional trajectory feature into two-dimensional coordinates;

根据各点云点的二维坐标和各点云点在三维坐标中的z轴坐标，获取各实测三维检测框对应的鸟瞰图；According to the two-dimensional coordinates of each point cloud point and the z-axis coordinate of each point cloud point in the three-dimensional coordinates, obtain the bird's-eye view corresponding to each measured three-dimensional detection frame;

根据各点云点的二维坐标和各点云点的强度，获取各实测三维检测框对应的强度图；According to the two-dimensional coordinates of each point cloud point and the intensity of each point cloud point, the intensity map corresponding to each measured three-dimensional detection frame is obtained;

根据鸟瞰图中各点云点在z轴方向的密度，获取各实测三维检测框对应的密度图；According to the density of each point cloud point in the z-axis direction in the bird's-eye view, the density map corresponding to each measured 3D detection frame is obtained;

将鸟瞰图、强度图和密度图进行合并处理，得到各三维轨迹特征对应的二维映射轨迹特征。The bird's-eye view, the intensity map and the density map are combined to obtain the 2D mapping trajectory feature corresponding to each 3D trajectory feature.

在其中一个实施例中，上述根据各点云点的二维坐标和各点云点在三维坐标中的z轴坐标，获取各实测三维检测框对应的鸟瞰图，包括：In one of the embodiments, the above-mentioned according to the two-dimensional coordinates of each point cloud point and the z-axis coordinate of each point cloud point in the three-dimensional coordinates, obtain the bird's-eye view corresponding to each measured three-dimensional detection frame, including:

对各点云点在三维坐标中的z轴坐标进行归一化处理，将归一化处理后的z轴坐标确定为各点云点的像素值；The z-axis coordinates of each point cloud point in the three-dimensional coordinates are normalized, and the normalized z-axis coordinate is determined as the pixel value of each point cloud point;

以各点云点对应的像素值作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的鸟瞰图。Taking the pixel value corresponding to each point cloud point as the pixel value of the corresponding two-dimensional coordinate position, the bird's-eye view corresponding to each measured three-dimensional detection frame is obtained.

在其中一个实施例中，上述根据各点云点的二维坐标和各点云点的强度，获取各实测三维检测框对应的强度图，包括：In one embodiment, according to the two-dimensional coordinates of each point cloud point and the intensity of each point cloud point, the intensity map corresponding to each measured three-dimensional detection frame is obtained, including:

对各点云点的强度进行归一化处理，将归一化处理后强度确定为各点云点的像素值；The intensity of each point cloud point is normalized, and the normalized intensity is determined as the pixel value of each point cloud point;

以各点云点对应的像素值作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的强度图。Taking the pixel value corresponding to each point cloud point as the pixel value of the corresponding two-dimensional coordinate position, the intensity map corresponding to each measured three-dimensional detection frame is obtained.

在其中一个实施例中，上述根据鸟瞰图中各点云点在z轴方向的密度，获取各实测三维检测框对应的密度图，包括：In one of the embodiments, above-mentioned according to the density of each point cloud point in the z-axis direction in the bird's-eye view, obtain the density map corresponding to each measured three-dimensional detection frame, including:

根据每个点云点的二维坐标位置中在z轴方向上的点云点的数量、所有坐标位置中点云点的数量的最大值、所有坐标位置中点云点的数量的最小值，确定每个位置的点云点在z轴方向的密度；According to the number of point cloud points in the z-axis direction in the two-dimensional coordinate position of each point cloud point, the maximum value of the number of point cloud points in all coordinate positions, and the minimum value of the number of point cloud points in all coordinate positions, Determine the density of the point cloud points at each location in the z-axis direction;

将每个坐标位置的点云点在z轴方向的密度作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的密度图。The density of the point cloud points in each coordinate position in the z-axis direction is taken as the pixel value of the corresponding two-dimensional coordinate position, and the density map corresponding to each measured three-dimensional detection frame is obtained.

在其中一个实施例中，上述根据各二维映射轨迹特征和各二维轨迹特征的融合数据，提取各候选检测对象的实测轨迹特征，包括：In one of the embodiments, above-mentioned according to the fusion data of each two-dimensional mapping track feature and each two-dimensional track feature, extract the measured track feature of each candidate detection object, including:

将各二维轨迹特征和各二维映射轨迹特征压缩为相同比例大小后，拼接得到融合数据矩阵；After compressing each two-dimensional trajectory feature and each two-dimensional mapping trajectory feature into the same scale, the fusion data matrix is obtained by splicing;

将从融合数据矩阵中提取的特征信息确定为各候选检测对象的实测轨迹特征。The feature information extracted from the fusion data matrix is determined as the measured trajectory feature of each candidate detection object.

在其中一个实施例中，上述根据各待跟踪目标在上一帧的历史检测框，预测各待跟踪目标在当前帧中预测检测框，包括：In one embodiment, above-mentioned according to the historical detection frame of each target to be tracked in the previous frame, predict that each target to be tracked predicts the detection frame in the current frame, including:

通过预设的跟踪算法模型，根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框；其中，跟踪算法模型是基于匀变速运动状态的空间方程构建的。Through the preset tracking algorithm model, according to the historical 3D detection frame of each target to be tracked in the previous frame, predict the 3D detection frame of each target to be tracked in the current frame; wherein, the tracking algorithm model is based on the space of uniform speed motion state Equations are constructed.

第二方面，本申请实施例提供一种目标跟踪装置，该装置包括：In a second aspect, an embodiment of the present application provides a target tracking device, the device comprising:

获取模块，用于获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据；第一传感器和第二传感器为不同类型的传感器；an acquisition module for acquiring the first current frame data of the target scene collected by the first sensor, and the second current frame data of the target scene collected by at least one second sensor; the first sensor and the second sensor are different types of sensors;

特征获取模块，用于根据第一当前帧数据和第二当前帧数据，获取各候选检测对象在当前帧中的实测特征信息；候选检测对象为第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功的检测对象；所述实测特征信息表示所述各候选检测对象的固有特性信息；The feature acquisition module is used to acquire the measured feature information of each candidate detection object in the current frame according to the first current frame data and the second current frame data; the candidate detection objects are each detection object in the first current frame data and the second current frame data. Each detection object in the frame data is successfully associated and matched to the detection object; the measured feature information represents the inherent characteristic information of each candidate detection object;

预测模块，用于根据各待跟踪目标在上一帧的历史检测框，预测各待跟踪目标在当前帧中预测检测框；The prediction module is used to predict the detection frame of each target to be tracked in the current frame according to the historical detection frame of each target to be tracked in the previous frame;

跟踪模块，用于根据各候选检测对象在当前帧中的实测特征信息，以及各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪。The tracking module is used for tracking each candidate detection object according to the measured feature information of each candidate detection object in the current frame and the prediction detection frame of each target to be tracked in the current frame.

第三方面，本申请实施例提供一种计算机设备，包括存储器和处理器，存储器存储有计算机程序，处理器执行计算机程序时实现上述第一方面实施例提供的任一项方法的步骤。In a third aspect, an embodiment of the present application provides a computer device, including a memory and a processor, where the memory stores a computer program, and when the processor executes the computer program, the processor implements the steps of any one of the methods provided in the embodiments of the first aspect.

第四方面，本申请实施例提供一种计算机可读存储介质，其上存储有计算机程序，计算机程序被处理器执行时实现上述第一方面实施例提供的任一项方法的步骤。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of any one of the methods provided in the above-mentioned first aspect embodiment.

本申请实施例提供的一种目标跟踪方法、装置、计算机设备和存储介质，目标跟踪方法、装置、计算机设备和存储介质，通过获取第一传感器采集的目标场景的第一当前帧数据和至少一个与第一传感器类型不同的第二传感器采集的目标场景的第二当前帧数据，获取根据第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功确定的各候选检测对象在当前帧中的实测特征信息，然后根据各候选检测对象在当前帧中的实测特征信息及预测的各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪。该方法中，对当前帧中各检测对象进行跟踪时，各检测对象的实测特征信息是根据两种以上不同类型的传感器采集的数据确定出来的，即综合了多种类型的传感器采集的数据来确定各帧中检测对象的实测特征信息，可以精准地、完整地反映各帧中检测对象的特征，这样，在将各候选检测对象的实测特征信息征与各待跟踪目标准确的数据进行匹配时，可以准确地确定出每一个检测对象的所属待跟踪目标，有效地完成各帧的目标跟踪。另外，根据第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功来筛选出候选检测对象，确保了各类型传感器采集的数据中的检测对象的一致性。A target tracking method, apparatus, computer equipment, and storage medium provided by the embodiments of the present application, and the target tracking method, apparatus, computer equipment, and storage medium are obtained by acquiring the first current frame data of the target scene collected by the first sensor and at least one The second current frame data of the target scene collected by the second sensor of a different type from the first sensor is obtained, and each candidate detection object determined according to the successful correlation matching between each detection object in the first current frame data and each detection object in the second current frame data is obtained. The measured feature information of the object in the current frame, and then the detection frame is predicted in the current frame according to the measured feature information of each candidate detection object in the current frame and the predicted target to be tracked in the current frame, and each candidate detection object is tracked. In this method, when each detection object in the current frame is tracked, the measured feature information of each detection object is determined according to the data collected by two or more different types of sensors, that is, the data collected by various types of sensors is synthesized to Determining the measured feature information of the detected objects in each frame can accurately and completely reflect the features of the detected objects in each frame. In this way, when the measured feature information of each candidate detection object is matched with the accurate data of each target to be tracked , the target to be tracked to which each detection object belongs can be accurately determined, and the target tracking of each frame can be effectively completed. In addition, the candidate detection objects are selected according to the successful correlation matching between the detection objects in the first current frame data and the detection objects in the second current frame data, which ensures the consistency of detection objects in the data collected by various types of sensors.

附图说明Description of drawings

图1为一个实施例提供的一种目标跟踪的应用环境图；1 is an application environment diagram of a target tracking provided by an embodiment;

图1a为一个实施例中激光雷达和摄像设备的位置关系；Fig. 1a is a positional relationship between a lidar and a camera device in one embodiment;

图1b为一个实施例中计算机设备的内部结构图；Figure 1b is an internal structure diagram of a computer device in one embodiment;

图2为一个实施例提供的一种目标跟踪方法的流程示意图；2 is a schematic flowchart of a target tracking method provided by an embodiment;

图3为另一个实施例提供的一种目标跟踪方法的流程示意图；3 is a schematic flowchart of a target tracking method provided by another embodiment;

图4为另一个实施例提供的一种目标跟踪方法的流程示意图；4 is a schematic flowchart of a target tracking method provided by another embodiment;

图5为另一个实施例提供的一种目标跟踪方法的流程示意图；5 is a schematic flowchart of a target tracking method provided by another embodiment;

图6为另一个实施例提供的一种目标跟踪方法的流程示意图；6 is a schematic flowchart of a target tracking method provided by another embodiment;

图7为另一个实施例提供的一种目标跟踪方法的流程示意图；7 is a schematic flowchart of a target tracking method provided by another embodiment;

图8为另一个实施例提供的一种目标跟踪方法的流程示意图；8 is a schematic flowchart of a target tracking method provided by another embodiment;

图9为另一个实施例提供的一种目标跟踪方法的流程示意图；9 is a schematic flowchart of a target tracking method provided by another embodiment;

图10为另一个实施例提供的一种目标跟踪方法的流程示意图；10 is a schematic flowchart of a target tracking method provided by another embodiment;

图11为另一个实施例提供的一种目标跟踪方法的流程示意图；11 is a schematic flowchart of a target tracking method provided by another embodiment;

图12为另一个实施例提供的一种目标跟踪方法的流程图；12 is a flowchart of a target tracking method provided by another embodiment;

图13为一个实施例提供的一种目标跟踪装置的结构框图。FIG. 13 is a structural block diagram of a target tracking apparatus provided by an embodiment.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

本申请提供的一种目标跟踪方法，可以应用于图1所示的应用环境中，该应用环境包括激光雷达01、摄像设备02及计算机设备03。其中，激光雷达01、摄像设备02和计算机设备三者之间可以通信；激光雷达01包括但不限于是脉冲雷达、连续波雷达、米波雷达、分米波雷达、厘米波雷达等，包括8线、16线、24线、32线、64线、128线激光雷达；摄像设备02包括但不限于是专业摄像机、CCD摄像机、网络摄像机、便携式摄像机、黑白摄像机、彩色摄像机、红外线摄像机、X光摄像机、暗访摄像机等；计算机设备03包括但不限于是服务器、各种终端：个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备等，相机包括枪型相机、半球型相机、球型相机。A target tracking method provided by the present application can be applied to the application environment shown in FIG. Among them, laser radar 01, camera equipment 02 and computer equipment can communicate; laser radar 01 includes but is not limited to pulse radar, continuous wave radar, meter wave radar, decimeter wave radar, centimeter wave radar, etc., including 8 Line, 16 line, 24 line, 32 line, 64 line, 128 line lidar; camera equipment 02 includes but is not limited to professional camera, CCD camera, network camera, camcorder, black and white camera, color camera, infrared camera, X-ray Cameras, unannounced cameras, etc.; computer equipment 03 includes but is not limited to servers, various terminals: personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, etc. camera.

其中，激光雷达01和摄像设备02之间的位置相对固定，其安装方式可以是雷达与相机安装在固定到路侧杆上，不限位置，立杆横杆皆可，例如如图1a所示的种激光雷达01和摄像设备02的安装示意图。图1b提供了一种计算机设备03的内部结构图，该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储目标跟踪的相关数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种目标跟踪方法。Among them, the position between the lidar 01 and the camera device 02 is relatively fixed, and the installation method can be that the radar and the camera are installed on the roadside pole, and the position is not limited. Schematic diagram of the installation of the laser radar 01 and the camera equipment 02. Figure 1b provides a diagram of the internal structure of a computer device 03 comprising a processor, memory and network interface connected by a system bus. Among other things, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores an operating system, a computer program and a database. The internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium. The computer device's database is used to store target tracking related data. The network interface of the computer device is used to communicate with external terminals through a network connection. The computer program is executed by a processor to implement a target tracking method.

本申请实施例提供一种目标跟踪方法、装置、计算机设备和存储介质，能够提高目标跟踪结果的精度。下面将通过实施例并结合附图具体地对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合，对于相同或相似的概念或过程可能在某些实施例中不再赘述。需要说明的是，本申请提供的一种目标跟踪方法，图2-图12的执行主体为计算机设备，其中，其执行主体还可以是目标跟踪装置，其中该装置可以通过软件、硬件或者软硬件结合的方式实现成为计算机设备的部分或者全部。The embodiments of the present application provide a target tracking method, apparatus, computer equipment and storage medium, which can improve the accuracy of target tracking results. The technical solution of the present application and how the technical solution of the present application solves the above-mentioned technical problems will be described in detail below through embodiments and in conjunction with the accompanying drawings. The following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be repeated in some embodiments. It should be noted that, in a target tracking method provided by the present application, the execution subject of FIG. 2 to FIG. 12 is a computer device, wherein the execution subject may also be a target tracking device, wherein the device can be implemented by software, hardware or software and hardware. The combined implementation becomes part or all of the computer equipment.

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments.

在一个实施例中，图2提供了一种目标跟踪方法，该实施例涉及的是计算机设备根据两种以上不同类型传感器采集的同一场景的数据，根据不同类型传感器采集的数据进行检测对象的关联匹配之后，将关联匹配成功的检测对象的预测数据和实测数据进行匹配分析，然后将匹配分析结果中匹配成功的检测对象对应待跟踪目标之后，对待跟踪目标进行跟踪的具体过程，如图2所示，所述方法包括：In one embodiment, FIG. 2 provides a target tracking method, which involves a computer device according to the data of the same scene collected by two or more different types of sensors, and the association of detected objects according to the data collected by different types of sensors After matching, the prediction data and the measured data of the successfully matched detection objects are matched and analyzed, and then the successfully matched detection objects in the matching analysis result correspond to the target to be tracked, and the specific process of tracking the target to be tracked is shown in Figure 2. shown, the method includes:

S101，获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据；第一传感器和第二传感器为不同类型的传感器。S101: Acquire first current frame data of a target scene collected by a first sensor, and second current frame data of a target scene collected by at least one second sensor; the first sensor and the second sensor are different types of sensors.

其中，目标场景指的是需要跟踪的目标所在的场景，例如，目标为车辆，车辆行驶在某一道路上，那么该道路一定范围就为该目标场景。Among them, the target scene refers to the scene where the target to be tracked is located. For example, if the target is a vehicle and the vehicle is driving on a certain road, then a certain range of the road is the target scene.

第一传感器包括但不限于是摄像设备或者激光雷达，同样，第二传感器也包括但不限于是摄像设备或者激光雷达；但第一传感器和第二传感器为不同类型的传感器，例如，若第一传感器是激光雷达，那么第二传感器是摄像设备，可参见图1a所示的激光雷达和摄像设备的安装方式。The first sensor includes, but is not limited to, a camera device or a lidar. Similarly, the second sensor also includes, but is not limited to, a camera device or a lidar; however, the first sensor and the second sensor are different types of sensors. If the sensor is a lidar, then the second sensor is a camera device. Please refer to the installation method of the lidar and the camera device as shown in Figure 1a.

第一传感器采集的目标场景的当前帧数据是第一当前帧数据，以第一传感器是激光雷达为例，第一当前帧数据就是采集的目标场景当前的数据为三维点云数据，例如，目标为车辆，车辆行驶在某一道路上，路侧设置的激光雷达对该道路一定范围进行扫描，得到该空间场景中的三维点云数据。同样，第二传感器若是摄像设备，那么第二当前帧数据就是摄像设备采集的目标场景当前帧的二维像素数据，例如目标场景的视频数据。The current frame data of the target scene collected by the first sensor is the first current frame data. Taking the first sensor as a lidar as an example, the first current frame data is the collected target scene. The current data is 3D point cloud data, for example, the target For the vehicle, the vehicle is driving on a certain road, and the lidar set on the roadside scans a certain range of the road to obtain the three-dimensional point cloud data in the spatial scene. Likewise, if the second sensor is a camera device, then the second current frame data is the two-dimensional pixel data of the current frame of the target scene collected by the camera device, such as video data of the target scene.

S102，根据第一当前帧数据和第二当前帧数据，获取各候选检测对象在当前帧中的实测特征信息；候选检测对象为第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功的检测对象；实测特征信息表示各候选检测对象的固有特性信息。S102, according to the first current frame data and the second current frame data, obtain the measured feature information of each candidate detection object in the current frame; the candidate detection objects are each detection object in the first current frame data and each detection object in the second current frame data. The detected objects are associated with the detected objects that have been successfully matched; the measured feature information represents the inherent characteristic information of each candidate detected object.

其中，候选检测对象为第一当前帧数据和第二当前帧数据中关联匹配成功的检测对象，例如，第一当前帧数据中有5个检测对象，第二当前帧数据中也有5个检测对象，但是关联匹配上的只有4个，那么这4个检测对象就是候选检测对象。Among them, the candidate detection objects are detection objects that are successfully correlated and matched in the first current frame data and the second current frame data. For example, there are 5 detection objects in the first current frame data, and there are also 5 detection objects in the second current frame data. , but there are only 4 correlation matches, so these 4 detection objects are candidate detection objects.

实际应用中，在获取了第一当前帧数据和第二当前帧数据之后，先对第一当前帧数据和第二当前帧数据中的检测对象进行关联匹配，得到候选检测对象，然后接着从第一当前帧数据和第二当前帧数据中获取各候选检测对象的当前帧的实测特征信息。In practical applications, after the first current frame data and the second current frame data are obtained, the detection objects in the first current frame data and the second current frame data are correlated and matched to obtain candidate detection objects, and then the detection objects are obtained from the first current frame data and the second current frame data. The measured feature information of the current frame of each candidate detection object is obtained from the first current frame data and the second current frame data.

另外，计算机设备在确定候选检测对象的当前帧的实测三维检测框和实测轨迹特征时，第一当前帧数据和第二当前帧数据必须是有效帧。可选地，有效帧指的是第一当前帧数据和第二当前帧数据的采集时间同步，例如，两者采集时间间隔小于预设阈值。In addition, when the computer device determines the measured three-dimensional detection frame and the measured trajectory feature of the current frame of the candidate detection object, the first current frame data and the second current frame data must be valid frames. Optionally, the valid frame refers to the synchronization of the collection time of the first current frame data and the second current frame data, for example, the collection time interval between the two is less than a preset threshold.

实测特征信息表示各候选检测对象的固有特性信息，可选地，实测特征信息包括实测三维检测框和实测轨迹特征，其中，实测三维检测框和实测轨迹特征指的是根据第一当前帧数据和第二当前帧数据中实际信息计算的各目标的三维检测框和各目标的轨迹特征。其中，三维检测框就是各候选检测对象在点云中的检测框，轨迹特征可以是方向梯度直方图特征等可以反映检测对象各种信息的特征；例如，计算机设备根据第一当前帧数据和第二当前帧数据中各候选检测对象的检测框的位置信息、检测框中的颜色、纹理信息等综合确定各候选检测对象的轨迹特征，作为各候选检测对象当前帧的实测轨迹特征。当然，实际应用时，实测特征信息也可包括候选检测对象的二维检测框和二维轨迹特征，本申请实施例对此不加以限定。The measured feature information represents the inherent characteristic information of each candidate detection object. Optionally, the measured feature information includes the measured 3D detection frame and the measured trajectory feature, wherein the measured 3D detection frame and the measured trajectory feature refer to the data based on the first current frame data and the measured trajectory feature. The three-dimensional detection frame of each target and the trajectory feature of each target calculated from the actual information in the second current frame data. Among them, the three-dimensional detection frame is the detection frame of each candidate detection object in the point cloud, and the trajectory feature may be a feature such as a directional gradient histogram feature that can reflect various information of the detection object; 2. The position information of the detection frame of each candidate detection object in the current frame data, the color and texture information in the detection frame, etc., comprehensively determine the trajectory feature of each candidate detection object, as the measured trajectory feature of the current frame of each candidate detection object. Of course, in practical application, the actually measured feature information may also include a two-dimensional detection frame and a two-dimensional trajectory feature of a candidate detection object, which is not limited in this embodiment of the present application.

S103，根据各待跟踪目标在上一帧的历史检测框，预测各待跟踪目标在当前帧中预测检测框。S103, according to the historical detection frame of each target to be tracked in the previous frame, predict the detection frame of each target to be tracked in the current frame.

对于每一个待跟踪目标来说，其运动轨迹在连续帧中的位置变化是相关的，可以通过一些预测方法基于目标已发生的轨迹预测该目标未来的轨迹。现在需要预测的是当前帧的各待跟踪目标的预测检测框，而上一帧中各待跟踪目标的特征信息(包括检测框和轨迹特征)已经是已知的，所以可根据各待跟踪目标在上一帧的历史检测框，预测各待跟踪目标在当前帧中预测检测框。实际应用中，检测框可包括二维检测框或者三维检测框，本申请实施例对此不限定，例如，根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框，这里的历史三维检测框指的就是待跟踪目标在上一帧中已发生的已知三维检测框。例如，计算机设备可以采用预设的卡尔曼滤波器根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框。For each target to be tracked, the position changes of its motion trajectory in consecutive frames are related, and some prediction methods can be used to predict the target's future trajectory based on the target's trajectory. Now what needs to be predicted is the predicted detection frame of each target to be tracked in the current frame, and the feature information (including detection frame and trajectory features) of each target to be tracked in the previous frame is already known, so the target can be tracked according to each target to be tracked. In the historical detection frame of the previous frame, each target to be tracked is predicted to predict the detection frame in the current frame. In practical applications, the detection frame may include a two-dimensional detection frame or a three-dimensional detection frame, which is not limited in this embodiment of the present application. The 3D detection frame is predicted in the frame, and the historical 3D detection frame here refers to the known 3D detection frame that has occurred in the previous frame of the target to be tracked. For example, the computer equipment can use the preset Kalman filter to predict the three-dimensional detection frame of each target to be tracked in the current frame according to the historical three-dimensional detection frame of each target to be tracked in the previous frame.

S104，根据各候选检测对象在当前帧中的实测特征信息，以及各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪。S104: Track each candidate detection object according to the measured feature information of each candidate detection object in the current frame and the predicted detection frame of each target to be tracked in the current frame.

由于上一帧中各待跟踪目标的历史检测框已经是已知的，所以以上一帧各待跟踪目标的历史检测框为标准，预测出各待跟踪目标在当前帧中预测检测框可以作为各待跟踪目标在当前帧中的识别依据；例如，以三维检测框为例，通过判断各候选检测对象在当前帧中的实测三维检测框与各待跟踪目标在当前帧中预测三维检测框之间是否匹配，可确定出每个候选检测对象分别属于哪一个待跟踪的目标，从而就可以对每个候选检测对象进行跟踪。这里需要说明的是，目标跟踪时每帧数据中都存在多个目标需要跟踪，就需要确定出当前帧数据中每个检测对象分别属于哪一个待跟踪的目标，确定了当前帧数据中各检测对象的所属待跟踪目标后，反过来就可以确定出各待跟踪标在当前帧数据中的轨迹。所以在本申请所有实施例中，跟踪过程(还未跟踪成功)的称为检测对象，跟踪成功的称为待跟踪目标(或目标)，后续不再赘述。Since the historical detection frame of each target to be tracked in the previous frame is already known, the historical detection frame of each target to be tracked in the previous frame is the standard, and the predicted detection frame of each target to be tracked in the current frame can be used as the standard. The identification basis of the target to be tracked in the current frame; for example, taking the 3D detection frame as an example, by judging the difference between the measured 3D detection frame of each candidate detection object in the current frame and the predicted 3D detection frame of each target to be tracked in the current frame Whether it matches or not, it can be determined which target to be tracked each candidate detection object belongs to, so that each candidate detection object can be tracked. It should be noted here that there are multiple targets to be tracked in each frame of data during target tracking, so it is necessary to determine which target to be tracked each detection object in the current frame data belongs to, and determine which target to be tracked belongs to each detection object in the current frame data. After the object belongs to the target to be tracked, the trajectory of each to-be-tracked mark in the current frame data can be determined in turn. Therefore, in all the embodiments of the present application, those in the tracking process (that have not been successfully tracked) are called detection objects, and those that have been successfully tracked are called objects to be tracked (or targets), which will not be described in detail later.

本实施例提供的目标跟踪方法，通过获取第一传感器采集的目标场景的第一当前帧数据和至少一个与第一传感器类型不同的第二传感器采集的目标场景的第二当前帧数据，获取根据第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功成功确定的各候选检测对象在当前帧中的实测特征信息，然后根据各候选检测对象在当前帧中的实测特征信息及预测的各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪。该方法中，对当前帧中各检测对象进行跟踪时，各检测对象的实测特征信息是根据两种以上不同类型的传感器采集的数据确定出来的，即综合了多种类型的传感器采集的数据来确定各帧中检测对象的实测特征信息，可以精准地、完整地反映各帧中检测对象的特征，这样，在将各候选检测对象的实测特征信息与各待跟踪目标准确的数据进行匹配时，可以准确地确定出每一个检测对象的所属待跟踪目标，有效地完成各帧的目标跟踪。另外，根据第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功来筛选出候选检测对象，确保了各类型传感器采集的数据中的检测对象的一致性。In the target tracking method provided in this embodiment, the first current frame data of the target scene collected by the first sensor and the second current frame data of the target scene collected by at least one second sensor of a different type from the first sensor are obtained. Each detection object in the first current frame data and each detection object in the second current frame data are correlated and matched successfully. The measured feature information of each candidate detection object in the current frame, and then according to the actual measurement of each candidate detection object in the current frame The feature information and each predicted target to be tracked predict a detection frame in the current frame, and track each candidate detection target. In this method, when tracking each detection object in the current frame, the measured characteristic information of each detection object is determined according to the data collected by more than two different types of sensors, that is, the data collected by various types of sensors is synthesized to Determining the measured feature information of the detected objects in each frame can accurately and completely reflect the features of the detected objects in each frame. In this way, when the measured feature information of each candidate detection object is matched with the accurate data of each target to be tracked, The target to be tracked to which each detection object belongs can be accurately determined, and the target tracking of each frame can be effectively completed. In addition, the candidate detection objects are screened out according to the successful correlation matching between the detection objects in the first current frame data and the detection objects in the second current frame data, which ensures the consistency of detection objects in the data collected by various types of sensors.

提供一个实施例，对上述S103步骤中根据各待跟踪目标在上一帧的特征信息，预测各待跟踪目标在当前帧中预测检测框的过程进行说明，该实施例以三维检测框为例进行说明，则该实施例包括：通过预设的跟踪算法模型，根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框；其中，该跟踪算法模型是基于匀变速运动状态的空间方程构建的。An embodiment is provided to describe the process of predicting the detection frame of each target to be tracked in the current frame according to the feature information of each target to be tracked in the previous frame in the above step S103. In this embodiment, the three-dimensional detection frame is used as an example. Note, this embodiment includes: predicting the three-dimensional detection frame of each target to be tracked in the current frame according to the historical three-dimensional detection frame of each target to be tracked in the previous frame through a preset tracking algorithm model; wherein, the tracking algorithm The model is constructed based on the space equations of uniformly variable motion states.

其中，状态空间方程是根据目标在空间中的不同运动状态建立的，可以反映目标的轨迹在不同时刻的变化情况、运动信息等表达式；基于该状态空间方程构建的跟踪算法模型，例如，卡尔曼滤波器，就可以更加贴近目标在空间中运动时真实信息。在构建好跟踪算法模型后，通过该跟踪算法模型根据各目标历史三维检测框预测当前帧中每一个待跟踪目标的预测三维检测框，可以使每一个待跟踪目标的预测三维检测框更加准确。Among them, the state space equation is established according to the different motion states of the target in space, which can reflect the changes of the target's trajectory at different times, motion information and other expressions; the tracking algorithm model constructed based on the state space equation, for example, Carl The Mann filter can be closer to the real information when the target moves in space. After the tracking algorithm model is constructed, the predicted 3D detection frame of each target to be tracked in the current frame is predicted by the tracking algorithm model according to the historical 3D detection frame of each target, which can make the predicted 3D detection frame of each target to be tracked more accurate.

由于匀变速运动状态的空间方程将目标在空间的运动与时间进行了耦合，考虑了加速度的影响，可以使得轨迹预测误差更小，提升了速度变化大的跟踪效果。所以，基于匀变速运动状态的空间方程构建的跟踪算法模型，在根据各待跟踪目标在上一帧的历史三维检测框预测的各待跟踪目标在当前帧中预测三维检测框非常精确。Since the space equation of the uniformly variable motion state couples the motion of the target in space with time, and the influence of acceleration is considered, the trajectory prediction error can be smaller, and the tracking effect with large speed changes can be improved. Therefore, the tracking algorithm model based on the space equation of the uniform speed motion state is very accurate in predicting the 3D detection frame of each target to be tracked in the current frame based on the historical 3D detection frame of each target to be tracked in the previous frame.

示例地，以每一时刻的各目标检测框为一个矩形框为例，每帧数据的检测结果为目标框 (例如，点云为三维目标框，图像为二维目标框)；将目标在图像里的运动看作匀变速运动，则构建如下状态空间方程(1)反映目标在图像里匀变速运动时的信息变化。Illustratively, taking each target detection frame at each moment as a rectangular frame as an example, the detection result of each frame of data is the target frame (for example, the point cloud is a three-dimensional target frame, and the image is a two-dimensional target frame); The motion in the image is regarded as a uniformly variable motion, and the following state space equation (1) is constructed to reflect the information change of the target when the uniformly variable motion is in the image.

其中，上式中，x′、y′代表当前帧图像中目标的检测框中心点在图像x轴、y轴的坐标， x、y代表时间t之前目标的检测框中心点在图像x轴、y轴的坐标，

代表时间t之前同一目标在图像x轴、y轴方向的速度，

代表时间t之前同一目标在图像x轴、y轴方向的加速度；α′、h′代表当前帧图像中目标的轨迹的长宽比和高度，α、h代表时间t之前目标的轨迹的长宽比和高度，

代表同一目标在时间t之前长宽比变化率、高度变化率。Among them, in the above formula, x', y' represent the coordinates of the detection frame center point of the target in the current frame image on the x-axis and y-axis of the image, x, y represent the target's detection frame center point before time t on the image x-axis, y-axis the coordinates of the y-axis,

represents the speed of the same target in the x-axis and y-axis directions of the image before time t,

Represents the acceleration of the same target in the x-axis and y-axis directions of the image before time t; α', h' represent the aspect ratio and height of the target's trajectory in the current frame image, α, h represent the length and width of the target's trajectory before time t ratio and height,

It represents the change rate of aspect ratio and height change rate of the same target before time t.

其中，上述t代表跟踪失败的时刻，那么t之前检测框中心点坐标是以采集数据中已完成跟踪的最后一个图像为准，速度和加速度则是指t时刻之前某个时间段的平均速度和平均加速度。那么基于上述状态空间方程(1)，可以对每个时刻的图像(各传感器采集的各帧数据) 中的目标的检测框进行预测，进行各目标实时的位置跟踪。由于状态空间方程(1)与时间进行了耦合，考虑到加速度，这样在轨迹预测阶段就考虑加速度的影响，可以使得轨迹预测误差更小，提升了速度变化大的跟踪效果。Among them, the above t represents the time when the tracking fails, then the coordinates of the center point of the detection frame before t are based on the last image in the collected data that has been tracked, and the speed and acceleration refer to the average speed and acceleration of a certain time period before time t. average acceleration. Then, based on the above state space equation (1), the detection frame of the target in the image at each moment (each frame of data collected by each sensor) can be predicted, and the real-time position tracking of each target can be performed. Since the state space equation (1) is coupled with time, considering the acceleration, the influence of acceleration is considered in the trajectory prediction stage, which can make the trajectory prediction error smaller and improve the tracking effect with large speed changes.

在以上实施例的基础上，本申请实施例还提供了一种目标跟踪方法，其涉及的是计算机设备根据各候选检测对象在当前帧中的实测特征信息和各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪的具体过程，该实施例以实测特征信息包括实测三维检测框和实测轨迹特征，根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框为例进行说明，如图3所示，上述S104步骤包括:On the basis of the above embodiments, the embodiments of the present application also provide a target tracking method, which involves a computer device predicting the current frame according to the measured feature information of each candidate detection object in the current frame and each target to be tracked in the current frame Detection frame, the specific process of tracking each candidate detection object. In this embodiment, the measured feature information includes the measured 3D detection frame and the measured trajectory feature, and according to the historical 3D detection frame of each target to be tracked in the previous frame, predict each to be tracked. The target predicts the three-dimensional detection frame in the current frame as an example to illustrate, as shown in Figure 3, the above-mentioned S104 steps include:

S201，获取各候选检测对象在当前帧中的实测三维检测框与各待跟踪目标在当前帧中预测的三维检测框之间的交并比。S201: Obtain the intersection ratio between the actually measured 3D detection frame of each candidate detection object in the current frame and the 3D detection frame predicted by each target to be tracked in the current frame.

本实施例是对各候选检测对象在当前帧中的实测三维检测框和各待跟踪目标在当前帧中预测三维检测框的匹配过程进行说明，计算两者之间的交并比实质上可看作是通过交并比匹配各候选检测对象在当前帧中实测三维检测框和各待跟踪目标在当前帧中预测三维检测框之间的相似度。This embodiment describes the matching process of the measured 3D detection frame of each candidate detection object in the current frame and the predicted 3D detection frame of each target to be tracked in the current frame. The operation is to match the similarity between the measured 3D detection frame of each candidate detection object in the current frame and each target to be tracked in the current frame to predict the similarity between the 3D detection frame in the current frame.

例如，各候选检测对象在当前帧中实测三维检测框为A1，各待跟踪目标在当前帧中预测三维检测框为B1，那么通过计算A1和B1的交并比来确定A1和B1的相似度。其中，交并比即为A1和B1的交集区域和并集区域的比值。For example, the measured 3D detection frame of each candidate detection object in the current frame is A1, and the predicted 3D detection frame of each target to be tracked in the current frame is B1, then the similarity between A1 and B1 is determined by calculating the intersection ratio of A1 and B1. . Among them, the intersection and union ratio is the ratio of the intersection area and the union area of A1 and B1.

确定了交并比后，交并比高于预设的交并比阈值的表示两者之间非常相似，可确定两者匹配成功，而交并比低于预设的交并比阈值的表示两者之间相差较大，两者匹配失败。After the intersection ratio is determined, if the intersection ratio is higher than the preset intersection ratio threshold, the two are very similar, and it can be determined that the two match successfully, while the intersection ratio is lower than the preset intersection ratio threshold. There is a large difference between the two, and the two fail to match.

S202，获取剩余检测对象的实测轨迹特征与各待跟踪目标在上一帧中历史轨迹特征之间的相似度；剩余检测对象为交并比小于预设交并比阈值的候选检测对象。S202, obtain the similarity between the measured track features of the remaining detection objects and the historical track features of each target to be tracked in the previous frame; the remaining detection objects are candidate detection objects whose intersection ratio is less than a preset intersection ratio threshold.

在上述计算了各候选检测对象实测三维检测框和各待跟踪目标的预测三维检测框之间交并比之后，将交并比小于预设的交并比阈值的交并比对应的候选检测对象称为剩余检测对象。After the above-mentioned calculation of the intersection ratio between the measured 3D detection frame of each candidate detection object and the predicted 3D detection frame of each target to be tracked, compare the intersection ratio of the intersection ratio smaller than the preset intersection ratio threshold to the corresponding candidate detection object. It is called the remaining detection object.

每一个待跟踪目标在视频的连续帧中是一帧一帧跟踪的，在跟踪当前帧中各目标时，上一帧中已跟踪过的各目标的轨迹特征是已知的，那么将上一帧中各目标的轨迹特征称为历史轨迹特征，则对于剩余检测对象，通过计算剩余检测对象在当前帧中的实测轨迹特征和各待跟踪目标在上一帧中的历史轨迹特征之间的相似度来确定两者是否匹配。例如，获取两者之间的余弦相似度，也可以通过距离度量，例如计算欧式距离等方式，本实施例对此不作限定。例如，轨迹特征可以是方向梯度直方图(Histogram of OrientedGradient，HOG)特征，或者其他特征，本实施例对此也不作限定。Each target to be tracked is tracked frame by frame in consecutive frames of the video. When tracking each target in the current frame, the trajectory features of the targets that have been tracked in the previous frame are known, then the previous The trajectory features of each target in the frame are called historical trajectory features. For the remaining detected objects, the similarity between the measured trajectory features of the remaining detected objects in the current frame and the historical trajectory features of each target to be tracked in the previous frame is calculated. to determine whether the two match. For example, to obtain the cosine similarity between the two, the distance measurement, such as calculating the Euclidean distance, may also be used, which is not limited in this embodiment. For example, the trajectory feature may be a Histogram of Oriented Gradient (HOG) feature, or other features, which are not limited in this embodiment.

确定了相似度后，相似度高于预设的相似度阈值的表示两者之间非常相似，可确定两者匹配成功，而相似度低于预设的相似度阈值的表示两者之间相差较大，两者匹配失败。After the similarity is determined, if the similarity is higher than the preset similarity threshold, it means that the two are very similar, and it can be determined that the two are successfully matched, while the similarity lower than the preset similarity threshold indicates that there is a difference between the two. larger, the two will fail to match.

S203，对交并比大于交并比阈值的各候选检测对象，和相似度大于预设相似度阈值的各剩余检测对象进行跟踪。S203, track each candidate detection object whose intersection ratio is greater than the intersection ratio threshold, and each remaining detection object whose similarity is greater than a preset similarity threshold.

上述交并比大于预设的交并比阈值以及相似度大于预设相似度阈值均表示匹配成功的，对于所有交并比匹配成功的各候选检测对象，可确定出该候选检测对象对应待跟踪目标的标识就是与其匹配成功的预测三维检测框对应的已知待跟踪目标的标识；而对于相似度大于预设相似度阈值，可确定出该候选检测对象对应待跟踪目标的标识就是与其匹配成功的历史轨迹特征对应的已知待跟踪目标的标识。在确定了匹配成功的各候选检测对象的标识之后，将各候选检测对象与对应标识进行关联，关联以后，就可以根据关联的标识确定每个候选检测对象是哪一个待跟踪目标，这样就完成对匹配成功的候选检测对象的标记并跟踪。The above-mentioned intersection ratio is greater than the preset intersection ratio threshold and the similarity is greater than the preset similarity threshold, indicating that the matching is successful. For all candidate detection objects that are successfully matched by the intersection ratio, it can be determined that the candidate detection object corresponds to the to-be-tracked object. The identification of the target is the identification of the known target to be tracked corresponding to the predicted 3D detection frame that is successfully matched; and for the similarity greater than the preset similarity threshold, it can be determined that the identification of the candidate detection object corresponding to the target to be tracked is a successful match. The identification of the known target to be tracked corresponding to the historical trajectory feature of . After the identification of each candidate detection object that has been successfully matched is determined, each candidate detection object is associated with the corresponding identification. After the association, it is possible to determine which target to be tracked each candidate detection object is according to the associated identification, thus completing Mark and track successfully matched candidate detection objects.

当然，实际应用中，也可只采用交并比的方式确定匹配成功的候选检测对象，或者只采用相似度的方式确定匹配成功的候选检测对象，本申请实施例对此不加以限制。Of course, in practical applications, the candidate detection objects that are successfully matched can also be determined only by the method of cross-combination ratio, or the candidate detection objects that are successfully matched can be determined only by the method of similarity, which is not limited in the embodiment of the present application.

可选地，在对匹配成功的候选检测对象进行了标记跟踪后，可根据各候选检测对象的实测三维检测框和实测轨迹特征，更新各候选检测对象所属的待跟踪目标在当前帧数据中的检测框和轨迹信息。可以理解的是，更新后的当前帧数据中待跟踪目标的实测三维检测框和实测轨迹特征就可以作为下一帧数据的上一帧数据中已知的历史三维检测框和历史轨迹特征。Optionally, after marking and tracking the successfully matched candidate detection objects, the current frame data of the target to be tracked to which each candidate detection object belongs can be updated according to the measured three-dimensional detection frame and the measured trajectory features of each candidate detection object. Detection box and trajectory information. It can be understood that the measured 3D detection frame and the measured trajectory feature of the target to be tracked in the updated current frame data can be used as the known historical 3D detection frame and historical trajectory feature in the previous frame data of the next frame of data.

本实施例提供的目标跟踪方法，先基于交并比对当前帧中各候选检测对象的实测三维检测框与待跟踪目标在当前帧的预测三维检测框进行匹配，然后将没有匹配成功的候选检测对象的实测轨迹特征与待跟踪目标在上一帧中的历史轨迹特征进行相似度匹配，整个过程采用不同的匹配方式混合匹配，即从不同的维度进行匹配，可以有效地将当前帧中各候选检测对象都匹配得到的对应的待跟踪目标，避免目标作为检测对象时因被遮挡时导致的该目标未匹配成功而丢失跟踪，使得跟踪过程每个待跟踪的目标都可以稳定地被跟踪。且由于三维检测框可以更加贴切、精准地反映目标的物理形态，这样本实施例中以检测对象的三维检测框进行交并比匹配，提高了目标匹配的精确性，进一步增加了目标跟踪的稳定性。The target tracking method provided by this embodiment firstly matches the measured 3D detection frame of each candidate detection object in the current frame with the predicted 3D detection frame of the target to be tracked in the current frame based on the intersection and comparison, and then detects the candidates that have not been successfully matched. Similarity matching is performed between the measured trajectory features of the object and the historical trajectory features of the target to be tracked in the previous frame. The whole process uses different matching methods to mix and match, that is, matching from different dimensions, which can effectively match the candidates in the current frame. The detection objects are all matched to the corresponding to-be-tracked targets, which avoids losing tracking due to the fact that the target is not matched successfully when the target is occluded, so that each target to be tracked can be tracked stably during the tracking process. And because the three-dimensional detection frame can more closely and accurately reflect the physical form of the target, in this embodiment, the three-dimensional detection frame of the detection object is used for cross-combination matching, which improves the accuracy of target matching and further increases the stability of target tracking. sex.

前文有提及第一传感器和第二传感器获取的任意帧(任意时刻)数据需要是有效帧，而有效帧指的是第一当前帧数据和第二当前帧数据之间的时间同步，所以需要判断每一帧中第一当前帧数据和第二当前帧数据之前的时间是否同步。基于此，提供一个实施例进行说明，如图4所示，该实施例包括：As mentioned above, the data of any frame (at any time) acquired by the first sensor and the second sensor needs to be a valid frame, and the valid frame refers to the time synchronization between the first current frame data and the second current frame data, so it is necessary to Determine whether the time before the first current frame data and the second current frame data in each frame is synchronized. Based on this, an embodiment is provided for description. As shown in FIG. 4 , the embodiment includes:

S301，对于任一帧传感器采集的数据，获取第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2。S301, for the data collected by any frame sensor, obtain the time stamp T1 of the first current frame data and the time stamp T2 of the second current frame data.

具体地，可通过获取第一当前帧数据的时间戳T1和第二当前帧数据的时间戳T2来判断。实际应用中，由于第一传感器和第二传感器采集数据时，其采集的数据上可能存在时间戳，也可能不存在时间戳，所以对于第一当前帧数据和第二当前帧数据中携带了时间戳的情况，例如，第一当前帧数据和第二当前帧数据均携带了传感器的GPS模块采集数据的时间，这种情况下，计算机设备可以直接获取第一当前帧数据的时间戳T1和第二当前帧数据的时间戳T 2。Specifically, it can be determined by acquiring the timestamp T1 of the first current frame data and the timestamp T2 of the second current frame data. In practical applications, when the first sensor and the second sensor collect data, a timestamp may or may not exist on the collected data, so the first current frame data and the second current frame data carry the time. In the case of stamping, for example, the first current frame data and the second current frame data both carry the time when the GPS module of the sensor collects the data. In this case, the computer device can directly obtain the time stamp T1 and the first current frame data 2. Timestamp T2 of the current frame data.

可选地，对于第一当前帧数据和第二当前帧数据中未携带时间戳的情况，则将第一当前帧数据的采集时间和第二当前帧数据的采集时间转换成同一时间轴下后，分别获取第一当前帧数据的时间戳T1和第二当前帧数据的时间戳T2。Optionally, for the situation that the first current frame data and the second current frame data do not carry timestamps, then the collection time of the first current frame data and the collection time of the second current frame data are converted into the same time axis. , to obtain the timestamp T1 of the first current frame data and the timestamp T2 of the second current frame data, respectively.

具体地，如果传感器中没有GPS模块，得不到GPS模块给的精确时间，那么就将各传感器本身的系统时间转换为计算机设备(即处理平台)的系统时间，统一在处理平台的系统时间下，可以精确地对第一当前帧数据和第二当前帧数据的时间戳进行判断。例如，在同一时刻下，处理平台系统时间为4:00，第一传感器系统时间为4:03，第二传感器系统时间为4:0 2，在转换后，把当前时刻的第一传感器采集的第一当前帧数据的时间转换为4:00(时间戳T 1)，把第二传感器采集的第二当前帧数据的时间转化为4:00(时间戳T2)，从而得到了第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2。Specifically, if there is no GPS module in the sensor, and the precise time given by the GPS module cannot be obtained, then the system time of each sensor itself is converted into the system time of the computer equipment (that is, the processing platform), and unified under the system time of the processing platform. , the timestamps of the first current frame data and the second current frame data can be accurately judged. For example, at the same time, the system time of the processing platform is 4:00, the time of the first sensor system is 4:03, and the time of the second sensor system is 4:02. The time of the first current frame data is converted into 4:00 (timestamp T1), and the time of the second current frame data collected by the second sensor is converted into 4:00 (timestamp T2), thereby obtaining the first current frame The data timestamp T1 and the timestamp T2 of the second current frame data.

S302，若T1和T2之间间隔小于预设的间隔阈值，则确定第一当前帧数据和第二当前帧数据为当前帧采集的数据；若T1和T2之间间隔大于预设阈值，则丢弃第一当前帧数据和第二当前帧数据，重新获取下一帧第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2。S302, if the interval between T1 and T2 is less than a preset interval threshold, determine that the first current frame data and the second current frame data are the data collected by the current frame; if the interval between T1 and T2 is greater than the preset threshold, discard For the first current frame data and the second current frame data, the time stamp T1 of the first current frame data and the time stamp T2 of the second current frame data of the next frame are obtained again.

在获取到T1和T2之后，判断T1和T2之间间隔(时间间隔为绝对值|T1-T2|)与预设的间隔阈值(例如10ms)的大小关系，若T1和T2之间间隔小于预设的间隔阈值，那么可确定第一当前帧数据和第二当前帧数据时间同步，为有效帧，就可以将第一当前帧数据和第二当前帧数据直接确定为当前时刻(当前帧)的数据。但若T1和T2之间间隔大于预设阈值，就表示第一当前帧数据和第二当前帧数据时间不同步，为无效帧，则丢弃第一当前帧数据和第二当前帧数据，重新获取下一帧第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2，进行判断过程。例如，按照一定的帧率(比如10Hz)寻找下一帧数据。After obtaining T1 and T2, determine the relationship between the interval between T1 and T2 (the time interval is the absolute value |T1-T2|) and the preset interval threshold (for example, 10ms). If the interval between T1 and T2 is smaller than the predetermined interval If the interval threshold is set, then it can be determined that the first current frame data and the second current frame data are time-synchronized, and if they are valid frames, the first current frame data and the second current frame data can be directly determined as the current moment (current frame). data. However, if the interval between T1 and T2 is greater than the preset threshold, it means that the time of the first current frame data and the second current frame data are not synchronized, and it is an invalid frame, then discard the first current frame data and the second current frame data, and re-acquire The judgment process is performed on the timestamp T1 of the first current frame data of the next frame and the timestamp T2 of the second current frame data. For example, according to a certain frame rate (such as 10Hz) to find the next frame of data.

可选地，若T1和T2之间间隔小于预设的间隔阈值，则丢弃T1和T2中较小的值，保留较大的值，然后获取较小值的传感器采集的下一帧数据中的时间戳T，将新T与保留的较大的值进行比较，确定两者之间的间隔是否小于预设的间隔阈值，若是，保留新T值与较大的值，否则，继续比较下一次的新T与保留的较大的值，直到，最新的T与保留的较大的值之间间隔小于预设间隔阈值。例如，激光雷达采集的数据的时间戳是T1，摄像设备采集的数据的时间戳是T2，预设间隔阈值是10Hz；一开始的T1>T2，那么舍弃T2，然后将摄像设备采集的新T2与T1进行比较，若新T2与T1的间隔<10Hz，那么保留新T2与T1；但若新T2与T1 之间的间隔>10Hz，则继续往下摄像设备最新获取的T2与T1的间隔。Optionally, if the interval between T1 and T2 is smaller than the preset interval threshold, discard the smaller value in T1 and T2, keep the larger value, and then acquire the smaller value in the next frame of data collected by the sensor with the smaller value. Timestamp T, compare the new T with the retained larger value to determine whether the interval between the two is less than the preset interval threshold, if so, retain the new T value and the larger value, otherwise, continue to compare the next time The new T and the retained larger value, until the interval between the latest T and the retained larger value is less than the preset interval threshold. For example, the time stamp of the data collected by the lidar is T1, the time stamp of the data collected by the camera device is T2, and the preset interval threshold is 10Hz; if T1>T2 at the beginning, then discard T2, and then use the new T2 collected by the camera device. Compared with T1, if the interval between new T2 and T1 is less than 10 Hz, keep new T2 and T1; but if the interval between new T2 and T1 is > 10 Hz, continue down to the interval between T2 and T1 newly obtained by the imaging device.

本实施例中，通过各个传感器采集的数据的时间戳，来判断各个传感器采集的数据是否同步，不同步的丢弃，只保留同步的数据，使得在进行目标跟踪时的数据更加精准。In this embodiment, whether the data collected by each sensor is synchronous is judged by the time stamp of the data collected by each sensor, and the asynchronous data is discarded, and only the synchronous data is retained, so that the data during target tracking is more accurate.

另外，在上述获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据之前，还可以进行一些预处理工作，进一步保证各传感器采集数据的同步。则可选地，在一个实施例，该方法还包括：调整第一传感器的采样频率和第二传感器的采样频率相同，以及对第一传感器和第二传感器之间的外参信息进行标定。In addition, before the above-mentioned acquisition of the first current frame data of the target scene collected by the first sensor and the second current frame data of the target scene collected by at least one second sensor, some preprocessing work may be performed to further ensure that each sensor collects Synchronization of data. Then optionally, in one embodiment, the method further includes: adjusting the sampling frequency of the first sensor to be the same as the sampling frequency of the second sensor, and calibrating the external parameter information between the first sensor and the second sensor.

预处理准备工作包括调整各传感器的采样频率，和对各传感器的外参信息进行标定。The preprocessing preparation includes adjusting the sampling frequency of each sensor and calibrating the external parameter information of each sensor.

其中，调整各传感器的采样频率可以是在安装各传感器时，将各传感器的采用频率调整一致；还可以是预先在计算机设备中植入调整程序，定时地将调整程序(可携带指定采样频率)发送给各传感器，指示各传感器调整自身的采样频率。Among them, the adjustment of the sampling frequency of each sensor may be to adjust the frequency of use of each sensor when installing each sensor; it may also be to implant the adjustment program in the computer equipment in advance, and periodically adjust the program (the specified sampling frequency can be carried) Sent to each sensor, instructing each sensor to adjust its own sampling frequency.

可选地，以第一传感器为摄像设备；第二传感器为激光雷达；对第一传感器和第二传感器之间的外参信息进行标定，包括：将摄像设备和激光雷达之间的相对位姿信息调整为目标位姿信息，并根据预设的标定算法获取摄像设备标定后的外参信息；根据标定后的外参信息对摄像设备的外参信息进行标定。Optionally, using the first sensor as a camera device; the second sensor as a lidar; calibrating the external parameter information between the first sensor and the second sensor, including: comparing the relative pose between the camera device and the lidar The information is adjusted to target pose information, and the calibrated external parameter information of the camera device is obtained according to a preset calibration algorithm; the external parameter information of the camera device is calibrated according to the calibrated external parameter information.

外参信息包括：位姿信息(相对位置和相对角度)和摄像设备的外参信息The external parameter information includes: pose information (relative position and relative angle) and external parameter information of the camera device

实际应用中，为了使激光雷达与摄像设备全面有效地获取周边环境信息，在安装激光雷达与摄像设备时，两者之间的要有合适的相对位置和相对角度。例如，可参见图1a所示的激光雷达与摄像设备安装角度，所以要保证的激光雷达与摄像设备之间的相对位置和相对角度，以保证激光雷达与摄像设备之间的相对位置和相对角度可以全面有效地获取周边环境信息。In practical applications, in order to enable the lidar and camera equipment to obtain the surrounding environment information comprehensively and effectively, when installing the lidar and the camera equipment, there must be an appropriate relative position and relative angle between them. For example, referring to the installation angle of the lidar and the camera device shown in Figure 1a, the relative position and relative angle between the lidar and the camera device should be guaranteed to ensure the relative position and relative angle between the lidar and the camera device. The surrounding environment information can be obtained comprehensively and effectively.

例如，获取目标位姿信息，该目标位姿信息中包括预先设定的目标相对位置和相对角度，那么计算机设备根据该目标相对位置和目标相对角度指示摄像设备与激光雷达调整自身的位姿。For example, acquiring target pose information, which includes a preset relative position and relative angle of the target, then the computer device instructs the camera device and the lidar to adjust its pose according to the relative position and relative angle of the target.

基于调整了位姿信息后的激光雷达和摄像设备，计算机设备对摄像设备的外参信息进行标定，例如，可根据点云与图像之间的映射关系对摄像设备的外参信息进行标定。其中，点云与图像之间的映射关系表示的是世界坐标系(激光雷达所使用坐标系)与像素坐标系(摄像设备的图像中像素所使用坐标系)之间的关系：

其中，该映射关系中，

为摄像设备的内参矩阵，

为摄像设备的外参矩阵，

为世界坐标系中各点的坐标矩阵，

为像素坐标系中点的坐标矩阵。Based on the laser radar and camera equipment after adjusting the pose information, the computer equipment calibrates the external parameter information of the camera equipment. For example, the external parameter information of the camera equipment can be calibrated according to the mapping relationship between the point cloud and the image. Among them, the mapping relationship between the point cloud and the image represents the relationship between the world coordinate system (the coordinate system used by the lidar) and the pixel coordinate system (the coordinate system used by the pixels in the image of the camera device):

Among them, in this mapping relationship,

is the internal parameter matrix of the camera device,

is the extrinsic parameter matrix of the camera device,

is the coordinate matrix of each point in the world coordinate system,

is the coordinate matrix of the point in the pixel coordinate system.

那么求取出根据实际激光雷达采集的点云数据和摄像设备采集像素数据可确定出世界坐标系中各点的坐标矩阵和像素坐标系中点的坐标矩阵。而摄像设备的内参信息可以直接获取，然后从该映射关系中求取摄像设备的外参矩阵，然后根据求取的外参矩阵作为摄像设备的新的外参信息，即完成了对摄像设备的标定。Then, the coordinate matrix of each point in the world coordinate system and the coordinate matrix of the point in the pixel coordinate system can be determined by obtaining the point cloud data collected by the actual lidar and the pixel data collected by the camera device. The internal parameter information of the camera device can be obtained directly, and then the external parameter matrix of the camera device is obtained from the mapping relationship, and then the obtained external parameter matrix is used as the new external parameter information of the camera device, that is, the camera device is completed. Calibration.

本实施例中通过预处理准备工作对摄像设备和激光雷达的外参信息进行标定，这样摄像设备和激光雷达可以全面有效地获取周边环境信息，从而使得采集的目标场景的数据更加精确。另外除了上述对摄像设备的外参信息进行标定，对摄像设备和激光雷达之间的外参信息 (相对位置和相对角度)也可进行标定，这样通过对摄像设备和激光雷达之间的外参信息进行标定可保证摄像设备和激光雷达采集的数据进行空间转换时，保持空间统一性，提高数据转换的精确度。In this embodiment, the external parameter information of the camera device and the laser radar is calibrated through the preprocessing preparation work, so that the camera device and the laser radar can comprehensively and effectively obtain the surrounding environment information, thereby making the collected data of the target scene more accurate. In addition to the above-mentioned calibration of the external parameter information of the camera device, the external parameter information (relative position and relative angle) between the camera device and the lidar can also be calibrated. The information calibration can ensure that the spatial uniformity of the data collected by the camera equipment and the lidar can be maintained and the accuracy of the data conversion can be improved.

通过下面的实施例对上述S102步骤中“根据第一当前帧数据和第二当前帧数据，确定各候选检测对象在当前帧中的实测特征信息”的过程进行说明，本实施例仍以实测特征信息包括实测三维检测框和实测轨迹特征为例进行说明，如图5所示，在一个实施例中，S102步骤包括：The process of “determining the measured feature information of each candidate detection object in the current frame according to the first current frame data and the second current frame data” in the above step S102 will be described through the following embodiments. In this embodiment, the measured features are still used. The information includes the measured three-dimensional detection frame and the measured trajectory feature as an example to illustrate, as shown in FIG. 5 , in one embodiment, step S102 includes:

S401，获取像素数据中各检测对象的二维检测框，和点云数据中各检测对象的实测三维检测框。S401: Acquire a two-dimensional detection frame of each detection object in the pixel data, and a measured three-dimensional detection frame of each detection object in the point cloud data.

本实施例以第一当前帧数据为摄像设备采集的像素数据，第二当前帧数据为激光雷达采集的点云数据为例进行说明。In this embodiment, the first current frame data is pixel data collected by a camera device, and the second current frame data is point cloud data collected by lidar as an example for description.

从摄像设备采集的像素数据获取各检测对象的二维检测框，例如，通过预设的深度学习算法模型，例如YOLOv3模型等，输出各检测对象所在的像素矩形框定位，目标置信度，目标分类等信息；其中，矩形框定位可以反映各检测对对象的二维检测框，目标置信度可以反映各检测对象相对于待跟踪目标的准确度；目标分类信息反映的是各检测对象的类别，例如，目标是车、人或者动物等。Obtain the two-dimensional detection frame of each detection object from the pixel data collected by the camera equipment. For example, through a preset deep learning algorithm model, such as the YOLOv3 model, etc., output the pixel rectangular frame location where each detection object is located, target confidence, and target classification. Among them, the rectangular frame positioning can reflect the two-dimensional detection frame of each detected object, and the target confidence can reflect the accuracy of each detected object relative to the target to be tracked; the target classification information reflects the category of each detected object, such as , the target is a car, a person or an animal.

从点云数据中获取各检测对象的三维检测框(实测三维检测框)，例如，通过预设的深度学习算法模型，例如SECOND模型等，输出各检测对象所在点云坐标系中的定位、尺寸、航向角等信息；其中，各检测对象在点云坐标系中的定位、尺寸、航向角等可以反映的是各检测对象的实测三维检测框。需要说明的是，本申请所有实施例中的实测三维检测框和二维检测框均指的是检测对象的实际测量的三维或二维检测框，但三维检测框称为实测三维检测框是为了保持与前文描述的统一性，而前文称为实测三维检测框是为了和预测三维检测框进行区分。Obtain the 3D detection frame (actually measured 3D detection frame) of each detection object from the point cloud data. For example, through a preset deep learning algorithm model, such as SECOND model, etc., output the location and size of each detection object in the point cloud coordinate system. , heading angle and other information; among them, the location, size, heading angle, etc. of each detection object in the point cloud coordinate system can reflect the measured 3D detection frame of each detection object. It should be noted that the measured 3D detection frame and the 2D detection frame in all the embodiments of this application refer to the actual measured 3D or 2D detection frame of the detection object, but the 3D detection frame is called the measured 3D detection frame for the purpose of To maintain the unity with the previous description, the previous description is called the measured 3D detection frame to distinguish it from the predicted 3D detection frame.

S402，将像素数据中各检测对象的二维检测框和点云数据中各检测对象的实测三维检测框进行匹配，确定候选检测对象。S402: Match the two-dimensional detection frame of each detection object in the pixel data with the measured three-dimensional detection frame of each detection object in the point cloud data to determine candidate detection objects.

在获取到像素数据中每一个检测对象的二维检测框和点云数据中每一个检测对象的实测三维检测框之后，先对像素数据中的检测对象和点云数据中的检测对象进行关联匹配，将关联匹配成功的检测对象确定为候选检测对象。After obtaining the 2D detection frame of each detection object in the pixel data and the measured 3D detection frame of each detection object in the point cloud data, first perform correlation matching between the detection objects in the pixel data and the detection objects in the point cloud data , and determine the detection object with successful association matching as the candidate detection object.

可选地，如图6所示，确定候选检测对象的一种实施例包括：Optionally, as shown in FIG. 6 , an embodiment of determining candidate detection objects includes:

S501，将各实测三维检测框映射为对应的二维映射检测框。S501: Map each measured three-dimensional detection frame to a corresponding two-dimensional mapped detection frame.

将每一个检测对象的实测三维检测框映射为对应的二维映射检测框，例如，可以是将点云数据中各实测三维检测框作为输入数据输入至预先训练好的转换网络模型中，输出结果即为得到该点云数据中各实测三维检测框对应的二维图像。又例如，先基于预设的三维坐标系和二维坐标系之间的映射关系，将各实测三维检测框中的每一个点云点的三维坐标转换为二维坐标后，得到该各实测三维检测框中每个点云点的二维坐标，然后确定每个二维坐标对应的点云点上对应的像素值，确定的每个二维坐标位置的像素值后，得到各实测三维检测框对应的二维图像。这样，得到的二维检测框中每个点的二维坐标位置处该点的像素值也不相同，也保留了三维点云点的特征。Map the measured 3D detection frame of each detection object to the corresponding 2D mapping detection frame. For example, each measured 3D detection frame in the point cloud data can be input into the pre-trained conversion network model as input data, and the output result That is, to obtain the two-dimensional image corresponding to each measured three-dimensional detection frame in the point cloud data. For another example, firstly, based on the mapping relationship between the preset three-dimensional coordinate system and the two-dimensional coordinate system, after converting the three-dimensional coordinates of each point cloud point in each measured three-dimensional detection frame into two-dimensional coordinates, the measured three-dimensional coordinates are obtained. Detect the two-dimensional coordinates of each point cloud point in the frame, and then determine the corresponding pixel value on the point cloud point corresponding to each two-dimensional coordinate. After determining the pixel value of each two-dimensional coordinate position, each measured three-dimensional detection frame is obtained. the corresponding 2D image. In this way, the pixel values of each point at the two-dimensional coordinate position of the obtained two-dimensional detection frame are also different, and the characteristics of the three-dimensional point cloud point are also retained.

S502，获取各二维检测框和各二维映射检测框之间的交并比，并将交并比大于交并比阈值的检测对象确定为候选检测对象。S502, obtain the intersection ratio between each two-dimensional detection frame and each two-dimensional mapping detection frame, and determine the detection object whose intersection ratio is greater than the threshold value of the intersection ratio as a candidate detection object.

得到各检测对象的实测三维检测框对应的二维映射检测框之后，获取各二维映射检测框和像素数据中各检测对象的二维检测框之间的交并比，然后将交并比大于预设交并比阈值的检测对象确定为候选检测对象。可以理解的是，该候选检测对象实质上是一对检测框。需要说明的是，这里的预设交并比阈值与前文实施例中的进行匹配的预设交并比阈值可以相同，也可以不同，本实施例对此不做限定。After obtaining the two-dimensional mapping detection frame corresponding to the measured three-dimensional detection frame of each detection object, the intersection ratio between each two-dimensional mapping detection frame and the two-dimensional detection frame of each detection object in the pixel data is obtained, and then the intersection ratio is greater than A detection object with a preset intersection ratio threshold is determined as a candidate detection object. It can be understood that the candidate detection objects are essentially a pair of detection frames. It should be noted that, the preset intersection ratio threshold here and the preset intersection ratio threshold for matching in the foregoing embodiment may be the same or different, which is not limited in this embodiment.

通过将像素数据中的各检测对象的二维检测框和点云数据中的各检测对象的实测三维检测框进行关联匹配，筛选出候选检测对象，确保了各类型传感器采集的数据中的检测对象的一致性。By correlating and matching the 2D detection frame of each detection object in the pixel data and the measured 3D detection frame of each detection object in the point cloud data, the candidate detection objects are screened out, and the detection objects in the data collected by various types of sensors are ensured. consistency.

S403，根据各候选检测对象在像素数据中的二维检测框和特征信息，以及各检测对象在点云数据中的实测三维检测框和特征信息，确定各候选检测对象的实测轨迹特征。S403, according to the two-dimensional detection frame and feature information of each candidate detection object in the pixel data, and the measured three-dimensional detection frame and feature information of each detection object in the point cloud data, determine the measured trajectory feature of each candidate detection object.

在确定了各候选检测对象后，进一步确定各候选检测对象的实测轨迹特征。例如，根据各候选检测对象的二维检测框，提取该二维检测框中的特征信息；例如，提取二维检测框的方向梯度直方图(Histogram of Oriented Gradient，HOG)特征作为对应的二维检测框的特征；根据各候选检测对象的实测三维检测框，提取该实测三维检测框中的特征信息；然后根据二维检测框中的特征信息和实测三维检测框中的特征信息综合确定各候选检测对象的实测轨迹特征。其中，各候选检测对象的实测三维检测框就是各检测对象在点云数据中的三维检测框。After each candidate detection object is determined, further determine the measured trajectory feature of each candidate detection object. For example, extract feature information in the two-dimensional detection frame according to the two-dimensional detection frame of each candidate detection object; for example, extract the Histogram of Oriented Gradient (HOG) feature of the two-dimensional detection frame as the corresponding two-dimensional detection frame The feature of the detection frame; according to the measured 3D detection frame of each candidate detection object, extract the feature information in the measured 3D detection frame; then comprehensively determine each candidate according to the feature information in the 2D detection frame and the measured 3D detection frame Detect the measured trajectory features of the object. Among them, the measured 3D detection frame of each candidate detection object is the 3D detection frame of each detection object in the point cloud data.

本实施例中，先基于像素数据中各检测对象的二维检测框，和点云数据中各检测对象的实测三维检测框确定候选检测对象，然后根据各候选检测对象的二维检测框中的特征信息和实测三维检测框中的特征信息确定各候选检测对象的实测轨迹特征。在确定各检测对象的实测轨迹特征时是根据两种以上不同类型的传感器采集的数据确定的，综合了多种类型的传感器采集的数据来确定各帧中检测对象的实测轨迹特征，可以精准地，完整地反映各帧中检测对象的特征，这样，在将各候选检测对象的实测轨迹特征与各待跟踪目标的历史轨迹特征进行匹配时，可以准确地确定出每一个检测对象的所属待跟踪目标，有效地完成各帧的目标跟踪。In this embodiment, candidate detection objects are first determined based on the two-dimensional detection frame of each detection object in the pixel data and the measured three-dimensional detection frame of each detection object in the point cloud data, and then the candidate detection objects are determined according to the two-dimensional detection frame of each candidate detection object. The feature information and the feature information in the measured three-dimensional detection frame determine the measured trajectory characteristics of each candidate detection object. When determining the measured trajectory characteristics of each detection object, it is determined according to the data collected by more than two different types of sensors. The data collected by various types of sensors is combined to determine the measured trajectory characteristics of the detection objects in each frame, which can accurately , which completely reflects the characteristics of the detection objects in each frame, so that when matching the measured trajectory characteristics of each candidate detection object with the historical trajectory characteristics of each to-be-tracked target, it is possible to accurately determine the to-be-tracked belonging of each detection object. target, effectively complete the target tracking of each frame.

提供一种上述S403步骤中确定各候选检测对象的实测轨迹特征的可实现方式，如图7所示，该实施例包括：Provide a realizable way of determining the measured trajectory features of each candidate detection object in the above-mentioned S403 step, as shown in Figure 7, this embodiment includes:

S601，将各候选检测对象在像素数据中的二维检测框和特征信息综合确定为各候选检测对象的二维轨迹特征；将各候选检测对象在点云数据中的实测三维检测框和特征信息综合确定为各候选检测对象的三维轨迹特征。S601, comprehensively determine the two-dimensional detection frame and feature information of each candidate detection object in the pixel data as the two-dimensional trajectory feature of each candidate detection object; determine the measured three-dimensional detection frame and feature information of each candidate detection object in the point cloud data The three-dimensional trajectory features of each candidate detection object are comprehensively determined.

将各候选检测对象在像素数据中的二维检测框提取的特征信息和二维检测框综合确定为各候选检测对象的二维轨迹特征，即，二维轨迹特征可以看作是各候选检测对象的二维检测框且携带着二维检测框中的像素点的特征信息。The feature information extracted from the two-dimensional detection frame in the pixel data of each candidate detection object and the two-dimensional detection frame are comprehensively determined as the two-dimensional trajectory feature of each candidate detection object, that is, the two-dimensional trajectory feature can be regarded as each candidate detection object. The 2D detection frame carries the feature information of the pixels in the 2D detection frame.

将各候选检测对象在点云数据中的实测三维检测框和从实测三维检测框中提取的特征信息综合确定为各候选检测对象的三维轨迹特征，同样，三维轨迹特征可以看作是各候选检测对象的实测三维检测框且携带着实测三维检测框中的点云点的特征信息。The measured 3D detection frame of each candidate detection object in the point cloud data and the feature information extracted from the measured 3D detection frame are comprehensively determined as the 3D trajectory feature of each candidate detection object. Similarly, the 3D trajectory feature can be regarded as each candidate detection frame. The measured 3D detection frame of the object carries the feature information of the point cloud points in the measured 3D detection frame.

S602，将各候选检测对象的三维轨迹特征转换为对应的二维映射轨迹特征。S602: Convert the three-dimensional trajectory feature of each candidate detection object into a corresponding two-dimensional mapping trajectory feature.

由于三维轨迹特征从展示形式上就是实测三维检测框，所以将各候选检测对象的三维轨迹特征转为对应的二维映射轨迹特征时，可以根据点云和图像中点的映射关系进行转换，其中，基于映射关系将实测三维检测框的每一个点云点的三维坐标转换为二维坐标后，得到该实测三维检测框中每个点云点的二维坐标，然后确定每个二维坐标对应的点云点上对应的像素值，确定的每个二维坐标位置的像素值后，得到该实测三维检测框对应的二维检测框。这样，得到的二维检测框中每个点的二维坐标位置处该点的像素值也不相同，也保留了实测三维检测框中点云点的特征，这样，转换得到的二维映射轨迹特征包括三维轨迹特征中的检测框及检测框中点的特征信息。Since the 3D trajectory feature is the measured 3D detection frame in the form of display, when the 3D trajectory feature of each candidate detection object is converted into the corresponding 2D mapped trajectory feature, it can be converted according to the mapping relationship between the point cloud and the midpoint of the image, where , after converting the three-dimensional coordinates of each point cloud point of the measured three-dimensional detection frame into two-dimensional coordinates based on the mapping relationship, the two-dimensional coordinates of each point cloud point in the measured three-dimensional detection frame are obtained, and then the corresponding two-dimensional coordinates are determined. After determining the pixel value of each two-dimensional coordinate position corresponding to the pixel value on the point cloud point, the two-dimensional detection frame corresponding to the measured three-dimensional detection frame is obtained. In this way, the pixel value of each point in the two-dimensional coordinate position of the obtained two-dimensional detection frame is also different, and the characteristics of the point cloud points in the measured three-dimensional detection frame are also retained. In this way, the converted two-dimensional mapping trajectory The features include the detection frame in the three-dimensional trajectory feature and the feature information of the points in the detection frame.

S603，根据各二维映射轨迹特征和各二维轨迹特征的融合数据，提取各候选检测对象的实测轨迹特征。可选地，将各二维轨迹特征和各二维映射轨迹特征压缩为相同比例大小后，拼接得到融合数据矩阵；将从融合数据矩阵中提取的特征信息确定为各候选检测对象的实测轨迹特征。S603, according to each two-dimensional mapping trajectory feature and the fusion data of each two-dimensional trajectory feature, extract the measured trajectory feature of each candidate detection object. Optionally, after compressing each two-dimensional trajectory feature and each two-dimensional mapping trajectory feature into the same scale, a fusion data matrix is obtained by splicing; the feature information extracted from the fusion data matrix is determined as the measured trajectory feature of each candidate detection object. .

各二维映射轨迹特征是从点云数据中确定了各候选检测对象的特征信息，各二维轨迹特征是从像素数据中确定的各候选检测对象的特征信息，根据这两个融合数据提炼的特征信息为各候选检测对象的实测轨迹特征。Each 2D mapping trajectory feature is the feature information of each candidate detection object determined from the point cloud data, and each 2D trajectory feature is the feature information of each candidate detection object determined from the pixel data, extracted from the two fusion data. The feature information is the measured trajectory feature of each candidate detection object.

仍以上述举例进行说明，将各候选检测对象的二维轨迹特征信息和各二维映射轨迹特征调整为相同比例，例如，各候选检测对象的二维轨迹特征信息压缩到N*N*3大小，二维映射轨迹特征放缩到同样大小比例。比例调整之后，将两者进行拼接，得到融合数据矩阵。然后将融合数据矩阵输入至卷积神经网络中进行特征的提取，提取的特征即为各候选检测对象的实测轨迹特征。Still taking the above example to illustrate, the two-dimensional trajectory feature information of each candidate detection object and each two-dimensional mapping trajectory feature are adjusted to the same ratio, for example, the two-dimensional trajectory feature information of each candidate detection object is compressed to N*N*3 size , the two-dimensional mapping trajectory features are scaled to the same scale. After the scale adjustment, the two are spliced to obtain a fusion data matrix. Then, the fusion data matrix is input into the convolutional neural network for feature extraction, and the extracted features are the measured trajectory features of each candidate detection object.

本实施例中，通过将各候选检测对象在点云数据中的三维轨迹特征转换为二维映射轨迹特征，然后将二维映射轨迹特征和各候选检测对象在像素数据中的二维轨迹特征拼接后形成融合数据矩阵，从融合数据矩阵提取的特征信息确定为各候选检测对象的实测轨迹特征。从多维度类型的数据来确定各帧中检测对象的实测轨迹特征，可以精准地，完整地反映各帧中检测对象的特征，这样，在将各候选检测对象的实测轨迹特征与各待跟踪目标准确的数据进行匹配时，可以准确地确定出每一个检测对象的所属待跟踪目标，有效地完成各帧的目标跟踪。In this embodiment, the three-dimensional trajectory features of each candidate detection object in the point cloud data are converted into two-dimensional mapped trajectory features, and then the two-dimensional mapped trajectory features and the two-dimensional trajectory features of each candidate detection object in the pixel data are spliced Then, a fusion data matrix is formed, and the feature information extracted from the fusion data matrix is determined as the measured trajectory characteristics of each candidate detection object. Determining the measured trajectory characteristics of the detection objects in each frame from multi-dimensional data can accurately and completely reflect the characteristics of the detection objects in each frame. When accurate data is matched, the target to be tracked to which each detection object belongs can be accurately determined, and the target tracking of each frame can be effectively completed.

下面以具体的实施例，对上述S602中将各候选检测对象的三维轨迹特征转换为对应的二维映射轨迹特征的过程进行说明，如图8所示，S602包括：Below with a specific embodiment, the process of converting the three-dimensional trajectory feature of each candidate detection object into the corresponding two-dimensional mapping trajectory feature in the above-mentioned S602 will be described, as shown in Figure 8, S602 includes:

S701，将各三维轨迹特征对应的实测三维检测框中的点云点的三维坐标转换为二维坐标。S701: Convert the three-dimensional coordinates of the point cloud points in the measured three-dimensional detection frame corresponding to each three-dimensional trajectory feature into two-dimensional coordinates.

其中，转换三维坐标到二维坐标可以基于三维坐标系和二维坐标系之间的映射关系进行，例如，三维坐标系和二维坐标系之间的映射关系可以如下公式(1)所示：The conversion of the three-dimensional coordinates to the two-dimensional coordinates can be performed based on the mapping relationship between the three-dimensional coordinate system and the two-dimensional coordinate system. For example, the mapping relationship between the three-dimensional coordinate system and the two-dimensional coordinate system can be shown in the following formula (1):

上述公式(1)中a、b表示点云点在三维坐标系中的坐标，a_t、b_t表示点云点映射后在二维坐标系中的坐标，h表示点云边界到y轴的距离，w表示点云边界到x轴的距离，其中， x轴和y轴是以激光雷达为原点的坐标轴。In the above formula (1), a and b represent the coordinates of the point cloud point in the three-dimensional coordinate system, a _t and b _t represent the coordinates of the point cloud point in the two-dimensional coordinate system after mapping, and h represents the point cloud boundary to the y-axis. Distance, w represents the distance from the point cloud boundary to the x-axis, where the x-axis and the y-axis are the coordinate axes with the lidar as the origin.

基于公式(1)的映射关系，转换三维检测框中各点云点的三维坐标为二维坐标。Based on the mapping relationship of formula (1), the three-dimensional coordinates of each point cloud point in the three-dimensional detection frame are converted into two-dimensional coordinates.

S702，根据各点云点的二维坐标和各点云点在三维坐标中的z轴坐标，获取各实测三维检测框对应的鸟瞰图；根据各点云点的二维坐标和各点云点的强度，获取各实测三维检测框对应的强度图；根据鸟瞰图中各点云点在z轴方向的密度，获取各实测三维检测框对应的密度图。S702, according to the two-dimensional coordinates of each point cloud point and the z-axis coordinate of each point cloud point in the three-dimensional coordinates, obtain a bird's-eye view corresponding to each measured three-dimensional detection frame; according to the two-dimensional coordinates of each point cloud point and each point cloud point The intensity map corresponding to each measured 3D detection frame is obtained; according to the density of each point cloud point in the z-axis direction in the bird's-eye view, the density map corresponding to each measured 3D detection frame is obtained.

本步骤中分别获取了各实测三维检测框对应的鸟瞰图、强度图和密度图，这三个图都是二维的。In this step, the bird's-eye view, intensity map and density map corresponding to each measured three-dimensional detection frame are obtained respectively, and these three maps are all two-dimensional.

其中，获取实测三维检测框对应的鸟瞰图是根据各点云点的二维坐标和各点云点在三维坐标中的z轴坐标确定的。其中，点云点的二维坐标是上述步骤中转换得到的，每个点云点在二维坐标决定了该点云点在二维平面上的位置。其中，点云点在三维坐标中的z轴坐标是点云点在点云数据所在的三维坐标中的Z坐标，Z坐标也可以看做是该点云点在二维坐标中的高度。基于实测三维检测框中，每个点云点在二维平面上的位置和该点云点在二维坐标中的高度，可以得到各实测三维检测框对应的鸟瞰图。The bird's-eye view corresponding to the measured three-dimensional detection frame is obtained according to the two-dimensional coordinates of each point cloud point and the z-axis coordinate of each point cloud point in the three-dimensional coordinates. The two-dimensional coordinates of the point cloud points are obtained by conversion in the above steps, and the two-dimensional coordinates of each point cloud point determine the position of the point cloud point on the two-dimensional plane. Among them, the z-axis coordinate of the point cloud point in the three-dimensional coordinate is the Z coordinate of the point cloud point in the three-dimensional coordinate where the point cloud data is located, and the Z coordinate can also be regarded as the height of the point cloud point in the two-dimensional coordinate. Based on the measured 3D detection frame, the position of each point cloud point on the 2D plane and the height of the point cloud point in the 2D coordinates, the bird's-eye view corresponding to each measured 3D detection frame can be obtained.

可选地，如图9所示，获取各实测三维检测框对应的鸟瞰图的过程包括：Optionally, as shown in Figure 9, the process of acquiring the bird's-eye view corresponding to each measured 3D detection frame includes:

S801，对各点云点在三维坐标中的z轴坐标进行归一化处理，将归一化处理后的z轴坐标确定为各点云点的像素值。S801, normalize the z-axis coordinates of each point cloud point in the three-dimensional coordinates, and determine the normalized z-axis coordinate as the pixel value of each point cloud point.

先根据点云点在三维坐标中的z轴坐标确定出每个点云点在鸟瞰图中像素值，因为像素值的范围是0-255，所以要先对点云点在三维坐标中的z轴坐标，做归一化处理，归一化到0 ～255之间，归一化后的值即为各点云点在鸟瞰图中像素值。First, determine the pixel value of each point cloud point in the bird's-eye view according to the z-axis coordinate of the point cloud point in the three-dimensional coordinate. Because the range of pixel value is 0-255, it is necessary to first determine the z of the point cloud point in the three-dimensional coordinate. The axis coordinates are normalized and normalized to between 0 and 255. The normalized value is the pixel value of each point cloud point in the bird's-eye view.

S802，以各点云点对应的像素值作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的鸟瞰图。S802, take the pixel value corresponding to each point cloud point as the pixel value of the corresponding two-dimensional coordinate position, and obtain the bird's-eye view corresponding to each actually measured three-dimensional detection frame.

得到各点云点在鸟瞰图中像素值后，结合各点云点在二维坐标可确定到每个点云点的坐标位置，然后将对应点云点的像素值填充到二维坐标位置处，得到最终的鸟瞰图。After obtaining the pixel value of each point cloud point in the bird's eye view, the coordinate position of each point cloud point can be determined by combining the two-dimensional coordinates of each point cloud point, and then the pixel value of the corresponding point cloud point is filled to the two-dimensional coordinate position. , to get the final bird's-eye view.

其中，获取强度图是根据各点云点的二维坐标和各点云点的强度确定的，其中，点云点的强度是激光雷达采集点云数据时每个点云的强度得到的。基于实测三维检测框中每个点云点在二维平面上的位置和该点云点的强度，可以得到各实测三维检测框对应的强度图。The acquired intensity map is determined according to the two-dimensional coordinates of each point cloud point and the intensity of each point cloud point, wherein the intensity of each point cloud point is obtained from the intensity of each point cloud when the lidar collects point cloud data. Based on the position of each point cloud point in the measured 3D detection frame on the 2D plane and the intensity of the point cloud point, the intensity map corresponding to each measured 3D detection frame can be obtained.

可选地，如图10所示，获取各实测三维检测框对应的强度图的过程包括：Optionally, as shown in FIG. 10 , the process of acquiring the intensity map corresponding to each measured 3D detection frame includes:

S901，对各点云点的强度进行归一化处理，将归一化处理后强度确定为各点云点的像素值。S901, normalize the intensity of each point cloud point, and determine the intensity after the normalization process as the pixel value of each point cloud point.

先根据点云点的强度确定出每个点云点在鸟瞰图中像素值，同样，像素值的范围是0-25 5，所以要先对点云点的强度做归一化处理，归一化到0～255之间，归一化后的值即为各点云点在强度图中像素值。First, determine the pixel value of each point cloud point in the bird's eye view according to the intensity of the point cloud point. Similarly, the range of pixel value is 0-25 5, so the intensity of the point cloud point should be normalized first. The normalized value is the pixel value of each point cloud point in the intensity map.

S902，以各点云点对应的像素值作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的强度图。S902, using the pixel value corresponding to each point cloud point as the pixel value of the corresponding two-dimensional coordinate position, obtain the intensity map corresponding to each actually measured three-dimensional detection frame.

得到各点云点在强度图中像素值后，结合各点云点在二维坐标可确定到每个点云点的坐标位置，然后将对应点云点的像素值填充到二维坐标位置处，得到最终的强度图。After obtaining the pixel value of each point cloud point in the intensity map, the coordinate position of each point cloud point can be determined by combining the two-dimensional coordinates of each point cloud point, and then the pixel value of the corresponding point cloud point is filled to the two-dimensional coordinate position. , to get the final intensity map.

其中，获取密度图是根据鸟瞰图中各点云点在z轴方向的密度确定的，其中，点云点在 z轴方向的密度指的是在点云三维坐标时，每个x轴和y轴决定的位置处在z轴方向上点云点的数量的密度。The obtained density map is determined according to the density of each point cloud point in the z-axis direction in the bird's-eye view, where the density of the point cloud point in the z-axis direction refers to the three-dimensional coordinates of the point cloud, each x-axis and y-axis The position determined by the axis is the density of the number of point cloud points in the z-axis direction.

可选地，如图11所示，获取各实测三维检测框对应的密度图的过程包括：Optionally, as shown in FIG. 11 , the process of obtaining the density map corresponding to each measured 3D detection frame includes:

S1001，根据每个点云点的二维坐标位置中在z轴方向上的点云点的数量、所有坐标位置中点云点的数量的最大值、所有坐标位置中点云点的数量的最小值，确定每个位置的点云点在z轴方向的密度。S1001, according to the number of point cloud points in the z-axis direction in the two-dimensional coordinate position of each point cloud point, the maximum value of the number of point cloud points in all coordinate positions, and the minimum number of point cloud points in all coordinate positions value, which determines the density of point cloud points in the z-axis direction at each location.

以x坐标和y坐标决定一个像素点的二维坐标位置，那么三维检测框中每个点云点都对应一个二维坐标位置，获取每个二维坐标位置在z轴方向上的点云点的数量，那么对于其中一个点云点来说，根据该点云点的二维坐标位置中在z轴方向上的点云点的数量、所有坐标位置中点云点的数量的最大值、所有坐标位置中点云点的数量的最小值，就可以确定该点云点在z轴方向的密度。The two-dimensional coordinate position of a pixel is determined by the x coordinate and the y coordinate, then each point cloud point in the three-dimensional detection frame corresponds to a two-dimensional coordinate position, and the point cloud point of each two-dimensional coordinate position in the z-axis direction is obtained. , then for one of the point cloud points, according to the number of point cloud points in the z-axis direction in the two-dimensional coordinate position of the point cloud point, the maximum value of the number of point cloud points in all coordinate positions, all The minimum value of the number of point cloud points in the coordinate position can determine the density of the point cloud points in the z-axis direction.

例如，点云点在z轴方向的密度可通过下述公式(2)确定。For example, the density of point cloud points in the z-axis direction can be determined by the following formula (2).

其中，ρ_i为第i个像素点(即二维坐标位置)的密度，c_i为第i个像素点的点云数量，c_min为各像素点中点云数量的最小值，c_max表示各像素点中点云数量的最大值。Among them, ρ _i is the density of the ith pixel (that is, the two-dimensional coordinate position), ci is the number of point clouds of the _ith pixel, c _min is the minimum value of the number of point clouds in each pixel, and c _max represents The maximum number of point clouds in each pixel.

S1002，将每个坐标位置的点云点在z轴方向的密度作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的密度图。S1002 , taking the density of the point cloud points at each coordinate position in the z-axis direction as the pixel value of the corresponding two-dimensional coordinate position, and obtaining a density map corresponding to each measured three-dimensional detection frame.

得到各点云点的密度后，将该密度作为该点云点的像素点，结合各点云点在二维坐标可确定到每个点云点的坐标位置，然后将对应点云点的像素值填充到二维坐标位置处，得到最终的密度图。After obtaining the density of each point cloud point, the density is used as the pixel point of the point cloud point, and the coordinate position of each point cloud point can be determined by combining the two-dimensional coordinates of each point cloud point, and then the pixel corresponding to the point cloud point can be determined. The values are filled to the 2D coordinate positions to get the final density map.

获取鸟瞰图、强度图和密度图的过程如上，下面是对鸟瞰图、强度图和密度图进行合并处理的过程。The process of obtaining the bird's-eye view, intensity map and density map is as above, and the following is the process of merging the bird's-eye view, intensity map and density map.

S705，将鸟瞰图、强度图和密度图进行合并处理，得到各三维轨迹特征对应的二维映射轨迹特征。S705, combine the bird's-eye view, the intensity map, and the density map to obtain a two-dimensional mapping trajectory feature corresponding to each three-dimensional trajectory feature.

得到点云数据中各实测三维检测框对应的鸟瞰图、强度图和密度图后，将这三个二维的图像进行合并，可选地，将鸟瞰图、强度图和密度图分别作为R、G、B三个通道的图像进行合并处理，即可得到各实测三维检测框对应的二维映射检测框，而实测三维检测框中每个点云点的特征信息都反映在二维映射检测框中的各像素点上，各实测三维检测框对应的各三维轨迹特征，各二维映射检测框对应的二维映射轨迹特征，即得到了各三维轨迹特征对应的二维映射轨迹特征。After obtaining the bird's-eye view, intensity map and density map corresponding to each measured three-dimensional detection frame in the point cloud data, the three two-dimensional images are merged. Optionally, the bird's-eye view, intensity map and density map are used as R, The images of the three channels G and B are combined to obtain the two-dimensional mapping detection frame corresponding to each measured three-dimensional detection frame, and the feature information of each point cloud point in the measured three-dimensional detection frame is reflected in the two-dimensional mapping detection frame. On each pixel point in , each 3D trajectory feature corresponding to each measured 3D detection frame, and the 2D mapping trajectory feature corresponding to each 2D mapping detection frame, the 2D mapping trajectory feature corresponding to each 3D trajectory feature is obtained.

本实施例中，由于是先获取了各实测三维检测框对应的鸟瞰图、强度图和密度图，该三个图中每个点的像素值都是基于点云数据的实际数据转换的，即将点云点在点云中的该有的信息均做了保留，当将这三个图合并后得到二维映射轨迹特征中各点的像素值也可以准确反映出点云点在三维点云中的信息，从而使得二维映射轨迹特征可以更加精确的反映各三维轨迹特征。In this embodiment, since the bird's-eye view, the intensity map and the density map corresponding to each measured 3D detection frame are obtained first, the pixel value of each point in the three figures is converted based on the actual data of the point cloud data. The information of the point cloud points in the point cloud is preserved. When the three images are merged to obtain the pixel value of each point in the two-dimensional mapping trajectory feature, it can also accurately reflect the point cloud point in the three-dimensional point cloud. Therefore, the two-dimensional mapping trajectory features can more accurately reflect the three-dimensional trajectory features.

在一个实施例中，如图12所示，提供了一种目标跟踪方法的实施例，该实施例包括：In one embodiment, as shown in FIG. 12, an embodiment of a target tracking method is provided, and the embodiment includes:

S1101，调整第一传感器的采样频率和第二传感器的采样频率相同，以及对第一传感器和第二传感器之间的外参信息进行标定；S1101, adjusting the sampling frequency of the first sensor to be the same as the sampling frequency of the second sensor, and calibrating the external parameter information between the first sensor and the second sensor;

S1102，对于任一帧传感器采集的数据，获取第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2；S1102, for the data collected by any frame sensor, obtain the time stamp T1 of the first current frame data and the time stamp T2 of the second current frame data;

S1103，若T1和T2之间间隔小于预设的间隔阈值，确定第一当前帧数据和第二当前帧数据为当前帧采集的数据；若T1和T2之间间隔大于预设阈值，则丢弃第一当前帧数据和第二当前帧数据，重新获取下一帧的第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2；S1103, if the interval between T1 and T2 is less than a preset interval threshold, determine that the first current frame data and the second current frame data are the data collected by the current frame; if the interval between T1 and T2 is greater than the preset threshold, discard the first current frame data and the second current frame data. a current frame data and a second current frame data, re-acquire the first current frame data timestamp T1 and the second current frame data timestamp T2 of the next frame;

S1104，获取像素数据中各检测对象的二维检测框，和点云数据中各检测对象的实测三维检测框；S1104, obtain the two-dimensional detection frame of each detection object in the pixel data, and the measured three-dimensional detection frame of each detection object in the point cloud data;

S1105，根据像素数据中各检测对象的二维检测框和点云数据中各检测对象的实测三维检测框，确定候选检测对象；S1105, according to the two-dimensional detection frame of each detection object in the pixel data and the measured three-dimensional detection frame of each detection object in the point cloud data, determine the candidate detection object;

S1106，将各候选检测对象在像素数据中的二维检测框和特征信息综合确定为各候选检测对象的二维轨迹特征；将各候选检测对象在点云数据中的实测三维检测框和特征信息综合确定为各候选检测对象的三维轨迹特征；S1106, comprehensively determine the two-dimensional detection frame and feature information of each candidate detection object in the pixel data as the two-dimensional trajectory feature of each candidate detection object; Comprehensively determine the three-dimensional trajectory features of each candidate detection object;

S1107，将各三维轨迹特征对应的实测三维检测框中的点云点的三维坐标转换为二维坐标；S1107, convert the three-dimensional coordinates of the point cloud points in the measured three-dimensional detection frame corresponding to each three-dimensional trajectory feature into two-dimensional coordinates;

S1108，获取各实测三维检测框对应的鸟瞰图；获取各实测三维检测框对应的密度图；获取各实测三维检测框对应的强度图；将鸟瞰图、强度图和密度图进行合并处理，得到各三维轨迹特征对应的二维映射轨迹特征；S1108: Obtain a bird's-eye view corresponding to each measured 3D detection frame; obtain a density map corresponding to each measured 3D detection frame; obtain an intensity map corresponding to each measured 3D detection frame; The 2D mapping trajectory feature corresponding to the 3D trajectory feature;

S1109，根据各二维映射轨迹特征和各二维轨迹特征的融合数据，提取各候选检测对象的实测轨迹特征；S1109, according to the fusion data of each two-dimensional mapping track feature and each two-dimensional track feature, extract the measured track feature of each candidate detection object;

S1110，通过预设的跟踪算法模型，根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框；S1110, by a preset tracking algorithm model, according to the historical three-dimensional detection frame of each target to be tracked in the previous frame, predict that each target to be tracked predicts a three-dimensional detection frame in the current frame;

S1111，获取各候选检测对象在当前帧中的实测三维检测框与各待跟踪目标在当前帧中预测三维检测框之间的交并比；S1111, obtain the intersection ratio between the measured three-dimensional detection frame of each candidate detection object in the current frame and the predicted three-dimensional detection frame of each target to be tracked in the current frame;

S1112，获取剩余检测对象的实测轨迹特征与各待跟踪目标在上一帧中历史轨迹特征之间的相似度；剩余检测对象为交并比小于预设交并比阈值的候选检测对象；S1112, obtain the similarity between the measured track features of the remaining detection objects and the historical track features of each target to be tracked in the previous frame; the remaining detection objects are candidate detection objects whose intersection ratio is less than a preset intersection ratio threshold;

S1113，对交并比大于交并比阈值的各候选检测对象，和相似度大于预设相似度阈值的各剩余检测对象进行跟踪。S1113: Track each candidate detection object whose intersection ratio is greater than the intersection ratio threshold, and each remaining detection object whose similarity is greater than a preset similarity threshold.

本实施例提供的目标跟踪方法中各步骤，其实现原理和技术效果与前面各目标跟踪方法实施例中类似，在此不再赘述。图12实施例中各步骤的实现方式只是一种举例，对各实现方式不作限定，各步骤的顺序在实际应用中可进行调整，只要可以实现各步骤的目的即可。The implementation principles and technical effects of the steps in the target tracking method provided in this embodiment are similar to those in the previous embodiments of the target tracking method, and are not repeated here. The implementation manner of each step in the embodiment of FIG. 12 is only an example, and each implementation manner is not limited, and the order of each step can be adjusted in practical application, as long as the purpose of each step can be achieved.

应该理解的是，虽然图2-12的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，这些步骤可以以其它的顺序执行。而且，图2-12中的至少一部分步骤可以包括多个子步骤或者多个阶段，这些子步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，这些子步骤或者阶段的执行顺序也不必然是依次进行，而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowcharts of Figures 2-12 are shown in sequence as indicated by the arrows, these steps are not necessarily performed sequentially in the sequence indicated by the arrows. Unless explicitly stated herein, there is no strict order in the execution of these steps, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 2-12 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed and completed at the same time, but may be executed at different times. These sub-steps or stages are not necessarily completed at the same time. The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.

在一个实施例中，如图8所示，提供了一种目标跟踪装置，包括：获取模块10、确定模块11、预测模块12和跟踪模块13，其中，In one embodiment, as shown in Figure 8, a target tracking device is provided, comprising: an acquisition module 10, a determination module 11, a prediction module 12 and a tracking module 13, wherein,

获取模块10，用于获取第一传感器采集的目标场景的第一当前帧数据，和至少一个第二传感器采集的目标场景的第二当前帧数据；第一传感器和第二传感器为不同类型的传感器；The acquiring module 10 is configured to acquire the first current frame data of the target scene collected by the first sensor and the second current frame data of the target scene collected by at least one second sensor; the first sensor and the second sensor are different types of sensors ;

特征获取模块11，用于根据第一当前帧数据和第二当前帧数据，获取各候选检测对象在当前帧中的实测特征信息；候选检测对象为第一当前帧数据中各检测对象和第二当前帧数据中各检测对象关联匹配成功的检测对象；实测特征信息表示各候选检测对象的固有特性信息；The feature acquisition module 11 is used to acquire the measured feature information of each candidate detection object in the current frame according to the first current frame data and the second current frame data; the candidate detection objects are each detection object in the first current frame data and the second current frame data. The detected objects in the current frame data are successfully correlated and matched; the measured feature information represents the inherent characteristic information of each candidate detection object;

预测模块12，用于根据各待跟踪目标在上一帧的历史检测框，预测各待跟踪目标在当前帧中预测检测框；Prediction module 12, for predicting each target to be tracked predicts the detection frame in the current frame according to the historical detection frame of each target to be tracked in the previous frame;

跟踪模块13，用于根据各候选检测对象在当前帧中的实测特征信息，以及各待跟踪目标在当前帧中预测检测框，对各候选检测对象进行跟踪。The tracking module 13 is configured to track each candidate detection object according to the measured feature information of each candidate detection object in the current frame and the prediction detection frame of each candidate detection object in the current frame.

在一个实施例中，上述跟踪模块13包括：In one embodiment, the above-mentioned tracking module 13 includes:

第一获取单元，用于获取各候选检测对象在当前帧中的实测三维检测框与各待跟踪目标在当前帧中预测的三维检测框之间的交并比；The first acquisition unit is used to obtain the intersection ratio between the measured three-dimensional detection frame of each candidate detection object in the current frame and the three-dimensional detection frame predicted by each target to be tracked in the current frame;

第二获取单元，用于获取剩余检测对象的实测轨迹特征与各待跟踪目标在上一帧中历史轨迹特征之间的相似度；剩余检测对象为交并比小于预设交并比阈值的候选检测对象；The second obtaining unit is used to obtain the similarity between the measured track features of the remaining detection objects and the historical track features of each target to be tracked in the previous frame; the remaining detection objects are candidates whose intersection ratio is less than a preset intersection ratio threshold Detection object;

第一确定单元，用于对交并比大于交并比阈值的各候选检测对象，和相似度大于预设相似度阈值的各剩余检测对象进行跟踪。The first determining unit is configured to track each candidate detection object whose intersection ratio is greater than the intersection ratio threshold, and each remaining detection object whose similarity is greater than a preset similarity threshold.

在一个实施例中，上述获取模块10包括：In one embodiment, the above-mentioned obtaining module 10 includes:

第三获取单元，用于对于任一帧传感器采集的数据，获取第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2；The third acquisition unit is used for acquiring the first current frame data timestamp T1 and the second current frame data timestamp T2 for the data collected by any frame sensor;

第二确定单元，用于若T1和T2之间间隔小于预设的间隔阈值，则确定第一当前帧数据和第二当前帧数据为当前帧采集的数据；若T1和T2之间间隔大于预设阈值，则丢弃第一当前帧数据和第二当前帧数据，重新获取下一帧的第一当前帧数据时间戳T1和第二当前帧数据的时间戳T2。The second determination unit is configured to determine that the first current frame data and the second current frame data are the data collected by the current frame if the interval between T1 and T2 is less than the preset interval threshold; if the interval between T1 and T2 is greater than the predetermined interval threshold If the threshold is set, the first current frame data and the second current frame data are discarded, and the time stamp T1 of the first current frame data and the time stamp T2 of the second current frame data of the next frame are re-acquired.

在一个实施例中，上述第三获取单元具体用于若第一当前帧数据和第二当前帧数据中未携带时间戳，则将第一当前帧数据的采集时间和第二当前帧数据的采集时间转换成同一时间轴下，获取T1和T2。In one embodiment, the above-mentioned third acquisition unit is specifically configured to, if the first current frame data and the second current frame data do not carry a time stamp, collect the collection time of the first current frame data and the collection time of the second current frame data Time is converted into T1 and T2 under the same time axis.

在一个实施例中，该装置还包括：调整模块，用于调整第一传感器的采样频率和第二传感器的采样频率相同，以及对第一传感器和第二传感器之间的外参信息进行标定。In one embodiment, the apparatus further includes: an adjustment module configured to adjust the sampling frequency of the first sensor to be the same as the sampling frequency of the second sensor, and to calibrate the external parameter information between the first sensor and the second sensor.

在一个实施例中，第一传感器为摄像设备；第二传感器为激光雷达；In one embodiment, the first sensor is a camera device; the second sensor is a lidar;

则上述调整模块，具体用于将摄像设备和激光雷达之间的相对位姿信息调整为目标位姿信息，并根据预设的标定算法获取摄像设备标定后的外参信息；根据标定后的外参信息对摄像设备的外参信息进行标定。The above adjustment module is specifically used to adjust the relative pose information between the camera device and the lidar to the target pose information, and obtain the calibrated external parameter information of the camera device according to the preset calibration algorithm; The parameter information is used to calibrate the external parameter information of the camera device.

在一个实施例中，上述第一当前帧数据为摄像设备采集的像素数据，第二当前帧数据为激光雷达采集的点云数据；则上述特征获取模块11包括：In one embodiment, the above-mentioned first current frame data is the pixel data collected by the camera equipment, and the second current frame data is the point cloud data collected by the lidar; Then the above-mentioned feature acquisition module 11 includes:

检测框获取单元，用于获取像素数据中各检测对象的二维检测框，和点云数据中各检测对象的实测三维检测框；A detection frame acquisition unit, used for acquiring the two-dimensional detection frame of each detection object in the pixel data, and the measured three-dimensional detection frame of each detection object in the point cloud data;

候选检测对象确定单元，用于将像素数据中各检测对象的二维检测框和点云数据中各检测对象的实测三维检测框进行匹配，确定候选检测对象；The candidate detection object determination unit is used to match the two-dimensional detection frame of each detection object in the pixel data and the measured three-dimensional detection frame of each detection object in the point cloud data to determine the candidate detection object;

实测特征确定单元，用于根据各候选检测对象在像素数据中的二维检测框和特征信息，以及各检测对象在点云数据中的实测三维检测框和特征信息，确定各候选检测对象的实测轨迹特征。The measured feature determination unit is used to determine the actual measurement of each candidate detection object according to the two-dimensional detection frame and feature information of each candidate detection object in the pixel data, and the measured three-dimensional detection frame and feature information of each detection object in the point cloud data. Trajectory features.

在一个实施例中，上述候选检测对象确定单元，具体用于将各三维检测框映射为对应的二维映射检测框；获取各二维检测框和各二维映射检测框之间的交并比，并将交并比大于交并比阈值的检测对象确定为候选检测对象。In one embodiment, the above-mentioned candidate detection object determination unit is specifically configured to map each 3D detection frame to a corresponding 2D mapped detection frame; obtain the intersection ratio between each 2D detection frame and each 2D mapped detection frame , and the detection object whose intersection ratio is greater than the threshold of the intersection ratio is determined as a candidate detection object.

在一个实施例中，上述实测特征确定单元包括：In one embodiment, the above-mentioned measured feature determination unit includes:

轨迹特征确定子单元，用于将各候选检测对象在像素数据中的二维检测框和特征信息综合确定为各候选检测对象的二维轨迹特征；将各候选检测对象在点云数据中的实测三维检测框和特征信息综合确定为各候选检测对象的三维轨迹特征；The trajectory feature determination subunit is used to comprehensively determine the two-dimensional detection frame and feature information of each candidate detection object in the pixel data as the two-dimensional trajectory feature of each candidate detection object; the actual measurement of each candidate detection object in the point cloud data The three-dimensional detection frame and feature information are comprehensively determined as the three-dimensional trajectory features of each candidate detection object;

转换子单元，用于将各候选检测对象的三维轨迹特征转换为对应的二维映射轨迹特征；a conversion subunit, used to convert the three-dimensional trajectory features of each candidate detection object into the corresponding two-dimensional mapping trajectory features;

特征提取子单元，用于根据各二维映射轨迹特征和各二维轨迹特征的融合数据，提取各候选检测对象的实测轨迹特征。The feature extraction sub-unit is used to extract the measured track features of each candidate detection object according to the fusion data of each two-dimensional mapping track feature and each two-dimensional track feature.

在一个实施例中，上述转换子单元包括：In one embodiment, the above-mentioned conversion subunit includes:

坐标转换子单元，用于将各三维轨迹特征对应的实测三维检测框中的点云点的三维坐标转换为二维坐标；The coordinate conversion subunit is used to convert the three-dimensional coordinates of the point cloud points in the measured three-dimensional detection frame corresponding to each three-dimensional trajectory feature into two-dimensional coordinates;

鸟瞰图子单元，用于根据各点云点的二维坐标和各点云点在三维坐标中的z轴坐标，获取各实测三维检测框对应的鸟瞰图；The bird's-eye view subunit is used to obtain the bird's-eye view corresponding to each measured three-dimensional detection frame according to the two-dimensional coordinates of each point cloud point and the z-axis coordinate of each point cloud point in the three-dimensional coordinates;

强度图子单元，用于根据各点云点的二维坐标和各点云点的强度，获取各实测三维检测框对应的强度图；The intensity map subunit is used to obtain the intensity map corresponding to each measured three-dimensional detection frame according to the two-dimensional coordinates of each point cloud point and the intensity of each point cloud point;

密度图子单元，用于根据鸟瞰图中各点云点在z轴方向的密度，获取各实测三维检测框对应的密度图；The density map subunit is used to obtain the density map corresponding to each measured three-dimensional detection frame according to the density of each point cloud point in the z-axis direction in the bird's-eye view;

射轨迹特征确定子单元，用于将鸟瞰图、强度图和密度图进行合并处理，得到各三维轨迹特征对应的二维映射轨迹特征。It is used to combine the bird's-eye view, intensity map and density map to obtain the 2D mapping trajectory feature corresponding to each 3D trajectory feature.

在一个实施例中，上述鸟瞰图子单元，具体用于对各点云点在三维坐标中的z轴坐标进行归一化处理，将归一化处理后的z轴坐标确定为各点云点的像素值；以各点云点对应的像素值作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的鸟瞰图。In one embodiment, the above-mentioned bird's-eye view subunit is specifically used to normalize the z-axis coordinates of each point cloud point in the three-dimensional coordinates, and determine the normalized z-axis coordinate as each point cloud point The pixel value of each point cloud point is taken as the pixel value of the corresponding two-dimensional coordinate position, and the bird's-eye view corresponding to each measured three-dimensional detection frame is obtained.

在一个实施例中，上述强度图子单元，具体用于对各点云点的强度进行归一化处理，将归一化处理后强度确定为各点云点的像素值；以各点云点对应的像素值作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的强度图。In one embodiment, the above-mentioned intensity map subunit is specifically used to normalize the intensity of each point cloud point, and determine the intensity after the normalization process as the pixel value of each point cloud point; The corresponding pixel value is taken as the pixel value of the corresponding two-dimensional coordinate position, and the intensity map corresponding to each measured three-dimensional detection frame is obtained.

在一个实施例中，上述密度图子单元，具体用于根据每个点云点的二维坐标位置中在z 轴方向上的点云点的数量、所有坐标位置中点云点的数量的最大值、所有坐标位置中点云点的数量的最小值，确定每个位置的点云点在z轴方向的密度；将每个坐标位置的点云点在z 轴方向的密度作为对应的二维坐标位置的像素值，得到各实测三维检测框对应的密度图。In one embodiment, the above-mentioned density map subunit is specifically used for the maximum number of point cloud points in the z-axis direction in the two-dimensional coordinate position of each point cloud point and the maximum number of point cloud points in all coordinate positions. value, the minimum value of the number of point cloud points in all coordinate positions, determine the density of point cloud points at each position in the z-axis direction; take the density of point cloud points at each coordinate position in the z-axis direction as the corresponding two-dimensional The pixel value of the coordinate position is obtained, and the density map corresponding to each measured 3D detection frame is obtained.

在一个实施例中，上述特征提取子单元，具体用于将各二维轨迹特征和各二维映射轨迹特征压缩为相同比例大小后，拼接得到融合数据矩阵；将从融合数据矩阵中提取的特征信息确定为各候选检测对象的实测轨迹特征。In one embodiment, the above-mentioned feature extraction subunit is specifically used for compressing each two-dimensional trajectory feature and each two-dimensional mapping trajectory feature into the same scale, and then splicing to obtain a fusion data matrix; the features extracted from the fusion data matrix The information is determined as the measured trajectory features of each candidate detection object.

在一个实施例中，上述预测模块12，具体用于通过预设的跟踪算法模型，根据各待跟踪目标在上一帧的历史三维检测框，预测各待跟踪目标在当前帧中预测三维检测框；其中，跟踪算法模型是基于匀变速运动状态的空间方程构建的。In one embodiment, the above-mentioned prediction module 12 is specifically configured to predict the 3D detection frame of each target to be tracked in the current frame according to the historical 3D detection frame of each target to be tracked in the previous frame through a preset tracking algorithm model ; Among them, the tracking algorithm model is constructed based on the space equation of uniformly variable motion state.

关于目标跟踪装置的具体限定可以参见上文中对于目标跟踪方法的限定，在此不再赘述。上述目标跟踪装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the target tracking device, reference may be made to the limitation on the target tracking method above, which will not be repeated here. Each module in the above-mentioned target tracking device can be implemented in whole or in part by software, hardware and combinations thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer equipment in the form of hardware, and can also be stored in the memory in the computer equipment in the form of software, so that the processor can call and execute the corresponding operations of the above-mentioned various modules.

在一个实施例中，提供了一种计算机设备，该计算机设备可以是终端，其内部结构图可以如图1b所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种目标跟踪方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏，该计算机设备的输入装置可以是显示屏上覆盖的触摸层，也可以是计算机设备外壳上设置的按键、轨迹球或触控板，还可以是外接的键盘、触控板或鼠标等。In one embodiment, a computer device is provided, the computer device may be a terminal, and the internal structure diagram of which may be as shown in Fig. 1b. The computer equipment includes a processor, memory, a network interface, a display screen, and an input device connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes non-volatile storage media, internal memory. The nonvolatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used to communicate with external terminals through a network connection. The computer program when executed by a processor implements a target tracking method. The display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad set on the shell of the computer equipment , or an external keyboard, trackpad, or mouse.

本领域技术人员可以理解，图1b中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的计算机设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 1b is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

在一个实施例中，提供了一种计算机设备，包括存储器和处理器，存储器中存储有计算机程序，该处理器执行计算机程序时实现以下步骤：In one embodiment, a computer device is provided, comprising a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:

上述实施例提供的一种计算机设备，其实现原理和技术效果与上述方法实施例类似，在此不再赘述。The implementation principle and technical effect of the computer device provided by the above-mentioned embodiment are similar to those of the above-mentioned method embodiment, which will not be repeated here.

在一个实施例中，提供了一种计算机可读存储介质，其上存储有计算机程序，计算机程序被处理器执行时实现以下步骤：In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program implements the following steps when executed by a processor:

上述实施例提供的一种计算机可读存储介质，其实现原理和技术效果与上述方法实施例类似，在此不再赘述。The implementation principle and technical effect of the computer-readable storage medium provided by the above-mentioned embodiments are similar to those of the above-mentioned method embodiments, and details are not described herein again.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的计算机程序可存储于一非易失性计算机可读取存储介质中，该计算机程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus) 直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM) 等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage In the medium, when the computer program is executed, it may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. For the sake of brevity, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, all It is considered to be the range described in this specification.

以上所述实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对发明专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are more specific and detailed, but should not therefore be construed as limiting the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application should be determined by the appended claims.

Claims

1. a target tracking method, is characterized in that, described method comprises:

Acquire the first current frame data of the target scene collected by the first sensor, and the second current frame data of the target scene collected by at least one second sensor; the first sensor and the second sensor are different types of sensors ;

According to the first current frame data and the second current frame data, the actual measurement feature information of each candidate detection object in the current frame is obtained; the candidate detection object is each detection object and all detection objects in the first current frame data. Each detection object in the second current frame data is successfully correlated and matched to the detection object; the measured feature information represents the inherent characteristic information of each candidate detection object;

According to the historical detection frame of each target to be tracked in the previous frame, predict the detection frame of each target to be tracked in the current frame;

Each candidate detection object is tracked according to the measured feature information of each candidate detection object in the current frame and the predicted detection frame of each to-be-tracked object in the current frame.

2. The method according to claim 1, wherein the measured feature information comprises an actual measured three-dimensional detection frame and an actual measured trajectory feature;

Described according to the measured feature information of each described candidate detection object in the current frame, and the predicted detection frame of each described target to be tracked in the current frame, each described candidate detection object is tracked, including:

Obtain the intersection ratio between the measured 3D detection frame of each candidate detection object in the current frame and the 3D detection frame predicted by each of the to-be-tracked targets in the current frame;

Obtain the similarity between the measured track features of the remaining detection objects and the historical track features of the respective targets to be tracked in the previous frame; the remaining detection objects are candidate detection objects whose intersection ratio is less than a preset intersection ratio threshold;

Track each candidate detection object whose intersection ratio is greater than the intersection ratio threshold, and each remaining detection object whose similarity is greater than a preset similarity threshold.

3. The method according to claim 1 or 2, wherein the first current frame data is pixel data collected by a camera device, and the second current frame data is point cloud data collected by lidar; the The measured feature information includes the measured 3D detection frame and the measured trajectory feature;

Then, determining the measured feature information of each candidate detection object in the current frame according to the first current frame data and the second current frame data, including:

Obtain the two-dimensional detection frame of each detection object in the pixel data, and the measured three-dimensional detection frame of each detection object in the point cloud data;

Matching the two-dimensional detection frame of each detection object in the pixel data and the measured three-dimensional detection frame of each detection object in the point cloud data to determine the candidate detection object;

According to the two-dimensional detection frame and feature information of each candidate detection object in the pixel data, and the measured 3D detection frame and feature information of each detection object in the point cloud data, determine each candidate detection frame The measured trajectory features of the object.

4. The method according to claim 3, wherein the two-dimensional detection frame of each detection object in the pixel data is matched with the measured three-dimensional detection frame of each detection object in the point cloud data, and the determined The candidate detection objects include:

mapping each of the measured three-dimensional detection frames to corresponding two-dimensional mapped detection frames;

The intersection ratio between each of the two-dimensional detection frames and each of the two-dimensional mapping detection frames is acquired, and a detection object whose intersection ratio is greater than the threshold of the intersection ratio is determined as the candidate detection object.

5 . The method according to claim 3 , wherein, according to the two-dimensional detection frame and feature information of each candidate detection object in the pixel data, and the point cloud of each detection object The measured 3D detection frame and feature information in the data determine the measured trajectory characteristics of each candidate detection object, including:

The two-dimensional detection frame and feature information of each candidate detection object in the pixel data are comprehensively determined as the two-dimensional trajectory feature of each candidate detection object; each candidate detection object is included in the point cloud data. The measured 3D detection frame and feature information are comprehensively determined as the 3D trajectory feature of each candidate detection object;

Converting the three-dimensional trajectory feature of each candidate detection object into a corresponding two-dimensional mapping trajectory feature;

According to each of the two-dimensional mapped trajectory features and the fusion data of each of the two-dimensional trajectory features, the measured trajectory features of each of the candidate detection objects are extracted.

6. The method according to claim 5, wherein the converting the three-dimensional trajectory feature of each candidate detection object into a corresponding two-dimensional mapping trajectory feature comprises:

Converting the three-dimensional coordinates of the point cloud points in the measured three-dimensional detection frame corresponding to each of the three-dimensional trajectory features into two-dimensional coordinates;

According to the two-dimensional coordinates of each of the point cloud points and the z-axis coordinates of each of the point cloud points in the three-dimensional coordinates, obtain a bird's-eye view corresponding to each of the measured three-dimensional detection frames;

According to the two-dimensional coordinates of each of the point cloud points and the intensity of each of the point cloud points, an intensity map corresponding to each of the measured three-dimensional detection frames is obtained;

Obtain a density map corresponding to each of the measured three-dimensional detection frames according to the density of each of the point cloud points in the z-axis direction in the bird's-eye view;

The bird's-eye view, the intensity map, and the density map are combined to obtain a two-dimensional mapped trajectory feature corresponding to each of the three-dimensional trajectory features.

7 . The method according to claim 6 , wherein, according to the two-dimensional coordinates of each of the point cloud points and the z-axis coordinates of each of the point cloud points in the three-dimensional coordinates, the measured three-dimensional coordinates are obtained. 8 . The bird's-eye view corresponding to the detection frame, including:

Normalize the z-axis coordinate of each of the point cloud points in the three-dimensional coordinates, and determine the normalized z-axis coordinate as the pixel value of each of the point cloud points;

Taking the pixel value corresponding to each of the point cloud points as the pixel value of the corresponding two-dimensional coordinate position, a bird's-eye view corresponding to each of the measured three-dimensional detection frames is obtained.

8 . The method according to claim 6 , wherein the intensity map corresponding to each of the measured three-dimensional detection frames is obtained according to the two-dimensional coordinates of each of the point cloud points and the intensity of each of the point cloud points. 9 . ,include:

Normalize the intensity of each of the point cloud points, and determine the intensity after the normalization process as the pixel value of each of the point cloud points;

Taking the pixel value corresponding to each of the point cloud points as the pixel value of the corresponding two-dimensional coordinate position, the intensity map corresponding to each of the measured three-dimensional detection frames is obtained.

9 . The method according to claim 6 , wherein the obtaining a density map corresponding to each of the measured three-dimensional detection frames according to the density of each of the point cloud points in the z-axis direction in the bird’s-eye view, comprising: 10 . :

According to the number of point cloud points in the z-axis direction in the two-dimensional coordinate position of each point cloud point, the maximum value of the number of point cloud points in all coordinate positions, and the minimum value of the number of point cloud points in all coordinate positions, Determine the density of the point cloud points at each location in the z-axis direction;

The density of the point cloud points at each coordinate position in the z-axis direction is taken as the pixel value of the corresponding two-dimensional coordinate position, and a density map corresponding to each of the measured three-dimensional detection frames is obtained.

10 . The method according to claim 9 , wherein, according to the fusion data of each of the two-dimensional mapped trajectory features and each of the two-dimensional trajectory features, the measured trajectory features of each of the candidate detection objects are extracted, 10 . include:

After compressing each of the two-dimensional trajectory features and each of the two-dimensional mapping trajectory features into the same scale, splicing to obtain a fusion data matrix;

The feature information extracted from the fusion data matrix is determined as the measured trajectory feature of each candidate detection object.

11. The method according to claim 1 or 2, wherein, predicting the detection frame of each target to be tracked in the current frame according to the historical detection frame of each target to be tracked in the previous frame, comprising:

Through the preset tracking algorithm model, according to the historical 3D detection frame of each target to be tracked in the previous frame, predict the 3D detection frame of each target to be tracked in the current frame; wherein, the tracking algorithm model is based on It is constructed from the space equation of the uniformly variable motion state.

12. A target tracking device, wherein the device comprises:

an acquisition module, configured to acquire the first current frame data of the target scene collected by the first sensor, and the second current frame data of the target scene collected by at least one second sensor; the first sensor and the second sensor for different types of sensors;

A feature acquisition module, configured to acquire the measured feature information of each candidate detection object in the current frame according to the first current frame data and the second current frame data; the candidate detection object is the first current frame data each detection object in the second current frame data and each detection object in the second current frame data are correlated and successfully matched; the measured feature information represents the inherent characteristic information of each candidate detection object;

The prediction module is used to predict the detection frame of each target to be tracked in the current frame according to the historical detection frame of each target to be tracked in the previous frame;

The tracking module is configured to track each candidate detection object according to the measured feature information of each candidate detection object in the current frame and the prediction detection frame of each of the to-be-tracked objects in the current frame.

13. A computer device, comprising a memory and a processor, wherein the memory stores a computer program, wherein the processor implements the steps of the method according to any one of claims 1 to 11 when the processor executes the computer program .

14. A computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 11 are implemented.