CN110717445B

CN110717445B - Front vehicle distance tracking system and method for automatic driving

Info

Publication number: CN110717445B
Application number: CN201910953010.XA
Authority: CN
Inventors: 高跃; 魏宇轩; 赵曦滨
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2019-10-09
Filing date: 2019-10-09
Publication date: 2022-08-23
Anticipated expiration: 2039-10-09
Also published as: CN110717445A

Abstract

The invention discloses a preceding vehicle distance tracking system and method for automatic driving, wherein: a data acquisition unit is used to acquire image sequences with equal time intervals from a camera; a vehicle detection unit is used to extract the outline frame of the preceding vehicle from the image line; the coordinate positioning unit is used to calculate the real position of the vehicle ahead according to the pixel coordinates of the vehicle outline frame line in the image and the camera parameters; the vehicle tracking unit identifies the front vehicle outline frame line of each image in the serial image sequence For the same vehicle, the vehicle is numbered; according to the vehicle position calculated by the coordinate positioning unit and the vehicle number calculated by the vehicle tracking unit, the system converts the input image sequence into an XML format file of the distance sequence of the preceding vehicle. Through the technical solution of the present invention, a single vehicle-mounted camera can be used to detect and track the position of the vehicle ahead on the road, which is beneficial to road perception, automatic obstacle avoidance and decision-making assistance in the automatic driving system.

Description

A front vehicle distance tracking system and method for automatic driving

技术领域technical field

本发明涉及视觉辅助的自动驾驶的技术领域，具体而言，涉及一种用于自动驾驶的前车距离跟踪系统和一种基于二维图像的车辆测距方法。The present invention relates to the technical field of visual aided automatic driving, in particular, to a preceding vehicle distance tracking system for automatic driving and a two-dimensional image-based vehicle ranging method.

背景技术Background technique

基于视觉辅助的自动驾驶是当前自动驾驶技术的主流解决方案。摄像机和激光雷达是两种最常用的车载视觉传感器，其中，摄像机采集二维图像数据，激光雷达采集三维点云数据。相对摄像机而言，目前激光雷达的造价较为高昂，因此，摄像机是实现视觉辅助自动驾驶的成本最低的方案。Autonomous driving based on visual assistance is the mainstream solution for current autonomous driving technology. Cameras and lidars are the two most commonly used vehicle vision sensors, among which cameras collect 2D image data and lidars collect 3D point cloud data. Compared with cameras, the current cost of LiDAR is relatively high. Therefore, cameras are the lowest cost solution to achieve vision-assisted autonomous driving.

在车载摄像机自动驾驶平台上，面向图像的路况识别与理解是自动驾驶的核心问题。自动驾驶涉及的车辆避让、路径规划等自动决策技术，都需要获取前方车辆的位置信息。同时，通过对前方道路上车辆的跟踪，可以进一步服务于前车驾驶行为预判、提前制动等场景理解和驾驶决策任务。因此，基于图像的前车距离跟踪方法与系统是自动驾驶的关键技术。On the vehicle-mounted camera autonomous driving platform, image-oriented road condition recognition and understanding is the core issue of autonomous driving. The automatic decision-making technologies such as vehicle avoidance and path planning involved in autonomous driving all need to obtain the position information of the vehicle ahead. At the same time, by tracking the vehicles on the road ahead, it can further serve scene understanding and driving decision-making tasks such as predicting the driving behavior of the preceding vehicle and braking in advance. Therefore, the image-based approach and system for distance tracking of the preceding vehicle are the key technologies for autonomous driving.

而现有技术中，通常采用传统图形学算子来提取图像中的前方车辆信息，这种方式不能很好的适应复杂的路况场景，对于黑夜、斑驳树影、带纹理的路面、车辆互相遮挡的情形容易产生误判。同时，现有技术中，大多没有考虑多帧图像的关联性，没有从连续的多帧图像中匹配相同车辆的不同影像，忽视了对前方车辆位置的持续跟踪，因此限制了测距结果的适用范围。具体的，使用传统的前车视觉测距方法辅助自动驾驶决策，存在着以下困难：In the prior art, traditional graphics operators are usually used to extract the information of the vehicle ahead in the image, which cannot be well adapted to complex road conditions. situation is prone to misjudgment. At the same time, most of the existing technologies do not consider the correlation of multiple frames of images, do not match different images of the same vehicle from consecutive multiple frame images, and ignore the continuous tracking of the position of the vehicle ahead, thus limiting the application of the ranging results. scope. Specifically, there are the following difficulties in using the traditional visual ranging method of the preceding vehicle to assist automatic driving decision-making:

1)复杂性：自动驾驶的应用场景非常复杂，不同路况拍摄的图像的光照程度、路面纹理、遮挡情况各异，传统方法缺乏在强干扰复杂场景下的鲁棒性；1) Complexity: The application scenarios of autonomous driving are very complex. The illumination levels, road textures, and occlusions of images taken under different road conditions are different. Traditional methods lack robustness in complex scenes with strong interference;

2)表示能力：现有前车测距方法没有融合车辆追踪技术，无法将多个的前方车辆瞬时坐标串联成前方车辆轨迹，无法从中推测出前方车辆的行车行为。2) Representation ability: The existing vehicle ranging method does not integrate vehicle tracking technology, and cannot concatenate multiple instantaneous coordinates of the vehicle ahead into the trajectory of the vehicle ahead, and cannot infer the driving behavior of the vehicle ahead.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于：基于车载摄像机拍摄的图像序列跟踪前车轨迹、计算前车位置，从而辅助自动驾驶决策。The purpose of the present invention is to track the trajectory of the preceding vehicle and calculate the position of the preceding vehicle based on the image sequence captured by the vehicle-mounted camera, thereby assisting automatic driving decision-making.

本发明的技术方案，提供了一种用于自动驾驶的前车距离跟踪系统，其特征在于，所述处系统包括：数据采集单元，车辆检测单元，坐标定位单元以及车辆跟踪单元；The technical solution of the present invention provides a preceding vehicle distance tracking system for automatic driving, wherein the system includes: a data acquisition unit, a vehicle detection unit, a coordinate positioning unit and a vehicle tracking unit;

所述数据采集单元用于从摄像机中提取等时间间隔的图像序列，同时记录摄像机的参数，包括表征摄像机焦距、投影中心、倾斜和畸变的内参数以及表征摄像机位置平移和位置旋转的外参数；The data acquisition unit is used to extract image sequences at equal time intervals from the camera, and record the parameters of the camera at the same time, including internal parameters representing camera focal length, projection center, tilt and distortion, and external parameters representing camera position translation and position rotation;

所述车辆检测单元包括若干个车辆检测模块，其中每个车辆检测模块负责处理单个图像；车辆检测模块利用深度神经网络模型构建的分类器识别图像中的车辆，利用回归器拟合车辆的轮廓框线，从而实现在单个图像上对前方车辆的定位；The vehicle detection unit includes several vehicle detection modules, wherein each vehicle detection module is responsible for processing a single image; the vehicle detection module uses the classifier constructed by the deep neural network model to identify the vehicle in the image, and uses the regressor to fit the contour frame of the vehicle. line, so as to locate the vehicle ahead on a single image;

所述坐标定位单元包括若干个坐标定位模块，其中每个坐标定位模块负责处理单个车辆轮廓框线；对于每个车辆轮廓框线，坐标定位模块根据框线四个顶点的像素以及由数据采集单元得到的摄像机参数，通过几何变换的方法，将前方车辆的轮廓框线像素坐标变换为车载摄像机所在的车体坐标，从而确定前方车辆相对于本车的空间位置；The coordinate positioning unit includes several coordinate positioning modules, wherein each coordinate positioning module is responsible for processing a single vehicle outline frame; for each vehicle outline frame, the coordinate positioning module is based on the pixels of the four vertices of the frame line and the data collected by the data acquisition unit. The obtained camera parameters, through the method of geometric transformation, transform the pixel coordinates of the outline frame line of the front vehicle into the vehicle body coordinates where the vehicle camera is located, so as to determine the spatial position of the front vehicle relative to the vehicle;

所述车辆跟踪单元负责处理辨识多帧图像中的同一个车辆，形成每个车辆的行车轨迹；所述车辆跟踪单元识别不同图片中出现的相同的车辆，赋予其唯一的id号，并将行车轨迹串联，从而实现前车跟踪。The vehicle tracking unit is responsible for processing and recognizing the same vehicle in multiple frames of images to form the driving track of each vehicle; the vehicle tracking unit identifies the same vehicle appearing in different pictures, assigns it a unique id number, and assigns the driving track to the vehicle. The trajectories are connected in series, so that the preceding vehicle can be tracked.

进一步地，所述车辆检测单元包括候选区域生成模块、车辆判别模块和轮廓框线回归模块；Further, the vehicle detection unit includes a candidate region generation module, a vehicle discrimination module and a contour frame line regression module;

候选区域生成模块保存有若干不同尺寸的锚点，每个锚点是一个由若干个像素组成的矩形区域；The candidate region generation module saves several anchor points of different sizes, and each anchor point is a rectangular area composed of several pixels;

车辆判别模块输入为候选区域生成模块生成的可能含有车辆的候选区域，输出为该候选区域是否含有车辆的信息；The input of the vehicle identification module is a candidate region that may contain a vehicle generated by the candidate region generation module, and the output is the information of whether the candidate region contains a vehicle;

轮廓框线回归模块根据掩膜区域卷积神经网络回归器，在候选区域坐标的基础上，微调图片中车辆的轮廓框线。The contour box line regression module fine-tunes the contour line of the vehicle in the picture based on the coordinates of the candidate region based on the mask region convolutional neural network regressor.

进一步地，根据所述车辆检测单元，使用掩膜区域卷积神经网络计算图像中车辆的轮廓框线；其中，掩膜区域卷积神经网络的计算分为三个步骤：Further, according to the vehicle detection unit, use the mask area convolutional neural network to calculate the outline frame line of the vehicle in the image; wherein, the calculation of the mask area convolutional neural network is divided into three steps:

步骤1、进行备选区域提取，将预设的若干个不同大小的锚点按从左到右，从上到下的顺序遍历整个图像，计算可能作为车辆外轮廓框线的矩形块的位置；Step 1. Extract the candidate area, traverse the entire image from left to right and top to bottom with several preset anchor points of different sizes, and calculate the position of the rectangular block that may be used as the outline of the vehicle;

步骤2、进行区域物体分类，对于每个备选区域，用卷积神经网络提取区域的视觉特征，用多层感知机通过视觉特征判断区域内物体的类别；Step 2. Perform regional object classification. For each candidate region, use a convolutional neural network to extract the visual features of the region, and use a multi-layer perceptron to judge the category of objects in the region through the visual features;

步骤3、完成轮廓框线坐标，对于每个备选区域，用神经网络回归备选区域轮廓框线相对于检测目标的轮廓框线的偏移量，使得备选区域的轮廓框线进一步贴合检测目标的轮廓框线。Step 3. Complete the coordinates of the outline frame. For each candidate area, use the neural network to regress the offset of the outline frame of the candidate area relative to the outline frame of the detection target, so that the outline frame of the candidate area is further fitted. Detects the outline of the target.

进一步地，所述坐标定位单元被配置为：Further, the coordinate positioning unit is configured to:

根据车辆检测单元求出的车辆在图像中的坐标，通过摄像机坐标-世界坐标变换公式求得车辆在世界坐标系中的坐标，其中坐标变换公式为：According to the coordinates of the vehicle in the image obtained by the vehicle detection unit, the coordinates of the vehicle in the world coordinate system are obtained through the camera coordinate-world coordinate transformation formula, where the coordinate transformation formula is:

sm＝A[R|t]Msm=A[R|t]M

其中M＝[x_w,y_w,z_w]^T为世界坐标系下的三维坐标，m＝[x_i,y_i]^T为图像中检测出的车辆轮廓框线底部中心点的二维坐标，R和t分别是摄像机外参数矩阵中的旋转矩阵和平移矩阵；A是摄像机内参数矩阵，_-A[1,1]＝f_x,A[1,3]＝c_x,A[2,2]＝f_y,A[2,3]＝c_y,A[3,3]＝1，A的其余位置都是0；其中f_x,f_y,c_x,c_y分别是摄像机在x轴方向上的焦距，在y轴方向上的焦距，在x轴方向上的光心，在y轴方向上的光心；s是景深；其中，m,A,R,t都是可以从数据采集单元获取的已知量，s,M是待求的未知量。where M=[x _w , y _w , z _w ] ^T is the three-dimensional coordinate in the world coordinate system, m=[x _i , y _i ] ^T is the two-dimensional coordinate of the bottom center point of the vehicle outline frame line detected in the image , R and t are the rotation matrix and translation matrix in the camera external parameter matrix, respectively; A is the camera internal parameter matrix, _- A[1,1]=f _x , A[1,3]=c _x , A[2, 2]=f _y , A[2,3]= _cy , A[3,3]=1, the rest of A is 0; where f _x , f _y , c _x , _cy are the camera at x The focal length in the axial direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction; s is the depth of field; among them, m, A, R, t are all available from the data The known quantities obtained by the acquisition unit, s, M are the unknown quantities to be obtained.

进一步地，坐标定位单元的计算分为2个步骤：Further, the calculation of the coordinate positioning unit is divided into 2 steps:

步骤1、景深估计，在图像上车辆框线底部水平线的地面上取一点，其在图像中的二维坐标为m_g，该点的世界坐标的z方向分量z_w＝0已知，同时由于该地面点与车辆框线底部中心点位于图像的同一水平面，这两个像素点的景深s相同，由此，可以得到第一个线性方程组(e31)Step 1. Depth of field estimation, take a point on the ground of the horizontal line at the bottom of the vehicle frame line on the image, its two-dimensional coordinate in the image is m _g , the z-direction component of the world coordinate of this point z _w = 0 is known, and because The ground point and the bottom center point of the vehicle frame line are located on the same horizontal plane of the image, and the depth of field s of these two pixel points is the same. Therefore, the first linear equation system (e31) can be obtained.

sm_g＝A[R|t]M₀ (e31)sm _g =A[R|t]M ₀ (e31)

步骤2，根据世界坐标求解步骤，由坐标变换公式对车辆框线底部中心点m得到第二个线性方程组(e32)Step 2, according to the world coordinate solution step, the second linear equation system (e32) is obtained from the center point m at the bottom of the vehicle frame line by the coordinate transformation formula

sm＝A[R|t]M (e32)sm=A[R|t]M (e32)

将这两个方程组联立，可以用方程组(e31)中的已知量消去方程组(e32)中的未知量景深s，从而求出车辆框线底部中心点的世界坐标M。By combining these two equations, the unknown depth of field s in the equation (e32) can be eliminated with the known in the equation (e31), so as to obtain the world coordinate M of the center point at the bottom of the vehicle frame line.

进一步地，所述车辆跟踪单元包括距离计算模块与距离匹配模块Further, the vehicle tracking unit includes a distance calculation module and a distance matching module

距离计算模块根据前后两帧图片的各个车辆轮廓框线，计算两帧图像间第一帧的若干个轮廓框线中心到第二帧的若干个轮廓框线的像素距离，距离最近的一组帧间匹配轮廓框线认为是同一辆车在两帧图片的轮廓框线；The distance calculation module calculates the pixel distance from the center of several outline frames of the first frame to several outline frames of the second frame between the two frames of images according to each vehicle outline frame line of the two frames of pictures, and the distance from the nearest set of frames The matching outline frame line is considered to be the outline frame line of the same vehicle in two frames of pictures;

距离匹配模块针对图像序列中相邻的两幅图片，根据最近匹配原则，优先匹配相邻两幅图片中距离最近的车辆轮廓框线，将匹配得到的相邻一组车辆轮廓框线赋予相同的车辆ID标记，然后去掉已经匹配的车辆轮廓框线，在相邻两幅图片中的剩下的轮廓框线中按照距离最近原则继续匹配，直到相邻两幅图片中一副图片中所有的车辆轮廓框线已经全部匹配。For the two adjacent pictures in the image sequence, the distance matching module preferentially matches the vehicle outline frame line with the closest distance in the adjacent two pictures according to the nearest matching principle, and assigns the matched adjacent group of vehicle outline frame lines to the same. Vehicle ID mark, and then remove the matched vehicle outline frame, and continue to match the remaining outline frames in the two adjacent pictures according to the principle of closest distance, until all the vehicles in one picture in the adjacent two pictures The outline frame lines have all been matched.

本发明还提供了一种用于自动驾驶的前车距离跟踪系统进行跟踪的方法，具体包括以下步骤：The present invention also provides a method for tracking by a preceding vehicle distance tracking system for automatic driving, which specifically includes the following steps:

步骤1：摄像机校准，对于车载摄像机的位置参数和光学参数进行校准并记录于系统数据采集软件中；Step 1: Camera calibration, the position parameters and optical parameters of the vehicle camera are calibrated and recorded in the system data acquisition software;

摄像机位置参数包括摄像机固定的位置到车头、车底盘、车身两侧的距离以及摄像机相对于车底盘的立体角度；The camera position parameters include the distance from the fixed position of the camera to the front of the car, the chassis of the car, the sides of the car body, and the stereo angle of the camera relative to the chassis of the car;

步骤2：对前方车辆进行识别与定位，前方车辆的识别与定位通过车辆检测单元和坐标定位单元实现；Step 2: Identify and locate the vehicle in front, and the identification and location of the vehicle in front are realized by the vehicle detection unit and the coordinate positioning unit;

步骤3：对前方车辆的轨迹进行跟踪，通过车辆跟踪定位单元，识别图像序列中重复出现的车辆，将所有出现在图像序列中的不同车辆赋予不同的唯一ID作为区别，输出每个车辆的轨迹序列到XML文件。Step 3: Track the trajectory of the vehicle ahead, identify the vehicles that appear repeatedly in the image sequence through the vehicle tracking and positioning unit, assign different unique IDs to all different vehicles appearing in the image sequence as distinctions, and output the trajectory of each vehicle Sequence to XML file.

本发明的有益效果是：通过使用掩膜区域神经网络辨识图像中的车辆轮廓，提高了复杂路况场景下车辆检测的鲁棒性。通过基于地平线像素点的坐标变换，更加准确地从图像中还原出车辆的三维坐标点。通过车辆跟踪单元，使得系统不仅具有前车测距功能，还具备了车辆跟踪功能，从而可以更好地掌握前车的运动轨迹，对前车的动向进行预判，使得车辆的驾驶决策更具预判性和安全性。The beneficial effect of the present invention is that the robustness of vehicle detection under complex road conditions is improved by using the mask area neural network to identify the vehicle contour in the image. Through the coordinate transformation based on the horizon pixel point, the three-dimensional coordinate point of the vehicle is more accurately restored from the image. Through the vehicle tracking unit, the system not only has the function of distance measurement of the preceding vehicle, but also has the function of vehicle tracking, so that the movement trajectory of the preceding vehicle can be better grasped, and the movement of the preceding vehicle can be predicted, which makes the driving decision of the vehicle more accurate. Anticipation and safety.

附图说明Description of drawings

本发明的上述和/或附加方面的优点在结合下面附图对实施例的描述中将变得明显和容易理解，其中：The advantages of the above and/or additional aspects of the present invention will become apparent and readily understood from the following description of embodiments in conjunction with the accompanying drawings, wherein:

图1是根据本发明的一个实施例的用于自动驾驶的前车距离跟踪方法与系统的示意框图；FIG. 1 is a schematic block diagram of a method and system for tracking the distance of a preceding vehicle for automatic driving according to an embodiment of the present invention;

图2是根据本发明的一个实施例的坐标变换计算过程的示意图。FIG. 2 is a schematic diagram of a coordinate transformation calculation process according to an embodiment of the present invention.

具体实施方式Detailed ways

为了能够更清楚地理解本发明的上述目的、特征和优点，下面结合附图和具体实施方式对本发明进行进一步的详细描述。需要说明的是，在不冲突的情况下，本发明的实施例及实施例中的特征可以相互结合。In order to understand the above objects, features and advantages of the present invention more clearly, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and the features of the embodiments may be combined with each other under the condition of no conflict.

在下面的描述中，阐述了很多具体细节以便于充分理解本发明，但是，本发明还可以采用其他不同于在此描述的其他方式来实施，因此，本发明的保护范围并不受下面公开的具体实施例的限制。In the following description, many specific details are set forth to facilitate a full understanding of the present invention. However, the present invention can also be implemented in other ways different from those described herein. Therefore, the protection scope of the present invention is not limited by the following disclosure. Restrictions to specific embodiments.

实施例：Example:

以下结合图1至图2对本发明的实施例进行说明。Embodiments of the present invention will be described below with reference to FIGS. 1 to 2 .

如图1所示，本实施例提供了一种用于自动驾驶的前车距离跟踪系统100，包括：数据采集单元10，车辆检测单元20，坐标定位单元30以及车辆跟踪单元40。As shown in FIG. 1 , this embodiment provides a preceding vehicle distance tracking system 100 for automatic driving, including: a data acquisition unit 10 , a vehicle detection unit 20 , a coordinate positioning unit 30 and a vehicle tracking unit 40 .

所述数据采集单元用于从摄像机中提取等时间间隔的图像序列，同时记录摄像机的参数，包括表征摄像机焦距、投影中心、倾斜和畸变的内参数以及表征摄像机位置平移和位置旋转的外参数；该实施例中，数据采集单元10包括数据采集硬件设备以及数据采集软件。数据采集硬件是固定于车辆顶部的摄像机，摄像机的镜头朝向与车辆底盘平行，摄像机距离车头、车底盘、车身两侧的距离要实现测量并在数据采集软件中记录。行车时，打开摄像机拍摄路况，将拍摄的图像序列通过数据采集软件传输给该系统后续的单元。The data acquisition unit is used to extract image sequences at equal time intervals from the camera, and record the parameters of the camera at the same time, including internal parameters representing camera focal length, projection center, tilt and distortion, and external parameters representing camera position translation and position rotation; In this embodiment, the data acquisition unit 10 includes data acquisition hardware equipment and data acquisition software. The data acquisition hardware is a camera fixed on the top of the vehicle. The lens of the camera is oriented parallel to the chassis of the vehicle. The distance between the camera and the front of the vehicle, the chassis, and the sides of the vehicle body should be measured and recorded in the data acquisition software. When driving, turn on the camera to capture road conditions, and transmit the captured image sequence to subsequent units of the system through data acquisition software.

数据采集软件用于从摄像头传输拍摄到的路况图像序列信息并记录摄像机自身的参数信息，为该系统后续单元的处理提供数据支撑。具体地，数据采集软件分为图像序列采集模块和摄像机参数采集模块。The data acquisition software is used to transmit the captured road condition image sequence information from the camera and record the parameter information of the camera itself, providing data support for the processing of subsequent units of the system. Specifically, the data acquisition software is divided into an image sequence acquisition module and a camera parameter acquisition module.

图像序列采集模块用于从摄像头中采集路况图像序列传输给后续单元。一台车载前置摄像机拍摄前方路况的录像，包含道路、车辆和行人状况。录制的视频会按照固定的帧率截取成等间隔的图像序列。图像序列中含有多帧图片，在图1中用“图片1”“图片2”……“图片n”表示，其中n代表图像序列中图片的总数。在图像序列中，图片之间遵循时间顺序上的先后关系，并且相邻两幅图片的拍摄时间的间隔是相等的。The image sequence acquisition module is used to collect the road condition image sequence from the camera and transmit it to the subsequent unit. An in-vehicle front-facing camera captures footage of the road ahead, including road, vehicle and pedestrian conditions. The recorded video will be cut into a sequence of equally spaced images at a fixed frame rate. The image sequence contains multiple frames of pictures, which are represented by "picture 1", "picture 2"..."picture n" in Figure 1, where n represents the total number of pictures in the image sequence. In the image sequence, the pictures follow a time sequence relationship, and the interval between the shooting times of two adjacent pictures is equal.

摄像机参数采集模块记录摄像机的外参数和内参数传输给后续单元。具体地，摄像机的外参数是摄像机在车体上摆放的空间位置参数，用旋转矩阵和平移向量存储；摄像机的外参数是摄像机本身的光学参数，包括摄像机的焦距和光心在x轴和y轴两个方向上的分量。The camera parameter acquisition module records the external parameters and internal parameters of the camera and transmits them to the subsequent units. Specifically, the external parameters of the camera are the spatial position parameters of the camera placed on the vehicle body, which are stored with a rotation matrix and a translation vector; the external parameters of the camera are the optical parameters of the camera itself, including the focal length and optical center of the camera on the x-axis and y-axis. Components in both directions of the axis.

车辆检测单元20包括若干个车辆检测模块，其中每个车辆检测模块负责处理单个图像；车辆检测模块利用深度神经网络模型构建的分类器识别图像中的车辆，利用回归器拟合车辆的轮廓框线，从而实现在单个图像上对前方车辆的定位；The vehicle detection unit 20 includes several vehicle detection modules, wherein each vehicle detection module is responsible for processing a single image; the vehicle detection module uses the classifier constructed by the deep neural network model to identify the vehicle in the image, and uses the regressor to fit the contour frame of the vehicle , so as to locate the vehicle ahead on a single image;

该实施例中，车辆检测单元20包括：n个车辆检测子单元，在图1中用“车辆检测单元21”“车辆检测单元22”……“车辆检测单元2n”表示，其中n同样表示待处理的图像序列中图片的总数。具体地，一个车辆检测子单元负责处理图像序列中的一张图片。每个车辆检测子单元拥有完全相同的配置，区别在于处理的图片不同。In this embodiment, the vehicle detection unit 20 includes: n vehicle detection subunits, which are represented by “vehicle detection unit 21 ”, “vehicle detection unit 22 ” . . . “vehicle detection unit 2n ” in FIG. The total number of pictures in the processed image sequence. Specifically, a vehicle detection subunit is responsible for processing one picture in the image sequence. Each vehicle detection sub-unit has exactly the same configuration, the difference is that the processed images are different.

具体地，一个车辆检测单元20由如下功能模块组成：候选区域生成模块、车辆判别模块和轮廓框线回归模块。Specifically, a vehicle detection unit 20 is composed of the following functional modules: a candidate region generation module, a vehicle discrimination module, and a contour frame line regression module.

候选区域生成模块被配置为：候选区域生成模块保存有若干不同尺寸的锚点，每个锚点是一个由若干个像素组成的矩形区域。候选区域生成模块将这些锚点从图片的左上角按从左到右、从上到下的顺序顺次移动到图片的右下角，每次移动一个像素单元。在锚点移动的每个位置，候选区域生成模块根据像素特征判断锚点覆盖的矩形区域是否可能含有车辆，如果该锚点覆盖区域可能含有车辆，则记录锚点在图片中的位置作为候选区域。The candidate region generation module is configured as follows: the candidate region generation module saves several anchor points of different sizes, and each anchor point is a rectangular region composed of several pixels. The candidate region generation module sequentially moves these anchor points from the upper left corner of the picture to the lower right corner of the picture in the order from left to right and from top to bottom, moving one pixel unit each time. At each position where the anchor point moves, the candidate area generation module determines whether the rectangular area covered by the anchor point may contain vehicles according to the pixel characteristics. If the anchor point coverage area may contain vehicles, the position of the anchor point in the picture is recorded as the candidate area. .

车辆判别模块被配置为：车辆判别模块输入为候选区域生成模块生成的可能含有车辆的候选区域，输出为该候选区域是否含有车辆的信息。进一步地，车辆判别模块根据掩膜区域卷积神经网络分类器，将候选区域的像素特征进行特征提取和分类，根据分类器输出的各个类别的概率决定该候选区域是否含有车辆。具体地，如果分类器输出的概率中“车”这一类的概率最大，则认为该候选区域含有车辆，否则认为该候选区域不含车辆，无需进行后续的处理。The vehicle identification module is configured such that the vehicle identification module inputs a candidate region that may contain a vehicle generated by the candidate region generation module, and outputs information about whether the candidate region contains a vehicle. Further, the vehicle discrimination module performs feature extraction and classification on the pixel features of the candidate region according to the mask region convolutional neural network classifier, and determines whether the candidate region contains a vehicle according to the probability of each category output by the classifier. Specifically, if the probability of the class "vehicle" is the largest in the probability output by the classifier, it is considered that the candidate area contains a vehicle, otherwise, it is considered that the candidate area does not contain a vehicle, and no subsequent processing is required.

轮廓框线回归模块被配置为：根据掩膜区域卷积神经网络回归器，在候选区域坐标的基础上，微调图片中车辆的轮廓框线。进一步地，轮廓框线回归模块被配置为：轮廓框线用框线左上角的左边(x,y)和轮廓框线的宽高(w,h)表示。每幅片fig通过车辆检测单元转化为图片中每一车辆的轮廓框线参数集合：((x₁,y₁,w₁,h₁),(x₂,y₂,w₂,h₂),…,(x_k,y_k,w_k,h_k))。其中变量k表示图片fig中包含k部车辆，对于图像序列中的不同图片，k可以是不同的值。The contour box line regression module is configured to fine-tune the contour lines of the vehicle in the image based on the coordinates of the candidate region according to the mask region convolutional neural network regressor. Further, the contour frame regression module is configured to: the contour frame is represented by the left (x, y) of the upper left corner of the frame and the width and height (w, h) of the contour frame. Each piece of fig is converted into a set of outline frame parameters of each vehicle in the picture by the vehicle detection unit: ((x ₁ , y ₁ , w ₁ , h ₁ ), (x ₂ , y ₂ , w ₂ , h ₂ ) ,…,(x _k ,y _k ,w _k ,h _k )). The variable k indicates that the picture fig contains k vehicles, and k can be different values for different pictures in the image sequence.

根据所述车辆检测单元，使用掩膜区域卷积神经网络计算图像中车辆的轮廓框线；其中，掩膜区域卷积神经网络的计算分为三个步骤：According to the vehicle detection unit, use the mask area convolutional neural network to calculate the outline frame of the vehicle in the image; wherein, the calculation of the mask area convolutional neural network is divided into three steps:

坐标定位单元30包括若干个坐标定位模块，其中每个坐标定位模块负责处理单个车辆轮廓框线；对于每个车辆轮廓框线，坐标定位模块根据框线四个顶点的像素以及由数据采集单元(10)得到的摄像机参数，通过几何变换的方法，将前方车辆的轮廓框线像素坐标变换为车载摄像机所在的车体坐标，从而确定前方车辆相对于本车的空间位置；The coordinate positioning unit 30 includes several coordinate positioning modules, wherein each coordinate positioning module is responsible for processing a single vehicle outline frame line; for each vehicle outline frame line, the coordinate positioning module is based on the pixels of the four vertices of the frame line and the data collected by the data acquisition unit ( 10) The obtained camera parameters, through the method of geometric transformation, transform the pixel coordinates of the outline frame line of the preceding vehicle into the vehicle body coordinates where the on-board camera is located, thereby determining the spatial position of the preceding vehicle relative to the vehicle;

该实施例中，坐标定位单元30包括：n个坐标定位子单元，在图1中用“坐标定位单元31”“坐标定位单元32”……“坐标定位单元3n”表示，其中n同样表示待处理的图像序列中图片的总数。具体地，一个坐标定位子单元负责处理图像序列中的一张图片经过车辆检测单元得到的一组轮廓框线参数集合。每个坐标定位子单元拥有完全相同的配置，区别在于处理的轮廓框线参数集合不同。In this embodiment, the coordinate positioning unit 30 includes: n coordinate positioning subunits, which are denoted by “coordinate positioning unit 31”, “coordinate positioning unit 32”… “coordinate positioning unit 3n” in FIG. The total number of pictures in the processed image sequence. Specifically, a coordinate positioning sub-unit is responsible for processing a set of outline frame line parameters obtained by a picture in the image sequence through the vehicle detection unit. Each coordinate positioning subunit has exactly the same configuration, the difference is that the set of parameters of the outline frame line processed is different.

具体地，坐标定位单元被配置为：根据车辆检测单元求出的车辆轮廓框线在图像中的坐标，通过摄像机坐标-世界坐标变换公式求得车辆在世界坐标系中的坐标，其中坐标变换公式为：Specifically, the coordinate positioning unit is configured to: obtain the coordinates of the vehicle in the world coordinate system through the camera coordinate-world coordinate transformation formula according to the coordinates of the vehicle outline frame line in the image obtained by the vehicle detection unit, wherein the coordinate transformation formula for:

sm＝A[R|t]Msm=A[R|t]M

其中M＝[x_w,y_w,z_w]^T为世界坐标系下的三维坐标，m＝[x_i,y_i]^T为该点在摄像机所拍摄图片中的二维坐标，R和t分别是摄像机外参数矩阵中的旋转矩阵和平移矩阵。A是摄像机内参数矩阵，A[1,1]＝f_x,A[1,3]＝c_x,A[2,2]＝f_y,A[2,3]＝c_y,A[3,3]＝1，A的其余位置都是0.其中f_x,f_y,c_x,c_y分别是摄像机在x轴方向上的焦距，在y轴方向上的焦距，在x轴方向上的光心，在y轴方向上的光心。s是景深。where M=[x _w , y _w , z _w ] ^T is the three-dimensional coordinate in the world coordinate system, m=[x _i , y _i ] ^T is the two-dimensional coordinate of the point in the picture taken by the camera, R and t They are the rotation matrix and the translation matrix in the camera extrinsic parameter matrix, respectively. A is the parameter matrix in the camera, A[1,1]=f _x , A[1,3]=c _x , A[2,2]=f _y , A[2,3]= _cy , A[3 ,3]=1, the rest of the positions of A are 0. Where f _x , f _y , c _x , and c _y are the focal length of the camera in the x-axis direction, the focal length in the y-axis direction, and the x-axis direction. The optical center of , the optical center in the y-axis direction. s is the depth of field.

根据上述公式，坐标定位单元的计算分为2个步骤：景深估计步骤和世界坐标求解步骤。所述景深估计步骤，在图像上车辆框线底部水平线的地面上取一个参考点，其在图像中的二维坐标为m_g，该点的世界坐标的z方向分量z_w＝0已知，同时由于该地面点与车辆框线底部中心点位于图像的同一水平面，这两个像素点的景深s相同。由此，可以得到第一个线性方程组(e31)According to the above formula, the calculation of the coordinate positioning unit is divided into two steps: a depth of field estimation step and a world coordinate solution step. In the depth of field estimation step, a reference point is taken on the ground of the horizontal line at the bottom of the vehicle frame line on the image, its two-dimensional coordinate in the image is m _g , and the z-direction component z _w =0 of the world coordinate of this point is known, At the same time, since the ground point and the bottom center point of the vehicle frame line are located on the same horizontal plane of the image, the depth of field s of these two pixel points is the same. From this, the first system of linear equations (e31) can be obtained

sm_g＝A[A|t]M₀ sm _g =A[A|t]M ₀

其中M₀＝[x₀,y₀,0]^T为参考点在世界坐标系下的三维坐标，m_g＝[x_g,y_g]^T为图像中参考点的二维坐标，R和t分别是摄像机外参数矩阵中的旋转矩阵和平移矩阵。A是摄像机内参数矩阵，A[1,1]＝f_x,A[1,3]＝c_x,A[2,2]＝f_y,A[2,3]＝c_y,A[3,3]＝1，A的其余位置都是0.其中f_x,f_y,c_x,c_y分别是摄像机在x轴方向上的焦距，在y轴方向上的焦距，在x轴方向上的光心，在y轴方向上的光心。s是景深。where M ₀ =[x ₀ ,y ₀ ,0] ^T is the three-dimensional coordinate of the reference point in the world coordinate system, m _g =[x _g ,y _g ] ^T is the two-dimensional coordinate of the reference point in the image, R and t They are the rotation matrix and the translation matrix in the camera extrinsic parameter matrix, respectively. A is the parameter matrix in the camera, A[1,1]=f _x , A[1,3]=c _x , A[2,2]=f _y , A[2,3]= _cy , A[3 ,3]=1, the rest of the positions of A are 0. Where f _x , f _y , c _x , and c _y are the focal length of the camera in the x-axis direction, the focal length in the y-axis direction, and the x-axis direction. The optical center of , the optical center in the y-axis direction. s is the depth of field.

所述世界坐标求解步骤，由坐标变换公式对车辆框线底部中心点m得到第二个线性方程组(e32)In the step of solving the world coordinates, the second linear equation system (e32) is obtained from the center point m at the bottom of the vehicle frame line by the coordinate transformation formula.

sm＝A[R|t]Msm=A[R|t]M

其中M＝[x_w,y_w,z_w]^T为图像中检测出的车辆轮廓框线底部中心在世界坐标系下的三维坐标，m＝[x_i,y_i]^T为图像中检测出的车辆轮廓框线底部中心点的二维坐标，R和t分别是摄像机外参数矩阵中的旋转矩阵和平移矩阵。A是摄像机内参数矩阵，A[1,1]＝f_x,A[1,3]＝c_x,A[2,2]＝f_y,A[2,3]＝c_y,A[3,3]＝1，A的其余位置都是0.其中f_x,f_y,c_x,c_y分别是摄像机在x轴方向上的焦距，在y轴方向上的焦距，在x轴方向上的光心，在y轴方向上的光心。s是景深。Among them, M=[x _w , y _w , z _w ] ^T is the three-dimensional coordinate of the bottom center of the vehicle outline frame line detected in the image in the world coordinate system, m=[x _i , y _i ] ^T is the detected image in the image. The two-dimensional coordinates of the bottom center point of the vehicle outline frame line, R and t are the rotation matrix and translation matrix in the camera extrinsic parameter matrix, respectively. A is the parameter matrix in the camera, A[1,1]=f _x , A[1,3]=c _x , A[2,2]=f _y , A[2,3]= _cy , A[3 ,3]=1, the rest of the positions of A are 0. Where f _x , f _y , c _x , and c _y are the focal length of the camera in the x-axis direction, the focal length in the y-axis direction, and the x-axis direction. The optical center of , the optical center in the y-axis direction. s is the depth of field.

将这两个方程组联立，可以用方程组(e31)中的已知量消去方程组(e32)中的未知量景深s，从而求出车辆框线底部中心点的世界坐标M.By combining these two equations, the unknown depth of field s in the equations (e32) can be eliminated with the knowns in the equations (e31), so as to obtain the world coordinate M of the center point at the bottom of the vehicle frame line.

进一步地，对于每一幅图像中的每一个车辆轮廓框线(x,y,w,h)，计算得到车辆轮廓底部的中心点坐标(x+w/2,y+h)，即m＝(x+w/2,y+h).由于m₀与m在同一水平线上，因此相应地m₀＝(x′,y+h).由此已知量m,m₀,A,R,t全部给出，联立方程组(e31,e32)可以求解出车体后缘底部的世界坐标(X,Y,Z).如此，每个车辆轮廓框线坐标(x,y,w,h)转化为车体后缘底部中心点的世界坐标(X,Y,0)。Further, for each vehicle outline frame line (x, y, w, h) in each image, the coordinates of the center point (x+w/2, y+h) at the bottom of the vehicle outline are calculated, that is, m= (x+w/2, y+h). Since m ₀ and m are on the same horizontal line, correspondingly m ₀ =(x′, y+h). From this, the quantities m, m ₀ , A, R are known , t are all given, the simultaneous equations (e31, e32) can solve the world coordinates (X, Y, Z) of the bottom of the rear edge of the vehicle body. In this way, the coordinates of each vehicle outline frame (x, y, w, h) Convert to world coordinates (X, Y, 0) of the bottom center point of the rear edge of the vehicle body.

辆跟踪单元40负责处理辨识多帧图像中的同一个车辆，形成每个车辆的行车轨迹；所述车辆跟踪单元40识别相同的车辆，赋予其唯一的id号，并将行车轨迹串联，从而实现前车跟踪。The vehicle tracking unit 40 is responsible for processing and recognizing the same vehicle in the multi-frame images to form the driving track of each vehicle; the vehicle tracking unit 40 identifies the same vehicle, assigns it a unique id number, and connects the driving track in series, thereby realizing Front car tracking.

根据前后两帧图像的各个车辆轮廓框线，计算两帧图像间第一帧的若干个轮廓框线中心到第二帧的若干个轮廓框线的像素距离，距离最近的一组帧间匹配轮廓框线认为是同一辆车在两帧图像的轮廓框线；由此对于整个图像序列的两两相邻图像进行计算，可以得到所有的帧间匹配轮廓框线，对应图像序列中所有的不同的车辆，对于每个车辆，将它们在各帧的轮廓框线串联起来，就得到了每辆车的行车轨迹。Calculate the pixel distance from the center of several contour lines of the first frame to several contour lines of the second frame between the two frames of images according to the respective vehicle contour lines of the two frames of images, and match the contour between the nearest set of frames. The frame line is considered to be the contour frame line of the same vehicle in two frames of images; thus, by calculating the adjacent images of the entire image sequence, all matching contour frames between frames can be obtained, corresponding to all different frames in the image sequence. Vehicles, for each vehicle, concatenate their outline frame lines in each frame to get the driving trajectory of each vehicle.

该实施例中，车辆跟踪单元40包括：距离计算模块与距离匹配模块。In this embodiment, the vehicle tracking unit 40 includes: a distance calculation module and a distance matching module.

具体地，距离计算模块根据前后两帧图片的各个车辆轮廓框线，计算两帧图像间第一帧的若干个轮廓框线中心到第二帧的若干个轮廓框线的像素距离，距离最近的一组帧间匹配轮廓框线认为是同一辆车在两帧图片的轮廓框线。对于整个图像序列的每一对两两相邻图片进行距离计算，得到相邻两幅图片中所有对应车辆轮廓框线的距离。Specifically, the distance calculation module calculates the pixel distances from the centers of several outline outlines of the first frame to several outline outlines of the second frame between the two frames of images according to the respective vehicle outline outlines of the two frames of pictures before and after, and the distance to the nearest A set of matching contour lines between frames is considered to be the contour lines of the same vehicle in two frames. For each pair of adjacent pictures in the entire image sequence, the distance is calculated to obtain the distances of all the corresponding vehicle contour lines in the adjacent two pictures.

具体地，距离匹配模块被配置为：对于图像序列中相邻的两幅图片，根据最近匹配原则，优先匹配相邻两幅图片中距离最近的车辆轮廓框线，将匹配得到的相邻一组车辆轮廓框线赋予相同的车辆ID标记，然后去掉已经匹配的车辆轮廓框线，在相邻两幅图片中的剩下的轮廓框线中按照距离最近原则继续匹配，直到相邻两幅图片中一副图片中所有的车辆轮廓框线已经全部匹配，忽略另一幅图片中剩余的车辆轮廓框线。对于整个图像序列中所有的相邻图片对进行上述匹配操作，可以得到所有的帧间匹配轮廓框线，以及轮廓框线对应车辆的唯一ID。对于每个被赋予唯一ID的车辆，将它们在各帧的轮廓框线串联起来，就得到了每辆车的行车轨迹。Specifically, the distance matching module is configured to: for two adjacent pictures in the image sequence, according to the nearest matching principle, preferentially match the vehicle outline frame line with the closest distance in the two adjacent pictures, and match the obtained adjacent group of The vehicle outline is given the same vehicle ID mark, and then the matched vehicle outline is removed, and the remaining outlines in the two adjacent pictures are matched according to the principle of closest distance, until the two adjacent pictures are All the vehicle outlines in one image have been matched, and the remaining vehicle outlines in the other image are ignored. By performing the above matching operation on all adjacent image pairs in the entire image sequence, all matching contour frames between frames and the unique ID of the vehicle corresponding to the contour frame can be obtained. For each vehicle that is given a unique ID, concatenate their outlines in each frame to get the driving trajectory of each vehicle.

进一步地，距离计算模块被配置为：对于同一个图像序列的相邻两幅图像，分别有车辆轮廓序列：Further, the distance calculation module is configured to: for two adjacent images of the same image sequence, there are respectively a sequence of vehicle contours:

和

and

其中k和k′分别指图像fig₁和fig₂通过车辆检测单元20计算出的车辆轮廓框线的数量。定义距离Among them, k and k' refer to the number of vehicle outline frame lines calculated by the vehicle detection unit 20 in the images fig ₁ and fig ₂ respectively. define distance

可以求出k×k′组车辆轮廓框线之间的距离。

The distances between the k×k′ groups of vehicle contour lines can be obtained.

进一步地，距离匹配模块被配置为：对于图像fig₁中的每一个车辆轮廓框线

按照最近的原则找到图像fig₂中的轮廓框线

匹配。依次按顺序计算一个图像序列中所有两两相邻的图像的匹配，可以得到若干连续的匹配串

其中K为图像序列中车辆i出现的次数。综合坐标定位单元30的结果，可以得到车辆i的坐标序列

坐标序列以XML格式输出，作为该前车距离跟踪系统的最终输出结果。Further, the distance matching module is configured as: for each vehicle outline frame line in the image fig ₁

Find the outline frame lines in the image fig ₂ according to the nearest principle

match. Calculate the matching of all pairs of adjacent images in an image sequence in order, and several consecutive matching strings can be obtained

where K is the number of times the vehicle i appears in the image sequence. By synthesizing the results of the coordinate positioning unit 30, the coordinate sequence of the vehicle i can be obtained

The coordinate sequence is output in XML format as the final output result of the preceding vehicle distance tracking system.

该实施例还提供了一种用于自动驾驶的前车距离跟踪方法，具体包括以下步骤：This embodiment also provides a method for tracking the distance of a preceding vehicle for automatic driving, which specifically includes the following steps:

步骤1：摄像机校准，摄像机校准步骤要求对于车载摄像机的位置参数和光学参数进行校准并记录于系统数据采集软件中。Step 1: Camera calibration, the camera calibration step requires that the position parameters and optical parameters of the vehicle camera be calibrated and recorded in the system data acquisition software.

具体地，摄像机位置参数包括摄像机固定的位置到车头、车底盘、车身两侧的距离以及摄像机相对于车底盘的立体角度。摄像机相对于车头、车底盘、车身两侧的距离用摄像机外参数矩阵中的平移矩阵t表示，摄像机相对于车底盘的立体角用摄像机外参数矩阵中的旋转矩阵R表示。Specifically, the camera position parameters include the distance from the fixed position of the camera to the front of the vehicle, the chassis of the vehicle, and the sides of the vehicle body, and the stereo angle of the camera relative to the chassis of the vehicle. The distance of the camera relative to the front of the car, the chassis, and both sides of the body is represented by the translation matrix t in the camera extrinsic parameter matrix, and the solid angle of the camera relative to the chassis is represented by the rotation matrix R in the camera extrinsic parameter matrix.

摄像机的光学参数用摄像机内参数矩阵A表示，其中A[1,1]＝f_x,A[1,3]＝c_x,A[2,2]＝f_y,A[2,3]＝c_y,A[3,3]＝1，A的其余位置都是0.其中f_x,f_y,c_x,c_y分别是摄像机在x轴方向上的焦距，在y轴方向上的焦距，在x轴方向上的光心，在y轴方向上的光心。The optical parameters of the camera are represented by the camera internal parameter matrix A, where A[1,1]=f _x , A[1,3]=c _x , A[2,2]=f _y , A[2,3]= c _y , A[3,3]=1, the rest of the positions of A are 0. Among them f _x , f _y , c _x , c _y are the focal length of the camera in the x-axis direction, and the focal length in the y-axis direction. , the optical center in the x-axis direction, and the optical center in the y-axis direction.

步骤2：对前方车辆进行识别与定位前方车辆的识别与定位通过车辆检测单元20和坐标定位单元30实现。Step 2: Identifying and Locating the Front Vehicle The recognition and localization of the front vehicle is realized by the vehicle detection unit 20 and the coordinate positioning unit 30 .

具体地，对于一组图像序列，包含n个等时间间距的图片：“图片1”“图片2”……“图片n”，n个车辆检测子单元与图像序列中的每一帧图片一一对应。每个车辆检测子单元检测一张图片中所有的车辆的轮廓框线，每个轮廓框线用轮廓左上角的坐标以及轮廓的宽度和高度表示。对于一副含有k个车辆的图片，前方车辆识别和定位步骤将会得到如下k组轮廓框线组成的集合：Specifically, for a set of image sequences, it includes n pictures with equal time intervals: "picture 1", "picture 2"..."picture n", and the n vehicle detection subunits correspond to each frame of picture in the image sequence. A correspondence. Each vehicle detection subunit detects the outline frame lines of all vehicles in a picture, and each outline frame line is represented by the coordinates of the upper left corner of the outline and the width and height of the outline. For a picture containing k vehicles, the front vehicle identification and localization steps will result in the following set of k groups of contour lines:

((x₁,y₁,w₁,h₁),(x₂,y₂,w₂,h₂),…,(x_k,y_k,w_k,h_k))，其中x,y,w,h分别表示轮廓框线的左上角像素横坐标、左上角像素纵坐标、宽度和高度。下标1,2,……,k分别对应该图片中的第1,2,……,k辆车。((x ₁ ,y ₁ ,w ₁ ,h ₁ ),(x ₂ ,y ₂ ,w ₂ ,h ₂ ),…,(x _k ,y _k ,w _k ,h _k )), where x,y , w, h represent the abscissa of the upper left corner of the outline frame, the ordinate of the upper left corner of the pixel, the width and the height, respectively. The subscripts 1, 2, ..., k correspond to the 1, 2, ..., k vehicles in the picture, respectively.

对于每帧图片中的每个轮廓框线，坐标定位单元首先在图片上的车辆轮廓框线根据框线参数(x,y,w,h)得到车辆底部中点的二维坐标m＝(x+w/2,y+h/2).进一步地，通过坐标定位单元将车辆底部中点的二维坐标转化为真实世界坐标系中的三维坐标M＝(X,Y,0).其中X,Y分别是真实世界坐标系前方车辆相对于本车的横向距离和纵向距离。For each contour frame in each frame of picture, the coordinate positioning unit first obtains the two-dimensional coordinate m=(x) of the midpoint of the vehicle bottom according to the frame parameters (x, y, w, h) of the vehicle contour frame on the picture +w/2,y+h/2). Further, the two-dimensional coordinates of the midpoint of the bottom of the vehicle are converted into three-dimensional coordinates in the real world coordinate system by the coordinate positioning unit M=(X, Y, 0). Wherein X , Y are the lateral distance and longitudinal distance of the vehicle ahead relative to the vehicle in the real world coordinate system, respectively.

步骤3：对前方车辆的轨迹进行跟踪。通过车辆跟踪定位单元，识别图像序列中重复出现的车辆，将所有出现在图像序列中的不同车辆赋予不同的唯一ID作为区别，输出每个车辆的轨迹序列到XML文件。Step 3: Track the trajectory of the vehicle ahead. Through the vehicle tracking and positioning unit, the vehicles that appear repeatedly in the image sequence are identified, and all different vehicles appearing in the image sequence are assigned different unique IDs as distinctions, and the trajectory sequence of each vehicle is output to an XML file.

通过步骤2，图像序列中的每一帧图片被转化为若干个坐标点，分别表示该帧图片中前方车辆的位置。车辆跟踪定位单元识别相邻两帧的相同车辆，从而将整个图像序列的不同图片中的车辆坐标整理成若干连续的轨迹，每一条轨迹对应一个唯一ID的前方车辆。一条轨迹是形如

的坐标序列，X,Y分别代表前方车辆相对于本车的横向距离和纵向距离，下标t,t+1,……,T代表该车在摄像机中出现的连续时间序列，上标i表示该车的唯一ID.所有车辆轨迹最终保存在XML文件中，存储格式的示例如下：Through step 2, each frame of picture in the image sequence is transformed into several coordinate points, which respectively represent the position of the vehicle ahead in this frame of picture. The vehicle tracking and positioning unit identifies the same vehicle in two adjacent frames, thereby organizing the vehicle coordinates in different pictures of the entire image sequence into several continuous tracks, each track corresponding to a vehicle ahead with a unique ID. A trajectory is like

The coordinate sequence of , X, Y respectively represent the lateral distance and longitudinal distance of the vehicle ahead relative to the vehicle, the subscript t, t+1, ..., T represents the continuous time sequence of the vehicle appearing in the camera, and the superscript i represents The unique ID of the car. All vehicle trajectories are finally saved in an XML file, an example of the storage format is as follows:

本申请中的步骤可根据实际需求进行顺序调整、合并和删减。The steps in this application can be adjusted, combined and deleted in sequence according to actual needs.

本申请装置中的单元可根据实际需求进行合并、划分和删减。The units in the device of the present application can be combined, divided and deleted according to actual needs.

尽管参考附图详地公开了本申请，但应理解的是，这些描述仅仅是示例性的，并非用来限制本申请的应用。本申请的保护范围由附加权利要求限定，并可包括在不脱离本申请保护范围和精神的情况下针对发明所作的各种变型、改型及等效方案。Although the present application has been disclosed in detail with reference to the accompanying drawings, it should be understood that these descriptions are merely exemplary and are not intended to limit the application of the present application. The protection scope of the present application is defined by the appended claims, and may include various modifications, alterations and equivalent solutions for the invention without departing from the protection scope and spirit of the present application.

Claims

1. A preceding vehicle distance tracking system for automatic driving, characterized in that the system comprises: a data acquisition unit (10), a vehicle detection unit (20), a coordinate positioning unit (30) and a vehicle tracking unit (40) );

The data acquisition unit (10) is used for extracting image sequences at equal time intervals from the camera, and simultaneously recording the parameters of the camera, including the internal parameters representing the camera focal length, projection center, tilt and distortion, and the camera position translation and position rotation. external parameters;

The vehicle detection unit (20) includes several vehicle detection modules, wherein each vehicle detection module is responsible for processing a single image; the vehicle detection module uses a classifier constructed by a deep neural network model to identify the vehicle in the image, and uses a regressor to fit the vehicle , so as to locate the vehicle ahead on a single image;

The coordinate positioning unit (30) includes several coordinate positioning modules, wherein each coordinate positioning module is responsible for processing a single vehicle outline frame; The camera parameters obtained by the data acquisition unit (10), through the method of geometric transformation, transform the pixel coordinates of the outline frame line of the preceding vehicle into the vehicle body coordinates where the vehicle camera is located, thereby determining the spatial position of the preceding vehicle relative to the vehicle;

The coordinate positioning unit (30) is configured to:

According to the coordinates of the vehicle in the image obtained by the vehicle detection unit (20), the coordinates of the vehicle in the world coordinate system are obtained through the camera coordinate-world coordinate transformation formula, where the coordinate transformation formula is:

in

is the three-dimensional coordinate in the world coordinate system,

is the two-dimensional coordinate of the bottom center point of the vehicle outline frame line detected in the image,

and

are the rotation matrix and translation matrix in the camera extrinsic parameter matrix, respectively

is the in-camera parameter matrix,

,

The rest of the positions are 0; where are the focal length of the camera in the x-axis direction, the focal length in the y-axis direction, the optical center in the x-axis direction, and the optical center in the y-axis direction

is the depth of field; where,

are all known quantities that can be obtained from the data acquisition unit (10),

is the unknown quantity to be sought;

The calculation of the coordinate positioning unit is divided into 2 steps:

Step 1. Depth of field estimation, take a point on the ground of the horizontal line at the bottom of the vehicle frame line on the image, and its two-dimensional coordinates in the image are

, the z-direction component of the point's world coordinates

It is known that since the ground point and the center point at the bottom of the vehicle frame line are located on the same level of the image, the depth of field of these two pixel points is

The same, from this, the first linear equation system (e31) can be obtained

(e31)

Step 2: According to the world coordinate solution step, the center point at the bottom of the vehicle frame line is determined by the coordinate transformation formula.

get the second system of linear equations (e32)

(e32)

By combining these two equations, the unknown depth of field in equations (e32) can be eliminated by the knowns in equations (e31)

, so as to find the world coordinates of the center point at the bottom of the vehicle frame line

;

The vehicle tracking unit (40) is responsible for processing and recognizing the same vehicle in multiple frames of images to form the driving track of each vehicle; the vehicle tracking unit (40) identifies the same vehicle that appears in different pictures, and assigns it a unique vehicle. id number, and connect the driving trajectories in series, so as to realize the tracking of the preceding vehicle;

The vehicle tracking unit (40) includes a distance calculation module and a distance matching module;

The distance calculation module calculates the pixel distance from the center of several outline outlines of the first frame to several outline outlines of the second frame between the two frame images according to each vehicle outline frame line of the two frames of images, and the distance to the nearest set of frames The matching outline frame line is considered to be the outline frame line of the same vehicle in two frames of pictures;

For the two adjacent images in the image sequence, the distance matching module preferentially matches the vehicle outline frame lines with the closest distance in the adjacent two images according to the nearest matching principle, and assigns the matched adjacent group of vehicle outline frames to the same value. The vehicle ID mark, and then remove the matched vehicle outline, and continue to match the remaining vehicle outlines in the two adjacent images according to the principle of the closest distance, until all the adjacent two images in one image. The vehicle outline lines have all been matched.

2. The preceding vehicle distance tracking system for automatic driving according to claim 1, wherein the vehicle detection unit (20) comprises a candidate region generation module, a vehicle discrimination module and a contour box line regression module;

The candidate region generation module saves several anchor points of different sizes, and each anchor point is a rectangular area composed of several pixels;

The input of the vehicle identification module is a candidate region that may contain a vehicle generated by the candidate region generation module, and the output is the information of whether the candidate region contains a vehicle;

The contour box line regression module fine-tunes the contour line of the vehicle in the image based on the coordinates of the candidate region based on the mask region convolutional neural network regressor.

3. The preceding vehicle distance tracking system for automatic driving as claimed in claim 2, characterized in that,

According to the vehicle detection unit, use the mask area convolutional neural network to calculate the outline frame of the vehicle in the image; wherein, the calculation of the mask area convolutional neural network is divided into three steps:

Step 1. Extract the candidate area, traverse the entire image from left to right and top to bottom with several preset anchor points of different sizes, and calculate the position of the rectangular block that may be used as the outline of the vehicle;

Step 2. Perform regional object classification. For each candidate region, use a convolutional neural network to extract the visual features of the region, and use a multi-layer perceptron to judge the category of objects in the region through the visual features;

Step 3. Complete the coordinates of the outline frame. For each candidate area, use the neural network to regress the offset of the outline frame of the candidate area relative to the outline frame of the detection target, so that the outline frame of the candidate area is further fitted. Detects the outline of the target.

4. a method utilizing the preceding vehicle distance tracking system for automatic driving according to claim 1 to track, specifically comprises the following steps:

Step 1: Camera calibration, the position parameters and optical parameters of the vehicle camera are calibrated and recorded in the system data acquisition software;

The camera position parameters include the distance from the fixed position of the camera to the front of the car, the chassis of the car, the sides of the car body, and the stereo angle of the camera relative to the chassis of the car;

Step 2: Identify and locate the vehicle ahead, and the identification and positioning of the vehicle in front is realized by the vehicle detection unit (20) and the coordinate positioning unit (30);

Step 3: Track the trajectory of the vehicle ahead, identify the vehicles that appear repeatedly in the image sequence through the vehicle tracking and positioning unit, assign different unique IDs to all different vehicles appearing in the image sequence as distinctions, and output the trajectory of each vehicle Sequence to XML file.