CN114162145A

CN114162145A - Automatic vehicle driving method and device and electronic equipment

Info

Publication number: CN114162145A
Application number: CN202210033315.0A
Authority: CN
Inventors: 张艺浩; 韩志华; 徐修信; 郭立群
Original assignee: Zhitu Shanghai Intelligent Technology Co ltd; Suzhou Zhitu Technology Co Ltd
Current assignee: Zhitu Shanghai Intelligent Technology Co ltd; Suzhou Zhitu Technology Co Ltd
Priority date: 2022-01-12
Filing date: 2022-01-12
Publication date: 2022-03-11

Abstract

The application provides a vehicle automatic driving method, a vehicle automatic driving device and electronic equipment, wherein the method comprises the following steps: the method comprises the steps of firstly obtaining driving information of a target vehicle, driving information of related vehicles and lane information, then predicting driving scores through a first neural network model to obtain driving scores of a plurality of different driving strategies, and finally determining the target driving strategy according to the driving scores. According to the technology, a plurality of driving strategies are determined firstly through driving decisions, then score prediction is carried out on each driving strategy by using the neural network model, so that the finally determined target driving strategy integrates effective prediction of the running performance of a target vehicle by the neural network on the basis of a traditional decision algorithm, the target driving strategy determined through the scores predicted by the first neural network model is closer to the actual complex traffic condition, and the safety of automatic driving of the vehicle is effectively improved.

Description

Vehicle automatic driving method, device and electronic device

技术领域technical field

本申请涉及自动驾驶技术领域，尤其是涉及一种车辆自动驾驶方法、装置及电子设备。The present application relates to the technical field of automatic driving, and in particular, to a vehicle automatic driving method, device and electronic device.

背景技术Background technique

随着人工智能技术以及城市智能轨道交通的快速发展，自动驾驶车辆即自动驾驶系统已经逐步进入大众的日常生活，自动驾驶系统是一个汇集众多高新技术的综合系统，作为关键环节的环境信息获取和智能决策控制依赖于传感器技术、图像识别技术、电子与计算机技术与控制技术等一系列高新技术的创新和突破。With the rapid development of artificial intelligence technology and urban intelligent rail transit, self-driving vehicles, namely self-driving systems, have gradually entered the daily life of the public. The self-driving system is a comprehensive system that brings together many high-tech, as a key link in environmental information acquisition and Intelligent decision control relies on a series of high-tech innovations and breakthroughs such as sensor technology, image recognition technology, electronic and computer technology and control technology.

自动驾驶车辆在动态复杂环境下如何做出合理有预见性的驾驶决策以保证车辆快速平稳的行驶是目前自动驾驶领域极具挑战性的研究课题，现有技术中，通常是根据有限状态的决策算法来确定驾驶策略。How to make reasonable and predictable driving decisions in a dynamic and complex environment to ensure the fast and stable driving of autonomous vehicles is a very challenging research topic in the field of autonomous driving. In the existing technology, decisions are usually based on finite states Algorithms to determine driving strategies.

但是，基于规则以及有限状态的传统决策算法，其决策的制定需要根据特定的驾驶情形，而真实的交通驾驶环境是多种因素结合的复杂情况，并不能很好的与简单设定的驾驶情形完全符合，这样导致通过传统的决策算法确定的驾驶策略不能满足实际的复杂路况，会出现碰撞等危险情形。However, for traditional decision-making algorithms based on rules and finite states, the decision-making needs to be based on a specific driving situation, and the real traffic driving environment is a complex situation with a combination of various factors, which cannot be well matched with the simply set driving situation. Complete compliance, which leads to the fact that the driving strategy determined by the traditional decision-making algorithm cannot meet the actual complex road conditions, and there will be dangerous situations such as collisions.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本申请的目的在于提供一种车辆自动驾驶方法、装置及电子设备，以提升车辆自动驾驶的安全性。In view of this, the purpose of the present application is to provide a vehicle automatic driving method, device and electronic device, so as to improve the safety of vehicle automatic driving.

第一方面，本申请实施例提供一种车辆自动驾驶方法，该方法包括：获取目标车辆在当前时刻的第一行驶信息、与目标车辆相距第一指定范围内的相关车辆对应的第二行驶信息以及与目标车辆相距第二指定范围内的相关车道的车道信息；通过第一神经网络模型第一行驶信息、第二行驶信息以及车道信息预测目标车辆以多个不同的预定驾驶策略行驶的行驶得分；其中，行驶得分用于表征目标车辆以预定驾驶策略进行行驶的行驶状况和车辆性能；第一神经网络模型通过不同交通状况下对应的样本数据训练得到；根据多个不同驾驶策略分别对应的行驶得分从多个不同驾驶策略中确定目标驾驶策略，以使目标车辆在当前时刻的下一时刻按照目标驾驶策略行驶。In a first aspect, an embodiment of the present application provides an automatic driving method for a vehicle, the method includes: acquiring first driving information of a target vehicle at the current moment, and second driving information corresponding to a related vehicle within a first specified range from the target vehicle and the lane information of the relevant lanes within a second specified range from the target vehicle; the first neural network model is used to predict the driving score of the target vehicle traveling with a plurality of different predetermined driving strategies through the first driving information, the second driving information and the lane information ; wherein, the driving score is used to characterize the driving conditions and vehicle performance of the target vehicle traveling with a predetermined driving strategy; the first neural network model is obtained by training corresponding sample data under different traffic conditions; The score determines the target driving strategy from a number of different driving strategies, so that the target vehicle drives according to the target driving strategy at the next moment from the current moment.

进一步地，上述与目标车辆相距指定范围内的相关车辆包括以下至少一者：目标车辆所在车道中与目标车辆距离小于第一距离阈值且位于目标车辆行驶方向前方的第一相关车辆；在目标车辆所在车道右侧车道中与目标车辆距离小于第二距离阈值且位于目标车辆行驶方向前方的第二相关车辆；在目标车辆所在车道右侧车道中与目标车辆距离小于第三距离阈值且位于目标车辆行驶方向后方的第三相关车辆；在目标车辆所在车道左侧车道中与目标车辆距离小于第四距离阈值且位于目标车辆行驶方向前方的第四相关车辆；在目标车辆所在车道右侧车道中与目标车辆距离小于第五距离阈值且位于目标车辆行驶方向后方的第五相关车辆。Further, the above-mentioned related vehicles within a specified range from the target vehicle include at least one of the following: a first related vehicle whose distance from the target vehicle in the lane where the target vehicle is located is less than a first distance threshold and is located in front of the driving direction of the target vehicle; The second related vehicle in the right lane of the lane where the distance from the target vehicle is less than the second distance threshold and located in front of the driving direction of the target vehicle; in the lane on the right side of the lane where the target vehicle is located, the distance from the target vehicle is less than the third distance threshold and located in the target vehicle The third related vehicle behind the driving direction; the fourth related vehicle in the left lane of the target vehicle's lane with the A fifth associated vehicle whose distance to the target vehicle is less than the fifth distance threshold and is located behind the direction of travel of the target vehicle.

进一步地，上述目标车辆在当前时刻的第一行驶信息包括目标车辆的第一位置和第一速度；获取与目标车辆相距指定范围内的相关车辆对应的第二行驶信息的步骤，包括：获取相关车辆在当前时刻的第二位置和第二速度；计算第二位置相对于第一位置的相对位置，以及第二速度相对于第一速度的相对速度；将相对位置和相对速度确定为相关车辆对应的第二行驶信息。Further, the first travel information of the target vehicle at the current moment includes the first position and the first speed of the target vehicle; the step of obtaining the second travel information corresponding to the relevant vehicles within a specified range of the target vehicle includes: obtaining relevant The second position and second speed of the vehicle at the current moment; calculate the relative position of the second position relative to the first position, and the relative speed of the second speed relative to the first speed; determine the relative position and relative speed as the corresponding vehicle of the second driving information.

进一步地，上述方法还包括：如果目标车辆所在车道为道路中最右侧车道，将位于目标车辆所在车道右侧车道中的相关车辆的第二行驶信息设置为0；如果目标车辆所在车道为道路中最左侧车道，将位于目标车辆所在车道左侧车道中的相关车辆的第二行驶信息设置为0。Further, the above method also includes: if the lane where the target vehicle is located is the rightmost lane in the road, setting the second driving information of the relevant vehicle in the lane on the right side of the lane where the target vehicle is located to 0; if the lane where the target vehicle is located is the road In the leftmost lane, the second driving information of the relevant vehicle in the left lane of the lane where the target vehicle is located is set to 0.

进一步地，上述相关车道包括目标车辆所在车道、目标车辆所在车道的右侧车道以及目标车辆所在车道的左侧车道；获取与目标车辆相距第二指定范围内的相关车道的车道信息的步骤，包括：将车辆在当前时刻的位置作为原点位置；计算相关车道中第一预设数量个连续位置中每个位置与原点位置的横向距离和纵向距离；其中，在第一预设数量个连续位置中每两个相邻位置之间的距离相等；将第一预设数量个连续位置中每个位置对应的横向距离和纵向距离构成的集合确定为与目标车辆相距第二指定范围内的相关车道的车道信息。Further, the above-mentioned relevant lanes include the lane where the target vehicle is located, the right lane of the lane where the target vehicle is located, and the left lane of the lane where the target vehicle is located; the step of acquiring the lane information of the relevant lane within a second specified range from the target vehicle includes: : take the position of the vehicle at the current moment as the origin position; calculate the horizontal distance and the longitudinal distance between each position in the first preset number of continuous positions in the relevant lane and the origin position; wherein, in the first preset number of continuous positions The distance between every two adjacent positions is equal; the set consisting of the lateral distance and the longitudinal distance corresponding to each position in the first preset number of continuous positions is determined as the distance between the relevant lane within the second specified range from the target vehicle Lane information.

进一步地，上述多个驾驶策略至少包括以下中的任意两个：加速直行、保持速度直行、减速直行、直线紧急刹车、向左车道匀速变道以及向右车道匀速变道。Further, the above-mentioned multiple driving strategies include at least any two of the following: accelerating straight ahead, maintaining speed straight ahead, decelerating straight ahead, straight-line emergency braking, uniform lane change to the left lane, and uniform lane change to the right lane.

进一步地，上述第一神经网络模型通过以下步骤训练得到：获取样本数据；其中，样本数据包括目标样本车辆的速度信息和位置信息，与目标样本车辆相关的相关样本车辆的速度信息和位置信息，以及目标样本车辆对应的车道的车道信息；根据样本数据和损失函数计算第二预设数量个连续时刻中每个时刻对应的初始神经网络模型的损失值；其中，损失函数包括碰撞参数、能耗参数、变道惩罚参数、行驶效率参数中的一个或多个；根据损失值调整初始神经网络模型的参数，将满足训练停止条件时对应的初始神经网络模型确定为第一神经网络模型。Further, the above-mentioned first neural network model is obtained by training through the following steps: obtaining sample data; wherein, the sample data includes the speed information and position information of the target sample vehicle, and the speed information and position information of the relevant sample vehicles related to the target sample vehicle, and the lane information of the lane corresponding to the target sample vehicle; calculate the loss value of the initial neural network model corresponding to each moment in the second preset number of consecutive moments according to the sample data and the loss function; wherein, the loss function includes collision parameters, energy consumption One or more of parameters, lane change penalty parameters, and driving efficiency parameters; adjust the parameters of the initial neural network model according to the loss value, and determine the initial neural network model corresponding to the training stop condition as the first neural network model.

进一步地，上述方法还包括：通过初始神经网络模型确定第二预设数量个连续时刻中当前时刻的下一时刻多个驾驶策略分别对应的预测得分；将预测得分最大的驾驶策略作为下一时刻驾驶策略，并获取目标样本车辆以下一时刻驾驶策略行驶产生的下一时刻的样本数据；基于下一时刻的样本数据，继续对初始神经网络模型进行训练。Further, the above method also includes: determining the prediction scores corresponding to multiple driving strategies at the next moment at the current moment in the second preset number of consecutive moments through an initial neural network model; and using the driving strategy with the largest predicted score as the next moment. The driving strategy is obtained, and the sample data of the next moment generated by the target sample vehicle driving under the driving strategy of the next moment is obtained; based on the sample data of the next moment, the initial neural network model is continued to be trained.

进一步地，上述根据多个不同驾驶策略分别对应的行驶得分从多个不同驾驶策略中确定目标驾驶策略的步骤，包括：将行驶得分高于预设得分阈值的驾驶策略确定为备选驾驶策略集合；从备选驾驶策略集合中确定目标驾驶策略。Further, the above-mentioned step of determining a target driving strategy from a plurality of different driving strategies according to the driving scores corresponding to the plurality of different driving strategies includes: determining a driving strategy with a driving score higher than a preset score threshold as a set of alternative driving strategies ; Determine the target driving strategy from the set of alternative driving strategies.

进一步地，上述从备选驾驶策略集合中确定目标驾驶策略的步骤，包括：判断备选驾驶策略集合中行驶得分最高的驾驶策略是否满足预设的安全行驶条件；如果满足，将该驾驶策略确定为目标驾驶策略；如果不满足，将该驾驶策略从备选驾驶策略集合中删除。Further, the above-mentioned step of determining the target driving strategy from the alternative driving strategy set includes: judging whether the driving strategy with the highest driving score in the alternative driving strategy set satisfies the preset safe driving conditions; if so, determining the driving strategy is the target driving strategy; if not satisfied, delete the driving strategy from the set of alternative driving strategies.

进一步地，上述方法还包括：如果目标驾驶策略为向左车道匀速变道或向右车道匀速变道，则通过以下公式控制车辆进行变道：

其中，

表示方向盘的转角，θ_n与θ_f分别表示目标车道上前方第一位置与前方第二位置与自车位置形成的夹角，Δt为时间步长，k_f、k_n和k_I表示驾驶行为的常数，目标车道为目标车辆准备从目标车辆所在的当前车道切换的车道。Further, the above method also includes: if the target driving strategy is to change lanes at a constant speed to the left lane or change lanes at a constant speed to the right lane, then control the vehicle to change lanes by the following formula:

in,

Represents the steering wheel angle, θ _n and θ _f respectively represent the angle formed by the first position in front of the target lane and the second position in front and the position of the vehicle, Δt is the time step, k _f , k _n and k _I represent the driving behavior The constant of , the target lane is the lane that the target vehicle is ready to switch from the current lane where the target vehicle is located.

第二方面，本申请实施例还提供一种车辆自动驾驶装置，该装置包括：信息获取模块，用于获取目标车辆在当前时刻的第一行驶信息、与所述目标车辆相距第一指定范围内的相关车辆对应的第二行驶信息以及与所述目标车辆相距第二指定范围内的相关车道的车道信息；预测模块，用于通过第一神经网络模型、所述第一行驶信息、所述第二行驶信息以及所述车道信息预测所述目标车辆以多个不同的预定驾驶策略行驶的行驶得分；其中，所述行驶得分用于表征所述目标车辆以所述预定驾驶策略进行行驶的行驶状况和车辆性能；所述第一神经网络模型通过不同交通状况下对应的样本数据训练得到；目标驾驶策略确定模块，用于根据所述多个不同驾驶策略分别对应的行驶得分从所述多个不同驾驶策略中确定目标驾驶策略，以使所述目标车辆在所述当前时刻的下一时刻按照所述目标驾驶策略行驶。In a second aspect, an embodiment of the present application further provides an automatic driving device for a vehicle, the device includes: an information acquisition module configured to acquire first driving information of a target vehicle at the current moment and a distance from the target vehicle within a first specified range The second driving information corresponding to the relevant vehicle and the lane information of the relevant lane within the second specified range from the target vehicle; the prediction module is used to pass the first neural network model, the first driving information, the first The driving information and the lane information predict the driving score of the target vehicle when the target vehicle is driven by a plurality of different predetermined driving strategies; wherein, the driving score is used to characterize the driving condition of the target vehicle driving by the predetermined driving strategy and vehicle performance; the first neural network model is obtained by training corresponding sample data under different traffic conditions; a target driving strategy determination module is used for driving scores corresponding to the plurality of different driving strategies from the plurality of different driving strategies A target driving strategy is determined in the driving strategy, so that the target vehicle drives according to the target driving strategy at the next moment of the current moment.

第三方面，本申请实施例还提供一种电子设备，包括处理器和存储器，存储器存储有能够被处理器执行的计算机可执行指令，处理器执行计算机可执行指令以实现上述第一方面的车辆自动驾驶方法。In a third aspect, embodiments of the present application further provide an electronic device, including a processor and a memory, where the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the vehicle of the first aspect above Autopilot method.

第四方面，本申请实施例还提供一种计算机可读存储介质，计算机可读存储介质存储有计算机可执行指令，计算机可执行指令在被处理器调用和执行时，计算机可执行指令促使处理器实现上述第一方面的车辆自动驾驶方法。In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are invoked and executed by the processor, the computer-executable instructions cause the processor to The vehicle automatic driving method of the above first aspect is realized.

与现有技术相比，本申请具有以下有益效果：Compared with the prior art, the present application has the following beneficial effects:

本申请实施例提供的上述车辆自动驾驶方法、装置及电子设备，首先获取目标车辆的行驶信息，相关车辆的行驶信息以及车道信息，然后通过第一神经网络模型进行行驶得分的预测，得到多个不同驾驶策略的行驶得分，最终根据行驶得分确定目标驾驶策略。本申请的技术中通过驾驶决策首先确定出多个驾驶策略，然后针对每个驾驶策略，利用神经网络模型对其进行得分预测，使得最终确定的目标驾驶策略在传统决策算法的基础上融合了神经网络对目标车辆的行驶性能的有效预测，由于第一神经网络模型是通过多种不同交通状况下的样本数据训练得到的，因此通过第一神经网络模型预测的得分而确定的目标驾驶策略更贴近实际的复杂交通情况，有效提升了车辆自动驾驶的安全性。The above-mentioned vehicle automatic driving method, device, and electronic device provided by the embodiments of the present application first obtain the driving information of the target vehicle, the driving information and lane information of the relevant vehicle, and then use the first neural network model to predict the driving score, and obtain multiple Driving scores of different driving strategies, and finally determine the target driving strategy according to the driving scores. In the technology of the present application, a plurality of driving strategies are firstly determined through driving decisions, and then a neural network model is used to predict the score of each driving strategy, so that the final target driving strategy is based on the traditional decision-making algorithm. The network can effectively predict the driving performance of the target vehicle. Since the first neural network model is obtained by training sample data under a variety of different traffic conditions, the target driving strategy determined by the score predicted by the first neural network model is closer to the target. The actual complex traffic situation effectively improves the safety of vehicle automatic driving.

本公开的其他特征和优点将在随后的说明书中阐述，或者，部分特征和优点可以从说明书推知或毫无疑义地确定，或者通过实施本公开的上述技术即可得知。Additional features and advantages of the present disclosure will be set forth in the description that follows, or some may be inferred or unambiguously determined from the description, or may be learned by practicing the above-described techniques of the present disclosure.

为使本公开的上述目的、特征和优点能更明显易懂，下文特举较佳实施例，并配合所附附图，作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, the preferred embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings.

附图说明Description of drawings

为了更清楚地说明本申请具体实施方式或现有技术中的技术方案，下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图是本申请的一些实施方式，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the specific embodiments of the present application or the technical solutions in the prior art, the accompanying drawings that need to be used in the description of the specific embodiments or the prior art will be briefly introduced below. The drawings are some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1为本申请实施例提供的一种电子系统的结构示意图；1 is a schematic structural diagram of an electronic system according to an embodiment of the present application;

图2为本申请实施例提供的一种车辆自动驾驶方法的流程图；FIG. 2 is a flowchart of a method for automatic driving of a vehicle provided by an embodiment of the present application;

图3为本申请实施例提供的另一种车辆自动驾驶方法的流程图；FIG. 3 is a flowchart of another vehicle automatic driving method provided by an embodiment of the present application;

图4为本申请实施例提供的一种第一神经网络模型的结构示意图；4 is a schematic structural diagram of a first neural network model provided by an embodiment of the present application;

图5为本申请实施例提供的一种第一神经网络模型的训练方法流程图；5 is a flowchart of a training method of a first neural network model provided by an embodiment of the present application;

图6为本申请实施例提供的一种车辆自动驾驶装置的结构示意图；FIG. 6 is a schematic structural diagram of a vehicle automatic driving device provided by an embodiment of the present application;

图7为本申请实施例提供的一种电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

具体实施方式Detailed ways

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合附图对本申请的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the present application will be described clearly and completely below with reference to the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. example. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

自动驾驶的主要目的在于减少人力成本、增加运输效率、降级能耗以及避免交通事故等。随着人工智能领域的日益发展，机动车自动驾驶技术也日渐成熟。早期的自动驾驶主要研发重心为安全平稳的驾驶体验。近些年来，如何在安全的基础上提高行驶效率以及减少燃油消耗引起了更广泛的关注。自动驾驶车辆在动态复杂环境下如何做出合理有预见性的驾驶决策以保证车辆快速平稳的行驶依然很具有挑战性。基于规则以及有限状态的传统决策算法尽管在一些自动驾驶任务中取得了成功，然而它们最大的一个缺点便是其决策的制定需要根据特定的驾驶情形。这样导致其难以概括更为复杂的真实交通驾驶环境。基于此，本申请实施例提供一种车辆自动驾驶方法、装置及电子设备，以提升车辆自动驾驶的安全性。The main purpose of autonomous driving is to reduce labor costs, increase transportation efficiency, reduce energy consumption, and avoid traffic accidents. With the development of artificial intelligence, the technology of autonomous driving of motor vehicles is also becoming more and more mature. In the early days of autonomous driving, the main focus of research and development was on a safe and stable driving experience. In recent years, how to improve driving efficiency and reduce fuel consumption on the basis of safety has attracted wider attention. How to make reasonable and predictable driving decisions in dynamic and complex environments to ensure fast and smooth driving of autonomous vehicles is still very challenging. Although traditional rule-based and finite-state decision-making algorithms have achieved success in some autonomous driving tasks, one of their biggest drawbacks is that their decision-making needs to be based on specific driving situations. This makes it difficult to generalize more complex real traffic driving environments. Based on this, embodiments of the present application provide a vehicle automatic driving method, device, and electronic device, so as to improve the safety of vehicle automatic driving.

参照图1所示的电子系统100的结构示意图。该电子系统可以用于实现本申请实施例的车辆自动驾驶方法和装置。Referring to the schematic structural diagram of the electronic system 100 shown in FIG. 1 . The electronic system can be used to implement the vehicle automatic driving method and device according to the embodiments of the present application.

如图1所示的一种电子系统的结构示意图，电子系统100包括一个或多个处理设备102、一个或多个存储装置104。可选地，电子系统100还可以包括输入装置106、输出装置108以及一个或多个数据采集设备110，这些组件通过总线系统112和/或其它形式的连接机构(未示出)互连。应当注意，图1所示的电子系统100的组件和结构只是示例性的，而非限制性的，根据需要，电子系统可以具有图1中的部分组件，也可以具有其他组件和结构。As a schematic structural diagram of an electronic system shown in FIG. 1 , the electronic system 100 includes one or more processing devices 102 and one or more storage devices 104 . Optionally, the electronic system 100 may also include an input device 106, an output device 108, and one or more data acquisition devices 110, these components being interconnected by a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structures of the electronic system 100 shown in FIG. 1 are only exemplary and non-limiting, and the electronic system may have some components in FIG. 1 or other components and structures as required.

处理设备102可以为服务器、智能终端，或者是包含中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元的设备，可以对电子系统100中的其它组件的数据进行处理，还可以控制电子系统100中的其它组件以执行车辆自动驾驶功能。The processing device 102 can be a server, an intelligent terminal, or a device that includes a central processing unit (CPU) or other forms of processing units with data processing capabilities and/or instruction execution capabilities, and can process data from other components in the electronic system 100 . Processing may also control other components in the electronic system 100 to perform vehicle autopilot functions.

存储装置104可以包括一个或多个计算机程序产品，计算机程序产品可以包括各种形式的计算机可读存储介质，例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在计算机可读存储介质上可以存储一个或多个计算机程序指令，处理设备102可以运行程序指令，以实现下文的本申请实施例中(由处理设备实现)的客户端功能以及/或者其它期望的功能。在计算机可读存储介质中还可以存储各种应用程序和各种数据，例如应用程序使用和/或产生的各种数据等。Storage 104 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and/or cache memory, among others. Non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processing device 102 may execute the program instructions to implement the client functions (implemented by the processing device) in the following embodiments of the present application and/or other desired Function. Various application programs and various data, such as various data used and/or generated by the application program, etc., may also be stored in the computer-readable storage medium.

输入装置106可以是用户用来输入指令的装置，并且可以包括键盘、鼠标、麦克风和触摸屏等中的一个或多个。Input device 106 may be a device used by a user to input instructions, and may include one or more of a keyboard, mouse, microphone, touch screen, and the like.

输出装置108可以向外部(例如，用户)输出各种信息(例如，图像或声音)，并且可以包括显示器、扬声器等中的一个或多个。The output device 108 may output various information (eg, images or sounds) to the outside (eg, a user), and may include one or more of a display, a speaker, and the like.

数据采集设备110可以获取待处理数据，并且将该待处理数据存储在存储装置104中以供其它组件使用。The data acquisition device 110 may acquire the data to be processed and store the data to be processed in the storage device 104 for use by other components.

示例性地，用于实现根据本申请实施例的车辆自动驾驶方法、装置及电子设备中的各器件可以集成设置，也可以分散设置，诸如将处理设备102、存储装置104、输入装置106和输出装置108集成设置于一体，而将数据采集设备110设置于可以采集到数据的指定位置。当上述电子系统中的各器件集成设置时，该电子系统可以被实现为诸如相机、智能手机、平板电脑、计算机、车载终端等智能终端。Exemplarily, each device in the method, device, and electronic device for implementing the automatic driving of a vehicle according to the embodiments of the present application may be integrated or distributed, such as the processing device 102 , the storage device 104 , the input device 106 , and the output device. The device 108 is integrated and arranged in one body, and the data collection device 110 is arranged at a designated location where data can be collected. When the various devices in the above electronic system are integrated, the electronic system can be implemented as a smart terminal such as a camera, a smart phone, a tablet computer, a computer, a vehicle-mounted terminal, and the like.

图2为本申请实施例提供的一种车辆自动驾驶方法的流程图，如图2所示，该方法具体包括以下步骤：FIG. 2 is a flowchart of a vehicle automatic driving method provided by an embodiment of the present application. As shown in FIG. 2 , the method specifically includes the following steps:

S202：获取目标车辆在当前时刻的第一行驶信息、与目标车辆相距第一指定范围内的相关车辆对应的第二行驶信息以及与目标车辆相距第二指定范围内的相关车道的车道信息；S202: Acquire first driving information of the target vehicle at the current moment, second driving information corresponding to a relevant vehicle within a first specified range from the target vehicle, and lane information of a relevant lane within a second specified range from the target vehicle;

其中，目标车辆为本申请实施例提供的方法对其进行控制的车辆，第一行驶信息可以包括目标车辆的速度、位置等信息。相关车辆为与目标车辆的距离在第一指定范围内的车辆，例如目标车辆同车道的前车、后车，目标车辆相邻车道的前车、后车等。第二行驶信息包括相关车辆的速度和位置信息。车道信息包括该车道与目标车辆所在车道之间的相对距离等信息。The target vehicle is a vehicle controlled by the method provided in this embodiment of the application, and the first driving information may include information such as speed and position of the target vehicle. The relevant vehicle is a vehicle whose distance from the target vehicle is within the first specified range, for example, the preceding vehicle and the rear vehicle in the same lane of the target vehicle, the preceding vehicle and the rear vehicle in the adjacent lane of the target vehicle, and the like. The second travel information includes speed and position information of the relevant vehicle. The lane information includes information such as the relative distance between the lane and the lane where the target vehicle is located.

S204：通过第一神经网络模型、第一行驶信息、第二行驶信息以及车道信息预测目标车辆以多个不同的预定驾驶策略行驶的行驶得分；其中，行驶得分用于表征目标车辆以预定驾驶策略进行行驶的行驶状况和车辆性能；第一神经网络模型通过不同交通状况下对应的样本数据训练得到；S204: Predict the driving score of the target vehicle traveling with multiple different predetermined driving strategies by using the first neural network model, the first driving information, the second driving information and the lane information; wherein, the driving score is used to characterize the target vehicle with the predetermined driving strategy. The driving conditions and vehicle performance for driving; the first neural network model is obtained by training the corresponding sample data under different traffic conditions;

第一神经网络模型为行驶得分预测模型，可以对输入的第一行驶信息、第二行驶信息和车道信息进行处理，得到多个行驶得分，每个行驶得分对应一种预定驾驶策略，例如，预设5种驾驶策略，分别得到行驶得分为10、25、80、45、96，那么根据行驶得分可以确定目标车辆在下一时刻采用哪个驾驶策略行驶。第一神经网络模型的训练方法，将在下文详细阐述，在此不再赘述。The first neural network model is a driving score prediction model, which can process the input first driving information, second driving information and lane information to obtain a plurality of driving scores, each driving score corresponds to a predetermined driving strategy, for example, a predetermined driving strategy. Assuming five driving strategies, the driving scores are 10, 25, 80, 45, and 96, respectively. Then, according to the driving scores, it can be determined which driving strategy the target vehicle adopts to drive at the next moment. The training method of the first neural network model will be described in detail below, and will not be repeated here.

S206：根据多个不同预定驾驶策略分别对应的行驶得分从多个不同预定驾驶策略中确定目标驾驶策略，以使目标车辆在当前时刻的下一时刻按照目标驾驶策略行驶。S206: Determine a target driving strategy from the plurality of different predetermined driving strategies according to the driving scores corresponding to the plurality of different predetermined driving strategies, so that the target vehicle drives according to the target driving strategy at the next moment of the current moment.

本申请实施例提供的上述车辆自动驾驶方法，首先获取目标车辆的行驶信息，相关车辆的行驶信息以及车道信息，然后通过第一神经网络模型进行行驶得分的预测，得到多个不同驾驶策略的行驶得分，最终根据行驶得分确定目标驾驶策略。本申请的技术中通过驾驶决策首先确定出多个驾驶策略，然后针对每个驾驶策略，利用神经网络模型对其进行得分预测，使得最终确定的目标驾驶策略在传统决策算法的基础上融合了神经网络对目标车辆的行驶性能的有效预测，由于第一神经网络模型是通过多种不同交通状况下的样本数据训练得到的，因此通过第一神经网络模型预测的得分而确定的目标驾驶策略更贴近实际的复杂交通情况，有效提升了车辆自动驾驶的安全性。The above-mentioned vehicle automatic driving method provided by the embodiment of the present application first obtains the driving information of the target vehicle, the driving information of the relevant vehicle and the lane information, and then uses the first neural network model to predict the driving score, and obtains driving with multiple different driving strategies. Score, and finally determine the target driving strategy according to the driving score. In the technology of the present application, a plurality of driving strategies are firstly determined through driving decisions, and then a neural network model is used to predict the score of each driving strategy, so that the final target driving strategy is based on the traditional decision-making algorithm. The network can effectively predict the driving performance of the target vehicle. Since the first neural network model is obtained by training sample data under a variety of different traffic conditions, the target driving strategy determined by the score predicted by the first neural network model is closer to the target. The actual complex traffic situation effectively improves the safety of vehicle automatic driving.

在一些可能的实施方式中，上述与目标车辆相距指定范围内的相关车辆包括以下至少一者：In some possible implementations, the above-mentioned related vehicles within a specified range from the target vehicle include at least one of the following:

(1)目标车辆所在车道中与目标车辆距离小于第一距离阈值且位于目标车辆行驶方向前方的第一相关车辆；(1) A first related vehicle whose distance from the target vehicle in the lane where the target vehicle is located is less than the first distance threshold and is located in front of the target vehicle's traveling direction;

(2)在目标车辆所在车道右侧车道中与目标车辆距离小于第二距离阈值且位于目标车辆行驶方向前方的第二相关车辆；(2) a second related vehicle in the right lane of the lane where the target vehicle is located, the distance from the target vehicle is less than the second distance threshold and is located in front of the target vehicle's driving direction;

(3)在目标车辆所在车道右侧车道中与目标车辆距离小于第三距离阈值且位于目标车辆行驶方向后方的第三相关车辆；(3) a third related vehicle whose distance from the target vehicle is less than the third distance threshold and is located behind the target vehicle's driving direction in the right lane of the lane where the target vehicle is located;

(4)在目标车辆所在车道左侧车道中与目标车辆距离小于第四距离阈值且位于目标车辆行驶方向前方的第四相关车辆；(4) a fourth related vehicle in the left lane of the lane where the target vehicle is located, the distance from the target vehicle is less than the fourth distance threshold and is located in front of the target vehicle's driving direction;

(5)在目标车辆所在车道右侧车道中与目标车辆距离小于第五距离阈值且位于目标车辆行驶方向后方的第五相关车辆。(5) A fifth related vehicle whose distance from the target vehicle is less than the fifth distance threshold and located behind the target vehicle's traveling direction in the right lane of the lane where the target vehicle is located.

图3为本申请实施例提供的另一种车辆自动驾驶方法的流程图，该方法侧重于描述如何获取相关车辆对应的第二行驶信息的具体实施方式，如图3所示，该方法具体包括以下步骤：FIG. 3 is a flowchart of another vehicle automatic driving method provided by the embodiment of the present application. The method focuses on describing the specific implementation of how to obtain the second driving information corresponding to the relevant vehicle. As shown in FIG. 3 , the method specifically includes: The following steps:

S302：获取目标车辆在当前时刻的第一行驶信息以及与目标车辆相距第二指定范围内的相关车道的车道信息；S302: Acquire first driving information of the target vehicle at the current moment and lane information of a relevant lane within a second specified range from the target vehicle;

目标车辆在当前时刻的第一行驶信息包括目标车辆的第一位置和第一速度。The first travel information of the target vehicle at the current moment includes the first position and the first speed of the target vehicle.

S304：获取相关车辆在当前时刻的第二位置和第二速度；S304: Obtain the second position and second speed of the relevant vehicle at the current moment;

S306：计算第二位置相对于第一位置的相对位置，以及第二速度相对于第一速度的相对速度；S306: Calculate the relative position of the second position with respect to the first position, and the relative velocity of the second speed with respect to the first speed;

S308：将相对位置和相对速度确定为相关车辆对应的第二行驶信息。S308: Determine the relative position and relative speed as the second travel information corresponding to the relevant vehicle.

在一些示例中，如果目标车辆所在车道为道路中最右侧车道，将位于目标车辆所在车道右侧车道中的相关车辆的第二行驶信息设置为0；In some examples, if the lane where the target vehicle is located is the rightmost lane in the road, setting the second travel information of the relevant vehicle located in the right lane of the lane where the target vehicle is located to 0;

在另一些示例中，如果目标车辆所在车道为道路中最左侧车道，将位于目标车辆所在车道左侧车道中的相关车辆的第二行驶信息设置为0。In other examples, if the lane where the target vehicle is located is the leftmost lane in the road, the second travel information of the relevant vehicle located in the left lane of the lane where the target vehicle is located is set to 0.

上述的最右侧车道的右侧车道，和最左侧车道的左侧车道相当于是超出边界的车道，超出边界指的是所考虑的目标车道并不存在于真实道路之中。例如，当我们行驶在最右侧车道上，我们要考虑自车本车道以及左侧和右侧3条车道上障碍物的信息。这样情况下对应的本车道和左侧车道是真实存在的，因而障碍物的信息对应的就是真实的信息。但是右侧车道并不存在，是一个虚拟的概念。为了保持一致性，我们假设这种情况下右侧车道是充满虚拟车辆以保证超出边界的车道是不可进入的。The above-mentioned right lane of the rightmost lane and the left lane of the leftmost lane are equivalent to the lanes beyond the boundary, which means that the considered target lane does not exist in the real road. For example, when we are driving in the rightmost lane, we need to consider the information about the obstacles in the own lane and the 3 lanes on the left and right. In this case, the corresponding lane and the left lane are real, so the information of the obstacle corresponds to the real information. But the right lane does not exist and is a virtual concept. For consistency, we assume that in this case the right lane is filled with virtual vehicles to ensure that the lane beyond the boundary is inaccessible.

S310：通过第一神经网络模型、第一行驶信息、第二行驶信息以及车道信息预测目标车辆以多个不同的预定驾驶策略行驶的行驶得分；S310: Predict the driving score of the target vehicle traveling with multiple different predetermined driving strategies by using the first neural network model, the first driving information, the second driving information and the lane information;

S312：根据多个不同的预定驾驶策略分别对应的行驶得分从多个不同的预定驾驶策略中确定目标驾驶策略，以使目标车辆在当前时刻的下一时刻按照目标驾驶策略行驶。S312: Determine a target driving strategy from a plurality of different predetermined driving strategies according to the driving scores corresponding to the plurality of different predetermined driving strategies, so that the target vehicle drives according to the target driving strategy at the next moment of the current moment.

通过上述实施例提供的方法，可以在进行车辆驾驶策略预测的过程中充分考虑目标车辆和相关车辆以及车道的信息，使得驾驶策略对应的行驶得分更加符合实际交通行驶状况。With the method provided by the above embodiment, the information of the target vehicle, related vehicles and lanes can be fully considered in the process of predicting the vehicle driving strategy, so that the driving score corresponding to the driving strategy is more in line with the actual traffic driving conditions.

在一些可能的实施方式中，上述相关车道包括目标车辆所在车道、目标车辆所在车道的右侧车道以及目标车辆所在车道的左侧车道；基于此，可以通过以下方法获取相关车道的车道信息：In some possible implementations, the above-mentioned relevant lanes include the lane where the target vehicle is located, the right lane of the lane where the target vehicle is located, and the left lane of the lane where the target vehicle is located; based on this, the lane information of the relevant lane can be obtained by the following methods:

(1)将车辆在当前时刻的位置作为原点位置；(1) Take the position of the vehicle at the current moment as the origin position;

(2)计算相关车道中第一预设数量个连续位置中每个位置与原点位置的横向距离和纵向距离；其中，在第一预设数量个连续位置中每两个相邻位置之间的距离相等；(2) Calculate the lateral distance and the longitudinal distance between each position in the first preset number of continuous positions in the relevant lane and the origin position; wherein, in the first preset number of continuous positions, the distance between each two adjacent positions equal distance

(3)将第一预设数量个连续位置中每个位置对应的横向距离和纵向距离构成的集合确定为与目标车辆相距第二指定范围内的相关车道的车道信息。(3) Determining the set of the lateral distance and the longitudinal distance corresponding to each of the first preset number of continuous positions as the lane information of the relevant lane within the second specified range from the target vehicle.

在一些可能的实施方式中，上述实施例中的多个驾驶策略至少包括以下中的任意两个：加速直行、保持速度直行、减速直行、直线紧急刹车、向左车道匀速变道以及向右车道匀速变道。In some possible implementations, the plurality of driving strategies in the above embodiments include at least any two of the following: accelerating straight ahead, maintaining speed straight ahead, decelerating straight ahead, straight-line emergency braking, changing lanes at a constant speed to the left lane, and going straight to the right lane Change lanes evenly.

下面结合图4说明本申请实施例提供的第一神经网络模型的结构及其训练方法。如图4所示，第一神经网络模型的输入由3部分构成：目标车辆特征、相关车辆特征以及车道结构特征。目标车辆特征以及相关车辆特征通过多层感知机均输出大小为128的特征向量。每个车道结构特征通过一系列的一维卷积输出大小为1024的特征向量。将相关车辆特征向量与车道特征向量整合后，通过全连接神经网络，最后得到大小为6的输出值，对应于每个驾驶策略的行驶得分，如图4所示。驾驶策略的行驶得分表示该驾驶策略的好坏程度，在实际的应用中，应当选取行驶得分值最大的驾驶策略作为当前时刻下最优的策略选择，同时要综合基于规则或有限状态的传统决策方案来考虑该最优策略的可行性。The following describes the structure of the first neural network model and the training method thereof provided by the embodiment of the present application with reference to FIG. 4 . As shown in Figure 4, the input of the first neural network model consists of three parts: target vehicle features, related vehicle features, and lane structure features. The target vehicle features and related vehicle features both output feature vectors with a size of 128 through the multi-layer perceptron. Each lane structure feature is outputted as a feature vector of size 1024 through a series of one-dimensional convolutions. After integrating the relevant vehicle feature vector with the lane feature vector, through a fully connected neural network, an output value of size 6 is finally obtained, corresponding to the driving score of each driving strategy, as shown in Figure 4. The driving score of a driving strategy indicates the quality of the driving strategy. In practical applications, the driving strategy with the largest driving score should be selected as the optimal strategy choice at the current moment. decision-making scheme to consider the feasibility of the optimal strategy.

在第一神经网络模型的训练过程中，为了能够模拟出现实情况下的高速公路交通流，实验及测试的高速公路为4排单向行驶车道。周边车辆从预设的18种车辆(长度范围：2～18m，宽度范围：1.6～3m)中随机选取，最大数量为50辆。In the training process of the first neural network model, in order to simulate the traffic flow of the expressway in reality, the expressway used in the experiment and test is 4 rows of one-way driving lanes. The surrounding vehicles are randomly selected from the preset 18 types of vehicles (length range: 2~18m, width range: 1.6~3m), and the maximum number is 50 vehicles.

为保证安全的驾驶行为，周边车辆的驾驶模型采用纵向智能驾驶员(IDM)跟车模型，横向采用最小化由变道引起的总体刹车(MOBIL)自适应巡航控制器。To ensure safe driving behavior, the driving model of surrounding vehicles adopts the longitudinal intelligent driver (IDM) following model, and the lateral adopts the minimization of total braking caused by lane change (MOBIL) adaptive cruise controller.

其中，纵向IDM巡航控制器的表达式为Among them, the expression of the longitudinal IDM cruise controller is

其中，v表示车辆的速度，a表示最大期望加速度，b表示期望的速度减小率，s₀为两车之间的最小间隔，s表示两车之间的实际间隔，Δv表示两车之间的速度差，v_set表示期望速度，T_set则表示期望的时间间隔。Among them, v represents the speed of the vehicle, a represents the maximum expected acceleration, b represents the expected speed reduction rate, s ₀ is the minimum interval between the two vehicles, s represents the actual interval between the two vehicles, and Δv represents the distance between the two vehicles The speed difference of , _vset is the desired speed, and _Tset is the desired time interval.

横向MOBIL变道决策控制器为的表达式为The expression of the lateral MOBIL lane change decision controller is:

其中，a_c，a_n和a_o分别表示本车的加速度，变道目标车道后续车辆的加速度以及自车后续跟随车辆的加速度。

和

分别表示变道任务执行后自车的加速度，变道目标车道后续车辆的加速度以及自车后续跟随车辆的加速度。b_safe表示车辆速度最大的减小速率。Δa_th表示加速度转变阈值。当(3)与(4)同时满足时，车辆则向目标车道执行变道任务。Among them, a _c , an and a _o respectively represent the acceleration of the own vehicle, the acceleration of the following vehicle in the lane _- change target lane, and the acceleration of the following vehicle of the own vehicle.

and

Respectively represent the acceleration of the ego vehicle after the lane change task is executed, the acceleration of the following vehicle in the lane change target lane, and the acceleration of the ego vehicle following the vehicle. b _safe represents the maximum reduction rate of vehicle speed. Δa _th represents the acceleration transition threshold. When (3) and (4) are satisfied at the same time, the vehicle performs the lane change task to the target lane.

在上述模型结构的基础上，本申请实施例还提供了一种第一神经网络模型的训练方法，在构建了初始的第一神经网络模型的结构，并且获取了样本数据后，可以按照如图5所示的方法进行模型的训练，该方法可以具体包括以下步骤：On the basis of the above model structure, the embodiment of the present application also provides a training method for the first neural network model. The method shown in 5 is used to train the model, and the method can specifically include the following steps:

S502：获取样本数据；其中，样本数据包括目标样本车辆的速度信息和位置信息，与目标样本车辆相关的相关样本车辆的速度信息和位置信息，以及目标样本车辆对应的车道的车道信息；S502: Obtain sample data; wherein the sample data includes speed information and position information of the target sample vehicle, speed information and position information of the relevant sample vehicle related to the target sample vehicle, and lane information of the lane corresponding to the target sample vehicle;

S504：根据样本数据和损失函数计算第二预设数量个连续时刻中每个时刻对应的初始神经网络模型的损失值；其中，损失函数包括碰撞参数、能耗参数、变道惩罚参数、行驶效率参数中的一个或多个；S504: Calculate the loss value of the initial neural network model corresponding to each of the second preset number of consecutive moments according to the sample data and the loss function; wherein the loss function includes a collision parameter, an energy consumption parameter, a lane change penalty parameter, and a driving efficiency one or more of the parameters;

S506：根据损失值调整初始神经网络模型的参数，将满足训练停止条件时对应的初始神经网络模型确定为第一神经网络模型。S506: Adjust the parameters of the initial neural network model according to the loss value, and determine the corresponding initial neural network model when the training stop condition is satisfied as the first neural network model.

为了能够使自车高效、安全的在高速公路上驾驶并且要同时考虑燃油消耗，第一神经网络模型所选择的回报包含了以下几个方面：In order to enable the ego car to drive efficiently and safely on the highway while taking into account fuel consumption, the reward selected by the first neural network model includes the following aspects:

①.是否发生碰撞。如果自车与其他车辆相撞或者超出高速公路可行驶车道范围，则施加r_safe＝-100的惩罚，并且任务以失败结束。如果不发生碰撞则r_safe＝0。①.Whether there is a collision. If the ego vehicle collides with other vehicles or is out of the freeway drivable lane, a penalty of r _safe = -100 is imposed and the mission ends in failure. r _safe = 0 if no collision occurs.

②.行驶效率。对于每个动作的执行，自车速度直接决定了驾驶的快慢。因此，回报的选取与当前车速成正比②. Driving efficiency. For the execution of each action, the speed of the ego vehicle directly determines the speed of driving. Therefore, the choice of reward is proportional to the current speed

其中，v为自车的速度，v_des＝30m/s表示期望速度。Among them, v is the speed of the vehicle, and v _des =30m/s represents the desired speed.

③.变道惩罚。尽管变道是超车所必备的前提，但是过于频繁的变道并非良好的驾驶行为。为了对变道频率加以限制，每次变道的决策都会得到r_change＝-1的惩罚。③. Lane change penalty. Although changing lanes is a prerequisite for overtaking, changing lanes too frequently is not good driving behavior. To limit the frequency of lane changes, each lane change decision is penalized with r _change = -1.

④.燃油消耗。每次宏观动作执行后，通过记录速度的实力曲线，结合本公司卡车的发动机燃油消耗模型，可以得到关于本次动作执行期间燃油的消耗p(L/km)。以减少油耗为目的，则对应产生的汇报至为r_oil＝-p/100。④. Fuel consumption. After each macro action is executed, by recording the strength curve of the speed, combined with the engine fuel consumption model of the company's truck, the fuel consumption p (L/km) during the execution of the action can be obtained. For the purpose of reducing fuel consumption, the corresponding report is r _oil = -p/100.

结合上述所有的回报定义，总的回报表达式为Combining all the above return definitions, the total return expression is

R＝r_safe+r_v+r_change+r_oil (6)R=r _safe +r _v +r _change +r _oil (6)

本申请实施例的第一神经网络模型采用DQN算法，采用下一时刻状态下动作的最大值来更新当前时刻的动作值。损失函数的定义为：The first neural network model of the embodiment of the present application adopts the DQN algorithm, and uses the maximum value of the action in the state at the next moment to update the action value at the current moment. The loss function is defined as:

L＝(R+γ*argmax_a′Q(s′,a′)-Q(s,a))² (7)L=(R+γ*argmax _a′ Q(s′,a′)-Q(s,a)) ² (7)

其中，R表示回报，γ为折损因子，Q(s,a)为状态s下选取动作a的值。采用梯度下降算法对其进行优化，训练参数w的更新满足Among them, R is the return, γ is the loss factor, and Q(s, a) is the value of the action a selected in the state s. The gradient descent algorithm is used to optimize it, and the update of the training parameter w satisfies the

其中，α为学习率，s表示一个状态量，是表1对应的观测状态，a表示动作，是6个离散的宏观决策行为。Among them, α is the learning rate, s represents a state quantity, which is the observation state corresponding to Table 1, and a represents the action, which is six discrete macroscopic decision-making behaviors.

进一步地，在一些示例中，在确定了下一时刻的预测得分之后，还可以根据以下方法确定样本车辆在下一时刻的行使方式：Further, in some examples, after the prediction score of the next moment is determined, the driving mode of the sample vehicle at the next moment can also be determined according to the following method:

(1)通过初始神经网络模型确定第二预设数量个连续时刻中当前时刻的下一时刻多个驾驶策略分别对应的预测得分；(1) determining the respective prediction scores corresponding to multiple driving strategies at the next moment at the current moment in the second preset number of consecutive moments through the initial neural network model;

(2)将预测得分最大的驾驶策略作为下一时刻驾驶策略，并获取目标样本车辆以下一时刻驾驶策略行驶产生的下一时刻的样本数据；(2) Taking the driving strategy with the largest predicted score as the driving strategy at the next moment, and obtaining the sample data of the next moment generated by the target sample vehicle driving under the driving strategy at the next moment;

(3)基于下一时刻的样本数据，继续对初始神经网络模型进行训练。(3) Continue to train the initial neural network model based on the sample data at the next moment.

DQN算法中每个时刻，每个驾驶策略相当于一种状态，用s来表示，各个状态s下的不同动作值Q(s，a)是使用第一神经网络模型做近似表达处理的，因此在训练样本中对于某个给定的“下一时刻”，我们可以得到6个动作对应的Q值，然后选取最大的作为“当前时刻”下更新的迭代目标。虽然这些值在最开始是随机给定，但是通过不断迭代，最终会收敛。At each moment in the DQN algorithm, each driving strategy is equivalent to a state, represented by s, and the different action values Q(s, a) in each state s are approximated by the first neural network model, so For a given "next moment" in the training sample, we can get the Q values corresponding to 6 actions, and then select the largest one as the iterative target updated under the "current moment". Although these values are initially given randomly, through constant iteration, it will eventually converge.

每次迭代只会更新某个固定的动作值，这取决于训练样本中所记录的信息。比如在训练的过程中一个样本记录了在状态s下选择了动作a1，这个a1对应的动作值可能并非最大。不同动作通过迭代可以逐渐收敛。训练过程中有一定几率选择不同的动作称为探索，因为我们一开始并不知道每个动作对应的真实值的大小，需要经过一定的试探才能够评估得到。Only a fixed action value is updated each iteration, depending on the information recorded in the training samples. For example, in the training process, a sample records that the action a1 is selected in the state s, and the action value corresponding to this a1 may not be the largest. Different actions can gradually converge through iteration. There is a certain probability of selecting different actions during the training process, which is called exploration, because we do not know the size of the real value corresponding to each action at the beginning, and we need to go through a certain trial before we can evaluate it.

当训练完成后，在使用的时候驾驶策略就是选择当前时刻下6个动作值，然后选取最大值对应的动作作为驾驶策略。When the training is completed, the driving strategy is to select the 6 action values at the current moment, and then select the action corresponding to the maximum value as the driving strategy.

当深度强化学习智能体训练完毕使用时，首先通过相关车辆的感知模块得到的相关车辆的行驶特征以及地图定位模块给出的道路结构信息，提取对应的特征信息。然后通过神经网络得到当前时刻状态下的每个动作值。对于最优动作的选取遵循以下规则：首先根据动作值的大小进行排序，优先选择最大值对应的动作。然后通过状态机检验该动作能否成功执行。如果可以则执行该动作，如果不满足状态机的转换条件则选择剩下一个最大动作值对应的动作。重复状态机的检验直至满足条件为止。如果所有动作均不满足，则触发返回指令。When the deep reinforcement learning agent is trained and used, first, the corresponding feature information is extracted through the driving characteristics of the relevant vehicle obtained by the perception module of the relevant vehicle and the road structure information given by the map positioning module. Then each action value at the current moment state is obtained through the neural network. The selection of the optimal action follows the following rules: First, sort according to the size of the action value, and give priority to the action corresponding to the maximum value. The state machine then checks whether the action can be successfully executed. If possible, execute the action, and if the transition condition of the state machine is not satisfied, select the action corresponding to the remaining one with the largest action value. The verification of the state machine is repeated until the condition is satisfied. If all actions are not satisfied, trigger the return instruction.

在一些可能的实施方式中，在确定了多个不同驾驶策略的行驶得分后，可以根据以下方法从中选择目标驾驶策略：将行驶得分高于预设得分阈值的驾驶策略确定为备选驾驶策略集合；从备选驾驶策略集合中确定目标驾驶策略。In some possible implementations, after determining the driving scores of a plurality of different driving strategies, a target driving strategy may be selected from the following methods: a driving strategy with a driving score higher than a preset score threshold is determined as a set of alternative driving strategies ; Determine the target driving strategy from the set of alternative driving strategies.

具体地，可以首先判断备选驾驶策略集合中行驶得分最高的驾驶策略是否满足预设的安全行驶条件；如果满足，将该驾驶策略确定为目标驾驶策略；如果不满足，将该驾驶策略从备选驾驶策略集合中删除。Specifically, it can be first determined whether the driving strategy with the highest driving score in the set of alternative driving strategies satisfies the preset safe driving conditions; if so, the driving strategy is determined as the target driving strategy; if not, the driving strategy is taken from the backup Deleted from the selected driving strategy collection.

需要注意的是，本申请实施例中的安全行驶条件，可以采用状态机的方式进行判断，状态机可以采用已有的决策判断算法实现，本申请实施例对于状态机的形式和具体实现方法不做具体限定。例如，当我们的决策模型给出变道的指示，会对触发状态机的检测。状态机判断当前与后车的距离是否满足安全距离长度，与前车的相对速度是否会发生碰撞等一系列条件。当所有条件均满足时则认为决策指令可执行。本发明目前采用的状态机比较简单，作为前期的试验，主要用来验证决策整体的安全性。It should be noted that the safe driving conditions in the embodiments of the present application can be judged by means of a state machine, and the state machine can be realized by using an existing decision-making and judgment algorithm. Make specific restrictions. For example, when our decision model gives an indication of a lane change, the detection of the state machine is triggered. The state machine judges whether the current distance from the rear vehicle meets the safety distance length, and whether the relative speed with the preceding vehicle will collide with a series of conditions. When all conditions are met, the decision instruction is considered executable. The state machine currently used in the present invention is relatively simple, and as a preliminary test, it is mainly used to verify the overall security of the decision.

在一些可能的实施方式中，如果确定目标驾驶策略为向左车道匀速变道或向右车道匀速变道，则通过以下公式控制车辆进行变道：In some possible implementations, if it is determined that the target driving strategy is changing lanes at a constant speed to the left lane or changing lanes at a constant speed to the right lane, the vehicle is controlled to change lanes by the following formula:

其中，

表示方向盘的转角，θ_n与θ_f分别表示目标车道上前方第一位置与前方第二位置与自车位置形成的夹角，Δt为时间步长，k_f、k_n和k_I表示驾驶行为的常数，目标车道为目标车辆准备从目标车辆所在的当前车道切换的车道。in,

为了便于理解，本申请实施例还提供一种实际应用场景的车辆自动驾驶方法，该方法具体包括以下步骤：For ease of understanding, the embodiment of the present application also provides a vehicle automatic driving method in a practical application scenario, and the method specifically includes the following steps:

步骤1：获取目标车辆在当前时刻的第一速度v1和第一位置d1；Step 1: Obtain the first speed v1 and first position d1 of the target vehicle at the current moment;

步骤2：确定相关车辆；Step 2: Determine the relevant vehicle;

相关车辆包括：目标车辆的感知范围100m内，目标车辆所在的当前车道前方距离最近的2辆车，左侧车道前/后方距离最近的各2辆车以及右侧车道前/后方距离最近的各2辆车，共10辆车。Relevant vehicles include: within 100m of the perception range of the target vehicle, the two closest vehicles in front of the current lane where the target vehicle is located, the two closest vehicles in front/rear in the left lane, and the two closest vehicles in front/rear in the right lane. 2 cars, 10 cars in total.

步骤3：确定相关车道；Step 3: Determine the relevant lane;

相关车道包括：目标车辆所在车道、目标车辆所在车道的右侧车道以及目标车辆所在车道的左侧车道；The relevant lanes include: the lane where the target vehicle is located, the right lane of the lane where the target vehicle is located, and the left lane of the lane where the target vehicle is located;

步骤4：确定目标车辆的特征1，相关车辆的特征2；Step 4: Determine the feature 1 of the target vehicle and the feature 2 of the related vehicle;

目标车辆的特征1为当前时刻的速度以及偏离车道中心线的距离。The feature 1 of the target vehicle is the speed at the current moment and the distance from the center line of the lane.

相关车辆中每个车辆的特征2为每辆车通过雷达与定位信息获取当前时刻的位置与速度，并转换成相对自车的相对位置与相对速度。The feature 2 of each vehicle in the related vehicles is that each vehicle obtains the position and speed at the current moment through radar and positioning information, and converts it into a relative position and relative speed relative to the own vehicle.

步骤5：确定车道对应的特征3；Step 5: Determine the feature 3 corresponding to the lane;

特征3为每条车道以1m为距离间隔，选取前方80米共80个中心线的轨迹点与目标车辆横纵向的相对位置。上述特征的大小如表1所示。Feature 3 is that each lane is spaced at a distance of 1m, and the relative positions of the trajectory points of 80 center lines in front of 80 meters and the horizontal and vertical directions of the target vehicle are selected. The sizes of the above features are shown in Table 1.

表1Table 1

需要注意的是，如果某个方向上位置上在100m范围内没有车辆，对应的相对位置和速度信息则填充常数0。对于超出边界的车道，则是认为该车道充满车辆且不可进入。因此车道中车辆横向的相对位置大小为车道宽度，纵向的相对位置为0，相对速度为0。It should be noted that if there is no vehicle within a range of 100m in a certain direction, the corresponding relative position and speed information is filled with constant 0. For a lane beyond the boundary, the lane is considered to be full of vehicles and inaccessible. Therefore, the lateral relative position of the vehicle in the lane is the lane width, the longitudinal relative position is 0, and the relative speed is 0.

步骤6：确定驾驶策略集合Step 6: Determine the driving strategy set

本申请实施例中的驾驶策略包括：加速直行、保持速度直行、减速直行、直线紧急刹车、向左车道匀速变道以及向右车道匀速变道。其中，直线加速与减速的底层控制由改变公式(1)和(2)中的IDM自适应巡航算法控制参数来实现，见表2。The driving strategy in this embodiment of the present application includes: accelerating straight ahead, maintaining speed straight ahead, decelerating straight ahead, straight line emergency braking, changing lanes to the left lane at a constant speed, and changing lanes to the right lane at a constant speed. Among them, the underlying control of linear acceleration and deceleration is realized by changing the control parameters of the IDM adaptive cruise algorithm in formulas (1) and (2), as shown in Table 2.

表2Table 2

纵向加速longitudinal acceleration 纵向减速Longitudinal deceleration aa 2.0m/s22.0m/s2 0.6m/s20.6m/s2 bb 3.0m/s23.0m/s2 1.0m/s21.0m/s2 s0s0 0m/s0m/s 5m/s5m/s vsetvset 30m/s30m/s 19m/s19m/s TsetTset 1s1s 2s2s

本申请实施例采用简化的2点控制模型来完成变道的控制。具体如下公式所示：The embodiment of the present application adopts a simplified 2-point control model to complete the control of the lane change. The specific formula is as follows:

在保持匀速行驶的基础上，通过改变方向盘的转角来实现变道操作。其中，

表示方向盘的转角，θ_n与θ_f分别表示目标车道上前方50米与100米参考点与自车位置形成的夹角，Δt＝0.05为时间步长，k_f＝20，k_n＝10，k_I＝6表示驾驶行为的常数。On the basis of maintaining a constant speed, the lane changing operation is realized by changing the steering angle. in,

represents the steering wheel angle, θ _n and θ _f respectively represent the angle formed by the reference point 50 meters and 100 meters ahead on the target lane and the position of the vehicle, Δt=0.05 is the time step, k _f =20, k _n =10, k _I =6 represents a constant for driving behavior.

步骤7：将行驶得分高于预设得分阈值的驾驶策略确定为备选驾驶策略集合；Step 7: Determine a driving strategy with a driving score higher than a preset score threshold as a set of alternative driving strategies;

步骤8：判断备选驾驶策略集合中行驶得分最高的驾驶策略是否满足预设的安全行驶条件；Step 8: judging whether the driving strategy with the highest driving score in the alternative driving strategy set satisfies the preset safe driving conditions;

步骤9：如果满足，将该驾驶策略确定为目标驾驶策略；Step 9: If satisfied, determine the driving strategy as the target driving strategy;

步骤10：目标驾驶策略为向左车道匀速变道或向右车道匀速变道，目标车辆在下一时刻按照目标驾驶策略行驶。Step 10: The target driving strategy is to change lanes at a constant speed to the left lane or change lanes at a constant speed to the right lane, and the target vehicle drives according to the target driving strategy at the next moment.

具体地，通过以下公式控制车辆进行变道：Specifically, the vehicle is controlled to change lanes by the following formula:

其中，

基于上述方法实施例，本申请实施例还提供一种车辆自动驾驶装置，参见图6所示，该装置包括：Based on the foregoing method embodiments, an embodiment of the present application further provides a vehicle automatic driving device, as shown in FIG. 6 , the device includes:

信息获取模块602，用于获取目标车辆在当前时刻的第一行驶信息、与目标车辆相距第一指定范围内的相关车辆对应的第二行驶信息以及与目标车辆相距第二指定范围内的相关车道的车道信息；The information acquisition module 602 is used to acquire the first driving information of the target vehicle at the current moment, the second driving information corresponding to the relevant vehicle within the first specified range from the target vehicle, and the relevant lane within the second specified range from the target vehicle lane information;

预测模块604，用于通过第一神经网络模型、第一行驶信息、第二行驶信息以及车道信息预测目标车辆以多个不同的预定驾驶策略行驶的行驶得分；其中，行驶得分用于表征目标车辆以预定驾驶策略进行行驶的行驶状况和车辆性能；第一神经网络模型通过不同交通状况下对应的样本数据训练得到；The prediction module 604 is used for predicting the driving score of the target vehicle traveling with a plurality of different predetermined driving strategies through the first neural network model, the first driving information, the second driving information and the lane information; wherein, the driving score is used to characterize the target vehicle Driving conditions and vehicle performance for driving with a predetermined driving strategy; the first neural network model is obtained by training corresponding sample data under different traffic conditions;

目标驾驶策略确定模块606，用于根据多个不同驾驶策略分别对应的行驶得分从多个不同驾驶策略中确定目标驾驶策略，以使目标车辆在当前时刻的下一时刻按照目标驾驶策略行驶。The target driving strategy determination module 606 is configured to determine the target driving strategy from the plurality of different driving strategies according to the driving scores corresponding to the plurality of different driving strategies, so that the target vehicle drives according to the target driving strategy at the next moment of the current moment.

本申请实施例提供的上述车辆自动驾驶装置，首先获取目标车辆的行驶信息，相关车辆的行驶信息以及车道信息，然后通过第一神经网络模型进行行驶得分的预测，得到多个不同驾驶策略的行驶得分，最终根据行驶得分确定目标驾驶策略。本申请的技术中通过驾驶决策首先确定出多个驾驶策略，然后针对每个驾驶策略，利用神经网络模型对其进行得分预测，使得最终确定的目标驾驶策略在传统决策算法的基础上融合了神经网络对目标车辆的行驶性能的有效预测，由于第一神经网络模型是通过多种不同交通状况下的样本数据训练得到的，因此通过第一神经网络模型预测的得分而确定的目标驾驶策略更贴近实际的复杂交通情况，有效提升了车辆自动驾驶的安全性。The above-mentioned vehicle automatic driving device provided by the embodiment of the present application first obtains the driving information of the target vehicle, the driving information of the relevant vehicle and the lane information, and then uses the first neural network model to predict the driving score, so as to obtain a plurality of driving strategies with different driving strategies. Score, and finally determine the target driving strategy according to the driving score. In the technology of the present application, a plurality of driving strategies are firstly determined through driving decisions, and then a neural network model is used to predict the score of each driving strategy, so that the final target driving strategy is based on the traditional decision-making algorithm. The network can effectively predict the driving performance of the target vehicle. Since the first neural network model is obtained by training sample data under a variety of different traffic conditions, the target driving strategy determined by the score predicted by the first neural network model is closer to the target. The actual complex traffic situation effectively improves the safety of vehicle automatic driving.

上述与目标车辆相距指定范围内的相关车辆包括以下至少一者：目标车辆所在车道中与目标车辆距离小于第一距离阈值且位于目标车辆行驶方向前方的第一相关车辆；在目标车辆所在车道右侧车道中与目标车辆距离小于第二距离阈值且位于目标车辆行驶方向前方的第二相关车辆；在目标车辆所在车道右侧车道中与目标车辆距离小于第三距离阈值且位于目标车辆行驶方向后方的第三相关车辆；在目标车辆所在车道左侧车道中与目标车辆距离小于第四距离阈值且位于目标车辆行驶方向前方的第四相关车辆；在目标车辆所在车道右侧车道中与目标车辆距离小于第五距离阈值且位于目标车辆行驶方向后方的第五相关车辆。The above-mentioned related vehicles within a specified range from the target vehicle include at least one of the following: a first related vehicle whose distance from the target vehicle in the lane where the target vehicle is located is less than a first distance threshold and is located in front of the driving direction of the target vehicle; A second related vehicle in the side lane whose distance from the target vehicle is less than the second distance threshold and is located in front of the direction of travel of the target vehicle; in the lane on the right side of the lane where the target vehicle is located, the distance from the target vehicle is less than the third distance threshold and is located behind the direction of travel of the target vehicle the third related vehicle; the fourth related vehicle whose distance from the target vehicle in the left lane of the target vehicle’s lane is less than the fourth distance threshold and is located in front of the target vehicle’s driving direction; the distance from the target vehicle in the right lane of the target vehicle’s lane A fifth associated vehicle that is less than the fifth distance threshold and is located behind the target vehicle in the direction of travel.

上述目标车辆在当前时刻的第一行驶信息包括目标车辆的第一位置和第一速度；上述信息获取模块602还用于：获取相关车辆在当前时刻的第二位置和第二速度；计算第二位置相对于第一位置的相对位置，以及第二速度相对于第一速度的相对速度；将相对位置和相对速度确定为相关车辆对应的第二行驶信息。The first driving information of the above-mentioned target vehicle at the current moment includes the first position and the first speed of the target vehicle; the above-mentioned information obtaining module 602 is also used to: obtain the second position and the second speed of the relevant vehicle at the current moment; calculate the second The relative position of the position with respect to the first position, and the relative speed of the second speed with respect to the first speed; the relative position and the relative speed are determined as the second travel information corresponding to the relevant vehicle.

上述装置还用于：如果目标车辆所在车道为道路中最右侧车道，将位于目标车辆所在车道右侧车道中的相关车辆的第二行驶信息设置为0；如果目标车辆所在车道为道路中最左侧车道，将位于目标车辆所在车道左侧车道中的相关车辆的第二行驶信息设置为0。The above device is also used for: if the lane where the target vehicle is located is the rightmost lane on the road, set the second driving information of the relevant vehicle in the right lane of the lane where the target vehicle is located to 0; if the lane where the target vehicle is located is the most right lane on the road Left lane, set the second driving information of related vehicles in the left lane of the lane where the target vehicle is located to 0.

上述相关车道包括目标车辆所在车道、目标车辆所在车道的右侧车道以及目标车辆所在车道的左侧车道；上述信息获取模块602还用于：将车辆在当前时刻的位置作为原点位置；计算相关车道中第一预设数量个连续位置中每个位置与原点位置的横向距离和纵向距离；其中，在第一预设数量个连续位置中每两个相邻位置之间的距离相等；将第一预设数量个连续位置中每个位置对应的横向距离和纵向距离构成的集合确定为与目标车辆相距第二指定范围内的相关车道的车道信息。The above-mentioned relevant lanes include the lane where the target vehicle is located, the right side lane of the lane where the target vehicle is located, and the left side lane of the lane where the target vehicle is located; the above-mentioned information acquisition module 602 is also used to: take the position of the vehicle at the current moment as the origin position; calculate the relevant lanes The horizontal distance and the vertical distance between each position in the first preset number of continuous positions and the origin position; wherein, the distance between every two adjacent positions in the first preset number of continuous positions is equal; A set consisting of a lateral distance and a longitudinal distance corresponding to each of the preset number of consecutive positions is determined as lane information of a relevant lane within a second specified range from the target vehicle.

上述多个驾驶策略至少包括以下中的任意两个：加速直行、保持速度直行、减速直行、直线紧急刹车、向左车道匀速变道以及向右车道匀速变道。The above multiple driving strategies include at least any two of the following: accelerating straight ahead, maintaining speed straight ahead, decelerating straight ahead, emergency braking in a straight line, changing lanes at a constant speed to the left lane, and changing lanes at a constant speed to the right lane.

上述第一神经网络模型通过以下步骤训练得到：获取样本数据；其中，样本数据包括目标样本车辆的速度信息和位置信息，与目标样本车辆相关的相关样本车辆的速度信息和位置信息，以及目标样本车辆对应的车道的车道信息；根据样本数据和损失函数计算第二预设数量个连续时刻中每个时刻对应的初始神经网络模型的损失值；其中，损失函数包括碰撞参数、能耗参数、变道惩罚参数、行驶效率参数中的一个或多个；根据损失值调整初始神经网络模型的参数，将满足训练停止条件时对应的初始神经网络模型确定为第一神经网络模型。The above-mentioned first neural network model is obtained through the following steps of training: obtaining sample data; wherein the sample data includes speed information and position information of the target sample vehicle, speed information and position information of the relevant sample vehicles related to the target sample vehicle, and the target sample vehicle. Lane information of the lane corresponding to the vehicle; calculate the loss value of the initial neural network model corresponding to each of the second preset number of consecutive moments according to the sample data and the loss function; wherein, the loss function includes collision parameters, energy consumption parameters, variable One or more of the road penalty parameter and the driving efficiency parameter; adjust the parameters of the initial neural network model according to the loss value, and determine the corresponding initial neural network model when the training stop condition is satisfied as the first neural network model.

上述装置还用于：通过初始神经网络模型确定第二预设数量个连续时刻中当前时刻的下一时刻多个驾驶策略分别对应的预测得分；将预测得分最大的驾驶策略作为下一时刻驾驶策略，并获取目标样本车辆以下一时刻驾驶策略行驶产生的下一时刻的样本数据；基于下一时刻的样本数据，继续对初始神经网络模型进行训练。The above device is also used for: determining the prediction scores corresponding to multiple driving strategies at the next moment at the current moment in the second preset number of consecutive moments through an initial neural network model; and using the driving strategy with the largest predicted score as the driving strategy at the next moment , and obtain the sample data of the next moment generated by the target sample vehicle driving under the driving strategy of the next moment; based on the sample data of the next moment, continue to train the initial neural network model.

上述目标驾驶策略确定模块606还用于：将行驶得分高于预设得分阈值的驾驶策略确定为备选驾驶策略集合；从备选驾驶策略集合中确定目标驾驶策略。The above-mentioned target driving strategy determination module 606 is further configured to: determine a driving strategy with a driving score higher than a preset score threshold as a set of alternative driving strategies; and determine a target driving strategy from the set of alternative driving strategies.

上述从备选驾驶策略集合中确定目标驾驶策略的过程，包括：判断备选驾驶策略集合中行驶得分最高的驾驶策略是否满足预设的安全行驶条件；如果满足，将该驾驶策略确定为目标驾驶策略；如果不满足，将该驾驶策略从备选驾驶策略集合中删除。The above process of determining a target driving strategy from a set of alternative driving strategies includes: judging whether the driving strategy with the highest driving score in the set of alternative driving strategies satisfies a preset safe driving condition; if so, determining the driving strategy as the target driving strategy; if not satisfied, delete the driving strategy from the set of alternative driving strategies.

上述装置还用于：如果目标驾驶策略为向左车道匀速变道或向右车道匀速变道，则通过以下公式控制车辆进行变道：

其中，

表示方向盘的转角，θ_n与θ_f分别表示目标车道上前方第一位置与前方第二位置与自车位置形成的夹角，Δt为时间步长，k_f、k_n和k_I表示驾驶行为的常数，目标车道为目标车辆准备从目标车辆所在的当前车道切换的车道。The above device is also used for: if the target driving strategy is to change lanes at a constant speed to the left lane or change lanes at a constant speed to the right lane, then control the vehicle to change lanes through the following formula:

in,

本申请实施例提供的车辆自动驾驶装置，其实现原理及产生的技术效果和前述方法实施例相同，为简要描述，上述装置的实施例部分未提及之处，可参考前述车辆自动驾驶方法实施例中的相应内容。The implementation principle and the technical effects of the vehicle automatic driving device provided by the embodiments of the present application are the same as those of the foregoing method embodiments. For the sake of brief description, for the parts not mentioned in the embodiments of the foregoing device, reference may be made to the foregoing vehicle automatic driving method implementation. corresponding content in the example.

为了验证本申请实施例提供的第一神经网络模型的可行性，本申请实施例采用CARLA自动驾驶仿真模拟器建立一个高速公路的驾驶环境。其中，周边车辆的初始速度分布为60km/h～120km/h，自车的初始速度为60km/h，最大速度为110km/h。目的是本发明的模型能否在安全的基础上，更加快速的在高速公路上行驶,同时能够降低燃油消耗。In order to verify the feasibility of the first neural network model provided by the embodiment of the present application, the embodiment of the present application adopts the CARLA automatic driving simulation simulator to establish a driving environment of a highway. Among them, the initial speed distribution of the surrounding vehicles is 60km/h～120km/h, the initial speed of the own vehicle is 60km/h, and the maximum speed is 110km/h. The purpose is whether the model of the present invention can drive more quickly on the highway on the basis of safety, and can reduce fuel consumption at the same time.

通过1000组测试路况场景的结构表明，使用传统基于规则的跟车模型，每公里平均耗时为44.76s，油耗为0.211升。采用本申请实施例提供的车辆自动驾驶方法，自车在行驶过程中合理的选择变道超车时机无碰撞事故发生，每公里平均耗时为35.24s，油耗为0.205升。平均耗时减小了21.27％，油耗降低了2.3％。The structure of 1000 sets of test road conditions shows that using the traditional rule-based car following model, the average time per kilometer is 44.76s and the fuel consumption is 0.211 liters. By adopting the vehicle automatic driving method provided by the embodiment of the present application, the self-vehicle can reasonably choose the timing of changing lanes and overtaking during the driving process, and no collision accident occurs. The average time spent was reduced by 21.27%, and the fuel consumption was reduced by 2.3%.

本申请实施例还提供了一种电子设备，如图7所示，为该电子设备的结构示意图，其中，该电子设备包括处理器1501和存储器1502，该存储器1502存储有能够被该处理器1501执行的计算机可执行指令，该处理器1501执行该计算机可执行指令以实现上述车辆自动驾驶方法。An embodiment of the present application also provides an electronic device, as shown in FIG. 7 , which is a schematic structural diagram of the electronic device, wherein the electronic device includes a processor 1501 and a memory 1502 , and the memory 1502 stores data that can be used by the processor 1501 Executed computer-executable instructions, the processor 1501 executes the computer-executable instructions to implement the above-mentioned vehicle automatic driving method.

在图7示出的实施方式中，该电子设备还包括总线1503和通信接口1504，其中，处理器1501、通信接口1504和存储器1502通过总线1503连接。In the embodiment shown in FIG. 7 , the electronic device further includes a bus 1503 and a communication interface 1504 , wherein the processor 1501 , the communication interface 1504 and the memory 1502 are connected through the bus 1503 .

其中，存储器1502可能包含高速随机存取存储器(RAM，Random Access Memory)，也可能还包括非不稳定的存储器(non-volatile memory)，例如至少一个磁盘存储器。通过至少一个通信接口1504(可以是有线或者无线)实现该系统网元与至少一个其他网元之间的通信连接，可以使用互联网，广域网，本地网，城域网等。总线1503可以是ISA(IndustryStandard Architecture，工业标准体系结构)总线、PCI(Peripheral ComponentInterconnect，外设部件互连标准)总线或EISA(Extended Industry StandardArchitecture，扩展工业标准结构)总线等。所述总线1503可以分为地址总线、数据总线、控制总线等。为便于表示，图7中仅用一个双向箭头表示，但并不表示仅有一根总线或一种类型的总线。The memory 1502 may include a high-speed random access memory (RAM, Random Access Memory), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the network element of the system and at least one other network element is implemented through at least one communication interface 1504 (which may be wired or wireless), which may use the Internet, a wide area network, a local network, a metropolitan area network, and the like. The bus 1503 may be an ISA (IndustryStandard Architecture, industry standard architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus, or an EISA (Extended Industry Standard Architecture, extended industry standard architecture) bus, or the like. The bus 1503 can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one bidirectional arrow is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.

处理器1501可能是一种集成电路芯片，具有信号的处理能力。在实现过程中，上述方法的各步骤可以通过处理器1501中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1501可以是通用处理器，包括中央处理器(Central Processing Unit，简称CPU)、网络处理器(Network Processor，简称NP)等；还可以是数字信号处理器(DigitalSignal Processor，简称DSP)、专用集成电路(Application Specific IntegratedCircuit，简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array，简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器，闪存、只读存储器，可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器，处理器1501读取存储器中的信息，结合其硬件完成前述实施例的车辆自动驾驶方法的步骤。The processor 1501 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 1501 or an instruction in the form of software. The above-mentioned processor 1501 may be a general-purpose processor, including a central processing unit (CPU for short), a network processor (NP for short), etc.; it may also be a digital signal processor (Digital Signal Processor, DSP for short) , Application Specific Integrated Circuit (ASIC for short), Field-Programmable Gate Array (FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory, and the processor 1501 reads the information in the memory, and completes the steps of the vehicle automatic driving method in the foregoing embodiment in combination with its hardware.

本申请实施例还提供了一种计算机可读存储介质，该计算机可读存储介质存储有计算机可执行指令，该计算机可执行指令在被处理器调用和执行时，该计算机可执行指令促使处理器实现上述车辆自动驾驶方法，具体实现可参见前述方法实施例，在此不再赘述。Embodiments of the present application further provide a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when the computer-executable instructions are invoked and executed by a processor, the computer-executable instructions cause the processor to For the implementation of the above-mentioned vehicle automatic driving method, the specific implementation can refer to the foregoing method embodiments, which will not be repeated here.

本申请实施例所提供的车辆自动驾驶方法、装置及电子设备的计算机程序产品，包括存储了程序代码的计算机可读存储介质，所述程序代码包括的指令可用于执行前面方法实施例中所述的方法，具体实现可参见方法实施例，在此不再赘述。The computer program product of the vehicle automatic driving method, device, and electronic device provided by the embodiments of the present application includes a computer-readable storage medium storing program codes, and the instructions included in the program codes can be used to execute the methods described in the foregoing method embodiments. The specific implementation can refer to the method embodiment, which is not repeated here.

除非另外具体说明，否则在这些实施例中阐述的部件和步骤的相对步骤、数字表达式和数值并不限制本申请的范围。The relative steps, numerical expressions and numerical values of the components and steps set forth in these embodiments do not limit the scope of the present application unless specifically stated otherwise.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

在本申请的描述中，需要说明的是，术语“中心”、“上”、“下”、“左”、“右”、“竖直”、“水平”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本申请和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本申请的限制。此外，术语“第一”、“第二”、“第三”仅用于描述目的，而不能理解为指示或暗示相对重要性。In the description of this application, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the accompanying drawings, which is only for the convenience of describing the present application and simplifying the description, rather than indicating or implying that the indicated device or element must have a specific orientation or a specific orientation. construction and operation, and therefore should not be construed as limitations on this application. Furthermore, the terms "first", "second", and "third" are used for descriptive purposes only and should not be construed to indicate or imply relative importance.

最后应说明的是：以上所述实施例，仅为本申请的具体实施方式，用以说明本申请的技术方案，而非对其限制，本申请的保护范围并不局限于此，尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化，或者对其中部分技术特征进行等同替换；而这些修改、变化或者替换，并不使相应技术方案的本质脱离本申请实施例技术方案的精神和范围，都应涵盖在本申请的保护范围之内。因此，本申请的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present application, and are used to illustrate the technical solutions of the present application, rather than limit them. The embodiments describe the application in detail, and those of ordinary skill in the art should understand that: any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the application. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be covered in this application. within the scope of protection. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims

1. a vehicle automatic driving method, is characterized in that, described method comprises:

Obtain the first driving information of the target vehicle at the current moment, the second driving information corresponding to the relevant vehicle within the first specified range from the target vehicle, and the lane information of the relevant lane within the second specified range from the target vehicle ;

The driving score of the target vehicle when the target vehicle travels with multiple different predetermined driving strategies is predicted by using the first neural network model, the first driving information, the second driving information and the lane information; wherein, the driving score is determined by is used to characterize the driving conditions and vehicle performance of the target vehicle traveling with the predetermined driving strategy; the first neural network model is obtained by training corresponding sample data under different traffic conditions;

A target driving strategy is determined from the plurality of different driving strategies according to the driving scores corresponding to the plurality of different driving strategies, so that the target vehicle drives according to the target driving strategy at the next moment of the current moment.

2 . The method according to claim 1 , wherein the relevant vehicles within a specified range from the target vehicle include at least one of the following: 2 .

a first related vehicle in the lane where the target vehicle is located and whose distance from the target vehicle is less than a first distance threshold and is located in front of the driving direction of the target vehicle;

a second related vehicle in the right lane of the lane where the target vehicle is located, the distance from the target vehicle being less than a second distance threshold and ahead of the direction of travel of the target vehicle;

A third related vehicle whose distance from the target vehicle is less than a third distance threshold and located behind the target vehicle in the direction of travel in the right lane of the lane where the target vehicle is located;

a fourth related vehicle whose distance from the target vehicle is less than a fourth distance threshold in the left lane of the lane where the target vehicle is located and is located in front of the direction of travel of the target vehicle;

A fifth related vehicle whose distance from the target vehicle is less than a fifth distance threshold and located behind the target vehicle in the direction of travel in the right lane of the lane where the target vehicle is located.

3. The method according to claim 1, wherein the first driving information of the target vehicle at the current moment comprises a first position and a first speed of the target vehicle;

The step of acquiring the second driving information corresponding to the relevant vehicle within a specified range of the target vehicle includes:

obtaining the second position and second speed of the relevant vehicle at the current moment;

calculating the relative position of the second position with respect to the first position, and the relative velocity of the second velocity with respect to the first velocity;

The relative position and the relative speed are determined as second travel information corresponding to the relevant vehicle.

4. The method according to claim 3, wherein the method further comprises:

If the lane where the target vehicle is located is the rightmost lane in the road, set the second driving information of the relevant vehicle in the lane on the right side of the lane where the target vehicle is located to 0;

If the lane where the target vehicle is located is the leftmost lane in the road, the second travel information of the relevant vehicle located in the left lane of the lane where the target vehicle is located is set to 0.

5. The method according to claim 1, wherein the relevant lanes comprise a lane where the target vehicle is located, a right lane of the lane where the target vehicle is located, and a left lane of the lane where the target vehicle is located;

The step of acquiring lane information of a relevant lane within a second specified range from the target vehicle includes:

Taking the position of the vehicle at the current moment as the origin position;

Calculate the lateral distance and the longitudinal distance between each position in the first preset number of consecutive positions in the relevant lane and the origin position; wherein, in the first preset number of consecutive positions, every two adjacent positions the distance between them is equal;

The set formed by the lateral distance and the longitudinal distance corresponding to each of the first preset number of consecutive positions is determined as lane information of a relevant lane within a second specified range from the target vehicle.

6. The method of claim 1, wherein the plurality of different driving strategies include at least any two of the following:

Accelerate straight, maintain speed straight, slow down straight, emergency braking in a straight line, uniform lane change to the left lane, and uniform lane change to the right lane.

7. The method according to claim 1, wherein the first neural network model is obtained by training through the following steps:

Obtain sample data; wherein, the sample data includes speed information and position information of a target sample vehicle, speed information and position information of a relevant sample vehicle related to the target sample vehicle, and a lane of a lane corresponding to the target sample vehicle information;

Calculate the loss value of the initial neural network model corresponding to each of the second preset number of consecutive moments according to the sample data and the loss function; wherein the loss function includes collision parameters, energy consumption parameters, lane change penalty parameters, and driving efficiency one or more of the parameters;

The parameters of the initial neural network model are adjusted according to the loss value, and the corresponding initial neural network model when the training stop condition is satisfied is determined as the first neural network model.

8. The method according to claim 7, wherein the method further comprises:

Determine, by using the initial neural network model, the prediction scores corresponding to the plurality of different driving strategies at the next moment of the current moment in the second preset number of consecutive moments;

Taking the driving strategy with the largest predicted score as the driving strategy at the next moment, and acquiring the sample data of the next moment generated by the target sample vehicle running with the driving strategy at the next moment;

Based on the sample data at the next moment, continue to train the initial neural network model.

9. The method according to claim 1, wherein the step of determining a target driving strategy from the plurality of different driving strategies according to the driving scores corresponding to the plurality of different driving strategies respectively comprises:

determining the driving strategy with the driving score higher than the preset score threshold as a set of alternative driving strategies;

A target driving strategy is determined from the set of candidate driving strategies.

10. The method according to claim 9, wherein the step of determining a target driving strategy from the set of alternative driving strategies comprises:

Judging whether the driving strategy with the highest driving score in the candidate driving strategy set satisfies the preset safe driving condition;

If satisfied, determine the driving strategy as the target driving strategy;

If not, the driving strategy is deleted from the set of alternative driving strategies.

11. The method according to any one of claims 1-10, wherein the method further comprises:

If the target driving strategy is changing lanes at a constant speed to the left lane or changing lanes at a constant speed to the right lane, the vehicle is controlled to change lanes by the following formula:

in,

Represents the steering wheel angle, θ _n and θ _f respectively represent the angle formed by the first position in front of the target lane and the second position in front and the position of the vehicle, Δt is the time step, k _f , k _n and k _I represent the driving behavior The target lane is the lane where the target vehicle is prepared to switch from the lane where the target vehicle is located.

12. A vehicle automatic driving device, characterized in that the device comprises:

The information acquisition module is used to acquire the first driving information of the target vehicle at the current moment, the second driving information corresponding to the relevant vehicles within the first specified range from the target vehicle, and the second specified range from the target vehicle. Lane information of the relevant lane;

a prediction module, configured to predict the driving score of the target vehicle traveling with a plurality of different predetermined driving strategies by using the first neural network model, the first driving information, the second driving information and the lane information; wherein, The driving score is used to characterize the driving conditions and vehicle performance of the target vehicle traveling with the predetermined driving strategy; the first neural network model is obtained by training corresponding sample data under different traffic conditions;

A target driving strategy determination module, configured to determine a target driving strategy from the plurality of different driving strategies according to the driving scores corresponding to the plurality of different driving strategies, so that the target vehicle is at the next moment of the current moment Drive according to the target driving strategy.

13. An electronic device, comprising a processor and a memory, the memory storing computer-executable instructions that can be executed by the processor, the processor executing the computer-executable instructions to implement the claims The method of any one of 1-11.

14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores computer-executable instructions that, when invoked and executed by a processor, cause the processor to Implementing the method of any of claims 1-11.