CN116279471A

CN116279471A - A Vehicle Acceleration Prediction Method Considering Driving Behavior Characteristics in Car-following Scenarios

Info

Publication number: CN116279471A
Application number: CN202211598432.8A
Authority: CN
Inventors: 汤书宁; 丁路洒; 邹亚杰; 张�浩
Original assignee: Tongji University
Current assignee: Tongji University
Priority date: 2022-12-12
Filing date: 2022-12-12
Publication date: 2023-06-23

Abstract

The invention relates to a vehicle acceleration prediction method considering driving behavior characteristics in a following scene, which comprises the following steps: step 1) obtaining a prediction data set, and carrying out data processing and variable extraction: screening and reserving the obtained prediction data set based on a preconfigured rule, and extracting characteristic variables from the effective following fragment data; step 2) constructing a driving behavior semantic division model based on CHMM, taking a characteristic variable as input of the driving behavior semantic division model, dividing the characteristic variable into segments with different behavior characteristics, and performing similarity evaluation on driving behavior semantics; and 3) constructing a CNN-LSTM acceleration prediction model, dividing the effective following segment data based on the similarity evaluation result to construct an input-output sample set, training the CNN-LSTM acceleration prediction model, and predicting the vehicle acceleration based on the trained CNN-LSTM acceleration prediction model. Compared with the prior art, the method has the advantages of high prediction precision and the like.

Description

A Vehicle Acceleration Prediction Method Considering Driving Behavior Characteristics in Car-following Scenarios

技术领域technical field

本发明涉及车辆智能驾驶和主动安全技术领域，尤其是涉及一种跟驰场景下考虑驾驶行为特征的车辆加速度预测方法。The invention relates to the technical fields of vehicle intelligent driving and active safety, in particular to a vehicle acceleration prediction method considering driving behavior characteristics in a car-following scene.

背景技术Background technique

近些年来，随着我国经济的快速发展和城市化进程的不断加快，城市区域一体化趋势加强，交通工具的保有量和规模也随之加快和扩大。同时，各种交通问题也随之出现。交通安全问题日益突出，严重影响着的交通出行质量和社会经济发展。In recent years, with the rapid development of my country's economy and the continuous acceleration of urbanization, the trend of urban regional integration has been strengthened, and the number and scale of transportation vehicles have also accelerated and expanded. At the same time, various traffic problems also appear thereupon. The problem of traffic safety is becoming more and more prominent, seriously affecting the quality of traffic travel and social and economic development.

驾驶行为是影响交通安全的重要因素之一，在由人、车辆和道路组成的复杂交通环境中，人为因素是导致交通事故的最重要因素。一些研究表明，70％以上的交通事故是由人为错误引起的，尤其是长途旅行，驾驶环境复杂，因此，对驾驶行为的研究是交通安全领域的研究热点；此外，车辆智能化和个性化服务是未来车辆技术发展的重要趋势，对不同驾驶习惯的驾驶员进行个性化预测有助于针对不同风格的驾驶员行为实现个性化建模，捕捉驾驶员行为特征，提高驾驶行为的预测精度。目前针对驾驶行为预测的研究主要分为两类：模型驱动方法和数据驱动方法。Driving behavior is one of the important factors affecting traffic safety. In the complex traffic environment composed of people, vehicles and roads, human factors are the most important factors leading to traffic accidents. Some studies have shown that more than 70% of traffic accidents are caused by human errors, especially for long-distance travel, and the driving environment is complex. Therefore, research on driving behavior is a research hotspot in the field of traffic safety; in addition, vehicle intelligence and personalized services It is an important trend in the development of vehicle technology in the future. Personalized prediction of drivers with different driving habits can help realize personalized modeling for different styles of driver behavior, capture driver behavior characteristics, and improve the prediction accuracy of driving behavior. Current research on driving behavior prediction is mainly divided into two categories: model-driven methods and data-driven methods.

模型驱动方法是一种基于特定假设的固定结构模型，其参数一般从经验数据中计算得到。模型驱动预测方法一般使用的数据较少，复杂性有限。因此，在处理大型数据集时，无法充分挖掘大型数据集中的有效信息，模型复杂程度也会增大。The model-driven approach is a fixed-structure model based on specific assumptions, whose parameters are generally calculated from empirical data. Model-driven forecasting methods generally use less data and have limited complexity. Therefore, when dealing with large data sets, the effective information in large data sets cannot be fully mined, and the complexity of the model will also increase.

数据驱动方法是一类没有固定结构和固定参数的模型，用于驾驶行为预测的数据驱动模型可以使用多个不同来源的数据集，从而预测性能。然而，这类预测方法大多没有考虑驾驶行为分类，无法挖掘驾驶行为的潜在异质性。此外，由于驾驶行为数据是随机和非线性的，难以使用传统的线性模型来进行准确描述，近年来，基于深度学习的方法在时间序列数据预测中得到了广泛应用。而单一深度学习模型拟合能力弱，对于高维度，相关性复杂的数据拟合效果差。Data-driven methods are a class of models without fixed structure and fixed parameters. Data-driven models for driving behavior prediction can use multiple datasets from different sources to predict performance. However, most of these prediction methods do not consider the classification of driving behavior and cannot tap the potential heterogeneity of driving behavior. In addition, since driving behavior data is random and non-linear, it is difficult to accurately describe it using traditional linear models. In recent years, methods based on deep learning have been widely used in time series data prediction. However, the fitting ability of a single deep learning model is weak, and the fitting effect is poor for data with high dimensions and complex correlations.

发明内容Contents of the invention

本发明的目的就是为了提供一种跟驰场景下考虑驾驶行为特征的车辆加速度预测方法，考虑驾驶员的行为特征以提高加速度预测精度。The purpose of the present invention is to provide a vehicle acceleration prediction method that considers driving behavior characteristics in a car-following scene, and considers the driver's behavior characteristics to improve the accuracy of acceleration prediction.

本发明的目的可以通过以下技术方案来实现：The purpose of the present invention can be achieved through the following technical solutions:

一种跟驰场景下考虑驾驶行为特征的车辆加速度预测方法，包括以下步骤：A vehicle acceleration prediction method considering driving behavior characteristics in a car-following scene, comprising the following steps:

步骤1)获取预测数据集并进行数据处理和变量提取：对获取的预测数据集，基于预配置的规则对有效跟驰片段进行筛选保留，从有效跟驰片段数据中提取特征变量，其中，所述预测数据集包括自车数据和前车数据；Step 1) Obtain the predicted data set and perform data processing and variable extraction: For the obtained predicted data set, filter and retain the effective car-following segments based on pre-configured rules, and extract feature variables from the effective car-following segment data. The prediction data set includes the vehicle data and the preceding vehicle data;

步骤2)构建基于CHMM的驾驶行为语义划分模型，以特征变量作为驾驶行为语义划分模型的输入，将特征变量分割成具有不同行为特征的片段，并对驾驶行为语义进行相似度评价；Step 2) Build a CHMM-based driving behavior semantic segmentation model, use feature variables as the input of the driving behavior semantic segmentation model, segment the feature variables into segments with different behavioral characteristics, and evaluate the similarity of driving behavior semantics;

步骤3)构建CNN-LSTM加速度预测模型，基于相似度评价结果划分有效跟驰片段数据构建输入输出样本集，对CNN-LSTM加速度预测模型进行训练，基于训练完成的CNN-LSTM加速度预测模型进行车辆加速度预测。Step 3) Construct a CNN-LSTM acceleration prediction model, divide the effective car-following segment data based on the similarity evaluation results to construct an input and output sample set, train the CNN-LSTM acceleration prediction model, and perform vehicle training based on the trained CNN-LSTM acceleration prediction model. Acceleration prediction.

所述步骤1)包括以下步骤：Described step 1) comprises the following steps:

步骤11)对预测数据集进行数据清洗，剔除异常值或大于1s的缺失帧，用线性插值弥补小于1s的缺失片段，获得有效数据集；Step 11) Carry out data cleaning on the predicted data set, remove outliers or missing frames greater than 1s, and use linear interpolation to make up for missing segments less than 1s to obtain a valid data set;

步骤12)对获得的有效数据集进行有效跟车片段的提取，根据两车之间的相对位置情况、目标车辆的速度大小、是否有超车事件发生以及跟驰事件持续时间长短对有效跟驰片段进行筛选保留；Step 12) Extracting the effective car-following segment from the obtained valid data set, according to the relative position between the two vehicles, the speed of the target vehicle, whether there is an overtaking event, and the duration of the car-following event. carry out screening retention;

步骤13)从有效跟驰片段数据中提取特征变量。Step 13) Extract feature variables from the effective car-following segment data.

所述步骤2)包括以下步骤：Described step 2) comprises the following steps:

步骤21)根据预配置的分段阈值把特征变量划分为不同的等级；Step 21) divide the feature variable into different grades according to the pre-configured segmentation threshold;

步骤22)以驾驶员为单位对特征变量序列分别进行标准化处理；Step 22) standardize the feature variable sequence with the driver as the unit;

步骤23)根据驾驶行为的时序性质以及数据分布特征，构建基于CHMM的驾驶行为语义划分模型，采用无监督学习的方法，以标准化后的特征变量作为驾驶行为语义划分模型的输入，将特征变量数据分割成具有不同行为特征的片段；Step 23) According to the temporal nature of driving behavior and data distribution characteristics, construct a CHMM-based semantic segmentation model of driving behavior, adopt the method of unsupervised learning, use the standardized characteristic variables as the input of the semantic segmentation model of driving behavior, and convert the characteristic variable data Segmentation into segments with different behavioral characteristics;

步骤24)基于标准化出现频率分布与JS散度计算方法，求解每对驾驶员两两之间行为语义持续时长最优分布、行为语义频率分布、特征变量分布的JS散度，加权求和得到驾驶行为相似程度的量化指标值。Step 24) Based on the standardized occurrence frequency distribution and JS divergence calculation method, solve the optimal distribution of behavior semantic duration, behavior semantic frequency distribution, and JS divergence of characteristic variable distribution between each pair of drivers, and obtain the driving Quantitative index value of behavioral similarity.

所述步骤22)中的标准化处理过程为：The standardization process in the step 22) is:

其中，x表示特征变量序列；m为驾驶员编号，M表示驾驶员数量；l为驾驶员m的稳定跟驰事件编号，L表示该驾驶员的稳定跟驰事件数；μ_m和σ_m分别表示驾驶员m的特征变量的均值和标准差；标准化处理后特征变量的均值为0，标准差等于1。Among them, x represents the sequence of characteristic variables; m is the number of drivers, and M represents the number of drivers; l is the number of stable car-following events of driver m, and L represents the number of stable car-following events of the driver; μ _m and σ _m are respectively Indicates the mean and standard deviation of the characteristic variables of the driver m; after standardization, the mean of the characteristic variables is 0, and the standard deviation is equal to 1.

所述基于CHMM的驾驶行为语义划分模型中，In the CHMM-based driving behavior semantic division model,

在HMM中，包含两种序列类型，其中观测序列X＝{x₁，x₂，...，x_T}可以观测得到，隐藏状态序列S＝{s₁，s₂，...，s_T}无法观测得到；HMM由状态转移矩阵A，观测概率矩阵B和初始状态概率分布П表示：λ＝(A，B，Π)，隐状态个数为I，观测变量的所有可能取值数为J；In HMM, there are two sequence types, where the observation sequence X={x ₁ , x ₂ ,...,x _T } can be observed, and the hidden state sequence S={s ₁ , s ₂ ,..., s _T } cannot be observed; HMM is represented by the state transition matrix A, the observation probability matrix B and the initial state probability distribution П: λ=(A, B, Π), the number of hidden states is I, and the number of all possible values of the observed variable for J;

其中，状态转移矩阵A＝[a_ij]_I×I，其状态转移概率为：Among them, the state transition matrix A＝[a _ij ] _I×I , the state transition probability is:

a_ij＝P(s_t+1＝q_j|s_t＝q_i)a _ij ＝P(s _t+1 ＝q _j |s _t ＝q _i )

观测概率矩阵B＝[b_ij]_I×J，观测值仅由当前时刻的隐状态决定：Observation probability matrix B=[b _ij ] _I×J , the observation value is only determined by the hidden state at the current moment:

b_ij＝P(x_t＝x_j|s_t＝q_i)b _ij =P(x _t =x _j |s _t =q _i )

初始状态概率分布∏＝[π_i]_1×I，表示初始时刻的隐状态的概率分布：Initial state probability distribution ∏=[π _i ] _1×I , which represents the probability distribution of the hidden state at the initial moment:

π_i＝P(s₁＝q_i)π _i =P(s ₁ =q _i )

CHMM中每条HMM链的隐状态个数均设为N，则CHMM模型隐状态个数为N^Q，其中，Q表示特征变量的个数；任意时刻模型的隐状态组合可以表示为

HMM由模型参数λ＝(A，B，∏)进行表示，其中，状态转移矩阵A＝{a_u，v}、观测值概率B＝b_u(X_t)和初始状态概率∏＝{π_u}的计算方法如下：The number of hidden states of each HMM chain in CHMM is set to N, then the number of hidden states of the CHMM model is N ^Q , where Q represents the number of characteristic variables; the combination of hidden states of the model at any time can be expressed as

HMM is represented by model parameters λ=(A, B, ∏), where state transition matrix A={a _{u, v} }, observation probability B= _bu (X _t ) and initial state probability ∏={π _u } is calculated as follows:

CHMM的隐状态层解码了特征变量间的依赖关系，考虑到特征变量均为连续型变量，引入高斯分布计算观测概率：The hidden state layer of CHMM decodes the dependencies between feature variables. Considering that the feature variables are all continuous variables, a Gaussian distribution is introduced to calculate the observation probability:

其中，μ_c，u和σ_c，u分别表示特征变量c对应的HMM链中隐状态为q_c，u时，概率分布的均值和标准差。Among them, μ _{c, u} and σ _{c, u} respectively represent the mean and standard deviation of the probability distribution when the hidden state in the HMM chain corresponding to the characteristic variable c is q _{c, u} .

所述步骤24)包括以下步骤：Described step 24) comprises the following steps:

步骤241)驾驶员D_m提取得到的每一个行为语义片段的持续时长为d_m，f(d_m)表示行为语义持续时长的分布，f(d_n)表示驾驶员D_n提取得到的行为语义持续时长的分布，通过量化分布f(d_m)和f(d_n)的接近程度来度量不同驾驶员驾驶模式转移规律的相似性

Step 241) The duration of each behavioral semantic segment extracted by the driver D _m is d _m , f(d _m ) represents the distribution of the duration of the behavioral semantics, and f(d _n ) represents the behavioral semantics extracted by the driver D _n The distribution of duration, by quantifying the closeness of the distributions f(d _m ) and f(d _n ), to measure the similarity of the driving pattern transfer laws of different drivers

步骤242)通过量化概率质量函数的接近程度来度量驾驶员D_m和驾驶员D_n在驾驶模式选择偏好的相似性

Step 242) Measure the similarity of the driver D _m and the driver D _n in the driving mode selection preference by quantifying the proximity of the probability mass function

考虑到每一种类型的跟驰距离工况下，行为语义出现概率和等于1，分别度量跟驰距离工况S_Δd下概率质量函数

和/>

的接近程度后求均值，得到/>

Considering that under each type of car-following distance condition, the sum of the occurrence probability of behavioral semantics is equal to 1, respectively measure the probability mass function under the car-following distance condition S _Δd

and />

After calculating the mean value of the closeness, get />

步骤243)根据预配置的区段划分阈值将行为语义持续时长划分为j个区段，分别统计每一区段行为语义片段的特征变量数据的均值和标准差，驾驶员D_m在区段d中特征变量x_k的均值和标准差记为

和/>

采用正态分布/>

简化驾驶员D_m在区段d中特征变量x_k的概率密度函数；依次计算每一个持续时长区段内的特征变量的概率密度函数对的接近程度，求均值得到驾驶员D_m和驾驶员D_n驾驶操作激进程度的相似性/>

Step 243) Divide the behavioral semantic duration into j segments according to the pre-configured segment threshold, and count the mean and standard deviation of the characteristic variable data of each segmental behavioral semantic segment respectively. Driver D _m is in segment d The mean and standard deviation of the characteristic variable x _k in are recorded as

and />

Using a normal distribution />

Simplify the probability density function of the characteristic variable x _k of the driver D _m in the segment d; calculate the closeness of the probability density function pairs of the characteristic variables in each duration segment in turn, and calculate the average value to obtain the driver D _m and the driver D _n similarity in aggressiveness of driving maneuvers />

其中，K表示特征变量的个数；Among them, K represents the number of feature variables;

步骤244)

和/>

分别从三个维度量化了驾驶员D_m和驾驶员D_n风格的相似程度，通过加权平均得到驾驶风格相似性的综合评价指标η_m，n：step 244)

and />

The similarity degree of driver D _m and driver D _n style is quantified from three dimensions respectively, and the comprehensive evaluation index η _m,n of driving style similarity is obtained by weighted average:

其中，w_i∈[0，1]且

分别表示三个驾驶风格相似性指标的权重；where w _i ∈ [0, 1] and

represent the weights of the three driving style similarity indicators respectively;

步骤245)给定两个标准化频率分布的离散型概率密度分布

和/>

采用JS散度来度量两个概率分布的相似性：Step 245) given two discrete probability density distributions of standardized frequency distributions

and />

JS divergence is used to measure the similarity of two probability distributions:

其中，离散型概率密度分布

和/>

的KL散度/>

的计算方法为：Among them, the discrete probability density distribution

and />

KL divergence />

The calculation method is:

和/>

分别表示在跟驰距离工况S_Δd下，驾驶员D_m和驾驶员D_n行为语义的标准化出现频率；

and />

Respectively represent the normalized occurrence frequency of driver D _m and driver D _n behavior semantics under the car-following distance condition S _Δd ;

JS散度的值域范围是[0,1]，当两个概率分布相同时，JS散度等于0，概率分布差别越大，JS值越接近1。The value range of JS divergence is [0,1]. When the two probability distributions are the same, the JS divergence is equal to 0. The greater the difference in probability distribution, the closer the JS value is to 1.

所述步骤3)包括以下步骤：Described step 3) comprises the following steps:

步骤31)根据相似性评价结果和预配置的相似度阈值，选择符合相似度条件的若干驾驶员作为一组，将有效跟驰片段按照驾驶员以预配置的比例分别归入训练集、测试集和验证集；Step 31) According to the similarity evaluation results and the pre-configured similarity threshold, select several drivers who meet the similarity conditions as a group, and classify the effective car-following segments into the training set and the test set according to the pre-configured ratio of the drivers and validation set;

步骤32)构建输入输出样本，将训练集和测试集中包含加速度及其他特征变量的数据转化为N维数组，每个输入样本由K个时刻的特征变量组成，输出标签样本为第K+1时刻的加速度数据；Step 32) Construct input and output samples, convert the data containing acceleration and other characteristic variables in the training set and test set into an N-dimensional array, each input sample is composed of characteristic variables at K moments, and the output label sample is the K+1th moment acceleration data;

步骤33)构建CNN-LSTM车辆加速度预测模型；Step 33) build CNN-LSTM vehicle acceleration prediction model;

步骤34)模型训练及超参数优化：使用验证集对训练过程中的模型进行实时评估，检验其训练效果，采用Adam优化器进行迭代，重复训练过程使网络参数趋于最优；Step 34) model training and hyperparameter optimization: use the verification set to evaluate the model in the training process in real time, check its training effect, use Adam optimizer to iterate, and repeat the training process to make the network parameters tend to be optimal;

步骤35)基于训练完成的CNN-LSTM车辆加速度预测模型对目标车辆的加速度进行预测。Step 35) Predict the acceleration of the target vehicle based on the trained CNN-LSTM vehicle acceleration prediction model.

所述CNN-LSTM车辆加速度预测模型包括依次连接的输入层、CNN层、LSTM层和全连接层，其中，The CNN-LSTM vehicle acceleration prediction model includes an input layer, a CNN layer, an LSTM layer and a fully connected layer connected in sequence, wherein,

所述输入层用于接收输入数据；The input layer is used to receive input data;

所述CNN层用于压缩和提取输入数据的空间特征，所述CNN层包括一层卷积层和一层最大池化层；The CNN layer is used to compress and extract the spatial features of the input data, and the CNN layer includes a convolution layer and a maximum pooling layer;

所述LSTM层用于提取CNN输出数据的时间序列特征，所述LSTM层中的LSTM网络为多对一LSTM；The LSTM layer is used to extract the time series features of CNN output data, and the LSTM network in the LSTM layer is many-to-one LSTM;

所述全连接层用于输出加速度预测结果；The fully connected layer is used to output acceleration prediction results;

模型网络中的每层均添加了Dropout层，且选用均方误差作为模型损失函数和预测准确度的评价。Each layer in the model network is added with a Dropout layer, and the mean square error is selected as the evaluation of the model loss function and prediction accuracy.

所述卷积层的计算公式如下：The calculation formula of the convolutional layer is as follows:

其中，σ为激活函数，

为第k层、第j维的特征图，M_j为输入图的集合，/>

为卷积核，/>

为偏置。Among them, σ is the activation function,

is the feature map of the kth layer and the jth dimension, M _j is the set of input maps, />

is the convolution kernel, />

for the bias.

所述LSTM网络的遗忘门参数更新公式为：The forget gate parameter update formula of the LSTM network is:

f_t＝σ(w_fh_t-1+w_fx_t+b_f)f _t ＝σ(w _f h _t-1 +w _f x _t +b _f )

其中，f_t表示t时刻的遗忘门状态，w_f和b_f分别是遗忘门的权值和偏置，σ表示sigmoid函数，输出值范围为0到1，h_t表示隐含层状态，x_t表示t时刻网络的输入值；Among them, f _t represents the state of the forget gate at time t, w _f and b _f are the weight and bias of the forget gate, respectively, σ represents the sigmoid function, and the output value ranges from 0 to 1, h _t represents the state of the hidden layer, x _t represents the input value of the network at time t;

LSTM网络的输入门用于更新输入细胞状态中的信息：The input gate of the LSTM network is used to update the information in the input cell state:

i_t＝σ(w_ix_t+w_ih_t-1+b_i)i _t ＝σ(w _i x _t +w _i h _t-1 +b _i )

其中，i_t表示t时刻的输入门状态，w_i和b_i分别是输入门的权值和偏置；Among them, i _t represents the state of the input gate at time t, and w _i and b _i are the weight and bias of the input gate, respectively;

LSTM网络记忆细胞状态c_t，由存储在前一个记忆细胞中的信息c_t-1和新的候选信息进行更新：The LSTM network memory cell state c _t is updated by the information c _t-1 stored in the previous memory cell and the new candidate information:

其中，

是t时刻的候选细胞状态，c_t是更新后的细胞状态，w_c和b_c分别是细胞状态的权值和偏置，.代表点积；in,

is the candidate cell state at time t, c _t is the updated cell state, w _c and b _c are the weight and bias of the cell state respectively, . represents the dot product;

LSTM的输出门用于控制细胞状态中的信息输出，根据记忆细胞状态c_t和输出门状态o_t计算输出隐含层状态h_t：The output gate of LSTM is used to control the information output in the cell state, and the output hidden layer state h t is calculated according to the memory cell state c _t and the output gate state o _t _:

o_t＝σ(w_ox_t+w_oh_t-1+b_o)o _t ＝σ(w _o x _t +w _o h _t-1 +b _o )

h_t＝o_t·tanh(c_t)h _t ＝o _t ·tanh(c _t )

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

(1)本发明基于CHMM的驾驶行为语义划分模型将特征变量数据分割成具有不同行为特征的片段，这是对HMM模型的一种拓展，通过建立多条HMM链的隐状态变量间的相互关联，实现HMM链结构的耦合，从而描述多特征变量数据的综合统计特性；除此之外，CHMM还充分利用了HMM的数学结构和概率推理能力，同时建立了多通道隐状态间的影响关系，非常适用于融合多变量数据信息，能够挖掘不同驾驶员之间驾驶行为的异质性，提高预测精度。(1) The CHMM-based driving behavior semantic segmentation model of the present invention divides the feature variable data into segments with different behavioral characteristics, which is an extension of the HMM model, by establishing the correlation between the hidden state variables of multiple HMM chains , to realize the coupling of the HMM chain structure, so as to describe the comprehensive statistical characteristics of multi-characteristic variable data; in addition, CHMM also makes full use of the mathematical structure and probabilistic reasoning ability of HMM, and establishes the influence relationship between multi-channel hidden states, It is very suitable for the fusion of multivariate data information, which can mine the heterogeneity of driving behavior among different drivers and improve the prediction accuracy.

(2)本发明使用CNN-LSTM车辆加速度预测模型对分组后的驾驶员跟驰数据进行加速度预测，获取了数据之间的空间特征，克服了LSTM单个模型的局限性，结合CNN在空间特征提取和LSTM在时序学习方面的优势，充分捕捉数据之间的空间特征和时间特征，有效提升了模型的预测精度。(2) The present invention uses the CNN-LSTM vehicle acceleration prediction model to carry out acceleration prediction on the grouped driver's car-following data, obtains the spatial features between the data, overcomes the limitations of the LSTM single model, and combines CNN in spatial feature extraction With the advantages of LSTM in time series learning, it fully captures the spatial and temporal features between data, and effectively improves the prediction accuracy of the model.

(3)本发明提出的跟驰场景下考虑驾驶行为特征的车辆加速度预测方法，能够合理地将车辆跟驰数据划分行为语义段，通过相似性评价将同质驾驶员归为一组，有效捕捉数据的空间特征和时间特征，从而实现高精度的加速度预测，促进高级驾驶辅助系统(ADAS)的发展，改善交通安全问题。(3) The car-following scene proposed by the present invention considers the vehicle acceleration prediction method of driving behavior characteristics, which can reasonably divide the vehicle-following data into behavioral semantic segments, group homogeneous drivers into one group through similarity evaluation, and effectively capture Spatial and temporal characteristics of the data, so as to achieve high-precision acceleration prediction, promote the development of advanced driver assistance systems (ADAS), and improve traffic safety issues.

附图说明Description of drawings

图1为本发明的方法流程图；Fig. 1 is method flowchart of the present invention;

图2为本发明实施例的驾驶行为语义分割图；FIG. 2 is a semantic segmentation diagram of driving behavior according to an embodiment of the present invention;

图3为本发明实施例的驾驶行为语义相似度评价图；Fig. 3 is the driving behavior semantic similarity evaluation figure of the embodiment of the present invention;

图4为本发明的CNN-LSTM车辆加速度预测模型结构图；Fig. 4 is the structure diagram of CNN-LSTM vehicle acceleration prediction model of the present invention;

图5为本发明实施例的车辆加速度预测结果与真实值对比图。Fig. 5 is a comparison chart of the vehicle acceleration prediction result and the real value according to the embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明进行详细说明。本实施例以本发明技术方案为前提进行实施，给出了详细的实施方式和具体的操作过程，但本发明的保护范围不限于下述的实施例。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

本实施例提供一种跟驰场景下考虑驾驶行为特征的车辆加速度预测方法，如图1所示，包括以下步骤：This embodiment provides a vehicle acceleration prediction method considering driving behavior characteristics in a car-following scene, as shown in FIG. 1 , including the following steps:

步骤1)获取预测数据集并进行数据处理和变量提取：对获取的预测数据集，基于预配置的规则对有效跟驰片段进行筛选保留，从有效跟驰片段数据中提取特征变量，其中，所述预测数据集包括自车数据和前车数据。Step 1) Obtain the predicted data set and perform data processing and variable extraction: For the obtained predicted data set, filter and retain the effective car-following segments based on pre-configured rules, and extract feature variables from the effective car-following segment data. The above prediction data set includes self-vehicle data and front-vehicle data.

步骤11)对预测数据集进行数据清洗，剔除异常值或大于1s的缺失帧，用线性插值弥补小于1s的缺失片段，获得有效数据集。Step 11) Perform data cleaning on the prediction data set, remove outliers or missing frames longer than 1s, and use linear interpolation to make up for missing segments less than 1s to obtain a valid data set.

实施例选取SPMD数据集中30名具有相似驾驶场景的驾驶员的数据作为预测数据集进行分析。数据集中的重要变量信息如下：The embodiment selects the data of 30 drivers with similar driving scenes in the SPMD data set as the prediction data set for analysis. The important variable information in the dataset is as follows:

表1SPMD数据说明Table 1 SPMD data description

步骤12)对获得的有效数据集进行有效跟车片段的提取，根据两车之间的相对位置情况、目标车辆的速度大小、是否有超车事件发生以及跟驰事件持续时间长短对有效跟驰片段进行筛选保留。Step 12) Extracting the effective car-following segment from the obtained valid data set, according to the relative position between the two vehicles, the speed of the target vehicle, whether there is an overtaking event, and the duration of the car-following event. Keep screening.

对于数据清洗后的数据集，本实施例根据以下条件对符合要求的有效跟驰片段进行保留：For the data set after data cleaning, this embodiment retains the valid car-following segments that meet the requirements according to the following conditions:

(1)主车与前车位于同一车道；(1) The main vehicle and the vehicle in front are in the same lane;

(2)沿车道方向上的相对距离大于5m，且小于120m；(2) The relative distance along the lane is greater than 5m and less than 120m;

(3)车辆速度大于5m/s；(3) The vehicle speed is greater than 5m/s;

(4)当邻车切入，即前车ID发生变化时，跟驰事件终止；(4) When the adjacent vehicle cuts in, that is, when the ID of the preceding vehicle changes, the car-following event is terminated;

(5)单个跟驰事件的持续时长大于50s。(5) The duration of a single car-following event is longer than 50s.

本实施例中，为了研究主车的跟驰行为特征，选取主车加速度、相对速度和相对距离三个特征变量描述跟驰场景：In this embodiment, in order to study the car-following behavior characteristics of the main vehicle, three characteristic variables of the main vehicle acceleration, relative speed and relative distance are selected to describe the car-following scene:

(1)主车加速度(a₁)：可以直接体现驾驶意图和驾驶员行为偏好。(1) Acceleration of the main vehicle (a ₁ ): it can directly reflect the driving intention and the driver's behavior preference.

(2)相对距离(Δd)：主车和前车沿前进方向的位置差，Δd＝x₂-x₁，恒大于0。(2) Relative distance (Δd): The position difference between the main vehicle and the leading vehicle along the forward direction, Δd=x ₂ -x ₁ , always greater than 0.

(3)相对速度(Δv)：主车和前车沿前进方向的速度差，Δv＝v₂-v₁，当相对速度大于0时，前车速度更快，两车逐渐远离；当相对距离小于0时，后车速度更快，后车越来越靠近前车。(3) Relative speed (Δv): the speed difference between the main vehicle and the front vehicle along the forward direction, Δv=v ₂ -v ₁ , when the relative speed is greater than 0, the speed of the front vehicle is faster, and the two vehicles gradually move away; when the relative distance When it is less than 0, the speed of the rear vehicle is faster, and the rear vehicle is getting closer to the front vehicle.

步骤2)构建基于CHMM的驾驶行为语义划分模型，以特征变量作为驾驶行为语义划分模型的输入，将特征变量分割成具有不同行为特征的片段，并对驾驶行为语义进行相似度评价。Step 2) Construct a CHMM-based semantic segmentation model of driving behavior, use feature variables as the input of the semantic segmentation model of driving behavior, segment the feature variables into segments with different behavioral characteristics, and evaluate the similarity of driving behavior semantics.

步骤21)根据预配置的分段阈值把特征变量划分为不同的等级。Step 21) Divide the feature variable into different levels according to the pre-configured segmentation threshold.

根据特征变量的频率分布直方图、累积分布图得到数据的分位点，结合分位点和驾驶员的感知舒适阈值，将相对距离划分为{远距离LD,中等距离ND,近距离CD}三个等级，相对速度和加速度分别划分为{快速远离RFB,缓慢远离FB,保持KE,缓慢靠近CI,快速靠近RCI}和{急加速AA,缓加速GA,无加速NA,缓减速GD,急减速AD}五个等级。According to the frequency distribution histogram and cumulative distribution map of the characteristic variables, the quantile points of the data are obtained, combined with the quantile points and the driver's perceived comfort threshold, the relative distance is divided into {long-distance LD, medium-distance ND, short-distance CD} three The relative speed and acceleration are divided into {fast away from RFB, slowly away from FB, keep KE, slowly close to CI, fast close to RCI} and {fast acceleration AA, slow acceleration GA, no acceleration NA, slow deceleration GD, rapid deceleration AD} five levels.

分段信息如表2所示：The segmentation information is shown in Table 2:

表2变量分段信息说明Table 2 Description of variable segmentation information

步骤22)以驾驶员为单位对特征变量序列分别进行标准化处理：Step 22) standardize the sequence of feature variables with the driver as the unit:

其中，x＝{Δd,Δv,a}^T表示相对距离、相对速度和加速度序列；m为驾驶员编号，M表示驾驶员数量，本实施例中，M＝30；l为驾驶员m的稳定跟驰事件编号，L表示该驾驶员的稳定跟驰事件数；μ_m和σ_m分别表示驾驶员m的特征变量的均值和标准差；标准化处理后特征变量的均值为0，标准差等于1。Wherein, x={Δd, Δv, a} ^T represents relative distance, relative speed and acceleration sequence; m is the driver number, M represents the number of drivers, in the present embodiment, M=30; l is the stability of driver m number of car-following events, L represents the number of stable car-following events of the driver; μ _m and σ _m represent the mean and standard deviation of the characteristic variables of driver m respectively; the mean value of the characteristic variables after standardization is 0, and the standard deviation is equal to 1 .

步骤23)根据驾驶行为的时序性质以及数据分布特征，构建基于CHMM的驾驶行为语义划分模型，采用无监督学习的方法，以标准化后的特征变量作为驾驶行为语义划分模型的输入，将特征变量数据分割成具有不同行为特征的片段。Step 23) According to the temporal nature of driving behavior and data distribution characteristics, construct a CHMM-based semantic segmentation model of driving behavior, adopt the method of unsupervised learning, use the standardized characteristic variables as the input of the semantic segmentation model of driving behavior, and convert the characteristic variable data Segmentation into segments with different behavioral characteristics.

在HMM(Hidden Markov Model，隐马尔可夫模型)中，包含两种序列类型，其中观测序列X＝{x₁,x₂,...,x_T}可以观测得到，隐藏状态序列S＝{s₁,s₂,...,s_T}无法观测得到。HMM可以由状态转移矩阵A，观测概率矩阵B和初始状态概率分布Π表示，λ＝(A,B,Π)。隐状态个数为I，观测变量的所有可能取值数为J。In HMM (Hidden Markov Model, Hidden Markov Model), there are two sequence types, where the observation sequence X={x ₁ ,x ₂ ,...,x _T } can be observed, and the hidden state sequence S={ s ₁ ,s ₂ ,...,s _T } cannot be observed. HMM can be represented by state transition matrix A, observation probability matrix B and initial state probability distribution Π, λ=(A, B, Π). The number of hidden states is I, and the number of all possible values of the observed variable is J.

其状态转移矩阵A＝[a_ij]_I×I，其状态转移概率为：Its state transition matrix A=[a _ij ] _I×I , and its state transition probability is:

a_ij＝P(s_t+1＝q_j|s_t＝q_i)a _ij ＝P(s _t+1 ＝q _j |s _t ＝q _i )

b_ij＝P(x_t＝x_j|s_t＝q_i)b _ij =P(x _t =x _j |s _t =q _i )

初始状态概率分布Π＝[π_i]_1×I，表示初始时刻的隐状态的概率分布：The initial state probability distribution Π=[π _i ] _1×I represents the probability distribution of the hidden state at the initial moment:

π_i＝P(s₁＝q_i)π _i =P(s ₁ =q _i )

所述基于CHMM的驾驶行为语义划分模型为基于HMM的改进模型，其中，The CHMM-based driving behavior semantic division model is an improved model based on HMM, wherein,

CHMM中每条HMM链的隐状态个数均设为N，则CHMM模型隐状态个数为N³，其中，3表示本实施例中选取的特征变量的个数；任意时刻模型的隐状态组合可以表示为

HMM由模型参数λ＝(A,B,Π)进行表示，其中，状态转移矩阵A＝{a_u,v}、观测值概率B＝b_u(X_t)和初始状态概率Π＝{π_u}的计算方法如下：The number of hidden states of each HMM chain in CHMM is set to N, then the number of hidden states of the CHMM model is N ³ , where 3 represents the number of characteristic variables selected in this embodiment; the hidden state combination of the model at any time It can be expressed as

HMM is represented by model parameters λ=(A,B,Π), where state transition matrix A={a _u,v }, observation probability B= _bu (X _t ) and initial state probability Π={π _u } is calculated as follows:

其中，μ_c,u和σ_c,u分别表示特征变量c对应的HMM链中隐状态为q_c,u时，概率分布的均值和标准差。Among them, μ _{c, u} and σ _{c, u} respectively represent the mean value and standard deviation of the probability distribution when the hidden state in the HMM chain corresponding to the characteristic variable c is q _{c, u} .

本实施例中，1号驾驶员的代表性轨迹分割结果如图2所示。横坐标为时间，实线、点划线和虚线三条折线分别展示了特征变量的原始数据信息，背景色块为提取得到的驾驶行为序列单元，相同颜色表示同一类型的行为语义。In this embodiment, the representative trajectory segmentation results of driver No. 1 are shown in FIG. 2 . The abscissa is time, and the solid line, dot-dash line, and dotted line respectively show the original data information of the feature variables. The background color block is the extracted driving behavior sequence unit, and the same color indicates the same type of behavior semantics.

CHMM的行为语义分割结果与数据波动情况基本吻合，能够精准地根据特征变量数据潜在的行为特征，提取有效行为语义序列单元。同时，模型可以有效克服数据噪声的干扰，避免出现过短的行为语义片段。The behavioral semantic segmentation results of CHMM are basically consistent with the data fluctuations, and can accurately extract effective behavioral semantic sequence units based on the potential behavioral characteristics of feature variable data. At the same time, the model can effectively overcome the interference of data noise and avoid too short behavioral semantic fragments.

和/>

的接近程度后求均值，得到/>

and />

After calculating the mean value of the closeness, get />

步骤243)根据预配置的区段划分阈值将行为语义持续时长划分为8个区段：{＜1,[1,5),[5,10),[10,15),[15,20),[20,25),[25,30),≥30}，分别统计每一区段行为语义片段的特征变量数据的均值和标准差，驾驶员D_m在区段d中特征变量x_k的均值和标准差记为

和/>

Step 243) Divide the semantic duration of the behavior into 8 segments according to the pre-configured segment division threshold: {<1, [1,5), [5,10), [10,15), [15,20) _. _{_} The mean and standard deviation are recorded as

and />

由于当数据量比较大时，绝大部分分布都可以用正态分布近似，因此本发明采用正态分布

简化驾驶员D_m在区段d中特征变量x_k的概率密度函数。同一个持续时长区段内的相同特征变量的分布规律的相似程度具有可比性，因此，本发明依次计算每一个持续时长区段内三个特征变量的概率密度函数对的接近程度，求均值得到驾驶员D_m和驾驶员D_n驾驶操作激进程度的相似性/>

Since most of the distributions can be approximated by the normal distribution when the amount of data is relatively large, the present invention adopts the normal distribution

Simplify the probability density function of the characteristic variable x _k of the driver D _m in the section d. The similarity of the distribution laws of the same characteristic variable in the same duration segment is comparable, therefore, the present invention calculates the closeness of the probability density function pairs of the three characteristic variables in each duration segment in turn, and calculates the mean value to obtain The similarity between the aggressiveness of driver D _m and driver D _n driving maneuvers />

其中，K表示特征变量的个数。Among them, K represents the number of feature variables.

步骤244)

和/>

分别从三个维度量化了驾驶员D_m和驾驶员D_n风格的相似程度，通过加权平均得到驾驶风格相似性的综合评价指标η_m,n：step 244)

and />

其中，w_i∈[0,1]且

分别表示三个驾驶风格相似性指标的权重，可以根据应用场景的需要进行调整。本实施例认为三个维度相似程度的重要性一致，设定

where w _i ∈ [0,1] and

Respectively represent the weights of the three driving style similarity indicators, which can be adjusted according to the needs of the application scenario. This embodiment considers that the importance of the similarity of the three dimensions is the same, setting

步骤245)给定两个标准化频率分布的离散型概率密度分布

和/>

and />

其中，离散型概率密度分布

和/>

的KL散度/>

的计算方法为：Among them, the discrete probability density distribution

and />

KL divergence />

The calculation method is:

和/>

分别表示在跟驰距离工况S_Δd下，驾驶员D_m和驾驶员D_n行为语义的标准化出现频率。

and />

Respectively represent the normalized occurrence frequency of driver D _m and driver D _n behavior semantics under the car-following distance condition S _Δd .

JS散度具有对称性，且值域范围是[0,1]，当两个概率分布相同时，JS散度等于0，概率分布差别越大，JS值越接近1。The JS divergence is symmetric, and the value range is [0,1]. When the two probability distributions are the same, the JS divergence is equal to 0, and the greater the difference between the probability distributions, the closer the JS value is to 1.

本实施例中，综合三方面相似程度分析的JS散度结果，计算加权均值得到多维度全方面的驾驶风格相似性量化评估结果，采用热力图直观展示结果如图3所示。深色区域的JS散度值更大，差异性更加显著；斜对角线JS散度值等于0，采用白色背景绘制。In this embodiment, the JS divergence results of the three-aspect similarity analysis are integrated, and the weighted average is calculated to obtain the multi-dimensional and comprehensive quantitative evaluation results of the driving style similarity. The heat map is used to visually display the results, as shown in Figure 3. The JS divergence value of the dark area is larger, and the difference is more significant; the JS divergence value of the diagonal line is equal to 0, drawn with a white background.

如图3所示，驾驶员#16与大多数驾驶员的驾驶风格差异性比较大，驾驶员#16的驾驶风格特别保守，驾驶模式转换频率低，期望跟驰间距比较大。驾驶员#24与#25分别与四位驾驶员的驾驶风格区别明显，这几位驾驶员在驾驶模式选择偏好与驾驶操作激进程度方面差别比较显著。As shown in Figure 3, the driving style of driver #16 is quite different from most drivers, and the driving style of driver #16 is particularly conservative, the frequency of driving mode switching is low, and the expected car-following distance is relatively large. The driving styles of drivers #24 and #25 are significantly different from those of the four drivers, and these drivers have significant differences in driving mode selection preferences and driving aggressiveness.

步骤3)构建CNN-LSTM加速度预测模型，其结构如图4所示，基于相似度评价结果划分有效跟驰片段数据构建输入输出样本集，对CNN-LSTM加速度预测模型进行训练，基于训练完成的CNN-LSTM加速度预测模型进行车辆加速度预测。Step 3) Construct the CNN-LSTM acceleration prediction model, the structure of which is shown in Figure 4. Based on the similarity evaluation results, divide the effective car-following segment data to construct an input and output sample set, and train the CNN-LSTM acceleration prediction model. CNN-LSTM acceleration prediction model for vehicle acceleration prediction.

步骤31)根据相似性评价结果和预配置的相似度阈值，选择符合相似度条件的10个驾驶员作为一组，将有效跟驰片段按照驾驶员以7：2：1的比例分别归入训练集、测试集和验证集。Step 31) According to the similarity evaluation results and the pre-configured similarity threshold, select 10 drivers who meet the similarity conditions as a group, and put the effective car-following segments into the training according to the ratio of 7:2:1 drivers set, test set, and validation set.

步骤32)构建输入输出样本，将训练集和测试集中包含加速度及其他特征变量的数据转化为N维数组，每个输入样本由K个时刻的特征变量组成，输出标签样本为第K+1时刻的加速度数据。Step 32) Construct input and output samples, convert the data containing acceleration and other characteristic variables in the training set and test set into an N-dimensional array, each input sample is composed of characteristic variables at K moments, and the output label sample is the K+1th moment acceleration data.

本实施例中模型的输入为主车加速度、相对距离、相对速度三个特征变量，输出为未来时刻的主车加速度。通过调试，选定时间窗长度为80，预测长度为1，对训练集，验证集，测试集进行样本划分，如表3所示，得到模型的输入样本和标签样本。The input of the model in this embodiment is the three characteristic variables of main vehicle acceleration, relative distance and relative speed, and the output is the main vehicle acceleration at a future moment. Through debugging, the selected time window length is 80, and the prediction length is 1. The training set, verification set, and test set are divided into samples. As shown in Table 3, the input samples and label samples of the model are obtained.

表3数据集用途及数量Table 3 Data set usage and quantity

步骤33)构建CNN-LSTM车辆加速度预测模型。Step 33) Construct a CNN-LSTM vehicle acceleration prediction model.

所述LSTM层用于提取CNN输出数据的时间序列特征，所述LSTM层中的LSTM网络为多对一LSTM，，隐藏层中含有若干个LSTM单元；The LSTM layer is used to extract the time series features of the CNN output data, the LSTM network in the LSTM layer is a many-to-one LSTM, and the hidden layer contains several LSTM units;

模型网络中的每层均添加了Dropout层。Each layer in the model network is added with a dropout layer.

其中，σ为激活函数，

为第k层、第j维的特征图，M_j为输入图的集合，/>

为卷积核，/>

为偏置。Among them, σ is the activation function,

is the convolution kernel, />

for the bias.

f_t＝σ(w_fh_t-1+w_fx_t+b_f)f _t ＝σ(w _f h _t-1 +w _f x _t +b _f )

i_t＝σ(w_ix_t+w_ih_t-1+b_i)i _t ＝σ(w _i x _t +w _i h _t-1 +b _i )

其中，

o_t＝σ(w_ox_t+w_oh_t-1+b_o)o _t ＝σ(w _o x _t +w _o h _t-1 +b _o )

h_t＝o_t·tanh(c_t)h _t ＝o _t ·tanh(c _t )

本实施例的卷积层激活函数选用ReLU，实现如果输入大于0，则输出与输入相等，否则输出为0的功能。池化层选用最大池化，可通过消除非最大值减少上层的计算量。同时利于提取不同区域的局部依赖关系，保留最显著的信息，将获得的区域向量作为LSTM网络的输入，LSTM层选用激活函数tanh，全连接层作为模型输出层。本实施例的优化器选用Adam优化器，Adam将自适应学习率的梯度下降算法和动量梯度下降算法相结合，从而适应稀疏梯度和梯度震荡的问题。The activation function of the convolutional layer in this embodiment uses ReLU to realize the function that if the input is greater than 0, the output is equal to the input, otherwise the output is 0. The pooling layer chooses the maximum pooling, which can reduce the calculation amount of the upper layer by eliminating the non-maximum value. At the same time, it is beneficial to extract the local dependencies of different regions, retain the most significant information, and use the obtained region vector as the input of the LSTM network. The LSTM layer uses the activation function tanh, and the fully connected layer is used as the model output layer. The optimizer of this embodiment selects the Adam optimizer, and Adam combines the gradient descent algorithm of the adaptive learning rate and the momentum gradient descent algorithm, so as to adapt to the problem of sparse gradient and gradient oscillation.

步骤34)模型训练及超参数优化：使用验证集对训练过程中的模型进行实时评估，检验其训练效果，采用Adam优化器进行迭代，重复训练过程使网络参数趋于最优。Step 34) Model training and hyperparameter optimization: use the verification set to evaluate the model in the training process in real time, check the training effect, use the Adam optimizer to iterate, and repeat the training process to make the network parameters tend to be optimal.

针对CNN-LSTM中可能出现的过拟合问题，本发明在网络中每层添加Dropout。Dropout为一种正则化方法，它会随机屏蔽神经元，被屏蔽的神经元不会被视为网络中的一部分，即不会参与正向传播的运算。Aiming at the possible overfitting problem in CNN-LSTM, the present invention adds Dropout to each layer in the network. Dropout is a regularization method that randomly shields neurons, and the shielded neurons will not be considered as part of the network, that is, they will not participate in the forward propagation operation.

本发明选用均方误差(Mean Square Error，MSE)作为模型损失函数和预测准确度的评价：MSE的值越小，说明模型预测具有更好的精确度。The present invention uses Mean Square Error (Mean Square Error, MSE) as the evaluation of the model loss function and prediction accuracy: the smaller the value of MSE, the better the accuracy of model prediction.

本实施例中，CNN-LSTM模型需要调整的超参数包括批大小、训练轮次、学习率、LSTM神经元个数等。选择模型收敛后损失函数最小的超参数组合作为最优超参数组合，通过调整，得到CNN-LSTM车辆加速度预测模型的超参数设置如表4所示：In this embodiment, the hyperparameters that need to be adjusted for the CNN-LSTM model include batch size, training rounds, learning rate, number of LSTM neurons, and the like. Select the hyperparameter combination with the smallest loss function after the model converges as the optimal hyperparameter combination. After adjustment, the hyperparameter settings of the CNN-LSTM vehicle acceleration prediction model are obtained as shown in Table 4:

表4超参数设置表Table 4 hyperparameter setting table

基于预测值和预测样本的真实值，本实施例选用均方误差(mean squared error，MSE)、均方根误差(root mean squared error，RMSE)、平均绝对误差(mean absoluteerror，MAE)来衡量预测的精度。下面给出的是RMSE和MAE的计算公式。Based on the predicted value and the true value of the predicted sample, the present embodiment selects mean squared error (mean squared error, MSE), root mean squared error (root mean squared error, RMSE), mean absolute error (mean absolute error, MAE) to measure the prediction accuracy. Given below are the calculation formulas for RMSE and MAE.

其中，Y_i为真实值，

为预测值。Among them, Y _i is the real value,

for the predicted value.

为了展示模型预测效果，将风格相似的驾驶员小组与余下的20位驾驶员小组(风格不一定相似)的加速度预测结果进行对比，将CNN-LSTM模型与传统LSTM进行对比，得到结果如下：In order to demonstrate the prediction effect of the model, the acceleration prediction results of the driver group with similar style and the remaining 20 driver groups (not necessarily similar in style) are compared, and the CNN-LSTM model is compared with the traditional LSTM, and the results are as follows:

表5预测结果对比表Table 5 Comparison table of prediction results

可以看出，驾驶风格相似小组1两种模型的预测结果都优于驾驶风格不一定相似的小组2，其中CNN-LSTM模型小组1的预测误差比小组2降低47.5％，说明本发明所采用的模型对于提高模型训练和预测性能具有显著优势。其次，本发明所使用的CNN-LSTM模型在两个小组的预测任务上效果都优于传统LSTM方法。在小组1的加速度预测上，两种模型差异不大，因为小组1的跟驰片段之间相似性较高，基本的LSTM模型也能实现较好的拟合效果。在小组2中，CNN-LSTM模型比LSTM的精度提高47.1％，说明CNN-LSTM在处理复杂预测任务时的性能要好于LSTM。It can be seen that the prediction results of the two models of group 1 with similar driving styles are better than those of group 2 with not necessarily similar driving styles, and the prediction error of group 1 of the CNN-LSTM model is 47.5% lower than that of group 2, which shows that the method used in the present invention Models have significant advantages for improving model training and predictive performance. Secondly, the CNN-LSTM model used in the present invention is better than the traditional LSTM method in the prediction tasks of the two groups. In the acceleration prediction of group 1, there is not much difference between the two models, because the similarity between the car-following segments of group 1 is relatively high, and the basic LSTM model can also achieve a better fitting effect. In panel 2, the CNN-LSTM model achieves 47.1% higher accuracy than LSTM, indicating that CNN-LSTM performs better than LSTM in handling complex prediction tasks.

图5绘制的是小组1部分跟驰片段的真实值与CNN-LSTM模型的预测值的对比图。从图中可以看出本发明的预测结果较好拟合了车辆加速度变化的趋势，预测效果良好。Figure 5 shows the comparison between the actual value of part of the car-following segment of group 1 and the predicted value of the CNN-LSTM model. It can be seen from the figure that the prediction result of the present invention better fits the trend of vehicle acceleration variation, and the prediction effect is good.

以上详细描述了本发明的较佳具体实施例。应当理解，本领域的普通技术人员无需创造性劳动就可以根据本发明的构思做出诸多修改和变化。因此，凡本技术领域中技术人员依据本发明的构思在现有技术的基础上通过逻辑分析、推理、或者有限的实验可以得到的技术方案，皆应在权利要求书所确定的保护范围内。The preferred specific embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make many modifications and changes according to the concept of the present invention without creative effort. Therefore, all technical solutions that can be obtained by those skilled in the art based on the concept of the present invention through logical analysis, reasoning, or limited experiments on the basis of the prior art shall be within the scope of protection defined in the claims.

Claims

1. a vehicle acceleration prediction method considering driving behavior characteristics under a car-following scene, is characterized in that, comprises the following steps:

Step 1) Obtain the predicted data set and perform data processing and variable extraction: For the obtained predicted data set, filter and retain the effective car-following segments based on pre-configured rules, and extract feature variables from the effective car-following segment data. The prediction data set includes the vehicle data and the preceding vehicle data;

Step 2) Build a CHMM-based driving behavior semantic segmentation model, use feature variables as the input of the driving behavior semantic segmentation model, segment the feature variables into segments with different behavioral characteristics, and evaluate the similarity of driving behavior semantics;

Step 3) Construct a CNN-LSTM acceleration prediction model, divide the effective car-following segment data based on the similarity evaluation results to construct an input and output sample set, train the CNN-LSTM acceleration prediction model, and perform vehicle training based on the trained CNN-LSTM acceleration prediction model. Acceleration prediction.

2. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 1, is characterized in that, described step 1) comprises the following steps:

Step 11) Carry out data cleaning on the predicted data set, remove outliers or missing frames greater than 1s, and use linear interpolation to make up for missing segments less than 1s to obtain a valid data set;

Step 12) Extracting the effective car-following segment from the obtained valid data set, according to the relative position between the two vehicles, the speed of the target vehicle, whether there is an overtaking event, and the duration of the car-following event. carry out screening retention;

Step 13) Extract feature variables from the effective car-following segment data.

3. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 1, is characterized in that, described step 2) comprises the following steps:

Step 21) divide the feature variable into different grades according to the pre-configured segmentation threshold;

Step 22) standardize the feature variable sequence with the driver as the unit;

Step 23) According to the temporal nature of driving behavior and data distribution characteristics, construct a CHMM-based semantic segmentation model of driving behavior, adopt the method of unsupervised learning, use the standardized characteristic variables as the input of the semantic segmentation model of driving behavior, and convert the characteristic variable data Segmentation into segments with different behavioral characteristics;

Step 24) Based on the standardized occurrence frequency distribution and JS divergence calculation method, solve the optimal distribution of behavior semantic duration, behavior semantic frequency distribution, and JS divergence of characteristic variable distribution between each pair of drivers, and obtain the driving Quantitative index value of behavioral similarity.

4. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 3, is characterized in that, the standardization process in described step 22) is:

Among them, x represents the sequence of characteristic variables; m is the number of drivers, and M represents the number of drivers; l is the number of stable car-following events of driver m, and L represents the number of stable car-following events of the driver; μ _m and σ _m are respectively Indicates the mean and standard deviation of the characteristic variables of the driver m; after standardization, the mean of the characteristic variables is 0, and the standard deviation is equal to 1.

5. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 3, is characterized in that, in the described CHMM-based driving behavior semantic division model,

In HMM, there are two sequence types, where the observation sequence X={x ₁ ,x ₂ ,...,x _T } is obtained by observation, and the hidden state sequence S={s ₁ ,s ₂ ,...,s _T } cannot be observed; HMM is represented by the state transition matrix A, the observation probability matrix B and the initial state probability distribution Π: λ=(A,B,Π), the number of hidden states is I, and the number of all possible values of the observed variable for J;

Among them, the state transition matrix A＝[a _ij ] _I×I , the state transition probability is:

a _ij ＝P(s _t+1 ＝q _j |s _t ＝q _i )

Observation probability matrix B=[b _ij ] _I×J , the observation value is only determined by the hidden state at the current moment:

b _ij =P(x _t =x _j |s _t =q _i )

The initial state probability distribution Π=[π _i ] _1×I represents the probability distribution of the hidden state at the initial moment:

π _i =P(s ₁ =q _i )

The number of hidden states of each HMM chain in CHMM is set to N, then the number of hidden states of the CHMM model is N ^Q , where Q represents the number of characteristic variables; the combination of hidden states of the model at any time can be expressed as

The hidden state layer of CHMM decodes the dependencies between feature variables. Considering that the feature variables are all continuous variables, a Gaussian distribution is introduced to calculate the observation probability:

Among them, μ _{c, u} and σ _{c, u} respectively represent the mean value and standard deviation of the probability distribution when the hidden state in the HMM chain corresponding to the characteristic variable c is q _{c, u} .

6. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 3, is characterized in that, described step 24) comprises the following steps:

and />

After calculating the mean value of the closeness, get />

S _Δd ∈ {LD,ND,CD}

and />

Using a normal distribution />

Among them, K represents the number of feature variables;

step 244)

and />

The similarity between driver D _m and driver D _n is quantified from three dimensions respectively, and the comprehensive evaluation index η _m,n of driving style similarity is obtained by weighted average:

where w _i ∈ [0,1] and

Step 245) given two discrete probability density distributions of standardized frequency distributions

and />

Among them, the discrete probability density distribution

and />

KL divergence />

The calculation method is:

and />

The value range of JS divergence is [0,1]. When the two probability distributions are the same, the JS divergence is equal to 0. The greater the difference in probability distribution, the closer the JS value is to 1.

7. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 1, is characterized in that, described step 3) comprises the following steps:

Step 31) According to the similarity evaluation results and the pre-configured similarity threshold, select several drivers who meet the similarity conditions as a group, and classify the effective car-following segments into the training set and the test set according to the pre-configured ratio of the drivers and validation set;

Step 32) Construct input and output samples, convert the data containing acceleration and other characteristic variables in the training set and test set into an N-dimensional array, each input sample is composed of characteristic variables at K moments, and the output label sample is the K+1th moment acceleration data;

Step 33) build CNN-LSTM vehicle acceleration prediction model;

Step 34) model training and hyperparameter optimization: use the verification set to evaluate the model in the training process in real time, check its training effect, use Adam optimizer to iterate, and repeat the training process to make the network parameters tend to be optimal;

Step 35) Predict the acceleration of the target vehicle based on the trained CNN-LSTM vehicle acceleration prediction model.

8. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 7, is characterized in that, described CNN-LSTM vehicle acceleration prediction model comprises input layer, CNN layer, LSTM layer connected successively and a fully connected layer, where,

The input layer is used to receive input data;

The CNN layer is used to compress and extract the spatial features of the input data, and the CNN layer includes a convolution layer and a maximum pooling layer;

The LSTM layer is used to extract the time series features of CNN output data, and the LSTM network in the LSTM layer is many-to-one LSTM;

The fully connected layer is used to output acceleration prediction results;

Each layer in the model network is added with a Dropout layer, and the mean square error is selected as the evaluation of the model loss function and prediction accuracy.

9. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 8, is characterized in that, the calculation formula of described convolution layer is as follows:

Among them, σ is the activation function,

is the convolution kernel, />

for the bias.

10. the vehicle acceleration prediction method considering driving behavior characteristics under a kind of car-following scene according to claim 8, is characterized in that, the forget gate parameter update formula of described LSTM network is:

f _t ＝σ(w _f h _t-1 +w _f x _t +b _f )

Among them, f _t represents the state of the forget gate at time t, w _f and b _f are the weight and bias of the forget gate, respectively, σ represents the sigmoid function, and the output value ranges from 0 to 1, h _t represents the state of the hidden layer, x _t represents the input value of the network at time t;

The input gate of the LSTM network is used to update the information in the input cell state:

i _t ＝σ(w _i x _t +w _i h _t-1 +b _i )

Among them, i _t represents the state of the input gate at time t, and w _i and b _i are the weight and bias of the input gate, respectively;

The LSTM network memory cell state c _t is updated by the information c _t-1 stored in the previous memory cell and the new candidate information:

in,

The output gate of LSTM is used to control the information output in the cell state, and the output hidden layer state h t is calculated according to the memory cell state c _t and the output gate state o _t _:

o _t ＝σ(w _o x _t +w _o h _t-1 +b _o )

h _t =o _t ·tanh(c _t ).