[go: up one dir, main page]

CN112590774B - A deep reinforcement learning-based drift storage control method for smart electric vehicles - Google Patents

A deep reinforcement learning-based drift storage control method for smart electric vehicles Download PDF

Info

Publication number
CN112590774B
CN112590774B CN202011530836.4A CN202011530836A CN112590774B CN 112590774 B CN112590774 B CN 112590774B CN 202011530836 A CN202011530836 A CN 202011530836A CN 112590774 B CN112590774 B CN 112590774B
Authority
CN
China
Prior art keywords
vehicle
wheel
force
center
tire
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011530836.4A
Other languages
Chinese (zh)
Other versions
CN112590774A (en
Inventor
冷搏
刘铭
熊璐
余卓平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202011530836.4A priority Critical patent/CN112590774B/en
Publication of CN112590774A publication Critical patent/CN112590774A/en
Application granted granted Critical
Publication of CN112590774B publication Critical patent/CN112590774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/06Automatic manoeuvring for parking
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0019Control system elements or transfer functions
    • B60W2050/0028Mathematical models, e.g. for simulation
    • B60W2050/0031Mathematical model of the vehicle
    • B60W2050/0034Multiple-track, 2D vehicle model, e.g. four-wheel model

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Human Computer Interaction (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)

Abstract

本发明涉及一种基于深度强化学习的智能电动汽车漂移入库控制方法,包括以下步骤:1)构建用于深度强化学习的车辆动力学模型以及轮胎力饱和工况下的轮胎模型;2)采用面向漂移入库控制的TD3算法实现智能电动汽车漂移入库。与现有技术相比,本发明控制精度高、鲁棒性好,能够使车辆准确完成漂移入库动作,并且在漂移过程中可以通过不断调节方向盘转角来使车辆准确到达库位,而且在车辆漂移的过程中可主动改变库位的中心位置,使车辆向更新后的库位位置漂移。

Figure 202011530836

The invention relates to a deep reinforcement learning-based drift storage control method for an intelligent electric vehicle, comprising the following steps: 1) constructing a vehicle dynamics model for deep reinforcement learning and a tire model under the tire force saturation condition; 2) using The TD3 algorithm for drift warehousing control realizes the drift warehousing of intelligent electric vehicles. Compared with the prior art, the present invention has high control precision and good robustness, and can make the vehicle accurately complete the drifting and warehousing action, and during the drifting process, the vehicle can accurately reach the warehouse position by continuously adjusting the steering wheel angle, and when the vehicle During the drifting process, the center position of the storage location can be changed actively, so that the vehicle can drift to the updated storage location.

Figure 202011530836

Description

一种基于深度强化学习的智能电动汽车漂移入库控制方法A deep reinforcement learning-based drift storage control method for smart electric vehicles

技术领域technical field

本发明涉及汽车入库控制领域,尤其是涉及一种基于深度强化学习的智能电动汽车漂移入库控制方法。The invention relates to the field of vehicle storage control, in particular to a deep reinforcement learning-based drift storage control method for an intelligent electric vehicle.

背景技术Background technique

车辆持续保持在后轮轮胎力饱和、后轴侧滑的状态下行驶,称为漂移,存在两种不同的漂移状态:The vehicle continues to drive in a state where the rear tires are saturated and the rear axle slips, which is called drift. There are two different drift states:

(1)后轴驱动、后轮滑转,此时可以通过控制后轴驱动力与前轮转向角时车辆质心侧偏角和车速保持在一恒定值,使车辆处于稳定状态,由于市面上绝大多数汽车为前轴驱动,故该状态下的漂移动作研究价值相对较小。(1) The rear axle is driven and the rear wheel is slipping. At this time, the vehicle mass center slip angle and vehicle speed can be kept at a constant value by controlling the rear axle driving force and the steering angle of the front wheel, so that the vehicle is in a stable state. Most cars are driven by the front axle, so the research value of drift action in this state is relatively small.

(2)按照开环控制律复现漂移动作可能受到外界环境和自车状态的干扰,使车辆无法漂移停入库位,例如,由于库位接近过程存在侧向位移误差和航向角误差,车辆在触发漂移动作时未完全满足预设的漂移触发位姿状态,存在一定偏差,根据开环控制器完成漂移动作会将该偏差保留至漂移结束;另外,由于底层执行器响应限制,开环控制下无法保证每一次执行器响应一致,当响应出现偏差时车辆会偏移预设的漂移轨迹;路面不均一造成漂移过程中轮胎力的突变,使漂移路径发生改变。(2) Reproducing the drift action according to the open-loop control law may be disturbed by the external environment and the state of the vehicle, so that the vehicle cannot drift into the storage space. When the drift action is triggered, the preset drift trigger pose state is not fully satisfied, and there is a certain deviation. According to the open-loop controller to complete the drift action, the deviation will be retained until the end of the drift. In addition, due to the response limit of the underlying actuator, the open-loop control Under this circumstance, it is impossible to guarantee the consistent response of each actuator. When the response deviates, the vehicle will deviate from the preset drift trajectory; uneven road surface causes sudden changes in tire force during the drift process, which changes the drift path.

发明内容SUMMARY OF THE INVENTION

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种基于深度强化学习的智能电动汽车漂移入库控制方法,本发明基于深度强化学习的无人驾驶汽车漂移入库动作的研究与实现,设计漂移控制器,根据车辆与库位间的相对位置和车辆状态参数调整方向盘转角,使车辆漂移停入库位。The purpose of the present invention is to provide a kind of intelligent electric vehicle drift storage control method based on deep reinforcement learning in order to overcome the above-mentioned defects in the prior art. Implement, design a drift controller, adjust the steering wheel angle according to the relative position between the vehicle and the storage space and vehicle state parameters, so that the vehicle drifts into the storage space.

本发明的目的可以通过以下技术方案来实现:The object of the present invention can be realized through the following technical solutions:

1.一种基于深度强化学习的智能电动汽车漂移入库控制方法,其特征在于,包括以下步骤:1. a kind of intelligent electric vehicle drift storage control method based on deep reinforcement learning, is characterized in that, comprises the following steps:

1)构建用于深度强化学习的车辆动力学模型以及轮胎力饱和工况下的轮胎模型;1) Build a vehicle dynamics model for deep reinforcement learning and a tire model under tire force saturation conditions;

2)采用面向漂移入库控制的TD3算法实现智能电动汽车漂移入库。2) The TD3 algorithm for drift warehousing control is used to realize the drift warehousing of intelligent electric vehicles.

所述的步骤1)中,车辆动力学模型具体为考虑前后与左右载荷转移的四轮三自由度车辆动力学模型,所述的三自由度包括车辆质心处速度vm、质心侧偏角β和横摆角速度ω。In the described step 1), the vehicle dynamics model is specifically a four-wheel, three-degree-of-freedom vehicle dynamics model that considers front-rear and left-right load transfer. and the yaw angular velocity ω.

四轮三自由度车辆动力学模型中,考虑纵侧向加速度的四轮垂向力的表达式为:In the four-wheel three-degree-of-freedom vehicle dynamics model, the expression of the four-wheel vertical force considering the longitudinal and lateral acceleration is:

Figure BDA0002852010660000021
Figure BDA0002852010660000021

Figure BDA0002852010660000022
Figure BDA0002852010660000022

Figure BDA0002852010660000023
Figure BDA0002852010660000023

Figure BDA0002852010660000024
Figure BDA0002852010660000024

Figure BDA0002852010660000025
Figure BDA0002852010660000025

Figure BDA0002852010660000026
Figure BDA0002852010660000026

式中,hm为质心高度,bf、br为前、后轮距,ax、ay为质心处不考虑车身旋转影响的纵、侧向加速度,FzFL、FzFR、FzRL、FzRR分别为左前、右前、左后、右后车轮的垂向力,m为电动汽车质量,g为重力加速度,l为轴距,lf、lr为前、后轴到质心的距离,FxFL、FxFR、FxRL、FxRR分别为左前、右前、左后、右后车轮的纵向力,FyFL、FyFR、FyRL、FyRR分别为左前、右前、左后、右后车轮的侧向力,δ为前轮转角。In the formula, h m is the height of the center of mass, b f and br are the front and rear wheelbases, a x and a y are the longitudinal and lateral accelerations at the center of mass without considering the influence of body rotation, F zFL , F zFR , F zRL , F zRR are the vertical forces of the left front, right front, left rear, and right rear wheels, respectively, m is the mass of the electric vehicle, g is the acceleration of gravity, l is the wheelbase, l f , l r are the distances from the front and rear axles to the center of mass, F xFL , F xFR , F xRL , and F xRR are the longitudinal forces of the left front, right front, left rear, and right rear wheels, respectively, and F yFL , F yFR , F yRL , and F yRR are the left front, right front, left rear, and right rear wheels, respectively The lateral force, δ is the front wheel angle.

在漂移过程中,考虑到载荷转移过大导致某一个车轮离地,出现使得该车轮的垂向载荷降为0、载荷转移达到上限的情况,当方向盘向左转漂移,载荷向右侧转移,左后轮离地时,则左后轮的垂向力为0,此时,根据纵侧向加速度、轴距和轮距将过多转移的载荷重新分配至左前轮和右后轮,则有:During the drifting process, considering that a certain wheel is lifted off the ground due to the excessive load transfer, the vertical load of the wheel is reduced to 0 and the load transfer reaches the upper limit. When the steering wheel is turned to the left and the load is transferred to the right, When the left rear wheel is off the ground, the vertical force of the left rear wheel is 0. At this time, the excessively transferred load is redistributed to the left front wheel and the right rear wheel according to the longitudinal and lateral acceleration, wheelbase and wheel distance, then Have:

ΔFtrans=|FzRL|ΔF trans = |F zRL |

F′zRL=0F' zRL = 0

Figure BDA0002852010660000031
Figure BDA0002852010660000031

Figure BDA0002852010660000032
Figure BDA0002852010660000032

其中,ΔFtrans为过多转移的载荷,F′zRL为分配后左后轮的垂向力,F′zRR为分配后右后轮的垂向力,F′zFL为分配后左前轮的垂向力。Among them, ΔF trans is the excessively transferred load, F′ zRL is the vertical force of the left rear wheel after distribution, F′ zRR is the vertical force of the right rear wheel after distribution, and F′ zFL is the vertical force of the left front wheel after distribution to force.

对考虑前后与左右载荷转移的四轮三自由度车辆动力学模型进行受力分析,得到车辆动力学平衡方程为:The force analysis of the four-wheel three-degree-of-freedom vehicle dynamics model considering the load transfer between front and rear and left and right is carried out, and the vehicle dynamic balance equation is obtained as:

Figure BDA0002852010660000033
Figure BDA0002852010660000033

Figure BDA0002852010660000034
Figure BDA0002852010660000034

φ=β+ψφ=β+ψ

Figure BDA0002852010660000035
Figure BDA0002852010660000035

Figure BDA0002852010660000036
Figure BDA0002852010660000036

Figure BDA0002852010660000037
Figure BDA0002852010660000037

据此计算得到车辆纵向车速vmx和侧向车速vmy,则有:According to this calculation, the longitudinal vehicle speed v mx and the lateral vehicle speed v my are obtained, then:

vmx=vm·cosβv mx = v m ·cosβ

vmy=vm·sinβv my = v m ·sinβ

其中,

Figure BDA0002852010660000041
为车辆质心处速度的变化率,
Figure BDA0002852010660000042
为质心侧偏角速度,φ为质心处车速全局方位角,
Figure BDA0002852010660000043
为质心处车速全局方位角速度,
Figure BDA0002852010660000044
为横摆角速度的变化率,ψ为车头全局方位角,Iz为横摆转动惯量,vx为车辆纵向车速,vy为车辆侧向车速。in,
Figure BDA0002852010660000041
is the rate of change of the velocity at the center of mass of the vehicle,
Figure BDA0002852010660000042
is the side-slip angular velocity of the centroid, φ is the global azimuth of the vehicle speed at the centroid,
Figure BDA0002852010660000043
is the global azimuth velocity of the vehicle speed at the center of mass,
Figure BDA0002852010660000044
is the rate of change of the yaw rate, ψ is the global azimuth angle of the front of the vehicle, I z is the yaw moment of inertia, v x is the longitudinal speed of the vehicle, and v y is the lateral speed of the vehicle.

所述的步骤1)中,用于深度强化学习训练的轮胎模型包括前轮轮胎力模型和后轮轮胎力模型。In the step 1), the tire model used for deep reinforcement learning training includes a front tire force model and a rear tire force model.

对于后轮轮胎力模型,在漂移过程中,后轮制动抱死并在路面上纯摩擦,后轮的轮胎力方向与车轮轮心瞬时速度的方向相反,通过对后轮进行受力分析得到后轮纵侧向轮胎力分量的表达式为:For the tire force model of the rear wheel, during the drifting process, the rear wheel brake locks and there is pure friction on the road surface, and the direction of the tire force of the rear wheel is opposite to the direction of the instantaneous speed of the wheel center. The expression of the longitudinal and lateral tire force components of the rear wheels is:

对于左后轮:For the left rear wheel:

Figure BDA0002852010660000045
Figure BDA0002852010660000045

对于右后轮:For the right rear wheel:

Figure BDA0002852010660000046
Figure BDA0002852010660000046

Fr_sat=μ1Fz F r_sat = μ 1 F z

其中,vxRL、vyRL分别为左后轮轮心处纵、侧向速度,vxRR、vyRR分别为右后轮轮心处纵、侧向速度,λL、λR分别为左、右后轮轮心侧偏角,FxRL、FyRL分别为左后轮纵、侧向力,FxRR、FyRR分别为右后轮纵、侧向力,FrRL_sat、FrRR_sat分别为左、右后轮水平饱和轮胎力,Fr_sat表示对应车轮水平饱和轮胎力,μ1为车轮抱死时路面利用附着系数,Fz表示对应车轮的垂向力。Among them, v xRL and v yRL are the longitudinal and lateral speeds at the center of the left rear wheel, respectively, v xRR and vyRR are the longitudinal and lateral speeds at the center of the right rear wheel, respectively, and λ L and λ R are the left and right velocities, respectively. Rear wheel center slip angle, F xRL and F yRL are the longitudinal and lateral forces of the left rear wheel, respectively, F xRR and F yRR are the longitudinal and lateral forces of the right rear wheel, respectively, and F rRL_sat and F rRR_sat are the left and right forces, respectively The horizontal saturated tire force of the rear wheel, F r_sat represents the horizontal saturated tire force of the corresponding wheel, μ 1 is the road surface utilization adhesion coefficient when the wheel is locked, and F z represents the vertical force of the corresponding wheel.

对于前轮轮胎力模型,在漂移过程中,前轮轮胎力尚未饱和,则采用改进Burckhardt轮胎模型对轮胎力进行拟合,用以表述侧向力与侧偏角的关系,则有:For the front tire force model, during the drifting process, the front tire force is not saturated, then the improved Burckhardt tire model is used to fit the tire force to express the relationship between the lateral force and the slip angle, as follows:

Figure BDA0002852010660000051
Figure BDA0002852010660000051

其中,θ1~θ5为拟合参数,α为前轮侧偏角;Among them, θ 1 to θ 5 are fitting parameters, and α is the front wheel slip angle;

左轮侧偏角αL和右轮侧偏角αR可通过以下公式求得:The left wheel slip angle α L and the right wheel slip angle α R can be obtained by the following formulas:

Figure BDA0002852010660000052
Figure BDA0002852010660000052

Figure BDA0002852010660000053
Figure BDA0002852010660000053

由于前轮未施加制动力和驱动力,处于自由滚动状态,有FxFL=0,FxFR=0,在确定前轮轮胎力方向时仅考虑侧向力,则前轮轮胎力方向垂直于轮胎平面,由前轮转向角决定。Since the front wheel does not apply braking force and driving force, it is in a free rolling state, with F xFL = 0, F xFR = 0, only the lateral force is considered when determining the tire force direction of the front wheel, then the tire force direction of the front wheel is perpendicular to the tire The plane is determined by the steering angle of the front wheels.

所述的步骤2)具体包括以下步骤:Described step 2) specifically comprises the following steps:

21)设计面向漂移入库控制的TD3算法,构建Actor网络和Critic网络,具体为:21) Design TD3 algorithm for drift warehousing control, build Actor network and Critic network, specifically:

Critic网络和Actor网络均为由全连接层组成的BP神经网络,Critic网络的输入为车辆状态和动作,输出为Q值,Actor网络的输入为车辆状态,输出为动作,所述的车辆状态为表征漂移过程车辆状态的参数,包括以车辆质心为原点,车头朝向为y轴正方向的相对坐标系下库位坐标(ex、ey)和库位朝向

Figure BDA0002852010660000056
车辆质心处速度vm、质心侧偏角β以及横摆角速度ω,所述的动作为方向盘转角;Both the Critic network and the Actor network are BP neural networks composed of fully connected layers. The input of the Critic network is the vehicle state and action, and the output is the Q value. The input of the Actor network is the vehicle state and the output is the action. The vehicle state is The parameters that characterize the state of the vehicle during the drift process, including the location coordinates (e x , e y ) and the location orientation in the relative coordinate system with the center of mass of the vehicle as the origin and the vehicle's head orientation as the positive direction of the y-axis
Figure BDA0002852010660000056
the velocity vm at the center of mass of the vehicle, the side-slip angle β of the center of mass and the yaw angular velocity ω, the actions are the steering wheel angle;

22)构建奖励函数r(k),则有:22) Construct the reward function r(k), then there are:

Figure BDA0002852010660000054
Figure BDA0002852010660000054

其中,wx、wy

Figure BDA0002852010660000055
分别为ex、ey
Figure BDA0002852010660000057
的权重,k为时间;Among them, w x , w y ,
Figure BDA0002852010660000055
are e x , e y and
Figure BDA0002852010660000057
The weight of , k is time;

23)对Actor网络和Critic网络进行训练,并据此完成智能电动汽车漂移入库。23) Train the Actor network and the Critic network, and complete the drift storage of smart electric vehicles accordingly.

在步骤23)中,对Actor网络和Critic网络进行训练前,先确定漂移入库控制器的边界,根据该边界对每次车辆漂移的目标库位位置进行随机取值,在迭代训练中,车辆以随机选取的目标库位位置和朝向计算车辆状态,并据此对Critic网络和Actor网络进行训练,通过在训练过程中随机更新目标库位位置,拓展训练数据集,提升化能力。In step 23), before the Actor network and the Critic network are trained, the boundary of the drift storage controller is determined first, and the target location of each vehicle drift is randomly selected according to the boundary. In the iterative training, the vehicle The vehicle status is calculated with the randomly selected target location and orientation, and the Critic network and the Actor network are trained accordingly. By randomly updating the target location during the training process, the training data set is expanded and the ability to improve.

与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:

一、基于深度强化学习TD3算法设计了一种智能电动汽车漂移入库的控制方法,提高了控制精度,克服了由于路面不均匀造成的漂移入库存在误差的问题,也可以改变库位中心点,使车辆向更新后的库位位置移动,提高了控制系统的鲁棒性。1. Based on the deep reinforcement learning TD3 algorithm, a control method of intelligent electric vehicle drift storage is designed, which improves the control accuracy, overcomes the problem of drift storage error caused by uneven road surface, and can also change the center point of the storage location. , so that the vehicle moves to the updated warehouse position, which improves the robustness of the control system.

二、漂移入库的过程中可以通过不断调整方向盘角度使车辆调整位姿,使车辆准确的漂移入库。2. During the process of drifting into the warehouse, the vehicle can adjust its posture by continuously adjusting the angle of the steering wheel, so that the vehicle can accurately drift into the warehouse.

附图说明Description of drawings

图1为本发明的方法流程图。FIG. 1 is a flow chart of the method of the present invention.

图2为漂移过程部分状态参数定义示意图。Figure 2 is a schematic diagram of the definition of some state parameters in the drift process.

图3为基于深度强化学习的漂移控制算法流程。Figure 3 shows the flow of the drift control algorithm based on deep reinforcement learning.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明进行详细说明。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

如图1所示,本发明提供一种基于深度强化学习的智能电动汽车漂移入库控制方法,包括以下步骤:As shown in FIG. 1 , the present invention provides a deep reinforcement learning-based intelligent electric vehicle drift storage control method, which includes the following steps:

1)搭建用于深度强化学习训练的车辆动力学模型和轮胎模型,具体包括以下步骤:1) Build a vehicle dynamics model and a tire model for deep reinforcement learning training, including the following steps:

11)搭建用于深度强化学习的车辆动力学模型11) Building a vehicle dynamics model for deep reinforcement learning

考虑前后与左右载荷转移的四轮三自由度车辆动力学模型,三个自由度分别为车辆质心处速度大小vm,质心侧偏角大小β,横摆角速度ω;A four-wheel three-degree-of-freedom vehicle dynamics model considering the load transfer between front and rear and left and right, the three degrees of freedom are respectively the velocity of the vehicle's center of mass v m , the center of mass slip angle β, and the yaw angular velocity ω;

由于漂移过程车辆纵侧向加速度都很大,必须考虑车辆前后和左右载荷转移对轮胎垂向力的影响。考虑纵侧向加速度的四轮垂向力计算公式如式(1):Since the longitudinal and lateral accelerations of the vehicle are large during the drifting process, the influence of the vehicle's front and rear and left and right load transfer on the vertical force of the tire must be considered. The four-wheel vertical force calculation formula considering the longitudinal and lateral acceleration is as formula (1):

Figure BDA0002852010660000071
Figure BDA0002852010660000071

式中,hm为质心高度,bf、br为前后轮距,ax、ay为质心处不考虑车身旋转影响的纵侧向加速度,由式(2)求得:In the formula, h m is the height of the center of mass, b f and br are the front and rear wheelbases, and a x and a y are the longitudinal and lateral accelerations at the center of mass without considering the influence of the body rotation, which can be obtained from formula (2):

Figure BDA0002852010660000072
Figure BDA0002852010660000072

在漂移过程中,需要考虑载荷转移过大导致某一个车轮离地,使对应轮的垂向载荷降为0、载荷转移达到上限的情况。由于是甩尾制动过程,载荷向前轴转移,因此仅考虑后轮离地的可能性。假设方向盘向左转漂移,载荷向右侧转移,则左后轮存在离地可能。当根据公式计算得到FzRL<0时,令该轮垂向力为0,且根据纵侧向加速度与轴距、轮距将过多转移的载荷重新分配至左前轮和右后轮,公式表达为:During the drifting process, it is necessary to consider the situation that a certain wheel is lifted off the ground due to the excessive load transfer, so that the vertical load of the corresponding wheel is reduced to 0, and the load transfer reaches the upper limit. Since it is a tail-flick braking process, the load is transferred to the front axle, so only the possibility of the rear wheel getting off the ground is considered. Assuming that the steering wheel drifts to the left and the load shifts to the right, the left rear wheel may leave the ground. When F zRL <0 calculated according to the formula, let the vertical force of the wheel be 0, and redistribute the excessively transferred load to the left front wheel and the right rear wheel according to the longitudinal and lateral acceleration, wheelbase and wheelbase, the formula Expressed as:

Figure BDA0002852010660000073
Figure BDA0002852010660000073

对车辆模型进行受力分析,得到车辆动力学平衡方程为:The force analysis of the vehicle model is carried out, and the dynamic balance equation of the vehicle is obtained as:

Figure BDA0002852010660000081
Figure BDA0002852010660000081

Figure BDA0002852010660000082
Figure BDA0002852010660000082

Figure BDA0002852010660000083
Figure BDA0002852010660000083

Figure BDA0002852010660000084
Figure BDA0002852010660000084

式中,δ为前轮转角;φ为质心处车速全局方位角,

Figure BDA0002852010660000085
为速度方向变化率;ψ为车头全局方位角,
Figure BDA0002852010660000086
为车辆横摆角速度,即车头方向变化率;根据φ=β+ψ和对上式积分,即可得到各时刻车辆的质心速度vm、质心侧偏角β和横摆角速度ω,再根据式(8)求得车辆纵侧向车速:In the formula, δ is the front wheel rotation angle; φ is the global azimuth angle of the vehicle speed at the center of mass,
Figure BDA0002852010660000085
is the rate of change of the speed direction; ψ is the global azimuth of the head of the vehicle,
Figure BDA0002852010660000086
is the yaw rate of the vehicle, that is, the rate of change of the head direction; according to φ=β+ψ and the integral of the above formula, the center of mass velocity v m , the side-slip angle of the center of mass β and the yaw rate ω of the vehicle at each moment can be obtained, and then according to the formula (8) Obtain the longitudinal and lateral speed of the vehicle:

Figure BDA0002852010660000087
Figure BDA0002852010660000087

12)搭建用于深度强化学习在轮胎力饱和工况下的轮胎模型12) Build a tire model for deep reinforcement learning under tire force saturation conditions

与常规工况的行驶条件不同,漂移时后轮轮胎力饱和,车身侧向速度与质心侧偏角大,且纵侧向车速均处于急剧变化的状态,因而此时车辆系统是一个强非线性、纵侧高度耦合的时变系统,则车辆实时饱和轮胎力由式(9)求得:Different from the normal driving conditions, when drifting, the tire force of the rear wheel is saturated, the lateral speed of the vehicle body and the side-slip angle of the center of mass are large, and the longitudinal and lateral vehicle speeds are in a state of rapid change, so the vehicle system is a strong nonlinear at this time. , the time-varying system of the longitudinal side height coupling, the real-time saturated tire force of the vehicle can be obtained by formula (9):

Fr_sat=μ1Fz (9)F r_sat = μ 1 F z (9)

式中,

Figure BDA0002852010660000088
为水平轮胎力合力,μ1为车轮抱死时路面利用附着系数,μ1=0.9μmax,即为0.9倍的峰值附着系数,峰值附着系数为1。In the formula,
Figure BDA0002852010660000088
is the resultant horizontal tire force, μ 1 is the road surface adhesion coefficient when the wheel is locked, μ 1 =0.9 μ max , which is 0.9 times the peak adhesion coefficient, and the peak adhesion coefficient is 1.

121)后轮轮胎力模型121) Rear tire force model

在后轴抱死制动的过程中,轮胎力饱和,无论侧偏角大小如何变化,纵侧向力合力大小不变,说明求漂移过程后轮水平轮胎力时可忽略侧偏角变化,可直接求漂移状态下后轴轮胎力大小。In the process of rear axle locking and braking, the tire force is saturated. No matter how the sideslip angle changes, the resultant force of the longitudinal and lateral forces remains unchanged, which means that the change of the sideslip angle can be ignored when calculating the horizontal tire force of the rear wheel during the drifting process. Directly find the tire force of the rear axle in the drift state.

由于后轮制动抱死,车轮在路面上纯摩擦,因而轮胎力方向由轮心速度方向决定,即轮胎力方向与车轮轮心瞬时速度的方向相反。对漂移过程后轮进行受力分析,可得到后轮纵侧向轮胎力分量的表达式:Since the rear wheel is locked, the wheels are purely rubbed on the road surface, so the direction of the tire force is determined by the direction of the wheel center speed, that is, the direction of the tire force is opposite to the direction of the instantaneous speed of the wheel center. By analyzing the force of the rear wheel during the drift process, the expression of the longitudinal and lateral tire force components of the rear wheel can be obtained:

左后轮:Left rear wheel:

Figure BDA0002852010660000091
Figure BDA0002852010660000091

右后轮:Right rear wheel:

Figure BDA0002852010660000092
Figure BDA0002852010660000092

式中,vxRL、vyRL分别为左后轮轮心处纵、侧向速度,vxRR,vyRR分别为右后轮轮心处纵、侧向速度;λL、λR分别为左、右后轮轮心侧偏角;FxRL、FyRL分别为左后轮纵、侧向力,FxRR、FyRR分别为右后轮纵、侧向力;FrRL_sat、FrRR_sat分别为左、右后轮水平饱和轮胎力。In the formula, v xRL and v yRL are the longitudinal and lateral speeds at the wheel center of the left rear wheel, respectively, v xRR , v yRR are the longitudinal and lateral speeds at the wheel center of the right rear wheel, respectively; λ L , λ R are the left, Right rear wheel center slip angle; F xRL and F yRL are the longitudinal and lateral forces of the left rear wheel, respectively, F xRR and F yRR are the longitudinal and lateral forces of the right rear wheel, respectively; F rRL_sat and F rRR_sat are the left, The right rear wheel level saturated tire force.

122)前轮轮胎力模型122) Front tire force model

前轮轮胎力尚未饱和,将其纵侧解耦,采用适用于准静态工况的轮胎模型求轮胎侧向力。采用改进Burckhardt轮胎模型对轮胎力进行拟合,表达侧向力与侧偏角的关系,则有:The tire force of the front wheel is not saturated, so the longitudinal side is decoupled, and the tire lateral force is obtained by using a tire model suitable for quasi-static conditions. The tire force is fitted by the improved Burckhardt tire model, and the relationship between the lateral force and the slip angle is expressed as follows:

Figure BDA0002852010660000093
Figure BDA0002852010660000093

式中,θ1~θ5为拟合参数,α为前轮侧偏角,左轮和右轮侧偏角可通过式(13)、(14)求出。In the formula, θ 1 to θ 5 are fitting parameters, α is the front wheel slip angle, and the left and right wheel slip angles can be obtained by formulas (13) and (14).

Figure BDA0002852010660000101
Figure BDA0002852010660000101

Figure BDA0002852010660000102
Figure BDA0002852010660000102

由于未施加制动力及驱动力,认为前轮处于自由滚动状态,车轮纵向力近似为0,即FxFL=0,FxFR=0。在确定前轮轮胎力方向时仅考虑侧向力,故前轮轮胎力方向垂直于轮胎平面,由前轮转向角决定。Since no braking force and driving force are applied, it is considered that the front wheel is in a free rolling state, and the longitudinal force of the wheel is approximately 0, that is, F xFL =0, F xFR =0. Only the lateral force is considered when determining the tire force direction of the front wheel, so the force direction of the front wheel tire is perpendicular to the tire plane and is determined by the steering angle of the front wheel.

2)面向漂移入库控制的TD3算法设计。2) TD3 algorithm design for drift storage control.

在漂移过程中,采用深度强化学习算法,以搭建的漂移车辆动力学模型为基础,根据端到端漂移控制器,实现车辆的准确漂移入库,具体为:In the drifting process, the deep reinforcement learning algorithm is used, based on the built drifting vehicle dynamics model, and according to the end-to-end drift controller, the vehicle can be accurately drifted into the warehouse, specifically:

TD3算法中,Critic网络的输入为车辆状态和动作,输出为Q值;Actor网络的输入为车辆状态,输出为动作,即方向盘转角大小;In the TD3 algorithm, the input of the Critic network is the vehicle state and action, and the output is the Q value; the input of the Actor network is the vehicle state, and the output is the action, that is, the steering wheel angle;

选定表征漂移过程车辆状态的参数,作为Critic网络和Actor网络的输入,该组参数应能够将漂移中某时刻车辆状态唯一的表示出来,且与方向盘转角输入值存在动力学的相关性。6个状态参数为:以车辆质心为原点、车头朝向为y轴正方向的相对坐标系下库位坐标ex、ey和库位朝向

Figure BDA0002852010660000103
车辆的纵侧向车速的合速度vm、质心侧偏角β以及横摆角速度ω。ex、ey
Figure BDA0002852010660000104
反应了漂移过程中车辆当前时刻位置和航向角与期望状态之差,如图2所示,vm、β和ω表征前三者的变化率。The parameters that characterize the vehicle state during the drift process are selected as the input of the Critic network and the Actor network. This group of parameters should be able to uniquely represent the vehicle state at a certain moment during the drift process, and there is a dynamic correlation with the input value of the steering wheel angle. The 6 state parameters are: the warehouse location coordinates e x , e y and the warehouse location orientation in the relative coordinate system with the center of mass of the vehicle as the origin and the vehicle head orientation as the positive direction of the y-axis
Figure BDA0002852010660000103
The resultant velocity vm of the vehicle's longitudinal and lateral vehicle speeds, the center of mass slip angle β, and the yaw angular velocity ω. e x , e y and
Figure BDA0002852010660000104
It reflects the difference between the vehicle's current position and heading angle and the expected state during the drift process. As shown in Figure 2, v m , β and ω represent the rate of change of the first three.

确定了强化学习算法所训练的深度神经网络后,对奖励函数进行设计,以计算车辆在漂移过程中不同状态所对应的奖励值。奖励函数设计如下:After determining the deep neural network trained by the reinforcement learning algorithm, the reward function is designed to calculate the reward value corresponding to the different states of the vehicle during the drifting process. The reward function is designed as follows:

Figure BDA0002852010660000105
Figure BDA0002852010660000105

式中,wx、wy

Figure BDA0002852010660000106
分别为ex、ey
Figure BDA0002852010660000107
的权重。由于所关注的是车辆停稳时与库位中心的位移误差和与库位朝向的航向角误差,因此将车速的三次方放在分母项,可以使得当车辆车速越低、越接近停止时,其纵侧向位移误差和航向角误差所计算得到的奖励值绝对值越大。根据算法原理,当车辆最终停在远离库位的位置,会计算得到一个很小的奖励值;而当车辆停在库位中心附近时,计算得到的奖励值接近于0,使前序状态和动作对应的目标Q值较大。Actor网络在根据车辆状态计算方向盘转角时会尽可能使Q值最大,使车辆最终停入库位。在进行奖励函数的设计时,应将被控量放入奖励函数中,但是在漂移入库的过程中,方向盘转角一直在调整,这是一个连续的过程,无法界定其中一次的转向对漂移入库的结果影响,所以权重系数置0。In the formula, w x , w y ,
Figure BDA0002852010660000106
are e x , e y and
Figure BDA0002852010660000107
the weight of. Since the focus is on the displacement error from the center of the storage location and the heading angle error from the orientation of the storage location when the vehicle is stationary, the cube of the vehicle speed is placed in the denominator term, so that when the vehicle speed is lower and closer to stopping, the The greater the absolute value of the reward value calculated by the longitudinal and lateral displacement error and the heading angle error. According to the algorithm principle, when the vehicle is finally parked far from the storage space, a small reward value will be calculated; when the vehicle is parked near the center of the storage space, the calculated reward value will be close to 0, making the pre-order state and the The target Q value corresponding to the action is larger. The Actor network will try to maximize the Q value when calculating the steering wheel angle based on the vehicle state, so that the vehicle will eventually park in the storage space. When designing the reward function, the controlled quantity should be put into the reward function, but in the process of drifting into the library, the steering wheel angle has been adjusted. This is a continuous process, and it is impossible to define one of the steering pairs. The result of the library is affected, so the weight coefficient is set to 0.

在进行网络训练之前,首先确定车载漂移入库控制器的“边界”,认为无论施加怎样的方向盘转角,车辆终末位置和终末航向角不会超过此边界。Before network training, the "boundary" of the vehicle-mounted drift storage controller is first determined, and it is believed that no matter what steering wheel angle is applied, the vehicle's final position and final heading angle will not exceed this boundary.

根据控制器边界,对每次车辆漂移的目标库位位置进行随机取值。当一次完整的漂移过程结束后,设定随机目标库位位置(Xaim,Yaim)和朝向ψaim,且满足上述控制器边界的约束。According to the boundary of the controller, the target location of each vehicle drift is randomly selected. When a complete drift process is over, set the random target location (X aim , Y aim ) and direction ψ aim , and satisfy the constraints of the controller boundary above.

在迭代训练中,车辆以该目标库位位置和朝向计算车辆状态ex、ey

Figure BDA0002852010660000111
依此对Critic网络和Actor网络进行训练,通过在训练过程中随机更新目标库位位置,拓展了训练数据集,可以提升网络的泛化能力。In iterative training, the vehicle calculates the vehicle state ex, e y and
Figure BDA0002852010660000111
Based on this, the Critic network and the Actor network are trained, and the training data set is expanded by randomly updating the target location during the training process, which can improve the generalization ability of the network.

实施例Example

本实施例中,根据上述方法实现的漂移入库的控制方法具体为:In this embodiment, the control method for drift storage implemented according to the above method is specifically:

步骤一、搭建基于深度强化学习的漂移入库的四轮三自由度车辆动力学模型以及搭建轮胎力饱和工况下的轮胎模型。考虑前后与左右载荷转移的四轮三自由度车辆动力学模型。三个自由度分别为:车辆质心处速度大小vm,质心侧偏角大小β,横摆角速度ω。Step 1: Build a four-wheel, three-degree-of-freedom vehicle dynamics model based on deep reinforcement learning for drift storage and build a tire model under tire force saturation conditions. A four-wheel, three-degree-of-freedom vehicle dynamics model considering front-to-rear and left-right load transfer. The three degrees of freedom are: the velocity of the vehicle's center of mass v m , the side-slip angle of the center of mass β, and the yaw rate ω.

步骤二、基于深度强化学习的车辆动力学模型进行Critic网络与Actor网络设计、以及奖励函数设计。Critic网络的输入为车辆状态和动作,输出为Q值;Actor网络的输入为车辆状态,输出为动作。输入量与输出量个数较少,对应关系较为简单,采用由全连接层组成的BP神经网络进行Critic网络和Actor网络的搭建,基于深度强化学习的漂移控制算法流程如图3所示。Step 2: Design the Critic network and Actor network, and design the reward function based on the vehicle dynamics model of deep reinforcement learning. The input of the Critic network is the vehicle state and action, and the output is the Q value; the input of the Actor network is the vehicle state and the output is the action. The number of inputs and outputs is small, and the corresponding relationship is relatively simple. The BP neural network composed of fully connected layers is used to construct the Critic network and the Actor network. The flow of the drift control algorithm based on deep reinforcement learning is shown in Figure 3.

Claims (1)

1.一种基于深度强化学习的智能电动汽车漂移入库控制方法,其特征在于,包括以下步骤:1. a kind of intelligent electric vehicle drift storage control method based on deep reinforcement learning, is characterized in that, comprises the following steps: 1)构建用于深度强化学习的车辆动力学模型以及轮胎力饱和工况下的轮胎模型,车辆动力学模型具体为考虑前后与左右载荷转移的四轮三自由度车辆动力学模型,所述的三自由度包括车辆质心处速度vm、质心侧偏角β和横摆角速度ω,四轮三自由度车辆动力学模型中,考虑纵侧向加速度的四轮垂向力的表达式为:1) Build a vehicle dynamics model for deep reinforcement learning and a tire model under tire force saturation conditions. The vehicle dynamics model is a four-wheel, three-degree-of-freedom vehicle dynamics model that considers front-rear and left-right load transfer. The described The three degrees of freedom include the vehicle mass center velocity v m , the mass center slip angle β and the yaw angular velocity ω. In the four-wheel three-degree-of-freedom vehicle dynamics model, the expression of the four-wheel vertical force considering the longitudinal and lateral acceleration is:
Figure FDA0003316525890000011
Figure FDA0003316525890000011
Figure FDA0003316525890000012
Figure FDA0003316525890000012
Figure FDA0003316525890000013
Figure FDA0003316525890000013
Figure FDA0003316525890000014
Figure FDA0003316525890000014
Figure FDA0003316525890000015
Figure FDA0003316525890000015
Figure FDA0003316525890000016
Figure FDA0003316525890000016
式中,hm为质心高度,bf、br为前、后轮距,ax、ay为质心处不考虑车身旋转影响的纵、侧向加速度,FzFL、FzFR、FzRL、FzRR分别为左前、右前、左后、右后车轮的垂向力,m为电动汽车质量,g为重力加速度,l为轴距,lf、lr为前、后轴到质心的距离,FxFL、FxFR、FxRL、FxRR分别为左前、右前、左后、右后车轮的纵向力,FyFL、FyFR、FyRL、FyRR分别为左前、右前、左后、右后车轮的侧向力,δ为前轮转角;In the formula, h m is the height of the center of mass, b f and br are the front and rear wheelbases, a x and a y are the longitudinal and lateral accelerations at the center of mass without considering the influence of body rotation, F zFL , F zFR , F zRL , F zRR are the vertical forces of the left front, right front, left rear, and right rear wheels, respectively, m is the mass of the electric vehicle, g is the acceleration of gravity, l is the wheelbase, l f , l r are the distances from the front and rear axles to the center of mass, F xFL , F xFR , F xRL , and F xRR are the longitudinal forces of the left front, right front, left rear, and right rear wheels, respectively, and F yFL , F yFR , F yRL , and F yRR are the left front, right front, left rear, and right rear wheels, respectively , δ is the front wheel rotation angle; 在漂移过程中,考虑到载荷转移过大导致某一个车轮离地,出现使得该车轮的垂向载荷降为0、载荷转移达到上限的情况,当方向盘向左转漂移,载荷向右侧转移,左后轮离地时,则左后轮的垂向力为0,此时,根据纵侧向加速度、轴距和轮距将过多转移的载荷重新分配至左前轮和右后轮,则有:During the drifting process, considering that a certain wheel is lifted off the ground due to the excessive load transfer, the vertical load of the wheel is reduced to 0 and the load transfer reaches the upper limit. When the steering wheel is turned to the left and the load is transferred to the right, When the left rear wheel is off the ground, the vertical force of the left rear wheel is 0. At this time, the excessively transferred load is redistributed to the left front wheel and the right rear wheel according to the longitudinal and lateral acceleration, wheelbase and wheel distance, then Have: ΔFtrans=|FzRL|ΔF trans = |F zRL | F′zRL=0F' zRL = 0
Figure FDA0003316525890000021
Figure FDA0003316525890000021
Figure FDA0003316525890000022
Figure FDA0003316525890000022
其中,ΔFtrans为过多转移的载荷,F′zRL为分配后左后轮的垂向力,F′zRR为分配后右后轮的垂向力,F′zFL为分配后左前轮的垂向力;Among them, ΔF trans is the excessively transferred load, F′ zRL is the vertical force of the left rear wheel after distribution, F′ zRR is the vertical force of the right rear wheel after distribution, and F′ zFL is the vertical force of the left front wheel after distribution to force; 对考虑前后与左右载荷转移的四轮三自由度车辆动力学模型进行受力分析,得到车辆动力学平衡方程为:The force analysis of the four-wheel three-degree-of-freedom vehicle dynamics model considering the load transfer between front and rear and left and right is carried out, and the vehicle dynamic balance equation is obtained as:
Figure FDA0003316525890000023
Figure FDA0003316525890000023
Figure FDA0003316525890000024
Figure FDA0003316525890000024
φ=β+ψφ=β+ψ
Figure FDA0003316525890000025
Figure FDA0003316525890000025
Figure FDA0003316525890000026
Figure FDA0003316525890000026
Figure FDA0003316525890000027
Figure FDA0003316525890000027
据此计算得到车辆纵向车速vmx和侧向车速vmy,则有:According to this calculation, the longitudinal vehicle speed v mx and the lateral vehicle speed v my are obtained, then: vmx=vm·cosβv mx = v m ·cosβ vmy=vm·sinβv my = v m ·sinβ 其中,
Figure FDA0003316525890000031
为车辆质心处速度的变化率,
Figure FDA0003316525890000032
为质心侧偏角速度,φ为质心处车速全局方位角,
Figure FDA0003316525890000033
为质心处车速全局方位角速度,
Figure FDA0003316525890000034
为横摆角速度的变化率,ψ为车头全局方位角,Iz为横摆转动惯量,vx为车辆纵向车速,vy为车辆侧向车速;
in,
Figure FDA0003316525890000031
is the rate of change of the velocity at the center of mass of the vehicle,
Figure FDA0003316525890000032
is the side-slip angular velocity of the centroid, φ is the global azimuth of the vehicle speed at the centroid,
Figure FDA0003316525890000033
is the global azimuth velocity of the vehicle speed at the center of mass,
Figure FDA0003316525890000034
is the rate of change of the yaw angular velocity, ψ is the global azimuth angle of the front of the vehicle, I z is the yaw moment of inertia, v x is the longitudinal speed of the vehicle, and v y is the lateral speed of the vehicle;
用于深度强化学习训练的轮胎模型包括前轮轮胎力模型和后轮轮胎力模型,对于后轮轮胎力模型,在漂移过程中,后轮制动抱死并在路面上纯摩擦,后轮的轮胎力方向与车轮轮心瞬时速度的方向相反,通过对后轮进行受力分析得到后轮纵侧向轮胎力分量的表达式为:The tire models used for deep reinforcement learning training include the front tire force model and the rear tire force model. For the rear tire force model, during the drifting process, the rear wheel brakes lock and there is pure friction on the road surface, and the rear wheel’s The direction of the tire force is opposite to the direction of the instantaneous speed of the wheel center. Through the force analysis of the rear wheel, the expression of the longitudinal and lateral tire force component of the rear wheel is obtained as: 对于左后轮:For the left rear wheel:
Figure FDA0003316525890000035
Figure FDA0003316525890000035
对于右后轮:For the right rear wheel:
Figure FDA0003316525890000036
Figure FDA0003316525890000036
Fr_sat=μ1Fz F r_sat = μ 1 F z 其中,vxRL、vyRL分别为左后轮轮心处纵、侧向速度,vxRR、vyRR分别为右后轮轮心处纵、侧向速度,λL、λR分别为左、右后轮轮心侧偏角,FxRL、FyRL分别为左后轮纵、侧向力,FxRR、FyRR分别为右后轮纵、侧向力,FrRL_sat、FrRR_sat分别为左、右后轮水平饱和轮胎力,Fr_sat表示对应车轮水平饱和轮胎力,μ1为车轮抱死时路面利用附着系数,Fz表示对应车轮的垂向力;Among them, v xRL and v yRL are the longitudinal and lateral speeds at the center of the left rear wheel, respectively, v xRR and vyRR are the longitudinal and lateral speeds at the center of the right rear wheel, respectively, and λ L and λ R are the left and right velocities, respectively. Rear wheel center slip angle, F xRL and F yRL are the longitudinal and lateral forces of the left rear wheel, respectively, F xRR and F yRR are the longitudinal and lateral forces of the right rear wheel, respectively, and F rRL_sat and F rRR_sat are the left and right forces, respectively The horizontal saturated tire force of the rear wheel, F r_sat represents the horizontal saturated tire force of the corresponding wheel, μ 1 is the adhesion coefficient of the road surface when the wheel is locked, and F z represents the vertical force of the corresponding wheel; 对于前轮轮胎力模型,在漂移过程中,前轮轮胎力尚未饱和,则采用改进Burckhardt轮胎模型对轮胎力进行拟合,用以表述侧向力与侧偏角的关系,则有:For the front tire force model, during the drifting process, the front tire force is not saturated, then the improved Burckhardt tire model is used to fit the tire force to express the relationship between the lateral force and the slip angle, as follows:
Figure FDA0003316525890000041
Figure FDA0003316525890000041
其中,θ1~θ5为拟合参数,α为前轮侧偏角;Among them, θ 1 to θ 5 are fitting parameters, and α is the front wheel slip angle; 左轮侧偏角αL和右轮侧偏角αR可通过以下公式求得:The left wheel slip angle α L and the right wheel slip angle α R can be obtained by the following formulas:
Figure FDA0003316525890000042
Figure FDA0003316525890000042
Figure FDA0003316525890000043
Figure FDA0003316525890000043
由于前轮未施加制动力和驱动力,处于自由滚动状态,有FxFL=0,FxFR=0,在确定前轮轮胎力方向时仅考虑侧向力,则前轮轮胎力方向垂直于轮胎平面,由前轮转向角决定;Since the front wheel does not apply braking force and driving force, it is in a free rolling state, with F xFL = 0, F xFR = 0, only the lateral force is considered when determining the tire force direction of the front wheel, then the tire force direction of the front wheel is perpendicular to the tire The plane is determined by the steering angle of the front wheel; 2)采用面向漂移入库控制的TD3算法实现智能电动汽车漂移入库,具体包括以下步骤:2) Adopt the TD3 algorithm for drift storage control to realize the drift storage of intelligent electric vehicles, which includes the following steps: 21)设计面向漂移入库控制的TD3算法,构建Actor网络和Critic网络,具体为:21) Design TD3 algorithm for drift warehousing control, build Actor network and Critic network, specifically: Critic网络和Actor网络均为由全连接层组成的BP神经网络,Critic网络的输入为车辆状态和动作,输出为Q值,Actor网络的输入为车辆状态,输出为动作,所述的车辆状态为表征漂移过程车辆状态的参数,包括以车辆质心为原点,车头朝向为y轴正方向的相对坐标系下库位坐标(ex、ey)和库位朝向
Figure FDA0003316525890000044
车辆质心处速度vm、质心侧偏角β以及横摆角速度ω,所述的动作为方向盘转角;
Both the Critic network and the Actor network are BP neural networks composed of fully connected layers. The input of the Critic network is the vehicle state and action, and the output is the Q value. The input of the Actor network is the vehicle state and the output is the action. The vehicle state is The parameters that characterize the state of the vehicle during the drift process, including the location coordinates (e x , e y ) and the location orientation in the relative coordinate system with the center of mass of the vehicle as the origin and the vehicle's head orientation as the positive direction of the y-axis
Figure FDA0003316525890000044
the velocity vm at the center of mass of the vehicle, the side-slip angle β of the center of mass and the yaw angular velocity ω, the actions are the steering wheel angle;
22)构建奖励函数r(k),则有:22) Construct the reward function r(k), then there are:
Figure FDA0003316525890000045
Figure FDA0003316525890000045
其中,wx、wy
Figure FDA0003316525890000046
分别为ex、ey
Figure FDA0003316525890000047
的权重,k为时间;
Among them, w x , w y ,
Figure FDA0003316525890000046
are e x , e y and
Figure FDA0003316525890000047
The weight of , k is time;
23)对Actor网络和Critic网络进行训练,并据此完成智能电动汽车漂移入库,对Actor网络和Critic网络进行训练前,先确定漂移入库控制器的边界,根据该边界对每次车辆漂移的目标库位位置进行随机取值,在迭代训练中,车辆以随机选取的目标库位位置和朝向计算车辆状态,并据此对Critic网络和Actor网络进行训练,通过在训练过程中随机更新目标库位位置,拓展训练数据集,提升化能力。23) Train the Actor network and the Critic network, and complete the drift storage of the smart electric vehicle accordingly. Before training the Actor network and the Critic network, determine the boundary of the drift storage controller, and according to the boundary, each vehicle drifts In the iterative training, the vehicle calculates the vehicle state with the randomly selected target location and orientation, and trains the Critic network and the Actor network accordingly, and randomly updates the target during the training process. Warehouse location, expand the training data set, and improve the ability of transformation.
CN202011530836.4A 2020-12-22 2020-12-22 A deep reinforcement learning-based drift storage control method for smart electric vehicles Active CN112590774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011530836.4A CN112590774B (en) 2020-12-22 2020-12-22 A deep reinforcement learning-based drift storage control method for smart electric vehicles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011530836.4A CN112590774B (en) 2020-12-22 2020-12-22 A deep reinforcement learning-based drift storage control method for smart electric vehicles

Publications (2)

Publication Number Publication Date
CN112590774A CN112590774A (en) 2021-04-02
CN112590774B true CN112590774B (en) 2022-02-18

Family

ID=75200195

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011530836.4A Active CN112590774B (en) 2020-12-22 2020-12-22 A deep reinforcement learning-based drift storage control method for smart electric vehicles

Country Status (1)

Country Link
CN (1) CN112590774B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114852043B (en) * 2022-03-23 2024-06-18 武汉理工大学 A HEV energy management method and system based on TD3

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN205531568U (en) * 2015-12-31 2016-08-31 肖全景 Multi -functional device of parking
CN206528518U (en) * 2017-01-11 2017-09-29 陈生泰 The system that wheel can be made to turn to any angle and the different steering angles of coordination with one another
KR20180052983A (en) * 2016-11-11 2018-05-21 현대자동차주식회사 Control method for drift logic of vehicle equipped with epb
CN108202726A (en) * 2016-12-20 2018-06-26 通用汽车环球科技运作有限责任公司 The vehicle braking mode driven for racing
CN111890951A (en) * 2020-08-07 2020-11-06 吉林大学 Intelligent electric vehicle trajectory tracking and motion control method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN205531568U (en) * 2015-12-31 2016-08-31 肖全景 Multi -functional device of parking
KR20180052983A (en) * 2016-11-11 2018-05-21 현대자동차주식회사 Control method for drift logic of vehicle equipped with epb
CN108202726A (en) * 2016-12-20 2018-06-26 通用汽车环球科技运作有限责任公司 The vehicle braking mode driven for racing
CN206528518U (en) * 2017-01-11 2017-09-29 陈生泰 The system that wheel can be made to turn to any angle and the different steering angles of coordination with one another
CN111890951A (en) * 2020-08-07 2020-11-06 吉林大学 Intelligent electric vehicle trajectory tracking and motion control method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
autonomous drift parking using a switched control strategy with onboard sensor;E.Jelavic等;《IFAC-PapersOnLine》;20170731;全文 *
极限工况下自动驾驶车辆的轨迹规划与运动控制;张放;《中国博士学位论文全文数据库(电子期刊)》;20200415;全文 *

Also Published As

Publication number Publication date
CN112590774A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
Li et al. Comprehensive tire–road friction coefficient estimation based on signal fusion method under complex maneuvering operations
CN110116732B (en) Vehicle lateral stability control method considering tire cornering stiffness change
Cho et al. Estimation of tire forces for application to vehicle stability control
CN105253141B (en) A kind of vehicle handling stability control method adjusted based on wheel longitudinal force
US6508102B1 (en) Near real-time friction estimation for pre-emptive vehicle control
CN104773169B (en) Vehicle yaw stability integrating control method based on tire slip angle
Han et al. Estimation of the tire cornering stiffness as a road surface classification indicator using understeering characteristics
WO2022266824A1 (en) Steering control method and apparatus
CN108594652A (en) A kind of vehicle-state fusion method of estimation based on observer information iteration
JP6286091B1 (en) Vehicle state estimation device, control device, suspension control device, and suspension device.
US6745112B2 (en) Method of estimating quantities that represent state of vehicle
CN106004870A (en) Vehicle stability integrated control method based on variable-weight model prediction algorithm
CN111267856A (en) Vehicle automatic drift control method and system based on longitudinal force pre-distribution
CN103407451A (en) Method for estimating longitudinal adhesion coefficient of road
JP2004131070A (en) Confirmation of lifting and grounding of automobile wheel
Tian et al. Integrated control with DYC and DSS for 4WID electric vehicles
CN105936273A (en) Vehicle active torque inter-wheel and inter-axis distribution method
Tseng et al. Technical challenges in the development of vehicle stability control system
CN108394413B (en) A kind of electronic vehicle attitude and parameter correcting method of four motorized wheels and steering
CN113247004A (en) Joint estimation method for vehicle mass and road transverse gradient
CN113650621B (en) State Parameter Estimation Method for Distributed Drive Electric Vehicles for Complex Operating Conditions
CN113147420A (en) Target optimization torque distribution method based on road adhesion coefficient identification
CN116061934A (en) Architecture for managing chassis and driveline actuators and model-based predictive control method
CN115712949A (en) Virtual verification and verification model structure for motion control
CN115214697A (en) Adaptive second-order sliding mode control intelligent automobile transverse control method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant