[go: up one dir, main page]

CN114967465B - Trajectory planning method, device, electronic device and storage medium - Google Patents

Trajectory planning method, device, electronic device and storage medium Download PDF

Info

Publication number
CN114967465B
CN114967465B CN202210624231.4A CN202210624231A CN114967465B CN 114967465 B CN114967465 B CN 114967465B CN 202210624231 A CN202210624231 A CN 202210624231A CN 114967465 B CN114967465 B CN 114967465B
Authority
CN
China
Prior art keywords
state information
motion state
model
target
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210624231.4A
Other languages
Chinese (zh)
Other versions
CN114967465A (en
Inventor
冷晓琨
常琳
吴雨璁
白学林
柯真东
王松
何治成
黄贤贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leju Shenzhen Robotics Co Ltd
Original Assignee
Leju Shenzhen Robotics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leju Shenzhen Robotics Co Ltd filed Critical Leju Shenzhen Robotics Co Ltd
Priority to CN202210624231.4A priority Critical patent/CN114967465B/en
Publication of CN114967465A publication Critical patent/CN114967465A/en
Application granted granted Critical
Publication of CN114967465B publication Critical patent/CN114967465B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

The application provides a track planning method, a track planning device, electronic equipment and a storage medium, and relates to the technical field of motion control. The method comprises the steps of generating initial estimated motion state information of a target object by adopting a target track model obtained through pre-training according to motion state information of the target object at the current moment, and adjusting the estimated motion state information by adopting a model predictive control algorithm according to the initial estimated motion state information of the target object to obtain target motion state information of the target object, wherein a motion track of the target object is formed by at least two target motion state information at continuous moments. The initial estimated motion state information is generated through the trained target track model, and the target track model is obtained through training according to the historical track data of the sample object, and the initial estimated motion state information is calculated based on the model, so that the efficiency of generating the initial estimated motion state information is higher, the calculation speed of a model predictive control algorithm can be increased, and the track generation efficiency is improved.

Description

Track planning method, track planning device, electronic equipment and storage medium
Technical Field
The present application relates to the field of motion control technologies, and in particular, to a track planning method, a track planning device, an electronic device, and a storage medium.
Background
Humanoid robots are complex systems with multiple degrees of freedom, which have been rapidly developed due to their ability to simulate the motion behavior of humans, and are widely used in manufacturing to reduce labor force. The humanoid robot is provided with a planned accurate motion track in the process of being applied, which is the premise of being capable of accurately working.
In the prior art, the track planning is generally performed by adopting an online track generation mode, the online track generation mode is generally operated by a model predictive control (MPC, model Predictive Control) algorithm, and the calculation speed of the track planning by the MPC is low due to the low accuracy of the initial input data of the MPC acquired in the existing method.
Disclosure of Invention
The application aims to provide a track planning method, a device, electronic equipment and a storage medium aiming at the defects in the prior art so as to solve the problem of low track planning efficiency in the prior art.
In order to achieve the above purpose, the technical scheme adopted by the embodiment of the application is as follows:
in a first aspect, an embodiment of the present application provides a track planning method, including:
Generating initial estimated motion state information of a target object by adopting a target track model obtained by training in advance according to motion state information of the target object at the current moment;
and according to the initial estimated motion state information of the target object, adjusting the estimated motion state information by adopting a model predictive control algorithm to obtain target motion state information of the target object, wherein the motion trail of the target object is composed of at least two pieces of target motion state information at continuous moments.
Optionally, the target track model is obtained by training in the following manner:
acquiring motion state information of a plurality of sample objects at a plurality of historical moments from a pre-constructed motion state information base, wherein the motion state information comprises position information of the objects, speed information of the objects and track remaining time;
And carrying out iterative training on the initial track model based on the motion state information of a plurality of historical moments of each sample object to obtain the target track model.
Optionally, the performing iterative training on the initial trajectory model based on the motion state information of the plurality of historical moments of each sample object to obtain the target trajectory model includes:
inputting motion state information of a plurality of historical moments of each sample object into the initial track model to obtain estimated motion state information of each sample object output by the initial track model;
Calculating model loss of the initial track model according to the estimated motion state information and the actual motion state information, correcting model parameters of the initial track model according to the model loss, and estimating motion state information of a sample object again based on the corrected initial track model;
And iteratively executing the steps until the model loss of the initial track model meets a preset condition, and taking the initial track model meeting the preset condition as the target track model.
Optionally, the calculating the model loss of the initial trajectory model according to the estimated motion state information and the actual motion state information includes:
Calculating the actual motion state information of each sample object by using a model prediction control algorithm according to the motion state information of each sample object at each historical moment;
And comparing and analyzing the estimated motion state information of each sample object with the actual motion state information to obtain the model loss of the initial track model.
Optionally, the adjusting the estimated motion state information by using a model predictive control algorithm according to the initial estimated motion state information of the target object to obtain target motion state information of the target object includes:
and inputting the initial estimated motion state information of the target object into the model predictive control algorithm as an input parameter, and calculating to obtain the target motion state information of the target object.
Optionally, before acquiring the motion state information of the plurality of historical moments of the plurality of sample objects from the pre-constructed motion state information base, the method further includes:
Acquiring a plurality of historical motion tracks of a plurality of sample objects, wherein each historical motion track is composed of motion state information of at least two continuous historical moments of each sample object;
splitting each historical motion trail of each sample object to obtain motion state information of a plurality of historical moments of each sample object;
and constructing the motion state information base by motion state information of a plurality of historical moments of each sample object.
Optionally, the position information of the object comprises the position of the object and the direction of the object, and the speed information of the object comprises the three-dimensional speed and the three-dimensional acceleration of the object.
In a second aspect, the embodiment of the application also provides a track planning device, which comprises a generating module and a correcting module;
The generation module is used for generating initial estimated motion state information of the target object by adopting a target track model obtained by training in advance according to motion state information of the target object at the current moment;
The correction module is used for adjusting the estimated motion state information by adopting a model predictive control algorithm according to the initial estimated motion state information of the target object to obtain target motion state information of the target object, and the motion trail of the target object is composed of at least two continuous target motion state information.
Optionally, the apparatus further comprises a model training module;
the model training module is used for acquiring motion state information of a plurality of sample objects at a plurality of historical moments from a pre-constructed motion state information base, wherein the motion state information comprises position information of the objects, speed information of the objects and track remaining time;
And carrying out iterative training on the initial track model based on the motion state information of a plurality of historical moments of each sample object to obtain the target track model.
The model training module is specifically configured to input motion state information of a plurality of historical moments of each sample object into the initial track model, so as to obtain estimated motion state information of each sample object output by the initial track model;
Calculating model loss of the initial track model according to the estimated motion state information and the actual motion state information, correcting model parameters of the initial track model according to the model loss, and estimating motion state information of a sample object again based on the corrected initial track model;
And iteratively executing the steps until the model loss of the initial track model meets a preset condition, and taking the initial track model meeting the preset condition as the target track model.
The model training module is specifically configured to calculate, according to motion state information of each sample object at each historical moment, the actual motion state information of each sample object by using a model prediction control algorithm;
And comparing and analyzing the estimated motion state information of each sample object with the actual motion state information to obtain the model loss of the initial track model.
Optionally, the correction module is specifically configured to input the initial estimated motion state information of the target object as an input parameter into the model prediction control algorithm, and calculate to obtain target motion state information of the target object.
Optionally, the device further comprises a construction module;
the construction module is used for acquiring a plurality of historical motion tracks of a plurality of sample objects, wherein each historical motion track is composed of motion state information of at least two continuous historical moments of each sample object;
splitting each historical motion trail of each sample object to obtain motion state information of a plurality of historical moments of each sample object;
and constructing the motion state information base by motion state information of a plurality of historical moments of each sample object.
Optionally, the position information of the object comprises the position of the object and the direction of the object, and the speed information of the object comprises the three-dimensional speed and the three-dimensional acceleration of the object.
In a third aspect, an embodiment of the application provides an electronic device comprising a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is in operation, the processor executing the machine-readable instructions to perform the steps of the track planning method as provided in the first aspect.
In a fourth aspect, an embodiment of the present application provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the trajectory planning method as provided in the first aspect.
The beneficial effects of the application are as follows:
The application provides a track planning method, a track planning device, electronic equipment and a storage medium, wherein the method comprises the steps of generating initial estimated motion state information of a target object by adopting a target track model obtained by training in advance according to motion state information of the target object at the current moment; and according to the initial estimated motion state information of the target object, adjusting the estimated motion state information by adopting a model predictive control algorithm to obtain target motion state information of the target object, wherein the motion trail of the target object is composed of at least two continuous target motion state information. According to the method, initial estimated motion state information is generated through the trained target track model, the target track model is obtained through training according to historical track data of a sample object, and the initial estimated motion state information is calculated based on the model, so that the efficiency of generating the initial estimated motion state information is high, the accuracy is high, and when the initial estimated motion state information with high accuracy is used as an initial input value of a model predictive control algorithm, the calculation speed of the model predictive control algorithm can be accelerated, and the track generation efficiency is improved due to the fact that the initial input value has high accuracy.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a track planning system according to an embodiment of the present application;
fig. 2 is a schematic flow chart of a track planning method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of another track planning method according to an embodiment of the present application;
FIG. 4 is a flowchart of another track planning method according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of another track planning method according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating another trajectory planning method according to an embodiment of the present application;
fig. 7 is a schematic diagram of a track planning apparatus according to an embodiment of the present application;
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for the purpose of illustration and description only and are not intended to limit the scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this disclosure, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to or removed from the flow diagrams by those skilled in the art under the direction of the present disclosure.
In addition, the described embodiments are only some, but not all, embodiments of the application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.
In order to enable a person skilled in the art to use the present disclosure, in connection with a specific application scenario "trajectory planning of a humanoid robot", the following embodiments are presented. It will be apparent to those having ordinary skill in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the application is mainly described around trajectory planning of a humanoid robot, it should be understood that this is only one exemplary embodiment. The present application may be applied to track planning scenarios for any other object. For example, the present application may be applied to vehicle trajectory planning and the like.
It should be noted that the term "comprising" will be used in embodiments of the application to indicate the presence of the features stated hereafter, but not to exclude the addition of other features.
First, prior to describing the embodiments of the present application, the related art to which the present application relates will be described:
The humanoid robot can simulate the behavior of human beings, so that the humanoid robot can be applied to a plurality of fields to replace the human beings to perform work, thereby improving the work efficiency. In the process of controlling the robot to operate, accurate track planning is performed on the robot, so that the accuracy of an operation result is higher.
The existing track planning for the humanoid robot mainly adopts two modes, namely offline track generation, wherein the offline track generation can consider a multi-rigid-body model in the robot, but because of the complex model and large calculation amount, the track generation for a period of seconds generally takes a few minutes. One is online trajectory re-planning (generation), i.e. generating a motion trajectory from the current robot motion state, which due to the requirement of real-time generation (typically with computation times within 50 ms) can only be done with simplified robot dynamics models, such as mass-focused particle models.
In general, in the online track generation mode, a model predictive control algorithm (MPC, model Predictive Control) may be used to perform track operation, where the model predictive control algorithm has a high requirement on input data of the algorithm, and when the model predictive control algorithm has a good initial input value, the model predictive control algorithm may perform operation at a high speed.
In the existing method, when the initial input value of the model predictive control algorithm is obtained, a large-scale track library is generally constructed according to the motion state information of the historical moments of a large number of randomly selected robots, the optimal motion state information of the target moment of the robots is estimated from a plurality of tracks in the track library by adopting a linear interpolation or spline interpolation mode aiming at the tracks in the track library, and the optimal motion state information of the target moment is used as the initial input value of the model predictive control algorithm.
However, when the scale of the track library is large, the method adopts an interpolation mode to obtain the initial input value of the model predictive control algorithm, so that the real-time performance and the accuracy are poor, and the obtained initial input value of the model predictive control algorithm is not ideal.
The method combines an off-line track generation mode and an on-line track generation mode, trains and acquires a track model in an off-line mode, calculates and acquires estimated motion state information of the robot in an on-line mode by adopting the track model, takes the estimated motion state information as input of a model predictive control algorithm, and corrects the estimated motion state information by the model predictive control algorithm to acquire target motion state information. The initial input value of the model predictive control algorithm is generated by adopting the track model obtained through training, so that the accuracy and the generation efficiency of the generated initial input value can be effectively improved.
Fig. 1 is a schematic diagram of an architecture of a trajectory planning system according to an embodiment of the present application, where, as shown in fig. 1, the trajectory planning system adopted in the present application may include an offline training module and an online running module, and the offline training module may include training and generating an initial trajectory model by using a motion state information base created in advance, and performing iterative correction on the initial trajectory model by using actual motion state information generated by a model prediction control algorithm, so as to obtain a target trajectory model. The online running module takes the collected current motion state information of the robot as the input of a target track model to generate initial estimated motion state information of the robot, the initial estimated motion state information is taken as an initial input value of a model predictive control algorithm, and the robot can be generated to be the target motion state information through algorithm operation, so that the motion track can be generated according to the target motion state information.
Fig. 2 is a flow chart of a track planning method according to an embodiment of the present application, where an execution subject of the method may be a computer device, as shown in fig. 2, and the method may include:
s101, generating initial estimated motion state information of a target object by adopting a target track model obtained by training in advance according to motion state information of the target object at the current moment.
In this embodiment, the target object may refer to a humanoid robot, and in practical application, the target object may also be a vehicle, an unmanned aerial vehicle, a virtual character in a game, etc., and the method may be used for track planning of any object with motion behavior.
Optionally, the track of the object may be formed by motion state information of the object at different moments, and track planning may be understood as estimating motion state information of the object at the next moment in real time according to motion state information of the object at the current moment, and forming a small track of the object by the motion state information of the current moment and the motion state information of the next moment.
Alternatively, the motion state information of the target object at the current moment may be acquired in real time, or may be generated by the target track model. The target trajectory model may be trained from historical trajectory data of the sample object.
If the current time is at the first time, the motion state information of the current time can be acquired in real time, and if the current time is at any time after the first time, for example, at the second time, the motion state information of the current time can be initial estimated motion state information generated by adopting a target track model according to the acquired motion state information of the first time. That is, if the current time is any time after the first time, the motion state information of the current time may be estimated by the target track model according to the motion state information of the previous time.
S102, according to initial estimated motion state information of a target object, adjusting the estimated motion state information by adopting a model predictive control algorithm to obtain target motion state information of the target object, wherein a motion track of the target object is composed of at least two continuous target motion state information.
In some embodiments, the initial estimated motion state information of the target object generated by the target track model may have some deviation, where the model predictive control algorithm may be used to correct the initial estimated motion state information, that is, correct the deviation, so as to obtain the target motion state information of the target object.
It should be noted that, the generated initial estimated running state information of the target object may be initial estimated running state information of a time subsequent to the current time of the target object. The subsequent time may include one time or may include a plurality of times, and when the subsequent time includes one time, the subsequent time may be a next time to the current time, and when the subsequent time includes a plurality of times, the subsequent time may be a plurality of times that are sequentially consecutive to the current time, such as a next time, and the like, of the current time.
The trajectory is generally composed of a plurality of point position information, that is, the motion trajectory of the target object may be composed of at least two consecutive times of target motion state information, wherein at least two consecutive times of target motion state information may be composed, and the more consecutive times, the longer the composed trajectory may be.
In summary, the track planning method provided by the embodiment includes generating initial estimated motion state information of a target object by adopting a target track model obtained by training in advance according to motion state information of the target object at the current moment, and adjusting the estimated motion state information by adopting a model predictive control algorithm according to the initial estimated motion state information of the target object to obtain target motion state information of the target object, wherein a motion track of the target object is formed by at least two target motion state information at continuous moments. According to the method, initial estimated motion state information is generated through the trained target track model, the target track model is obtained through training according to historical track data of a sample object, and the initial estimated motion state information is calculated based on the model, so that the efficiency of generating the initial estimated motion state information is high, the accuracy is high, and when the initial estimated motion state information with high accuracy is used as an initial input value of a model predictive control algorithm, the calculation speed of the model predictive control algorithm can be accelerated, and the track generation efficiency is improved due to the fact that the initial input value has high accuracy.
Fig. 3 is a flow chart of another trajectory planning method according to an embodiment of the present application, and optionally, the target trajectory model may be trained by:
s201, acquiring motion state information of a plurality of historical moments of a plurality of sample objects from a pre-constructed motion state information base, wherein the motion state information comprises position information of the objects, speed information of the objects and track remaining time.
The motion state information base can comprise motion state information of a plurality of sample objects at a plurality of historical moments, optionally, the plurality of sample objects can be randomly selected from the motion state information base, and the motion state information of the plurality of sample objects at the plurality of historical moments is obtained as sample data of model training.
The motion state information of any moment can comprise position information of an object at the moment, speed information of the object at the moment and track remaining time, wherein the track remaining time can be used for representing execution time of a track, and the longer the track remaining time is, the longer the length of time required for completing the track is, so that the longer the track can be described to a certain extent.
S202, performing iterative training on the initial track model based on the motion state information of a plurality of historical moments of each sample object to obtain a target track model.
Optionally, the motion state information of a plurality of historical moments of each sample object can be adopted for training to obtain an initial track model, meanwhile, based on the output result of the initial track model, cross entropy loss calculation is carried out on the initial track model, and model loss of the initial track model is calculated, so that model parameters of the initial track model are continuously corrected, and a target track model is obtained.
The process of calculating the model loss of the initial track model is carried out iteratively until the initial track model is corrected step by step to obtain the target track model.
Fig. 4 is a flowchart of another trajectory planning method according to an embodiment of the present application, optionally, in step S202, performing iterative training on an initial trajectory model based on motion state information of each sample object to obtain a target trajectory model, which may include:
s301, motion state information of a plurality of historical moments of each sample object is input into an initial track model, and estimated motion state information of each sample object output by the initial track model is obtained.
Here, each parameter in the initial trajectory model is unknown until the motion state information at a plurality of historic times of each sample object is not input, and the initial trajectory model can be obtained after training by using the motion state information at a plurality of historic times of each sample object as input data to be input to the initial trajectory model. At this time, the initial trajectory model has processing power.
In the training process of the initial trajectory model, the motion state information of the first historical moment of the sample object may be used as input data of the initial trajectory model, and the motion state information of at least one continuous second historical moment after the first historical moment of the sample object may be used as output data of the initial trajectory model, so as to obtain the initial trajectory model through training.
For example, assume that the motion state information at a plurality of historic times of one sample object includes motion state information at time 1, motion state information at time 2, motion state information at time 3, and motion state information at time 4, wherein time 1, time 2, time 3, and time 4 are consecutive times. Then, for any sample object, the motion state information at time 1 can be used as input data of the initial trajectory model, and the motion state information at time 2 can be used as output data of the initial trajectory model, so as to perform initial trajectory model training.
Or for any sample object, the motion state information at the moment 1 can be used as input data of an initial track model, and the motion state information at the moment 2, the moment 3 and the moment 4 can be used as output data of the initial track model so as to train the initial track model.
Optionally, based on the initial track model obtained by training, when the motion state information of a plurality of historical moments of each sample object is input into the initial track model, the estimated motion state information of each sample object output by the initial track model can be obtained. Here, the estimated state information of each sample object may be the motion state information of the next time after each historical time, or may be the motion state information of a plurality of continuous times after each historical time, that is, the initial trajectory model may be used to estimate the trajectory in a shorter time or in a longer time.
In some embodiments, the estimated motion state information of the next moment corresponding to the current moment of each sample object calculated by the initial trajectory model may include a plurality of estimated motion state information, and then the estimated motion state information with the minimum estimated motion state information of the next moment may be outputted as real estimated motion state information by referring to the trajectory remaining time corresponding to the estimated motion state information of each next moment.
S302, calculating model loss of an initial track model according to the estimated motion state information and the actual motion state information, correcting model parameters of the initial track model according to the model loss, and estimating motion state information of the sample object again based on the corrected initial track model.
The actual motion state information can be actual motion state information at a subsequent time of each historical time of each sample object, the actual motion state information is motion state information contained in an actual track finally completed by the sample object, calculation deviation of the model can be corrected according to the actual motion state information and the estimated motion state information, so that model parameters of the initial track model can be continuously adjusted, estimated motion state information output by the corrected initial track model is enabled to be approximate to the actual motion state information as much as possible, and accuracy of the initial track model obtained through training is improved.
S303, iteratively executing the steps until the model loss of the initial track model meets the preset condition, and taking the initial track model meeting the preset condition as the target track model.
Optionally, by continuously performing the step S302 in an iterative manner, the model parameters of the initial trajectory model may be continuously adjusted, and when the model loss of the current initial trajectory model meets the preset loss threshold, the correction of the initial trajectory model may be ended, and the initial trajectory model at this time is used as the target trajectory model.
Fig. 5 is a flow chart of another trajectory planning method according to an embodiment of the present application, optionally, in step S302, calculating a model loss of an initial trajectory model according to estimated motion state information and actual motion state information may include:
S401, calculating actual motion state information of each sample object by using a model prediction control algorithm according to motion state information of each sample object at each historical moment.
The model predictive control algorithm, i.e. control by model prediction in the future, where control is model-based, is a closed-loop optimization control strategy.
The model predictive control algorithm mainly comprises three parts, namely a predictive model, a reference track and a control algorithm. Wherein the reference trajectory may be added as a constraint to future input, output or state variables.
In this embodiment, the motion state information of each sample object at each historical moment can be used as the reference track and input data of the model prediction control algorithm, the predicted actual motion state information of each object is generated through the prediction model, the control algorithm controls each object to move according to the track indicated by the predicted actual motion state information according to the predicted actual motion state information, meanwhile, the motion state information fed back after the track moves is obtained, the fed back motion state information is compared with the reference track, if the deviation does not meet the condition, the fed back motion state information is used as the input data of the prediction model prediction control algorithm again, the steps are continuously repeated until the fed back motion state information approaches the reference track, and at the moment, the current fed back motion state information can be used as the actual motion state information of each sample object calculated by the model prediction control algorithm.
The model predictive control algorithm herein may be understood with reference to existing methods.
S402, comparing and analyzing the estimated motion state information of each sample object with the actual motion state information to obtain the model loss of the initial track model.
In some embodiments, the model loss of the initial trajectory model may be obtained by calculating the estimated motion state information of each sample object to obtain the difference value of the actual motion state information.
Optionally, in step S102, according to the initial estimated motion state information of the target object, a model predictive control algorithm is adopted to adjust the estimated motion state information to obtain the target motion state information of the target object, which may include inputting the initial estimated motion state information of the target object as an input parameter into the model predictive control algorithm, and calculating to obtain the target motion state information of the target object.
In the above embodiment, the control principle of the model predictive control algorithm is described in detail, where the initial estimated motion state information of the target object can be understood as the reference track, and the target motion state information of the target object can be calculated based on the control algorithm.
It should be noted that, the target object may refer to an object that is currently subjected to trajectory planning, and the sample object refers to an object used for performing sample training, where the sample object is an object that has already performed a specific trajectory.
Fig. 6 is a flowchart of another track planning method according to an embodiment of the present application, optionally, before acquiring motion state information of a plurality of historical moments of a plurality of sample objects from a pre-constructed motion state information base in step S201, the method of the present application may further include:
s501, acquiring a plurality of historical motion tracks of a plurality of sample objects, wherein each historical motion track is composed of motion state information of at least two continuous historical moments of each sample object.
In some embodiments, a plurality of historical motion trajectories of a plurality of sample objects may be extracted from motion trajectories of a large number of objects stored in a server background, each of the historical motion trajectories being composed of motion state information of at least two consecutive historical moments of the sample objects.
S502, splitting each historical motion trail of each sample object to obtain motion state information of each sample object at a plurality of historical moments.
Alternatively, since each historical motion trail of each sample object is composed of motion state information of some historical moments, each historical motion trail of each sample object may be split into motion state information of a plurality of historical moments of each sample object, and the motion state information of each moment in each historical motion trail is used as motion state information of one historical moment.
S503, constructing a motion state information base according to motion state information of a plurality of historical moments of each sample object.
The motion state information base may include motion state information at a plurality of historic times of each split sample object.
Alternatively, in the motion state information, the position information of the object may include a position of the object and an orientation of the object, and the speed information of the object may include a three-dimensional speed and a three-dimensional acceleration of the object.
The position information of the object at any time may include position information and orientation information of the object at the time, where taking the target object as an example of the robot, the position information of the object may be calculated according to coordinates of the centers of the two feet of the robot, and the orientation information of the object may refer to the orientation of the current toe of the robot.
And because the track of the target object is planned in a three-dimensional space, track estimation is needed by referring to the three-dimensional speed and three-dimensional acceleration information of the target object.
Based on this, the velocity information in the motion state information at a plurality of historic times of each sample object included in the motion state information base constructed as described above can be split into a plurality of items, whereby a large-scale motion state information base can be constructed.
In summary, the track planning method provided by the embodiment includes generating initial estimated motion state information of a target object by adopting a target track model obtained by training in advance according to motion state information of the target object at the current moment, and adjusting the estimated motion state information by adopting a model predictive control algorithm according to the initial estimated motion state information of the target object to obtain target motion state information of the target object, wherein a motion track of the target object is formed by at least two target motion state information at continuous moments. According to the method, initial estimated motion state information is generated through the trained target track model, the target track model is obtained through training according to historical track data of a sample object, and the initial estimated motion state information is calculated based on the model, so that the efficiency of generating the initial estimated motion state information is high, the accuracy is high, and when the initial estimated motion state information with high accuracy is used as an initial input value of a model predictive control algorithm, the calculation speed of the model predictive control algorithm can be accelerated, and the track generation efficiency is improved due to the fact that the initial input value has high accuracy.
The following describes a device, equipment, a storage medium, etc. for executing the track planning method provided by the present application, and specific implementation processes and technical effects thereof are referred to above, and are not described in detail below.
Fig. 7 is a schematic diagram of a track planning apparatus according to an embodiment of the present application, where functions implemented by the track planning apparatus correspond to steps executed by the method described above. The apparatus may be understood as a computer device or a server, or a processor of a server, or may be understood as a component, which is independent from the server or the processor and is controlled by the server, to implement the functions of the present application, as shown in fig. 7, and may include a generating module 710 and a modifying module 720;
the generating module 710 is configured to generate initial estimated motion state information of the target object by using a target track model obtained by training in advance according to motion state information of the target object at the current moment;
the correction module 720 is configured to adjust the estimated motion state information by using a model predictive control algorithm according to the initial estimated motion state information of the target object, so as to obtain target motion state information of the target object, where a motion track of the target object is formed by at least two target motion state information at consecutive moments.
Optionally, the apparatus further comprises a model training module;
The model training module is used for acquiring motion state information of a plurality of sample objects at a plurality of historical moments from a pre-constructed motion state information base, wherein the motion state information comprises position information of the objects, speed information of the objects and track remaining time;
And carrying out iterative training on the initial track model based on the motion state information of a plurality of historical moments of each sample object to obtain a target track model.
The model training module is specifically used for inputting motion state information of a plurality of historical moments of each sample object into the initial track model to obtain estimated motion state information of each sample object output by the initial track model;
Calculating model loss of the initial track model according to the estimated motion state information and the actual motion state information, correcting model parameters of the initial track model according to the model loss, and estimating motion state information of the sample object again based on the corrected initial track model;
And iteratively executing the steps until the model loss of the initial track model meets the preset condition, and taking the initial track model meeting the preset condition as the target track model.
The model training module is specifically used for calculating the actual motion state information of each sample object by using a model prediction control algorithm according to the motion state information of each sample object at each historical moment;
And comparing and analyzing the estimated motion state information of each sample object according to the actual motion state information to obtain the model loss of the initial track model.
Optionally, the correction module 720 is specifically configured to input the initial estimated motion state information of the target object as an input parameter to a model prediction control algorithm, and calculate to obtain the target motion state information of the target object.
Optionally, the device further comprises a construction module;
the construction module is used for acquiring a plurality of historical motion tracks of a plurality of sample objects, wherein each historical motion track is composed of motion state information of at least two continuous historical moments of each sample object;
splitting each historical motion trail of each sample object to obtain motion state information of a plurality of historical moments of each sample object;
And constructing a motion state information base according to the motion state information of a plurality of historical moments of each sample object.
Optionally, the position information of the object comprises the position of the object and the orientation of the object, and the speed information of the object comprises the three-dimensional speed and the three-dimensional acceleration of the object.
The foregoing apparatus is used for executing the method provided in the foregoing embodiment, and its implementation principle and technical effects are similar, and are not described herein again.
The modules above may be one or more integrated circuits configured to implement the methods above, such as one or more Application SPECIFIC INTEGRATED Circuits (ASIC), or one or more microprocessors (DIGITAL SINGNAL processor, DSP), or one or more field programmable gate arrays (Field Programmable GATE ARRAY, FPGA), or the like. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
The modules may be connected or communicate with each other via wired or wireless connections. The wired connection may include a metal cable, optical cable, hybrid cable, or the like, or any combination thereof. The wireless connection may include a connection through a LAN, WAN, bluetooth, zigBee, or NFC, or any combination thereof. Two or more modules may be combined into a single module, and any one module may be divided into two or more units. It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the method embodiments, and are not repeated in the present disclosure.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the terminal may be a computing device with a data processing function.
The device may include a processor 801, a memory 802.
The memory 802 is used for storing a program, and the processor 801 calls the program stored in the memory 802 to execute the above-described method embodiment. The specific implementation manner and the technical effect are similar, and are not repeated here.
Therein, the memory 802 stores program code that, when executed by the processor 801, causes the processor 801 to perform various steps in the trajectory planning method according to various exemplary embodiments of the present application described in the above section of the "exemplary method" of the present specification.
The Processor 801 may be a general purpose Processor such as a Central Processing Unit (CPU), digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field programmable gate array (Field Programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
Memory 802, as a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, EEPROM), magnetic Memory, magnetic disk, optical disk, and the like. The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 802 of embodiments of the present application may also be circuitry or any other device capable of performing storage functions for storing program instructions and/or data.
Optionally, the present application also provides a program product, such as a computer readable storage medium, comprising a program for performing the above-described method embodiments when being executed by a processor.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units.
The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (english: processor) to perform some of the steps of the methods according to the embodiments of the application. The storage medium includes various media capable of storing program codes, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk.

Claims (7)

1.一种轨迹规划方法,其特征在于,包括:1. A trajectory planning method, comprising: 根据目标对象当前时刻的运动状态信息,采用预先训练得到的目标轨迹模型,生成所述目标对象的初始预估运动状态信息;According to the motion state information of the target object at the current moment, the target trajectory model obtained by pre-training is used to generate the initial estimated motion state information of the target object; 根据所述目标对象的初始预估运动状态信息,采用模型预测控制算法,调整所述预估运动状态信息,得到所述目标对象的目标运动状态信息,所述目标对象的运动轨迹由至少两个连续时刻的目标运动状态信息构成;According to the initial estimated motion state information of the target object, a model predictive control algorithm is used to adjust the estimated motion state information to obtain target motion state information of the target object, wherein the motion trajectory of the target object is composed of target motion state information at at least two consecutive moments; 目标轨迹模型采用如下方式训练得到:The target trajectory model is trained in the following way: 从预先构建的运动状态信息库中获取多个样本对象的多个历史时刻的运动状态信息,运动状态信息包括:对象的位置信息、对象的速度信息及轨迹剩余时间;Obtaining motion state information of multiple sample objects at multiple historical moments from a pre-built motion state information library, the motion state information including: object position information, object speed information, and trajectory remaining time; 基于各样本对象的多个历史时刻的运动状态信息对初始轨迹模型进行迭代训练,得到所述目标轨迹模型;Iteratively training the initial trajectory model based on the motion state information of each sample object at multiple historical moments to obtain the target trajectory model; 所述基于各样本对象的多个历史时刻的运动状态信息对初始轨迹模型进行迭代训练,得到所述目标轨迹模型,包括:The iterative training of the initial trajectory model based on the motion state information of each sample object at multiple historical moments to obtain the target trajectory model includes: 将各样本对象的多个历史时刻的运动状态信息输入所述初始轨迹模型中,得到所述初始轨迹模型输出的各样本对象的预估运动状态信息;Inputting motion state information of each sample object at multiple historical moments into the initial trajectory model to obtain estimated motion state information of each sample object output by the initial trajectory model; 根据所述预估运动状态信息以及实际运动状态信息,计算所述初始轨迹模型的模型损失,根据所述模型损失修正所述初始轨迹模型的模型参数,并基于修正后的初始轨迹模型重新对样本对象进行运动状态信息预估;Calculating a model loss of the initial trajectory model according to the estimated motion state information and the actual motion state information, correcting a model parameter of the initial trajectory model according to the model loss, and re-estimating the motion state information of the sample object based on the corrected initial trajectory model; 迭代执行上述步骤,直至所述初始轨迹模型的模型损失满足预设条件,并将满足预设条件的初始轨迹模型作为所述目标轨迹模型;Iteratively perform the above steps until the model loss of the initial trajectory model meets the preset conditions, and use the initial trajectory model that meets the preset conditions as the target trajectory model; 所述根据所述预估运动状态信息以及实际运动状态信息,计算所述初始轨迹模型的模型损失,包括:The calculating the model loss of the initial trajectory model according to the estimated motion state information and the actual motion state information includes: 根据各所述样本对象在各历史时刻的运动状态信息,使用模型预测控制算法,计算各所述样本对象的所述实际运动状态信息;According to the motion state information of each sample object at each historical moment, using a model predictive control algorithm, calculating the actual motion state information of each sample object; 对各所述样本对象的所述预估运动状态信息以所述实际运动状态信息进行比对分析,得到所述初始轨迹模型的模型损失。The estimated motion state information of each of the sample objects is compared and analyzed with the actual motion state information to obtain a model loss of the initial trajectory model. 2.根据权利要求1所述的方法,其特征在于,所述根据所述目标对象的初始预估运动状态信息,采用模型预测控制算法,调整所述预估运动状态信息,得到所述目标对象的目标运动状态信息,包括:2. The method according to claim 1, characterized in that the step of adjusting the estimated motion state information based on the initial estimated motion state information of the target object by using a model predictive control algorithm to obtain the target motion state information of the target object comprises: 将所述目标对象的初始预估运动状态信息作为输入参数输入所述模型预测控制算法,计算得到所述目标对象的目标运动状态信息。The initial estimated motion state information of the target object is input into the model predictive control algorithm as an input parameter, and the target motion state information of the target object is calculated. 3.根据权利要求1所述的方法,其特征在于,所述从预先构建的运动状态信息库中获取多个样本对象的多个历史时刻的运动状态信息之前,还包括:3. The method according to claim 1, characterized in that before acquiring the motion state information of multiple sample objects at multiple historical moments from the pre-built motion state information library, it also includes: 获取多个样本对象的多个历史运动轨迹,各历史运动轨迹由各样本对象的至少两个连续历史时刻的运动状态信息构成;Acquire multiple historical motion trajectories of multiple sample objects, each historical motion trajectory being composed of motion state information of at least two consecutive historical moments of each sample object; 对各样本对象的各历史运动轨迹进行拆分,得到各样本对象的多个历史时刻的运动状态信息;Splitting each historical motion trajectory of each sample object to obtain motion state information of each sample object at multiple historical moments; 由各样本对象的多个历史时刻的运动状态信息,构建所述运动状态信息库。The motion state information library is constructed based on the motion state information of each sample object at multiple historical moments. 4.根据权利要求1所述的方法,其特征在于,所述对象的位置信息包括:对象的位置及对象的朝向;所述对象的速度信息包括:对象的三维速度及三维加速度。4 . The method according to claim 1 , wherein the position information of the object comprises: the position of the object and the orientation of the object; and the speed information of the object comprises: the three-dimensional speed and three-dimensional acceleration of the object. 5.一种轨迹规划装置,其特征在于,包括:生成模块、修正模块以及模型训练模块;5. A trajectory planning device, characterized by comprising: a generation module, a correction module and a model training module; 所述生成模块,用于根据目标对象当前时刻的运动状态信息,采用预先训练得到的目标轨迹模型,生成所述目标对象的初始预估运动状态信息;The generating module is used to generate the initial estimated motion state information of the target object according to the motion state information of the target object at the current moment and using the target trajectory model obtained by pre-training; 所述修正模块,用于根据所述目标对象的初始预估运动状态信息,采用模型预测控制算法,调整所述预估运动状态信息,得到所述目标对象的目标运动状态信息,所述目标对象的运动轨迹由至少两个连续时刻的目标运动状态信息构成;The correction module is used to adjust the estimated motion state information according to the initial estimated motion state information of the target object by using a model predictive control algorithm to obtain the target motion state information of the target object, wherein the motion trajectory of the target object is composed of the target motion state information of at least two consecutive moments; 所述模型训练模块,用于从预先构建的运动状态信息库中获取多个样本对象的多个历史时刻的运动状态信息,运动状态信息包括:对象的位置信息、对象的速度信息及轨迹剩余时间;The model training module is used to obtain the motion state information of multiple sample objects at multiple historical moments from a pre-built motion state information library, where the motion state information includes: object position information, object speed information and trajectory remaining time; 基于各样本对象的多个历史时刻的运动状态信息对初始轨迹模型进行迭代训练,得到所述目标轨迹模型;Iteratively training the initial trajectory model based on the motion state information of each sample object at multiple historical moments to obtain the target trajectory model; 所述模型训练模块,具体用于将各样本对象的多个历史时刻的运动状态信息输入所述初始轨迹模型中,得到所述初始轨迹模型输出的各样本对象的预估运动状态信息;The model training module is specifically used to input the motion state information of each sample object at multiple historical moments into the initial trajectory model to obtain the estimated motion state information of each sample object output by the initial trajectory model; 根据所述预估运动状态信息以及实际运动状态信息,计算所述初始轨迹模型的模型损失,根据所述模型损失修正所述初始轨迹模型的模型参数,并基于修正后的初始轨迹模型重新对样本对象进行运动状态信息预估;Calculating a model loss of the initial trajectory model according to the estimated motion state information and the actual motion state information, correcting a model parameter of the initial trajectory model according to the model loss, and re-estimating the motion state information of the sample object based on the corrected initial trajectory model; 迭代执行上述步骤,直至所述初始轨迹模型的模型损失满足预设条件,并将满足预设条件的初始轨迹模型作为所述目标轨迹模型;Iteratively perform the above steps until the model loss of the initial trajectory model meets the preset conditions, and use the initial trajectory model that meets the preset conditions as the target trajectory model; 所述模型训练模块,具体用于根据各所述样本对象在各历史时刻的运动状态信息,使用模型预测控制算法,计算各所述样本对象的所述实际运动状态信息;The model training module is specifically used to calculate the actual motion state information of each sample object using a model prediction control algorithm according to the motion state information of each sample object at each historical moment; 对各所述样本对象的所述预估运动状态信息以所述实际运动状态信息进行比对分析,得到所述初始轨迹模型的模型损失。The estimated motion state information of each of the sample objects is compared and analyzed with the actual motion state information to obtain a model loss of the initial trajectory model. 6.一种电子设备,其特征在于,包括:处理器、存储介质和总线,所述存储介质存储有所述处理器可执行的程序指令,当电子设备运行时,所述处理器与所述存储介质之间通过总线通信,所述处理器执行所述程序指令,以执行时执行如权利要求1至4任一所述的轨迹规划方法的步骤。6. An electronic device, characterized in that it comprises: a processor, a storage medium and a bus, wherein the storage medium stores program instructions executable by the processor, and when the electronic device is running, the processor communicates with the storage medium through the bus, and the processor executes the program instructions to perform the steps of the trajectory planning method as described in any one of claims 1 to 4. 7.一种计算机可读存储介质,其特征在于,所述存储介质上存储有计算机程序,所述计算机程序被处理器运行时执行如权利要求1至4任一所述的轨迹规划方法的步骤。7. A computer-readable storage medium, characterized in that a computer program is stored on the storage medium, and when the computer program is executed by a processor, the steps of the trajectory planning method according to any one of claims 1 to 4 are executed.
CN202210624231.4A 2022-06-02 2022-06-02 Trajectory planning method, device, electronic device and storage medium Active CN114967465B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210624231.4A CN114967465B (en) 2022-06-02 2022-06-02 Trajectory planning method, device, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210624231.4A CN114967465B (en) 2022-06-02 2022-06-02 Trajectory planning method, device, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN114967465A CN114967465A (en) 2022-08-30
CN114967465B true CN114967465B (en) 2025-04-08

Family

ID=82960257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210624231.4A Active CN114967465B (en) 2022-06-02 2022-06-02 Trajectory planning method, device, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN114967465B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116125819B (en) * 2023-04-14 2023-07-07 智道网联科技(北京)有限公司 Track correction method, track correction device, electronic device and computer-readable storage medium
CN116989790A (en) * 2023-07-05 2023-11-03 中国电信股份有限公司技术创新中心 Positioning method, positioning device, computer equipment and storage medium
CN117874529B (en) * 2024-03-12 2024-05-28 浪潮电子信息产业股份有限公司 Motion trajectory prediction method, model training method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111121777A (en) * 2019-11-26 2020-05-08 北京三快在线科技有限公司 Unmanned equipment trajectory planning method and device, electronic equipment and storage medium
CN112306059A (en) * 2020-10-15 2021-02-02 北京三快在线科技有限公司 A training method, control method and device for a control model

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3955080B1 (en) * 2020-08-12 2024-01-17 Robert Bosch GmbH Method and device for socially aware model predictive control of a robotic device using machine learning
CN113867334B (en) * 2021-09-07 2023-05-05 华侨大学 Unmanned path planning method and system for mobile machinery
CN114527769B (en) * 2022-03-11 2024-10-01 深圳市优必选科技股份有限公司 Track planning method, track planning device, sports equipment and computer-readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111121777A (en) * 2019-11-26 2020-05-08 北京三快在线科技有限公司 Unmanned equipment trajectory planning method and device, electronic equipment and storage medium
CN112306059A (en) * 2020-10-15 2021-02-02 北京三快在线科技有限公司 A training method, control method and device for a control model

Also Published As

Publication number Publication date
CN114967465A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN114967465B (en) Trajectory planning method, device, electronic device and storage medium
Wu et al. Human-guided reinforcement learning with sim-to-real transfer for autonomous navigation
Nguyen-Tuong et al. Using model knowledge for learning inverse dynamics
US11403513B2 (en) Learning motor primitives and training a machine learning system using a linear-feedback-stabilized policy
Chen et al. Delay-aware multi-agent reinforcement learning for cooperative and competitive environments
CN111638646B (en) Training method and device for walking controller of quadruped robot, terminal and storage medium
CN113050640A (en) Industrial robot path planning method and system based on generation of countermeasure network
CN117086886B (en) Robot dynamic error prediction method and system based on mechanism data hybrid driving
CN114115341A (en) A method and system for cooperative motion of an agent cluster
CN116483108A (en) State correction method and device for quadruped robot, electronic equipment and storage medium
Fahimi et al. An alternative closed-loop vision-based control approach for Unmanned Aircraft Systems with application to a quadrotor
CN116578080A (en) Local path planning method based on deep reinforcement learning
Tao et al. A multiobjective collaborative deep reinforcement learning algorithm for jumping optimization of bipedal robot
CN118752484A (en) Space trajectory planning method for manipulator joints based on model-predicted path integration
CN113485107A (en) Reinforcement learning robot control method and system based on consistency constraint modeling
Yang et al. Suicidal pedestrian: Generation of safety-critical scenarios for autonomous vehicles
Flad et al. Individual driver modeling via optimal selection of steering primitives
Zhang et al. Inverse model predictive control: Learning optimal control cost functions for MPC
CN118700133A (en) Path planning method for redundantly driven robotic arms based on DSAW offline reinforcement learning algorithm
CN118112991A (en) Position force hybrid control method for tracking unknown curved surface by robot
CN115576317B (en) Multi-pretightening point path tracking control method and system based on neural network
CN115293334B (en) Unmanned equipment control method based on model-based high-sample rate deep reinforcement learning
Baldauf et al. Iterative learning-based model predictive control for mobile robots in space applications
Zhao et al. Cooperation with Humans of Unknown Intentions in Confined Spaces Using the Stackelberg Friend-or-Foe Game
WO2023157235A1 (en) Arithmetic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant