CN114460841B - Foot robot multi-step controller generation method and computer readable storage medium - Google Patents
Foot robot multi-step controller generation method and computer readable storage medium Download PDFInfo
- Publication number
- CN114460841B CN114460841B CN202111534719.XA CN202111534719A CN114460841B CN 114460841 B CN114460841 B CN 114460841B CN 202111534719 A CN202111534719 A CN 202111534719A CN 114460841 B CN114460841 B CN 114460841B
- Authority
- CN
- China
- Prior art keywords
- robot
- generating
- foot
- controller
- motion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000033001 locomotion Effects 0.000 claims abstract description 44
- 230000006870 function Effects 0.000 claims abstract description 22
- 238000004088 simulation Methods 0.000 claims abstract description 16
- 230000002787 reinforcement Effects 0.000 claims abstract description 11
- 238000005457 optimization Methods 0.000 claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 8
- 102100040653 Tryptophan 2,3-dioxygenase Human genes 0.000 claims description 3
- 101710136122 Tryptophan 2,3-dioxygenase Proteins 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 230000000737 periodic effect Effects 0.000 claims description 3
- 230000005021 gait Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
- G05B13/042—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
- Manipulator (AREA)
Abstract
The application relates to the technical field related to robot control, and discloses a method for generating a multi-step controller of a foot-type robot and a computer readable storage medium, wherein the method for generating the multi-step controller of the foot-type robot comprises the following steps: s1: establishing a simulation environment by using any physical engine; s2, inputting initial parameters, and generating reference motion by using a function; and S3, performing reinforcement learning training in a simulation environment by using an optimization algorithm, adding a reward function, simulating the motion generated in the step S2, wherein the reward function comprises deviation weights of the actual state of the robot and the reference state in the step S2, and then outputting the controller. Because reinforcement learning has great randomness, a control algorithm with very unnatural motion conditions is easily obtained, and the problem of unnatural motion conditions can be effectively avoided by generating reference motion in the step S2 and taking the difference between the actual motion of the robot and the reference motion as one of targets in the step S3.
Description
Technical Field
The invention relates to the technical field related to robot control, in particular to a method for generating a multi-step controller of a foot-type robot and a computer readable storage medium.
Background
The rhythms of the leg-foot type living beings or robots in which the different legs are lifted and landed during movement are called gait. Animals in nature can adopt different optimal gait under different environments and movement speeds. In the case of four-foot exercise, because of the large number of legs and the large number and complexity of gait, it is difficult to design a controller to control the four-foot robot to perform a plurality of different gait exercises. The application provides the robot for solving the problems that most of robots at home and abroad can only control the movement of one gait, and the effect of automatically adjusting the gait of the robot according to the environment is difficult to realize, so that the adaptability of the robot to different environments is insufficient.
Disclosure of Invention
The present invention is directed to a method for generating a multi-step controller of a foot robot and a computer readable storage medium for solving the above problems.
The invention is realized by the following technical scheme.
The invention relates to a method for generating a multi-step controller of a foot robot, which comprises the following steps:
S1: establishing a simulation environment by using any physical engine;
s2, inputting initial parameters, and generating reference motion by using a function;
And S3, performing reinforcement learning training in a simulation environment by using an optimization algorithm, adding a reward function, simulating the motion generated in the step S2, taking the difference between the actual motion of the robot and the reference motion in the step S2 as one of targets, and then outputting the targets to a controller.
Further, in the step S2, a plurality of phases are used, each phase corresponds to a periodic motion of one leg, and each phase is generated in the same or different manners.
Further, the function of generating the reference motion in the step S2 is:
where p i is the position of the foot, Is the three-dimensional coordinates of the foot,In order to be a phase of the light,The step length is represented by T, beta is the duty ratio of the landing time, v x,vy is the movement speed, and h is the leg lifting height.
Further, the strategy of reinforcement learning training in the step S3 is a long-short term memory neural network.
Further, the optimization algorithm in the step S3 is one of a strategy gradient algorithm, Q-Learning or DQN.
Further, the policy gradient algorithm includes a PPO or TRPO algorithm.
Further, the reward function in the step S3 is the sum of the products of the motion state values of the robots and the weight coefficients thereof.
Further, the reward function is specifically r=w1rτ+w2rv+w3rj+w4rb+wsrh+w6rc, where w 1-w6 is a weight, r j is a deviation parameter of an actual state of the robot from a reference state generated in the step S2, r τ is a parameter for encouraging the robot to save a motor moment, r v is a parameter for encouraging the actual speed of the robot to follow a command speed, r b is a parameter for encouraging the body posture of the robot to be stable, r h is a parameter for encouraging the body height of the robot to be stable at a certain value, and r c is a parameter for encouraging the contact state of each foot of the robot with the ground to be the same as the contact state of the reference movement.
Further, in the step S3, the size and the quality of each element of the robot, the contact friction and the time delay are randomized.
A computer readable storage medium having stored therein a computer program or code set which when executed by a processor implements the foot robot multi-step controller generation method described above.
The invention has the beneficial effects that:
Because reinforcement learning has great randomness, a control algorithm with very unnatural motion conditions is easily obtained, and the problem of unnatural motion conditions can be effectively avoided by generating reference motion in the step S2 and taking the difference between the actual motion of the robot and the reference motion as one of targets in the step S3.
The method for generating the reference motion in the step S2 is very simple, the reference motion control effect of the simple geometry is considered to be poor by the prior art, a simplified dynamics model is required to be established for the robot, a control algorithm based on the model is used for obtaining a high-quality controller, but the establishment difficulty of the controller is high.
The application uses the long-short time memory network, the network structure can fully excavate the hidden information in the time sequence data, and for the problem, the long-short time memory network can automatically identify the system characteristic of the robot by utilizing the time sequence information in the process of the movement of the robot, so that the controller can directly migrate to the physical robot from the simulation environment without additional debugging.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from these drawings without inventive effort for a person skilled in the art.
The invention will be further described with reference to the drawings and examples.
Fig. 1 is a schematic flow chart of a method for generating a multi-step controller of a foot robot according to the present invention.
Detailed Description
The present invention will be described in detail with reference to fig. 1.
The invention relates to a method for generating a multi-step controller of a foot robot, which comprises the following steps:
S1: the simulation environment is established using any physical engine, such as: the method comprises the steps of constructing a multi-legged robot rapid simulation environment by using an open source Bullet physical engine, wherein the multi-legged robot rapid simulation environment comprises a plurality of groups of robot physical models and physical attributes of surrounding environments, and the sensor information of the plurality of groups of robots is added into the simulation environment in a plug-in mode and displayed in a visual mode.
S2: inputting initial parameters, generating reference motion by using a function, wherein the function for generating the reference motion is as follows:
where p i is the position of the foot, Is the three-dimensional coordinates of the foot,In order to be a phase of the light,The method comprises the steps of representing step length, T is period, beta is ground time duty ratio, v x,vy is motion speed, h is leg lifting height, initial parameters input in the step S2 comprise sensor data of a robot, control instructions and phase information, the sensor data comprise motor position and speed, robot posture and robot heart speed, the control instructions comprise target line speed and angular speed, the phase information is in four-foot robot example, 4 phases are used, each phase corresponds to periodic motion of one leg, each phase is continuously circulated from 0 to 2 pi in the working process, sin and cos functions are calculated for the phases at the same time, and final input quantity is obtained.
And S3, performing reinforcement learning training in a simulation environment by utilizing an optimization algorithm, wherein the complete reinforcement learning training comprises the simulation environment, a strategy (or a controller, a control algorithm, namely, a mathematical form from input to output), a state space and a motion space (namely, the input and the output of the strategy), a reward function (or an objective function) and the optimization algorithm. Repeating trial and error in a simulation environment by using a reinforcement learning method, and automatically determining parameters by an optimization algorithm;
The optimization algorithm is specifically a strategic gradient algorithm such as PPO or TRPO, or Q-Learning algorithm, DQN algorithm and the like algorithm can be used, a reward function is added, the reward function is the sum of products of motion state values of a plurality of robots and weight coefficients thereof, preferably, the reward function is r=w1rτ+w2rv+w3rj+w4rb+w5rh+w6rc,, wherein w 1-w6 is weight, r j is a deviation parameter of an actual state of the robots from a reference state generated in the step S2, r τ is a parameter for encouraging the robots to save motor moment, r v is a parameter for encouraging the robots to follow a command speed, r b is a parameter for encouraging the robot to have stable body posture, r h is a parameter for encouraging the robot to have a stable height at a certain value, r c is a parameter for encouraging the contact state of each foot of the robots with the ground to have the same contact state as the reference motion, so that the difference of the actual motion of the robots and the reference motion in the step S2 is one of targets, and then a controller is output;
the strategy of reinforcement learning training in the step S3 is a long-short term memory neural network, the network structure can fully mine implicit information in time sequence data, and for the problem, the long-short term memory network can automatically identify the system characteristics of the robot by utilizing the time sequence information in the process of robot movement, so that a controller can be directly migrated to a physical robot from a simulation environment without additional debugging.
S4: and (3) transferring the controller obtained in the step (S3) to a physical robot, outputting the target position of each joint of the robot by the controller, so that the control of the robot can be realized, the robot can move along with a speed command under the gait specified by the phase, and the multiple gaits are switched.
In the step S3, parameters such as size and quality of each element, contact friction, delay and the like of the robot are randomized, actual conditions such as hardware processing errors, ground conditions, system delay and the like are simulated, robustness is enhanced, and success rate from simulation to physical migration is improved.
A computer readable storage medium having stored therein a computer program or code set which when executed by a processor implements the foot robot multi-step controller generation method described above.
The above embodiments are only for illustrating the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the present invention and implement it without limiting the scope of the present invention. All equivalent changes or modifications made in accordance with the spirit of the present invention should be construed to be included in the scope of the present invention.
Claims (8)
1. A method for generating a multi-step controller of a foot robot is characterized by comprising the following steps: the method comprises the following steps:
S1: establishing a simulation environment by using any physical engine;
s2, inputting initial parameters, and generating reference motion by using a function;
s3, performing reinforcement learning training in a simulation environment by using an optimization algorithm, adding a reward function, simulating the motion generated in the step S2, wherein the reward function comprises deviation weights of the actual state of the robot and the reference state in the step S2, and then outputting the deviation weights to a controller;
in the step S2, a plurality of phases are used, each phase corresponds to the periodic motion of one leg, and the generation modes of each phase are the same or different;
The function of generating the reference motion in the step S2 is as follows:
where p i is the position of the foot, Is the three-dimensional coordinates of the foot,In order to be a phase of the light,The step length is represented by T, beta is the duty ratio of the landing time, v x,vy is the movement speed, and h is the leg lifting height.
2. The method for generating the multi-step controller of the foot robot according to claim 1, wherein the method comprises the steps of: and (3) the strategy of reinforcement learning training in the step S3 is a long-short-term memory neural network.
3. The method for generating the multi-step controller of the foot robot according to claim 2, wherein the method comprises the steps of: the optimization algorithm in the step S3 is one of a strategy gradient algorithm, Q-Learning or DQN.
4. A method for generating a multi-step controller for a foot robot according to claim 3, wherein: the policy gradient algorithm comprises a PPO or TRPO algorithm.
5. The method for generating the multi-step controller of the foot robot according to claim 4, wherein the method comprises the steps of: the reward function in the step S3 is the sum of the products of the motion state values of a plurality of robots and the weight coefficients of the robots.
6. The method for generating the multi-step controller of the foot robot according to claim 5, wherein the method comprises the steps of: the reward function is specifically r=w1rτ+w2rv+w3rj+w4rb+w5rh+w6rc,, where w 1-w6 is a weight, r j is a deviation parameter of an actual state of the robot from a reference state generated in the step S2, r τ is a parameter for encouraging the robot to save a motor moment, r v is a parameter for encouraging the actual speed of the robot to follow a command speed, r b is a parameter for encouraging the body posture of the robot to be stable, r h is a parameter for encouraging the body height of the robot to be stable at a certain value, and r c is a parameter for encouraging the contact state of each foot of the robot with the ground to be the same as the contact state of the reference movement.
7. A method of generating a multi-step controller for a foot robot according to any one of claims 3 to 6, wherein: and in the step S3, randomizing the size and the quality of each element of the robot, the contact friction and the time delay.
8. A computer-readable storage medium, characterized by: the computer readable storage medium has stored therein a computer program or a set of codes, which when executed by a processor, implements the method for generating a multi-step controller of a foot robot according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111534719.XA CN114460841B (en) | 2021-12-15 | 2021-12-15 | Foot robot multi-step controller generation method and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111534719.XA CN114460841B (en) | 2021-12-15 | 2021-12-15 | Foot robot multi-step controller generation method and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114460841A CN114460841A (en) | 2022-05-10 |
CN114460841B true CN114460841B (en) | 2024-10-15 |
Family
ID=81406630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111534719.XA Active CN114460841B (en) | 2021-12-15 | 2021-12-15 | Foot robot multi-step controller generation method and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114460841B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115016325A (en) * | 2022-07-04 | 2022-09-06 | 北京化工大学 | Gait self-learning method for foot type robot |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112596534A (en) * | 2020-12-04 | 2021-04-02 | 杭州未名信科科技有限公司 | Gait training method and device for quadruped robot based on deep reinforcement learning, electronic equipment and medium |
CN112684794A (en) * | 2020-12-07 | 2021-04-20 | 杭州未名信科科技有限公司 | Foot type robot motion control method, device and medium based on meta reinforcement learning |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3768472A1 (en) * | 2018-04-22 | 2021-01-27 | Google LLC | Systems and methods for learning agile locomotion for multiped robots |
CN111913490B (en) * | 2020-08-18 | 2023-11-14 | 山东大学 | Four-foot robot dynamic gait stability control method and system based on foot falling adjustment |
CN112631131A (en) * | 2020-12-19 | 2021-04-09 | 北京化工大学 | Motion control self-generation and physical migration method for quadruped robot |
CN113771081B (en) * | 2021-07-06 | 2024-04-30 | 清华大学 | Physical-based virtual human hand automatic grabbing method and device |
-
2021
- 2021-12-15 CN CN202111534719.XA patent/CN114460841B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112596534A (en) * | 2020-12-04 | 2021-04-02 | 杭州未名信科科技有限公司 | Gait training method and device for quadruped robot based on deep reinforcement learning, electronic equipment and medium |
CN112684794A (en) * | 2020-12-07 | 2021-04-20 | 杭州未名信科科技有限公司 | Foot type robot motion control method, device and medium based on meta reinforcement learning |
Also Published As
Publication number | Publication date |
---|---|
CN114460841A (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110861084B (en) | A self-reset control method for quadruped robot fall based on deep reinforcement learning | |
Gams et al. | Adaptation and coaching of periodic motion primitives through physical and visual interaction | |
CN113031528B (en) | Multi-legged robot non-structural ground motion control method based on depth certainty strategy gradient | |
WO2019209681A1 (en) | Systems and methods for learning agile locomotion for multiped robots | |
CN113190029B (en) | Autonomous generation of adaptive gaits for quadruped robots based on deep reinforcement learning | |
De Sapio et al. | Simulating the task-level control of human motion: a methodology and framework for implementation | |
CN114326722B (en) | Six-foot robot self-adaptive gait planning method, system, device and medium | |
CN112276947B (en) | Robot motion simulation method, device, equipment and storage medium | |
CN117215204B (en) | Robot gait training method and system based on reinforcement learning | |
CN117555339B (en) | Strategy network training method and human-shaped biped robot gait control method | |
Mukovskiy et al. | Adaptive synthesis of dynamically feasible full-body movements for the humanoid robot HRP-2 by flexible combination of learned dynamic movement primitives | |
CN112749515A (en) | Damaged robot gait self-learning integrating biological inspiration and deep reinforcement learning | |
CN114460841B (en) | Foot robot multi-step controller generation method and computer readable storage medium | |
CN118012077A (en) | Four-foot robot motion control method and system based on reinforcement learning motion simulation | |
CN106094817A (en) | Intensified learning humanoid robot gait's planing method based on big data mode | |
Mohan et al. | A biomimetic, force-field based computational model for motion planning and bimanual coordination in humanoid robots | |
CN116604532A (en) | Intelligent control method for upper limb rehabilitation robot | |
CN114740875B (en) | Robot rhythmic motion control method and system based on neural oscillator | |
Allen et al. | Evolved controllers for simulated locomotion | |
CN118809606A (en) | A torque control method for humanoid robots based on position loop pre-training | |
CN110515297A (en) | A Phased Motion Control Method Based on Redundant Musculoskeletal System | |
Lee et al. | Combining GRN modeling and demonstration-based programming for robot control | |
Li et al. | Experience-learning inspired two-step reward method for efficient legged locomotion learning towards natural and robust gaits | |
Dobrynin | Gait synthesis of a home quadruped robot | |
Beal et al. | A manifold operator representation for adaptive design |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |