[go: up one dir, main page]

CN114460841B - Foot robot multi-step controller generation method and computer readable storage medium - Google Patents

Foot robot multi-step controller generation method and computer readable storage medium Download PDF

Info

Publication number
CN114460841B
CN114460841B CN202111534719.XA CN202111534719A CN114460841B CN 114460841 B CN114460841 B CN 114460841B CN 202111534719 A CN202111534719 A CN 202111534719A CN 114460841 B CN114460841 B CN 114460841B
Authority
CN
China
Prior art keywords
robot
generating
foot
controller
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111534719.XA
Other languages
Chinese (zh)
Other versions
CN114460841A (en
Inventor
王宏涛
邵烨程
金永斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZJU Hangzhou Global Scientific and Technological Innovation Center
Original Assignee
ZJU Hangzhou Global Scientific and Technological Innovation Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZJU Hangzhou Global Scientific and Technological Innovation Center filed Critical ZJU Hangzhou Global Scientific and Technological Innovation Center
Priority to CN202111534719.XA priority Critical patent/CN114460841B/en
Publication of CN114460841A publication Critical patent/CN114460841A/en
Application granted granted Critical
Publication of CN114460841B publication Critical patent/CN114460841B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)
  • Manipulator (AREA)

Abstract

The application relates to the technical field related to robot control, and discloses a method for generating a multi-step controller of a foot-type robot and a computer readable storage medium, wherein the method for generating the multi-step controller of the foot-type robot comprises the following steps: s1: establishing a simulation environment by using any physical engine; s2, inputting initial parameters, and generating reference motion by using a function; and S3, performing reinforcement learning training in a simulation environment by using an optimization algorithm, adding a reward function, simulating the motion generated in the step S2, wherein the reward function comprises deviation weights of the actual state of the robot and the reference state in the step S2, and then outputting the controller. Because reinforcement learning has great randomness, a control algorithm with very unnatural motion conditions is easily obtained, and the problem of unnatural motion conditions can be effectively avoided by generating reference motion in the step S2 and taking the difference between the actual motion of the robot and the reference motion as one of targets in the step S3.

Description

Foot robot multi-step controller generation method and computer readable storage medium
Technical Field
The invention relates to the technical field related to robot control, in particular to a method for generating a multi-step controller of a foot-type robot and a computer readable storage medium.
Background
The rhythms of the leg-foot type living beings or robots in which the different legs are lifted and landed during movement are called gait. Animals in nature can adopt different optimal gait under different environments and movement speeds. In the case of four-foot exercise, because of the large number of legs and the large number and complexity of gait, it is difficult to design a controller to control the four-foot robot to perform a plurality of different gait exercises. The application provides the robot for solving the problems that most of robots at home and abroad can only control the movement of one gait, and the effect of automatically adjusting the gait of the robot according to the environment is difficult to realize, so that the adaptability of the robot to different environments is insufficient.
Disclosure of Invention
The present invention is directed to a method for generating a multi-step controller of a foot robot and a computer readable storage medium for solving the above problems.
The invention is realized by the following technical scheme.
The invention relates to a method for generating a multi-step controller of a foot robot, which comprises the following steps:
S1: establishing a simulation environment by using any physical engine;
s2, inputting initial parameters, and generating reference motion by using a function;
And S3, performing reinforcement learning training in a simulation environment by using an optimization algorithm, adding a reward function, simulating the motion generated in the step S2, taking the difference between the actual motion of the robot and the reference motion in the step S2 as one of targets, and then outputting the targets to a controller.
Further, in the step S2, a plurality of phases are used, each phase corresponds to a periodic motion of one leg, and each phase is generated in the same or different manners.
Further, the function of generating the reference motion in the step S2 is:
where p i is the position of the foot, Is the three-dimensional coordinates of the foot,In order to be a phase of the light,The step length is represented by T, beta is the duty ratio of the landing time, v x,vy is the movement speed, and h is the leg lifting height.
Further, the strategy of reinforcement learning training in the step S3 is a long-short term memory neural network.
Further, the optimization algorithm in the step S3 is one of a strategy gradient algorithm, Q-Learning or DQN.
Further, the policy gradient algorithm includes a PPO or TRPO algorithm.
Further, the reward function in the step S3 is the sum of the products of the motion state values of the robots and the weight coefficients thereof.
Further, the reward function is specifically r=w1rτ+w2rv+w3rj+w4rb+wsrh+w6rc, where w 1-w6 is a weight, r j is a deviation parameter of an actual state of the robot from a reference state generated in the step S2, r τ is a parameter for encouraging the robot to save a motor moment, r v is a parameter for encouraging the actual speed of the robot to follow a command speed, r b is a parameter for encouraging the body posture of the robot to be stable, r h is a parameter for encouraging the body height of the robot to be stable at a certain value, and r c is a parameter for encouraging the contact state of each foot of the robot with the ground to be the same as the contact state of the reference movement.
Further, in the step S3, the size and the quality of each element of the robot, the contact friction and the time delay are randomized.
A computer readable storage medium having stored therein a computer program or code set which when executed by a processor implements the foot robot multi-step controller generation method described above.
The invention has the beneficial effects that:
Because reinforcement learning has great randomness, a control algorithm with very unnatural motion conditions is easily obtained, and the problem of unnatural motion conditions can be effectively avoided by generating reference motion in the step S2 and taking the difference between the actual motion of the robot and the reference motion as one of targets in the step S3.
The method for generating the reference motion in the step S2 is very simple, the reference motion control effect of the simple geometry is considered to be poor by the prior art, a simplified dynamics model is required to be established for the robot, a control algorithm based on the model is used for obtaining a high-quality controller, but the establishment difficulty of the controller is high.
The application uses the long-short time memory network, the network structure can fully excavate the hidden information in the time sequence data, and for the problem, the long-short time memory network can automatically identify the system characteristic of the robot by utilizing the time sequence information in the process of the movement of the robot, so that the controller can directly migrate to the physical robot from the simulation environment without additional debugging.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from these drawings without inventive effort for a person skilled in the art.
The invention will be further described with reference to the drawings and examples.
Fig. 1 is a schematic flow chart of a method for generating a multi-step controller of a foot robot according to the present invention.
Detailed Description
The present invention will be described in detail with reference to fig. 1.
The invention relates to a method for generating a multi-step controller of a foot robot, which comprises the following steps:
S1: the simulation environment is established using any physical engine, such as: the method comprises the steps of constructing a multi-legged robot rapid simulation environment by using an open source Bullet physical engine, wherein the multi-legged robot rapid simulation environment comprises a plurality of groups of robot physical models and physical attributes of surrounding environments, and the sensor information of the plurality of groups of robots is added into the simulation environment in a plug-in mode and displayed in a visual mode.
S2: inputting initial parameters, generating reference motion by using a function, wherein the function for generating the reference motion is as follows:
where p i is the position of the foot, Is the three-dimensional coordinates of the foot,In order to be a phase of the light,The method comprises the steps of representing step length, T is period, beta is ground time duty ratio, v x,vy is motion speed, h is leg lifting height, initial parameters input in the step S2 comprise sensor data of a robot, control instructions and phase information, the sensor data comprise motor position and speed, robot posture and robot heart speed, the control instructions comprise target line speed and angular speed, the phase information is in four-foot robot example, 4 phases are used, each phase corresponds to periodic motion of one leg, each phase is continuously circulated from 0 to 2 pi in the working process, sin and cos functions are calculated for the phases at the same time, and final input quantity is obtained.
And S3, performing reinforcement learning training in a simulation environment by utilizing an optimization algorithm, wherein the complete reinforcement learning training comprises the simulation environment, a strategy (or a controller, a control algorithm, namely, a mathematical form from input to output), a state space and a motion space (namely, the input and the output of the strategy), a reward function (or an objective function) and the optimization algorithm. Repeating trial and error in a simulation environment by using a reinforcement learning method, and automatically determining parameters by an optimization algorithm;
The optimization algorithm is specifically a strategic gradient algorithm such as PPO or TRPO, or Q-Learning algorithm, DQN algorithm and the like algorithm can be used, a reward function is added, the reward function is the sum of products of motion state values of a plurality of robots and weight coefficients thereof, preferably, the reward function is r=w1rτ+w2rv+w3rj+w4rb+w5rh+w6rc,, wherein w 1-w6 is weight, r j is a deviation parameter of an actual state of the robots from a reference state generated in the step S2, r τ is a parameter for encouraging the robots to save motor moment, r v is a parameter for encouraging the robots to follow a command speed, r b is a parameter for encouraging the robot to have stable body posture, r h is a parameter for encouraging the robot to have a stable height at a certain value, r c is a parameter for encouraging the contact state of each foot of the robots with the ground to have the same contact state as the reference motion, so that the difference of the actual motion of the robots and the reference motion in the step S2 is one of targets, and then a controller is output;
the strategy of reinforcement learning training in the step S3 is a long-short term memory neural network, the network structure can fully mine implicit information in time sequence data, and for the problem, the long-short term memory network can automatically identify the system characteristics of the robot by utilizing the time sequence information in the process of robot movement, so that a controller can be directly migrated to a physical robot from a simulation environment without additional debugging.
S4: and (3) transferring the controller obtained in the step (S3) to a physical robot, outputting the target position of each joint of the robot by the controller, so that the control of the robot can be realized, the robot can move along with a speed command under the gait specified by the phase, and the multiple gaits are switched.
In the step S3, parameters such as size and quality of each element, contact friction, delay and the like of the robot are randomized, actual conditions such as hardware processing errors, ground conditions, system delay and the like are simulated, robustness is enhanced, and success rate from simulation to physical migration is improved.
A computer readable storage medium having stored therein a computer program or code set which when executed by a processor implements the foot robot multi-step controller generation method described above.
The above embodiments are only for illustrating the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the present invention and implement it without limiting the scope of the present invention. All equivalent changes or modifications made in accordance with the spirit of the present invention should be construed to be included in the scope of the present invention.

Claims (8)

1. A method for generating a multi-step controller of a foot robot is characterized by comprising the following steps: the method comprises the following steps:
S1: establishing a simulation environment by using any physical engine;
s2, inputting initial parameters, and generating reference motion by using a function;
s3, performing reinforcement learning training in a simulation environment by using an optimization algorithm, adding a reward function, simulating the motion generated in the step S2, wherein the reward function comprises deviation weights of the actual state of the robot and the reference state in the step S2, and then outputting the deviation weights to a controller;
in the step S2, a plurality of phases are used, each phase corresponds to the periodic motion of one leg, and the generation modes of each phase are the same or different;
The function of generating the reference motion in the step S2 is as follows:
where p i is the position of the foot, Is the three-dimensional coordinates of the foot,In order to be a phase of the light,The step length is represented by T, beta is the duty ratio of the landing time, v x,vy is the movement speed, and h is the leg lifting height.
2. The method for generating the multi-step controller of the foot robot according to claim 1, wherein the method comprises the steps of: and (3) the strategy of reinforcement learning training in the step S3 is a long-short-term memory neural network.
3. The method for generating the multi-step controller of the foot robot according to claim 2, wherein the method comprises the steps of: the optimization algorithm in the step S3 is one of a strategy gradient algorithm, Q-Learning or DQN.
4. A method for generating a multi-step controller for a foot robot according to claim 3, wherein: the policy gradient algorithm comprises a PPO or TRPO algorithm.
5. The method for generating the multi-step controller of the foot robot according to claim 4, wherein the method comprises the steps of: the reward function in the step S3 is the sum of the products of the motion state values of a plurality of robots and the weight coefficients of the robots.
6. The method for generating the multi-step controller of the foot robot according to claim 5, wherein the method comprises the steps of: the reward function is specifically r=w1rτ+w2rv+w3rj+w4rb+w5rh+w6rc,, where w 1-w6 is a weight, r j is a deviation parameter of an actual state of the robot from a reference state generated in the step S2, r τ is a parameter for encouraging the robot to save a motor moment, r v is a parameter for encouraging the actual speed of the robot to follow a command speed, r b is a parameter for encouraging the body posture of the robot to be stable, r h is a parameter for encouraging the body height of the robot to be stable at a certain value, and r c is a parameter for encouraging the contact state of each foot of the robot with the ground to be the same as the contact state of the reference movement.
7. A method of generating a multi-step controller for a foot robot according to any one of claims 3 to 6, wherein: and in the step S3, randomizing the size and the quality of each element of the robot, the contact friction and the time delay.
8. A computer-readable storage medium, characterized by: the computer readable storage medium has stored therein a computer program or a set of codes, which when executed by a processor, implements the method for generating a multi-step controller of a foot robot according to any one of claims 1-7.
CN202111534719.XA 2021-12-15 2021-12-15 Foot robot multi-step controller generation method and computer readable storage medium Active CN114460841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111534719.XA CN114460841B (en) 2021-12-15 2021-12-15 Foot robot multi-step controller generation method and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111534719.XA CN114460841B (en) 2021-12-15 2021-12-15 Foot robot multi-step controller generation method and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN114460841A CN114460841A (en) 2022-05-10
CN114460841B true CN114460841B (en) 2024-10-15

Family

ID=81406630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111534719.XA Active CN114460841B (en) 2021-12-15 2021-12-15 Foot robot multi-step controller generation method and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114460841B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016325A (en) * 2022-07-04 2022-09-06 北京化工大学 Gait self-learning method for foot type robot

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112596534A (en) * 2020-12-04 2021-04-02 杭州未名信科科技有限公司 Gait training method and device for quadruped robot based on deep reinforcement learning, electronic equipment and medium
CN112684794A (en) * 2020-12-07 2021-04-20 杭州未名信科科技有限公司 Foot type robot motion control method, device and medium based on meta reinforcement learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3768472A1 (en) * 2018-04-22 2021-01-27 Google LLC Systems and methods for learning agile locomotion for multiped robots
CN111913490B (en) * 2020-08-18 2023-11-14 山东大学 Four-foot robot dynamic gait stability control method and system based on foot falling adjustment
CN112631131A (en) * 2020-12-19 2021-04-09 北京化工大学 Motion control self-generation and physical migration method for quadruped robot
CN113771081B (en) * 2021-07-06 2024-04-30 清华大学 Physical-based virtual human hand automatic grabbing method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112596534A (en) * 2020-12-04 2021-04-02 杭州未名信科科技有限公司 Gait training method and device for quadruped robot based on deep reinforcement learning, electronic equipment and medium
CN112684794A (en) * 2020-12-07 2021-04-20 杭州未名信科科技有限公司 Foot type robot motion control method, device and medium based on meta reinforcement learning

Also Published As

Publication number Publication date
CN114460841A (en) 2022-05-10

Similar Documents

Publication Publication Date Title
CN110861084B (en) A self-reset control method for quadruped robot fall based on deep reinforcement learning
Gams et al. Adaptation and coaching of periodic motion primitives through physical and visual interaction
CN113031528B (en) Multi-legged robot non-structural ground motion control method based on depth certainty strategy gradient
WO2019209681A1 (en) Systems and methods for learning agile locomotion for multiped robots
CN113190029B (en) Autonomous generation of adaptive gaits for quadruped robots based on deep reinforcement learning
De Sapio et al. Simulating the task-level control of human motion: a methodology and framework for implementation
CN114326722B (en) Six-foot robot self-adaptive gait planning method, system, device and medium
CN112276947B (en) Robot motion simulation method, device, equipment and storage medium
CN117215204B (en) Robot gait training method and system based on reinforcement learning
CN117555339B (en) Strategy network training method and human-shaped biped robot gait control method
Mukovskiy et al. Adaptive synthesis of dynamically feasible full-body movements for the humanoid robot HRP-2 by flexible combination of learned dynamic movement primitives
CN112749515A (en) Damaged robot gait self-learning integrating biological inspiration and deep reinforcement learning
CN114460841B (en) Foot robot multi-step controller generation method and computer readable storage medium
CN118012077A (en) Four-foot robot motion control method and system based on reinforcement learning motion simulation
CN106094817A (en) Intensified learning humanoid robot gait's planing method based on big data mode
Mohan et al. A biomimetic, force-field based computational model for motion planning and bimanual coordination in humanoid robots
CN116604532A (en) Intelligent control method for upper limb rehabilitation robot
CN114740875B (en) Robot rhythmic motion control method and system based on neural oscillator
Allen et al. Evolved controllers for simulated locomotion
CN118809606A (en) A torque control method for humanoid robots based on position loop pre-training
CN110515297A (en) A Phased Motion Control Method Based on Redundant Musculoskeletal System
Lee et al. Combining GRN modeling and demonstration-based programming for robot control
Li et al. Experience-learning inspired two-step reward method for efficient legged locomotion learning towards natural and robust gaits
Dobrynin Gait synthesis of a home quadruped robot
Beal et al. A manifold operator representation for adaptive design

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant