[go: up one dir, main page]

CN119204085B - Multi-source video data robot skill learning method and system - Google Patents

Multi-source video data robot skill learning method and system Download PDF

Info

Publication number
CN119204085B
CN119204085B CN202411699867.0A CN202411699867A CN119204085B CN 119204085 B CN119204085 B CN 119204085B CN 202411699867 A CN202411699867 A CN 202411699867A CN 119204085 B CN119204085 B CN 119204085B
Authority
CN
China
Prior art keywords
robot
camera
strategy
video
virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411699867.0A
Other languages
Chinese (zh)
Other versions
CN119204085A (en
Inventor
张博源
张振亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing General Artificial Intelligence Research Institute
Original Assignee
Beijing General Artificial Intelligence Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing General Artificial Intelligence Research Institute filed Critical Beijing General Artificial Intelligence Research Institute
Priority to CN202411699867.0A priority Critical patent/CN119204085B/en
Publication of CN119204085A publication Critical patent/CN119204085A/en
Application granted granted Critical
Publication of CN119204085B publication Critical patent/CN119204085B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/008Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Robotics (AREA)
  • Manipulator (AREA)

Abstract

The invention provides a multi-source video data robot skill learning method and system, which comprises the steps of automatically collecting example videos related to a skill through an example video collecting module according to a movement skill text description, carrying out data expansion to obtain movement example video data, constructing a virtual robot and a virtual camera, instantiating, combining an intelligent agent with a robot control strategy and a camera mirror strategy, generating and recording robot movement video recording data, constructing a video intelligent scoring model through a movement skill video scoring module, generating a scoring result for the robot movement video recording data, setting a reward feedback collaborative optimization robot control strategy and a camera mirror strategy of a neural network scoring model through an intelligent agent learning module, and updating the robot control strategy and the camera mirror strategy into the intelligent agent.

Description

Multi-source video data robot skill learning method and system
Technical Field
The invention relates to the technical field of machine learning intelligent iteration data processing information transmission control, in particular to a multi-source video data robot skill learning method and system.
Background
Skill learning of virtual robots aims at providing them with a variety of motor skill keywords (e.g., walking, running, grabbing objects, etc.) that can be used to generate robot character animations that conform to human motor patterns and to physical laws. Since the motion process of a robot often requires joint control of multiple joints, and collision interaction with the environment during the motion process is often not differentiable, the existing methods generally adopt reinforcement learning technology to learn the control strategy of the robot. The training signal of the reinforcement learning method comes from reward feedback obtained after the robot executes a certain action, and the control strategy of the robot is gradually adjusted according to the feedback so as to obtain higher expected reward. The simulation learning method provides a simple and effective reward calculation method, namely, a reward is calculated based on the similarity between a robot action sequence and an example action sequence (usually from a human example), wherein the higher the similarity is, the larger the reward is, and conversely, the lower the similarity is, the smaller the reward is;
The similarity between the robot motion sequence and the example motion sequence is calculated by two common calculation methods, namely, based on tracking errors, namely, calculating an average value of L2 distances between each joint of the robot and the example individual joint at corresponding moments, and based on a distance between distributions, namely, measuring the distance between state transition distribution generated when the robot moves and state transition distribution generated by the example individual, and particularly, the method can be used for quantitatively estimating through a discriminator in an countermeasure learning method. The premise of calculating the two rewards is to obtain three-dimensional motion tracks of each joint of an example individual, and the data are often recovered through a motion capture device or a motion reconstruction algorithm. Existing skill learning methods based on video data typically involve two steps, first reconstructing a sequence of motion poses of an example individual from an example video, and then calculating a reward signal for the robot based on tracking errors or inter-distribution distances. Because the motion capturing and motion reconstructing methods have high requirements on training video acquisition, most of the existing video skill learning methods are only suitable for video with fixed visual angle recording and small character movement. Therefore, it is an important research challenge to improve the existing video skill learning method to better utilize massive internet data, the existing video skill learning method generally relies on a three-dimensional pose sequence to train, namely, a reward signal is calculated in a three-dimensional space to assist a robot to learn, however, the three-dimensional pose sequence is usually recovered by means of a complex motion capturing device or a motion reconstruction algorithm, both the two methods have high requirements on the acquisition process of a training video, the limitation makes the existing algorithm difficult to effectively utilize massive motor skills on the internet to display the video, so that the problem that the robot can learn skills is limited still to be solved, and therefore, a multi-source video data robot skill system and method are necessary to be provided to at least partially solve the problems in the prior art.
Disclosure of Invention
The summary of the invention is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to identify the scope of the claimed subject matter, since a series of concepts in a simplified form are included in the summary of the invention, which is described in further detail in the detailed description.
To at least partially solve the above problems, the present invention provides a multi-source video data robot skill learning method, comprising:
S100, automatically collecting example videos related to the skills according to the motor skill text description through an example video collecting module, and performing data expansion to obtain motor example video data;
s200, constructing and instantiating a virtual robot and a virtual camera, cooperating a robot control strategy and a camera mirror-moving strategy, combining an intelligent body, and generating and recording robot motion video recording data;
S300, constructing an intelligent video scoring model through a motor skill video scoring module, and generating a scoring result of robot motion video recording data;
s400, setting a reward feedback collaborative optimization robot control strategy and a camera lens strategy of a neural network scoring model through an agent learning module, and updating the reward feedback collaborative optimization robot control strategy and the camera lens strategy into the agent.
Preferably, S100 includes:
s101, setting a motor skill text description, and analyzing and extracting motor skill keywords in the motor skill text description from the motor skill text description by combining a keyword extraction algorithm with a large language model;
s102, performing label content aggregation on the motor skill keywords to obtain motor skill labels;
And S103, according to the motor skill keywords and the motor skill labels, gathering the motor example video and performing data expansion to obtain motor example video data and motor video expansion data.
Preferably, S200 includes:
s201, virtual modeling of the robot is carried out according to robot parameter information by using a simulation experiment platform, and a virtual robot with a virtual structure formed by combining a plurality of rigid bodies is constructed;
S202, modeling a camera according to camera parameter information, constructing a pinhole model camera without collision attribute, and obtaining a virtual camera;
and S203, instantiating the virtual robot and the virtual camera, combining the instantiated virtual robot and virtual camera into an intelligent body in cooperation with a robot control strategy and a camera mirror operation strategy, performing virtual robot control and motion video recording, generating and recording a robot motion video captured by a camera in a simulation environment, acquiring robot motion video recording data, and transmitting the robot motion video recording data to a motion skill video scoring module.
Preferably, S300 includes:
s301, extracting and storing characteristics of robot motion video recording data, robot motion example video data and motion video expansion data;
s302, a neural network scoring model is built, feature similarity of the robot motion video recording data, the robot motion example video data and the motion video expansion data is compared, and a motion video recording data scoring result is generated.
Preferably, S400 includes:
S401, the intelligent agent learning module acquires an optimized robot control strategy and an optimized camera lens strategy by cooperating with an optimized robot control strategy and a camera lens strategy through rewarding feedback of a neural network scoring model based on scoring result feedback of motion video recording data of the neural network scoring model;
S402, according to the optimized robot control strategy and the optimized camera lens-transporting strategy, iteratively optimizing the intelligent agent strategy, and updating the intelligent agent strategy into the virtual robot and the virtual camera combined intelligent agent.
The invention provides a multi-source video data robot skill system, comprising:
The video collection data expansion subsystem is used for automatically collecting example videos related to the skills according to the motor skill text description through the example video collection module and carrying out data expansion to obtain motor example video data;
Virtually constructing a control strategy subsystem, constructing a virtual robot and a virtual camera, instantiating, cooperating with a robot control strategy and a camera mirror strategy, combining an intelligent body, and generating and recording robot motion video recording data;
The intelligent motor skill scoring system constructs an intelligent video scoring model through a motor skill video scoring module to generate scoring results of robot motion video recording data;
And the strategy optimization agent updating subsystem is used for setting a reward feedback collaborative optimization robot control strategy and a camera lens-operating strategy of the neural network scoring model through an agent learning module and updating the strategy control strategy and the camera lens-operating strategy into the agent.
Preferably, the video gathering data expansion subsystem comprises:
the keyword extraction analysis unit is used for setting a motor skill text description, and analyzing and extracting motor skill keywords in the motor skill text description from the motor skill text description by combining a keyword extraction algorithm with a large language model;
the label content aggregation unit is used for carrying out label content aggregation on the motor skill keywords to obtain motor skill labels;
and the video tag expansion unit is used for collecting the motion example video and carrying out data expansion according to the motion skill keywords and the motion skill tags to obtain the motion example video data and the motion video expansion data.
Preferably, the virtual construction control strategy subsystem comprises:
The robot structure modeling unit is used for performing virtual modeling on the robot according to the robot parameter information by using a simulation experiment platform, and constructing a virtual robot with a virtual structure formed by combining a plurality of rigid bodies;
The virtual camera modeling unit is used for carrying out camera modeling according to the camera parameter information, constructing a pinhole model camera without collision attribute and obtaining a virtual camera;
The virtual robot and the virtual camera are instantiated, the virtual robot control strategy and the camera mirror strategy are cooperated, the instantiated virtual robot and virtual camera are combined into an intelligent body, virtual robot control and motion video recording are carried out, a robot motion video captured by a camera in a simulation environment is generated and recorded, robot motion video recording data are obtained, and the robot motion video recording data are transmitted to a motion skill video scoring module.
Preferably, the motor skills intelligence scoring subsystem comprises:
The feature extraction storage unit is used for extracting and storing features of robot motion video recording data, robot motion example video data and motion video expansion data;
And the scoring model feature scoring unit is used for constructing a neural network scoring model, comparing the feature similarity of the robot motion video recording data, the robot motion example video data and the motion video expansion data, and generating a motion video recording data scoring result.
Preferably, the policy optimization agent update subsystem comprises:
The intelligent learning optimization unit is used for acquiring an optimized robot control strategy and an optimized camera lens operation strategy by cooperatively optimizing the robot control strategy and the camera lens operation strategy through rewarding feedback of the neural network scoring model based on scoring result feedback of the motion video recording data of the neural network scoring model;
and the agent strategy iteration updating unit is used for iteratively optimizing the agent strategy according to the optimized robot control strategy and the optimized camera lens-operating strategy, and updating the agent strategy into the virtual robot and virtual camera combined agent.
Compared with the prior art, the invention at least comprises the following beneficial effects:
The invention discloses a multi-source video data robot skill learning method and system, wherein an example video collecting module is used for automatically collecting example videos related to skills and expanding data according to a motion skill text description to obtain motion example video data, a virtual robot and a virtual camera are constructed and instantiated, a robot motion video recording data is generated and recorded by combining a robot control strategy and a camera mirror strategy, a video intelligent scoring module is constructed to generate scoring results for the robot motion video recording data, a neural network scoring module is arranged through the intelligent body learning module, a reward feedback collaborative optimization robot control strategy and a camera mirror strategy of the neural network scoring module are set and updated into the intelligent body, the skill learning method based on the multi-source video data is provided, and aims to enable the virtual robot to learn a unified motion control strategy from the multi-source video example. The method simplifies the complex flow of the traditional method in the training data acquisition and preprocessing stage, improves the processing efficiency of training data, is of a multi-source video data type, models the skill learning problem based on the multi-source video data as a multi-agent reinforcement learning problem, trains a camera lens-operating strategy to assist the robot in finishing skill learning while training a robot motion control strategy, can naturally display the skill learning result of a virtual robot according to the self recording track of the robot without manual control, is a method for learning uniform motion skill keywords from the multi-source video data, combines the motion control of the robot with the camera lens-operating strategy, realizes a multi-source motion skill keyword learning method based on the multi-source video data, monitors the virtual robot to acquire new motion skill keywords through imitation learning, can naturally display the video by utilizing network data learning in various ways because the method does not require to acquire training videos in a specific mode, and can also display the natural skill keywords for the robot motion skill learning strategy while the method is used for intuitively displaying the natural skill training strategy of the robot.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
Fig. 1 is a diagram of a multi-source video data robot skill learning system according to an embodiment of the invention.
Fig. 2 is a diagram of an embodiment of a multi-source video data robot skill learning method according to the present invention.
Fig. 3 is a diagram illustrating an exemplary embodiment of a multi-source video data robot skill learning method and system according to the present invention.
Detailed Description
The invention is further described in detail below with reference to the drawings and examples to enable one skilled in the art to practice the invention, and as shown in the drawings, the invention provides a multi-source video data robot skill learning method comprising:
S100, automatically collecting example videos related to the skills according to the motor skill text description through an example video collecting module, and performing data expansion to obtain motor example video data;
s200, constructing and instantiating a virtual robot and a virtual camera, cooperating a robot control strategy and a camera mirror-moving strategy, combining an intelligent body, and generating and recording robot motion video recording data;
S300, constructing an intelligent video scoring model through a motor skill video scoring module, and generating a scoring result of robot motion video recording data;
s400, setting a reward feedback collaborative optimization robot control strategy and a camera lens strategy of a neural network scoring model through an agent learning module, and updating the reward feedback collaborative optimization robot control strategy and the camera lens strategy into the agent.
The technical scheme has the principle and effects that the multi-source video data robot skill learning method comprises the steps of automatically collecting example videos related to skills according to a skill text description through an example video collecting module and expanding data, obtaining motion example video data, constructing a virtual robot and a virtual camera and instantiating, combining an intelligent body with a camera mirror strategy to generate and record robot motion video recording data, constructing a video intelligent scoring model through a skill video scoring module to generate scoring results for the robot motion video recording data, setting a reward feedback collaborative optimization robot control strategy and a camera mirror strategy of a neural network scoring model through an intelligent body learning module, and updating the strategy into the intelligent body, wherein the skill learning method based on the multi-source video data aims to enable a virtual robot to learn a unified motion control strategy from the multi-source video example, training the robot mirror strategy to record the robot, recording the scoring auxiliary motion video recording data through the scoring result, and avoiding the problem that the training method does not need to be used for preprocessing the video data in a massive scale, and the problem that the training of the strategy is not required to be used for training the video data is complex, and the training of the multi-source video data is not needed to be processed, and the problem is solved. The method simplifies the complex flow of the traditional method in the training data acquisition and preprocessing stage, improves the processing efficiency of training data, multi-source video data types, models the skill learning problem based on the multi-source video data as a multi-agent reinforcement learning problem, trains a camera lens strategy to assist the robot to complete skill learning while training a robot motion control strategy, the camera lens strategy trained by the method can adjust the recording track of the robot according to the motion track of the robot without manual control, naturally displays the skill learning result of the virtual robot, learns unified motion skill keywords from the multi-source video data, combines the method of training the robot motion control and the camera lens strategy, realizes a multi-source motion skill keyword learning method based on the multi-source video data, monitors the virtual robot to acquire new motion skill keywords through imitation learning, acquires training videos in a specific mode because the method does not require acquisition of a specific mode, can naturally display the skill learning result of the virtual robot according to the motion track, simultaneously gathers a video learning result for a large-scale video image through a video training strategy by a video training module 1, can also visually gather the video game content of the training strategy for the robot, and can display a video sample by capturing the video game content of the video through a motion map of the training module, the method comprises the steps of analyzing and judging motion keyword information of a motion example video, wherein the motion keyword information comprises a motion description keyword and a motion action characteristic keyword, the motion description keyword is subjected to video text description analysis through text description content and subtitle text description content of the motion example video to obtain a video text description analysis result, the video text description analysis result is compared with the motion action characteristic keyword according to the video text description analysis result to judge whether the video text description is consistent with the motion action characteristic, motion example video data are transmitted to a skill video scoring module, the skill video scoring module parallelly receives robot motion video recording data recorded by a skill video recording module in a 3D simulation environment system, a virtual robot and a virtual camera are combined to construct a combined intelligent body, the combined intelligent body records a virtual robot video to obtain robot motion video recording data, the robot motion video recording data are transmitted to the skill video scoring module, the skill video scoring module scores the motion example video data and the robot motion video recording data, the robot motion video data scoring information is transmitted to the intelligent body, the intelligent body learning intelligent body is continuously trained according to the training environment, and the training environment is continuously optimized.
In one embodiment, S100 comprises:
s101, setting a motor skill text description, and analyzing and extracting motor skill keywords in the motor skill text description from the motor skill text description by combining a keyword extraction algorithm with a large language model;
s102, performing label content aggregation on the motor skill keywords to obtain motor skill labels;
And S103, according to the motor skill keywords and the motor skill labels, gathering the motor example video and performing data expansion to obtain motor example video data and motor video expansion data.
The principle and effect of the technical proposal are that setting a text description of the motor skills; the method comprises the steps of analyzing and extracting a motor skill keyword in a motor skill text description through a keyword extraction algorithm in combination with a large language model, carrying out tag content aggregation on the motor skill keyword to obtain a motor skill tag, collecting motor example video and data expansion according to the motor skill keyword and the motor skill tag, obtaining motor example video and motor video expansion data, collecting motor example video and data expansion according to the motor skill keyword and the motor skill tag, collecting example video according to the motor skill keyword and the motor skill tag, obtaining collected motor example video through data capture, carrying out motor tag matching on the motor skill tag and a reference motor tag in the existing motor data set, taking out the motor video with the matching degree higher than the set reference matching degree, carrying out data expansion, obtaining motor example video data and motor video expansion data, transmitting the motor example video data and the motor video expansion data to a skill video scoring module, analyzing the text of the skill description, building a simulation environment and instantiating a machine, initializing/updating a motion mirror and a motion mirror, optimizing and a motion mirror according to a training strategy, and optimizing the training strategy by combining the intelligent model with the intelligent model of the motion mirror according to the analysis model.
In one embodiment, S200 includes:
s201, virtual modeling of the robot is carried out according to robot parameter information by using a simulation experiment platform, and a virtual robot with a virtual structure formed by combining a plurality of rigid bodies is constructed;
S202, modeling a camera according to camera parameter information, constructing a pinhole model camera without collision attribute, and obtaining a virtual camera;
and S203, instantiating the virtual robot and the virtual camera, combining the instantiated virtual robot and virtual camera into an intelligent body in cooperation with a robot control strategy and a camera mirror operation strategy, performing virtual robot control and motion video recording, generating and recording a robot motion video captured by a camera in a simulation environment, acquiring robot motion video recording data, and transmitting the robot motion video recording data to a motion skill video scoring module.
The principle and the effect of the technical scheme are that a simulation experiment platform is utilized to carry out virtual modeling on the robot according to the parameter information of the robot, and a virtual robot with a virtual structure formed by combining a plurality of rigid bodies is constructed; modeling the camera according to the camera parameter information, constructing a pinhole model camera without collision attribute, and obtaining a virtual camera; instantiating a virtual robot and a virtual camera, cooperating a robot control strategy and a camera operation strategy, combining the instantiated virtual robot and virtual camera into an intelligent body, carrying out virtual robot control and motion video recording, generating and recording a robot motion video captured by a camera in a simulation environment, acquiring robot motion video recording data, transmitting the robot motion video recording data to a motion skill video scoring module, carrying out virtual modeling on the robot according to robot parameter information by using a simulation experiment platform, constructing a virtual robot 211 with a plurality of rigid body combined virtual structures by using the simulation experiment platform, constructing a plurality of rigid body combined virtual structures by using the robot parameter information, connecting all parts of the virtual structures through virtual joints, using a rotation joint with 1 degree of freedom for knee joints and elbow joints, setting all other joints into spherical joints with 3 degrees of freedom, setting robot entity virtual association parameters including a robot total mass of 45 kg, a robot height of 1.62 m, a robot state space 197 for the robot, a space dimension for the robot, a network control strategy for the robot, and inputting the robot state space dimension for the robot control strategy for the network model control strategy, the method comprises the steps of outputting torque values for controlling joints, carrying out control strategy modeling through a fully-connected neural network, obtaining a robot control strategy network model through input and output dimensions of 197 and 36 respectively, wherein a training scene is a ground surface which is randomly initialized, the robot makes different actions on the ground surface, and a plurality of grid types and a plurality of physical parameters are arranged on the ground surface during initialization so as to improve the robustness of the robot control strategy;
According to camera parameter information, camera modeling is carried out, a pinhole model camera without collision attribute is built, virtual camera 212 is obtained, internal parameter of the camera is randomly sampled and generated within a certain range to cover the internal parameter range of the camera used when multi-source video data are collected, a camera lens operation strategy network model is built, input of the camera lens operation strategy network model comprises world pose of the camera and the robot in a plurality of frames in the past, relative pose of the camera and the robot, and projected pixel coordinates of a robot joint in a picture, output of the camera lens operation strategy network model is a camera next operation command, the next operation command is decomposed into a lens operation rotation command and a lens operation displacement command, virtual robot and a virtual camera are instantiated, the instantiated virtual robot and the instantiated virtual camera lens operation strategy are combined into an intelligent object, virtual robot control and motion video recording comprises the first instantiation of the camera and the robot in a plurality of frames, the robot control network strategy and the camera lens operation strategy network model is optimized, the network strategy is accurately studied from the network model, and the network model of the camera is accurately controlled from the network model, and the network strategy of the network model is accurately studied.
In one embodiment, S300 includes:
s301, extracting and storing characteristics of robot motion video recording data, robot motion example video data and motion video expansion data;
s302, a neural network scoring model is built, feature similarity of the robot motion video recording data, the robot motion example video data and the motion video expansion data is compared, and a motion video recording data scoring result is generated.
The principle and the effect of the technical scheme are that the characteristic extraction and storage are carried out on the robot motion video recording data, the robot motion example video data and the motion video expansion data;
The method comprises the steps of constructing a neural network scoring model, comparing characteristic similarity of robot motion video recording data with robot motion example video data and motion video expansion data, wherein the input of the neural network scoring model is the projection characteristic of two frames of state transitions before and after a robot on an image plane, the output of the neural network scoring model is a scoring value between 0 and 1, the state transitions sampled from example data distribution are trained to output 1, the state transitions recorded in the motion process of the robot are trained to output 0, the neural network scoring model is judged on the image space, the optical flow information is extracted from projection coordinates or images of the robot joints on the image plane, the scoring model is realized by using a fully connected neural network, the input is an observation vector with 52 dimensions, the observation vector comprises the projection coordinates (13 x 2) of the image of each joint of the robot and the optical flow information (13 x 2) of the projection position, the tensor with 1 dimension is input after a plurality of layers of fully connected layers, and the score between 0 and 1 is output through a moid activation function.
In one embodiment, S400 includes:
S401, the intelligent agent learning module acquires an optimized robot control strategy and an optimized camera lens strategy by cooperating with an optimized robot control strategy and a camera lens strategy through rewarding feedback of a neural network scoring model based on scoring result feedback of motion video recording data of the neural network scoring model;
S402, according to the optimized robot control strategy and the optimized camera lens-transporting strategy, iteratively optimizing the intelligent agent strategy, and updating the intelligent agent strategy into the virtual robot and the virtual camera combined intelligent agent.
The principle and the effect of the technical scheme are that the intelligent agent learning module acquires an optimized robot control strategy and an optimized camera lens-operating strategy by cooperating with the optimized robot control strategy and the camera lens-operating strategy through the reward feedback of the neural network scoring model based on the scoring result feedback of the motion video recording data of the neural network scoring model;
the method comprises the steps of optimizing a robot control strategy and optimizing a camera lens-operating strategy, iteratively optimizing an agent strategy, updating the agent strategy to a virtual robot and virtual camera combined agent, introducing a reward signal to assist algorithm convergence in the initial stage of optimizing the robot control strategy and optimizing the camera lens-operating strategy, realizing a more robust training effect, defining a strategy network and a reward function, updating the strategy network by adopting a reinforcement learning algorithm, and using the updated camera lens-operating strategy and the updated robot control strategy for carrying out new data collection and iterative training so as to iteratively optimize the agent strategy, wherein the strategy network and the reward function are defined to prevent the robot from falling down, limit the action amplitude of the robot and ensure that the camera can record the reward to the whole body or part of the trunk of the robot all the time.
The invention provides a multi-source video data robot skill system, comprising:
The video collection data expansion subsystem is used for automatically collecting example videos related to the skills according to the motor skill text description through the example video collection module and carrying out data expansion to obtain motor example video data;
Virtually constructing a control strategy subsystem, constructing a virtual robot and a virtual camera, instantiating, cooperating with a robot control strategy and a camera mirror strategy, combining an intelligent body, and generating and recording robot motion video recording data;
The intelligent motor skill scoring system constructs an intelligent video scoring model through a motor skill video scoring module to generate scoring results of robot motion video recording data;
And the strategy optimization agent updating subsystem is used for setting a reward feedback collaborative optimization robot control strategy and a camera lens-operating strategy of the neural network scoring model through an agent learning module and updating the strategy control strategy and the camera lens-operating strategy into the agent.
The principle and effect of the technical scheme are that the invention provides a multi-source video data robot skill system, which comprises a video collecting data expansion subsystem, a virtual construction control strategy subsystem, a virtual robot and virtual camera collaborative control strategy and camera fortune mirror strategy and an intelligent agent combination, wherein the video collecting data expansion subsystem automatically collects example videos related to skills according to a movement skill text description through an example video collecting module and expands the data, the virtual construction control strategy subsystem is used for constructing a virtual robot and a virtual camera and instantiating, the virtual robot control strategy and the camera fortune mirror strategy are cooperated and combined to generate and record robot movement video recording data, the movement skill intelligent evaluation subsystem is used for constructing a video intelligent scoring model through a movement skill video scoring module and generating scoring results for the robot movement video recording data, the strategy optimization intelligent agent updating subsystem is used for setting feedback of a neural network scoring model to cooperatively optimize the robot control strategy and the camera fortune mirror strategy and updating the strategy into the intelligent agent, the virtual robot is used for learning unified movement control strategy based on the multi-source video data, the virtual robot is used for learning a uniform movement control strategy from a multi-source video example, the method is not used for carrying out the training strategy, the training of the video training strategy is not needed to be completely different from the multi-source video example, the complex exercise data is not needed, the problem is solved by the training method is solved, the three-dimensional training data is not needs to be effectively trained, and the three-dimensional movement data is not has been processed, enabling a virtual robot to learn unified motor skills keywords from massive example videos of different sources. The method simplifies the complex flow of the traditional method in the training data acquisition and preprocessing stage, improves the processing efficiency of training data, multi-source video data types, models the skill learning problem based on the multi-source video data as a multi-agent reinforcement learning problem, trains a camera lens strategy to assist the robot to complete skill learning while training a robot motion control strategy, the camera lens strategy trained by the method can adjust the recording track of the robot according to the motion track of the robot without manual control, naturally displays the skill learning result of the virtual robot, learns unified motion skill keywords from the multi-source video data, combines the method of training the robot motion control and the camera lens strategy, realizes a multi-source motion skill keyword learning method based on the multi-source video data, monitors the virtual robot to acquire new motion skill keywords through imitation learning, acquires training videos in a specific mode because the method does not require acquisition of a specific mode, can naturally display the skill learning result of the virtual robot according to the motion track, simultaneously gathers a video learning result for a large-scale video image through a video training strategy by a video training module 1, can also visually gather the video game content of the training strategy for the robot, and can display a video sample by capturing the video game content of the video through a motion map of the training module, the method comprises the steps of analyzing and judging motion keyword information of a motion example video, wherein the motion keyword information comprises a motion description keyword and a motion action characteristic keyword, the motion description keyword is subjected to video text description analysis through text description content and subtitle text description content of the motion example video to obtain a video text description analysis result, the video text description analysis result is compared with the motion action characteristic keyword according to the video text description analysis result to judge whether the video text description is consistent with the motion action characteristic, motion example video data are transmitted to a skill video scoring module, the skill video scoring module parallelly receives robot motion video recording data recorded by a skill video recording module in a 3D simulation environment system, a virtual robot and a virtual camera are combined to construct a combined intelligent body, the combined intelligent body records a virtual robot video to obtain robot motion video recording data, the robot motion video recording data are transmitted to the skill video scoring module, the skill video scoring module scores the motion example video data and the robot motion video recording data, the robot motion video data scoring information is transmitted to the intelligent body, the intelligent body learning intelligent body is continuously trained according to the training environment, and the training environment is continuously optimized.
In one embodiment, a video gathering data expansion subsystem, comprising:
the keyword extraction analysis unit is used for setting a motor skill text description, and analyzing and extracting motor skill keywords in the motor skill text description from the motor skill text description by combining a keyword extraction algorithm with a large language model;
the label content aggregation unit is used for carrying out label content aggregation on the motor skill keywords to obtain motor skill labels;
and the video tag expansion unit is used for collecting the motion example video and carrying out data expansion according to the motion skill keywords and the motion skill tags to obtain the motion example video data and the motion video expansion data.
The principle and effect of the technical scheme are that the video collection data expansion subsystem comprises a keyword extraction and analysis unit and a motor skill text description; the method comprises the steps of analyzing and extracting a motor skill keyword in a motor text description by a keyword extraction algorithm in combination with a large language model, carrying out tag content aggregation on the motor skill keyword to obtain a motor skill tag, collecting a video tag expansion unit, collecting motor example videos and carrying out data expansion according to the motor skill keyword and the motor skill tag to obtain motor example video data and motor video expansion data, collecting motor example videos and carrying out data expansion according to the motor skill keyword and the motor skill tag, collecting the example videos through data capture, obtaining collected motor example videos, carrying out motor tag matching on the motor skill tag and a reference motor tag in the existing motor data set, carrying out data expansion on the motor videos with the matching degree higher than the set reference matching degree, transmitting the motor example video data and the motor video expansion data to a skill video scoring module, collecting example video data conforming to the description through analysis of the text of the skill description, constructing a simulation environment and initializing a robot and an updating machine, controlling a video scoring mirror and an intelligent training strategy and optimizing the model based on the intelligent model of the motor example video and the intelligent model, and updated into the simulation environment.
In one embodiment, the virtual build control strategy subsystem comprises:
The robot structure modeling unit is used for performing virtual modeling on the robot according to the robot parameter information by using a simulation experiment platform, and constructing a virtual robot with a virtual structure formed by combining a plurality of rigid bodies;
The virtual camera modeling unit is used for carrying out camera modeling according to the camera parameter information, constructing a pinhole model camera without collision attribute and obtaining a virtual camera;
The virtual robot and the virtual camera are instantiated, the virtual robot control strategy and the camera mirror strategy are cooperated, the instantiated virtual robot and virtual camera are combined into an intelligent body, virtual robot control and motion video recording are carried out, a robot motion video captured by a camera in a simulation environment is generated and recorded, robot motion video recording data are obtained, and the robot motion video recording data are transmitted to a motion skill video scoring module.
The principle and effect of the technical scheme are that the virtual construction control strategy subsystem comprises a robot structure modeling unit, a virtual camera modeling unit, an instantiation agent combination unit, a virtual robot 211, and a simulation experiment platform, wherein the robot structure modeling unit is used for performing virtual modeling on a robot according to robot parameter information to construct a virtual robot with a virtual structure composed of a plurality of rigid bodies, the virtual camera modeling unit is used for performing camera modeling according to the camera parameter information to construct a pinhole model camera without collision attribute to obtain a virtual camera, the instantiation agent combination unit is used for instantiating the virtual robot and the virtual camera, the virtual robot control strategy and the camera mirror strategy are cooperated, the instantiated virtual robot and the virtual camera are combined to form an intelligent agent, virtual robot control and motion video recording are performed, the robot motion video captured by a camera in a simulation environment is generated and acquired, the motion video recording data of the robot is transmitted to a motion skill video scoring module, the virtual robot is constructed by the virtual robot 211 composed of the plurality of rigid bodies according to the robot parameter information by using the simulation experiment platform, the virtual robot modeling is performed by constructing the virtual robot with the virtual structure composed of a plurality of rigid bodies according to the robot parameter information, the method comprises the steps of connecting each part of a virtual structure through virtual joints, using a rotating joint with 1 degree of freedom for knee joints and elbow joints, setting the other joints as spherical joints with 3 degrees of freedom, performing control strategy modeling through a fully connected neural network, wherein each robot role comprises 13 joints and 34 degrees of freedom, setting virtual association parameters of a robot entity, wherein each virtual association parameter of the robot entity comprises 45 kg of total mass of the robot, 1.62 m of height of the robot, 197D of state space dimension of the robot, 36D of motion space dimension of the robot, constructing a robot control strategy network model, wherein the robot control strategy network model comprises taking state information of the robot as input and outputting torque values for controlling each joint, the state information of the robot comprises spatial positions and motion speeds of each joint of the robot, performing control strategy modeling through the fully connected neural network, wherein the input and output dimensions are 197 and 36 respectively, obtaining a robot control strategy network model, training the scene is a ground surface which is initialized randomly, and the grid of the robot makes different motions on the ground, and is provided with various types and physical parameters during the initialization of the ground grid so as to improve the robustness of robot control strategy, and various types comprise flatness and the physical parameters and various grid parameters and various coefficient of friction coefficients;
According to camera parameter information, camera modeling is carried out, a pinhole model camera without collision attribute is built, virtual camera 212 is obtained, internal parameter of the camera is randomly sampled and generated within a certain range to cover the internal parameter range of the camera used when multi-source video data are collected, a camera lens operation strategy network model is built, input of the camera lens operation strategy network model comprises world pose of the camera and the robot in a plurality of frames in the past, relative pose of the camera and the robot, and projected pixel coordinates of a robot joint in a picture, output of the camera lens operation strategy network model is a camera next operation command, the next operation command is decomposed into a lens operation rotation command and a lens operation displacement command, virtual robot and a virtual camera are instantiated, the instantiated virtual robot and the instantiated virtual camera lens operation strategy are combined into an intelligent object, virtual robot control and motion video recording comprises the first instantiation of the camera and the robot in a plurality of frames, the robot control network strategy and the camera lens operation strategy network model is optimized, the network strategy is accurately studied from the network model, and the network model of the camera is accurately controlled from the network model, and the network strategy of the network model is accurately studied.
In one embodiment, the motor skills intelligence scoring subsystem includes:
The feature extraction storage unit is used for extracting and storing features of robot motion video recording data, robot motion example video data and motion video expansion data;
And the scoring model feature scoring unit is used for constructing a neural network scoring model, comparing the feature similarity of the robot motion video recording data, the robot motion example video data and the motion video expansion data, and generating a motion video recording data scoring result.
The intelligent scoring subsystem for the motor skills comprises a feature extraction and storage unit, a feature extraction and storage unit and a feature extraction and storage unit, wherein the feature extraction and storage unit is used for extracting and storing robot motion video recording data, robot motion example video data and motion video expansion data; the scoring model feature scoring unit is used for constructing a neural network scoring model, comparing the feature similarity of the robot motion video recording data with the robot motion example video data and the motion video expansion data to generate a motion video recording data scoring result, constructing a neural network scoring model, and comparing the feature similarity of the robot motion video recording data with the robot motion example video data and the motion video expansion data, wherein the input of the neural network scoring model is the projection feature of the state transition of the front frame and the rear frame of the robot on an image plane, the output of the neural network scoring model is a scoring value between 0 and 1, the model is trained to output 1 for the state transition sampled from the example data distribution, the model is trained to output 0 for the state transition recorded in the motion process of the robot, the neural network scoring model is judged on an image space, the scoring model is realized by using a fully connected neural network, the input is an observation vector with 52 dimensions, the image projection coordinate (13 x 2) of each joint of the robot and the information (13 x 2) of the projection position, the model is a plurality of optical flow 1 layers are connected in a layer, and the optical flow 1 is activated through a plurality of layers, and the optical flow 1 is activated, and the optical flow 1 is scored through a plurality of layers, and the optical flow 1 is activated.
In one embodiment, a policy optimization agent update subsystem includes:
The intelligent learning optimization unit is used for acquiring an optimized robot control strategy and an optimized camera lens operation strategy by cooperatively optimizing the robot control strategy and the camera lens operation strategy through rewarding feedback of the neural network scoring model based on scoring result feedback of the motion video recording data of the neural network scoring model;
and the agent strategy iteration updating unit is used for iteratively optimizing the agent strategy according to the optimized robot control strategy and the optimized camera lens-operating strategy, and updating the agent strategy into the virtual robot and virtual camera combined agent.
The principle and effect of the technical scheme are that the strategy optimization intelligent agent updating subsystem comprises an intelligent agent learning optimizing unit, an intelligent agent learning module, an intelligent agent strategy iteration updating unit, an intelligent agent control strategy and an intelligent agent control strategy updating unit, wherein the intelligent agent learning optimizing unit comprises an intelligent agent learning optimizing unit, the intelligent agent learning module is used for carrying out iterative optimization intelligent agent strategy updating according to the optimizing robot control strategy and the optimizing camera mirror strategy, the intelligent agent optimizing strategy is updated into a virtual robot and virtual camera combined intelligent agent, the optimizing robot control strategy and the optimizing camera mirror strategy are updated into the virtual robot and virtual camera combined intelligent agent, the iterative optimization intelligent agent strategy is introduced into an initial stage of the optimizing robot control strategy and the optimizing camera mirror strategy to assist algorithm convergence, a more robust training effect is achieved, a strategy network and a reward function are defined, the strategy is updated by adopting a reinforcement learning algorithm strategy, the updated camera mirror and the robot control strategy are used for carrying out data collection and iteration of a new round, the intelligent robot strategy is prevented from falling down, and the whole body motion of the intelligent robot is prevented from being limited by the iterative strategy updating unit, or the intelligent agent strategy is always limited by the whole body, and the intelligent agent strategy is prevented from falling down.
Although embodiments of the present invention have been disclosed above, it is not limited to the details and embodiments shown and described, it is well suited to various fields of use for which the invention would be readily apparent to those skilled in the art, and accordingly, the invention is not limited to the specific details and illustrations shown and described herein, without departing from the general concepts defined in the claims and their equivalents.

Claims (8)

1.一种多源视频数据机器人技能学习方法,其特征在于,包括:1. A method for robot skill learning based on multi-source video data, comprising: S100,通过示例视频搜集模块根据运动技能文本描述,自动搜集与该技能相关的示例视频并进行数据扩充,获取运动示例视频数据;S100, automatically collecting sample videos related to the skill according to the text description of the sport skill through a sample video collection module and performing data expansion to obtain sport sample video data; S200,构建虚拟机器人及虚拟摄像机并实例化,协同机器人控制策略与摄像机运镜策略并组合智能体,生成并录制机器人运动视频录制数据;S200, constructing and instantiating a virtual robot and a virtual camera, coordinating a robot control strategy and a camera movement strategy and combining intelligent agents, and generating and recording robot motion video recording data; S300,通过运动技能视频打分模块,构建视频智能打分模型,生成对机器人运动视频录制数据的评分结果;S300, through the sports skill video scoring module, builds a video intelligent scoring model to generate scoring results for the robot sports video recording data; S400,通过智能体学习模块,设置神经网络打分模型的奖励反馈协同优化机器人控制策略和摄像机运镜策略,并更新到智能体中;S400, through the agent learning module, sets the reward feedback of the neural network scoring model to collaboratively optimize the robot control strategy and camera movement strategy, and updates them to the agent; S200包括:S200 includes: S201,利用仿真实验平台,根据机器人参数信息,进行机器人虚拟建模,构建由多个刚体组合虚拟结构的虚拟机器人;S201, using the simulation experiment platform, based on the robot parameter information, performing robot virtual modeling, and constructing a virtual robot composed of a plurality of rigid bodies and a virtual structure; S202,根据摄像机参数信息,进行摄像机建模,构建无碰撞属性的针孔模型摄像机,获取虚拟摄像机;S202, performing camera modeling according to the camera parameter information, constructing a pinhole model camera with a collision-free property, and obtaining a virtual camera; S203,将虚拟机器人和虚拟摄像机进行实例化,协同机器人控制策略与摄像机运镜策略并将实例化后的虚拟机器人和虚拟摄像机组合为智能体,进行虚拟机器人控制和运动视频录制,生成并录制仿真环境中摄像机捕捉到的机器人运动视频,获取机器人运动视频录制数据;将机器人运动视频录制数据传输到运动技能视频打分模块;S203, instantiating the virtual robot and the virtual camera, coordinating the robot control strategy with the camera movement strategy, and combining the instantiated virtual robot and the virtual camera into an intelligent body, performing virtual robot control and motion video recording, generating and recording the robot motion video captured by the camera in the simulation environment, and obtaining the robot motion video recording data; transmitting the robot motion video recording data to the sports skill video scoring module; 通过全连接神经网络进行控制策略建模,获取机器人控制策略网络模型;The control strategy is modeled through a fully connected neural network to obtain the robot control strategy network model; 构建摄像机运镜策略网络模型;摄像机运镜策略网络模型的输入包括摄像机与机器人在过去若干帧中的世界位姿,和摄像机与机器人的相对位姿,以及机器人关节在画面中投影的像素坐标;摄像机运镜策略网络模型的输出为摄像机下一步运镜指令,并下一步运镜指令分解为运镜旋转指令和运镜位移指令;将虚拟机器人和虚拟摄像机进行实例化,协同机器人控制策略与摄像机运镜策略并将实例化后的虚拟机器人和虚拟摄像机组合为智能体,进行虚拟机器人控制和运动视频录制包括:将机器人和摄像机首次实例化,机器人控制策略网络模型及摄像机运镜策略网络模型的参数进行随机初始化;在随后的学习过程中,机器人控制策略网络模型及摄像机运镜策略网络模型的策略网络逐步优化更新;机器人控制策略网络模型使机器人精确地复现示例视频中的动作技能,而摄像机运镜策略网络模型从与示例视频相似的角度录制运动视频。A camera movement strategy network model is constructed; the input of the camera movement strategy network model includes the world poses of the camera and the robot in the past several frames, the relative poses of the camera and the robot, and the pixel coordinates of the robot joints projected in the picture; the output of the camera movement strategy network model is the next camera movement instruction, and the next camera movement instruction is decomposed into a camera rotation instruction and a camera displacement instruction; the virtual robot and the virtual camera are instantiated, the robot control strategy and the camera movement strategy are coordinated, and the instantiated virtual robot and the virtual camera are combined into an intelligent body, and the virtual robot control and motion video recording are performed, including: the robot and the camera are instantiated for the first time, and the parameters of the robot control strategy network model and the camera movement strategy network model are randomly initialized; in the subsequent learning process, the strategy networks of the robot control strategy network model and the camera movement strategy network model are gradually optimized and updated; the robot control strategy network model enables the robot to accurately reproduce the action skills in the example video, and the camera movement strategy network model records the motion video from an angle similar to the example video. 2.根据权利要求1所述的一种多源视频数据机器人技能学习方法,其特征在于,S100包括:2. The method for learning robot skills using multi-source video data according to claim 1, wherein S100 comprises: S101,设置运动技能文本描述;通过关键词提取算法,结合大语言模型从运动技能文本描述中解析提取运动文本描述中的运动技能关键词;S101, setting a sports skill text description; parsing and extracting sports skill keywords from the sports skill text description by using a keyword extraction algorithm in combination with a large language model; S102,对运动技能关键词进行标签内容聚合,获取运动技能标签;S102, performing tag content aggregation on sports skill keywords to obtain sports skill tags; S103,根据运动技能关键词及运动技能标签,搜集运动示例视频并进行数据扩充,获取运动示例视频数据及运动视频扩充数据。S103, according to the sports skill keywords and sports skill tags, collect sports example videos and perform data expansion to obtain sports example video data and sports video expansion data. 3.根据权利要求1所述的一种多源视频数据机器人技能学习方法,其特征在于,S300包括:3. The method for learning robot skills using multi-source video data according to claim 1, wherein S300 comprises: S301,对机器人运动视频录制数据、机器人运动示例视频数据及运动视频扩充数据进行特征提取及存储;S301, extracting and storing features of robot motion video recording data, robot motion example video data, and motion video expansion data; S302,构建神经网络打分模型,对比机器人运动视频录制数据与机器人运动示例视频数据及运动视频扩充数据的特征相似度,生成运动视频录制数据评分结果。S302, constructing a neural network scoring model, comparing the feature similarity between the robot motion video recording data and the robot motion example video data and the motion video expansion data, and generating a scoring result for the motion video recording data. 4.根据权利要求1所述的一种多源视频数据机器人技能学习方法,其特征在于,S400包括:4. The method for learning robot skills using multi-source video data according to claim 1, wherein S400 comprises: S401,智能体学习模块基于神经网络打分模型的运动视频录制数据评分结果反馈,通过神经网络打分模型的奖励反馈协同优化机器人控制策略和摄像机运镜策略,获取优化机器人控制策略和优化摄像机运镜策略;S401, the agent learning module coordinately optimizes the robot control strategy and the camera movement strategy based on the feedback of the scoring result of the motion video recording data of the neural network scoring model through the reward feedback of the neural network scoring model, and obtains the optimized robot control strategy and the optimized camera movement strategy; S402,根据优化机器人控制策略和优化摄像机运镜策略,迭代优化智能体策略,更新到虚拟机器人及虚拟摄像机组合智能体中。S402, iteratively optimize the agent strategy according to the optimized robot control strategy and the optimized camera movement strategy, and update it to the virtual robot and virtual camera combined agent. 5.一种多源视频数据机器人技能学习系统,其特征在于,包括:5. A multi-source video data robot skill learning system, characterized by comprising: 视频搜集数据扩充子系统,通过示例视频搜集模块根据运动技能文本描述,自动搜集与该技能相关的示例视频并进行数据扩充,获取运动示例视频数据;The video collection data expansion subsystem automatically collects sample videos related to the skill and performs data expansion based on the text description of the sport skill through the sample video collection module to obtain the sport sample video data; 虚拟构建控制策略子系统,构建虚拟机器人及虚拟摄像机并实例化,协同机器人控制策略与摄像机运镜策略并组合智能体,生成并录制机器人运动视频录制数据;Virtually construct the control strategy subsystem, build and instantiate the virtual robot and virtual camera, coordinate the robot control strategy and camera movement strategy and combine the intelligent body, generate and record the robot motion video recording data; 运动技能智能评分子系统,通过运动技能视频打分模块,构建视频智能打分模型,生成对机器人运动视频录制数据的评分结果;The sports skill intelligent scoring subsystem builds a video intelligent scoring model through the sports skill video scoring module to generate scoring results for the robot's sports video recording data; 策略优化智能体更新子系统,通过智能体学习模块,设置神经网络打分模型的奖励反馈协同优化机器人控制策略和摄像机运镜策略,并更新到智能体中;The strategy optimization agent update subsystem uses the agent learning module to set the reward feedback of the neural network scoring model to collaboratively optimize the robot control strategy and camera movement strategy, and update them to the agent; 虚拟构建控制策略子系统,包括:Virtual construction control strategy subsystem, including: 机器人结构建模单元,利用仿真实验平台,根据机器人参数信息,进行机器人虚拟建模,构建由多个刚体组合虚拟结构的虚拟机器人;The robot structure modeling unit uses the simulation experiment platform to perform robot virtual modeling according to the robot parameter information, and constructs a virtual robot composed of multiple rigid bodies. 虚拟摄像机建模单元,根据摄像机参数信息,进行摄像机建模,构建无碰撞属性的针孔模型摄像机,获取虚拟摄像机;A virtual camera modeling unit performs camera modeling according to camera parameter information, constructs a pinhole model camera with a collision-free property, and obtains a virtual camera; 实例化智能体组合单元,将虚拟机器人和虚拟摄像机进行实例化,协同机器人控制策略与摄像机运镜策略并将实例化后的虚拟机器人和虚拟摄像机组合为智能体,进行虚拟机器人控制和运动视频录制,生成并录制仿真环境中摄像机捕捉到的机器人运动视频,获取机器人运动视频录制数据;将机器人运动视频录制数据传输到运动技能视频打分模块;Instantiate the intelligent agent combination unit, instantiate the virtual robot and the virtual camera, coordinate the robot control strategy with the camera movement strategy, and combine the instantiated virtual robot and virtual camera into an intelligent agent to control the virtual robot and record motion video, generate and record the robot motion video captured by the camera in the simulation environment, and obtain the robot motion video recording data; transmit the robot motion video recording data to the sports skill video scoring module; 通过全连接神经网络进行控制策略建模,获取机器人控制策略网络模型;The control strategy is modeled through a fully connected neural network to obtain the robot control strategy network model; 构建摄像机运镜策略网络模型;摄像机运镜策略网络模型的输入包括摄像机与机器人在过去若干帧中的世界位姿,和摄像机与机器人的相对位姿,以及机器人关节在画面中投影的像素坐标;摄像机运镜策略网络模型的输出为摄像机下一步运镜指令,并下一步运镜指令分解为运镜旋转指令和运镜位移指令;将虚拟机器人和虚拟摄像机进行实例化,协同机器人控制策略与摄像机运镜策略并将实例化后的虚拟机器人和虚拟摄像机组合为智能体,进行虚拟机器人控制和运动视频录制包括:将机器人和摄像机首次实例化,机器人控制策略网络模型及摄像机运镜策略网络模型的参数进行随机初始化;在随后的学习过程中,机器人控制策略网络模型及摄像机运镜策略网络模型的策略网络逐步优化更新;机器人控制策略网络模型使机器人精确地复现示例视频中的动作技能,而摄像机运镜策略网络模型从与示例视频相似的角度录制运动视频。A camera movement strategy network model is constructed; the input of the camera movement strategy network model includes the world poses of the camera and the robot in the past several frames, the relative poses of the camera and the robot, and the pixel coordinates of the robot joints projected in the picture; the output of the camera movement strategy network model is the next camera movement instruction, and the next camera movement instruction is decomposed into a camera rotation instruction and a camera displacement instruction; the virtual robot and the virtual camera are instantiated, the robot control strategy and the camera movement strategy are coordinated, and the instantiated virtual robot and the virtual camera are combined into an intelligent body, and the virtual robot control and motion video recording are performed, including: the robot and the camera are instantiated for the first time, and the parameters of the robot control strategy network model and the camera movement strategy network model are randomly initialized; in the subsequent learning process, the strategy networks of the robot control strategy network model and the camera movement strategy network model are gradually optimized and updated; the robot control strategy network model enables the robot to accurately reproduce the action skills in the example video, and the camera movement strategy network model records the motion video from an angle similar to the example video. 6.根据权利要求5所述的一种多源视频数据机器人技能学习系统,其特征在于,视频搜集数据扩充子系统,包括:6. A multi-source video data robot skill learning system according to claim 5, characterized in that the video collection data expansion subsystem comprises: 关键词提取解析单元,设置运动技能文本描述;通过关键词提取算法,结合大语言模型从运动技能文本描述中解析提取运动文本描述中的运动技能关键词;The keyword extraction and parsing unit sets a sports skill text description; the sports skill keywords in the sports skill text description are parsed and extracted from the sports skill text description by using a keyword extraction algorithm in combination with a large language model; 标签内容聚合单元,对运动技能关键词进行标签内容聚合,获取运动技能标签;A tag content aggregation unit aggregates tag content of sports skill keywords to obtain sports skill tags; 搜集视频标签扩充单元,根据运动技能关键词及运动技能标签,搜集运动示例视频并进行数据扩充,获取运动示例视频数据及运动视频扩充数据。The video tag expansion unit collects sports example videos and performs data expansion based on sports skill keywords and sports skill tags to obtain sports example video data and sports video expansion data. 7.根据权利要求5所述的一种多源视频数据机器人技能学习系统,其特征在于,运动技能智能评分子系统,包括:7. A multi-source video data robot skill learning system according to claim 5, characterized in that the sports skill intelligent scoring subsystem comprises: 特征提取存储单元,对机器人运动视频录制数据、机器人运动示例视频数据及运动视频扩充数据进行特征提取及存储;A feature extraction and storage unit is used to extract and store features of robot motion video recording data, robot motion example video data, and motion video expansion data; 打分模型特征评分单元,构建神经网络打分模型,对比机器人运动视频录制数据与机器人运动示例视频数据及运动视频扩充数据的特征相似度,生成运动视频录制数据评分结果。The scoring model feature scoring unit constructs a neural network scoring model, compares the feature similarity between the robot motion video recording data and the robot motion example video data and the motion video expansion data, and generates the motion video recording data scoring result. 8.根据权利要求5所述的一种多源视频数据机器人技能学习系统,其特征在于,策略优化智能体更新子系统,包括:8. The multi-source video data robot skill learning system according to claim 5, characterized in that the strategy optimization agent update subsystem comprises: 智能体学习优化单元,智能体学习模块基于神经网络打分模型的运动视频录制数据评分结果反馈,通过神经网络打分模型的奖励反馈协同优化机器人控制策略和摄像机运镜策略,获取优化机器人控制策略和优化摄像机运镜策略;The intelligent agent learning optimization unit, the intelligent agent learning module is based on the feedback of the scoring results of the motion video recording data of the neural network scoring model, and the reward feedback of the neural network scoring model is used to collaboratively optimize the robot control strategy and the camera movement strategy to obtain the optimized robot control strategy and the optimized camera movement strategy; 智能体策略迭代更新单元,根据优化机器人控制策略和优化摄像机运镜策略,迭代优化智能体策略,更新到虚拟机器人及虚拟摄像机组合智能体中。The intelligent agent strategy iteration and updating unit iteratively optimizes the intelligent agent strategy according to the optimized robot control strategy and the optimized camera movement strategy, and updates it to the virtual robot and virtual camera combined intelligent agent.
CN202411699867.0A 2024-11-26 2024-11-26 Multi-source video data robot skill learning method and system Active CN119204085B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411699867.0A CN119204085B (en) 2024-11-26 2024-11-26 Multi-source video data robot skill learning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411699867.0A CN119204085B (en) 2024-11-26 2024-11-26 Multi-source video data robot skill learning method and system

Publications (2)

Publication Number Publication Date
CN119204085A CN119204085A (en) 2024-12-27
CN119204085B true CN119204085B (en) 2025-02-18

Family

ID=94042393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411699867.0A Active CN119204085B (en) 2024-11-26 2024-11-26 Multi-source video data robot skill learning method and system

Country Status (1)

Country Link
CN (1) CN119204085B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119830993B (en) * 2025-03-14 2025-06-20 北京通用人工智能研究院 Multi-view video reward mechanism learning system and its construction method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114748169A (en) * 2022-03-31 2022-07-15 华中科技大学 Autonomous endoscope moving method of laparoscopic surgery robot based on image experience
CN115396595A (en) * 2022-08-04 2022-11-25 北京通用人工智能研究院 Video generation method and device, electronic equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10766136B1 (en) * 2017-11-03 2020-09-08 Amazon Technologies, Inc. Artificial intelligence system for modeling and evaluating robotic success at task performance
KR102619004B1 (en) * 2018-12-14 2023-12-29 삼성전자 주식회사 Robot control apparatus and method for learning task skill of the robot
US11524402B2 (en) * 2020-05-21 2022-12-13 Intrinsic Innovation Llc User feedback for robotic demonstration learning
CN115442519B (en) * 2022-08-08 2023-12-15 珠海普罗米修斯视觉技术有限公司 Video processing method, device and computer-readable storage medium
CN118003321A (en) * 2023-07-31 2024-05-10 重庆越千创新科技有限公司 Real-time control method and system for photographic robot
US20240342557A1 (en) * 2024-06-18 2024-10-17 Archana Balkrishna Yadav AI-Powered Robotic Defender for Performance Analysis and Advanced Sports Training

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114748169A (en) * 2022-03-31 2022-07-15 华中科技大学 Autonomous endoscope moving method of laparoscopic surgery robot based on image experience
CN115396595A (en) * 2022-08-04 2022-11-25 北京通用人工智能研究院 Video generation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN119204085A (en) 2024-12-27

Similar Documents

Publication Publication Date Title
US12236513B2 (en) Virtual character posture adjustment
WO2021143289A1 (en) Animation processing method and apparatus, and computer storage medium and electronic device
CN119204085B (en) Multi-source video data robot skill learning method and system
CN110599573A (en) Method for realizing real-time human face interactive animation based on monocular camera
US11945125B2 (en) Auxiliary photographing device for dyskinesia analysis, and control method and apparatus for auxiliary photographing device for dyskinesia analysis
CN111645065A (en) Mechanical arm motion planning method based on deep reinforcement learning
CN111028317B (en) Animation generation method, device and equipment for virtual object and storage medium
CN114888801B (en) Mechanical arm control method and system based on offline strategy reinforcement learning
CN110796593A (en) Image processing method, device, medium and electronic equipment based on artificial intelligence
CN116977506A (en) Model action redirection method, device, electronic equipment and storage medium
Lin et al. Balancing and reconstruction of segmented postures for humanoid robots in imitation of motion
CN116719409A (en) Operating skill learning method based on active interaction of intelligent agents
Xue et al. Learning to simulate complex scenes for street scene segmentation
Ramachandruni et al. Attentive task-net: Self supervised task-attention network for imitation learning using video demonstration
Liu et al. Differentiable robot rendering
Dai et al. Research on 2D Animation Simulation Based on Artificial Intelligence and Biomechanical Modeling.
CN115648203B (en) Method for realizing real-time mirror image behavior of robot based on lightweight neural network
CN117218713A (en) Action resolving method, device, equipment and storage medium
Liang et al. Interactive experience design of traditional dance in new media era based on action detection
Chen et al. Fast adaptive character animation synthesis algorithm based on depth image sequence
Wang A survey of visual analysis of human motion and its applications
Wu et al. Video driven adaptive grasp planning of virtual hand using deep reinforcement learning
CN111311648A (en) Hand-object interaction process tracking method based on cooperative differential evolution filtering
CN118429390B (en) Self-supervision target tracking method and system based on image synthesis and domain countermeasure learning
Chen YOLO Algorithm in Analysis and Design of Athletes’ Actions in College Physical Education

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant