Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, a first objective of the present invention is to provide a hand motion reconstruction method, which can obtain a more accurate three-dimensional reconstruction result and obtain a better object-hand interaction reconstruction result under a simpler hardware condition.
The second purpose of the invention is to provide a hand motion reconstruction device.
A third object of the invention is to propose a computer device.
A fourth object of the invention is to propose a non-transitory computer-readable storage medium.
To achieve the above object, a first aspect of the present invention provides a hand motion reconstruction method, including: acquiring a hand depth data set, wherein each hand depth data in the hand depth data set comprises hand depth picture information and hand skeleton coordinates corresponding to the hand depth picture information; controlling a preset hand model to adjust the posture according to each hand depth data in the hand depth data set respectively, and obtaining posture parameters of the fitting hand model after posture adjustment; and according to the gesture parameter Gaussian distribution function of the fitted hand model corresponding to each hand depth data, reconstructing hand motion according to the Gaussian distribution function.
The hand motion reconstruction method solves the technical problems that in the prior art, the movement of a human hand is flexible and complex, and serious shielding is often caused in the process of interacting with an object, can obtain a more accurate three-dimensional reconstruction result through isomorphic single-viewpoint RGB-D data, and can obtain a better object and hand interaction reconstruction result under a simpler hardware condition.
In an embodiment of the present invention, the controlling the preset hand model to adjust the posture according to each hand depth data in the hand depth data set, and obtain posture parameters of the fitting hand model after posture adjustment includes: acquiring first hand depth data meeting preset conditions in the hand depth data set; determining a first posture parameter of the preset hand model according to a point cloud matching algorithm and hand depth information in the first hand depth data; constructing a regression matrix of the first posture parameter through a gradient descent iterative algorithm and hand skeleton coordinates in the first hand depth data; determining second hand depth data in the hand depth data set except the first hand depth data, and calculating fitting skeleton coordinates of the preset hand model according to the regression matrix and hand skeleton coordinates corresponding to the second hand depth data; and calculating a second posture parameter of the preset hand model according to the fitting skeleton coordinate.
In an embodiment of the present invention, the acquiring first hand depth data in the hand depth data set that meets a preset condition includes: determining a reference attitude parameter corresponding to each hand depth data; calculating a difference value between the reference attitude parameter and a preset initial attitude parameter; and determining hand depth data corresponding to the difference value smaller than a preset threshold value as the first hand depth data.
In an embodiment of the present invention, the hand motion reconstruction method further includes: acquiring continuous multi-frame images of interaction between a user hand and an object based on a preset RGB-D camera; extracting first color information and first depth information of the hand of the user and second color information and second depth information of the object according to the continuous multi-frame images; acquiring the motion state information of the object according to the second color information and the second depth information; extracting depth information of a first key point of the hand of the user according to the first depth information; estimating the depth information of a second key point of the hand of the user according to the depth information of the first key point and the Gaussian distribution function; and simulating the interactive animation of the object and the hand of the user according to the depth information of the first key point, the depth information of the second key point, the first color information and the motion state information.
In an embodiment of the present invention, the estimating depth information of a second keypoint of the hand of the user according to the depth information of the first keypoint and the gaussian distribution function includes: determining estimated depth information of the second key point according to the depth information of the first key point and a preset algorithm; calculating the confidence of the estimated depth information according to the Gaussian distribution function; and detecting whether the confidence coefficient is greater than a preset threshold value, if not, modifying the estimated depth information until the confidence coefficient is greater than the preset threshold value, and taking the modified estimated depth information as the depth information of the second key point.
In an embodiment of the present invention, before the acquiring, based on the preset RGB-D camera, a continuous multi-frame image of a user's hand interacting with an object, the method includes: acquiring internal parameters and external parameters of an RGB module in the RGB-D camera; acquiring human body action images shot by the RGB module and the depth module in the RGB-D camera simultaneously; and correcting the internal parameters and the external parameters according to a preset function and the human motion images shot at the same time.
In order to achieve the above object, a second embodiment of the present invention provides a hand motion reconstruction device, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a hand depth data set, and each hand depth data in the hand depth data set comprises hand depth picture information and hand skeleton coordinates corresponding to the hand depth picture information; the control module is used for controlling a preset hand model to adjust the posture according to each hand depth data in the hand depth data set; the second acquisition module is used for acquiring the posture parameters of the fitting hand model after the posture adjustment; and the reconstruction module is used for reconstructing hand motion according to the Gaussian distribution function of the attitude parameters of the fitted hand model corresponding to each hand depth data.
In an embodiment of the present invention, the second obtaining module includes: the determining unit is used for determining a first posture parameter of the preset hand model according to a point cloud matching algorithm and hand depth information in the first hand depth data; and the calculation unit is used for calculating a second posture parameter of the preset hand model according to the fitting skeleton coordinate.
According to the hand motion reconstruction device, the technical problems that in the prior art, movement of a human hand is flexible and complex and serious shielding is often caused in an interaction process with an object are solved through the first acquisition module, the control module, the second acquisition module and the reconstruction module, a more accurate three-dimensional reconstruction result can be obtained through isomorphic single-viewpoint RGB-D data, and a better object and hand interaction reconstruction result can be obtained under a simpler hardware condition.
To achieve the above object, a third aspect of the present invention provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the hand motion reconstruction method according to the first aspect of the present invention is implemented.
In order to achieve the above object, a fourth aspect of the present invention provides a non-transitory computer-readable storage medium, wherein the computer program, when executed by a processor, implements the hand motion reconstruction method according to the first aspect of the above embodiments.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The hand motion reconstruction method and apparatus of the embodiments of the present invention are described below with reference to the drawings.
Fig. 1 is a schematic flow chart of a hand motion reconstruction method according to an embodiment of the present invention.
In view of the foregoing embodiments, an embodiment of the present invention provides a hand motion reconstruction method, as shown in fig. 1, the hand motion reconstruction method includes the following steps:
step 101, a hand depth data set is obtained, wherein each hand depth data in the hand depth data set comprises hand depth picture information and hand skeleton coordinates corresponding to the hand depth picture information.
Specifically, disclosed hand depth data sets are obtained, the data sets are defined differently, but each hand data includes a large amount of depth picture information and corresponding hand skeleton coordinates thereof, which can be a depth map of a human hand and matched hand skeleton three-dimensional coordinate marks, and in order to unify data, a MANO hand model needs to be fitted to various data sets.
And 102, controlling the preset hand model to adjust the posture according to each hand depth data in the hand depth data set respectively, and obtaining posture parameters of the fitting hand model after posture adjustment.
Specifically, control is predetermine the hand model and is concentrated every hand depth data adjustment gesture according to hand depth data respectively, and wherein the gesture can include the length, thickness etc. of finger, obtains the gesture parameter of the fitting hand model after the gesture adjustment, and wherein the gesture parameter can be the rotation angle of every finger joint, incline direction etc. and the process of specifically fitting is: acquiring first hand depth data meeting preset conditions in a hand depth data set; determining a first posture parameter of a preset hand model according to a point cloud matching algorithm and hand depth information in the first hand depth data; constructing a regression matrix of the first attitude parameter through a gradient descent iterative algorithm and hand skeleton coordinates in the first hand depth data; determining second hand depth data except the first hand depth data in the hand depth data set, and calculating fitting skeleton coordinates of a preset hand model according to the regression matrix and hand skeleton coordinates corresponding to the second hand depth data; and calculating a second posture parameter of the preset hand model according to the fitting skeleton coordinates.
Further, in obtaining first hand depth data meeting preset conditions in the hand depth data set, firstly, a reference attitude parameter corresponding to each hand depth data is determined, wherein whether the data volume of the depth data is large needs to be judged, such as the number of pixel points and the depth smoothness among the pixel points, then, a difference value between the reference attitude parameter and a preset initial attitude parameter is calculated, and then, the hand depth data corresponding to the difference value smaller than a preset threshold value is determined to be the first hand depth data.
It can be understood that as a possible implementation manner of the embodiment of the present invention, first hand depth data meeting preset conditions, i.e., several sets of data with simple actions, are selected, and an ICP point cloud matching method is used to fit a MANO hand model to the depth; secondly, learning regression matrixes of two frameworks among a few groups of data, because the data volume is small, directly solving the regression matrixes possibly underdetermined, but because some consistency exists among different hand framework definitions, 1 norm constraint can be added, the regression matrixes between the hand frameworks of the MANO model and the hand frameworks of the data set can be estimated through a gradient descent iterative algorithm, finally, the frameworks defined by the MANO of the remaining data are obtained through the regression matrixes, and the posture parameters are obtained through combining the depth images.
And 103, according to the gesture parameter Gaussian distribution function of the fitted hand model corresponding to each hand depth data, hand motion reconstruction is conducted according to the Gaussian distribution function.
Specifically, according to the gesture parameter gaussian distribution function of the fitting hand model corresponding to each hand depth data, the gesture parameters of the hand model can be analyzed by using a statistical method to obtain mixed gaussian distribution of the gesture parameters, and the mixed gaussian distribution is used as gesture prior distribution of the hand, so that hand motion reconstruction is facilitated.
After obtaining a gaussian distribution function of the pose parameters of the fitted hand model corresponding to each hand depth data, hand motion reconstruction is performed according to the gaussian distribution function, specifically, an embodiment of the present invention provides a hand motion reconstruction method, as shown in fig. 2, the method includes the following steps:
step 201, acquiring continuous multiframe images of the interaction between the hand of the user and the object based on a preset RGB-D camera.
Specifically, after calibration is completed, a preset RGB-D camera collects a continuous multi-frame image of interaction between a human hand and an object, which may be an RGB-D sequence in this example.
It should be noted that before acquiring a continuous multi-frame image of a user hand interacting with an object based on a preset RGB-D camera, internal and external parameters of an RGB module in the RGB-D camera need to be acquired; acquiring human body action images shot by a depth module in an RGB module and an RGB-D camera simultaneously; and correcting the internal parameters and the external parameters according to the preset function and the human motion image shot at the same time.
Specifically, a color (RGB) picture and a depth (depth) picture of the depth camera have a certain viewing angle difference, and the camera needs to be calibrated by using a checkerboard method or other methods. Because the Kinect camera has the function of human body identification, can utilize and carry out camera demarcation:
let the rgb camera internally refer to the following equation (1) and externally refer to the following equation (2):
the projection matrix from the three-dimensional space to the two-dimensional plane of the rgb image is shown in equation (3):
the Depth camera and the color camera shoot human body actions synchronously, and the corresponding relation between the RGB picture and the Depth picture can be obtained by utilizing a map function carried in the SDK of the Kinect camera, so that the camera can be calibrated through the relation.
Step 202, extracting first color information and first depth information of the hand of the user and second color information and second depth information of the object according to the continuous multi-frame images.
Specifically, image information and depth information of the hand of the user and image information and depth information of the object are extracted according to the RGB-D sequence, wherein RGB-D pictures of different viewpoints of the object are obtained, and a three-dimensional model of the object can be reconstructed by using an existing multi-viewpoint reconstruction algorithm, such as a kinect fusion algorithm carried by a computer vision open source library OpenCv 4.0 or commercial software Agisoft Metashape Pro.
And step 203, acquiring the motion state information of the object according to the second color information and the second depth information.
Specifically, the motion state of the object is identified from the RGB-D data through a depth learning method according to the image information and the depth information of the object. The motion state information of the object includes a motion position of the object, a shape of the object, and the like.
And step 204, extracting the depth information of the first key point of the hand of the user according to the first depth information.
Specifically, according to the depth information of the hand of the user, the depth information of the collected or non-shielded sparse key points of the hand is identified from the RGB-D data through a depth learning method such as OpenPose.
And step 205, estimating the depth information of the second key point of the hand of the user according to the depth information of the first key point and the Gaussian distribution function.
Specifically, estimated depth information of a second key point is determined according to depth information of a first key point and a preset algorithm, confidence of the estimated depth information is calculated according to a Gaussian distribution function, whether the confidence is greater than a preset threshold is detected, if not, the estimated depth information is modified until the confidence is greater than the preset threshold, and the modified estimated depth information is used as the depth information of the second key point. In this example, the second keypoint may be a point that is not acquired or occluded, and since the point that is not acquired or occluded is estimated by a gaussian function, the confidence level needs to be determined, and the estimated depth information is modified until the confidence level is greater than a preset threshold, and the estimated depth information is used as the depth information of the occluded point.
And step 206, simulating the interactive animation of the object and the hand of the user according to the depth information of the first key point, the depth information of the second key point, the first color information and the motion state information.
Specifically, the interactive animation of the object and the hand of the user is simulated according to the collected depth information of the point and the depth information of the shielded point, the image information of the hand of the user and the motion state of the object estimated by a preset algorithm.
The hand motion reconstruction method provided by the embodiment of the invention solves the technical problems that the movement of a human hand is flexible and complex and severe shielding is often accompanied in the process of interacting with an object in the prior art, can obtain a more accurate three-dimensional reconstruction result through isomorphic single-viewpoint RGB-D data, and can obtain a better object and hand interaction reconstruction result under a simpler hardware condition.
In order to realize the embodiment, the invention further provides a hand motion reconstruction device.
Fig. 3 is a schematic structural diagram of a hand motion reconstruction device according to an embodiment of the present invention.
As shown in fig. 3, the hand motion reconstruction apparatus includes: the hand depth data processing method comprises a first obtaining module 10, a control module 20, a second obtaining module 30 and a reconstruction module 40, wherein the first obtaining module 10 is used for obtaining a hand depth data set, each hand depth data in the hand depth data set comprises hand depth picture information and hand skeleton coordinates corresponding to the hand depth picture information, then the control module 20 controls a preset hand model to adjust the posture according to each hand depth data in the hand depth data set, and then the second obtaining module 30 obtains the posture parameters of a fitting hand model after the posture is adjusted; as shown in fig. 4, on the basis of fig. 3, the method further includes: the determining unit 31 is configured to determine a first pose parameter of a preset hand model according to a point cloud matching algorithm and hand depth information in the first hand depth data; and the calculating unit 32 is used for calculating a second attitude parameter of the preset hand model according to the fitted skeleton coordinates, and finally, the reconstructing module 40 is used for reconstructing hand motion according to the Gaussian distribution function of the attitude parameter of the fitted hand model corresponding to each hand depth data so as to reconstruct hand motion according to the Gaussian distribution function.
It should be noted that the explanation of the hand motion reconstruction method embodiment is also applicable to the hand motion reconstruction device of this embodiment, and is not repeated here.
In order to implement the above embodiments, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the hand motion reconstruction method as described in the above embodiments is implemented.
In order to implement the above embodiments, the present invention further proposes a non-transitory computer readable storage medium, wherein when being executed by a processor, the computer program implements the hand motion reconstruction method as described in the above embodiments.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.