[go: up one dir, main page]

CN114401446B - Human body posture migration method, device and system, electronic equipment and storage medium - Google Patents

Human body posture migration method, device and system, electronic equipment and storage medium Download PDF

Info

Publication number
CN114401446B
CN114401446B CN202111547521.5A CN202111547521A CN114401446B CN 114401446 B CN114401446 B CN 114401446B CN 202111547521 A CN202111547521 A CN 202111547521A CN 114401446 B CN114401446 B CN 114401446B
Authority
CN
China
Prior art keywords
human body
migrated
image
posture
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111547521.5A
Other languages
Chinese (zh)
Other versions
CN114401446A (en
Inventor
杨城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Cubesili Information Technology Co Ltd
Original Assignee
Guangzhou Cubesili Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Cubesili Information Technology Co Ltd filed Critical Guangzhou Cubesili Information Technology Co Ltd
Priority to CN202111547521.5A priority Critical patent/CN114401446B/en
Publication of CN114401446A publication Critical patent/CN114401446A/en
Application granted granted Critical
Publication of CN114401446B publication Critical patent/CN114401446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a human body posture migration method, a device, a system, electronic equipment and a storage medium, wherein the human body posture migration method comprises the following steps: acquiring texture characteristics of a target object in a human body image to be migrated and action characteristics of a source object in a source video; acquiring a human body posture conversion flow based on a source video and a human body image to be migrated; based on the texture features and the human body posture conversion flow, obtaining vector features of the target object; judging whether an error value between the vector characteristic and a preset true value is larger than a first preset value or not; if not, generating a target video based on the vector features and the action features. The application can simply and efficiently obtain the stable conversion flow prediction result so as to make the final generation result clearer and more stable.

Description

Human body posture migration method, device and system, electronic equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a human body posture migration method, a human body posture migration apparatus, a live broadcast system, an electronic device, and a computer readable storage medium.
Background
The technical scheme for realizing human body posture migration in the prior art comprises two major types of 3D reconstruction human body models, drive generation and 2D key points, namely geometric warp (or network learning warp) directly. The 3D reconstruction scheme has an excessive calculation amount, needs to consume a large amount of hardware resources, and meanwhile, the generated result is too unreal due to the fact that the generated result is stiff and uncoordinated in posture caused by the parameterized model, and the generated result of the 2D key point scheme is influenced by the learning of the conversion flow, and the influence degree is large.
Disclosure of Invention
The application provides at least a human body posture migration method, a human body posture migration device, a live broadcast system, electronic equipment and a computer readable storage medium.
The first aspect of the present application provides a human body posture migration method, which includes:
acquiring texture characteristics of a target object in a human body image to be migrated and action characteristics of a source object in a source video;
acquiring a human body posture conversion flow based on a source video and a human body image to be migrated;
Based on the texture features and the human body posture conversion flow, obtaining vector features of the target object;
Judging whether an error value between the vector characteristic and a preset true value is larger than a first preset value or not;
if not, generating a target video based on the vector features and the action features.
The human body image to be migrated at least comprises a first human body image to be migrated and a second human body image to be migrated, the action of the target object in the first human body image to be migrated is different from the action of the target object in the second human body image to be migrated, and the method further comprises:
Acquiring a first action characteristic of a target object in a first human body image to be migrated;
Acquiring a second action characteristic of a target object in a second human body image to be migrated;
and calculating a preset true value based on the first action characteristic and the second action characteristic.
After the step of generating the target video based on the vector features and the action features, the human body posture migration method further comprises the following steps:
Judging whether the confidence coefficient of the target video is larger than a second preset value or not;
if yes, outputting a target video;
If not, updating a preset true value based on the target video, and returning to the step of acquiring the texture characteristics of the target object in the human body image to be migrated and the action characteristics of the source object in the source video.
The human body posture migration method further comprises the following steps:
performing gesture estimation on the human body image to be migrated to obtain a first gesture estimation graph;
Acquiring first motion data based on a first gesture estimation graph, wherein the first motion data comprises position information of at least one human body key point of a target object;
Obtaining a first transformation parameter based on the first motion data and the standard human body posture point;
Obtaining a new human body image to be migrated based on the first transformation parameters and the first posture estimation diagram; the target object in the new human body image to be migrated is located at the center of the new human body image to be migrated.
The human body posture migration method further comprises the following steps:
acquiring an image of a specific frame in the source video, and carrying out gesture estimation on the image of the specific frame to obtain a second gesture estimation graph; wherein the image of the particular frame comprises a still image of the source object making at least one action;
Acquiring second motion data based on the second posture estimation graph, wherein the second motion data comprises position information of at least one human body key point of the source object;
obtaining a second transformation parameter based on the second motion data and the standard human body posture point;
obtaining a new image based on the second transformation parameters and the second posture estimation map; wherein the source object in the new image is located in the center of the new image.
The human body posture migration method further comprises the following steps:
Acquiring images of all training objects in a training set, and carrying out gesture estimation on the images of all training objects to obtain at least one corresponding gesture estimation graph;
collecting motion data of each attitude estimation graph based on at least one corresponding attitude estimation graph; the motion data comprise position information of at least one human body key point of a training object corresponding to the gesture estimation graph;
and averaging the motion data of all the gesture estimation graphs to obtain standard human gesture points.
A second aspect of the present application provides a human body posture transfer apparatus, comprising:
the acquisition module is used for acquiring texture characteristics of a target object in the human body image to be migrated, action characteristics of a source object in the source video, a preset true value and a human body posture conversion flow based on the source video and the human body image to be migrated;
the first calculation module is used for obtaining the vector characteristics of the target object based on the texture characteristics and the human body posture conversion flow;
The judging module is used for judging whether the deviation value between the vector characteristic and the preset true value is larger than the preset value or not;
And the second calculation module is used for obtaining the target video based on the vector characteristics and the action characteristics.
The third aspect of the present application provides a live broadcast system, where the live broadcast system includes a main broadcasting end, a viewer end and a server, the live broadcast system inputs a human body image to be migrated and a source video through the main broadcasting end or the viewer end, and the server obtains a target video according to the human body image to be migrated and the source video by the human body gesture migration method in the first aspect.
A fourth aspect of the present application provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, so as to implement the human posture migration method in the first aspect.
A fifth aspect of the present application provides a computer-readable storage medium having stored thereon program instructions which, when executed by a processor, implement the human posture migration method of the first aspect described above.
Compared with the prior art, the method has the advantages that the preset true value is added, so that the human body posture migration method is combined with supervision training, the stable conversion flow prediction result can be simply and efficiently obtained, and the final generation result is clearer and more stable.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of a first process according to an embodiment of the human body posture migration method of the present application;
FIG. 2 is a schematic diagram illustrating a structure of an embodiment of a picture style migration model according to the present application;
FIG. 3 is a schematic view of a second process according to an embodiment of the human body posture migration method of the present application;
FIG. 4 is a flowchart illustrating the step S14 of FIG. 1 for obtaining a predetermined true value;
FIG. 5 is a flow chart of another embodiment of a human body posture migration method of the present application;
Fig. 6 is a schematic flowchart of the specific process of acquiring the standard human body posture point before step S33 and step S43 in fig. 5;
FIG. 7 is a schematic diagram of a frame of an embodiment of a live system of the present application;
FIG. 8 is a schematic diagram of a frame of an embodiment of a human body posture transfer apparatus of the present application;
FIG. 9 is a schematic diagram of a frame of an embodiment of an electronic device of the present application;
FIG. 10 is a schematic diagram of a frame of an embodiment of a computer-readable storage medium of the present application.
Detailed Description
In order to better understand the technical solutions of the present application for those skilled in the art, the human body posture migration method, the human body posture migration apparatus, the live broadcast system, the electronic device and the computer readable storage medium provided by the present application are described in further detail below with reference to the accompanying drawings and the detailed description. It is to be understood that the depicted embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," and the like in this disclosure are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1 and fig. 2, fig. 1 is a first flow diagram of an embodiment of a human body posture migration method according to the present application, and fig. 2 is a schematic structural diagram of an embodiment of a picture style migration model according to the present application.
The body of execution of the human body posture migration method of the present application may be a track planning apparatus, for example, the track planning method may be executed by a terminal device or a server or other processing device, where the track planning apparatus may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a wireless phone, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like. In some possible implementations, the human gesture migration method may be implemented by a processor invoking computer readable instructions stored in a memory.
Specifically, the human body posture migration method of the embodiment of the present disclosure may include the steps of:
Step S11: and obtaining texture characteristics of a target object in the human body image to be migrated and action characteristics of a source object in the source video.
The human body image to be migrated and the source video are provided by a user, and are used for replacing a source object which makes a specific action in the source video with a target object in the human body image to be migrated, so that the user obtains the video of the target object which makes the specific action through a human body gesture migration method.
To replace the source object with the target object, it is necessary to confirm the motion characteristics of the source object and the texture characteristics of the target object, specifically, the motion characteristics are specific actions, for example, specific dance actions; texture features are features of a target object that are distinct from a source object, and may specifically be skin color, five sense organs, hairstyle, or skin condition, among others.
Alternatively, the body image to be migrated may be a photograph of the body, containing all the texture features of the target object, such as a frontal whole-body photograph of a person. The source video may be a dance video, in particular, a video of dancing by a single person or multiple persons, including a source object to be replaced.
When the object contained in the source video is a single person, the object is the source object; when the object contained in the source video is a plurality of persons, the source object needs to be determined, specifically, all the human body images contained in the source video can be obtained by carrying out human body image recognition on the source video, and the object contained in each human body image is subjected to action recognition so as to determine the target object for carrying out the specific action as the source object.
The human body posture migration method comprises the steps of determining a target object to be replaced by obtaining dance movements of a source object, and replacing texture features of the source object with corresponding texture features by obtaining texture features of the target object so as to achieve human body posture migration.
Specifically, the present embodiment extracts the texture feature X 'of the target object from the human body image to be migrated through the e_src network, and extracts the motion feature D' of the source object from the source video through the e_tgt network.
Step S12: and acquiring a human body posture conversion stream based on the source video and the human body image to be migrated.
Wherein the body posture conversion flow contains relevant information of conversion from the user posture to the dance posture, which is an important parameter in the body posture migration.
Step S13: and obtaining the vector features of the target object based on the texture features and the human body posture conversion flow.
In this embodiment, the body gesture conversion flow T is extracted from the source video and the body image to be migrated through the e_flow network, and then the body gesture conversion flow T is applied to the texture feature X' of the target object, so as to obtain the vector feature X T of the converted target object.
Step S14: and judging whether an error value between the vector characteristic and a preset true value is larger than a first preset value.
In this embodiment, a preset true value is added, and then an error value is calculated between the generated result of the e_flow network, that is, the vector feature X T of the target object and the preset true value, so as to perform constraint training, so as to implement the combination of supervision and training.
Specifically, the present embodiment calculates an L2 norm Loss function (L2-Loss) according to the vector feature X T of the target object and the preset true value, that is, calculates a Least Squares Error (LSE). The least square error is to minimize the sum of squares of the differences between the target value and the estimated value, i.e. the sum of squares of the differences between the vector feature X T and the preset true value.
If the error value between the vector feature and the preset true value is greater than the first preset value, returning to the step S11, and if not, executing the step S15.
Step S15: if not, generating a target video based on the vector feature X T and the action feature.
When the minimum square error between the vector feature X T of the target object and the preset true value is smaller than or equal to a first preset value, the deviation of the vector feature X T of the target object is proved to be in a controllable range, and a target video is further generated based on the vector feature X T and the action feature of the source object through a D_tgt network.
When it is determined that the minimum square error between the vector feature X T of the target object and the preset true value is greater than the first preset value, it is proved that a larger error exists in the vector feature X T of the target object, and if the vector feature X T is still used, the texture sharpness and the video stability of the generated target video are affected, so that step S13 is returned to, and the texture feature of the target object, the motion feature of the source object, the human body posture conversion flow and the vector feature X T of the target object are acquired again.
Before step S14 is performed, a preset true value needs to be obtained, and the specific obtaining process is continued with reference to fig. 4, and fig. 4 is a specific flowchart illustrating the process of obtaining the preset true value before step S14 in fig. 1.
Specifically, the method comprises the following steps:
Step S21: a first motion feature of a target object in a first human body image to be migrated is acquired.
Step S22: and acquiring a second action characteristic of the target object in the second human body image to be migrated.
Step S23: and calculating a preset true value based on the first action characteristic and the second action characteristic.
The human body images to be migrated comprise a plurality of human body images to be migrated, and actions of target objects in each human body image to be migrated can be the same or different.
Specifically, the embodiment includes a first to-be-migrated human body image and a second to-be-migrated human body image, and the motion of the target object in the first to-be-migrated human body image is different from the motion of the target object in the second to-be-migrated human body image. Optionally, the motion of the target object in the first human body image to be migrated is different from the motion of the source object, and the motion of the target object in the second human body image to be migrated is the same as the motion of the source object; or the action of the target object in the first human body image to be migrated is the same as the action of the source object, and the action of the target object in the second human body image to be migrated is different from the action of the source object.
In this embodiment, a first action feature and a second action feature of a target object in a first to-be-migrated human body image and a second to-be-migrated human body image are respectively acquired through a flow_net network, and a preset true value is calculated based on the first action feature and the second action feature.
Optionally, in other embodiments, the plurality of human body images to be migrated includes a plurality of target objects with different actions, and the preset true value is calculated according to the plurality of action features by using different action features of the target objects in the plurality of human body images to be migrated. Wherein at least one of the plurality of motion features is the same as a motion feature of the source object.
According to the embodiment, the pre-training of the E_flow network is realized by setting the preset true value, so that the accuracy of the E_flow network can be improved, and a stable human body posture conversion flow prediction result can be simply and efficiently obtained in a mode of combining supervision and training.
Based on the above embodiments, the confidence level of the target video may be further determined, referring to fig. 3, and fig. 3 is a second flow chart of an embodiment of the human body posture migration method according to the present application.
Specifically, the human body posture migration method of the embodiment of the present disclosure may further include the steps of:
step S16: and judging whether the confidence coefficient of the target video is larger than a second preset value.
When the target video is obtained through the d_tgt network, the confidence level of the target video needs to be further determined to determine whether the target video is "real". If the determination result is yes, step S18 is executed, and if the determination result is no, step S17 is executed.
Step S17: if not, based on the target video, updating the preset true value, and returning to the step S11.
Step S18: if yes, outputting the target video.
Specifically, the embodiment performs confidence judgment through the Dec network. Wherein, the input parameter of the Dec network is a target video, the output parameter represents the probability that the target video is a real video, if the output parameter is 1, it represents 100% of the real video, and if the output parameter is 0, it represents that it is impossible to be the real video.
Optionally, the second preset value in this embodiment is set to 0.5, and when the output parameter of the Dec network is greater than 0.5, it is proved that the Dec network judges that the target video is a real video, and the target video is directly output. When the output parameter of the Dec network is less than or equal to 0.5, it is proved that the Dec network judges that the target video is unlikely to be a real video, and the preset true value is updated based on the target video, and the step S11 is returned to acquire a new target video.
In the embodiment, confidence judgment is performed on the target video through the Dec network, and a preset true value is updated according to the output parameters of the Dec network, so that the output target video is more 'true'.
In order to further enhance the learning effect of the e_flow network, another embodiment of the present application is provided, please refer to fig. 5, fig. 5 is a flow chart of another embodiment of the human body posture transferring method of the present application. Specifically, the human body posture migration method of the embodiment of the present disclosure may include the steps of:
Step S31: and carrying out gesture estimation on the human body image to be migrated to obtain a first gesture estimation graph.
After the human body image to be migrated is acquired, the posture of the human body image to be migrated needs to be estimated, wherein the posture estimation method can be various, can be a two-dimensional posture estimation method or a three-dimensional posture estimation method, and the embodiment is not particularly limited to this.
Step S32: first motion data is collected based on the first pose estimation map.
According to the embodiment, a corresponding first gesture T 1 is obtained according to the first gesture estimation graph, and first motion data of the target object is collected from the first gesture T 1, where the first motion data includes position information of at least one human body key point of the target object.
The position information of the key points of the human body is used for positioning the head position and limb position of the target object in the migrated human body image, including but not limited to the top of the head, the five sense organs, the neck, the main joints of the limbs and the like. Optionally, in this embodiment, the human body key points mainly include seven posture key points such as neck and limb main joints.
Step S33: and obtaining a first transformation parameter based on the first motion data and the standard human body posture point.
The embodiment performs affine transformation parameter calculation based on the first motion data and the standard human body posture point to obtain a first transformation parameter θ 1.
Step S34: and obtaining a new human body image to be migrated based on the first transformation parameters and the first posture estimation diagram.
In this embodiment, the human body image to be migrated is aligned and changed by the formula (1), so that the target object in the new human body image to be migrated is located at the center of the new human body image to be migrated. The formula (1) is specifically shown as follows:
X2=T(X1,θ) (1)
Wherein X 2 is an image after change, X 1 is an image before change, θ is a change parameter, and when using formula (1), in this embodiment, the first change parameter θ 1 and the first gesture T 1 are respectively input into formula (1) to obtain a new image of a human body to be migrated.
Step S41: and acquiring an image of a specific frame in the source video, and carrying out attitude estimation on the image of the specific frame to obtain a second attitude estimation graph.
The source video is a dance video, which includes multiple frames of images, but not every frame of image has a source object to be replaced, so that multiple frames of images need to be screened to obtain images of a specific frame. Specifically, the image of the specific frame includes a static image of at least one action made by the source object, and the pose estimation is performed on the image of the specific frame, so as to obtain a second pose estimation graph.
The posture estimation method may be a two-dimensional posture estimation method or a three-dimensional posture estimation method, and this embodiment is not particularly limited herein.
Step S42: second motion data is acquired based on the second pose estimation map.
The embodiment obtains a corresponding second gesture T 2 according to the second gesture estimation diagram, and collects second motion data of the source object from the second gesture T 2, where the second motion data includes position information of at least one human body key point of the source object.
The position information of the key points of the human body is used for positioning the head position and limb position of the source object in the static image, including but not limited to the top of the head, the five sense organs, the neck, the main joints of the limbs and the like. Optionally, in this embodiment, the human body key points mainly include seven posture key points such as neck and limb main joints.
Step S43: and obtaining a second transformation parameter based on the second motion data and the standard human body posture point.
The embodiment performs affine transformation parameter calculation based on the second motion data and the standard human body posture point to obtain a second transformation parameter θ 2.
Step S44: and obtaining a new image based on the second transformation parameters and the second posture estimation diagram.
In this embodiment, the alignment of the images of the specific frames is changed by the formula (1), so that the source object in the new image is located at the center of the new image. When using the formula (1), the present embodiment inputs the second transformation parameter θ 2 and the second pose T 2 into the formula (1) respectively to obtain a new image.
Optionally, the present application may execute steps S31 and S41 simultaneously, or execute steps S31 and S41 sequentially according to time, for example, execute steps S31-S34 first, and then execute steps S41-S44; or steps S41-S44 are performed first and steps S31-S34 are performed later, which is not limited in the present application.
In addition, before executing step S33 and step S43, the standard human body posture point needs to be acquired, and the specific acquisition process is shown in fig. 6, and fig. 6 is a specific flowchart of acquiring the standard human body posture point before step S33 and step S43 in fig. 4. Specifically, the method comprises the following steps:
step S51: and acquiring images of all training objects in the training set, and carrying out gesture estimation on the images of all training objects to obtain at least one corresponding gesture estimation graph.
The training set comprises all input target objects in the human body images to be migrated and all total sets of objects contained in all active videos, any one of the target objects is a training object, and the embodiment carries out gesture estimation on the corresponding image of each training object to obtain at least one corresponding gesture estimation graph.
Step S52: motion data for each pose estimation map is collected based on at least one corresponding pose estimation map.
According to the embodiment, the corresponding gesture of each training object is obtained according to at least one corresponding gesture estimation graph, and the motion data of the training object is collected from the gesture, wherein the motion data comprises the position information of at least one human body key point of the training object.
The position information of the key points of the human body is used for positioning the head position and limb position of the source object in the static image, including but not limited to the top of the head, the five sense organs, the neck, the main joints of the limbs and the like. Optionally, in this embodiment, the human body key points mainly include seven posture key points such as neck and limb main joints.
Step S53: and averaging the motion data of all the gesture estimation graphs to obtain standard human gesture points.
In step S52, motion data of multiple groups of training objects are obtained, and weighted average processing is performed on the motion data, so that standard human body posture points can be obtained.
According to the embodiment, through carrying out gesture estimation on the input human body image to be migrated and the source video, extracting human body key points and carrying out affine transformation, the target object in the finally input human body image to be migrated and the source object in the source video are located at the center of the image, so that the learning amount of an E_flow network is reduced, the learning effect of the E_flow network is improved, and the finally generated target video is clearer and more stable.
The application also provides a live broadcast device, refer to fig. 7, and fig. 7 is a schematic frame diagram of an embodiment of the live broadcast device of the application. As shown in fig. 7, the live broadcast device 60 includes a main broadcast end 61, an audience end 62 and a server 63, the live broadcast device 60 inputs a human body image to be migrated and a source video through the main broadcast end 61 or the audience end 62 and stores the human body image and the source video in the server 63, and the server 63 performs human body migration on an image including a source object in each frame in the source video by the human body posture migration method of any embodiment, so that the source object is replaced by a target object, and the replaced target video is obtained and output to the main broadcast end 61 or the audience end 62, so that a corresponding user views the target video.
For example, when the user is a host, the user inputs the human body image to be migrated and the source video through the host end 61, obtains the replaced target video through the server 63, and displays the video to the audience watching the host, so that the interest degree of the user on the host can be improved.
When the user is a spectator, the user inputs the human body image to be migrated and the source video through the spectator terminal 62, and obtains the replaced target video through the server 63, so that the interestingness of the live broadcast equipment can be improved, and the viscosity of the user can be further improved.
The application further provides a human body posture transfer device, refer to fig. 8, and fig. 8 is a schematic frame diagram of an embodiment of the human body posture transfer device of the application. As shown in fig. 8, the human body posture transfer apparatus 70 includes an acquisition module 71, a first calculation module 72, a judgment module 73, and a second calculation module 74.
The obtaining module 71 is configured to obtain texture features X 'of a target object in a human body image to be migrated, motion features D' of a source object in a source video, a preset true value, and a human body posture conversion flow T based on the source video and the human body image to be migrated.
The first calculation module 72 is configured to obtain a vector feature X T of the target object based on the texture feature X' and the body posture conversion flow T.
The judging module 73 is configured to judge whether a deviation value between the vector feature X T and a preset true value is greater than a preset value. If yes, returning to the acquisition module 71, and re-acquiring texture characteristics X 'of the target object in the human body image to be migrated and action characteristics D' of the source object in the source video; if not, the vector feature X T and the motion feature D' are transmitted to the second calculation module 74.
The second calculation module 74 is configured to obtain the target video based on the vector feature X T and the motion feature D'.
Further, the human body posture shifting device 70 further includes a third calculation module 75, an output judgment module 76, and a posture estimation module 77.
The obtaining module 71 is further configured to obtain a first motion feature and a second motion feature of the target object in the first to-be-migrated human body image and the second to-be-migrated human body image.
The third calculation module 75 is configured to calculate a preset true value based on the first motion feature and the second motion feature, and transmit the preset true value to the determination module 73.
The output judging module 76 is configured to judge whether the confidence level of the target video is greater than a second preset value. If yes, outputting a target video; if not, the preset true value is updated based on the target video, and the acquisition module 71 is returned.
The pose estimation module 77 is configured to perform pose estimation on a to-be-migrated human body image, an image of a specific frame in a source video, and images of all training objects in a training set, respectively obtain first motion data, second motion data, and standard human body pose points, obtain a new to-be-migrated human body image of a target object located at a central position of the image based on the first motion data and the standard human body pose points, obtain a new source video of the source object located at the central position of the image based on the second motion data and the standard human body pose points, and transmit the new to-be-migrated human body image and the new source video to the acquisition module 71.
According to the human body posture migration device 70, the judgment module 73 and the posture estimation module 77 are arranged, so that the stable conversion flow prediction result can be simply and efficiently obtained from the reduction difficulty of the input part to the combination of the output part according to the supervised training, and the finally generated target video is clearer and more stable.
The application also provides an electronic device, referring to fig. 9, fig. 9 is a schematic frame diagram of an embodiment of the electronic device of the application. As shown in fig. 9, the electronic device 80 includes a memory 81 and a processor 82 coupled to each other, and the processor 82 is configured to execute program instructions stored in the memory 81 to implement steps in an embodiment of a trajectory planning method of any one of the robots described above. In one particular implementation scenario, electronic device 80 may include, but is not limited to: the microcomputer and the server, and the electronic device 80 may also include a mobile device such as a notebook computer and a tablet computer, which is not limited herein.
In particular, the processor 82 is configured to control itself and the memory 81 to implement the steps in any of the human gesture migration method embodiments described above. The processor 82 may also be referred to as a CPU (Central Processing Unit ). The processor 82 may be an integrated circuit chip having signal processing capabilities. The Processor 82 may also be a general purpose Processor, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 82 may be commonly implemented by an integrated circuit chip.
The present application also provides a computer readable storage medium, please refer to fig. 10, fig. 10 is a schematic diagram of a frame of an embodiment of the computer readable storage medium of the present application. As shown in fig. 10, the computer-readable storage medium 90 stores program instructions 91 executable by the processor, the program instructions 91 for implementing the steps in any of the human gesture migration method embodiments described above.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is only illustrative of the present application and is not to be construed as limiting the scope of the application, and all equivalent structures or equivalent flow modifications which may be made by the teachings of the present application and the accompanying drawings or which may be directly or indirectly employed in other related art are within the scope of the application.

Claims (8)

1. A human body posture transfer method, comprising:
acquiring texture characteristics of a target object in a human body image to be migrated and action characteristics of a source object in a source video; the texture features are obtained by extracting features of the human body image to be migrated;
extracting a human body posture conversion flow from the source video and the human body image to be migrated;
The human body posture conversion flow acts on the texture features to obtain the vector features of the converted target object;
judging whether an error value between the vector characteristic and a preset true value is larger than a first preset value or not;
if not, generating a target video based on the vector features and the action features;
the method further comprises the steps of:
Carrying out gesture estimation on the human body image to be migrated to obtain a first gesture estimation graph;
acquiring first motion data based on the first posture estimation graph, wherein the first motion data comprises position information of at least one human body key point of the target object, and the human body key point comprises seven posture key points such as a neck and main joint points of limbs;
obtaining a first transformation parameter based on the first motion data and the standard human body posture point;
Obtaining a new human body image to be migrated based on the first transformation parameters and the first posture estimation diagram; the target object in the new human body image to be migrated is positioned at the center of the new human body image to be migrated;
acquiring an image of a specific frame in the source video, and carrying out gesture estimation on the image of the specific frame to obtain a second gesture estimation graph; wherein the image of the particular frame comprises a still image of the source object making at least one action;
Acquiring second motion data based on the second posture estimation map, wherein the second motion data comprises position information of at least one human body key point of the source object;
Obtaining a second transformation parameter based on the second motion data and the standard human body posture point;
Obtaining a new image based on the second transformation parameters and the second posture estimation map; wherein the source object in the new image is located in a central position of the new image.
2. The method of claim 1, wherein the human body image to be migrated includes at least a first human body image to be migrated and a second human body image to be migrated, the motion of the target object in the first human body image to be migrated being different from the motion of the target object in the second human body image to be migrated, the method further comprising:
acquiring a first action characteristic of a target object in the first human body image to be migrated;
acquiring a second action characteristic of the target object in the second human body image to be migrated;
And calculating the preset true value based on the first action feature and the second action feature.
3. The method of claim 2, wherein after the step of generating a target video based on the vector features and the action features, the method further comprises:
judging whether the confidence coefficient of the target video is larger than a second preset value or not;
If yes, outputting the target video;
If not, updating the preset true value based on the target video, and returning to the step of acquiring the texture characteristics of the target object in the human body image to be migrated and the action characteristics of the source object in the source video.
4. The method according to claim 1, wherein the method further comprises:
acquiring images of all training objects in a training set, and carrying out gesture estimation on the images of all training objects to obtain at least one corresponding gesture estimation graph;
Collecting motion data of each attitude estimation graph based on the at least one corresponding attitude estimation graph; the motion data comprise position information of at least one human body key point of the training object corresponding to the gesture estimation graph;
And averaging the motion data of all the gesture estimation graphs to obtain the standard human gesture points.
5. A human body posture transfer apparatus, comprising:
The acquisition module is used for acquiring texture characteristics of a target object in a human body image to be migrated, action characteristics of a source object in a source video, a preset true value and extracting a human body posture conversion flow from the source video and the human body image to be migrated; the texture features are obtained by extracting features of the human body image to be migrated;
the first calculation module is used for acting the human body posture conversion flow on the texture characteristics to obtain the vector characteristics of the converted target object;
the judging module is used for judging whether the deviation value between the vector characteristic and the preset true value is larger than a preset value or not;
the second calculation module is used for obtaining a target video based on the vector features and the action features;
The acquisition module is further used for carrying out gesture estimation on the human body image to be migrated to obtain a first gesture estimation graph;
The acquisition module is further configured to acquire first motion data based on the first posture estimation graph, where the first motion data includes position information of at least one human body key point of the target object, and the human body key point includes seven posture key points including a neck and main joints of limbs;
The acquisition module is further used for acquiring a first transformation parameter based on the first motion data and the standard human body posture point;
The acquisition module is further used for acquiring a new human body image to be migrated based on the first transformation parameters and the first posture estimation diagram; the target object in the new human body image to be migrated is positioned at the center of the new human body image to be migrated;
the acquisition module is also used for acquiring an image of a specific frame in the source video, and carrying out gesture estimation on the image of the specific frame to obtain a second gesture estimation graph; wherein the image of the particular frame comprises a still image of the source object making at least one action;
The acquisition module is further used for acquiring second motion data based on the second gesture estimation graph, wherein the second motion data comprises position information of at least one human body key point of the source object;
the acquisition module is further used for acquiring second transformation parameters based on the second motion data and standard human body posture points;
the acquisition module is further used for acquiring a new image based on the second transformation parameters and the second posture estimation graph; wherein the source object in the new image is located in a central position of the new image.
6. A live broadcast system, characterized in that the live broadcast system comprises a main broadcasting end, a spectator end and a server, wherein the live broadcast system inputs a human body image to be migrated and a source video through the main broadcasting end or the spectator end, and the server obtains a target video according to the human body image to be migrated and the source video and through the human body gesture migration method as set forth in any one of claims 1-4.
7. An electronic device comprising a memory and a processor coupled to each other, the processor configured to execute program instructions stored in the memory to implement the human body pose migration method of any of claims 1-4.
8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which when executed by a processor, implements the human body posture migration method of any one of claims 1-4.
CN202111547521.5A 2021-12-16 2021-12-16 Human body posture migration method, device and system, electronic equipment and storage medium Active CN114401446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111547521.5A CN114401446B (en) 2021-12-16 2021-12-16 Human body posture migration method, device and system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111547521.5A CN114401446B (en) 2021-12-16 2021-12-16 Human body posture migration method, device and system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114401446A CN114401446A (en) 2022-04-26
CN114401446B true CN114401446B (en) 2024-09-24

Family

ID=81226277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111547521.5A Active CN114401446B (en) 2021-12-16 2021-12-16 Human body posture migration method, device and system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114401446B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116659047A (en) * 2023-05-06 2023-08-29 西安建筑科技大学 Adjustment method of air supply parameters of air conditioner in office environment based on user behavior feature recognition
CN116957919B (en) * 2023-07-12 2024-07-16 珠海凌烟阁芯片科技有限公司 A 3D human body model generation method and system based on RGBD image
WO2025050368A1 (en) * 2023-09-08 2025-03-13 Huawei Technologies Co., Ltd. System and method for generative human motion style transfer

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027438A (en) * 2019-12-03 2020-04-17 Oppo广东移动通信有限公司 Human body posture migration method, mobile terminal and computer storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5604249B2 (en) * 2010-09-29 2014-10-08 Kddi株式会社 Human body posture estimation device, human body posture estimation method, and computer program
CN111161200A (en) * 2019-12-22 2020-05-15 天津大学 Human Pose Transfer Method Based on Attention Mechanism
CN113762292B (en) * 2020-06-03 2024-02-02 杭州海康威视数字技术股份有限公司 A training data acquisition method and device and a model training method and device
CN112508776B (en) * 2020-12-11 2024-02-27 网易(杭州)网络有限公司 Motion transfer method, device and electronic device
CN113705295B (en) * 2021-03-10 2026-01-06 中国科学院计算技术研究所 Object pose transfer methods, apparatus, devices and storage media

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027438A (en) * 2019-12-03 2020-04-17 Oppo广东移动通信有限公司 Human body posture migration method, mobile terminal and computer storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于生成对抗网络的人体姿态合成人物图像与视频技术研究;王红豫;《中国优秀硕士论文电子期刊网》(第09期);论文第3、4章 *
王红豫.基于生成对抗网络的人体姿态合成人物图像与视频技术研究.《中国优秀硕士论文电子期刊网》.2021,(第09期),论文第3、4章. *

Also Published As

Publication number Publication date
CN114401446A (en) 2022-04-26

Similar Documents

Publication Publication Date Title
US20230290101A1 (en) Data processing method and apparatus, electronic device, and computer-readable storage medium
CN112085835B (en) Three-dimensional cartoon face generation method and device, electronic equipment and storage medium
CN114401446B (en) Human body posture migration method, device and system, electronic equipment and storage medium
US11417095B2 (en) Image recognition method and apparatus, electronic device, and readable storage medium using an update on body extraction parameter and alignment parameter
WO2021098616A1 (en) Motion posture recognition method, motion posture recognition apparatus, terminal device and medium
CN113688907B (en) Model training, video processing method, device, device and storage medium
US12493976B2 (en) Method for training depth estimation model, training apparatus, and electronic device applying the method
CN109685873B (en) Face reconstruction method, device, equipment and storage medium
CN112200041A (en) Video motion recognition method and device, storage medium and electronic equipment
CN116934848B (en) Data processing method, device, equipment and medium
CN112149602A (en) Action counting method and device, electronic equipment and storage medium
JP5837860B2 (en) Motion similarity calculation device, motion similarity calculation method, and computer program
US20250391158A1 (en) Generation method, non-transitory computer-readable recording medium, and information processing device
CN112258647A (en) Map reconstruction method and device, computer readable medium and electronic device
CN114332483B (en) Object key point detection method and device, training method and device and computing equipment
CN109978928A (en) A kind of binocular vision solid matching method and its system based on Nearest Neighbor with Weighted Voting
CN117853839B (en) Model training method and device
CN117876808A (en) Model training method and device
CN113724176A (en) Multi-camera motion capture seamless connection method, device, terminal and medium
CN108491081B (en) Data processing method and device based on neural network
CN114693845B (en) Stylized object driving method and device, medium and electronic device
CN111260692A (en) Face tracking method, device, device and storage medium
US20250308184A1 (en) Three dimensional aware video compositing
CN116110130B (en) Methods, devices and equipment for assessing human posture in badminton
US20250209732A1 (en) Three-dimentional scene reconstruction method and apparatus, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant