[go: up one dir, main page]

CN112215050A - Nonlinear 3DMM face reconstruction and pose normalization method, device, medium and equipment - Google Patents

Nonlinear 3DMM face reconstruction and pose normalization method, device, medium and equipment Download PDF

Info

Publication number
CN112215050A
CN112215050A CN201910820065.3A CN201910820065A CN112215050A CN 112215050 A CN112215050 A CN 112215050A CN 201910820065 A CN201910820065 A CN 201910820065A CN 112215050 A CN112215050 A CN 112215050A
Authority
CN
China
Prior art keywords
face
texture
shape
nonlinear
3dmm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910820065.3A
Other languages
Chinese (zh)
Inventor
周军
刘利朋
江武明
丁松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Eyes Intelligent Technology Co ltd
Beijing Eyecool Technology Co Ltd
Original Assignee
Beijing Eyes Intelligent Technology Co ltd
Beijing Eyecool Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Eyes Intelligent Technology Co ltd, Beijing Eyecool Technology Co Ltd filed Critical Beijing Eyes Intelligent Technology Co ltd
Publication of CN112215050A publication Critical patent/CN112215050A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种非线性3DMM人脸重建和姿态归一化方法、装置、介质及设备,属于计算机视觉领域。该方法包括:对模型进行训练,将2D人脸图像输入模型,得到3D人脸。模型包括CNN编码器、形状解码器、纹理解码器和渲染层;CNN编码器对2D人脸图像样本估计得到相机投影参数、形状参数和纹理参数,形状解码器和纹理解码器将形状参数和纹理参数解码为3D形状和3D纹理。训练时渲染层得到渲染图像,通过损失函数训练模型。预测时渲染层进行3D渲染,得到3D人脸。本发明具有比线性3DMM更高的表示能力,训练和预测端到端进行,无需3D面部扫描即可利用2D图像进行网络训练,重建后的3D人脸在归一化后识别准确率高。

Figure 201910820065

The invention discloses a nonlinear 3DMM face reconstruction and attitude normalization method, device, medium and equipment, which belong to the field of computer vision. The method includes: training a model, inputting a 2D face image into the model, and obtaining a 3D face. The model includes a CNN encoder, a shape decoder, a texture decoder and a rendering layer; the CNN encoder estimates the camera projection parameters, shape parameters and texture parameters from the 2D face image samples, and the shape decoder and texture decoder combine the shape parameters and texture parameters. Parameters are decoded into 3D shapes and 3D textures. During training, the rendering layer gets the rendered image and trains the model through the loss function. During prediction, the rendering layer performs 3D rendering to obtain a 3D face. The present invention has higher representation ability than linear 3DMM, training and prediction are carried out end-to-end, 2D images can be used for network training without 3D face scanning, and the reconstructed 3D face has high recognition accuracy after normalization.

Figure 201910820065

Description

Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment
Technical Field
The invention relates to the field of computer vision, in particular to a nonlinear 3DMM face reconstruction method, a nonlinear 3DMM face reconstruction device, a computer readable storage medium and computer readable storage equipment, and a face posture normalization method, a nonlinear 3DMM face reconstruction device, a computer readable storage medium and computer readable storage equipment based on the nonlinear 3DMM face reconstruction method.
Background
In the face image recognition technology, the pose of a face is an important factor influencing the face recognition rate, the face image recognition in the prior art is mainly used for recognizing a front face image or a small-pose (angle) face image, the recognition result of a large-pose face image is not ideal, and in order to improve the recognition accuracy rate, the pose normalization of the face image (especially the large-pose face image) is required.
The face image, the small-pose face image and the large-pose face image are all 2D face images. The human face posture normalization method based on 3D reconstruction is a method for performing 3D reconstruction on the 2D human face image to obtain a 3D human face, performing posture correction (normalization) on the 3D human face, and then projecting the 3D human face into a 2D human face image again to finish human face posture normalization.
The core of the human face posture normalization method based on 3D reconstruction is to perform 3D reconstruction on a 2D human face image to be normalized, generally, 3D is most applied to 3DMM type methods in reconstruction, and the 3DMM method is mainly divided into the following aspects.
(1) Linear 3DMM parameter estimation method
The 3D deformation Model (3D Mobile Model, 3DMM) is a method for constructing a 3D face Model based on statistical principles. The linear 3DMM parameter estimation method is roughly thought of constructing an average (specifically, a characteristic face: an average face + a characteristic vector group corresponds to a coefficient, note that the coefficient is not a characteristic value and needs to be finally solved reversely) face deformation model by using a face database, matching and combining the 2D face image and the average face deformation model after a new 2D face image is given, adjusting corresponding parameters of the average face deformation model, deforming the average face deformation model until the difference between the average face deformation model and the face image is reduced to the minimum, and optimizing and adjusting textures at this time to complete 3D face modeling.
After the 3D modeling of the face is completed through the steps, the 3D face is subjected to posture rotation through a 3D face rotation method, and finally the corrected 3D face is projected on a two-dimensional image plane to complete the normalization of the 2D face posture.
Linear 3DMM face reconstruction is based on a face scan to obtain a training set and performing Principal Component Analysis (PCA) based on the training set to supervise the 3 DMM. Principal Component Analysis (PCA) is a statistical method. A group of variables which are possibly correlated are converted into a group of linearly uncorrelated variables through orthogonal transformation, and the group of converted variables are called principal components. To simulate highly variable 3D face shapes, a large number of high quality 3D face scans are required. However, this requirement is expensive to implement. A widely used Basel Face Model (BFM) was constructed with only 200 subjects in the neutral expression. The missing expression data is compensated using the FaceWarehouse expression dataset. But almost all models use less than 300 training scans. Such a small training set is far from sufficient to describe the complete change of a face.
Second, under well-controlled conditions, a texture model of linear 3DMM is typically constructed with 2D face images collectively captured by a small number of 3D scans. Therefore, such models can only learn to perform facial textures under similar conditions, but do not perform well under other conditions (e.g., in a field environment). This greatly limits the application scenarios of 3 DMM.
Finally, the representation capability of linear 3DMM is not only limited by the size of the training set, but also formulated by it. Facial changes are nonlinear in nature. For example, the changes in different facial expressions or poses are inherently non-linear, which violates the linear assumption of PCA-based models. Therefore, the linear 3DMM model does not account well for facial changes.
(2) Improved linear 3DMM based method
The standard linear 3DMM is based on PCA and the statistical distribution is a single-mode gaussian distribution. Koppen et al believe that single mode gaussians do not represent real-world distributions. They proposed a gaussian mixture 3DMM that models the global population as a mixture of gaussian subgroups, each with its own mean, but sharing covariance. By modeling the 3D face through the method, the modeling precision is obviously improved. However, the method is still based on statistical PCA. Duong et al solved the linearity problem in face modeling by using a deep boltzmann machine. However, they only use 2D faces and sparse ground truth and therefore do not handle large posture change faces well.
Disclosure of Invention
In order to solve the technical problem, the invention provides a nonlinear 3d mm face reconstruction method, a nonlinear 3d mm face reconstruction device, a computer-readable storage medium and a computer-readable device, and a face pose normalization method, a nonlinear 3d mm face reconstruction device, a computer-readable storage medium and a computer-readable device based on the nonlinear 3d mm face reconstruction method. The nonlinear 3DMM face reconstruction method has higher representation capability than the traditional linear 3DMM, the training and the prediction are carried out end to end, and the network training can be carried out by using the unconstrained 2D image without collecting 3D face scanning. The identification accuracy of the reconstructed 3D face is high after normalization.
The technical scheme provided by the invention is as follows:
in a first aspect, the present invention provides a nonlinear 3DMM face reconstruction method, including:
training a nonlinear 3DMM model by using a training set;
the training set comprises a plurality of 2D face image samples, and the nonlinear 3DMM model comprises a CNN encoder, a multilayer perceptual shape decoder, a CNN texture decoder and a rendering layer;
during training, 2D face image samples input into the nonlinear 3DMM model are estimated by the CNN encoder to obtain camera projection parameters, shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into 3D shapes, the CNN texture decoder decodes the texture parameters into 3D textures, and the rendering layer obtains rendering images according to the camera projection parameters, the 3D shapes and the 3D textures; training parameters of the CNN encoder, the multilayer perceptual shape decoder and the CNN texture decoder through a loss function;
inputting the acquired 2D face image into a trained nonlinear 3DMM model to obtain a 3D face;
the 2D face image is estimated by the CNN encoder to obtain shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into a 3D shape, the CNN texture decoder decodes the texture parameters into a 3D texture, and the rendering layer performs 3D rendering according to the 3D shape, the 3D texture and a predefined 2D texture map to obtain a 3D face.
Further, the training of the nonlinear 3DMM model includes a pre-training stage and a fine-tuning stage, which are performed in sequence, wherein:
loss function L of the pre-training phase0Comprises the following steps:
L0=λ1L1+L23L34L4
L1is a loss of key point, L2For 3D shape loss, L3For 3D texture loss, L4Is the projection parameter loss;
the loss function L of the fine tuning stage is:
L=L65L51L1
L5to fight against damageIn the countermeasure loss, the generator is the nonlinear 3DMM model, and the discriminator is a discriminator of patchGAN; l is6To reconstruct losses;
Figure BDA0002187285020000041
x (i, j) is the value of the rendered image at coordinate (i, j), Y (i, j) is the value of the 2D face image at coordinate (i, j), H, W is the height and width of the 2D face image, respectively;
λ1~λ5are defined coefficients.
Further, the CNN encoder includes 14 convolutional layers, an AvgPool layer, and a full connection layer, which are connected in sequence, and the CNN encoder outputs a shape parameter and a texture parameter at the AvgPool layer; the CNN texture decoder comprises a full connection layer and 14 convolution layers which are connected in sequence, and the CNN texture decoder outputs a 3D texture at the last convolution layer; the multi-layer perceptual shape decoder includes two fully connected layers.
Further, the rendering layer performs 3D rendering according to the 3D shape, the 3D texture, and a predefined 2D texture map to obtain a 3D face, including:
predefining a 2D texture map, each pixel point of the 2D texture map corresponding to a vertex of a 3D shape;
and determining the texture value of each vertex in the 3D shape according to the texture value of each pixel point in the 2D texture map.
In a second aspect, the present invention provides a nonlinear 3DMM face reconstruction apparatus corresponding to the nonlinear 3DMM face reconstruction method described in the first aspect, where the apparatus includes:
the training module is used for training the nonlinear 3DMM model by using a training set;
the training set comprises a plurality of 2D face image samples, and the nonlinear 3DMM model comprises a CNN encoder, a multilayer perceptual shape decoder, a CNN texture decoder and a rendering layer;
during training, 2D face image samples input into the nonlinear 3DMM model are estimated by the CNN encoder to obtain camera projection parameters, shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into 3D shapes, the CNN texture decoder decodes the texture parameters into 3D textures, and the rendering layer obtains rendering images according to the camera projection parameters, the 3D shapes and the 3D textures; training parameters of the CNN encoder, the multilayer perceptual shape decoder and the CNN texture decoder through a loss function;
the prediction module is used for inputting the acquired 2D face image into the trained nonlinear 3DMM model to obtain a 3D face;
the 2D face image is estimated by the CNN encoder to obtain shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into a 3D shape, the CNN texture decoder decodes the texture parameters into a 3D texture, and the rendering layer performs 3D rendering according to the 3D shape, the 3D texture and a predefined 2D texture map to obtain a 3D face.
Further, the training module includes a pre-training unit and a fine-tuning unit, wherein:
loss function L of the pre-training unit0Comprises the following steps:
L0=λ1L1+L23L34L4
L1is a loss of key point, L2For 3D shape loss, L3For 3D texture loss, L4Is the projection parameter loss;
the loss function L of the fine tuning unit is:
L=L65L51L1
L5for resisting loss, in the resisting loss, a generator is the nonlinear 3DMM model, and a discriminator is a discriminator of patchGAN; l is6To reconstruct losses;
Figure BDA0002187285020000051
x (i, j) is the value of the rendered image at coordinate (i, j), Y (i, j) is the value of the 2D face image at coordinate (i, j), H, W is the height and width of the 2D face image, respectively;
λ1~λ5are defined coefficients.
Further, the CNN encoder includes 14 convolutional layers, an AvgPool layer, and a full connection layer, which are connected in sequence, and the CNN encoder outputs a shape parameter and a texture parameter at the AvgPool layer; the CNN texture decoder comprises a full connection layer and 14 convolution layers which are connected in sequence, and the CNN texture decoder outputs a 3D texture at the last convolution layer; the multi-layer perceptual shape decoder includes two fully connected layers.
Further, in the prediction module, the rendering layer performs 3D rendering according to the 3D shape, the 3D texture, and a predefined 2D texture map to obtain a 3D face, including:
a pre-defining unit for pre-defining a 2D texture map, each pixel point of the 2D texture map corresponding to a vertex of a 3D shape;
and the rendering unit is used for determining the texture value of each vertex in the 3D shape through the texture value of each pixel point in the 2D texture map.
In a third aspect, the present invention provides a computer-readable storage medium for nonlinear 3DMM face reconstruction corresponding to the nonlinear 3DMM face reconstruction method described in the first aspect, comprising a memory for storing processor-executable instructions, which when executed by the processor, implement the steps comprising the nonlinear 3DMM face reconstruction method described in the first aspect.
In a fourth aspect, the present invention provides an apparatus for nonlinear 3DMM face reconstruction corresponding to the nonlinear 3DMM face reconstruction method described in the first aspect, which includes at least one processor and a memory storing computer-executable instructions, and when the processor executes the instructions, the steps of the nonlinear 3DMM face reconstruction method described in the first aspect are implemented.
In a fifth aspect, the present invention provides a method for normalizing a face pose based on nonlinear 3DMM reconstruction, the method comprising:
3D reconstruction is carried out on the 2D face image by using the nonlinear 3DMM face reconstruction method in the first aspect to obtain a 3D face;
carrying out posture normalization on the 3D face;
and projecting the 3D face after the posture normalization onto a two-dimensional plane to obtain a 2D face image after the posture normalization.
Further, the posture normalization of the 3D face includes:
predefining a standard 3D pose face, wherein the standard 3D pose face and the 3D face have the same point cloud number;
performing matrixing storage on a standard 3D posture face and the 3D face and performing parameter fitting to obtain a conversion matrix;
performing matrix product on the conversion matrix and the 3D face matrix to finish the posture normalization of the 3D face;
the projecting the 3D face after the posture normalization onto a two-dimensional plane comprises the following steps:
dividing the 3D face into 3D meshes according to the vertexes of the 3D face, and coloring the 3D meshes through bilinear interpolation;
and rendering by using a Z cache renderer, and projecting the 3D face onto a two-dimensional plane.
In a sixth aspect, the present invention provides a face pose normalization apparatus based on nonlinear 3DMM reconstruction corresponding to the face pose normalization method based on nonlinear 3DMM reconstruction in the fifth aspect, the apparatus includes:
a 3D reconstruction module, configured to perform 3D reconstruction on the 2D face image by using the nonlinear 3DMM face reconstruction apparatus according to the second aspect, so as to obtain a 3D face;
the 3D face normalization module is used for carrying out posture normalization on the 3D face;
and the projection module is used for projecting the 3D face after the posture normalization onto a two-dimensional plane to obtain a 2D face image after the posture normalization.
Further, the 3D face normalization module includes:
the pre-defining unit is used for pre-defining a standard 3D pose human face, and the standard 3D pose human face and the 3D human face have the same point cloud number;
the parameter fitting unit is used for matrixing and storing the standard 3D posture face and the 3D face and performing parameter fitting to obtain a conversion matrix;
the normalization unit is used for performing matrix product on the conversion matrix and the 3D face matrix to finish the posture normalization of the 3D face;
the projection module includes:
the coloring unit is used for dividing the 3D face into 3D meshes according to the vertexes of the 3D face and coloring the 3D meshes through bilinear interpolation;
and the rendering unit is used for rendering by using the Z cache renderer and projecting the 3D face onto the two-dimensional plane.
In a seventh aspect, the present invention provides a computer-readable storage medium for face pose normalization corresponding to the nonlinear 3DMM reconstruction based face pose normalization method of the fifth aspect, comprising a memory for storing processor-executable instructions, which when executed by the processor, implement the steps of the nonlinear 3DMM reconstruction based face pose normalization method of the fifth aspect.
In an eighth aspect, the present invention provides an apparatus for face pose normalization corresponding to the nonlinear 3DMM reconstruction based face pose normalization method of the fifth aspect, comprising at least one processor and a memory storing computer-executable instructions, wherein the processor implements the steps of the nonlinear 3DMM reconstruction based face pose normalization method of the fifth aspect when executing the instructions.
The invention has the following beneficial effects:
in view of the obstacles of the linear 3D mm in the prior art on its data, supervised and linear basis, the present invention learns a non-linear 3D mm model of facial shape and texture from a set of unconstrained 2D face images for 3D reconstruction by innovating the learning paradigm of 3D mm, which has the following outstanding advantages:
1. the present invention learns a non-linear 3d mm model that has a higher representation capability than a conventional linear 3d mm model.
2. The coding-decoding network structure and the rendering layer provided by the invention can enable the training and prediction of the 3D reconstruction task to be carried out end to end.
3. The invention can use the unconstrained 2D image to perform network training without collecting 3D face scan.
Drawings
FIG. 1 is a flow chart of a nonlinear 3DMM face reconstruction method of the present invention;
FIG. 2 is a flow chart of a nonlinear 3DMM model training phase;
FIG. 3 is a flow chart of a prediction phase of a nonlinear 3DMM model;
FIG. 4 is a parameter diagram of a non-linear 3DMM model of the present invention;
FIG. 5 is a schematic diagram of a nonlinear 3DMM face reconstruction apparatus according to the present invention;
FIG. 6 is a flowchart of a method for normalizing a face pose based on nonlinear 3DMM reconstruction according to the present invention;
FIG. 7 is a flow chart of a nonlinear 3DMM model prediction stage and a subsequent face pose normalization method;
fig. 8 is a schematic diagram of a human face pose normalization device based on nonlinear 3DMM reconstruction according to the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Example 1:
the embodiment of the invention provides a nonlinear 3DMM face reconstruction method, as shown in FIG. 1, the method comprises the following steps:
step S100: the non-linear 3DMM model is trained using a training set.
The training set comprises a plurality of 2D face image samples, and the nonlinear 3DMM model comprises a CNN encoder, a multilayer perceptual shape decoder, a CNN texture decoder and a rendering layer.
The nonlinear 3DMM model comprises an encoder and two decoders, wherein the encoder is a Convolutional Neural Network (CNN), and the CNN is a deep learning method. One of the two decoders is a shape decoder, the other is a texture decoder, the shape decoder is a deep Convolutional Neural Network (CNN), the texture decoder is a Multi-Layer Perceptron (MLP), which is a relatively simple artificial neural network that maps input vectors to output vectors.
The nonlinear 3DMM face reconstruction method of the invention is divided into a training stage and a prediction stage on the whole, wherein the step S100 is the training stage.
During training, a 2D face image sample input into a nonlinear 3DMM model is estimated by a CNN encoder to obtain a camera projection parameter, a shape parameter and a texture parameter, a multi-layer perception shape decoder decodes the shape parameter into a 3D shape, the CNN texture decoder decodes the texture parameter into a 3D texture, and a rendering layer obtains a rendering image according to the camera projection parameter, the 3D shape and the 3D texture; the parameters of the CNN encoder, the multi-layer perceptual shape decoder, and the CNN texture decoder are trained by the loss function.
FIG. 2 is a flow chart of the nonlinear 3DMM model training phase. During training, errors between the output of a CNN encoder, a multi-layer perceptual shape decoder, a CNN texture decoder and a rendering layer and the input 2D face image sample (including the marking information of the 2D face image sample) are calculated, and a loss function comprises one or more errors.
Step S200: and inputting the acquired 2D face image into the trained nonlinear 3DMM model to obtain the 3D face.
After the nonlinear 3DMM model is trained, 3D reconstruction can be performed on the acquired 2D face image, and a 3D face is predicted, where step S200 is a prediction stage. The 2D face image is estimated through a CNN encoder to obtain shape parameters and texture parameters, a multi-layer perception shape decoder decodes the shape parameters into a 3D shape, a CNN texture decoder decodes the texture parameters into a 3D texture, and a rendering layer performs 3D rendering according to the 3D shape, the 3D texture and a predefined 2D texture map to obtain a 3D face. FIG. 3 is a flow chart of the prediction phase of the nonlinear 3DMM model.
The nonlinear 3DMM model provided by the invention firstly takes a 2D face image as input through a CNN encoder and generates shape parameters, texture parameters and camera projection parameters, and estimates a 3D shape and a 3D texture from the shape parameters and the texture parameters through two decoders. The rendering layer of the present invention then generates a reconstructed face (i.e., the rendered image described above) by fusing the 3D face, the 3D texture, and the camera projection parameters.
The invention designs different decoding networks for shape parameters and texture parameters: multi-layer perceptrons (MLP) for shape parameters and deep Convolutional Neural Networks (CNN) for texture parameters. The deep convolutional neural network decodes the texture parameters and outputs nonlinear 3D textures, the multi-layer perceptron decodes the shape parameters and outputs nonlinear 3D shapes, and the two decoders are nonlinear 3DMM in nature. With full learning of the 3D dm by the fitting algorithm, the 3D shape and 3D texture will perfectly reconstruct the input face.
The 3d mm model of the present invention is a coding-decoding framework structure, and the shape parameters, texture parameters, and camera projection parameters are estimated by a CNN encoder network, so that the framework can perform end-to-end training, which is essentially the fitting algorithm of the 3d mm of the present invention. In an end-to-end training scheme, the encoder and the two decoders are jointly learned to minimize the difference between the reconstructed face and the input face. By jointly learning the 3D dm model encoding and decoding, network training can be performed with a large number of unconstrained 2D images without relying on 3D scanning.
In summary, in view of the obstacles of the linear 3D mm in the prior art on its data, supervision and linear basis, the present invention learns a non-linear 3D mm model of face shape and texture from a set of unconstrained 2D face images for 3D reconstruction by innovating the learning paradigm of 3D mm, which has the following outstanding advantages:
1. the present invention learns a non-linear 3d mm model that has a higher representation capability than a conventional linear 3d mm model.
2. The coding-decoding network structure and the rendering layer provided by the invention can enable the training and prediction of the 3D reconstruction task to be carried out end to end.
3. The invention can use the unconstrained 2D image to perform network training without collecting 3D face scan.
In the invention, in order to enable the proposed encoding-decoding network to be trained end-to-end, the 3D shape and texture are represented in a 2D manner. For 3D shape representation, the same representation method as linear 3DMM is used. Namely, it is
Figure BDA0002187285020000111
Where Q is all vertices representing a 3D shape. By concatenating vertices into a vector as a 3D shape representation and as an output representation of the fully-connected layer in the shape decoding network. For 3D texture representation, the method employs UV parameterized textures. In the prior art, a one-dimensional vector method is generally adopted for 3D texture representation. Since the vertex textures in the one-dimensional vector representation are parameterized as vectors, this not only loses the spatial information of the vertices, but also causes inconvenience for deploying the network using CNN. The invention adopts the expanded 2D UV texture map as the 3D texture representation, completely solves the problem of one-dimensional vector representation, and is proved to be effective in the training and prediction of the coding-decoding network.
As an improvement of the embodiment of the present invention, the whole training of the nonlinear 3d mm model may include a pre-training stage and a fine-tuning stage, which are performed sequentially, and are described below according to the loss functions used in the two different stages.
Pre-training of the non-linear 3DMM model is performed on a 300W dataset that provides a 3DMM shape that fits well based on 2D face images
Figure BDA0002187285020000121
(i.e., the aforementioned 3D shape) and camera projection parameters
Figure BDA0002187285020000122
Then the texture map is rendered in UV space by the above two parameters
Figure BDA0002187285020000123
Thereby obtaining an input 2D image. Since the input 2D image is known, a pseudo-texture map can be found
Figure BDA0002187285020000124
In addition, the data set also provides a key point L corresponding to the 2D face imageL. The preliminary pre-training of the nonlinear 3DMM model is based on
Figure BDA0002187285020000125
And LLBy carrying out the following
Figure BDA0002187285020000126
And LLThat is, the label information of the 2D face image sample is the information provided by the training data set. In particular, the loss function L of the pre-training phase0Comprises the following steps:
L0=λ1L1+L23L34L4
L1is a loss of key point, L2For 3D shape loss, L3For 3D texture loss, L4For projection parameter loss, as shown in fig. 2.
Keypoint loss refers to the keypoint L provided by the datasetLWith model predictionThe error of the key point L;
Figure BDA0002187285020000127
is a 3DMM shape provided by a data set
Figure BDA0002187285020000128
Error from the model predicted 3D shape S;
Figure BDA0002187285020000129
is a pseudo-texture map computed from a data set
Figure BDA00021872850200001210
Error from the model predicted 3D texture T;
Figure BDA00021872850200001211
is camera projection parameters provided by a data set
Figure BDA00021872850200001212
Error from the camera projection parameters predicted by the model.
And after the nonlinear 3DMM model completes the preliminary pre-training, fine tuning the pre-trained model in a semi-supervised mode. With end-to-end 2D texture reconstruction and fully unsupervised training against network constraints, and again introducing supervised keypoint loss into this phase training.
The loss function L at the fine tuning stage is:
L=L65L51L1
as shown at L in FIG. 25Shown as L5To combat the loss, the method applies the combat loss to the training of the network. In the countermeasure loss, the generator is a nonlinear 3DMM model, and the discriminator is that of PatchGAN. The purpose of the counter-damage is to bring the image generated by the generator closer to the true facial texture, further targeting by the closer to true facial texture.
L6For reconstruction of losses, by reconstruction of losses L6Optimizing parameters of a texture decoder. In particular, the reconstruction loss of the present invention is trained by constructing the Mean Square Error (MSE) between the textures of the input 2D face image and the generated rendered image as a target. The specific reconstruction loss formula is as follows:
Figure BDA0002187285020000131
x (i, j) is the value of the rendered image at coordinate (i, j), Y (i, j) is the value of the 2D face image at coordinate (i, j), H, W is the height and width of the 2D face image, respectively, and is also the height and width of the rendered image generated.
The aforementioned λ1~λ5Are defined coefficients.
As another improvement of the embodiment of the present invention, the CNN encoder of the present invention includes 14 convolutional layers, AvgPool layers, and fully-connected layers that are sequentially connected, the CNN texture decoder includes a fully-connected layer, 14 convolutional layers that are sequentially connected, and the multi-layer perceptual shape decoder includes two fully-connected layers.
The parameters of the non-linear 3DMM model of the present invention are shown in FIG. 4, where E denotes the CNN encoder, DTRepresenting a CNN texture decoder, and further a D consisting of two fully-connected layersSIn part (multi-layer perceptual shape decoder), the entire model is trained end-to-end to perform 3D reconstruction of the input 2D face image.
Taking the first convolution layer Conv11 of the CNN encoder as an example, the convolution layer performs convolution operation in which the convolution kernel (Filter is a Filter, which is a general name of Filter kernels for various operations, and Filter is a convolution kernel in the convolution operation herein) is 3 × 3 and the step Size (Stride) is 1 on the input 2D face image, and the Size (Output Size) of the obtained Output image is 96 × 32; and so on; CNN encoder E outputs shape parameter l at AvgPool layerSAnd a texture parameter lT
Taking the first convolution layer FConv52 of the CNN texture decoder as an example, the convolution layer performs a convolution operation with a convolution kernel of 3 × 3 and a step size of 1 on the image output by the previous fully-connected layer, and the size of the obtained output image is 8 × 160; and so on; CNN textureDecoder DTOutputting the non-linear 3D texture at the last convolution layer FConv11, multilayer perceptual shape decoder DSOutputting a non-linear 3D shape. The 3D face processed by the deep neural network has high nonlinear characteristics, and has better discriminability compared with linear 3 DMM.
In the present invention, the rendering layer in step S200 performs 3D rendering according to the 3D shape, the 3D texture, and the predefined 2D texture map to obtain a 3D face, which specifically includes:
step S210: the 2D texture map is predefined, and each pixel point of the 2D texture map corresponds to a vertex of the 3D shape.
Step S220: and determining the texture value of each vertex in the 3D shape according to the texture value of each pixel point in the 2D texture map.
And outputting a human face parameter model which can be subjected to 3D representation and has the human face posture consistent with the posture of the input 2D human face image after the input 2D human face image is subjected to nonlinear 3 DMM. The human face parameter model comprises shape parameters, texture parameters and projection parameters, and the method renders texture information to a 3D space through a 3D texture rendering layer. According to the invention, the 2D texture map is predefined, and as each pixel point in the predefined 2D texture map has a corresponding 3D vertex and is associated with the corresponding 3D vertex, the pixel points at different positions in the 2D texture map simultaneously represent the different 3D vertices. The rendering process of the present invention determines the texture value for each vertex in the 3D shape from a predefined 2D texture map.
In summary, the nonlinear 3DMM face reconstruction method of the present invention is divided into two stages, namely, a training stage and a prediction stage.
The training phase is shown in fig. 2, and the nonlinear 3d mm model is trained in a semi-supervised manner. The method uses two depth networks of a multi-layer perception shape decoder and a CNN texture decoder to decode the shape and the texture parameters into a 3D shape and a 3D texture respectively. To enable the framework to be end-to-end trainable, the shape, texture parameters are estimated by a CNN encoder, which is essentially our 3DMM fitting algorithm. Three depths of CNN encoder, multi-layer perceptual shape decoder and CNN texture decoder with the help of a geometry-based rendering layerAnd combining the degree networks to reconstruct the final target of the input 2D face image. Formally, given a set of two-dimensional face images, the learning of the invention by a CNN encoder E estimates the projection parameters m, and the shape parameters lSAnd a texture parameter lTThe multi-layer perceptual three-dimensional shape decoder decodes the shape parameters to map to the 3D shape S, and the CNN texture decoder decodes the texture parameters to a realistic texture
Figure BDA0002187285020000151
The goal is that the rendered image with m, S and T can closely approximate the input 2D face image.
In the prediction stage, as shown in fig. 3, firstly, the input 2D face image passes through a trained coding-decoding network to obtain 3D shape and texture expression, and then passes through a rendering layer to obtain 3D face expression, that is, a 3D face.
The invention provides a nonlinear 3DMM model realized by a deep neural network, which has stronger 3D reconstruction expressive force than that of a linear 3DMM model. The two-stage network model training method based on semi-supervised learning enables the model to be optimized more easily. And the invention can utilize unconstrained 2D images for network training without collecting 3D face scans.
Example 2:
an embodiment of the present invention provides a nonlinear 3DMM face reconstruction device corresponding to the nonlinear 3DMM face reconstruction method described in embodiment 1, as shown in fig. 5, the device includes:
and the training module 10 is used for training the nonlinear 3DMM model by using a training set.
The training set comprises a plurality of 2D face image samples, and the nonlinear 3DMM model comprises a CNN encoder, a multilayer perceptual shape decoder, a CNN texture decoder and a rendering layer.
During training, a 2D face image sample input into a nonlinear 3DMM model is estimated by a CNN encoder to obtain a camera projection parameter, a shape parameter and a texture parameter, a multi-layer perception shape decoder decodes the shape parameter into a 3D shape, the CNN texture decoder decodes the texture parameter into a 3D texture, and a rendering layer obtains a rendering image according to the camera projection parameter, the 3D shape and the 3D texture; training parameters of a CNN encoder, a multi-layer perceptual shape decoder and a CNN texture decoder through a loss function;
and the prediction module 20 is configured to input the acquired 2D face image into the trained nonlinear 3DMM model to obtain a 3D face.
The 2D face image is estimated through a CNN encoder to obtain shape parameters and texture parameters, a multi-layer perception shape decoder decodes the shape parameters into a 3D shape, a CNN texture decoder decodes the texture parameters into a 3D texture, and a rendering layer performs 3D rendering according to the 3D shape, the 3D texture and a predefined 2D texture map to obtain a 3D face.
In view of the obstacles of the linear 3D mm in the prior art on its data, supervision and linear basis, the present invention learns a non-linear 3D mm model of facial shape and texture from a set of unconstrained 2D face images for 3D reconstruction by innovating the learning paradigm of 3D mm, and has the following outstanding advantages:
1. the present invention learns a non-linear 3d mm model that has a higher representation capability than a conventional linear 3d mm model.
2. The coding-decoding network structure and the rendering layer provided by the invention can enable the training and prediction of the 3D reconstruction task to be carried out end to end.
3. The invention can use the unconstrained 2D image to perform network training without collecting 3D face scan.
As an improvement of the present invention, the training module includes a pre-training unit and a fine-tuning unit, which are performed in sequence, wherein:
loss function L of pre-training unit0Comprises the following steps:
L0=λ1L1+L23L34L4
L1is a loss of key point, L2For 3D shape loss, L3For 3D texture loss, L4Is the projection parameter loss.
The loss function L of the trim unit is:
L=L65L51L1
L5for the countermeasure loss, the generator is a nonlinear 3DMM model, and the discriminator is a discriminator of patchGAN; l is6To reconstruct losses.
Figure BDA0002187285020000161
X (i, j) is the value of the rendered image at coordinate (i, j), Y (i, j) is the value of the 2D face image at coordinate (i, j), H, W is the height and width of the 2D face image, respectively.
λ1~λ5Are defined coefficients.
As another improvement of the present invention, the CNN encoder includes 14 convolutional layers, an AvgPool layer, and a full-link layer, which are connected in sequence, and the CNN encoder outputs a shape parameter and a texture parameter at the AvgPool layer; the CNN texture decoder comprises a full connection layer and 14 convolution layers which are connected in sequence, and the CNN texture decoder outputs 3D textures at the last convolution layer; the multi-layer perceptual shape decoder includes two fully connected layers.
In the prediction module of the present invention, the rendering layer performs 3D rendering according to a 3D shape, a 3D texture, and a predefined 2D texture map to obtain a 3D face, including:
a pre-defining unit for pre-defining a 2D texture map, each pixel point of the 2D texture map corresponding to a vertex of the 3D shape;
and the rendering unit is used for determining the texture value of each vertex in the 3D shape through the texture value of each pixel point in the 2D texture map.
The invention provides a nonlinear 3DMM model realized by a deep neural network, which has stronger 3D reconstruction expressive force than that of a linear 3DMM model. The two-stage network model training method based on semi-supervised learning enables the model to be optimized more easily. And the invention can utilize unconstrained 2D images for network training without collecting 3D face scans.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Example 3:
the method provided by the embodiment of the present specification can implement the service logic through a computer program and record the service logic on a storage medium, and the storage medium can be read and executed by a computer, so as to implement the effect of the solution described in embodiment 1 of the present specification. Accordingly, the present invention also provides a computer-readable storage medium for nonlinear 3DMM face reconstruction corresponding to the nonlinear 3DMM face reconstruction method of embodiment 1, comprising a memory for storing processor-executable instructions, which when executed by the processor, implement the steps comprising the nonlinear 3DMM face reconstruction method of embodiment 1.
In view of the obstacles of the linear 3D mm in the prior art on its data, supervision and linear basis, the present invention learns a non-linear 3D mm model of facial shape and texture from a set of unconstrained 2D face images for 3D reconstruction by innovating the learning paradigm of 3D mm, and has the following outstanding advantages:
1. the present invention learns a non-linear 3d mm model that has a higher representation capability than a conventional linear 3d mm model.
2. The coding-decoding network structure and the rendering layer provided by the invention can enable the training and prediction of the 3D reconstruction task to be carried out end to end.
3. The invention can use the unconstrained 2D image to perform network training without collecting 3D face scan.
The storage medium may include a physical device for storing information, and typically, the information is digitized and then stored using an electrical, magnetic, or optical media. The storage medium may include: devices that store information using electrical energy, such as various types of memory, e.g., RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, and usb disks; devices that store information optically, such as CDs or DVDs. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth.
The above description of the apparatus according to the method embodiment may also include other embodiments. The specific implementation manner may refer to the description of the related method embodiment, and is not described in detail herein.
Example 4:
the invention also provides a device for nonlinear 3DMM face reconstruction, which can be a single computer, and can also comprise an actual operation device and the like using one or more methods or one or more embodiment devices of the specification. The apparatus for nonlinear 3d mm face reconstruction may include at least one processor and a memory storing computer-executable instructions, which when executed by the processor implement the steps of the nonlinear 3d mm face reconstruction method described in embodiment 1 above.
In view of the obstacles of the linear 3D mm in the prior art on its data, supervision and linear basis, the present invention learns a non-linear 3D mm model of facial shape and texture from a set of unconstrained 2D face images for 3D reconstruction by innovating the learning paradigm of 3D mm, and has the following outstanding advantages:
1. the present invention learns a non-linear 3d mm model that has a higher representation capability than a conventional linear 3d mm model.
2. The coding-decoding network structure and the rendering layer provided by the invention can enable the training and prediction of the 3D reconstruction task to be carried out end to end.
3. The invention can use the unconstrained 2D image to perform network training without collecting 3D face scan.
The above description of the device according to the method or apparatus embodiment may also include other embodiments, and specific implementation may refer to the description of the related method embodiment, which is not described herein in detail.
Example 5:
the embodiment of the invention provides a human face posture normalization method based on nonlinear 3DMM reconstruction, as shown in FIGS. 6 and 7, the method comprises the following steps:
step S100': and 3D reconstruction is carried out on the 2D face image by using the nonlinear 3DMM face reconstruction method in the embodiment 1 to obtain a 3D face.
This step is equivalent to steps S100 to S200 of embodiment 1, and the specific implementation method and beneficial effects thereof are described in embodiment 1, which is not described again in this embodiment.
Step S200': and carrying out posture normalization on the 3D face.
The 3D face pose normalization can accurately correct the face pose in a three-dimensional space.
Step S300': and projecting the 3D face after the posture normalization onto a two-dimensional plane to obtain a 2D face image after the posture normalization.
The purpose of the 3D face projection is to project the 3D face with normalized posture onto a two-dimensional plane, and further obtain a normalized face on the two-dimensional plane.
The human face posture normalization method based on nonlinear 3DMM reconstruction can accurately normalize the human face posture and effectively solve the problem of low human face recognition precision under the large-angle human face posture: on the basis of the nonlinear 3DMM face reconstruction in the embodiment 1, the invention further performs posture normalization on the 3D face in a three-dimensional space; and finally, projecting the normalized 3D human face onto a two-dimensional image plane in a projection mode, thereby completing the posture normalization of the human faces with different angles on the two-dimensional image. The invention can accurately normalize the face pose, can also accurately normalize the face under the condition of large face pose change, effectively solves the problem of reduction of face identification accuracy under the condition of large-pose face, and can accurately process the face under the large pose. Through experimental tests, the method improves the 3DMM reconstruction performance, and the accuracy of the normalized face based on the method on a face recognition test set reaches 99.93%.
As a modification of the present invention, step S200' includes:
step S210': a standard 3D pose face is predefined, and the standard 3D pose face and the 3D face have the same point cloud number.
Step S220': and performing matrixing storage on the standard 3D posture face and the 3D face and performing parameter fitting to obtain a conversion matrix.
Step S230': and performing matrix product on the conversion matrix and the 3D face matrix, and performing posture rotation on the 3D face to finish the posture normalization of the 3D face.
The 3D face pose normalization is used as an important step in 2D image face pose normalization, and aims to solve a conversion matrix between the 3D face pose normalization and a predefined standard 3D pose face, and perform pose rotation on the 3D face through the conversion matrix so as to finish 3D pose normalization. Specifically, for convenience of calculation and 3D rotation, the number of point clouds of the predefined standard 3D pose face is the same as the number of point clouds of the 3D face estimated in the present invention. When the conversion matrix is calculated, the predefined standard 3D pose face and the estimated 3D face data are respectively subjected to matrixing storage, and the affine matrix of the two matrixes is solved to be the conversion matrix. And then performing matrix product on the conversion matrix and the estimated 3D face matrix to finish the normalization of the 3D posture.
After the 3D posture normalization processing is completed, the posture normalization of the final 2D face image can be completed only by projecting the 3D texture and the 3D shape of the 3D face to the 2D image plane. Specifically, step S300' includes:
step S310': and dividing the 3D face into 3D meshes according to the vertexes of the 3D face, and coloring the 3D meshes through bilinear interpolation.
When the method is used for projection, the 3D face needs to be subjected to mesh division, the 3D face is divided into triangular or quadrangular 3D meshes according to the vertexes of the 3D face, the method is preferably divided into the triangular meshes, and the obtained triangular meshes are also called triangular patches.
Since the texture value of each vertex of the 3D face is determined by the predefined position in the 2D texture map during projection, the 3D mesh needs to be rendered by bilinear down-sampling when the vertices are subdivided into meshes to ensure high accuracy of projection.
Step S320': and rendering by using a Z cache renderer, and projecting the 3D face onto a two-dimensional plane.
The invention uses a Z Buffer renderer for rendering. Z buffer rendering is performed according to the distance (i.e., Z value) between each spatial triangle patch and the viewer. If the Z value in the Z Buffer is larger, the triangular patch is closer to the observer, and the triangular patch should be rendered at the corresponding position by the color of the patch, and the Z value in the Z Buffer is updated. On the contrary, if the Z value in the Z Buffer is small, it indicates that the current triangular patch is relatively far away, and the current triangular patch is covered by the preceding triangular patch, so that rendering is not needed, and the Z value does not need to be updated. The Z Buffer carries out hidden surface elimination, and when the rendering is carried out, structures behind other objects are blanked to enable the structures not to be displayed, so that the visible part of an observer can be obtained through the Z Buffer, the invisible part of the observer is blanked, the final rendering result is that the visible part of the observer is rendered, and the 3D face is projected onto a two-dimensional plane.
The method provided by the embodiment 1 is used for 3D reconstruction, then the 3D reconstructed face pose is regularized by solving an affine matrix between the 3D reconstructed face and a predefined pose regularization face (standard 3D pose face), and finally the pose regularized 3D face is projected to a 2D image plane to complete face pose normalization. The method can accurately normalize the face posture while improving the 3DMM reconstruction performance.
In the face pose normalization method based on nonlinear 3DMM reconstruction provided in the embodiment of the present invention, the 3D reconstruction method is the nonlinear 3DMM face reconstruction method described in embodiment 1, the implementation principle and the generated technical effect are the same as those of embodiment 1, and for brief description, corresponding contents in embodiment 1 may be referred to where this embodiment does not refer to.
Example 6:
the embodiment of the present invention provides a human face posture normalization device based on nonlinear 3DMM reconstruction, corresponding to the human face posture normalization method based on nonlinear 3DMM reconstruction of embodiment 5, as shown in fig. 8, the device includes:
and a 3D reconstruction module 10' configured to perform 3D reconstruction on the 2D face image by using the nonlinear 3D dm face reconstruction apparatus described in embodiment 2, so as to obtain a 3D face.
And the 3D face normalization module 20' is used for performing pose normalization on the 3D face.
And the projection module 30' is used for projecting the 3D face after the posture normalization onto a two-dimensional plane to obtain a 2D face image after the posture normalization.
The human face posture normalization device based on nonlinear 3DMM reconstruction can accurately normalize the human face posture and effectively solve the problem of low human face recognition precision under the large-angle human face posture: on the basis of the nonlinear 3DMM face reconstruction in the embodiment 2, the invention further performs posture normalization on the 3D face in a three-dimensional space; and finally, projecting the normalized 3D human face onto a two-dimensional image plane in a projection mode, thereby completing the posture normalization of the human faces with different angles on the two-dimensional image. The invention can accurately normalize the face pose, can also accurately normalize the face under the condition of large face pose change, effectively solves the problem of reduction of face identification accuracy under the condition of large-pose face, and can accurately process the face under the large pose. Through experimental tests, the method improves the 3DMM reconstruction performance, and the accuracy of the normalized face based on the method on a face recognition test set reaches 99.93%.
As an improvement of the present invention, the 3D face normalization module includes:
and the predefining unit is used for predefining a standard 3D pose face, and the standard 3D pose face and the 3D face have the same point cloud number.
And the parameter fitting unit is used for matrixing and storing the standard 3D posture face and the 3D face and performing parameter fitting to obtain a conversion matrix.
And the normalization unit is used for performing matrix product on the conversion matrix and the 3D face matrix to finish the posture normalization of the 3D face.
The projection module includes:
and the coloring unit is used for dividing the 3D face into 3D meshes according to the vertexes of the 3D face and coloring the 3D meshes through bilinear interpolation.
And the rendering unit is used for rendering by using the Z cache renderer and projecting the 3D face onto the two-dimensional plane.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Example 7:
the method provided by the present specification and described in the foregoing embodiment may implement the service logic through a computer program and record the service logic on a storage medium, where the storage medium may be read and executed by a computer, so as to implement the effect of the solution described in embodiment 5 of the present specification. Accordingly, the present invention also provides a computer-readable storage medium for face pose normalization corresponding to the non-linear 3DMM reconstruction based face pose normalization method of embodiment 5, comprising a memory for storing processor-executable instructions, which when executed by the processor, implement the steps comprising the non-linear 3DMM reconstruction based face pose normalization method of embodiment 5.
The human face posture normalization device based on nonlinear 3DMM reconstruction can accurately normalize the human face posture and effectively solve the problem of low human face recognition precision under the large-angle human face posture: on the basis of the nonlinear 3DMM face reconstruction in the embodiment 1, the invention further performs posture normalization on the 3D face in a three-dimensional space; and finally, projecting the normalized 3D human face onto a two-dimensional image plane in a projection mode, thereby completing the posture normalization of the human faces with different angles on the two-dimensional image. The invention can accurately normalize the face pose, can also accurately normalize the face under the condition of large face pose change, effectively solves the problem of reduction of face identification accuracy under the condition of large-pose face, and can accurately process the face under the large pose. Through experimental tests, the method improves the 3DMM reconstruction performance, and the accuracy of the normalized face based on the method on a face recognition test set reaches 99.93%.
The storage medium may include a physical device for storing information, and typically, the information is digitized and then stored using an electrical, magnetic, or optical media. The storage medium may include: devices that store information using electrical energy, such as various types of memory, e.g., RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, and usb disks; devices that store information optically, such as CDs or DVDs. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth.
The above description of the apparatus according to the method embodiment may also include other embodiments. The specific implementation manner may refer to the description of the related method embodiment, and is not described in detail herein.
Example 8:
the invention also provides a device for normalizing the human face pose, which can be a single computer, and can also comprise an actual operation device and the like using one or more methods or one or more embodiment devices of the specification. The apparatus for face pose normalization may comprise at least one processor and a memory storing computer executable instructions, which when executed by the processor, implement the steps of the method for face pose normalization based on nonlinear 3d mm reconstruction described in embodiment 5 above.
The human face posture normalization device based on nonlinear 3DMM reconstruction can accurately normalize the human face posture and effectively solve the problem of low human face recognition precision under the large-angle human face posture: on the basis of the nonlinear 3DMM face reconstruction in the embodiment 1, the invention further performs posture normalization on the 3D face in a three-dimensional space; and finally, projecting the normalized 3D human face onto a two-dimensional image plane in a projection mode, thereby completing the posture normalization of the human faces with different angles on the two-dimensional image. The invention can accurately normalize the face pose, can also accurately normalize the face under the condition of large face pose change, effectively solves the problem of reduction of face identification accuracy under the condition of large-pose face, and can accurately process the face under the large pose. Through experimental tests, the method improves the 3DMM reconstruction performance, and the accuracy of the normalized face based on the method on a face recognition test set reaches 99.93%.
The above description of the device according to the method or apparatus embodiment may also include other embodiments, and specific implementation may refer to the description of the related method embodiment, which is not described herein in detail.
It should be noted that, the above-mentioned apparatus or system in this specification may also include other implementation manners according to the description of the related method embodiment, and a specific implementation manner may refer to the description of the method embodiment, which is not described herein in detail. The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program class, storage medium + program embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, refer to the partial description of the method embodiment.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a vehicle-mounted human-computer interaction device, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, when implementing one or more of the present description, the functions of each module may be implemented in one or more software and/or hardware, or a module implementing the same function may be implemented by a combination of multiple sub-modules or sub-units, etc. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may therefore be considered as a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element.
As will be appreciated by one skilled in the art, one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
One or more embodiments of the present description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the present specification can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description of the specification, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the present invention in its spirit and scope. Are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (12)

1. A nonlinear 3DMM face reconstruction method, the method comprising:
training a nonlinear 3DMM model by using a training set;
the training set comprises a plurality of 2D face image samples, and the nonlinear 3DMM model comprises a CNN encoder, a multilayer perceptual shape decoder, a CNN texture decoder and a rendering layer;
during training, 2D face image samples input into the nonlinear 3DMM model are estimated by the CNN encoder to obtain camera projection parameters, shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into 3D shapes, the CNN texture decoder decodes the texture parameters into 3D textures, and the rendering layer obtains rendering images according to the camera projection parameters, the 3D shapes and the 3D textures; training parameters of the CNN encoder, the multilayer perceptual shape decoder and the CNN texture decoder through a loss function;
inputting the acquired 2D face image into a trained nonlinear 3DMM model to obtain a 3D face;
the 2D face image is estimated by the CNN encoder to obtain shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into a 3D shape, the CNN texture decoder decodes the texture parameters into a 3D texture, and the rendering layer performs 3D rendering according to the 3D shape, the 3D texture and a predefined 2D texture map to obtain a 3D face.
2. The nonlinear 3DMM face reconstruction method according to claim 1, wherein the training of the nonlinear 3DMM model includes a pre-training phase and a fine-tuning phase performed in sequence, wherein:
loss function L of the pre-training phase0Comprises the following steps:
L0=λ1L1+L23L34L4
L1is a loss of key point, L2For 3D shape loss, L3For 3D texture loss, L4Is the projection parameter loss;
the loss function L of the fine tuning stage is:
L=L65L51L1
L5for resisting loss, in the resisting loss, a generator is the nonlinear 3DMM model, and a discriminator is a discriminator of patchGAN; l is6To reconstruct losses;
Figure FDA0002187285010000021
x (i, j) is the value of the rendered image at coordinate (i, j), Y (i, j) is the value of the 2D face image at coordinate (i, j), H, W is the height and width of the 2D face image, respectively;
λ1~λ5are defined coefficients.
3. The nonlinear 3DMM face reconstruction method according to claim 1 or 2, wherein the CNN encoder includes 14 convolutional layers, an AvgPool layer and a fully connected layer which are connected in sequence, and the CNN encoder outputs shape parameters and texture parameters at the AvgPool layer; the CNN texture decoder comprises a full connection layer and 14 convolution layers which are connected in sequence, and the CNN texture decoder outputs a 3D texture at the last convolution layer; the multi-layer perceptual shape decoder includes two fully connected layers.
4. The nonlinear 3DMM face reconstruction method of claim 3, wherein the rendering layer performs 3D rendering according to the 3D shape, 3D texture and predefined 2D texture map to obtain 3D face, comprising:
predefining a 2D texture map, each pixel point of the 2D texture map corresponding to a vertex of a 3D shape;
and determining the texture value of each vertex in the 3D shape according to the texture value of each pixel point in the 2D texture map.
5. A non-linear 3DMM face reconstruction apparatus, comprising:
the training module is used for training the nonlinear 3DMM model by using a training set;
the training set comprises a plurality of 2D face image samples, and the nonlinear 3DMM model comprises a CNN encoder, a multilayer perceptual shape decoder, a CNN texture decoder and a rendering layer;
during training, 2D face image samples input into the nonlinear 3DMM model are estimated by the CNN encoder to obtain camera projection parameters, shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into 3D shapes, the CNN texture decoder decodes the texture parameters into 3D textures, and the rendering layer obtains rendering images according to the camera projection parameters, the 3D shapes and the 3D textures; training parameters of the CNN encoder, the multilayer perceptual shape decoder and the CNN texture decoder through a loss function;
the prediction module is used for inputting the acquired 2D face image into the trained nonlinear 3DMM model to obtain a 3D face;
the 2D face image is estimated by the CNN encoder to obtain shape parameters and texture parameters, the multilayer perception shape decoder decodes the shape parameters into a 3D shape, the CNN texture decoder decodes the texture parameters into a 3D texture, and the rendering layer performs 3D rendering according to the 3D shape, the 3D texture and a predefined 2D texture map to obtain a 3D face.
6. The nonlinear 3DMM face reconstruction apparatus of claim 5, wherein the training module comprises a pre-training unit and a fine-tuning unit that are performed sequentially, wherein:
loss function L of the pre-training unit0Comprises the following steps:
L0=λ1L1+L23L34L4
L1is a loss of key point, L2For 3D shape loss, L3For 3D texture loss, L4Is the projection parameter loss;
the loss function L of the fine tuning unit is:
L=L65L51L1
L5for resisting loss, in the resisting loss, a generator is the nonlinear 3DMM model, and a discriminator is a discriminator of patchGAN; l is6To reconstruct losses;
Figure FDA0002187285010000031
x (i, j) is the value of the rendered image at coordinate (i, j), Y (i, j) is the value of the 2D face image at coordinate (i, j), H, W is the height and width of the 2D face image, respectively;
λ1~λ5are defined coefficients.
7. The nonlinear 3DMM face reconstruction apparatus according to claim 5 or 6, wherein the CNN encoder includes 14 convolutional layers, an AvgPool layer and a fully connected layer which are connected in sequence, and the CNN encoder outputs shape parameters and texture parameters at the AvgPool layer; the CNN texture decoder comprises a full connection layer and 14 convolution layers which are connected in sequence, and the CNN texture decoder outputs a 3D texture at the last convolution layer; the multi-layer perceptual shape decoder includes two fully connected layers.
8. The apparatus of claim 7, wherein in the prediction module, the rendering layer performs 3D rendering according to the 3D shape, 3D texture and predefined 2D texture map to obtain a 3D face, and the apparatus comprises:
a pre-defining unit for pre-defining a 2D texture map, each pixel point of the 2D texture map corresponding to a vertex of a 3D shape;
and the rendering unit is used for determining the texture value of each vertex in the 3D shape through the texture value of each pixel point in the 2D texture map.
9. A computer readable storage medium for non-linear 3DMM face reconstruction, comprising a memory for storing processor executable instructions which, when executed by the processor, perform steps comprising the non-linear 3DMM face reconstruction method of any of claims 1-4.
10. An apparatus for nonlinear 3d dm face reconstruction, comprising at least one processor and a memory storing computer-executable instructions, which when executed by the processor, implement the steps of the nonlinear 3d dm face reconstruction method of any of claims 1-4.
11. A method for normalizing human face pose based on nonlinear 3DMM reconstruction is characterized by comprising the following steps:
3D reconstruction is carried out on the 2D face image by using the nonlinear 3DMM face reconstruction method of any one of claims 1 to 4 to obtain a 3D face;
carrying out posture normalization on the 3D face;
and projecting the 3D face after the posture normalization onto a two-dimensional plane to obtain a 2D face image after the posture normalization.
12. The method of claim 11, wherein the pose normalization of the 3D face comprises:
predefining a standard 3D pose face, wherein the standard 3D pose face and the 3D face have the same point cloud number;
performing matrixing storage on a standard 3D posture face and the 3D face and performing parameter fitting to obtain a conversion matrix;
performing matrix product on the conversion matrix and the 3D face matrix to finish the posture normalization of the 3D face;
the projecting the 3D face after the posture normalization onto a two-dimensional plane comprises the following steps:
dividing the 3D face into 3D meshes according to the vertexes of the 3D face, and coloring the 3D meshes through bilinear interpolation;
and rendering by using a Z cache renderer, and projecting the 3D face onto a two-dimensional plane.
CN201910820065.3A 2019-06-24 2019-08-31 Nonlinear 3DMM face reconstruction and pose normalization method, device, medium and equipment Pending CN112215050A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019105512009 2019-06-24
CN201910551200 2019-06-24

Publications (1)

Publication Number Publication Date
CN112215050A true CN112215050A (en) 2021-01-12

Family

ID=74047951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910820065.3A Pending CN112215050A (en) 2019-06-24 2019-08-31 Nonlinear 3DMM face reconstruction and pose normalization method, device, medium and equipment

Country Status (1)

Country Link
CN (1) CN112215050A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884889A (en) * 2021-04-06 2021-06-01 北京百度网讯科技有限公司 Model training method, model training device, human head reconstruction method, human head reconstruction device, human head reconstruction equipment and storage medium
CN112967373A (en) * 2021-02-03 2021-06-15 重庆邮电大学 Nonlinear 3 DMM-based face image feature coding method
CN113112596A (en) * 2021-05-12 2021-07-13 北京深尚科技有限公司 Face geometric model extraction and 3D face reconstruction method, device and storage medium
CN113221842A (en) * 2021-06-04 2021-08-06 第六镜科技(北京)有限公司 Model training method, image recognition method, device, equipment and medium
CN113223124A (en) * 2021-03-30 2021-08-06 华南理工大学 Posture migration method based on three-dimensional human body parameterized model
CN113343927A (en) * 2021-07-03 2021-09-03 郑州铁路职业技术学院 Intelligent face recognition method and system suitable for facial paralysis patient
CN113538682A (en) * 2021-07-19 2021-10-22 北京的卢深视科技有限公司 Model training method, head reconstruction method, electronic device, and storage medium
CN113822977A (en) * 2021-06-28 2021-12-21 腾讯科技(深圳)有限公司 Image rendering method, device, equipment and storage medium
CN113870399A (en) * 2021-09-23 2021-12-31 北京百度网讯科技有限公司 Expression driving method and device, electronic equipment and storage medium
CN113887293A (en) * 2021-08-31 2022-01-04 际络科技(上海)有限公司 Visual human face three-dimensional reconstruction method based on linear solution
CN114338959A (en) * 2021-04-15 2022-04-12 西安汉易汉网络科技股份有限公司 End-to-end text-to-video video synthesis method, system medium and application
CN114531561A (en) * 2022-01-25 2022-05-24 阿里巴巴(中国)有限公司 Face video coding method, decoding method and device
CN114926591A (en) * 2022-05-25 2022-08-19 广州图匠数据科技有限公司 Multi-branch deep learning 3D face reconstruction model training method, system and medium
CN115083000A (en) * 2022-07-14 2022-09-20 北京百度网讯科技有限公司 Face model training method, face changing device and electronic equipment
CN115147508A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Method and device for training clothing generation model and method and device for generating clothing image
CN116091871A (en) * 2023-03-07 2023-05-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Physical countermeasure sample generation method and device for target detection model
CN117036620A (en) * 2023-10-07 2023-11-10 中国科学技术大学 Three-dimensional face reconstruction method based on single image
CN117315211A (en) * 2023-11-29 2023-12-29 苏州元脑智能科技有限公司 Digital human synthesis and model training method, device, equipment and storage medium thereof
CN117894059A (en) * 2024-03-15 2024-04-16 国网江西省电力有限公司信息通信分公司 A 3D face recognition method

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067573A1 (en) * 2000-03-08 2006-03-30 Parr Timothy C System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
US20090157649A1 (en) * 2007-12-17 2009-06-18 Panagiotis Papadakis Hybrid Method and System for Content-based 3D Model Search
CN101763636A (en) * 2009-09-23 2010-06-30 中国科学院自动化研究所 Method for tracing position and pose of 3D human face in video sequence
CN102436636A (en) * 2010-09-29 2012-05-02 中国科学院计算技术研究所 Method and system for automatically segmenting hair
CN102999942A (en) * 2012-12-13 2013-03-27 清华大学 Three-dimensional face reconstruction method
CN103400105A (en) * 2013-06-26 2013-11-20 东南大学 Method identifying non-front-side facial expression based on attitude normalization
CN104598879A (en) * 2015-01-07 2015-05-06 东南大学 Three-dimensional face recognition method based on face contour lines of semi-rigid areas
CN105144247A (en) * 2012-12-12 2015-12-09 微软技术许可有限责任公司 Generation of a three-dimensional representation of a user
CN107122725A (en) * 2017-04-18 2017-09-01 深圳大学 A kind of face identification method and its system based on joint sparse discriminant analysis
CN107122705A (en) * 2017-03-17 2017-09-01 中国科学院自动化研究所 Face critical point detection method based on three-dimensional face model
CN107506717A (en) * 2017-08-17 2017-12-22 南京东方网信网络科技有限公司 Without the face identification method based on depth conversion study in constraint scene
CN107680158A (en) * 2017-11-01 2018-02-09 长沙学院 A kind of three-dimensional facial reconstruction method based on convolutional neural networks model
CN108510573A (en) * 2018-04-03 2018-09-07 南京大学 A method of the multiple views human face three-dimensional model based on deep learning is rebuild
CN109255831A (en) * 2018-09-21 2019-01-22 南京大学 The method that single-view face three-dimensional reconstruction and texture based on multi-task learning generate
CN109299643A (en) * 2018-07-17 2019-02-01 深圳职业技术学院 A face recognition method and system based on large pose alignment
CN109697688A (en) * 2017-10-20 2019-04-30 虹软科技股份有限公司 A kind of method and apparatus for image procossing

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060067573A1 (en) * 2000-03-08 2006-03-30 Parr Timothy C System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
US20090157649A1 (en) * 2007-12-17 2009-06-18 Panagiotis Papadakis Hybrid Method and System for Content-based 3D Model Search
CN101763636A (en) * 2009-09-23 2010-06-30 中国科学院自动化研究所 Method for tracing position and pose of 3D human face in video sequence
CN102436636A (en) * 2010-09-29 2012-05-02 中国科学院计算技术研究所 Method and system for automatically segmenting hair
CN105144247A (en) * 2012-12-12 2015-12-09 微软技术许可有限责任公司 Generation of a three-dimensional representation of a user
CN102999942A (en) * 2012-12-13 2013-03-27 清华大学 Three-dimensional face reconstruction method
CN103400105A (en) * 2013-06-26 2013-11-20 东南大学 Method identifying non-front-side facial expression based on attitude normalization
CN104598879A (en) * 2015-01-07 2015-05-06 东南大学 Three-dimensional face recognition method based on face contour lines of semi-rigid areas
CN107122705A (en) * 2017-03-17 2017-09-01 中国科学院自动化研究所 Face critical point detection method based on three-dimensional face model
CN107122725A (en) * 2017-04-18 2017-09-01 深圳大学 A kind of face identification method and its system based on joint sparse discriminant analysis
CN107506717A (en) * 2017-08-17 2017-12-22 南京东方网信网络科技有限公司 Without the face identification method based on depth conversion study in constraint scene
CN109697688A (en) * 2017-10-20 2019-04-30 虹软科技股份有限公司 A kind of method and apparatus for image procossing
CN107680158A (en) * 2017-11-01 2018-02-09 长沙学院 A kind of three-dimensional facial reconstruction method based on convolutional neural networks model
CN108510573A (en) * 2018-04-03 2018-09-07 南京大学 A method of the multiple views human face three-dimensional model based on deep learning is rebuild
CN109299643A (en) * 2018-07-17 2019-02-01 深圳职业技术学院 A face recognition method and system based on large pose alignment
CN109255831A (en) * 2018-09-21 2019-01-22 南京大学 The method that single-view face three-dimensional reconstruction and texture based on multi-task learning generate

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LUAN TRAN等: "Nonlinear 3D Face Morphable Model", 《 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, pages 3 - 4 *
王钱庆等: "基于三维形变模型的人脸姿势表情校正", 《计算机科学》, vol. 46, no. 6, pages 1 - 4 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967373A (en) * 2021-02-03 2021-06-15 重庆邮电大学 Nonlinear 3 DMM-based face image feature coding method
CN113223124B (en) * 2021-03-30 2022-06-10 华南理工大学 A Pose Transfer Method Based on 3D Human Parametric Model
CN113223124A (en) * 2021-03-30 2021-08-06 华南理工大学 Posture migration method based on three-dimensional human body parameterized model
CN112884889A (en) * 2021-04-06 2021-06-01 北京百度网讯科技有限公司 Model training method, model training device, human head reconstruction method, human head reconstruction device, human head reconstruction equipment and storage medium
CN112884889B (en) * 2021-04-06 2022-05-20 北京百度网讯科技有限公司 Model training method, model training device, human head reconstruction method, human head reconstruction device, human head reconstruction equipment and storage medium
CN114338959A (en) * 2021-04-15 2022-04-12 西安汉易汉网络科技股份有限公司 End-to-end text-to-video video synthesis method, system medium and application
CN113112596A (en) * 2021-05-12 2021-07-13 北京深尚科技有限公司 Face geometric model extraction and 3D face reconstruction method, device and storage medium
CN113112596B (en) * 2021-05-12 2023-10-24 北京深尚科技有限公司 Face geometric model extraction and 3D face reconstruction method, equipment and storage medium
CN113221842A (en) * 2021-06-04 2021-08-06 第六镜科技(北京)有限公司 Model training method, image recognition method, device, equipment and medium
CN113221842B (en) * 2021-06-04 2023-12-29 第六镜科技(北京)集团有限责任公司 Model training method, image recognition method, device, equipment and medium
CN113822977A (en) * 2021-06-28 2021-12-21 腾讯科技(深圳)有限公司 Image rendering method, device, equipment and storage medium
CN113343927B (en) * 2021-07-03 2023-06-23 郑州铁路职业技术学院 An intelligent face recognition method and system suitable for patients with facial paralysis
CN113343927A (en) * 2021-07-03 2021-09-03 郑州铁路职业技术学院 Intelligent face recognition method and system suitable for facial paralysis patient
CN113538682A (en) * 2021-07-19 2021-10-22 北京的卢深视科技有限公司 Model training method, head reconstruction method, electronic device, and storage medium
CN113887293A (en) * 2021-08-31 2022-01-04 际络科技(上海)有限公司 Visual human face three-dimensional reconstruction method based on linear solution
CN113870399A (en) * 2021-09-23 2021-12-31 北京百度网讯科技有限公司 Expression driving method and device, electronic equipment and storage medium
CN113870399B (en) * 2021-09-23 2022-12-02 北京百度网讯科技有限公司 Expression driving method and device, electronic equipment and storage medium
WO2023045317A1 (en) * 2021-09-23 2023-03-30 北京百度网讯科技有限公司 Expression driving method and apparatus, electronic device and storage medium
CN114531561A (en) * 2022-01-25 2022-05-24 阿里巴巴(中国)有限公司 Face video coding method, decoding method and device
CN114926591B (en) * 2022-05-25 2025-05-02 广州图匠数据科技有限公司 Multi-branch deep learning 3D face reconstruction model training method, system and medium
CN114926591A (en) * 2022-05-25 2022-08-19 广州图匠数据科技有限公司 Multi-branch deep learning 3D face reconstruction model training method, system and medium
CN115147508B (en) * 2022-06-30 2023-09-22 北京百度网讯科技有限公司 Training of clothing generation model and method and device for generating clothing image
CN115147508A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Method and device for training clothing generation model and method and device for generating clothing image
CN115083000A (en) * 2022-07-14 2022-09-20 北京百度网讯科技有限公司 Face model training method, face changing device and electronic equipment
CN115083000B (en) * 2022-07-14 2023-09-05 北京百度网讯科技有限公司 Face model training method, face changing method, face model training device and electronic equipment
CN116091871B (en) * 2023-03-07 2023-08-25 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Physical countermeasure sample generation method and device for target detection model
CN116091871A (en) * 2023-03-07 2023-05-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Physical countermeasure sample generation method and device for target detection model
CN117036620A (en) * 2023-10-07 2023-11-10 中国科学技术大学 Three-dimensional face reconstruction method based on single image
CN117036620B (en) * 2023-10-07 2024-03-01 中国科学技术大学 Three-dimensional face reconstruction method based on single image
CN117315211A (en) * 2023-11-29 2023-12-29 苏州元脑智能科技有限公司 Digital human synthesis and model training method, device, equipment and storage medium thereof
CN117315211B (en) * 2023-11-29 2024-02-23 苏州元脑智能科技有限公司 Digital human synthesis and model training method, device, equipment and storage medium thereof
CN117894059A (en) * 2024-03-15 2024-04-16 国网江西省电力有限公司信息通信分公司 A 3D face recognition method

Similar Documents

Publication Publication Date Title
CN112215050A (en) Nonlinear 3DMM face reconstruction and pose normalization method, device, medium and equipment
Tran et al. On learning 3d face morphable model from in-the-wild images
CN111047548B (en) Attitude transformation data processing method and device, computer equipment and storage medium
Wang et al. Shape inpainting using 3d generative adversarial network and recurrent convolutional networks
Yang et al. Weakly-supervised disentangling with recurrent transformations for 3d view synthesis
Tu et al. Consistent 3d hand reconstruction in video via self-supervised learning
KR102602112B1 (en) Data processing method, device, and medium for generating facial images
CN112132739B (en) 3D reconstruction and face pose normalization method, device, storage medium and equipment
US12340440B2 (en) Adaptive convolutions in neural networks
CN110598601A (en) Face 3D key point detection method and system based on distributed thermodynamic diagram
CN110516643A (en) A 3D face key point detection method and system based on joint heat map
CN116385667A (en) Reconstruction method of three-dimensional model, training method and device of texture reconstruction model
CN113298931A (en) Reconstruction method and device of object model, terminal equipment and storage medium
CN112967373B (en) A Face Image Feature Coding Method Based on Nonlinear 3DMM
Huang et al. Object-occluded human shape and pose estimation with probabilistic latent consistency
CN118553001A (en) Texture-controllable three-dimensional fine face reconstruction method and device based on sketch input
CN114694081B (en) A video sample generation method based on multi-attribute synthesis
Kim et al. Deep transformer based video inpainting using fast Fourier tokenization
US20220172421A1 (en) Enhancement of Three-Dimensional Facial Scans
US20230104702A1 (en) Transformer-based shape models
CN112926543B (en) Image generation, three-dimensional model generation method, device, electronic device and medium
CN116883524A (en) Image generation model training, image generation method and device and computer equipment
CN116912148B (en) Image enhancement method, device, computer equipment and computer readable storage medium
Lee et al. Holistic 3D face and head reconstruction with geometric details from a single image
CN114882173B (en) A 3D monocular hair modeling method and device based on implicit expression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210112