CN104574432B

CN104574432B - Three-dimensional face reconstruction method and three-dimensional face reconstruction system for automatic multi-view-angle face auto-shooting image

Info

Publication number: CN104574432B
Application number: CN201510080860.5A
Authority: CN
Inventors: 李靓
Original assignee: Sichuan Chuanda Zhisheng Software Co Ltd
Current assignee: Sichuan Chuanda Zhisheng Software Co Ltd
Priority date: 2015-02-15
Filing date: 2015-02-15
Publication date: 2017-05-24
Anticipated expiration: 2035-02-15
Also published as: CN104574432A

Abstract

The invention discloses a three-dimensional face reconstruction method for an automatic multi-view-angle face auto-shooting image. The three-dimensional face reconstruction method comprises the following steps: automatically positioning mark points of a multi-view-angle face image of the same person; establishing a target function according to the positioned mark points and mark points corresponding to a reference face model to solve camera parameters; designing a reconstruction target function; and converting a three-dimensional face reconstruction problem into a multi-label image partitioning problem under a Markov random field, and solving by using a multi-label image partitioning algorithm. The method can be used for reconstructing a thick and precision three-dimensional face model and does not depend on an outer database, so that full-automatic face reconstruction can be realized and manual interaction does not need to be carried out by users.

Description

Three-dimensional face reconstruction method and system for automatic multi-view face self-portrait image

Technical Field

The invention relates to the field of computer vision, in particular to a three-dimensional face reconstruction method and a three-dimensional face reconstruction system for automatic multi-view face self-portrait images.

Background

The face reconstruction is one of the important research directions of the three-dimensional reconstruction, has wide application prospects in the fields of movies, games, three-dimensional face recognition and the like, and is valued by researchers in the fields of computer graphics, computer vision, machine vision, computer aided design and the like. From the perspective of data acquisition, three-dimensional face reconstruction is mainly divided into active ranging equipment and passive imaging equipment. Active distance measuring equipment such as a laser scanner can scan to obtain accurate three-dimensional information of a static object, but the active distance measuring equipment is expensive, long in scanning time and limited in scanning range, and is difficult to be used in applications with high real-time requirements; in contrast, the depth camera can acquire a dynamic object in real time, but the depth map generated by the depth camera is low in resolution, low in precision and high in noise. The most common passive imaging device is a camera, and because the device is simple and low in price and a large number of two-dimensional face images exist at present, a method for recovering a three-dimensional face structure from a multi-view two-dimensional face image is widely concerned. Because the texture of the face image is sparse, the ambiguity problem existing in the feature point matching process needs to be solved.

Document [ y.lin, g.medioni, and j.choice.accurate 3d face reconstruction from complex image images with profile constraints in Computer Vision and Pattern Recognition (CVPR),2010IEEE Conference on, pages 1490-1497. IEEE 2010 ] proposes a wide baseline based multi-view face reconstruction method under weak calibration conditions. The method inputs face images under five different postures (0-degree positive face, positive and negative 45-degree and positive and negative 90-degree side faces). The relative position relation of the camera is estimated by searching stable matching points under any three adjacent visual angles, and then a voxel-based objective function is established and solved in the horizontal and vertical directions under a cylindrical coordinate system respectively by combining the multi-visual angle color consistency, smoothness and side face contour information. However, in practical application, the automatically acquired side profile often cannot meet the requirement of reconstruction accuracy; on the other hand, the human face model reconstructed by the method has large deformation under certain visual angles as seen from experimental results. Due to the fact that the texture features of the face image are sparse, the method based on feature point matching fails when corresponding points cannot be found.

A three-dimensional face reconstruction method based on two images (such as 0 degree front face and 90 degree side face) is proposed in the literature [ H.Han and A.K.Jain.3d face texture from uncalibrated facial and profile images.in Biometrics: Theory, Applications and Systems (BTAS),2012 IEEE Fifth International Conference on, pages 223-230. IEEE 2012 ]. The algorithm is mainly based on a three-dimensional deformation model (3DMM) and combines face mark points: firstly, estimating deformation and texture parameters by using front face mark points; and then further using the side mark points to correct the model. The algorithm also relies on manually marked side face mark points; meanwhile, the reconstructed face model has a certain degree of deformation under certain visual angles (such as 45 degrees). In addition, the 3 DMM-based face reconstruction method needs to be combined with a well-aligned three-dimensional face database, and the reconstruction result is obtained by linearly overlapping the face database, so that the method depends on the well-aligned prior database and lacks the capability of describing three-dimensional face details.

The existing face reconstruction method based on multi-view images mainly has the following defects: 1) due to the particularity of sparse texture features of the human face image, the method based on the traditional feature point matching is not applicable to practical application, and dense three-dimensional data is difficult to obtain by the traditional method based on the feature point matching; 2) cumbersome manual interaction is required; 3) depends on an external three-dimensional face database, and the accuracy of the reconstruction result depends on the richness degree of the database.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a three-dimensional face reconstruction method and a three-dimensional face reconstruction system for automatic multi-view face self-shot images, which can reconstruct a dense and fine three-dimensional face model, and meanwhile, the reconstruction method does not depend on an external database, can realize full-automatic face reconstruction and does not require manual interaction of a user.

In order to achieve the purpose of the invention, the invention adopts the technical scheme that: a three-dimensional face reconstruction method for automatic multi-view face self-timer image comprises the following steps:

firstly, automatically positioning mark points for face images of multiple visual angles of the same person;

secondly, establishing an objective function to solve camera parameters Pi according to the positioned mark points and the mark points on the reference face model; wherein the objective function isXi is mark points Mi which are { X1, X2, … and Xn } positioned on the ith personal face image Ii, Xi is mark points Mi which are { X1, X2, … and Xn } on the reference face model, Mi and Mi are in one-to-one correspondence, parameters Pi represent a projection transformation matrix from a three-dimensional point to a corresponding image, and n is the number of the mark points;

and step three, establishing a reconstruction target function and optimizing, and solving the target function by using a multi-label image segmentation algorithm to obtain a three-dimensional face model.

Preferably, the optimized objective function is E ═ E_data+E_color+E_smooth(ii) a Wherein the data item isD represents the conversion of the reference model into a reference model of the two-dimensional image space, X being the model of the face to be solved, i being each pixel of the two-dimensional image space, X_iRepresenting the depth, D, corresponding to the pixel i on the model to be estimated_iRepresenting the depth corresponding to the pixel i on the reference model; the multi-view color uniformity term isWherein (k)₁,k₂) Representing pairs of different views, P_kI.e. the projection matrix corresponding to the view angle k,representing a three-dimensional point in a three-dimensional space corresponding to the pixel i; the depth smoothing term isWhere n (i) is a neighborhood set where the pixel i is located, and Xi and Xj respectively represent depths corresponding to the pixels i and j.

Preferably, in the first step, a regression-based method or a local optimization-based method is adopted to automatically locate mark points for the face images at multiple viewing angles, and the mark points include inner and outer canthus, nose tip, mouth angle, contour and the like.

Preferably, the optimization objective function E should satisfy the following condition: a) the output three-dimensional face model is similar to the reference face model; b) the color consistency between the output three-dimensional face model and the multi-view face image is satisfied; c) and the smoothness of depth change is satisfied in the local neighborhood of the output three-dimensional face model.

Preferably, the marking points on the reference face model are marked in advance.

The invention also provides a three-dimensional face reconstruction system for automatically multi-view face self-portrait images, which comprises:

the marking point positioning module is used for automatically positioning marking points for the face images of multiple visual angles of the same person;

the camera parameter estimation module is used for establishing an objective function according to the positioned mark points and the mark points on the reference face model to solve camera parameters P; wherein the objective function isXi is mark points Mi which are { X1, X2, … and Xn } positioned on the ith personal face image Ii, Xi is mark points Mi which are { X1, X2, … and Xn } on the reference face model, Mi and Mi are in one-to-one correspondence, parameters Pi represent a projection transformation matrix from a three-dimensional point to a corresponding image, and n is the number of the mark points;

and the optimization solving module is used for optimizing the reconstruction objective function and solving the objective function by using a multi-label image segmentation algorithm to obtain a three-dimensional face model.

Preferably, the reconstruction objective function is E ═ E_data+E_color+E_smooth(ii) a Wherein the data item isD represents the conversion of the reference model into a reference model of the two-dimensional image space, X being the model of the face to be solved, i being each pixel of the two-dimensional image space, X_iRepresenting the depth, D, corresponding to the pixel i on the model to be estimated_iRepresenting the depth corresponding to the pixel i on the reference model; the multi-view color uniformity term isWherein (k)₁,k₂) Representing pairs of different views, P_kI.e. the projection matrix corresponding to the view angle k,representing a three-dimensional point in a three-dimensional space corresponding to the pixel i; the depth smoothing term isWhere n (i) is a neighborhood set where the pixel i is located, and Xi and Xj respectively represent depths corresponding to the pixels i and j.

Preferably, the marking point positioning module automatically positions marking points on the face images at multiple visual angles by adopting a regression-based method or a local optimization-based method, and the marking points comprise inner and outer canthus, nose tip, mouth angle, contour and the like.

Preferably, the optimization solving module optimizes the objective function E to satisfy the following condition: a) the output three-dimensional face model is similar to the reference face model; b) the color consistency between the output three-dimensional face model and the multi-view face image is satisfied; c) and the smoothness of depth change is satisfied in the local neighborhood of the output three-dimensional face model.

Compared with the prior art, the invention has the beneficial effects that:

the method estimates the relative position relation between the face images with different visual angles and the solution space of the face to be reconstructed by introducing a single standard reference face model, and converts the three-dimensional face reconstruction problem into a multi-label image segmentation problem in a Markov Random Field (MRF) frame. By introducing the reference face model, ambiguity existing in the feature point matching process can be effectively eliminated, meanwhile, the solution space is reduced, the calculation performance of the algorithm is increased, stable reconstruction results are efficiently and quickly obtained, and the dense and fine three-dimensional face model can be reconstructed. Moreover, the method does not depend on an aligned external database, any reference face model can be used for operation, and the recovery of the face details is obtained through a pixel-level optimization algorithm. In addition, the method can realize full-automatic face reconstruction, does not require manual interaction of a user, and the reconstructed three-dimensional face model can be rotated to any other visual angle.

Description of the drawings:

FIG. 1 is a flowchart of a method in example 1 of the present invention.

Fig. 2 is a system block diagram in embodiment 2 of the present invention.

Detailed Description

The present invention will be described in further detail with reference to specific embodiments. It should be understood that the scope of the above-described subject matter is not limited to the following examples, and any techniques implemented based on the disclosure of the present invention are within the scope of the present invention.

The problem of three-dimensional reconstruction of multi-view facial images is difficult to obtain dense three-dimensional data by using a traditional method based on feature point matching, which is determined by the texture sparsity of the essence of two-dimensional facial images. The invention aims to solve the problem of how to obtain a dense full-face three-dimensional face model from a plurality of uncalibrated face images. The inventor researches and discovers that the relative position relation between facial images (from the same person) from different visual angles is determined, and the differences represent slight difference on local geometry. Thus, any face can be transformed from a reference face. The invention estimates the relative position relation between facial images with different visual angles and the solution space of the facial image to be reconstructed by introducing a standard reference facial model, and converts the three-dimensional facial reconstruction problem into a multi-label image segmentation problem in a Markov Random Field (MRF) frame, wherein the corresponding function comprises a data item, a multi-visual-angle color consistent item and a depth smoothing item. The method can reconstruct an accurate three-dimensional face model and can be further used for a face recognition system based on three dimensions. The following detailed description is made with reference to the accompanying drawings.

The flow chart of the three-dimensional face reconstruction method for the automatic multi-view face self-timer image shown in fig. 1 comprises the following steps:

step one, automatically positioning mark points for face images of multiple visual angles of the same person.

Specifically, marking points are automatically positioned on the face images at multiple visual angles by adopting a regression-based method (such as Gauss-Newton deformation Model method Gauss-Newton Deformable plate Model) or a Local optimization-based method (such as restrictive Local Model Constrained Local Model), and the marking points comprise inner and outer eye angles, nose tips, mouth angles, contours and the like, and can also be other common marking points.

Secondly, establishing an objective function to solve the camera parameter P according to the positioned mark points and the mark points on the reference face model; wherein the objective function isXi is a mark point mi ═ x1, x2, …, xn } located on the ith personal face image Ii, and Xi isAnd referring to the marked points Mi on the face model, wherein the marked points Mi are { X1, X2, … and Xn }, the Mi and Mi are in one-to-one correspondence, the parameter Pi represents a projection transformation matrix from the three-dimensional point to the corresponding image, and n is the number of the marked points.

And (3) locating the marker points mi for the ith input face image Ii, wherein n is the number of the marker points { x1, x2, …, xn }. Assuming that the located marker points Mi ═ { X1, X2, …, Xn } correspond one-to-one to the marker points Mi ═ { X1, X2, …, Xn } on the model of the reference face (from the same person), the corresponding camera parameters P are estimated by solving the following energy function:

the parameter P to be estimated, Pi, represents the projection transformation of a three-dimensional point to a corresponding image, and can be obtained by least square-based Levenberg-Marquardt algorithm optimization. And marking the marking points on the reference face model in advance. The parameter P is estimated, namely, camera calibration is carried out, which is an important step of three-dimensional reconstruction, in the early three-dimensional reconstruction technology, a calibration object-based camera calibration method is mostly adopted for European reconstruction, and with the further research, the modern three-dimensional reconstruction technology including the method provided by the invention mostly carries out self-calibration by calculating internal parameters of a camera, and the self-calibration is the existing mature technology and is not detailed. If the three-dimensional reconstruction cannot be performed without self-calibration of the camera, the estimated camera parameters are used in the later energy function (i.e., the optimized function E). Solving function E ═ E_data+E_color+E_smoothThereby restoring the three-dimensional face model.

And step three, optimizing a reconstruction target function, converting the three-dimensional face reconstruction problem into a multi-label image segmentation problem under a Markov random field, and solving the target function by using a multi-label image segmentation algorithm to obtain a three-dimensional face model. The multi-label image segmentation algorithm is developed to be mature and comprises an alpha-expansion algorithm, an alpha-swap algorithm, a BP algorithm and the like. It should be noted that each of the above specific algorithms adopted in the present invention is the existing mature technology.

The optimized objective function is E ═ E_data+E_color+E_smooth(ii) a The data item isD represents the conversion of the reference model into a reference model of the two-dimensional image space, X being the model of the face to be solved, i being each pixel of the two-dimensional image space, X_iRepresenting the depth, D, corresponding to the pixel i on the model to be estimated_iRepresenting the depth corresponding to the pixel i on the reference model;

the multi-view color uniformity term isWherein (k)₁,k₂) Representing pairs of different views, P_kI.e. the projection matrix corresponding to the viewing angle k, corresponding to the parameter Pi,representing a three-dimensional point in a three-dimensional space corresponding to the pixel i;

the depth smoothing term isWhere n (i) is a neighborhood set where the pixel i is located, and Xi and Xj represent depths corresponding to the pixels i and j.

The optimization objective function E in the invention should satisfy the following conditions: a) the output three-dimensional face model is similar to the reference model; b) the color consistency between the output three-dimensional face model and the multi-view face image is satisfied; c) and the smoothness of depth change is satisfied in the local neighborhood of the output three-dimensional face model.

Specifically, the relative position relationship between different face images is determined, and the difference is represented by the difference in local geometry. The required face model can be obtained by transforming any face. The method converts the transformation process among different human faces into the multi-label image segmentation problem under the Markov random field. Based on this, the corresponding optimization function E should satisfy the following condition: 1. outputting a three-dimensional face model similar to the reference face model; 2. the consistency of colors between the output three-dimensional face model and the multi-view face image is satisfied; 3. and outputting the smoothness meeting the depth change in the local neighborhood of the three-dimensional face model. And finally, solving the target function reconstruction by using a multi-label image segmentation algorithm to obtain a three-dimensional face model.

Because the reference face model exists in a three-dimensional space, the reference face model needs to be converted into a two-dimensional image space so as to be conveniently solved by using a corresponding multi-label image segmentation algorithm. The spatial transformation of the reference face model can be realized in two ways: 1. projecting the three-dimensional face model to a cylindrical surface coordinate system, expressing a projection value by using depth, and then expanding the cylindrical surface to obtain a depth map in a two-dimensional space; 2. projecting the three-dimensional model to a defined coordinate system, wherein the vertical direction of the two-dimensional depth map is the same as the y axis of the space coordinate system; the horizontal direction is defined as an included angle between the current three-dimensional point and the z axis; the corresponding value is the depth value of the current three-dimensional point.

Suppose S is a three-dimensional solution space converted into a two-dimensional image, corresponding to the x-y axis, i.e., the image space after conversion, and the z axis corresponds to the depth direction. That is, the solution space is within a neighborhood centered on the reference face model. On the other hand, S can be seen as a cuboid-shaped space bounding box, wherein the plane of the front face is the plane in the x-y direction, and the front face faces the z direction. According to different modeling accuracies, the space bounding box is cut into N equal parts by a plane parallel to the x-y direction, and the greater N represents the higher modeling accuracy. Therefore, each pixel point corresponds to N possible depth values, an optimal depth value is solved for each pixel by establishing a target function E, and then the optimal depth value is converted into a three-dimensional space, so that a final three-dimensional face model is obtained.

The objective function is defined as: e ═ E_data+E_color+E_smooth(ii) a Wherein,representing data items, D-representation conversion to two-dimensional image spaceX, i.e. the face model to be solved, i is each pixel in the two-dimensional image space, then X_iRepresenting the depth, D, corresponding to the pixel i on the model to be estimated_iRepresenting the depth corresponding to pixel i on the reference model. This term means that the three-dimensional face model to be optimized should be as similar as possible to the reference face model.

The multi-view color uniformity term is defined asWherein (k)₁,k₂) Representing pairs of different views, P_kI.e. the projection matrix corresponding to the view angle k,representing a three-dimensional point in three-dimensional space to which pixel i corresponds. This requires the same three-dimensional pointThe projected colors on the images at different viewing angles should remain the same.

The third term is a depth smoothing term, defined asWhere N (i) is the neighborhood set in which pixel i is located. This requirement is that the depth of two adjacent pixels in image space should be kept flat.

The invention estimates the relative position relation between facial images with different visual angles and the solution space of the facial image to be reconstructed by introducing a standard reference facial model, and converts the three-dimensional facial reconstruction problem into a multi-label image segmentation problem in a Markov Random Field (MRF) frame, wherein the corresponding function comprises a data item, a multi-visual-angle color consistent item and a depth smoothing item. The method can reconstruct an accurate three-dimensional face model and can be further used for a face recognition system based on three dimensions. By introducing the reference face model, the ambiguity existing in the feature point matching process can be effectively eliminated, the solution space is reduced, the calculation performance of the algorithm is increased, the stable reconstruction result is efficiently and quickly obtained, and the dense and fine three-dimensional face model can be reconstructed. Moreover, the method does not depend on an external database, any reference face model can be used for operation, and the recovery of the face details is obtained through a pixel-level optimization algorithm. In addition, the method can realize full-automatic face reconstruction, does not require manual interaction of a user, and the reconstructed three-dimensional face model can be rotated to any other visual angle.

Based on the same inventive concept and with reference to fig. 2, an embodiment of the present invention further provides a three-dimensional face reconstruction system for automatically multi-view face self-portrait images, which includes a marking point positioning module, a camera parameter estimation module and an optimization solution module.

The marking point positioning module is used for automatically positioning marking points for the face images of multiple visual angles of the same person. The camera parameter estimation module is used for establishing an objective function according to the mark points positioned by the mark point positioning module and the mark points on the reference face model to solve camera parameters Pi; wherein the objective function isXi is mark points Mi located on the ith personal face image Ii ═ X1, X2, … and Xn }, Xi is mark points Mi on the reference face model ═ X1, X2, … and Xn }, Mi and Mi are in one-to-one correspondence, parameters Pi represent projection transformation from three-dimensional points to corresponding images, and n is the number of mark points;

the optimization solving module is used for optimizing the reconstruction target function, converting the three-dimensional face reconstruction problem into a multi-label image segmentation problem under a Markov random field, and solving the target function by using a multi-label image segmentation algorithm to obtain a three-dimensional face model.

Specifically, the marking points on the reference face model are marked in advance. The optimized objective function is E ═ E_data+E_color+E_smooth(ii) a Wherein the data item isD denotes the conversion of the reference face model into a reference model of the two-dimensional image space, X being the face model to be solved, i being each pixel in the two-dimensional image space, X_iRepresenting the depth, D, corresponding to the pixel i on the model to be estimated_iRepresenting the depth corresponding to the pixel i on the reference face model; the multi-view color uniformity term isWherein (k)₁,k₂) Representing pairs of different views, P_kI.e. the projection matrix corresponding to the view angle k,representing a three-dimensional point in a three-dimensional space corresponding to the pixel i; the depth smoothing term isWhere n (i) is a neighborhood set where the pixel i is located, and Xi and Xj represent depths corresponding to the pixels i and j.

The marking point positioning module automatically positions marking points on the face images at multiple visual angles by adopting a regression-based method or a local optimization-based method, wherein the marking points comprise inner and outer canthus, nose tip, mouth angle, contour and the like.

The optimization solving module optimizes the objective function E to meet the following conditions: a) the output three-dimensional face model is similar to the reference model; b) the color consistency between the output three-dimensional face model and the multi-view face image is satisfied; c) and the smoothness of depth change is satisfied in the local neighborhood of the output three-dimensional face model. This embodiment is based on the same concept as the embodiment of the method shown in fig. 1, and for the same points, reference is made to the corresponding description of the embodiment of the method described above, and details thereof are not described here.

The invention provides a face reconstruction method based on a single reference face model, aiming at the defects of the current multi-view face image reconstruction method. The method inputs objects of a multi-view face image and a reference face model (wherein the mark points are known), and outputs a three-dimensional face model corresponding to the input face image. According to the method, the ambiguity existing in the characteristic point matching process can be effectively eliminated by introducing the reference face model, the solution space is reduced, the calculation performance of the algorithm is improved, and the stable reconstruction result is efficiently and quickly obtained; moreover, the method does not depend on an external database, any reference face can be used for operation, and the recovery of the face details is obtained through a pixel-level optimization algorithm; in addition, the method can realize full-automatic face reconstruction, and does not require manual interaction of users. The method can reconstruct a dense and fine three-dimensional face model; the reconstructed three-dimensional model can be rotated to other arbitrary visual angles and can be used for a face recognition system.

While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the above embodiments, and various modifications or alterations can be made by those skilled in the art without departing from the spirit and scope of the claims of the present application.

Claims

1. A three-dimensional face reconstruction method for automatic multi-view face self-timer image is characterized by comprising the following steps:

firstly, automatically marking and positioning face images of multiple visual angles of the same person;

secondly, establishing an objective function to solve the camera parameter P according to the positioned mark points and the mark points on the reference face model; wherein the objective function isx_iFor the ith personal face image I_iUpper positioned mark point m_i＝{x₁,x₂,…,x_n}，X_iFor reference to a marker point M on the face model_i＝{X₁,X₂,…,X_n}，m_iAnd M_iOne-to-one correspondence, parameter P_iRepresenting three-dimensional points to an image I_iN is the number of the mark points;

step three, establishing a reconstruction target function E ═ E_data+E_color+E_smoothOptimizing a reconstruction objective function, and solving the reconstruction objective function by using a multi-label image segmentation algorithm to obtain a three-dimensional face model;

the reconstruction objective function is defined as E ═ E_data+E_color+E_smooth；

Wherein the data item isi is each pixel on the two-dimensional image space,representing the depth, D, corresponding to the pixel i on the model to be estimated_iRepresenting the depth corresponding to the pixel i on the reference model;

the multi-view color uniformity term is

Wherein (k)₁,k₂) Representing pairs of different views, P_kI.e. the projection matrix corresponding to the view angle k,representing a three-dimensional point in a three-dimensional space corresponding to a pixel i on a model to be estimated;

the depth smoothing term isWherein N (i) isThe neighborhood set where the pixel i is located on the model to be estimated,respectively representing the depths corresponding to the pixels i, j on the model to be estimated.

2. The method for reconstructing the three-dimensional human face of an automatic multi-view human face self-timer image according to claim 1, wherein in the first step, a regression-based method or a local optimization-based method is adopted to automatically locate the mark points for the human face image of each view, and the mark points comprise an inner and outer eye corner, a nose tip, a mouth corner and a contour.

3. The method of claim 2, wherein the optimization objective function E satisfies the following condition: a) the output three-dimensional face model is similar to the reference face model; b) the color consistency between the output three-dimensional face model and the multi-view face image is satisfied; c) and the smoothness of depth change is satisfied in the local neighborhood of the output three-dimensional face model.

4. The method of claim 2, wherein the reference face model is marked in advance with a marking point.

5. The utility model provides an automatic three-dimensional face reconstruction system of many visual angles face auto heterodyne image which characterized in that includes:

the camera parameter estimation module is used for establishing an objective function according to the positioned mark points and the mark points on the reference face model to solve camera parameters P; wherein the objective function isx_iFor the ith personal face image I_iUpper positioned mark point m_i＝{x₁,x₂,…,x_n}，X_iFor reference to a marker point M on the face model_i＝{X₁,X₂,…,X_n}，m_iAnd M_iOne-to-one correspondence, parameter P_iRepresenting three-dimensional points to an image I_iN is the number of the mark points;

an optimization solving module for reconstructing an objective function E ═ E_data+E_color+E_smoothOptimizing, namely solving the reconstruction objective function by using a multi-label image segmentation algorithm to obtain a three-dimensional face model;

the reconstruction objective function is E ═ E_data+E_color+E_smooth；

the multi-view color uniformity term is

the depth smoothing term isWherein N (i) is the pixel on the model to be estimatedi is in the neighborhood set in which it is located,respectively representing the depths corresponding to the pixels i, j on the model to be estimated.

6. The system of claim 5, wherein the marker locating module automatically locates marker points including inner and outer eye corners, nose tip, mouth corner, and contour for the face image at each viewing angle by using a regression-based method or a local optimization-based method.

7. The system for reconstructing a three-dimensional face from an image of an automatic multi-view face according to claim 6, wherein the optimization solving module optimizes the objective function E to satisfy the following condition: a) the output three-dimensional face model is similar to the reference face model; b) the color consistency between the output three-dimensional face model and the multi-view face image is satisfied; c) and the smoothness of depth change is satisfied in the local neighborhood of the output three-dimensional face model.

8. The system of claim 6, wherein the reference face model is marked in advance by a marking point.