CN104077804B

CN104077804B - A kind of method based on multi-frame video picture construction three-dimensional face model

Info

Publication number: CN104077804B
Application number: CN201410253326.5A
Authority: CN
Inventors: 刘威; 张丛喆; 汤勇; 谢佳亮
Original assignee: GUANGZHOU JIAQI INTELLIGENT TECHNOLOGY CO LTD
Current assignee: GUANGZHOU JIAQI INTELLIGENT TECHNOLOGY CO LTD
Priority date: 2014-06-09
Filing date: 2014-06-09
Publication date: 2017-03-01
Anticipated expiration: 2034-06-09
Also published as: CN104077804A

Abstract

The invention discloses a kind of method based on multi-frame video picture construction three-dimensional face model, including：The two-dimentional monitored picture that the video camera of fixing irradiation position and angle is shot carries out three-dimensional reconstruction, thus obtaining three-dimensional space model and the three-dimensional spatial information of camera supervised picture；Extract, from the video image of input, the multiframe continuous sequence video comprising target motion, shape, texture and colouring information；Facial features localization, three-dimensional fix, face characteristic synchronized tracking and identification are carried out to multiframe continuous sequence video, thus obtaining three-dimensional face features' point of multiframe continuous sequence video；Three-dimensional space model according to camera supervised picture carries out superposition calculation to three-dimensional face features' point of multiframe continuous sequence video, thus forming three-dimensional face grid and generating three-dimensional face model data.The present invention has the advantages that simple and convenient, preferably and degree of accuracy is higher for real-time.The composite can be widely applied to field of video image processing.

Description

A kind of method based on multi-frame video picture construction three-dimensional face model

Technical field

The present invention relates to field of video image processing, especially one kind are based on multi-frame video picture construction three-dimensional face model Method.

Background technology

At present, based on biological characteristic（As fingerprint, palmmprint and footmark etc.）Identity identifying technology be widely used to security protection Field, the various service application based on Identification of Images also gradually spread to different industries and field.Traditional Identification of Images side Method based on two-dimension human face recognition methodss, including Fisher face recognition methodss and Eigenface recognition methodss etc..But two The discrimination of dimension recognition of face method is low, there is certain error, cannot meet the urgent needss of service application.And it is based on three-dimensional The Identification of Images method of faceform, compared with two-dimension human face method of identification, has more abundant information, and supports to revolve by space Turn the comparison realizing multi-angle, recognition accuracy is higher, has replaced the trend of two-dimension human face method of identification.

The structure of three-dimensional face model is core and the key of the Identification of Images method based on three-dimensional face model.At present, structure The method building three-dimensional face model mainly has two kinds：A kind of is to a fixed people by the three-dimensional camera of multi-angle Face is shot, and is then spliced into a threedimensional model；Another kind is to build three-dimensional mould by way of surface profile scans Type.Although both approaches are reconstructed three-dimensional face model to a certain extent, its operation is more complicated, not convenient.

The structure of three-dimensional face model includes feature extraction, master pattern change, positioning feature point and texture mapping to be waited Journey.Current feature extraction, master pattern change and texture mapping process are carried out mainly for Static Human Face image, are difficult to reflection The information such as the face parameter with movement locus and attribute（Situation as facial expression distortion is just difficult to describe）It is impossible to adopt phase Method like property tolerance or comparison reduces real human face to the full extent, and real-time is relatively low and error in data is larger.

In sum, need in the industry three-dimensional face model construction method a kind of convenient, real-time and that degree of accuracy is high at present badly.

Content of the invention

In order to solve above-mentioned technical problem, the purpose of the present invention is：A kind of convenient, real-time and degree of accuracy height and application are provided Scope is wide, the method based on multi-frame video picture construction three-dimensional face model.

The technical solution adopted for the present invention to solve the technical problems is：One kind is based on multi-frame video picture construction three-dimensional people The method of face model, including：

A. the two-dimentional monitored picture video camera of fixing irradiation position and angle being shot carries out three-dimensional reconstruction, thus obtaining The three-dimensional space model of camera supervised picture and three-dimensional spatial information；

B. from input video image extract comprise target motion, shape, texture and colouring information multiframe continuous Sequence video；

C. multiframe continuous sequence video is carried out facial features localization, three-dimensional fix, face characteristic synchronized tracking with Identification, thus obtain three-dimensional face features' point of multiframe continuous sequence video；

D. three-dimensional face features' point to multiframe continuous sequence video for the three-dimensional space model according to camera supervised picture It is overlapped calculating, thus forming three-dimensional face grid and generating three-dimensional face model data.

Further, described step A, it includes：

A1. homography solution is set up according to the intrinsic parameter matrix of video camera, described homography solution reflects actual ground level Homography relation with ground level in video camera shooting image；

A2. according to known to the height of video camera, given two length and the reference line perpendicular to ground level, to video camera Visual angle initial point calculated；

A3. camera angles are rebuild according to homography solution, the visual angle initial point of video camera and given Visualization Model three-dimensional Model, thus obtain three-dimensional space model and the three-dimensional spatial information of camera supervised picture.

Further, described step C, it includes：

C1. choose single frame of video as current video frame from multiframe continuous sequence video；

C2. facial features localization and extract facial feature are carried out to current video frame, thus obtaining the people of current video frame Face characteristic point；

C3. three-dimensional fix is carried out to the human face characteristic point of current video frame, and detect face contained by current video frame The spatial information of characteristic point, the movement locus of face characteristic and temporal information；

C4. face characteristic synchronized tracking and automatic identification are carried out to current video frame according to the result of detection, so that it is determined that The human face characteristic point of current video frame each locus coordinate in moving process；

C5. continue to choose next single frame of video from multiframe continuous sequence video as current video frame, then return Return step C2, thus the continuous kinestate under not in the same time generates the three-dimensional of face characteristic according to multiframe continuous sequence video Coordinate system matrix.

Further, described step D, it is specially：

Three-dimensional face key frame superposition calculation is carried out by multigroup three-dimensional face features' point of multiframe continuous sequence video, raw Become three-dimensional face model data directory list, thus setting up structurized three-dimensional face model data list and to three-dimensional face Model data index list carries out storage process.

Further, described step D, it includes：

D1. by the three-dimensional space model for the camera supervised picture of people for the human face characteristic point of multiframe continuous sequence video, Thus obtaining three-dimensional face features' space of points coordinate of multiframe continuous sequence video；

D2. according to three-dimensional face features' space of points coordinate, texture image is generated using 3-D view stitching algorithm, and opposite The texture image becoming is mapped, thus obtaining real three-dimensional face model data.

Further, described step D2, it includes：

D21. the dilute of face markers point is rebuild from multiframe continuous sequence video according to three-dimensional face features' space of points coordinate Rare set closes, and carries out trial one by one using thin plate spline TPS to sparse set；

D22. the result according to TPS trial carries out nonlinear transformation to Generic face model, thus obtaining the three-dimensional mated Faceform；

D23. the face texture feature information of multiframe continuous sequence video is obtained using 3-D view stitching algorithm, and will To face texture feature information be mapped to coupling three-dimensional face model in, thus obtaining real three-dimensional face model number According to.

Further, it is additionally provided with step E after described step D, described step E, it is specially：

Retain the special characteristic of Generic face model using SFM algorithm, by comparing with Generic face model, revise and generate Three-dimensional face model data and Generic face model data error；Then triangle close classification is adopted to pass through the depth letter of point Breath builds final three-dimensional face model.

Further, the depth information that triangle close classification passes through point is adopted to build final three-dimensional face in described step E The step for model, it includes：

E21. being filtered out from three-dimensional face grid according to default threshold value needs the triangle of subdivision, and to filtering out Triangle is marked；

E22. the triangle of labelling is combined into n grid block according to neighbouring relations, then this n grid block is independently gone out Come, be designated as Bb ₁ ,b ₂ ,b ₃ ,…,b _n, the part not simultaneously being labeled three-dimensional face grid is designated asR _i；

E23. willB _iThe weight on middle four summits of grid block is adjusted to 0,1/2,1/2 and 0 respectively, thus rightB _iCarry out grid Interpolation subdividing；

E24. segment to not doingR _iCarry out interpolation subdividing in boundary, so that being located at midpoint in borderline insertion point Place；

E25. willBWithRSynthesized, and judged whether the grid model after synthesis meets the length of side of its all triangle all Less than default threshold value, if so, then using the grid model after synthesis as final three-dimensional face model, conversely, then returning step Rapid E21.

The invention has the beneficial effects as follows：Built by the image information that the video camera of single fixing irradiation position and angle shoots Vertical three-dimensional face model, simple to operate, very convenient；By facial features localization, three dimensions are carried out to continuous sequence video Positioning, face characteristic synchronized tracking and identification, extract the key frame comprising face, enter Mobile state to the change in displacement of face location Follow the trail of, set up three-dimensional relationship, determine the spatial relation of each human face characteristic point, solving prior art cannot be to dynamic The information such as the face characteristic parameter such as movement locus of state facial image and attribute synchronizes the problem followed the tracks of with identification, real-time Preferably and degree of accuracy is higher.Further, the special characteristic of Generic face model is retained using SFM algorithm, and thin using triangle Point-score smooths to facial image, further increases degree of accuracy and the sense of reality of faceform.

Brief description

The invention will be further described with reference to the accompanying drawings and examples.

Fig. 1 is a kind of flow chart of steps of the method based on multi-frame video picture construction three-dimensional face model of the present invention；

Fig. 2 is the flow chart of step A of the present invention；

Fig. 3 is the flow chart of step C of the present invention；

Fig. 4 is the flow chart of step D of the present invention；

Fig. 5 is the flow chart of step D2 of the present invention；

Fig. 6 is the flow chart of step E triangle close classification of the present invention；

Fig. 7 is the schematic diagram according to video camera intrinsic Reconstruction three-dimensional space model for the embodiment one；

Fig. 8 is the schematic diagram that in embodiment one, multiple image is set up with three-dimensional portrait flow process.

Specific embodiment

Reference Fig. 1, a kind of method based on multi-frame video picture construction three-dimensional face model, including：

With reference to Fig. 2, it is further used as preferred embodiment, described step A, it includes：

With reference to Fig. 3, it is further used as preferred embodiment, described step C, it includes：

It is further used as preferred embodiment, described step D, it is specially：

With reference to Fig. 4, it is further used as preferred embodiment, described step D, it includes：

With reference to Fig. 5, it is further used as preferred embodiment, described step D2, it includes：

It is further used as preferred embodiment, after described step D, being additionally provided with step E, described step E, it is specially：

Generic face model, refers to be standard faces model known in the industry.

With reference to Fig. 6, it is further used as preferred embodiment, in described step E, adopt triangle close classification to pass through point The step for depth information builds final three-dimensional face model, it includes：

With reference to specific embodiment, the present invention is described in further detail.

Embodiment one

The present embodiment is carried out specifically by the process that video acquisition high-speed downloads device carries out faceform to the present invention Bright.

The process that video acquisition high-speed downloads device carries out faceform is：

（One）3 D scene rebuilding

In video acquisition high-speed downloads device, the intrinsic parameter of video camera is known, and the picture that video camera shoots carries out three-dimensional Scene rebuilding, reconstruction procedures are：First between the ground level in actual ground level and image, set up homography solution （homography）H；Utilize setting height(from bottom) h of reality and the ground level of video camera afterwards, default known length and perpendicular to ground The line of plane, video camera is calibrated.Specific embodiment is as follows：

（1）According to the pin-hole model of video camera, define matrixMFor：, it follows that actual Horizon The homography relation of the ground level in face and video camera shooting image is represented by… （1）.

Wherein, A is the intrinsic parameter matrix of video camera.r ₁,r ₂,r ₃For spin matrixRThree column vectors,tFor translation ginseng Number.If the corresponding point between ground level in actual ground level and image are more than 4 groups, formula can be passed through（1）, H can be made More extended.

（2）Define video camera optic center point, that is, the visual angle initial point of video camera be (x _c,y _c,h), order, Spatial relationship according to video camera can draw：

（3）The given reference line perpendicular to actual ground levell*, and its projection on ground level in photography imagel, the spatial relationship according to video camera can learn the straight line through impact pointH ^T lOn actual ground level and passing point (x _c,y _c, 0).

Therefore, according to step（1）-（3）, given camera heighthWith two perpendicular to actual ground level and length The reference line known, can calculatex _c,y _cWithK.

（4）Rebuild camera angles threedimensional model then according to default Visualization Model.

As shown in fig. 7, will (x _c,y _c,h) it is set to the central point of user coordinate system, and this Visualization Model is projected to reality Ground level.According to space geometry projection relation, any point in user coordinate system (x _w,y _w,z _w) throwing in actual ground level Shadow (x _w ^＇,y _w ^＇, 0) and formula can be passed through（2）Calculate, formula（2）For：

…（2）

（5）Finally utilize homography solution H, can be by projection mapping on actual ground level for this Visualization Model to image In ground level, thus setting up the reconstruction that mapping relations complete 3 D monitoring scene.After reconstruction terminates, can be to three setting up On dimension portrait, any point carries out depth information calibration.

（Two）Obtain three-dimensional face features' spatial data

After the completion of 3 D scene rebuilding, the picture in this video camera is calculated, using facial features localization method pair Wherein one framef1Picture carries out extract facial feature, and the characteristic point sequence collected is substituted into the three-dimensional space model rebuild In, draw each characteristic point three-dimensional space data [x _f1,y _f1,z _f1].Then begin to read the next frame in videof2, same Method obtain the second frame face characteristic three-dimensional space data [x _f2,y _f2,z _f2], until obtainingfnThe three-dimensional face features of frame Spatial data [x _fn,y _fn,z _fn].

（Three）Three-dimensional splicing and mapping

In addition it is also necessary to be modeled to the camera imaging figure of multiple angles after the three-dimensional face features' spatial data obtaining Shape, this process can be converted into the three-dimensional splicing problem of 3-D view.Specific practice is：The dilute of face markers point is rebuild from video Rare set closes, using thin plate spline TPS（Thin Plate Spline）These sparse set are carried out trial one by one, then in TPS On the basis of trial, Generic face model is carried out with the three-dimensional face model that nonlinear transformation obtains mating, finally again by video Face texture information to map this coupling three-dimensional face model in, thus obtaining real three-dimensional face model.

For example, 3-D view I1 and I2 that a pair can splice is located in the range of one group of given N number of image.I1 first Carry out splicing a kind of new 3-D view I11 obtaining with I2, then I11 image and I3 image carry out splicing and obtain image I12, Carry out splicing followed by I12 image and I3 image and obtain image I13.Repeat said process, till can not being spliced again, Thus obtaining a complete three-dimensional face model.

（Four）SFM algorithm

In order to ensure the precision of model, the present invention additionally uses SFM（Structure From Motion）Algorithm retains logical With the special characteristic of faceform, by comparing with Generic face model, revise the error between two width faces.Concretely comprise the following steps：

Determine first with fromf1Obtain [x _f1,y _f1,z _f1] on the basis of follow the trail of data.Subsequently estimate the fortune of human face characteristic point Move and structure change.Next motion estimated values are refined, finally by estimated value and next framef2Face coordinate figure enter Row compares, and judges that it, whether in the interval that estimation is calculated, if it is not, then abandoning, continues the extraction of next frame data, Thus circulate the face result drawing and can ensure that its grown form.

（Five）Triangle close classification

Three-dimensional portrait is set up by the depth information of point using triangle close classification, idiographic flow is as follows：

Step 1, screening needs the triangle of subdivision：If i-th triangle maximal side is k, work as K>This triangle of labelling during m Shape, m is default threshold value, and the present invention is taken as 0.15.All trianglees in traversal grid model, and labelling needs the three of division Angular.

Step 2, the triangle of composite marking：The triangle of labelling is combined into n block according to neighbouring relations, and this n Individual block is independent, be designated as Bb ₁ ,b ₂ ,b ₃ ,…,b _n, the part not simultaneously being labeled three-dimensional face grid is designated asR _i.

Step 3, segments independent grid block：RightB _iDo grid subdivision, the weight on four summits need to be adjusted, be changed to 0,1/ respectively 2,1/2,0, can be all midpoint always so in borderline insertion point, so that boundary shape is consistent.

Step 4, adjustmentR _iBorder：B _iBorder inserts new point in midpoint, is not done and segmentR _i? Boundary does same adjustment, so that the grid of synthesis coincide in splicing boundary.

Step 5, synthesizes R and B.

Through above-mentioned steps 1-5, the subdivision of R grid completes, and R, and B is also consistent in borderline division points, finally two Person combines and just achieves the once subdivision to whole original mesh.Repeat above step until the length of side of all trianglees Both less than threshold value, is finally reached model accuracy requirement.

Fig. 8 is the embodiment schematic diagram setting up three-dimensional portrait model by said method.

Compared with prior art, the image information that the present invention is shot by the video camera of single fixing irradiation position and angle Set up three-dimensional face model, simple to operate, very convenient；By facial features localization, three-dimensional space are carried out to continuous sequence video Between positioning, face characteristic synchronized tracking and identification, extract the key frame comprising face, action entered to the change in displacement of face location State is followed the trail of, and sets up three-dimensional relationship, determines the spatial relation of each human face characteristic point, solving prior art cannot be right The information such as the face characteristic parameter such as movement locus of dynamic human face image and attribute synchronizes the problem followed the tracks of with identification, in real time Property preferably and degree of accuracy is higher；Retain the special characteristic of Generic face model using SFM algorithm, and adopt triangle close classification pair Facial image is smoothed, and further increases degree of accuracy and the sense of reality of faceform.

It is more than that the preferable enforcement to the present invention is illustrated, but the invention is not limited to described enforcement Example, those of ordinary skill in the art also can make a variety of equivalent variations without prejudice on the premise of present invention spirit or replace Change, these equivalent deformation or replacement are all contained in the application claim limited range.

Claims

1. a kind of method based on multi-frame video picture construction three-dimensional face model it is characterised in that：Including：

A. the two-dimentional monitored picture video camera of fixing irradiation position and angle being shot carries out three-dimensional reconstruction, thus being imaged The three-dimensional space model of machine monitoring picture and three-dimensional spatial information；

B. extract, from the video image of input, the multiframe continuous sequence comprising target motion, shape, texture and colouring information Video；

C. facial features localization, three-dimensional fix, face characteristic synchronized tracking and identification are carried out to multiframe continuous sequence video, Thus obtaining three-dimensional face features' point of multiframe continuous sequence video；

D. the three-dimensional space model according to camera supervised picture is carried out to three-dimensional face features' point of multiframe continuous sequence video Superposition calculation, thus forming three-dimensional face grid and generating three-dimensional face model data；

Described step C, it includes：

C2. facial features localization and extract facial feature are carried out to current video frame, thus the face obtaining current video frame is special Levy a little；

C3. three-dimensional fix is carried out to the human face characteristic point of current video frame, and detect face characteristic contained by current video frame The spatial information of point, the movement locus of face characteristic and temporal information；

C4. face characteristic synchronized tracking and automatic identification are carried out to current video frame according to the result of detection, so that it is determined that currently The human face characteristic point of frame of video each locus coordinate in moving process；

C5. continue to choose next single frame of video from multiframe continuous sequence video as current video frame, be then back to walk Rapid C2, thus the continuous kinestate under not in the same time generates the three-dimensional coordinate of face characteristic according to multiframe continuous sequence video It is matrix.

2. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature exists In：Described step A, it includes：

A1. homography solution is set up according to the intrinsic parameter matrix of video camera, described homography solution reflects actual ground level and takes the photograph The homography relation of ground level in camera shooting image；

A2., according to known to the height of video camera, given two length and the reference line perpendicular to ground level, video camera is regarded Angle initial point is calculated；

A3. camera angles threedimensional model is rebuild according to homography solution, the visual angle initial point of video camera and given Visualization Model, Thus obtaining three-dimensional space model and the three-dimensional spatial information of camera supervised picture.

3. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature exists In：Described step D, it is specially：

Three-dimensional face key frame superposition calculation is carried out by multigroup three-dimensional face features' point of multiframe continuous sequence video, generates three Dimension faceform's data directory list, thus set up structurized three-dimensional face model data list and to three-dimensional face model Data directory list carries out storage process.

4. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature exists In：Described step D, it includes：

D1. by the three-dimensional space model for the camera supervised picture of people for the human face characteristic point of multiframe continuous sequence video, thus Obtain three-dimensional face features' space of points coordinate of multiframe continuous sequence video；

D2. according to three-dimensional face features' space of points coordinate, texture image is generated using 3-D view stitching algorithm, and to generation Texture image is mapped, thus obtaining real three-dimensional face model data.

5. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 4, its feature exists In：Described step D2, it includes：

D21. the sparse set of face markers point is rebuild from multiframe continuous sequence video according to three-dimensional face features' space of points coordinate Close, and using thin plate spline TPS, trial one by one is carried out to sparse set；

D22. the result according to TPS trial carries out nonlinear transformation to Generic face model, thus obtaining the three-dimensional face mating Model；

D23. obtain the face texture feature information of multiframe continuous sequence video using 3-D view stitching algorithm, and will obtain Face texture feature information is mapped in the three-dimensional face model of coupling, thus obtaining real three-dimensional face model data.

6. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 1, its feature exists In：It is additionally provided with step E after described step D, described step E, it is specially：

Retain the special characteristic of Generic face model using SFM algorithm, by comparing with Generic face model, revise three generating Dimension faceform's data and the error of Generic face model data；Then triangle close classification is adopted to pass through the depth information structure of point Build final three-dimensional face model.

7. a kind of method based on multi-frame video picture construction three-dimensional face model according to claim 6, its feature exists In：The step for depth information that triangle close classification passes through point builds final three-dimensional face model is adopted in described step E, It includes：

E21. being filtered out from three-dimensional face grid according to default threshold value needs the triangle of subdivision, and to the triangle filtering out Shape is marked；

E22. the triangle of labelling is combined into n grid block according to neighbouring relations, then independent for this n grid block, Be designated as Bb ₁ ,b ₂ ,b ₃ ,…,b _n, the part not simultaneously being labeled three-dimensional face grid is designated asR _i；

E23. willB _iThe weight on middle four summits of grid block is adjusted to 0,1/2,1/2 and 0 respectively, thus rightB _iCarry out gridding interpolation Subdivision；

E24. segment to not doingR _iCarry out interpolation subdividing in boundary, so that being located at midpoint in borderline insertion point；

E25. willBWithRSynthesized, and whether the grid model after judging to synthesize is met the length of side of its all triangle and be both less than Default threshold value, if so, then using the grid model after synthesis as final three-dimensional face model, conversely, then return to step E21.