CN1156248C - Method for detecting moving human face - Google Patents
Method for detecting moving human face Download PDFInfo
- Publication number
- CN1156248C CN1156248C CNB011204281A CN01120428A CN1156248C CN 1156248 C CN1156248 C CN 1156248C CN B011204281 A CNB011204281 A CN B011204281A CN 01120428 A CN01120428 A CN 01120428A CN 1156248 C CN1156248 C CN 1156248C
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- eyes
- eye
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
The present invention relates to a motion image human face feature detection method. The method comprises: human face images are shot, and form a training set; principal component analysis, Hough conversion, etc. are carried out, and then, the positions and the sizes are same with eyes in the images of the training set; the images are projected to a character eye space; candidate eyes with minimum errors between primitive eyes and the projection are used as test results, and the exact positions of mouth edges, nostrils and nasal tips are obtained by integral projection. Compared with the existing methods, the method has the advantages that the detection speed is enhanced to 225 times, and the correct rate is enhanced by 1.27%.
Description
The technical field is as follows:
the invention relates to a method for detecting face characteristics of a moving image, and belongs to the technical field of computer vision.
Background art:
the existing face feature detection method is performed for a still image. In the document of "robust face feature detection based on generalized symmetry" ("11 th international conference proceedings of pattern recognition", 1992, pp.117-120), authors d.reisfeld and y.yeshura proposed a typical still image face feature detection method. The principle of the method is as follows: according to the local and global symmetry of the human face, a complex measure (called symmetry) about the symmetry is defined, then the symmetry is obtained for each edge point in the image through energy function iteration, and the point with the maximum symmetry is regarded as a feature point. The method can detect pupils and mouth angles in the human face, the accuracy is about 95%, and the detection time of each image is about 3 minutes. The main disadvantages of this method are: (1) because the prior knowledge of the human face is not fully utilized, the method has large computation amount and low detection speed, and is not suitable for real-time application environments such as visual communication, non-contact computer operation and the like; (2) since only the information provided by a single still image is used, the search result cannot be verified and corrected; (3) only the feature points on a single image can be retrieved, and the method cannot be used for moving images.
The invention content is as follows:
the invention aims to provide a face detection method in a moving image, which can be used for quickly and accurately detecting the positions of two pupils, two corners of the mouth, two nostrils and the tip of the nose on a face in the moving image, thereby overcoming the defects of low speed, low accuracy and the like in the face detection method of a static image. The detected result can be used in application environments such as face recognition, visual communication, image coding, non-contact computer operation and the like.
The invention provides a moving image face feature detection method, which comprises the following steps:
1. shooting 300 face images with different sexes, ages, postures and illumination to form a training set, and geometrically calibrating the eyes of the images in the training set through homogeneous transformation to ensure that the sizes and the positions of the eyes in the images are completely consistent;
2. performing principal component analysis on the calibrated eyes in the images of the training set to obtain a group of characteristic vectors called characteristic eyes to form a characteristic eye subspace;
3. for a human face image of a tested person, firstly obtaining a plurality of candidate eyes through Hough transformation, carrying out geometric calibration on each pair of candidate eyes by using homogeneous transformation to ensure that the positions and the sizes of the candidate eyes are the same as those of the eyes in an image of a training set, then projecting the candidate eyes to the characteristic eye subspace, and finally taking the candidate eye with the minimum error between the original eye and the projection of the original eye as a detection result;
4. after the eye position of the tested person is determined through the steps, estimating the mouth position according to the human face structure characteristics, obtaining the accurate position of the mouth angle by utilizing integral projection, then estimating the nose position according to the mouth position and the eye position, and accurately positioning the positions of the nostrils and the nose tip by utilizing the integral projection;
5. and if false detection or missing detection occurs, estimating the positions of eyes, nose and mouth in the current frame from the characteristic points in the previous frame of image according to the motion smoothness constraint and the plane motion constraint.
The human face detection method of the invention is used for testing 50 image sequences with different postures, illumination, breadth size, gender, age and background, and the method has the correct detection rate of 96.27 percent and the average detection time of 40 seconds per sequence (each sequence comprises 50 frames of images). Compared with the prior method, the detection speed is improved by 225 times, and the accuracy is improved by 1.27%.
The invention can detect the human face characteristics in the moving images in real time, the accuracy rate reaches 96.27 percent, and the invention can be used in the following application fields: (1) and (5) face recognition. Face recognition methods fall into two broad categories, image-based and feature-based. For the former, the feature points obtained by the method can be used for calibrating the posture and guiding image matching; for the latter, the face features can be used directly as recognition criteria. (2) Visual communication. The biggest challenge in visual communication is to solve the contradiction between channel bandwidth and large amount of transmitted data. By using the method of the invention, the sending end only needs to transmit a few key frame images, can detect the characteristic points of the non-key frame images and only transmits the characteristic points. The receiving end can restore the non-key frame image according to the key frame and the characteristic points. In this way, the existing image transmission bandwidth can be reduced by several orders of magnitude. (3) And (4) encoding the moving image. Content-based retrieval coding methods are becoming new moving picture compression standards (e.g., MPEG-4 and MPEG-7). The human face features are important image contents, and the method provided by the invention can be used as an effective implementation and supplement of the coding method. (4) Contactless computer operation. In many situations, such as a disabled person operating a computer, nuclear reaction control, etc., a user cannot operate the computer with a keyboard or mouse. In which case the computer can be controlled by tracking the gaze point of the human eye. The method of the present invention detects the human face characteristic points in real time, and the positions of pupils on the computer screen are obtained according to the three-dimensional geometric model and the calibrated camera model, so that the computer makes corresponding response.
Description of the drawings:
fig. 1 is a mouth region definition.
Fig. 2 is a nose region definition.
FIG. 3 is a diagram of the feature point spacing used in motion smoothness constraints.
The specific implementation mode is as follows:
1. geometric calibration
100 and 300 face images with different sexes, ages, postures and illumination are shot to form a training set. And through homogeneous transformation, the eye geometry of the images in the training set is calibrated, so that the sizes and the positions of the eyes in the images are completely consistent. In the next step, the same geometric calibration is performed on the eyes in the face image to be tested, so that the relative positions of the two pupils of the eyes in the training set image and the test image are kept unchanged.
Assuming that the original image is I (x, y), the positions of the two pupils are known to be EL(xL,yL) And ER(xR,yR) The included angle between the pupil connecting line and the horizontal axis is theta. The image I (x, y) is now transformed into I' (x, y) by a homogeneous transformation (equation 1) such that the positions of the two pupils are E, respectivelyL0(xL0,yL0) And ER0(xR0,yR0)。EL0(xL0,yL0) And ER0(xR0,yR0) Is a fixed pupil position, and yL0=yR0I.e. the pupillary line is parallel to the horizontal axis.
[x′,y′]=STR[x,y,l]T, (1)
Wherein: r, T and S are a rotation transformation, a translation transformation and a scale transformation, respectively.
2. Acquisition of a characteristic eye subspace
And performing principal component analysis on the calibrated eyes in the images of the training set to obtain a group of characteristic vectors called characteristic eyes to form a characteristic eye subspace.
Assume that after calibration, the eye region size is w × h — n. Using n-dimension vector i to form RnAnd (4) showing. Let the training set be { i1,i2...,im},ik∈Rn,k=1,2,...,m。
First, the average image (i.e. average eye) of the training set is found:
then, a covariance matrix of the training set samples is calculated:
wherein,
A=[i1-μ,i2-μ,...,im-μ]A∈Rnxm. (7)
according to the Singular Value Decomposition (SVD) theorem, it is possible to pass through the matrix ATA∈Rm×mObtaining AA from the set of orthogonal feature vectorsT∈Rn×nOrthogonal feature vector set (u) of1,u2,...,ur). Will (u)1,u2,...,ur) Quadrature obtained after normalizationThe set of feature vectors is still represented as (u)1,u2,...,ur) This is exactly the eigenvector of the covariance matrix R of the training set.
In actual use, only the set of feature vectors (u) for which the following expression holds is taken1,u2,...,u1),
In algebraic sense, the covariance matrix R of the training set completely expresses all the information of the training set, and R can be used (u)1,u2,...,u1) Complete representation, therefore, if the selected training set includes human eye images in all cases, it can be considered as being composed of (u)1,u2,...,u1) The formed subspace can fully describe the human eye. That is, any human eye can use (u)1,u2,...,u1) Is expressed in linear combinations. We call (u)1,u2,...,u1) For characterizing the eye, it is called as (u)1,u2,...,u1) The constructed subspace is the characteristic eye subspace.
Suppose that an input image with a breadth of w x h is p ∈ RnIt is projected into the characteristic eye subspace, i.e.,
since U is an orthogonal matrix, it is, therefore,
(c1,c2,...,c1)T=UTp (10)
thus, we get a mapping of p in the feature eye subspace
The difference between p and p 'is described by its correlation value δ (p, p'):
3. eye detection
Firstly, obtaining a plurality of candidate eyes from a face image of a tested person; performing geometric calibration on each pair of candidate eyes by using homogeneous transformation to ensure that the positions and the sizes of the candidate eyes are the same as those of the eyes in the training set image, and then projecting the candidate eyes to the characteristic eye subspace; and finally, taking the candidate eye with the minimum error between the original eye and the projection thereof as a detection result.
Firstly, k candidate pupils C are obtained by Hough transformation1,C2,...,CkAnd with C1,C2,...,CkA complete graph G is constructed for the nodes. For C in the figureiAnd CiThe edges in between define a profit function B (i, j) as follows:
wherein k is1k2∈[0,1],k1+k2=1.0;pijAre each represented by CiAnd CjA human eye region is divided from the image for the left pupil and the right pupil; p'ijIs pijProjection into a characteristic eye space; gamma (p)ij,pij') is a similarity and symmetry measure; delta (p)ij,pij') is a description of the authenticity of the eye (equation 11); d (i, j) and A (i, j) are constraints on interocular distance and angle.
Pupil pair (C) satisfying the following conditionsl,Cr) Pupil position considered correct:
wherein, γ0Is the human eye similarity and symmetry threshold, δ0Is the human eye truth threshold. If there is no B (l, r) satisfying the equation (13), the binarization threshold value can be increased and adaptive adjustment can be performed.
4. Mouth and nose detection
(1) Mouth corner detection
First, the mouth region is estimated from the pupil position based on anthropometric data. If the two pupils are each C, as shown in FIG. 1lAnd CrThen the mouth area can be roughly estimated as the parallelogram ABCD. The horizontal and vertical integral projections are made in ABCD as follows:
where y ═ AB (x) and y ═ DC (x) are the linear equations of lines AB and DC, respectively; y BC (x) and x AD (y) are linear equations for lines BC and AD, respectively. H (y) is calculated from the original image, and v (x) is the combination of the vertical gradient map and the original image.
The valley points on histogram h (y) correspond to the vertical positions of the mouth corners, and the two valley points on histogram v (x) on both sides of the median value correspond to the horizontal positions of the mouth corners, whereby the positions of the two mouth corners can be determined.
(2) Nostril and tip detection
The nostril detection steps are as follows:
1) rough estimation of the nose region from the mouth region (fig. 2);
2) obtaining a base line y ═ yn of the nose by using integral projection;
3) two nostrils N1(xn1l,yn) And N3(xn3,yn) Is the point that lies on the baseline y-yn and satisfies the following condition:
wherein,
5. verification and correction of facial features in moving images
And (3) detecting the characteristics of each frame of image in the moving image by using the method, and if the condition of false detection or missing detection occurs, estimating the position of the characteristic point in the current frame from the characteristic point in the previous frame of image according to the motion smoothness constraint and the plane motion constraint. The specific method comprises the following steps:
1) starting from frame 1, features are detected frame by frame in the above-described manner until the variation between features of successive 3-frame images is less than a given threshold. These 3 frame images are referred to as reference frames, and their features are considered to be correct.
2) Given a reference frame, the feature detection steps of its neighboring frames (target frames) are:
(1) and estimating the characteristic region of the target frame according to the reference frame characteristic.
(2) The features of the target frame are detected within the estimated region using the method described above.
(3) And verifying the detection result by using motion smoothness constraint.
The principle of smoothness constraint is: between two adjacent frames (the reference frame and the target frame), since the head movement amplitude and the distance of the camera from the face are small, the change of the face characteristic point should be small. As shown in fig. 3, the variation of five distances between feature points in two adjacent images should be less than a threshold, otherwise, the detection is considered to be false.
(4) And if the detected features do not accord with the motion smoothness constraint, estimating the face features of the target frame by using a plane motion model.
Two pupils, two corners of the mouth and two nostrils on a human face may be considered approximately on a plane. This plane should conform to the plane rigid motion constraint between the two frames. Let x be (x)1,x2) Is the feature point in the reference frame, the corresponding feature point in the target frame can be estimated by the equations (19) and (20)
Wherein a 1.., a8 is a planar motion parameter. If 4 corresponding feature points in the reference frame and the target frame are known, a 1.
A=[a1 a2 a3 a4 a5 a6 a7 a8]r (22)
According to the steps 1-5, 6 corresponding feature points (two pupils, two corners of mouth and two nostrils) between two adjacent frames can be obtained. 4 of the 6 points are selected In the middle combination, 15 sets of plane parameters A can be obtained by using the formula (21)1....A15. For each combination, 6 feature points in the target frame can be estimated using equations (19) and (20). Optimum plane parameter AoptThe following equation is obtained:
Aopt={A1|Min(Err(A1)},i=1-15 (23)
where err (ai) is the estimation error:
Claims (1)
1. A method for detecting the face characteristics of a moving image is characterized by comprising the following steps:
(1) shooting 300 face images with different sexes, ages, postures and illumination to form a training set, and geometrically calibrating the eyes of the images in the training set through homogeneous transformation to ensure that the sizes and the positions of the eyes in the images are completely consistent;
(2) performing principal component analysis on the calibrated eyes in the images of the training set to obtain a group of characteristic vectors called characteristic eyes to form a characteristic eye subspace;
(3) for a human face image of a tested person, firstly obtaining a plurality of candidate eyes through Hough transformation, carrying out geometric calibration on each pair of candidate eyes by using homogeneous transformation to ensure that the positions and the sizes of the candidate eyes are the same as those of the eyes in an image of a training set, then projecting the candidate eyes to the characteristic eye subspace, and finally taking the candidate eye with the minimum error between the original eye and the projection of the original eye as a detection result;
(4) after the eye position of the tested person is determined through the steps, estimating the mouth position according to the human face structure characteristics, obtaining the accurate position of the mouth angle by utilizing integral projection, then estimating the nose position according to the mouth position and the eye position, and accurately positioning the positions of the nostrils and the nose tip by utilizing the integral projection;
(5) and if false detection or missing detection occurs, estimating the positions of eyes, nose and mouth in the current frame from the characteristic points in the previous frame of image according to the motion smoothness constraint and the plane motion constraint.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB011204281A CN1156248C (en) | 2001-07-13 | 2001-07-13 | Method for detecting moving human face |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB011204281A CN1156248C (en) | 2001-07-13 | 2001-07-13 | Method for detecting moving human face |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1325662A CN1325662A (en) | 2001-12-12 |
| CN1156248C true CN1156248C (en) | 2004-07-07 |
Family
ID=4664123
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB011204281A Expired - Fee Related CN1156248C (en) | 2001-07-13 | 2001-07-13 | Method for detecting moving human face |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN1156248C (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8294776B2 (en) | 2006-09-27 | 2012-10-23 | Sony Corporation | Imaging apparatus and imaging method |
Families Citing this family (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7936902B2 (en) * | 2004-11-12 | 2011-05-03 | Omron Corporation | Face feature point detection apparatus and feature point detection apparatus |
| JP2007094906A (en) * | 2005-09-29 | 2007-04-12 | Toshiba Corp | Feature point detection apparatus and method |
| CN100347721C (en) * | 2006-06-29 | 2007-11-07 | 南京大学 | Face setting method based on structured light |
| JP5228307B2 (en) * | 2006-10-16 | 2013-07-03 | ソニー株式会社 | Display device and display method |
| CN101169827B (en) * | 2007-12-03 | 2010-06-02 | 北京中星微电子有限公司 | Method and device for tracking characteristic point of image |
| JP4539729B2 (en) * | 2008-02-15 | 2010-09-08 | ソニー株式会社 | Image processing apparatus, camera apparatus, image processing method, and program |
| CN101339606B (en) * | 2008-08-14 | 2011-10-12 | 北京中星微电子有限公司 | Human face critical organ contour characteristic points positioning and tracking method and device |
| CN101360246B (en) * | 2008-09-09 | 2010-06-02 | 西南交通大学 | Video error concealment method combined with 3D face model |
| CN102043966B (en) * | 2010-12-07 | 2012-11-28 | 浙江大学 | Face recognition method based on combination of partial principal component analysis (PCA) and attitude estimation |
| CN102163240A (en) * | 2011-05-20 | 2011-08-24 | 苏州两江科技有限公司 | Method for constructing human face characteristic image index database based on MPEG-7 (Motion Picture Experts Group-7) standard |
| CN107506682A (en) * | 2016-06-14 | 2017-12-22 | 掌赢信息科技(上海)有限公司 | A kind of man face characteristic point positioning method and electronic equipment |
-
2001
- 2001-07-13 CN CNB011204281A patent/CN1156248C/en not_active Expired - Fee Related
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8294776B2 (en) | 2006-09-27 | 2012-10-23 | Sony Corporation | Imaging apparatus and imaging method |
| US9179057B2 (en) | 2006-09-27 | 2015-11-03 | Sony Corporation | Imaging apparatus and imaging method that acquire environment information and information of a scene being recorded |
Also Published As
| Publication number | Publication date |
|---|---|
| CN1325662A (en) | 2001-12-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Ploumpis et al. | Combining 3d morphable models: A large scale face-and-head model | |
| CN108549873B (en) | Three-dimensional face recognition method and three-dimensional face recognition system | |
| Chattopadhyay et al. | SURDS: Self-supervised attention-guided reconstruction and dual triplet loss for writer independent offline signature verification | |
| US6580810B1 (en) | Method of image processing using three facial feature points in three-dimensional head motion tracking | |
| CN100565583C (en) | Face feature point detection device, feature point detection device | |
| US7512255B2 (en) | Multi-modal face recognition | |
| JP4238542B2 (en) | Face orientation estimation apparatus, face orientation estimation method, and face orientation estimation program | |
| CN1156248C (en) | Method for detecting moving human face | |
| CN105487665B (en) | A kind of intelligent Mobile Service robot control method based on head pose identification | |
| US20060245639A1 (en) | Method and system for constructing a 3D representation of a face from a 2D representation | |
| US20160314345A1 (en) | System and method for identifying faces in unconstrained media | |
| Martin et al. | On the design and evaluation of robust head pose for visual user interfaces: Algorithms, databases, and comparisons | |
| CN1794265A (en) | Method and device for distinguishing face expression based on video frequency | |
| CN1781123A (en) | System and method for tracking a global shape of an object in motion | |
| CN101964064A (en) | Human face comparison method | |
| CN102654903A (en) | Face comparison method | |
| CN110603570B (en) | Object recognition method, device, system, and program | |
| CN101968846A (en) | Face tracking method | |
| CN106600626A (en) | Three-dimensional human body movement capturing method and system | |
| WO2022042203A1 (en) | Human body key point detection method and apparatus | |
| CN1561499A (en) | Head motion estimation from four feature points | |
| CN106096517A (en) | A kind of face identification method based on low-rank matrix Yu eigenface | |
| WO2015165227A1 (en) | Human face recognition method | |
| CN117095033A (en) | A multi-modal point cloud registration method based on image and geometric information guidance | |
| CN119137632A (en) | Probabilistic Keypoint Regression with Uncertainty |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C06 | Publication | ||
| PB01 | Publication | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| C19 | Lapse of patent right due to non-payment of the annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |