Invention content
The embodiment of the present application provides the method and device of more size face's Expression Recognitions based on three-point positioning method,
It can be by carrying out three-point fix to face, the accurate key feature points for capturing face are improved to the accurate of small Face detection
Degree, obtains more accurately human face data and expression data.
The method that the embodiment of the present application provides more size face's Expression Recognitions based on three-point positioning method, including:
Server receives the image information of camera shooting, includes the first human face data in described image information;
Described image information is inputted human-face detector by the server;
The server detects the candidate face data in first human face data according to human-face detector;
The server carries out three-point fix to the candidate face data;
The server judges the candidate face data, and whether three-point fix is successful;
If so, the server determines the second human face data;
Second human face data is input to Expression Recognition model by the server;
Expression Recognition model described in the server by utilizing exports expression possessed by second human face data.
Optionally, the three-point fix for the server in the candidate face data left eye center, in right eye
The heart and nose are positioned.
Optionally, the human-face detector is more size face detectors.
Optionally, the Expression Recognition model is the FaceNet that output is 64.
The embodiment of the present application provides a kind of more size face's expression recognition apparatus based on three-point positioning method, including:
Receiving unit for receiving the image information of camera shooting, includes the first face number in described image information
According to;
First input unit, for described image information to be inputted human-face detector;
Detection unit, for detecting the candidate face data in first human face data according to human-face detector;
Positioning unit, for carrying out three-point fix to the candidate face data;
Judging unit, for judging the candidate face data, whether three-point fix is successful;If so, determine the second face
Data;
Second input unit, for second human face data to be input to Expression Recognition model;
Output unit, for exporting expression possessed by second human face data using the Expression Recognition model.
Optionally, the three-point fix for the server in the candidate face data left eye center, in right eye
The heart and nose are positioned.
Optionally, the Expression Recognition model is the FaceNet that output is 64.
The embodiment of the present application provides a kind of more size face's expression recognition apparatus based on three-point positioning method, this is based on
More size face's expression recognition apparatus of three-point positioning method, which have, realizes above-mentioned more size faces based on three-point positioning method
The function of more size face's expression recognition apparatus based on three-point positioning method in the method for Expression Recognition.The function can pass through
Hardware realization can also be performed corresponding software by hardware and be realized.The hardware or software include one or more and above-mentioned work(
It can corresponding module.
The embodiment of the present application provides a kind of computer storage media, which is used to store above-mentioned be based on
Computer software instructions used in more size face's expression recognition apparatus of three-point positioning method, including for perform be based on
Program designed by more size face's expression recognition apparatus of three-point positioning method.
The embodiment of the present application provides a kind of computer program product, which refers to including computer software
It enables, which can be loaded to realize above-mentioned more size faces based on three-point positioning method by processor
The flow of expression recognition method.
As can be seen from the above technical solutions, the embodiment of the present application has the following advantages:Server is according to human-face detector
After detecting candidate human face data, three-point fix is carried out to candidate face data, i.e., to the left eye of face, right eye and nose into
Then the successful human face data of three-point fix is inputted Expression Recognition model, and export face by Expression Recognition model by row positioning
Expression possessed by data.Due to having many faces, and the ratio of picture shared by these faces is all on the picture that is shot in arenas
Very little, each organ sites clarity in face is poor or even server can occur can not accurately recognize each in face
Position, such as be difficult to recognize the accurate location of the left corners of the mouth and nose, using three-point fox method, only need to three positions of face into
Row positioning accordingly even when the position for part human face occur is difficult to situation about recognizing, will not influence the accuracy of positioning, from
And accurately capture the key feature points of face, improve the accuracy to small Face detection, obtain more accurately human face data and
Expression data.
Specific embodiment
Recognition of face is a kind of biological identification technology that the facial feature information based on people carries out identification.With camera shooting
Machine or camera acquisition image or video flowing containing face, and automatic detect and track face in the picture, and then to detection
The face arrived carries out a series of the relevant technologies of face.Face identification system (i.e. recognition of face device) mainly includes four composition portions
Point, respectively:Man face image acquiring and detection, facial image pretreatment, facial image feature extraction and matching and identification.It is defeated
Enter face identification system is usually in one or a series of facial images and face database containing not determining identity
Several known identities facial image or corresponding coding, and its output is then a series of similarity scores, shows to wait to know
The identity of other face.
Human facial expression recognition refers to carry out feature extraction to the facial expression information of face using computer, according to recognizing for people
Know and the mode of thinking is sorted out and understood, and then the mood of analysis and understanding people is gone from human facial expression information, it is such as glad, sad
Wound, surprised, frightened, angry and detest etc..Expression Recognition system (i.e. Expression Recognition device) is generally divided into four processing procedures, face
The acquisition of image and pretreatment, Face datection, human facial feature extraction and expression classification.Feature extraction is in facial expression recognition
Core procedure is the key that identification technology, decides final recognition result, directly affects the height of discrimination.
Occur the clear data with deep learning (Deep Learning) for break-through point in machine learning field in recent years to drive
Dynamic feature learning algorithm is substantially the general designation of a kind of method being trained to the model of deep structure.Deep structure
Model represents feature step by step by layering, it has given up by artificial well-designed explicit features extracting method, by by
The deep neural network (possessing tens of hidden layers, network parameters that are tens million of or even crossing hundred million) of layer ground one multilayer of structure, allows machine
Independently from sample data learning to the more essential feature for characterizing these samples, so that the feature learnt has more
There are generalization and characterization ability.
The embodiment of the present application provides more size face's expression recognition methods and device based on three-point positioning method, can
By carrying out three-point fix to face, the accurate key feature points for capturing face improve the accuracy to small Face detection, obtain
To more accurately human face data and expression data.
Below in conjunction with the accompanying drawings to more size face's Expression Recognitions provided by the embodiments of the present application based on three-point positioning method
Method and device elaborates.
Referring to Fig. 1, the method for more size face's Expression Recognitions based on three-point positioning method in the embodiment of the present application
One embodiment includes:
101st, server receives the image information of camera shooting, includes the first human face data in the image information;
Include the first human face data in the image information of camera shooting, which can be picture, can also
It is one section of video, does not limit specifically herein, and the length of number and video for receiving picture can be pre-set on the server
The threshold value of degree, does not limit specifically herein.
102nd, image information is inputted human-face detector by server;
Image information is input in human-face detector by server after image information is received.In the present embodiment, face
Detection can train Face datection by the human face data collection for 100,000 ranks being collected into arena environment by deep learning
Device.The algorithm of human-face detector can be artificial neural network (artificial neural networks, ANN) model,
Can be support vector machines (support vector machine, SVM) model or Adaboost models, the present embodiment is herein
It does not limit.ANN model is the mathematical model of imictron activity, for all multiple features to face (for example, the eyes of people
Size and opening is closed, hair style, the colour of skin etc.) model being simulated, detected.
103rd, server detects the candidate face data in the first human face data according to human-face detector;
Server by the image information for including human face region after human-face detector is input to, the ANN in human-face detector
Model classifiers or SVM classifier will be detected image information, in the first human face data for detecting image information
Candidate face information.Since the face majority in image information is small face, detected roughly by human-face detector
Face information may be including the non-face information in part, and face information that should be including the non-face information in part is candidate
Human face data.
104th, server carries out three-point fix to candidate face data;
After server detects candidate face data, since the present embodiment is that small face is detected, for people
The capture difficulty of face key feature points (for example, five characteristic points of face, left eye, right eye, nose, the left corners of the mouth and right corners of the mouth) is non-
Chang great, therefore face is positioned using the method for three-point fix in the present embodiment, 3 points of three passes for corresponding to face respectively
Key characteristic point can (such as only corresponding to left eye, the right corners of the mouth and nose).In three-point fix, server equally uses Face datection
ANN model grader or SVM classifier in device carry out the candidate face in each candidate face data three-point fix, and
It is returned using linear regression mode and obtains three key point positions.In the present embodiment, the face position of three-point fix can be advance
It selectes and is arranged in human-face detector, for selected face position, do not limit herein specifically.
105th, server judge candidate face data whether three-point fix success;
In the present embodiment, server if positioning is unsuccessful, that is, is used when carrying out three-point fix to candidate face data
Linear regression mode cannot get three key feature points (such as left eye, the right corners of the mouth and nose) of face after returning, then it represents that
Candidate face may be a misrecognition rather than real face.Face is definitely only obtained by three-point fix
Three key feature points, just determine that the candidate face is real face, and the data for being determined as the real face are
The human face data of expression to be identified is referred to as the second human face data in the present embodiment.
106th, the second human face data is input to Expression Recognition model by server;
In the present embodiment, server is input to Expression Recognition after the second human face data is determined, by second human face data
In model.Expression Recognition model can be divided into still image feature according to the difference of image property for the extraction of expressive features and carry
It takes and is extracted with sequential image feature.What is extracted in still image is the transient characteristic of the deformation characteristics of expression, i.e. expression, and for
The expression deformation characteristics that sequence image will not only extract each frame will also extract the motion feature of continuous sequence.Expression Recognition feature
In the method for the half-light source Expression Recognition based on transfer learning also may include having the matching process based on template, based on neural network
Method, method based on probabilistic model etc., do not limit herein specifically.
107th, server by utilizing Expression Recognition model exports expression possessed by the second human face data.
Second human face data is input to after Expression Recognition model, can in the training grader in Expression Recognition model
All kinds of expressions are matched, then, the expression type matched in server using the output of Expression Recognition model.Expression is known
Not, six classifications can be divided into, it is normal, happy, uneasy, sad, nauseous, surprised, it does not limit herein specifically
It is fixed.The training grader of Expression Recognition model can be extreme learning machine, i.e. ELM graders (extreme learning
Machine, ELM) or support vector machines, i.e. SVM classifier, it does not limit herein specifically.
In the embodiment of the present application, three-point fox method is used for the small face in image information, due to only need to be to person of low position
Three positions of face are positioned, and will not influence to determine accordingly even when the position for part human face occur is difficult to situation about recognizing
The accuracy of position so as to accurately capture the key feature points of face, improves the accuracy to small Face detection, can obtain more
Accurate human face data and expression data.
Another embodiment is described below, referring to Fig. 2, being based on Fig. 2 shows provided in an embodiment of the present invention
The specific implementation flow of more size face's expression recognition methods of three-point positioning method.
201st, server receives the image information of camera shooting, includes original human face data in image information;
Step 201 is similar with the step 101 in the embodiment of above-mentioned Fig. 1, and specific details are not described herein again.
202nd, image information is input to more size face detectors by server;
In the present embodiment, after server receives the image information of camera shooting, image information is input to more sizes
In human-face detector.More size face detectors can by using three layers of neural network (details of neural network such as Fig. 3),
With accomplished in many ways detector, a kind of method mainly learns the different character representation (scale-invariant of scale
), methods a kind of method is mainly by learning more size characteristics so as to be examined to different sizes using different networks
It surveys, does not limit herein specifically.
203rd, server detects the candidate face data in original human face data according to more size face detectors;
Server by the image information for including human face region after more size face detectors are input to, more size face's inspections
Image information will be detected by surveying device, detect the candidate face information in the original human face data of image information.More rulers
Very little human-face detector is different from fixed-size human-face detector, during model training, in order to reach more size detections
Purpose, setting model deep learning model in, six layers of hidden layer, more size people can be set in deep learning model
Face detector will obtain final judging result according to the data that six layers of hiding layer analysis obtain so that be led in hidden layer
The result of the image data size variation of cause can be recorded and using in last more sizes judgement.Candidate face information
It is defined in the step 203 of above-mentioned steps embodiment and has been described in detail, details are not described herein again.
204th, server positions left eye center, right eye center and the nose in candidate face data;
In the present embodiment, after server detects candidate face data, server to three key feature points of face,
I.e. left eye center, right eye center and nose are positioned, and 3 positions are obtained by linear regression mode.The three of face position
A key feature points can be preselected and are arranged in human-face detector, not limited herein specifically.
205th, server judges whether left eye center, right eye center and nose are positioned successfully in candidate face data;
In the present embodiment, server positions left eye center, right eye center and nose, if positioning is unsuccessful,
The position at left eye center, right eye center and nose cannot be successfully obtained using linear regression mode, then it represents that candidate people
Face may be a misrecognition rather than real face.Only left eye center, right eye center and nose are successfully determined
Position just determines that the candidate face is real face, which is the face of expression to be identified
Data are referred to as human face data to be identified in the present embodiment.
206th, human face data to be identified is input to Expression Recognition model by server;
In the present embodiment, Expression Recognition model can be the Expression Recognition model obtained according to deep learning, such as can be with
It is classical FaceNet models.It should be noted that Expression Recognition model can be by face three-point fix (the present embodiment
In, i.e., left eye center, right eye center and nose are positioned) after, by being estimated to face location, obtain the picture of face
The information of element.And the face to being detected in this present embodiment using more size faces, many smaller faces can be generated, this
When can use through the super-resolution reconfiguration technique of production model creation, feature reconstruction is carried out to face, then again will weight
Face after structure is as input data, and the deep learning model of one 4 layers of training, the output of the deep learning model is expression
Grader.
Current existing technical solution is that the information of face is transformed into the feature space of 128 dimensions by FaceNet networks,
But since the pixel data of the face part in image information is fewer, so be not suitable for human face data being divided into 128 dimensions, because
The technical solution of this present embodiment is that 128 repairs of FaceNet outputs originally are changed to 64 dimensions, and to human face data by server
Re -training is carried out, so as to obtain output as the new neural network of 64, abbreviation FaceNet-64.FaceNet-64 models pair
Each face exports 64 dimensional features, and then server is using 64 dimensional features of each face as training data, and each face has pair
The expression label answered including normal, happy, uneasy, sad, nauseous, surprised, and then classifies to data
Training obtains Expression Recognition model.The method that Multiple Linear Regression model may be used in the classification based training of Expression Recognition model, specifically
It does not limit herein.
207th, it is classified in Expression Recognition model that server by utilizing Expression Recognition model, which exports human face data to be identified,
The probability of each expression type;
In the present embodiment, various types of expressions, server by utilizing are prestored in the training grader of Expression Recognition model
Expression Recognition model by human face data to be identified with training grader in each classified expression type compared to pair, obtain often
The probability of kind expression type.The method that Multiple Linear Regression model may be used in the training grader of Expression Recognition model, it is specific this
Place does not limit.
208th, server selects the expression type of maximum probability as expression possessed by human face data to be identified.
In the present embodiment, server can select the expression type of maximum probability as possessed by human face data to be identified
Expression, can also expression probability possessed by list display human face data to be identified, do not limit herein specifically.
The advantageous effect of the present embodiment is that server, only need to be to face small in image information by using three-point fox method
Three positions positioned, accordingly even when the position for part human face occur is difficult to situation about recognizing, it is fixed to influence
The accuracy of position so as to accurately capture the key feature points of face, improves the accuracy to small Face detection, and it is more accurate to obtain
Human face data and expression data.
More size face's expression recognition methods based on three-point positioning method in the embodiment of the present application are carried out above
Description, is below described more size face's expression recognition apparatus based on three-point positioning method in the embodiment of the present application,
Referring to Fig. 4, in the embodiment of the present application more size face's expression recognition apparatus based on three-point positioning method one embodiment
Including:
Receiving unit 401 for receiving the image information of camera shooting, includes the first face in described image information
Data;
First input unit 402, for described image information to be inputted human-face detector;
Detection unit 403, for detecting the candidate face data in first human face data according to human-face detector;
Positioning unit 404, for carrying out three-point fix to the candidate face data;
Judging unit 405, for judging the candidate face data, whether three-point fix is successful;If so, determine second
Human face data;
Second input unit 406, for second human face data to be input to Expression Recognition model;
Output unit 407, for exporting expression possessed by second human face data using the Expression Recognition model.
In the present embodiment, three-point fix can be to left eye center, right eye center and the nose in candidate face data
It is positioned.
In the present embodiment, human-face detector can be more size face detectors.
In the present embodiment, Expression Recognition model can be FaceNet of the output for 64 dimensions.
In the present embodiment, the stream in more size face's expression recognition apparatus based on three-point positioning method performed by each unit
Journey is similar with the method flow described in earlier figures 1 and embodiment shown in Fig. 2, and details are not described herein again.
In the present embodiment, after detection unit 403 detects candidate human face data according to human-face detector, positioning unit
404 pairs of candidate face data carry out three-point fix (such as being positioned to the left eye, right eye and nose of face), and the second input is single
Member 406 will be determined as the successful human face data of three-point fix according to judging unit 405 and be input in Expression Recognition model, finally by
Output unit 407 exports expression possessed by the human face data gone out as Expression Recognition Model Identification.Using three-point fox method, due to
Three positions of face small in image information need to only be positioned, accordingly even when the position for part human face occur is difficult to distinguish
Situation about recognizing will not influence the accuracy of positioning, so as to accurately capture the key feature points of face, improve and small face is determined
The accuracy of position, obtains more accurately human face data and expression data.
It is a reality of more size face's expression recognition apparatus based on three-point positioning method in the embodiment of the present application above
Apply example.Referring to Fig. 5, in the embodiment of the present application more size face's expression recognition apparatus based on three-point positioning method another
Embodiment includes:
More size face's expression recognition apparatus 500 based on three-point positioning method can generate due to configuration or different performance
Bigger difference can include one or more central processing units (central processing units, CPU)
501 (for example, one or more processors) and memory 505 are stored in the memory 505 one or more
Application program or data.
Wherein, memory 505 can be volatile storage or persistent storage.Being stored in the program of memory 505 can wrap
One or more modules are included, each module can include operating the series of instructions in server.Further, in
Central processor 501 could be provided as communicating with memory 505, in more size face's Expression Recognitions dress based on three-point positioning method
Put the series of instructions operation performed on 500 in memory 505.
More size face's expression recognition apparatus 500 based on three-point positioning method can also include one or more electricity
Source 502, one or more wired or wireless network interfaces 503, one or more input/output interfaces 504 and/
Or, one or more operating systems, such as Windows Server TM, Mac OS XTM, UnixTM, LinuxTM,
FreeBSDTM etc..
Central processing unit in the present embodiment in more size face's expression recognition apparatus 500 based on three-point positioning method
Flow performed by 501 is similar with the method flow described in earlier figures 1, embodiment shown in Fig. 2, and details are not described herein again.
The embodiment of the present application also provides a kind of computer storage media, which is used to save as aforementioned base
The computer software instructions used in more size face's expression recognition apparatus in three-point positioning method, including being base for performing
The program designed by more size face's expression recognition apparatus in three-point positioning method.
The embodiment of the present application also provides a kind of computer program product, which refers to including computer software
It enables, which can be loaded to realize in earlier figures 1, Fig. 2 and embodiment shown in Fig. 4 by processor
Method flow.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit can refer to the corresponding process in preceding method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of division of logic function can have other dividing mode, such as multiple units or component in actual implementation
It may be combined or can be integrated into another system or some features can be ignored or does not perform.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be the indirect coupling by some interfaces, device or unit
It closes or communicates to connect, can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separate, be shown as unit
The component shown may or may not be physical unit, you can be located at a place or can also be distributed to multiple
In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme
's.
In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, it can also
That each unit is individually physically present, can also two or more units integrate in a unit.Above-mentioned integrated list
The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is independent product sale or uses
When, it can be stored in a computer read/write memory medium.Based on such understanding, the technical solution of the application is substantially
The part to contribute in other words to the prior art or all or part of the technical solution can be in the form of software products
It embodies, which is stored in a storage medium, is used including some instructions so that a computer
Equipment (can be personal computer, server or the network equipment etc.) performs the complete of each embodiment the method for the application
Portion or part steps.And aforementioned storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can store journey
The medium of sequence code.
The above, above example are only to illustrate the technical solution of the application, rather than its limitations;Although with reference to before
Embodiment is stated the application is described in detail, it will be understood by those of ordinary skill in the art that:It still can be to preceding
The technical solution recorded in each embodiment is stated to modify or carry out equivalent replacement to which part technical characteristic;And these
Modification is replaced, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution.