CN107085704A

CN107085704A - Fast Facial Expression Recognition Method Based on ELM Autoencoding Algorithm

Info

Publication number: CN107085704A
Application number: CN201710188162.6A
Authority: CN
Inventors: 陆晗; 曹九稳; 朱心怡
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2017-03-27
Filing date: 2017-03-27
Publication date: 2017-08-22

Abstract

The invention discloses a kind of fast face expression recognition method based on ELM own coding algorithms, it is as follows that the present invention includes step：The human face region detection grader of step 1, training based on Adaboost simultaneously carries out Face datection；2nd, the human face region detected is pre-processed, including cutting, size normalization and histogram equalization processing；3rd, feature extraction is carried out to pretreated Facial Expression Image as feature extraction algorithm using the ELM AE algorithms based on self-encoding encoder and the learning machine that transfinites combination；4th, the facial expression classifier based on the learning machine that transfinites is built, the vector of feature extraction is input in expression classifier, output result is the mood of this face.The present invention more can quickly and efficiently extract main information and dimensionality reduction.When Expression Recognition is classified, as long as ELM adjusts a parameter of neuron, identification run time is short, and accuracy rate is high, is a kind of efficient and fast algorithm of pace of learning.

Description

Fast Facial Expression Recognition Method Based on ELM Autoencoding Algorithm

技术领域technical field

本发明属于图像处理领域，阐述了人脸表情的情绪识别的全过程，尤其涉及一种基于ELM自编码算法的快速人脸表情识别方法。The invention belongs to the field of image processing, describes the whole process of emotion recognition of human facial expressions, and in particular relates to a fast human facial expression recognition method based on an ELM self-encoding algorithm.

背景技术Background technique

人脸表情的情绪识别，也就是在图片或者视频中识别出人脸，并作进一步的情绪分析，这在过去的几十年间成为生物智能特征识别领域的一个热点。情绪识别在本质上是要赋予计算机如何“察言观色”的能力，改善目前比较呆板，不成熟的人机交互环境。Emotion recognition of human facial expressions, that is, recognizing human faces in pictures or videos and performing further emotional analysis, has become a hot spot in the field of biological intelligence feature recognition in the past few decades. Emotion recognition is essentially to give computers the ability to "observe words and colors" and improve the current relatively rigid and immature human-computer interaction environment.

人脸表情的情绪分析主要包括以下几个方面：人脸检测与定位、图像预处理、表情的特征提取、表情分类及情绪分析。人脸检测常指在图像中检测到人脸区域，如若包含人脸，还需要定位到人脸的位置并确定尺寸，本文采用的是基于Adaboost的人脸检测法，这是一种迭代的算法，对于一组训练集，通过改变其中每个样本的分布概率，而得到不同的训练集Si，对于每一个Si进行训练从而得到一个弱分类器Hi，再将这些弱分类器根据不同的权值组合起来，就得到了强分类器。检测后的表情图像中通常会存在噪声、对比度不够等缺点，这些往往是由光照强度程度以及设备的性能优劣等因素所造成。因此预处理是人脸情绪识别过程中的一个非常重要的环节，有效的预处理方法能有利于提高人脸表情识别率。经过预处理后，特征提取算法将应用于人脸图像提取不同表情的特征。本专利采用自编码器和超限学习机结合的ELM-AE算法作为特征提取方法，这是一种高效实用的自编码算法，样本数据经过编码和解码处理，若重构误差足够小，在限定范围内，即可认定此编码code是对输入样本数据的有效表达，即可作为人脸图像表情的描述向量。最后，实现表情识别分类以及进一步的情绪分析，依据人脸表情图像特征提取获得的特征矢量构建识别模型训练特征库，给定被测目标类别标识，包括高兴、悲伤、惊讶、恐惧、愤怒、厌恶和中性。The emotion analysis of facial expressions mainly includes the following aspects: face detection and positioning, image preprocessing, feature extraction of expressions, expression classification and emotion analysis. Face detection often refers to the detection of a face area in an image. If a face is included, the position of the face needs to be located and the size determined. This article uses the face detection method based on Adaboost, which is an iterative algorithm. , for a set of training sets, different training sets Si are obtained by changing the distribution probability of each sample, and each Si is trained to obtain a weak classifier Hi, and then these weak classifiers are classified according to different weights Combined, a strong classifier is obtained. The detected expression images usually have shortcomings such as noise and insufficient contrast, which are often caused by factors such as the intensity of light and the performance of the device. Therefore, preprocessing is a very important link in the process of facial emotion recognition, and effective preprocessing methods can help improve the recognition rate of facial expressions. After preprocessing, the feature extraction algorithm will be applied to the face image to extract the features of different expressions. This patent uses the ELM-AE algorithm combining autoencoder and extreme learning machine as the feature extraction method. This is an efficient and practical autoencoder algorithm. After the sample data is encoded and decoded, if the reconstruction error is small enough, within Within the range, it can be determined that this encoding code is an effective expression of the input sample data, and can be used as a description vector of facial image expression. Finally, realize expression recognition classification and further emotion analysis, build a recognition model training feature library based on the feature vector obtained from facial expression image feature extraction, and give the target category identification, including happiness, sadness, surprise, fear, anger, disgust and neutral.

表情特征提取是整个人脸表情识别系统中重要的部分，传统的LBP、基于几何特征以及基于模板的特征提取方法都有一定的缺陷，例如LBP特征提取法难以处理高维数据，且运行速度慢；基于几何特征的方法适应性不够强，同时还会丢失部分信息。Expression feature extraction is an important part of the entire facial expression recognition system. The traditional LBP, geometric feature-based and template-based feature extraction methods have certain defects. For example, the LBP feature extraction method is difficult to handle high-dimensional data and runs slowly. ; The adaptability of the method based on geometric features is not strong enough, and some information will be lost at the same time.

表情识别的实质是设计一个高效的分类器，根据前阶段提取的特征向量数据，将目标表情分类为六种基本表情类别之一或分为中性表情。由此可见，分类器的设计直接影响到表情识别和情绪分析的最终效果。由于特征提取后的数据量较大，而传统的人工神经网络、基于模板匹配、支持向量机SVM等方法运行速度不够快，而且因为需要的训练样本过多和过长的训练时间，无法满足实时性的需求。因此本文采用基于极限学习机(ExtremeLearning Machine，ELM)算法的分类器来进行快速表情识别分类。ELM是一种简单易用、有效的单隐层前馈神经网络(SLFNs)学习算法。它在表情分类上有以下几点优势：(1)ELM在输入层和隐藏层之间采用随机权重。我们可以多次训练相同的数据集，这给不同的分类精度不同的输出空间。(2)ELM是一个更简单的前馈神经网络的学习算法。传统的神经网络学习算法(如BP算法)需要人为设置大量的网络训练参数，这样就会非常容易产生局部最优解。而ELM在确定网络参数的过程中，只需要设置网络的隐层节点个数，在算法执行过程中不需要调整网络的输入权值以及隐元的偏置，并且产生唯一的最优解。因此，ELM学习速度比传统人工神经网络更快且泛化性能更好，可以最快地实现表情分类以及情绪识别。The essence of expression recognition is to design an efficient classifier to classify the target expression into one of the six basic expression categories or into neutral expressions according to the feature vector data extracted in the previous stage. It can be seen that the design of the classifier directly affects the final effect of expression recognition and emotion analysis. Due to the large amount of data after feature extraction, the traditional artificial neural network, template matching, support vector machine (SVM) and other methods are not running fast enough, and because of too many training samples and too long training time, they cannot meet real-time sexual needs. Therefore, this paper uses a classifier based on the Extreme Learning Machine (ELM) algorithm for fast expression recognition and classification. ELM is an easy-to-use and effective learning algorithm for single-hidden-layer feed-forward neural networks (SLFNs). It has the following advantages in expression classification: (1) ELM uses random weights between the input layer and the hidden layer. We can train the same dataset multiple times, which gives different output spaces for different classification accuracies. (2) ELM is a simpler learning algorithm for feedforward neural networks. Traditional neural network learning algorithms (such as BP algorithm) need to artificially set a large number of network training parameters, so it is very easy to generate local optimal solutions. In the process of determining network parameters, ELM only needs to set the number of hidden layer nodes of the network, and does not need to adjust the input weights of the network and the bias of hidden elements during the algorithm execution process, and generates a unique optimal solution. Therefore, ELM has faster learning speed and better generalization performance than traditional artificial neural networks, and can realize expression classification and emotion recognition the fastest.

ELM的输出为其中：β_i是隐层节点和输出节点之间的权重，G(a_i,b_i,x)是隐层输出函数。h(x)＝[G(a₁,b₁,x),...,G(a_L,b_L,x)]^T是隐层相对于输入x的输出向量。ELM的关键在于最小化训练误差和输出权重范数。即最小化以及||β||。The output of the ELM is Among them: β _i is the weight between the hidden layer node and the output node, G(a _i , _bi , x) is the hidden layer output function. h(x)=[G(a ₁ ,b ₁ ,x),...,G(a _L ,b _L ,x)] ^T is the output vector of the hidden layer relative to the input x. The key to ELM is to minimize the training error and output weight norm. i.e. minimize and ||β||.

ELM算法总结如下：给定训练集{(x_i,t_i)x_i∈Rⁿ,t_i∈R^m,i＝1,2,...N}，隐层节点输出函数g(w,b,x)和隐层节点数L。The ELM algorithm is summarized as follows: given the training set {( _xi ,t _i ) _xi ∈R ⁿ ,t _i ∈R ^m ,i=1,2,...N}, the hidden layer node output function g(w, b, x) and the number of hidden layer nodes L.

(1)随机分配隐层节点的参数(w_i,b_i)，i＝1,2...,L。(1) Randomly assign the parameters (w _i , _bi ) of the hidden layer nodes, i=1, 2...,L.

(2)计算隐层输出矩阵Η。(2) Calculate the hidden layer output matrix H.

(3)计算隐层节点和输出节点之间的权重β.β＝H⁺T。(3) Calculate the weight β between the hidden layer node and the output node. β=H ⁺ T.

H⁺是隐层输出矩阵H的Moore-Penrose广义逆矩阵，可以使用正交投影法、正交化法和奇异值分解等方法来计算得到。H ⁺ is the Moore-Penrose generalized inverse matrix of the hidden layer output matrix H, which can be calculated using methods such as orthogonal projection, orthogonalization, and singular value decomposition.

发明内容Contents of the invention

本发明的目的是针对现有的表情识别算法中存在的问题，提供一种基于ELM自编码算法的快速人脸表情识别方法。一种更快更高效的快速表情识别方法。The purpose of the invention is to provide a fast human facial expression recognition method based on the ELM self-encoding algorithm for the problems existing in the existing expression recognition algorithm. A faster and more efficient fast expression recognition method.

本发明的技术方案主要包括如下步骤：Technical scheme of the present invention mainly comprises the steps:

步骤1、训练人脸区域检测分类器Step 1. Train the face region detection classifier

1-1.给定一系列训练样本(x₁,y₁),(x₂,y₂),...,(x_i,y_i),(x_n,y_n)，其中x_i表示第i个样本，y_i＝0时表示其为负样本(非人脸)，y_i＝1时表示其为正样本(人脸)，n为总共的训练样本数量。1-1. Given a series of training samples (x ₁ ,y ₁ ),(x ₂ ,y ₂ ),...,( _xi ,y _i ),(x _n ,y _n ), where x _i represents For the i-th sample, when y _i =0, it means it is a negative sample (non-face), when y _i =1, it means it is a positive sample (face), and n is the total number of training samples.

1-2.初始化权重并作权重归一化：D_t(i)是第t次循环中第i个样本的误差权重，t＝1...T。1-2. Initialize the weight and normalize the weight: D _t (i) is the error weight of the i-th sample in the t-th cycle, t=1...T.

1-3.对每个特征f，训练一个弱分类器h(x,f,p,θ)；计算对应所有特征的弱分类器的加权错误率: 1-3. For each feature f, train a weak classifier h(x, f, p, θ); calculate the weighted error rate of the weak classifier corresponding to all features:

1-4.对所有样本的权重进行更新：其中β_t＝ξ_t/1-ξ_t，e_i＝0表示x_i被正确地分类，e_i＝1表示x_i被错误地分类。1-4. Update the weights of all samples: Where β _t =ξ _t /1−ξ _t , e _i =0 indicates that _xi is correctly classified, and e _i =1 indicates that _xi is incorrectly classified.

1-5.训练之后的强分类器能够被用来进行人脸检测识别，如若图片中包含人脸，还会定位到人脸的位置并确定尺寸，强分类器H(x)：1-5. The trained strong classifier can be used for face detection and recognition. If the picture contains a face, it will also locate the position of the face and determine the size. The strong classifier H(x):

其中h_t为训练时具有最小误差率ξ_t的弱分类器。where h _t is the weak classifier with the minimum error rate ξ _t during training.

步骤2、人脸区域预处理Step 2. Face area preprocessing

2-1.把检测出的人脸区域进行感兴趣区域ROI裁剪出来，然后对图像作像素尺寸归一化处理：将图片缩小/放大到某一合适的像素尺寸。2-1. Cut out the ROI of the detected face area, and then normalize the pixel size of the image: reduce/enlarge the image to an appropriate pixel size.

2-2.对归一化处理后的图像作直方图均衡化处理来增强图像对比度。对于离散图像，均衡化公式为：Pr(r_k)＝r_k/N,0≤r_k＜1；k＝0,1,...,L-1，其中N是像素的总数数，k为灰度级总数，对于8位的灰度图像k取2⁸＝256，r_k为第k个灰度级值。均衡化变换函数为：2-2. Perform histogram equalization processing on the normalized image to enhance image contrast. For discrete images, the equalization formula is: Pr(r _k )=r _k /N,0≤r _k <1; k=0,1,...,L-1, where N is the total number of pixels, k is the total number of gray levels, for an 8-bit gray level image k takes 2 ⁸ =256, r _k is the kth gray level value. The equalization transformation function is:

其中n_j是灰度值为j的总像素数。where n _j is the total number of pixels with gray value j.

步骤3、人脸表情图像特征提取Step 3, facial expression image feature extraction

3-1.给定训练样本：X＝[x₁,x₂,...,x_N],即ELM-AE的输入和输出矩阵。3-1. Given training samples: X=[x ₁ , x ₂ , . . . , x _N ], that is, the input and output matrices of the ELM-AE.

3-2.随机生成隐层输入权值矩阵a＝[a₁,...,a_L]和正交化偏置向量矩阵b＝[b₁,...,b_L]，将输入数据映射到相同或者不同的数据维度空间：h＝g(ax+b)a^Ta＝I,b^Tb＝1其中：g()表示激活函数。3-2. Randomly generate hidden layer input weight matrix a=[a ₁ ,...,a _L ] and orthogonalization bias vector matrix b=[b ₁ ,...,b _L ], input data Mapped to the same or different data dimension space: h=g(ax+b)a ^T a=I,b ^T b=1 where: g() represents the activation function.

3-3.求解ELM-AE的输出权值矩阵β。3-3. Solve the output weight matrix β of ELM-AE.

假设输入输出层神经元数量为d,隐含层神经元数量为L。Suppose the number of neurons in the input and output layers is d, and the number of neurons in the hidden layer is L.

若d＜L或d＞L,即对于稀疏及压缩的特征表达， If d<L or d>L, that is, for sparse and compressed feature expressions,

若d＝L，即对于等维度的特征映射，β＝H^-1Xβ^Tβ＝IIf d=L, that is, for feature maps of equal dimensions, β=H ^-1 Xβ ^T β=I

其中：H＝[h₁,...,h_N]表示ELM-AE的隐含层输出矩阵。Where: H=[h ₁ ,...,h _N ] represents the hidden layer output matrix of ELM-AE.

3-4.向训练好的ELM-AE系统输入预处理后的人脸表情图像，得到的隐含层输出矩阵向量H即为整幅人脸图像的纹理特征向量。3-4. Input the preprocessed facial expression image to the trained ELM-AE system, and the obtained hidden layer output matrix vector H is the texture feature vector of the entire facial image.

步骤4、构建人脸表情分类器Step 4. Build a facial expression classifier

4-1.给定训练样本：{(x_i,t_i)|x_i∈Rⁿ,t_i∈R^m,i＝1,2,...N}，隐层输出函数g(w,b,x)，隐层节点数L和测试样本y。4-1. Given training samples: {( _xi ,t _i )| _xi ∈R ⁿ ,t _i ∈R ^m ,i=1,2,...N}, the hidden layer output function g(w, b, x), the number of hidden layer nodes L and the test sample y.

4-2.随机生成隐层节点参数(w_i,b_i),i＝1,2,...,L。4-2. Randomly generate hidden layer node parameters (w _i , bi ), _i =1, 2,...,L.

4-3.计算出隐层节点输出矩阵H(w₁,…w_L,x₁,…,x_N,b₁,…b_L)，且4-3. Calculate the hidden layer node output matrix H(w ₁ ,…w _L ,x ₁ ,…,x _N ,b ₁ ,…b _L ), and

确保H列满秩，其中w是连接隐层节点和输入神经元的输入权重，x是训练样本输入，N是训练样本个数，b_i是第i个隐层节点的偏差，g()表示激活函数。Make sure that the H column is full rank, where w is the input weight connecting the hidden layer node and the input neuron, x is the training sample input, N is the number of training samples, b _i is the bias of the i-th hidden layer node, g() means activation function.

4-4.计算出最优外权β：β＝H⁺T。4-4. Calculate the optimal external weight β: β=H ⁺ T.

4-5.计算测试样本y对应的输出o＝H(w₁,…w_L,x₁,…,x_N,b₁,…b_L)β。4-5. Calculate the output o=H(w ₁ ,...w _L ,x ₁ ,...,x _N ,b ₁ ,...b _L )β corresponding to the test sample y.

4-6.对测试样本进行表情识别分类，对ELM输出向量ο中最大值对应的类别即为该人脸的情绪。即 4-6. Carry out expression recognition and classification to the test samples, and the category corresponding to the maximum value in the ELM output vector o is the emotion of the face. which is

本发明有益效果如下：The beneficial effects of the present invention are as follows:

本发明采用的是深度极限学习机自编码器(ELM-AE)算法进行人脸表情特征提取，该算法是一种比普通AE自编码算法更为高效的自编码算法，它可以快速处理较高维度的输入数据，提取其主干部分信息，并且可以实现原始数据高维度、等维度、低维度的特征表达。What the present invention adopts is deep extreme learning machine self-encoder (ELM-AE) algorithm to carry out facial expression feature extraction, and this algorithm is a kind of self-encoding algorithm more efficient than common AE self-encoding algorithm, and it can process high Dimensional input data, extract its main part information, and realize the feature expression of high dimension, equal dimension and low dimension of original data.

本发明具有较快的识别速度，用ELM-AE算法进行表情特征提取时，相比于学习速率缓慢的梯度下降算法，它可以更为快速高效地提取主要信息并降维。表情识别分类时，ELM只要调节神经元的一个参数，识别运行时间短，并且准确率高，是一种高效且学习速度快的算法。The invention has a faster recognition speed, and when using the ELM-AE algorithm to extract expression features, it can more quickly and efficiently extract main information and reduce dimensionality compared with a gradient descent algorithm with a slow learning rate. For facial expression recognition and classification, ELM only needs to adjust one parameter of the neuron, the recognition running time is short, and the accuracy is high. It is an efficient and fast learning algorithm.

本发明可以降低数据的维度并且代表了原信息(即人脸表情图像)的主要成分，和其他特征提取算法相比，它具有快速提取图像基本构件的能力，还可以处理非常高维度的输入数据。同时，基于超限学习机ELM的表情分类算法具有较快的学习速度和识别速度。两种算法的结合可以大大提高人脸表情识别的速度和准确度。The present invention can reduce the dimension of the data and represent the main components of the original information (i.e. the facial expression image). Compared with other feature extraction algorithms, it has the ability to quickly extract the basic components of the image, and can also process very high-dimensional input data . At the same time, the expression classification algorithm based on the extreme learning machine ELM has a faster learning speed and recognition speed. The combination of the two algorithms can greatly improve the speed and accuracy of facial expression recognition.

附图说明Description of drawings

图1为本发明流程示意图；Fig. 1 is a schematic flow chart of the present invention;

图2为日本JAFFE人脸表情图像数据库；Fig. 2 is the Japanese JAFFE facial expression image database;

图3为预处理后的人脸表情图像；Fig. 3 is the facial expression image after preprocessing;

图4为ELM-AE网络结构图；Figure 4 is a network structure diagram of ELM-AE;

图5为单隐层前馈神经网络示意图。Fig. 5 is a schematic diagram of a single hidden layer feed-forward neural network.

具体实施方式detailed description

如图1所示，首先用Adaboost算法训练人脸区域检测分类器，把每次训练所得的若干个弱分类器按照一定的权值组合起来，就可以得到可以检测人脸区域的强分类器。之后将待检测的图片输入到训练好的人脸检测分类器，对检测到的人脸区域进行裁剪、尺寸像素归一化处理以及直方图均衡化处理。将处理后的人脸表情图片输入到已经训练好的ELM-AE特征提取神经网络中，得到的隐含层输出矩阵向量H即为整幅人脸图像的纹理特征向量。最后将此特征向量作为训练好的ELM表情识别分类器的输入，即可得到相应的表情类别的输出。As shown in Figure 1, first use the Adaboost algorithm to train the face region detection classifier, and combine several weak classifiers obtained from each training according to certain weights to obtain a strong classifier that can detect face regions. Then input the picture to be detected to the trained face detection classifier, and perform cropping, size pixel normalization and histogram equalization processing on the detected face area. Input the processed face expression picture into the trained ELM-AE feature extraction neural network, and the obtained hidden layer output matrix vector H is the texture feature vector of the whole face image. Finally, this feature vector is used as the input of the trained ELM expression recognition classifier, and the output of the corresponding expression category can be obtained.

本发明提供了一种基于ELM自编码算法的快速人脸表情识别方法，应用ELM-AE算法将人脸表情特征提取出来并作为ELM表情分类器的输入，两者结合既提高了运行速度，而且准确率高。The invention provides a fast human facial expression recognition method based on the ELM self-encoding algorithm. The ELM-AE algorithm is used to extract the facial expression features and use them as the input of the ELM expression classifier. The combination of the two not only improves the running speed, but also High accuracy.

具体实现方法如下所示：The specific implementation method is as follows:

步骤一：训练人脸区域检测分类器：对于一组训练样本，通过改变其中每个样本的分布概率，而得到不同的训练集Si，对于每一个Si进行训练从而得到一个弱分类器Hi，再将这些弱分类器根据不同的权值组合起来，就得到了强分类器。Step 1: Training face region detection classifier: For a set of training samples, by changing the distribution probability of each sample, different training sets Si are obtained, and each Si is trained to obtain a weak classifier Hi, and then A strong classifier is obtained by combining these weak classifiers according to different weights.

(1-1)如图2所示，使用日本JAFFE人脸表情数据库作为训练样本，给予每个样本初始化权重并作权重归一化处理：D_t(i)是第t次循环中第i个样本的误差权重，t＝1...T。(1-1) As shown in Figure 2, use the Japanese JAFFE facial expression database as a training sample, give each sample initialization weight and perform weight normalization processing: D _t (i) is the error weight of the i-th sample in the t-th cycle, t=1...T.

(1-2)对每个特征f，训练一个弱分类器h(x,f,p,θ)；计算对应所+特征的弱分类器的加权错误率:并对所有样本的权重进行更新：其中β_t＝ξ_t/1-ξ_t，e_i＝0表示x_i被正确地分类，e_i＝1表示x_i被错误地分类。(1-2) For each feature f, train a weak classifier h(x, f, p, θ); calculate the weighted error rate of the weak classifier corresponding to the + feature: And update the weights of all samples: Where β _t =ξ _t /1−ξ _t , e _i =0 indicates that _xi is correctly classified, and e _i =1 indicates that _xi is incorrectly classified.

(1-3)训练结束得到的强分类器即可用来进行人脸检测识别，如若图片中包含人脸，可确定人脸的中心位置和尺寸大小。(1-3) The strong classifier obtained after training can be used for face detection and recognition. If the picture contains a face, the center position and size of the face can be determined.

强分类器H(x)：Strong classifier H(x):

步骤二：人脸区域预处理：如图3所示，把检测出的人脸区域进行感兴趣区域ROI裁剪，然后作像素尺寸归一化处理，将图片缩小/放大到某一合适的像素尺寸，并且作直方图均衡化处理。Step 2: Face area preprocessing: As shown in Figure 3, the detected face area is cropped by the ROI of the region of interest, and then the pixel size is normalized to reduce/enlarge the image to a suitable pixel size , and perform histogram equalization.

(2-1)对检测出人脸区域的图片进行ROI区域分割后，作像素尺寸归一化处理，输出固定尺寸大小的人脸表情图像。(2-1) After ROI region segmentation is performed on the picture of the detected face area, the pixel size is normalized, and a fixed-size facial expression image is output.

(2-2)对处理后的图像作直方图均衡化处理，均衡化变换函数为：(2-2) Perform histogram equalization processing on the processed image, and the equalization transformation function is:

其中n_j是灰度值为j的总像素数。 where n _j is the total number of pixels with gray value j.

步骤三：人脸表情图像特征提取：对ELM-AE这一网络结构(如图4)进行训练，根据特征表达维度的不同，计算出不同的输出权值矩阵β。训练好的ELM-AE网络可以用来表情图片特征提取。Step 3: Feature extraction of facial expression images: train the network structure of ELM-AE (as shown in Figure 4), and calculate different output weight matrix β according to the different feature expression dimensions. The trained ELM-AE network can be used for feature extraction of expression pictures.

(3-1)给定训练样本：X＝[x₁,x₂,...,x_N],即ELM-AE的输入和输出矩阵。(3-1) Given training samples: X=[x ₁ , x ₂ , . . . , x _N ], that is, the input and output matrices of the ELM-AE.

(3-2)随机生成隐层输入权值矩阵a＝[a₁,...,a_L]和正交化偏置向量矩阵b＝[b₁,...,b_L]。(3-2) Randomly generate the hidden layer input weight matrix a=[a ₁ ,...,a _L ] and the orthogonalized bias vector matrix b=[b ₁ ,...,b _L ].

(3-3)将输入数据映射到相同或者不同的数据维度空间：(3-3) Map the input data to the same or different data dimension spaces:

h＝g(ax+b) a^Ta＝I,b^Tb＝1其中：g()表示激活函数。h=g(ax+b) a ^T a=I,b ^T b=1 where: g() represents an activation function.

(3-4)求解ELM-AE的输出权值矩阵β。(3-4) Solve the output weight matrix β of ELM-AE.

(3-5)向训练好的ELM-AE系统输入预处理后的人脸表情图像，得到的隐含层输出矩阵向量H即为整幅人脸图像的纹理特征向量。(3-5) Input the preprocessed facial expression image to the trained ELM-AE system, and the obtained hidden layer output matrix vector H is the texture feature vector of the entire facial image.

步骤四：构建人脸表情分类器：如图5所示，构建基于超限学习机的表情分类器，随机生成隐层节点的参数，最优化唯一的调节参数β进行训练。Step 4: Build a facial expression classifier: as shown in Figure 5, build an expression classifier based on an extreme learning machine, randomly generate the parameters of hidden layer nodes, and optimize the only adjustment parameter β for training.

(4-1)给定训练样本：{(x_i,t_i)|x_i∈Rⁿ,t_i∈R^m,i＝1,2,...N}，隐层输出函数g(w,b,x)，隐层节点数L和测试样本y。(4-1) Given a training sample: {( _xi , t _i )| _xi ∈ R ⁿ , t _i ∈ R ^m , i=1,2,...N}, the hidden layer output function g(w ,b,x), the number of hidden layer nodes L and the test sample y.

(4-2)随机生成隐层节点参数(w_i,b_i),i＝1,2,...,L。(4-2) Randomly generate hidden layer node parameters (w _i , bi ), _i =1, 2,...,L.

(4-3)计算出隐层节点输出矩阵H(w₁,…w_L,x₁,…,x_N,b₁,…b_L)，且(4-3) Calculate the hidden layer node output matrix H(w ₁ ,…w _L ,x ₁ ,…,x _N ,b ₁ ,…b _L ), and

(4-4)计算出最优外权β：β＝H⁺T。(4-4) Calculate the optimal outer weight β: β=H ⁺ T.

(4-5)计算测试样本y对应的输出：(4-5) Calculate the output corresponding to the test sample y:

o＝H(w₁,…w_L,x₁,…,x_N,b₁,…b_L)βo=H(w ₁ ,...w _L ,x ₁ ,...,x _N ,b ₁ ,...b _L )β

对测试样本进行表情识别分类，对ELM输出向量ο中最大值对应The expression recognition classification is carried out to the test sample, and the maximum value in the ELM output vector o corresponds to

的类别即为该人脸的情绪，即 The category of is the emotion of the face, namely

Claims

1. the fast face expression recognition method based on ELM own coding algorithms, it is characterised in that comprise the following steps：

Step 1, training face region detection grader

1-1. gives a series of training sample (x₁,y₁),(x₂,y₂),...,(x_i,y_i),(x_n,y_n), wherein x_iRepresent i-th of sample This, y_iIt is negative sample (non-face), y that it is represented when=0_iIt is positive sample (face) that it is represented when=1, and n is training sample altogether Quantity；

1-2. initializes weight and makees weight normalization：D_t(i) it is i-th of sample in the t times circulation This Error weight, t=1...T；

1-3. trains a Weak Classifier h (x, f, p, θ) to each feature f；Calculating corresponds to adding for the Weak Classifier of all features Weigh error rate:

1-4. is updated to the weight of all samples：Wherein β_t=ξ_t/1-ξ_t, e_i=0 represents x_iQuilt Correctly classify, e_i=1 represents x_iMistakenly classified；

Strong classifier after 1-5. training can be used for Face datection identification, if including face in picture, can also Navigate to the position of face and determine size, strong classifier H (x)：

<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>t</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>T</mi> </munderover> <msub> <mi>&alpha;</mi> <mi>t</mi> </msub> <msub> <mi>h</mi> <mi>t</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>1</mn> <mo>/</mo> <mn>2</mn> <munderover> <mo>&Sigma;</mo> <mrow> <mi>t</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>T</mi> </munderover> <msub> <mi>&alpha;</mi> <mi>t</mi> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <mi>o</mi> <mi>t</mi> <mi>h</mi> <mi>e</mi> <mi>r</mi> <mi>s</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow> </mtd> <mtd> <mrow> <msub> <mi>&alpha;</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>l</mi> <mi>o</mi> <mi>g</mi> <mn>1</mn> <mo>/</mo> <msub> <mi>&beta;</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>l</mi> <mi>o</mi> <mi>g</mi> <mfrac> <mrow> <mn>1</mn> <mo>-</mo> <msub> <mi>&xi;</mi> <mi>t</mi> </msub> </mrow> <msub> <mi>&xi;</mi> <mi>t</mi> </msub> </mfrac> </mrow> </mtd> </mtr> </mtable> </mfenced>

Wherein h_tThere is minimal error rate ξ during for training_tWeak Classifier；

Step 2, human face region pretreatment；

The human face region detected is carried out region of interest ROI and cuts out by 2-1., then makees Pixel Dimensions normalizing to image Change is handled：Picture is reduced/enlarged into a certain suitable Pixel Dimensions；

2-2. strengthens picture contrast to the image after normalized as histogram equalization processing；For discrete picture, Equalizing formula is：Pr(r_k)=r_k/N,0≤r_k＜ 1；K=0, wherein 1 ..., L-1, N are the total numbers of pixel, and k is gray scale Level sum, 2 are taken for the gray level image k of 8⁸=256, r_kFor k-th of gray-scale value；Equalizing transforming function transformation function is：

Wherein n_jIt is the total pixel number that gray value is j；

Step 3, Facial Expression Image feature extraction；

3-1. gives training sample：X=[x₁,x₂,...,x_N], i.e. ELM-AE input and output matrix；

3-2. generates hidden layer input weight matrix a=[a at random₁,...,a_L] and orthogonalization bias vector matrix b=[b₁,..., b_L], input data is mapped to identical or different data dimension space：H=g (ax+b) a^TA=I, b^TB=1 is wherein：g () represents activation primitive；

3-3. solves ELM-AE output weight matrix β；

Assuming that input and output layer neuronal quantity is d, hidden layer neuron quantity is L；

If d ＜ L or d ＞ L, i.e., for sparse and compression feature representation,

If d=L, i.e., for etc. dimension Feature Mapping, β=H^-1Xβ^Tβ=I

Wherein：H=[h₁,...,h_N] represent ELM-AE hidden layer output matrix；

3-4. inputs pretreated Facial Expression Image to the ELM-AE systems that train, obtained hidden layer output matrix to Amount H is the texture feature vector of view picture facial image；

Step 4, structure facial expression classifier；

4-1. gives training sample：{(x_i,t_i)|x_i∈Rⁿ,t_i∈R^m, i=1,2 ... N }, hidden layer output function g (w, b, x), The number of hidden nodes L and test sample y；

4-2. generates hidden node parameter (w at random_i,b_i), i=1,2 ..., L；

4-3. calculates hidden node output matrix H (w₁,…w_L,x₁,…,x_N,b₁,…b_L), and

<mrow> <mi>H</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mn>1</mn> </msub> <mo>,</mo> <mo>...</mo> <msub> <mi>w</mi> <mi>L</mi> </msub> <mo>,</mo> <msub> <mi>x</mi> <mn>1</mn> </msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub> <mi>x</mi> <mi>N</mi> </msub> <mo>,</mo> <msub> <mi>b</mi> <mn>1</mn> </msub> <mo>,</mo> <mo>...</mo> <msub> <mi>b</mi> <mi>L</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <mrow> <mi>g</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <msub> <mi>x</mi> <mn>1</mn> </msub> <mo>+</mo> <msub> <mi>b</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mn>...</mn> </mtd> <mtd> <mrow> <mi>g</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>L</mi> </msub> <mo>&CenterDot;</mo> <msub> <mi>x</mi> <mn>1</mn> </msub> <mo>+</mo> <msub> <mi>b</mi> <mi>L</mi> </msub> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> <mtd> <mrow></mrow> </mtd> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> <mtd> <mn>...</mn> </mtd> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> <mtd> <mrow></mrow> </mtd> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>g</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <msub> <mi>x</mi> <mi>n</mi> </msub> <mo>+</mo> <msub> <mi>b</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mn>...</mn> </mtd> <mtd> <mrow> <mi>g</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>L</mi> </msub> <mo>&CenterDot;</mo> <msub> <mi>x</mi> <mi>N</mi> </msub> <mo>+</mo> <msub> <mi>b</mi> <mi>L</mi> </msub> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

Ensure H sequency spectrums, wherein w is the input weight for connecting hidden node and input neuron, and x is training sample input, and N is Training sample number, b_iIt is the deviation of i-th of hidden node, g () represents activation primitive；

4-4. calculates optimal outer power β：β=H⁺T；

4-5. calculates the corresponding output o=H (w of test sample y₁,…w_L,x₁,…,x_N,b₁,…b_L)β；

4-6. carries out Expression Recognition classification to test sample, is the face to the corresponding classification of maximum in ELM output vectors ο Mood；I.e.