CN107527018A

CN107527018A - Momentum Face Detection Method Based on BP Neural Network

Info

Publication number: CN107527018A
Application number: CN201710617052.7A
Authority: CN
Inventors: 蒋林华; 蒋云良; 曹书慧; 林晓; 胡文军; 龙伟
Original assignee: Huzhou University
Current assignee: Huzhou University
Priority date: 2017-07-26
Filing date: 2017-07-26
Publication date: 2017-12-29

Abstract

Gabor characteristic and factor of momentum back-propagation algorithm are combined by the momentum method for detecting human face based on BP neural network, this method.The Gabor characteristic of training set is extracted first, and is entered into factor of momentum reverse transmittance nerve network and is trained.Then, go to detect in input picture using the system trained and whether there is face, if there is then being marked with rectangle.In order to improve the training effect of conventional counter propagation algorithm, factor of momentum is added in the algorithm, effectively slows down concussion trend of the neutral net in training, algorithm can be avoided to be absorbed in local minimum.In addition, increased factor of momentum can be adaptively adjusted the weighted value of every layer of reverse transmittance nerve network.It is substantial amounts of test result indicates that, compared with classics or state-of-the-art Face datection model, our experimental program is effective and has a competitiveness.

Description

Momentum Face Detection Method Based on BP Neural Network

技术领域technical field

本发明涉及图像数据处理领域，尤其涉及一种基于BP神经网络的动量人脸检测方法。The invention relates to the field of image data processing, in particular to a momentum face detection method based on BP neural network.

背景技术Background technique

在计算机视觉中人脸检测是一个值得研究的课题。在过去的许多年里，科研人员在人脸检测的领域投放了大量的经历。人脸检测的本质是将图像中存在的人脸用矩形标标注出来。随着人脸检测应用的增加，人脸检测逐渐发展成为一个独立的研究课题，受到研究人员的关注。Face detection is a worthy research topic in computer vision. In the past many years, researchers have invested a lot of experience in the field of face detection. The essence of face detection is to mark out the faces in the image with rectangular labels. With the increase of face detection applications, face detection has gradually developed into an independent research topic, which has attracted the attention of researchers.

通常，人脸检测可分为两类：一类是在静态图像(灰度或彩色图像) 上的人脸检测，根据图像上人脸的个数，可以在图像上检测到单个或多个人脸。另一类是在动态图像上进行人脸检测，也被称为目标跟。本文的研究是基于彩色图像中的人脸检测。人脸检测的过程实际上是对人脸模式特征的综合判断。输入的图像可能包含大量的模式特征，这些特征通过颜色属性可以分为两类：一类是皮肤颜色特征，另一类是灰度特征。Generally, face detection can be divided into two categories: one is face detection on static images (grayscale or color images), and single or multiple faces can be detected on the image according to the number of faces on the image . The other is face detection on dynamic images, also known as target tracking. The research in this paper is based on face detection in color images. The process of face detection is actually a comprehensive judgment on the characteristics of the face pattern. The input image may contain a large number of pattern features, which can be divided into two categories through color attributes: one is skin color features, and the other is grayscale features.

将神经网络应用于人脸检测早在1994年就已经出现，之后出现了使用卷积神经网络来训练分类器、视网膜连接神经网络来改善正面人脸检测、检测在图像中存在偏置角度的人脸、以及最新的卷积神经网络级联的方法用于人脸检测，这类神经网络在人脸检测识别中，都或多或少出现了神经网络算法收敛时间过长的问题，并且在网络训练过程中出现一定的震荡趋势，使算法陷入局部最小值。最终导致神经网络算法的不稳定性。The application of neural networks to face detection has appeared as early as 1994, followed by the use of convolutional neural networks to train classifiers, retinal connection neural networks to improve frontal face detection, and detection of people with biased angles in images. Face, and the latest convolutional neural network cascading method are used for face detection. In the face detection and recognition of this type of neural network, there are more or less problems with the convergence time of the neural network algorithm being too long, and in the network There is a certain oscillation trend in the training process, which makes the algorithm fall into a local minimum. Eventually lead to the instability of the neural network algorithm.

在人脸识别方法中，将Gabor变换运用到神经网络中已经现有技术涉及的，Gabor变换是一个窗口傅立叶变换。Gabor函数可以提取在频域内不同尺度和不同方向上的相关特征。Gabor核可以在特定的频率提取图像的特征。In the face recognition method, applying the Gabor transform to the neural network has been involved in the prior art, and the Gabor transform is a window Fourier transform. The Gabor function can extract related features in different scales and directions in the frequency domain. The Gabor kernel can extract the features of the image at a specific frequency.

类似中国专利号：CN201210057616的基于局部特征Gabor小波的神经网络；上海大学申请的“基于Gabor小波变换和局部二值模式优化的人脸识别方法”专利，公开号为CN102024141A。该专利采用一种基于Gabor小波变换和局部二值模式优融合在一起；清华大学申请的“人脸部件特征和 Gabor人脸特征融合的人脸识别方法及其装置”专利，公开号为 CN101276421，将Gabor小波变换与人脸部件特征融合起来。上述现有技术，不仅数据计算量比较大，收敛时间也比较长。Similar to the Chinese patent number: CN201210057616 based on the local feature Gabor wavelet neural network; the patent of "Face Recognition Method Based on Gabor Wavelet Transform and Local Binary Mode Optimization" applied by Shanghai University, the publication number is CN102024141A. This patent adopts a method based on Gabor wavelet transform and local binary model to optimally fuse together; Tsinghua University applied for the patent of "face recognition method and device for fusion of face part features and Gabor face features", the publication number is CN101276421 , combining the Gabor wavelet transform with the features of face parts. The above-mentioned prior art not only requires a relatively large amount of data calculation, but also takes a long time to converge.

发明内容Contents of the invention

本发明要解决的技术问题是：设计一种解决神经网络在人脸检测中收敛时间比较长，且在训练中容易产生震荡的问题。The technical problem to be solved by the present invention is: to design a method to solve the problem that the convergence time of the neural network in face detection is relatively long, and it is easy to generate oscillations in training.

为了解决上述技术问题，本发明提出一种基于BP神经网络的动量人脸检测方法。In order to solve the problems of the technologies described above, the present invention proposes a kind of momentum face detection method based on BP neural network.

步骤1：提取训练集的图像Gabor特征；Step 1: Extract the image Gabor features of the training set;

步骤2：将其输入到动量因子反向传播神经网络中进行训练；Step 2: Input it into the momentum factor backpropagation neural network for training;

步骤3：使用训练好的系统去检测输入图像中是否存在人脸，如果存在则用矩形标出。Step 3: Use the trained system to detect whether there is a human face in the input image, and mark it with a rectangle if it exists.

作为一种优选：步骤1中的Gabor特征提取的方法是选择了五个尺度和八个方向上的Gabor核用来提取图像中的Gabor特征，将输入图像用5*8 个Gabor核进行卷积，生成不同频率下的40个不同尺度的图像特征。As a preference: the Gabor feature extraction method in step 1 is to select Gabor kernels in five scales and eight directions to extract Gabor features in the image, and convolve the input image with 5*8 Gabor kernels , generating 40 image features of different scales at different frequencies.

本发明有益效果：Beneficial effects of the present invention:

1、为了提高传统反向传播算法的训练效果，将动量因子加到该算法中，有效地减缓神经网络在训练中的震荡趋势，可以避免算法陷入局部最小值。此外，增加的动量因子可以自适应地调整反向传播神经网络每层的权重值。大量的实验结果表明，与经典的或最先进的人脸检测模型相比，我们的实验方案是有效的。1. In order to improve the training effect of the traditional backpropagation algorithm, the momentum factor is added to the algorithm, which can effectively slow down the oscillation trend of the neural network during training and prevent the algorithm from falling into a local minimum. In addition, the added momentum factor can adaptively adjust the weight value of each layer of the backpropagation neural network. Extensive experimental results show that our experimental scheme is effective compared with classical or state-of-the-art face detection models.

2、对于图像的Gabor特征提取，采用5*8个Gabor核进行卷积，能够对复杂的图像，特别是具有色彩纹理，或是色彩差异度较低，特征值不明显的图像进行有效的提取，有效提升后续的BP神经网络训练的快速性。2. For the Gabor feature extraction of images, 5*8 Gabor kernels are used for convolution, which can effectively extract complex images, especially those with color texture, or images with low color difference and inconspicuous eigenvalues , effectively improving the rapidity of subsequent BP neural network training.

附图说明Description of drawings

附图1：本发明方法BP反向传播算法的过程。Accompanying drawing 1: The process of the BP backpropagation algorithm of the method of the present invention.

附图2：本发明方法检测单人脸图的效果图。Accompanying drawing 2: The effect diagram of the detection method of the present invention single face figure.

附图3：本发明方法检测多人脸图的效果图。Accompanying drawing 3: The effect drawing of the method of the present invention to detect multi-face image.

附图4：本发明方法和BP平均消耗时间的比较图。Accompanying drawing 4: The comparison diagram of the average consumption time of the method of the present invention and BP.

附图5：40个Gabor滤波器的示意图。Figure 5: Schematic diagram of 40 Gabor filters.

附图6：利用本发明方法中的40个Gabor滤波器对一副人像的图像特征的提取效果演示图；Accompanying drawing 6: utilize 40 Gabor filters in the inventive method to the extraction effect demo figure of the image feature of a pair of portraits;

具体实施方式detailed description

本发明首先选择了五个尺度和八个方向上的Gabor核用来提取输入图像中的Gabor特征，将输入图像与5*8个Gabor核进行卷积，如图5所示，利用这些Gabor核生成不同频率下的40个不同尺度的图像特征，也可以称为40个Gabor滤波器。图中行代表八个不同的方向，列代表五个不同的尺度。具体过程如下：The present invention first selects Gabor kernels on five scales and eight directions to extract the Gabor features in the input image, and convolves the input image with 5*8 Gabor kernels, as shown in Figure 5, using these Gabor kernels Generate 40 image features of different scales at different frequencies, which can also be called 40 Gabor filters. The rows in the figure represent eight different orientations, and the columns represent five different scales. The specific process is as follows:

给定的输入图像作为相应的输入信号f_in，先用傅里叶变换将其变换到频率域 Given the input image as the corresponding input signal f _in , it is first transformed into the frequency domain by Fourier transform

(x，y)表示空间域上的坐标。然后，空间信号的结果用于乘以可以获得由Gabor滤波器滤波的结果图像的Gabor核心的傅立叶变换。(x, y) represent coordinates on the spatial domain. The result of the spatial signal is then multiplied by a Fourier transform of a Gabor kernel that can obtain the resulting image filtered by a Gabor filter.

使用卷积定理公式(2)如下，Using the convolution theorem formula (2) is as follows,

其中Gabor核和输入信号进行卷积，获得靠近某个邻域的输入信号的响应。Among them, the Gabor kernel is convolved with the input signal to obtain the response of the input signal close to a certain neighborhood.

图6给出了公式(2)的一种运用，其中6-(a)是输入的原始图像。6-(b) 展示了5*8个不同频率、不同尺度的Gabor核(即：40个Gabor滤波器) 获取到原始图像6-(a)在不同频率和不同尺度上的响应情况，将这个响应结果作为图像特征被提取出来。Figure 6 shows an application of formula (2), where 6-(a) is the input original image. 6-(b) shows 5*8 different frequencies and different scales of Gabor kernels (ie: 40 Gabor filters) to obtain the response of the original image 6-(a) at different frequencies and different scales. The responses are extracted as image features.

用公式(3)表示的二维复数波乘以公式(4)中计算得到的二维高斯函数，从而获得公式(5)中的Gabor核。The two-dimensional complex wave represented by formula (3) is multiplied by the two-dimensional Gaussian function calculated in formula (4), so as to obtain the Gabor kernel in formula (5).

s(x,y)＝exp(i(2π(u₀x+v₀y))+p) (3)s(x,y)=exp(i(2π(u ₀ x+v ₀ y))+p) (3)

其中，初始相位p对Gabor核的影响不大，可以省略。Among them, the initial phase p has little influence on the Gabor kernel and can be omitted.

δ_x和δ_y分别控制高斯函数在x,y方向上的“展布“情况。δ _x and δ _y respectively control the "spread" of the Gaussian function in the x and y directions.

(x₀,y₀)是高斯核的中心点，θ是高斯核的旋转方向，(δ_x,δ_y)是高斯核在x,y方向上的尺度，(u₀,v₀)是频域坐标，K为高斯核的幅度比例。(x ₀ ,y ₀ ) is the center point of the Gaussian kernel, θ is the rotation direction of the Gaussian kernel, (δ _x ,δ _y ) is the scale of the Gaussian kernel in the x,y direction, (u ₀ ,v ₀ ) is the frequency Domain coordinates, K is the magnitude scale of the Gaussian kernel.

被广泛使用的BP神经网络作为本发明的基础。为了克服BP算法的缺点，动量项可用于提高算法收敛速度，避免算法陷入局部最小值。我们提出的人脸检测系统中的神经网络算法的整体步骤见图1简要展示了神经网络的工作流程。接下来详细介绍动量因子反向传播神经网络算法。The widely used BP neural network is used as the basis of the present invention. In order to overcome the shortcomings of the BP algorithm, the momentum item can be used to improve the convergence speed of the algorithm and avoid the algorithm from falling into a local minimum. The overall steps of the neural network algorithm in the face detection system we propose are shown in Figure 1, which briefly shows the workflow of the neural network. Next, the momentum factor backpropagation neural network algorithm is introduced in detail.

提取的图像Gabor特征1被用作神经网络的输入，神经网络是全连接的网络结构。如图1所示，一层是输入层2，另一层是隐藏层3。The extracted image Gabor feature 1 is used as the input of the neural network, which is a fully connected network structure. As shown in Figure 1, one layer is the input layer 2 and the other layer is the hidden layer 3.

隐藏层神经元的传递函数如下公式(6)所示The transfer function of the neurons in the hidden layer is shown in the following formula (6):

在这个函数中，net是来自输入层的输入数据。每个输入都被视为x_i，如果给定n个输入数据，其中，1≤i≤n,那么，net的公式如(7)所示：In this function, net is the input data from the input layer. Each input is regarded as x _i , if given n input data, among them, 1≤i≤n, then the formula of net is as shown in (7):

net＝x₁w₁+x₂w₂+…+x_nw_n (7)net＝x ₁ w ₁ +x ₂ w ₂ +...+x _n w _n (7)

其中，w_i是神经网络的初始权值。此外，神经网络中，输入层和输出层的节点的数目是已知的，隐藏层中的节点数h由公式(8)决定。Among them, w _i is the initial weight of the neural network. In addition, in the neural network, the number of nodes in the input layer and output layer is known, and the number h of nodes in the hidden layer is determined by formula (8).

其中，m和n分别代表输入层和输出层的节点数量，A代表在1和10 之间可调常数值。Among them, m and n represent the number of nodes in the input layer and output layer respectively, and A represents an adjustable constant value between 1 and 10.

图1通过BP神经网络输出值4附加一个作为反馈参数的动量因子，在转发反馈误差信号时增加反馈参数，改善传统BP神经网络的训练性能。Figure 1 adds a momentum factor as a feedback parameter through the output value 4 of the BP neural network, and increases the feedback parameter when forwarding the feedback error signal to improve the training performance of the traditional BP neural network.

正向传播的过程可以通过公式(9)计算如下。The process of forward propagation can be calculated by formula (9) as follows.

其中，x_j＝f(S_j)，net＝S_j。Wherein, x _j =f(S _j ), net=S _j .

BP的核心思想就是将输出误差以某种形式通过隐层向输入层逐层反传。初始权重w可以在反向传播网络的正向传播的过程中使用，它是由随机初始化生成。但是与理想的输出相比较，实际的输出可能会产生较大的误差。为了不断调整w，在公式(10)中可以得到误差函数，其中y_j表示实际输出。The core idea of BP is to pass the output error back to the input layer layer by layer through the hidden layer in some form. The initial weight w can be used during the forward propagation of the backpropagation network, which is generated by random initialization. However, compared with the ideal output, the actual output may have a large error. In order to continuously adjust w, the error function can be obtained in formula (10), where y _j represents the actual output.

输入的训练集在训练期间按以下计算步骤进行训练。The input training set is trained in the following computational steps during training.

1.从反向输出层计算出每一层每个单元中的误差项公式(11)。1. Calculate the error term formula (11) in each unit of each layer from the reverse output layer.

error_i＝y_j(1-y_j)(d_j-y_j) (11)error _i =y _j (1-y _j )(d _j -y _j ) (11)

2.计算隐藏层节点的误差公式(12)。2. Calculate the error formula (12) of the hidden layer nodes.

error_h＝y_h(1-y_h)error_h (12)error _h = y _h (1-y _h ) error _h (12)

3.更新每个权重公式(13)。3. Update each weight formula (13).

w_ik＝w_ik+μ·error_k·x_ik (13)w _ik ＝w _ik +μ error _k x _ik (13)

其中，Δw_ik＝μ·error_k·x_ik是权重的更新规则，x_ik表示输入值，w_ik是节点i和节点k之间相对应的权重值。Wherein, Δwi _ik =μ·error _k ·xi _ik is the update rule of the weight, x _ik represents the input value, and wi _ik is the corresponding weight value between node i and node k.

这些是神经网络中最基本的数学思想，可以训练积极的例子和负面例子，具有相对较长的时间消耗。以下是神经网络的改进公式(14)。These are the most basic mathematical ideas in neural networks, which can be trained on positive examples and negative examples, with relatively long time consumption. The following is the improved formula (14) for the neural network.

Δw_ik(n)＝μδ_kx_ik+αΔw_ik(n-1) (14)Δw _ik (n)=μδ _k x _ik +αΔw _ik (n-1) (14)

其中0≤α＜1是动量项。并且α一个全局参数，可由试验和输出层的输出值决定。当梯度保持同样的方向时，则可增大步长并指导迭代路径指向最小目标值。通常这是很有必要的，当α很大的时候(即α接近1)需要减少学习参数μ。如果梯度方向一直在改变，则动量项的加入可以使得迭代路径的变化变得平滑。Where 0≤α<1 is the momentum term. And α is a global parameter, which can be determined by the output value of the test and output layer. When the gradient keeps the same direction, the step size can be increased and the iterative path can be directed to the minimum target value. Usually it is necessary to reduce the learning parameter μ when α is large (that is, α is close to 1). If the gradient direction is always changing, the addition of the momentum term can make the change of the iterative path smooth.

这主要是在神经网络训练不好的情况下，即在不同方向的曲率不相同，这就形成了大小不同的长窄曲面谷。对曲面谷上的绝大多数点，迭代路径的梯度并不是指向最小值方向，从而迭代路径的连续步长之间从一边到另外一边，迭代路径出现震荡，收敛到最小值时的速度也大大变慢，如果增加动量项，则能够有效减弱震荡并大大提高收敛速度。This is mainly due to the poor training of the neural network, that is, the curvature in different directions is not the same, which forms long and narrow surface valleys of different sizes. For the vast majority of points on the surface valley, the gradient of the iterative path does not point to the direction of the minimum value, so the iteration path oscillates from one side to the other between the continuous steps of the iterative path, and the speed of convergence to the minimum value is also greatly improved. Slower, if the momentum item is added, the shock can be effectively weakened and the convergence speed can be greatly improved.

公式(14)中，第n次迭代的权重取决于第n-1次迭代的权重。一定程度上增加动量项提高了搜索步骤的效果，这可以使算法更快地收敛。另一方面，由于多层网络因为损失函数可能导致算法收敛到局部最小值，因此动量项可以在某种程度上跨越一些局部最小值，避免算法陷入局部最小值的情况。In formula (14), the weight of the nth iteration depends on the weight of the n-1th iteration. Increasing the momentum term to a certain extent improves the effect of the search step, which can make the algorithm converge faster. On the other hand, since the multi-layer network may cause the algorithm to converge to a local minimum due to the loss function, the momentum item can cross some local minimums to some extent to avoid the algorithm from falling into the local minimum.

实验验证：Experimental verification:

选取The Face Detection Data Set人脸数据集和大约含有5000个正面人脸的CMU and Harvard人脸数据库作为训练集的人脸库。其中CMU and Harvard人脸数据库大约包含有5000个人脸样例，这些人脸图像中的人脸存在一些偏移角度和简单的背景。同时，我们也使用250张无人脸的风景图像采用自举得方式获得了2500张左右非人脸的图像作为训练集的非人脸库。The Face Detection Data Set face dataset and the CMU and Harvard face database containing about 5000 frontal faces are selected as the face library of the training set. Among them, the CMU and Harvard face database contains about 5000 face samples, and the faces in these face images have some offset angles and simple backgrounds. At the same time, we also used 250 landscape images without faces to obtain about 2500 non-face images as the non-face library of the training set by bootstrapping.

图2和图3显示的是使用动量人脸检测在Partheenpan测试数据集上进行面人脸检测后得到的一些结果图像。图2的2-(a)中，我们选择了正面人脸图像作为输入。在图3的3-(a)中，每个输入图像包含多个人脸。图2的 2-(b)和图3的3-(b)中，输出图像中都将检测到的面部用矩形进行了标记。Figures 2 and 3 show some resulting images after face detection on the Partheenpan test dataset using momentum face detection. In 2-(a) of Fig. 2, we choose frontal face images as input. In 3-(a) of Fig. 3, each input image contains multiple faces. In 2-(b) of Fig. 2 and 3-(b) of Fig. 3, the detected faces are marked with rectangles in the output image.

我们训练好的动量人脸检测系统来测试一些常见的人脸数据库：Our trained momentum face detection system to test some common face databases:

1Partheenpan Data Set(Partheenpan)，1 Partheenpan Data Set (Partheenpan),

2The Annotated Faces in the Wild(AFW)，2 The Annotated Faces in the Wild (AFW),

3The Face Detection Data Set&Benchmark(FDD-B)3The Face Detection Data Set&Benchmark(FDD-B)

4CMU Data set(CMU)4CMU Data set (CMU)

图4中，将本发明方法的平均检测时间和经典BP神经算法在上述4 种数据中检测人脸的平均时间做了一个比较。In Fig. 4, a comparison is made between the average detection time of the method of the present invention and the average time of detecting human faces in the above four kinds of data by the classical BP neural algorithm.

因为AFW和CMU两个数据集中的每张图像中会具有多个人脸和较为复杂的图像背景，所以在这两个数据集上的检测时性能改善更为明显。Because each image in the two data sets of AFW and CMU will have multiple faces and more complex image backgrounds, the performance improvement in the detection of these two data sets is more obvious.

下面的表1给出了在四组不同人脸数据库上的检测率和误检测的。包括图像尺寸的多样性和图像清晰度带来的限制，该发明方法有一定的错误检测率，但都在现有技术中处于领先。Table 1 below gives the detection rates and false detections on four different face databases. Including the limitations brought by the variety of image sizes and image clarity, the inventive method has a certain false detection rate, but it is in the lead in the prior art.

表1四组不同人脸数据库测试集上的检测率和误检测情况Table 1 The detection rate and false detection of four groups of different face database test sets

比如AFW测试集中，有检测率＝(451-65)/451＝85.6％，错误检测率＝39/(39+451)＝7.95％。For example, in the AFW test set, the detection rate=(451-65)/451=85.6%, and the false detection rate=39/(39+451)=7.95%.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等同物界定。The basic principles, main features and advantages of the present invention have been shown and described above. Those skilled in the industry should understand that the present invention is not limited by the above-mentioned embodiments, and that described in the above-mentioned embodiments and the description only illustrates the principles of the present invention, and the present invention also has various aspects without departing from the spirit and scope of the present invention. Variations and improvements all fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. the momentum method for detecting human face based on BP neural network, it is characterised in that：

Step 1：Extract the image Gabor characteristic of training set；

Step 2：It is entered into factor of momentum reverse transmittance nerve network and is trained；

Step 3：Go to detect in input picture using the system trained and whether there is face, if there is then being marked with rectangle.

2. according to the method for claim 1, it is characterised in that：In described step 1 Gabor characteristic extraction method be It has selected the Gabor cores on five yardsticks and eight directions to be used for extracting the Gabor characteristic in image, by input picture 5*8 Individual Gabor cores carry out convolution, generate the characteristics of image of 40 different scales under different frequency.

3. according to the method for claim 1, it is characterised in that：Factor of momentum in described step 2 is Δ w_ik(n)=μ δ_kx_ik+αΔw_ik(n-1) wherein 0≤α ＜ 1 are momentum terms；In above formula, the weight of nth iteration depends on (n-1)th iteration Weight.