CN104834922B - Gesture identification method based on hybrid neural networks - Google Patents
Gesture identification method based on hybrid neural networks Download PDFInfo
- Publication number
- CN104834922B CN104834922B CN201510280013.3A CN201510280013A CN104834922B CN 104834922 B CN104834922 B CN 104834922B CN 201510280013 A CN201510280013 A CN 201510280013A CN 104834922 B CN104834922 B CN 104834922B
- Authority
- CN
- China
- Prior art keywords
- gesture
- mrow
- point
- neural network
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 85
- 238000000034 method Methods 0.000 title claims abstract description 70
- 230000001413 cellular effect Effects 0.000 claims abstract description 25
- 238000001514 detection method Methods 0.000 claims abstract description 22
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims abstract description 15
- 239000013598 vector Substances 0.000 claims description 50
- 230000011218 segmentation Effects 0.000 claims description 31
- 210000004027 cell Anatomy 0.000 claims description 16
- 210000002569 neuron Anatomy 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 16
- 238000003062 neural network model Methods 0.000 claims description 11
- 230000009467 reduction Effects 0.000 claims description 11
- CIWBSHSKHKDKBQ-JLAZNSOCSA-N Ascorbic acid Chemical compound OC[C@H](O)[C@H]1OC(=O)C(O)=C1O CIWBSHSKHKDKBQ-JLAZNSOCSA-N 0.000 claims description 9
- 238000001914 filtration Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 7
- 230000007704 transition Effects 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims 3
- 238000010168 coupling process Methods 0.000 claims 3
- 238000005859 coupling reaction Methods 0.000 claims 3
- 238000012935 Averaging Methods 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 abstract description 32
- 239000002131 composite material Substances 0.000 abstract description 10
- 230000000694 effects Effects 0.000 description 9
- 238000013461 design Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 8
- 230000003993 interaction Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 239000010410 layer Substances 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 239000000470 constituent Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 235000002566 Capsicum Nutrition 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 239000006002 Pepper Substances 0.000 description 2
- 241000722363 Piper Species 0.000 description 2
- 235000016761 Piper aduncum Nutrition 0.000 description 2
- 235000017804 Piper guineense Nutrition 0.000 description 2
- 235000008184 Piper nigrum Nutrition 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 210000000707 wrist Anatomy 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 238000010304 firing Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/107—Static hand or arm
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种基于混合神经网络的手势识别方法,对于待识别手势图像和手势图像训练样本,首先采用脉冲耦合神经网络检测得到噪声点,再利用复合去噪算法对噪声点进行处理,然后采用细胞神经网络提取手势图像中的边缘点,根据提取到的边缘点得到连通区域,利用曲率对每个连通区域进行指尖检测得到待定指尖点,排除人脸部分干扰得到手势区域,然后根据手势形状特点进行分割,根据分割后手势区域的轮廓点得到保留相位信息的傅里叶描述子,选择前若干个傅里叶描述子作为手势特征;根据手势图像训练样本的手势特征训练BP神经网络,将待识别手势图像的手势特征输入BP神经网络进行识别。本发明通过多种神经网络的运用,提高了对手势识别的准确率。
The invention discloses a gesture recognition method based on a hybrid neural network. For a gesture image to be recognized and a gesture image training sample, firstly, a pulse-coupled neural network is used to detect and obtain noise points, and then a composite denoising algorithm is used to process the noise points, and then Cellular neural network is used to extract the edge points in the gesture image, and the connected areas are obtained according to the extracted edge points, and the fingertip detection is performed on each connected area by curvature to obtain the undetermined fingertip points, and the gesture area is obtained by excluding the interference of the face part, and then according to Gesture shape features are segmented, and Fourier descriptors that retain phase information are obtained according to the contour points of the segmented gesture area, and the first few Fourier descriptors are selected as gesture features; BP neural network is trained according to gesture features of gesture image training samples , input the gesture feature of the gesture image to be recognized into the BP neural network for recognition. The present invention improves the accuracy of gesture recognition through the application of various neural networks.
Description
技术领域technical field
本发明属于手势识别技术领域,更为具体地讲,涉及一种基于混合神经网络的手势识别方法。The invention belongs to the technical field of gesture recognition, and more specifically relates to a gesture recognition method based on a hybrid neural network.
背景技术Background technique
随着计算机技术的突飞猛进,人机交互技术在人们的生活中越来越普及。人机交互(Human-Computer Interaction,HCI)技术是指用户与计算机之间使用某种操作方式而执行的一种人与计算机之间的交互过程。它的发展大致经历了纯手工作业阶段、语言命令控制阶段、用户界面阶段等,然而随着近年来人工智能等技术的不断发展,逐渐引起对人机交互技术发展的重视。With the rapid development of computer technology, human-computer interaction technology is becoming more and more popular in people's life. Human-computer interaction (Human-Computer Interaction, HCI) technology refers to an interactive process between a human and a computer that is performed by using a certain operation mode between the user and the computer. Its development has roughly gone through the pure manual operation stage, the language command control stage, the user interface stage, etc. However, with the continuous development of artificial intelligence and other technologies in recent years, it has gradually attracted attention to the development of human-computer interaction technology.
现在随着计算机在应用领域方面的不断拓展,现有的人机交互方式已经不能满足人们对日常需求的更高层次的要求,急需一种更加简洁、友好的新型人机相互交互的方式。由于人机交互的最终目的是为了实现人与机器之间自然地交流,而在日常生活中人与人之间大部分是通过肢体语言或者面部表情来传达信息的,只有一小部分是通过自然语言来完成的,这就表明肢体语言表达人类情感或者意图方面具有更大的优势。由于在肢体语言当中,手扮演着极为重要的角色,因此,基于手势行为的交互方式即手势行为识别系统,也即手势识别系统受到人们的广泛关注。Now with the continuous expansion of computer applications, the existing human-computer interaction methods can no longer meet people's higher-level requirements for daily needs, and a more concise and friendly new human-computer interaction method is urgently needed. Since the ultimate goal of human-computer interaction is to achieve natural communication between humans and machines, in daily life, most of the information between humans is conveyed through body language or facial expressions, and only a small part is through natural communication. This shows that body language has a greater advantage in expressing human emotions or intentions. Since the hand plays an extremely important role in body language, the interaction method based on gesture behavior, that is, the gesture behavior recognition system, that is, the gesture recognition system has attracted widespread attention.
一般情况下,手势识别系统主要由以下几个部分组成:手势预处理、手势分割、手势建模、手势特征提取、手势识别。对于手势预处理操作,主要是手势图像的去噪操作,目前常见的去噪算法包括:均值滤波、中值滤波、空间低通滤波、频域低通滤波以及脉冲耦合神经网络等,但是对于多种噪声存在的情况下,目前的算法的去噪能力都不能达到能好的去噪效果,因此设计一个良好的去噪算法对于后期的识别过程至关重要。对于手势分割操作,目前常用的手势分割方法有基于肤色信息的分割方法、基于运动信息的分割方法以及基于边缘信息的分割方法。由于基于肤色信息的分割方法容易受到背景信息的干扰,基于边缘信息的分割方法又不能达到很好的分割效果,因此如何设计一个良好有效地分割算法也是至关重要的。对于手势特征提取操作,目前应用最广的是基于傅里叶描述子的特征提取方法,但是由于该方法的旋转不变性使得该方法对于手势旋转之后的手势的特征变化不大,因此如何设计一个不具有旋转不变性的傅里叶描述子也是至关重要的。对于手势识别操作,目前常见的方法有模板匹配技术、支持向量机、神经网络方法、隐马尔可夫模型等,因此如何选用一个良好的手势识别方法对于手势识别系统同样至关重要。In general, a gesture recognition system is mainly composed of the following parts: gesture preprocessing, gesture segmentation, gesture modeling, gesture feature extraction, and gesture recognition. For the gesture preprocessing operation, it is mainly the denoising operation of the gesture image. At present, the common denoising algorithms include: mean filter, median filter, spatial low-pass filter, frequency domain low-pass filter, and pulse-coupled neural network. In the presence of such noise, the denoising ability of the current algorithm cannot achieve a good denoising effect, so designing a good denoising algorithm is very important for the later recognition process. For gesture segmentation operations, currently commonly used gesture segmentation methods include segmentation methods based on skin color information, segmentation methods based on motion information, and segmentation methods based on edge information. Because the segmentation method based on skin color information is easily disturbed by background information, and the segmentation method based on edge information cannot achieve a good segmentation effect, so how to design a good and effective segmentation algorithm is also crucial. For the gesture feature extraction operation, the feature extraction method based on the Fourier descriptor is currently the most widely used, but due to the rotation invariance of the method, the method has little change in the characteristics of the gesture after the gesture is rotated, so how to design a Fourier descriptors that are not rotation invariant are also crucial. For gesture recognition operations, currently common methods include template matching technology, support vector machines, neural network methods, hidden Markov models, etc. Therefore, how to choose a good gesture recognition method is also crucial for gesture recognition systems.
神经网络方法是指利用一些简单的处理单元来模拟人脑神经元,并把这些简单的处理单元以某种方式连接成网络来实现对人脑模拟的一门科学。神经网络方法往往具有以下优势:并行计算,分布式存储,健壮性,非线性的处理以及良好的自适应性和容错性能力。因此,神经网络方法能够在多个场景下得到应用。例如:手势识别、图像分割、噪声处理等。The neural network method refers to the science of simulating the human brain neurons by using some simple processing units, and connecting these simple processing units into a network in a certain way to realize the simulation of the human brain. Neural network methods often have the following advantages: parallel computing, distributed storage, robustness, nonlinear processing, and good adaptability and fault tolerance. Therefore, the neural network method can be applied in multiple scenarios. For example: gesture recognition, image segmentation, noise processing, etc.
目前,神经网络方法已经在手势行为识别领域得到了越来越多的应用。然而,神经网络方法在手势行为识别领域的应用也仅限于手势识别这个阶段,在针对手势行为识别的其他阶段的应用很少。At present, the neural network method has been more and more applied in the field of gesture behavior recognition. However, the application of neural network methods in the field of gesture recognition is limited to the stage of gesture recognition, and there are few applications in other stages of gesture recognition.
发明内容Contents of the invention
本发明的目的在于克服现有技术的不足,提供一种基于混合神经网络的手势识别方法,利用脉冲耦合神经网络提高手势图像的去噪效果,利用细胞神经网络进行手势分割,采用具有旋转可变性的傅里叶描述子作为手势特征,利用BP神经网络进行手势识别,从而提高手势识别的准确率The purpose of the present invention is to overcome the deficiencies of the prior art, to provide a gesture recognition method based on a hybrid neural network, which uses a pulse-coupled neural network to improve the denoising effect of the gesture image, uses a cellular neural network to perform gesture segmentation, and adopts a method with rotational variability The Fourier descriptor is used as a gesture feature, and the BP neural network is used for gesture recognition, thereby improving the accuracy of gesture recognition
为实现上述发明目的,本发明基于混合神经网络的手势识别方法,包括以下步骤:In order to achieve the above-mentioned purpose of the invention, the gesture recognition method based on the hybrid neural network of the present invention comprises the following steps:
S1:提取待识别手势图像和手势图像训练样本的特征,具体步骤包括:S1: Extract the features of the gesture image to be recognized and the gesture image training samples, the specific steps include:
S1.1:建议手势灰度图像的脉冲耦合神经网络模型,将当前手势灰度图像各像素点的灰度值作为脉冲耦合神经网络中对应神经元的输入,利用脉冲耦合神经网络的发放特性对手势图像的像素点进行检测,如果像素点的输出状态为点火状态,则将检测结果矩阵中该像素点对应的元素置为1,否则置为0;遍历检测结果矩阵的每个元素,如果元素值为1,则以该元素为降噪处理窗口的中心,降噪处理窗口的大小根据实际情况设置,统计降噪处理窗口中除中心点元素以外的其他元素的值,如果值为0的元素数量大于预设阈值,说明该中心点是噪声点,其他情况则不是噪声点;S1.1: The pulse-coupled neural network model of the gesture grayscale image is suggested. The gray value of each pixel of the current gesture grayscale image is used as the input of the corresponding neuron in the pulse-coupled neural network, and the pulse-coupled neural network is used to control the neural network. The pixel of the gesture image is detected. If the output state of the pixel is the ignition state, the element corresponding to the pixel in the detection result matrix is set to 1, otherwise it is set to 0; traverse each element of the detection result matrix, if the element If the value is 1, this element will be the center of the noise reduction processing window. The size of the noise reduction processing window is set according to the actual situation, and the values of other elements in the noise reduction processing window except the center point element are counted. If the element with a value of 0 If the number is greater than the preset threshold, it means that the center point is a noise point, otherwise it is not a noise point;
分别按以下公式计算噪声点的两种噪声估计值H(i,j)和V(i,j):Calculate the two noise estimates H(i,j) and V(i,j) of the noise points according to the following formulas:
H(i,j)=|a(i,j)-b(i,j)|H(i,j)=|a(i,j)-b(i,j)|
其中,a(i,j)是图像中像素点(i,j)处的灰度值,b(i,j)为该像素点进行中值滤波后的中值输出灰度值;Among them, a(i, j) is the gray value at the pixel point (i, j) in the image, and b(i, j) is the median output gray value of the pixel point after median filtering;
其中,m1(i,j)和m2(i,j)分别代表像素点(i,j)所在邻域中与a(i,j)灰度值最接近的两个点的灰度值;Among them, m 1 (i, j) and m 2 (i, j) respectively represent the gray values of the two points closest to the gray value of a (i, j) in the neighborhood where the pixel point (i, j) is located ;
如果H(i,j)≥T1,并且V(i,j)≥T2,则采用中值滤波对该噪声点进行处理,否则采用均值滤波对该噪声点进行处理;If H(i,j)≥T 1 , and V(i,j)≥T 2 , use median filter to process the noise point, otherwise use mean value filter to process the noise point;
S1.2:对经步骤S1.1去噪后的手势灰度图像进行直方图均衡化;S1.2: performing histogram equalization on the gesture grayscale image denoised in step S1.1;
S1.3:建立手势灰度图像的细胞神经网络模型,将均衡化后的手势灰度图像各像素点(i,j)的灰度值作为细胞神经网络模型中对应细胞的输入uij,按照状态转移过程的公式进行迭代,直到整个细胞神经网络收敛,得到每个细胞的输出yij(t);遍历细胞神经网络中每个像素点对应的细胞元的输出值,当某个像素点的输出值在[0,1]范围内,如果其对应邻域内其他像素点的像素值和大于预设阈值,则本像素不是边缘像素,否则是边缘像素点;当输出值在[-1,0)范围内,不是边缘像素点;S1.3: Establish the cellular neural network model of the gesture grayscale image, and use the gray value of each pixel (i, j) of the equalized gesture grayscale image as the input u ij of the corresponding cell in the cellular neural network model, according to The formula of the state transition process is iterated until the entire cellular neural network converges, and the output y ij (t) of each cell is obtained; the output value of the cell corresponding to each pixel in the cellular neural network is traversed, when a certain pixel The output value is in the range of [0,1]. If the sum of the pixel values of other pixels in the corresponding neighborhood is greater than the preset threshold, the pixel is not an edge pixel, otherwise it is an edge pixel; when the output value is in [-1,0 ) range, not edge pixels;
S1.4:根据步骤S1.3得到的边缘像素点得到连通区域,提取得到连通区域的轮廓,对每个连通区域分别进行指尖检测,指尖检测方法为:S1.4: Obtain connected regions according to the edge pixels obtained in step S1.3, extract the contours of the connected regions, and perform fingertip detection on each connected region respectively. The fingertip detection method is:
遍历连通区域中的每个轮廓像素点,将该像素点作为基准点,坐标记为p(px,py,0),预设一个距离常数L,沿轮廓方向取p点前面的第L个点p1(p1x,p1y,0),取点p后面的第L个点p2(p2x,p2y,0),计算向量与向量之间夹角的余弦值cosα,如果cosα大于预设曲率阈值T,则判定该点为待定指尖点,否则不作为待定指尖点;Traverse each contour pixel point in the connected area, take this pixel point as a reference point, and mark the coordinates as p(p x , p y ,0), preset a distance constant L, and take the Lth point in front of point p along the contour direction point p 1 (p 1x ,p 1y ,0), take the Lth point p 2 (p 2x ,p 2y ,0) after point p, and calculate the vector with vector The cosine value cosα of the angle between them, if cosα is greater than the preset curvature threshold T, then it is determined that the point is a pending fingertip point, otherwise it is not regarded as a pending fingertip point;
根据遍历方向确定指尖位置向量积的符号,如果按照手势区域整体轮廓的顺时针遍历时,向量积符号应为负,否则为正,计算待定指尖点向量与向量之间的向量积如果该向量积的符号与指尖位置对应的符号相同,则保留为待定指尖点,否则不保留;Determine fingertip position vector product according to traversal direction The sign of , if the clockwise traversal of the overall outline of the gesture area is followed, the sign of the vector product should be negative, otherwise it is positive, and the pending fingertip point vector is calculated with vector vector product between If the sign of the vector product is the same as the sign corresponding to the fingertip position, it will be reserved as the pending fingertip point, otherwise it will not be reserved;
判断该连通区域中检测到的所有待定指尖点中,y坐标最大的待定指尖点与y坐标最小的待定指尖点的y坐标差值是否超过人脸高度的一半,如果是,该连通区域不是手势区域,否则作为待定手势区域;再进一步判断的每个待定手势区域中待定指尖点数量是否超过预设的数量阈值,如果是,则该连通区域为手势区域,否则不是;Determine whether the y-coordinate difference between the undetermined fingertip point with the largest y-coordinate and the undetermined fingertip point with the smallest y-coordinate among all undetermined fingertip points detected in the connected area exceeds half of the face height, and if so, the connected If the area is not a gesture area, otherwise it is regarded as a pending gesture area; further judge whether the number of pending fingertip points in each pending gesture area exceeds the preset number threshold, if yes, the connected area is a gesture area, otherwise not;
求取手势区域的主方向,根据主方向按照手势长度与宽度比值为2对手势区域进行分割,得到分割后的手势区域;Find the main direction of the gesture area, and divide the gesture area according to the main direction according to the ratio of the length of the gesture to the width of 2 to obtain the divided gesture area;
S1.5:将经步骤S1.4分割后得到的手势区域,将手势区域的轮廓点坐标以复数形式表示,将所有轮廓点坐标构成离散序列,记轮廓点数量为n,对该离散序列进行傅里叶变换,得到n个傅里叶系数z(k),k=0,1,…,n-1,计算傅里叶描述子 S1.5: The gesture area obtained after step S1.4 is divided, the contour point coordinates of the gesture area are expressed in plural form, all the contour point coordinates are formed into a discrete sequence, and the number of contour points is n, and the discrete sequence is performed Fourier transform, get n Fourier coefficients z(k), k=0,1,...,n-1, calculate Fourier descriptor
其中k′=1,2,…,n-1,表示手势区域主方向与x轴的夹角。where k'=1,2,...,n-1, Indicates the angle between the main direction of the gesture area and the x-axis.
在傅里叶描述子中选择前Q个构成特征向量;Select the first Q constituent feature vectors in the Fourier descriptor;
S2:将训练样本手势图像的特征向量作为训练样本输入BP神经网络,其对应的手势图像类别作为BP神经网络的输出,对BP神经网络进行训练;S2: The feature vector of the gesture image of the training sample is input into the BP neural network as a training sample, and the corresponding gesture image category is used as the output of the BP neural network to train the BP neural network;
S3:将待识别手势图像的特征向量输入步骤S2训练好的BP神经网络中,输出识别得到的手势图像类别。S3: Input the feature vector of the gesture image to be recognized into the BP neural network trained in step S2, and output the recognized gesture image category.
本发明基于混合神经网络的手势识别方法,对于待识别手势图像和手势图像训练样本,首先采用脉冲耦合神经网络进行噪声点和边缘点的区分检测,再利用复合去噪算法对噪声点进行处理,然后采用细胞神经网络提取手势图像中的边缘点,根据提取到的边缘点得到连通区域,利用曲率对每个连通区域进行指尖检测得到待定指尖点,再排除人脸部分的干扰,得到手势区域,然后根据手形形状特点进行分割,得到分割后的手势区域;根据手势区域的轮廓点得到保留相位信息的傅里叶描述子,选择前若干个傅里叶描述子作为手势特征;根据手势图像训练样本的手势特征训练BP神经网络,将待识别手势图像的手势特征输入BP神经网络进行识别。The gesture recognition method based on the hybrid neural network of the present invention, for the gesture image to be recognized and the gesture image training samples, first uses the pulse-coupled neural network to distinguish and detect the noise point and the edge point, and then uses the composite denoising algorithm to process the noise point, Then the cellular neural network is used to extract the edge points in the gesture image, and the connected areas are obtained according to the extracted edge points, and the fingertip detection is performed on each connected area by the curvature to obtain the undetermined fingertip points, and then the interference of the face part is excluded to obtain the gesture region, and then segmented according to the characteristics of the hand shape to obtain the segmented gesture region; obtain the Fourier descriptors that retain the phase information according to the contour points of the gesture region, and select the first few Fourier descriptors as gesture features; according to the gesture image The gesture features of the training samples are used to train the BP neural network, and the gesture features of the gesture images to be recognized are input into the BP neural network for recognition.
本发明具有以下有益效果:The present invention has the following beneficial effects:
(1)利用脉冲耦合神经网络进行噪声点和边缘点的区分,结合复合去噪算法对手势图像进行去噪,可以提高去噪效果;(1) Use the pulse-coupled neural network to distinguish noise points and edge points, and combine the composite denoising algorithm to denoise the gesture image, which can improve the denoising effect;
(2)手势分割结合了细胞神经网络的粗分割与基于手势形状特征的细分割,可以提高手势分割的准确度;(2) Gesture segmentation combines the coarse segmentation of cellular neural network and the fine segmentation based on gesture shape features, which can improve the accuracy of gesture segmentation;
(3)手势特征采用傅里叶描述子,保留了相位信息,可以提高识别率。(3) The Fourier descriptor is used for the gesture feature, which retains the phase information and can improve the recognition rate.
附图说明Description of drawings
图1是本发明基于混合神经网络的手势识别方法的流程图Fig. 1 is the flowchart of the gesture recognition method based on hybrid neural network of the present invention
图2是本发明中手势图像特征提取的流程图Fig. 2 is the flowchart of gesture image feature extraction in the present invention
图3是结合手势形状特性进行手势细分割的流程图Figure 3 is a flow chart of gesture fine segmentation combined with gesture shape characteristics
图4是本发明指尖检测的示意图;Fig. 4 is the schematic diagram of fingertip detection of the present invention;
图5是手势粗分割的示例图;Fig. 5 is an example diagram of gesture coarse segmentation;
图6是手势细分割的示例图。FIG. 6 is an example diagram of fine segmentation of gestures.
具体实施方式detailed description
下面结合附图对本发明的具体实施方式进行描述,以便本领域的技术人员更好地理解本发明。需要特别提醒注意的是,在以下的描述中,当已知功能和设计的详细描述也许会淡化本发明的主要内容时,这些描述在这里将被忽略。Specific embodiments of the present invention will be described below in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention. It should be noted that in the following description, when detailed descriptions of known functions and designs may dilute the main content of the present invention, these descriptions will be omitted here.
实施例Example
图1是本发明基于混合神经网络的手势识别方法的流程图。如图1所示,本发明基于混合神经网络的手势识别方法包括以下步骤:Fig. 1 is a flow chart of the gesture recognition method based on the hybrid neural network of the present invention. As shown in Figure 1, the gesture recognition method based on hybrid neural network of the present invention comprises the following steps:
S101:提取待识别样本和训练样本的特征:S101: Extracting features of samples to be identified and training samples:
首先需要对待识别手势图像和手势图像训练样本进行特征提取。图2是本发明中手势图像特征提取的流程图。如图2所示,本发明中手势图像特征提取包括以下步骤:First, feature extraction needs to be performed on gesture images to be recognized and gesture image training samples. Fig. 2 is a flowchart of gesture image feature extraction in the present invention. As shown in Figure 2, gesture image feature extraction in the present invention comprises the following steps:
S201:手势图像去噪预处理:S201: gesture image denoising preprocessing:
本发明采用基于脉冲耦合神经网络(PCNN-Pulse Coupled Neural Network)和复合去噪算法相结合的去噪算法进行手势灰度图像的去噪,先通过采用脉冲耦合神经网络对手势图像进行噪声点和边缘点的区分检测,之后根据噪声点的类型采用复合去噪算法进行去噪操作,从而达到在保留边缘信息的前提下去除多种噪声的目的。The present invention uses a denoising algorithm based on a combination of PCNN-Pulse Coupled Neural Network (PCNN-Pulse Coupled Neural Network) and a composite denoising algorithm to denoise gesture grayscale images. The edge points are distinguished and detected, and then the compound denoising algorithm is used to perform denoising operations according to the type of noise points, so as to achieve the purpose of removing various noises under the premise of retaining edge information.
脉冲耦合神经网络的每个神经元由三个部分组成:接收部分、调制部分和脉冲产生器。脉冲耦合神经网络是图像降噪预处理的一种常用方法,其主要作用在于去除椒盐噪声。当脉冲耦合神经网络用于图像降噪领域时,可以理解成一个二维单层的局部连接网络,在这个网络中神经元与待处理灰度图像中的像素点是一一对应的,并且相邻神经元之间也是相互连接的关系。在降噪处理过程中,待处理图像的每个像素点的灰度值可以理解为神经元的反馈输入,同时每个神经元的输出只作为相邻神经元的输入,并且每个神经元的输出状态只有两种:点火状态和不点火状态,可分别记为1和0。由于噪声对应的像素点与周围的像素点区别较大,因此可以利用脉冲耦合神经网络的发放特性结合噪声的自身特性进行噪声点的判断,具体判断方法如下:Each neuron of a pulse-coupled neural network consists of three parts: a receiving part, a modulating part and a pulse generator. Pulse-coupled neural network is a common method for image noise reduction preprocessing, and its main function is to remove salt and pepper noise. When the pulse-coupled neural network is used in the field of image noise reduction, it can be understood as a two-dimensional single-layer local connection network. In this network, neurons correspond to pixels in the grayscale image to be processed one-to-one, and are relatively Neighboring neurons are also interconnected. In the process of denoising, the gray value of each pixel of the image to be processed can be understood as the feedback input of neurons, while the output of each neuron is only used as the input of adjacent neurons, and the output of each neuron There are only two output states: ignition state and non-ignition state, which can be recorded as 1 and 0 respectively. Since the pixel corresponding to the noise is quite different from the surrounding pixels, the noise point can be judged by using the emission characteristics of the pulse-coupled neural network combined with the characteristics of the noise itself. The specific judgment method is as follows:
建立手势灰度图像的脉冲耦合神经网络模型;将当前手势灰度图像各像素点的灰度值作为脉冲耦合神经网络中对应神经元的输入,然后利用脉冲耦合神经网络的发放特性对整个图像的像素点进行检测,如果像素点的输出状态为点火状态,则将检测结果矩阵中该像素点对应的元素置为1,否则置为0,可见检测结果矩阵和待处理图像的大小相同;设置降噪处理窗口大小,本实施例中为3×3;遍历检测结果矩阵的每个元素,如果元素值为1,也就是为点火状态,则以该元素为降噪处理窗口的中心,统计降噪处理窗口中除中心点元素以外的其他元素的值(即中心点对应像素点点以外其他像素点的检测结果),如果值为0(即不点火状态)的元素数量大于预设阈值,说明该中心点是噪声点,其他情况则该中心点不是噪声点。从而达到对噪声点和边缘点进行判断区分的目的。数量阈值一般是降噪处理窗口中元素数量的一半。Establish the pulse-coupled neural network model of the gesture gray-scale image; use the gray value of each pixel of the current gesture gray-scale image as the input of the corresponding neuron in the pulse-coupled neural network, and then use the pulse-coupled neural network’s firing characteristics to analyze the entire image. Pixels are detected. If the output state of the pixel is the ignition state, the element corresponding to the pixel in the detection result matrix is set to 1, otherwise it is set to 0. It can be seen that the size of the detection result matrix is the same as that of the image to be processed; Noise processing window size, in the present embodiment, be 3 * 3; Traverse each element of detection result matrix, if element value is 1, just be ignition state, then take this element as the center of noise reduction processing window, statistical noise reduction Process the values of other elements in the window except the center point element (that is, the detection results of other pixels other than the pixel corresponding to the center point), if the number of elements with a value of 0 (that is, the non-ignition state) is greater than the preset threshold, it means that the center The center point is a noise point, and in other cases, the center point is not a noise point. So as to achieve the purpose of judging and distinguishing noise points and edge points. The number threshold is generally half the number of elements in the denoising processing window.
在判断得到噪声点后,再采用复合去噪算法进行相应的去噪操作,其主要方法为:After the noise point is judged, the composite denoising algorithm is used to perform the corresponding denoising operation. The main method is:
假设a(i,j)是图像中像素点(i,j)处的灰度值,1≤i≤M,1≤j≤N,M表示手势灰度图像每行的像素点数量(即列数),N表示手势灰度图像每列的像素点数量(即行数),b(i,j)为该像素点进行中值滤波后的中值输出灰度值。为了达到对高斯噪声的目的,采用噪声点的像素值和中值输出灰度值的差值作为噪声估计值,如下式(1)所示Assume that a(i,j) is the grayscale value at the pixel point (i,j) in the image, 1≤i≤M, 1≤j≤N, M represents the number of pixels in each row of the gesture grayscale image (ie column number), N represents the number of pixels (that is, the number of rows) in each column of the gesture grayscale image, and b(i, j) is the median output gray value of the pixel after median filtering. In order to achieve the purpose of Gaussian noise, the difference between the pixel value of the noise point and the median output gray value is used as the noise estimation value, as shown in the following formula (1)
H(i,j)=|a(i,j)-b(i,j)| (1)H(i,j)=|a(i,j)-b(i,j)| (1)
由于噪声的类型不同,如果单纯地使用上述的估计方法,不能达到区分多种噪声的目的,因此在上述公式的基础上,又引入了另外一个噪声估计值V(i,j),该参数即像素点(i,j)处的像素值a(i,j)与相近的两个点m1(i,j)和m2(i,j)的梯度和的平均值,如下式(3)所示Due to the different types of noise, if the above estimation method is simply used, the purpose of distinguishing multiple noises cannot be achieved. Therefore, on the basis of the above formula, another noise estimation value V(i,j) is introduced. The parameter is The average value of the gradient sum of the pixel value a(i,j) at the pixel point (i,j) and the two adjacent points m 1 (i,j) and m 2 (i,j), as shown in the following formula (3) shown
其中,m1(i,j)和m2(i,j)分别代表像素点(i,j)所在邻域中与a(i,j)灰度值最接近的两个点的灰度值。Among them, m 1 (i, j) and m 2 (i, j) respectively represent the gray values of the two points closest to the gray value of a (i, j) in the neighborhood where the pixel point (i, j) is located .
设置阈值为T1和T2,则通过上述两种噪声估计值与阈值之间的关系,实现对不同噪声的对应处理,具体方法为:If the thresholds are set to T 1 and T 2 , the corresponding processing of different noises can be realized through the relationship between the above two noise estimates and the thresholds. The specific method is as follows:
如果H(i,j)≥T1,并且V(i,j)≥T2,则判定该噪声点的类型为椒盐噪声或者脉冲噪声,采用中值滤波对该噪声点进行处理,即将该噪声点的灰度值修改为中值滤波输出值,如果H(i,j)<T1,或者H(i,j)≥T1并且V(i,j)<T2,则判定该噪声类型为高斯噪声,采用均值滤波对该噪声点进行处理,即将该噪声点的灰度值修改为均值滤波输出值。If H(i,j)≥T 1 , and V(i,j)≥T 2 , then it is determined that the type of noise point is salt and pepper noise or impulse noise, and the noise point is processed by median filtering, that is, the noise point The gray value of the point is modified to the output value of the median filter. If H(i,j)<T 1 , or H(i,j)≥T 1 and V(i,j)<T 2 , the noise type is determined For Gaussian noise, the noise point is processed by mean filtering, that is, the gray value of the noise point is changed to the output value of mean filtering.
在上述算法中,阈值T1和T2的选取对复合去噪算法结果的好坏至关重要。其中目前常用的阈值选取方法为平均绝对离差算法即MAD算法。根据该算法可知,T1=3.5δij,δij表示像素点(i,j)的去噪窗口内所有像素点的平均绝对离差。阈值T2的选择主要是针对手势图像中可能出现的纹理,依据MAD算法和实验经验,T2的值通常选择为6~10的整数。 In the above algorithm, the selection of thresholds T1 and T2 is very important to the results of the composite denoising algorithm. Among them, the commonly used threshold selection method is the mean absolute deviation algorithm (MAD algorithm). According to the algorithm, it can be known that T 1 =3.5δ ij , where δ ij represents the average absolute dispersion of all pixels within the denoising window of the pixel (i,j). The selection of the threshold T 2 is mainly aimed at the texture that may appear in the gesture image. According to the MAD algorithm and experimental experience, the value of T 2 is usually selected as an integer of 6-10.
S202:直方图均衡化:S202: Histogram equalization:
直方图均衡化处理是指利用图像直方图对图像的对比度进行调整的方法,从而把原始图像的灰度直方图从比较集中的某个灰度区域变成在全局范围内均匀分布。本发明对步骤S201去噪处理后的手势灰度图像进行直方图均衡化处理,是为了扩大手势图像前景和后景部分灰度值的差别。直方图均衡化是目前一种常用的图像对比度增强的方法,其具体步骤在此不再赘述。Histogram equalization refers to the method of using the image histogram to adjust the contrast of the image, so that the gray histogram of the original image is changed from a relatively concentrated gray area to a uniform distribution in the global range. In the present invention, the histogram equalization process is performed on the gesture grayscale image after the denoising process in step S201, in order to enlarge the difference in the grayscale value of the foreground and background parts of the gesture image. Histogram equalization is a commonly used image contrast enhancement method at present, and its specific steps will not be repeated here.
S203:基于细胞神经网络的手势粗分割:S203: Coarse gesture segmentation based on cellular neural network:
与脉冲耦合神经网络一样,细胞神经网络中的神经元与手势灰度图像中的像素点一一对应,记第i行第j列的细胞为C(i,j)(对应手势灰度图像中的像素点(i,j)),细胞C(i,j)均由四部分组成:输入变量uij、状态转移变量xij、输出变量yij以及阀值I。细胞神经网络的细胞之间是局部互联的,细胞C(i,j)只与它的邻域Nr(i,j)中的细胞相互连接,而与其他的细胞无直接的连接关系。细胞C(i,j)邻域Nr(i,j)可以定义为:Like the pulse-coupled neural network, the neurons in the cellular neural network are in one-to-one correspondence with the pixels in the gesture grayscale image, and the cell in row i and column j is C(i,j) (corresponding to the gesture grayscale image pixel (i,j)), cell C(i,j) consists of four parts: input variable u ij , state transition variable x ij , output variable yi j and threshold I. The cells of the cellular neural network are locally interconnected, and the cell C(i,j) is only connected to the cells in its neighborhood N r (i,j), but has no direct connection relationship with other cells. Cell C(i,j) neighborhood N r (i,j) can be defined as:
Nr(i,j)=C(k,l)|max(k-i,l-j)≤r (3)N r (i,j)=C(k,l)|max(ki,lj)≤r (3)
其中,r为正整数,1≤i,k≤M,1≤j,l≤N,M表示手势灰度图像每行的像素点数量,N表示手势灰度图像每列的像素点数量。即细胞C(i,j)的邻域是以C(i,j)为中心,边长为2r+1的正方形所包括的范围。Among them, r is a positive integer, 1≤i, k≤M, 1≤j, l≤N, M represents the number of pixels in each row of the gesture grayscale image, and N represents the number of pixels in each column of the gesture grayscale image. That is, the neighborhood of cell C(i,j) is the range included by a square with side length 2r+1 centered on C(i,j).
细胞神经网络的主要公式为:The main formula of CNN is:
状态转移过程:State transition process:
输出方程:Output equation:
其中,1≤i,k≤M,1≤j,l≤N;t表示迭代次数;A(k,l)代表细胞C(i,j)所处的邻域Nr(i,j)内的细胞C(k,l)的反馈权重;B(k,l)则代表细胞C(i,j)所处的邻域Nr(i,j)内的细胞C(k,l)的控制权重,也即模板B中除中心位置元素之外的其他元素。这里(k,l)的取值依据邻域Nr(i,j)的定义决定。Among them, 1≤i, k≤M, 1≤j, l≤N; t represents the number of iterations; A(k,l) represents the neighborhood N r (i,j) where the cell C(i,j) is located The feedback weight of the cell C(k,l); B(k,l) represents the control of the cell C(k,l) in the neighborhood N r (i,j) where the cell C(i,j) is located Weight, that is, other elements in template B except the central position element. Here the value of (k,l) depends on the definition of the neighborhood N r (i,j).
反馈模板A和控制模块B都是(2r+1)×(2r+1)的矩阵,I代表细胞神经网络的阀值模板,A、B和I的值综合决定了细胞神经网络的输入量uij、输出量yij以及状态转移量xij的对应关系。因此对于细胞神经网络模型来说,如何正确地设计反馈模板A、控制模板B以及阀值I的取值至关重要。The feedback template A and the control module B are both (2r+1)×(2r+1) matrices, I represents the threshold template of the cellular neural network, and the values of A, B and I comprehensively determine the input quantity u of the cellular neural network The corresponding relationship between ij , output quantity yi j and state transition quantity x ij . Therefore, for the cellular neural network model, how to correctly design the feedback template A, the control template B and the value of the threshold I is very important.
本发明采用的模板设计方法是基于代数结与前人模板设计经验相结合的模板设计方法,模板A、B、I的格式一般设计如下:The template design method that the present invention adopts is the template design method based on the combination of algebraic knot and predecessor's template design experience, and the format of template A, B, I is generally designed as follows:
I=-d (8)I=-d (8)
其中,a,b,c,d均为正常数。Among them, a, b, c, d are all normal numbers.
建立手势灰度图像的细胞神经网络模型,将均衡化后手势灰度图像各像素点(i,j)的灰度值作为细胞神经网络模型中对应细胞的输入uij,按照状态转移过程的公式进行迭代,直到整个细胞神经网络收敛,每个细胞存在输出yij(t)。根据输出方程可知,细胞神经网络的输出值yij(t)介于1和-1之间,当yij(t)为1时,代表全黑;当yij(t)为-1时,代表全白。Establish the cellular neural network model of the gesture grayscale image, and use the gray value of each pixel (i, j) of the gesture grayscale image after equalization as the input u ij of the corresponding cell in the cellular neural network model, according to the formula of the state transition process Iterate until the entire cellular neural network converges, and each cell has an output y ij (t). According to the output equation, the output value y ij (t) of the cellular neural network is between 1 and -1, when y ij (t) is 1, it represents all black; when y ij (t) is -1, Represents all white.
判断某像素点是否为边缘点的基本原理为:当某个像素值为全黑,即为+1时,如果其对应邻域内的各个像素值的和大于设定的阈值参数,则本像素不是边缘像素,此时像素值趋于全白;反之,如果,其对应邻域内的各个像素值的和小于设定的阀值参数,则本像素代表边缘像素,此时像素值趋于全黑。当本像素值为全白,即-1时,则无论其对应邻域内各个像素的值大小如何,本像素值都将趋于全白。The basic principle of judging whether a pixel is an edge point is: when a pixel value is completely black, that is, +1, if the sum of the pixel values in its corresponding neighborhood is greater than the set threshold parameter, then the pixel is not At this time, the pixel value tends to be completely white; on the contrary, if the sum of the pixel values in the corresponding neighborhood is less than the set threshold parameter, this pixel represents an edge pixel, and the pixel value at this time tends to be completely black. When the value of this pixel is all white, that is, -1, the value of this pixel will tend to be all white regardless of the value of each pixel in its corresponding neighborhood.
根据以上原理,本发明中判断某像素点是否为边缘点的方法为:遍历细胞神经网络中每个像素点对应的细胞元的输出值,当某个像素点的输出值在[0,1]范围内,如果其对应邻域内其他像素点的像素值和大于预设阈值,则本像素不是边缘像素,否则是边缘像素点;当输出值在[-1,0)范围内,不是边缘像素点。邻域像素值和的阈值是根据实际情况来设置的。According to the above principles, the method for judging whether a certain pixel is an edge point in the present invention is: traverse the output value of the cell corresponding to each pixel in the cellular neural network, when the output value of a certain pixel is in [0,1] Within the range, if the sum of the pixel values of other pixels in the corresponding neighborhood is greater than the preset threshold, the pixel is not an edge pixel, otherwise it is an edge pixel; when the output value is in the range of [-1,0), it is not an edge pixel . The threshold of the neighborhood pixel value sum is set according to the actual situation.
S204:结合手势形状特性进行手势细分割:S204: Combining gesture shape features to perform gesture segmentation:
图3是结合手势形状特性进行手势细分割的流程图。如图3所示,本发明手势细分割包括以下步骤:Fig. 3 is a flow chart of fine segmentation of gestures combined with gesture shape features. As shown in Figure 3, the gesture subdivision of the present invention includes the following steps:
S301:提取连通区域及轮廓:S301: Extract connected regions and contours:
根据采用细胞神经网络得到的边缘像素点,求取连通区域,从而去除其他背景信息的干扰,只保留人的手部和脸部区域。本实施例中求取连通区域采用的算法为two_pass算法。然后提取连通区域的轮廓,本实施例采用搜索标记方法提取轮廓,具体流程为:对上面提取连通区域后的图像进行系统性地扫描,如果遇到连通区域内的某一个点,则以该点为起始点,然后跟踪它的边缘,并对边缘上面的像素进行标记。当扫描的轮廓达到完整闭合,则回到上一个位置继续扫描,直到发现新的像素信息。提取连通区域和轮廓也可以根据需要选用其他方法。According to the edge pixels obtained by using the cellular neural network, the connected area is calculated, so as to remove the interference of other background information, and only keep the human hand and face area. In this embodiment, the algorithm used to obtain the connected regions is the two_pass algorithm. Then extract the contour of the connected region. This embodiment uses the search mark method to extract the contour. The specific process is: systematically scan the image after the connected region is extracted above. If a point in the connected region is encountered, use this point as the starting point, then trace its edge, and mark the pixels above the edge. When the scanned contour reaches complete closure, return to the previous position and continue scanning until new pixel information is found. Extracting connected regions and contours can also use other methods as needed.
S302:对每个连通区域进行指尖检测:S302: Perform fingertip detection on each connected region:
对于得到的每个连通区域分别进行指尖检测,从而判断是否为手势区域。一般情况下在进行手势识别时,手指都是分开的,因此可以通过曲率计算来进行指尖检测。图4是本发明指尖检测的示意图。如4所示,指尖检测的方法为:Fingertip detection is performed on each of the obtained connected regions to determine whether it is a gesture region. In general, when gesture recognition is performed, fingers are separated, so fingertip detection can be performed through curvature calculation. Fig. 4 is a schematic diagram of fingertip detection in the present invention. As shown in 4, the method of fingertip detection is:
遍历连通区域中的每个轮廓像素点,将该像素点作为基准点,坐标记为p(px,py,0),(px,py)即表示该基准点在手势图像中的二维坐标,预设一个距离常数L,沿轮廓方向取p点前面的第L个点p1(p1x,p1y,0),则点p与点p1组成一条直线,接着沿轮廓方向取点p后面的第L个点p2(p2x,p2y,0),则点p与点p2也可以组成一条直线,这两条之间会形成一个夹角,该夹角记为α;将向量与向量之间夹角的余弦值作为将要计算的曲率结果,即曲率计算公式为:Traverse each contour pixel point in the connected area, and use this pixel point as a reference point, the coordinates are marked as p(p x , p y ,0), (p x , p y ) means the position of the reference point in the gesture image Two-dimensional coordinates, preset a distance constant L, take the L-th point p 1 (p 1x ,p 1y ,0) in front of point p along the contour direction, then point p and point p 1 form a straight line, and then follow the contour direction Take the L-th point p 2 (p 2x ,p 2y ,0) behind point p, then point p and point p 2 can also form a straight line, and an angle will be formed between the two, and the angle is recorded as α; convert the vector with vector The cosine value of the included angle is used as the curvature result to be calculated, that is, the curvature calculation formula is:
如果cosα大于预设曲率阈值T,则判定该点为待定指尖点。阈值T的大小是根据距离常数L来设置的,当距离常数L越大,阈值T也就越大。距离常数L通常也不能过小或过大,一般按照手指平均长度的四分之一到二分之一来设置。If cosα is greater than the preset curvature threshold T, it is determined that the point is a pending fingertip point. The size of the threshold T is set according to the distance constant L. When the distance constant L is larger, the threshold T is also larger. Usually, the distance constant L cannot be too small or too large, and it is generally set according to 1/4 to 1/2 of the average finger length.
对于手指的凹槽部分的干扰来说,可以通过向量与向量之间的向量积的符号来确定。通过图4可以看出,当点p位于指尖位置时向量积的符号与点p位于凹槽位置时向量积的符号不同,因此可以通过的符号来判断点p的位置。正是出于这个目的,才将点p、p1和p2的坐标以三维直角坐标方式表示。指尖位置的向量积的符号与遍历方向有关,当按照手势区域整体轮廓的顺时针遍历时,根据向量积的右手定则,指尖位置的向量积垂直于图像向内,即为负,当按照手势区域整体轮廓的逆时针遍历时(如图4中所示遍历方向),指尖位置的向量积垂直于图像向外,即为正。根据指尖位置的向量积的符号,从而去除凹槽部分的干扰。即判断待定指尖点向量积的符号,如果与指尖位置对应的符号相同,则保留为待定指尖点,否则不保留。For the interference of the groove part of the finger, the vector with vector Determine the sign of the vector product between . It can be seen from Figure 4 that when the point p is located at the fingertip The sign of the vector product is the same as when the point p is in the groove position The sign of the vector product is different, so it can be obtained by to determine the position of point p. It is for this purpose that the coordinates of the points p, p 1 and p 2 are expressed in three-dimensional Cartesian coordinates. fingertip position The sign of the vector product is related to the traversal direction. When traversing the overall outline of the gesture area clockwise, according to the right-hand rule of the vector product, the position of the fingertip The vector product is perpendicular to the image inward, which is negative. When traversing counterclockwise according to the overall outline of the gesture area (traversal direction as shown in Figure 4), the fingertip position The vector product is positive if it is perpendicular to the image outwards. according to fingertip position The sign of the vector product, thus removing the interference of the groove part. That is, to judge the pending fingertip If the sign of the vector product is the same as that corresponding to the fingertip position, it is reserved as the pending fingertip point, otherwise it is not reserved.
S303:判定手势区域:S303: Determine the gesture area:
在检测到指尖点后,还需要对指尖点进行判断,从而去除人脸部分的某些部分因为角度问题而引起的曲率大于阈值的干扰,判定得到手势区域。本发明采用了两重判定方法:After the fingertip point is detected, it is also necessary to judge the fingertip point, so as to remove the interference caused by some parts of the face with a curvature greater than the threshold due to the angle problem, and determine the gesture area. The present invention has adopted double judgment method:
首先判断连通区域中检测到的即y坐标最大的待定指尖点与检测到的y坐标最小的待定指尖点之间的y坐标差值是否超过人脸高度的一半,如果是,该连通区域不是手势区域,否则作为待定手势区域。这里之所以将距离大小设置为人脸高度的一半,是通过实验测试得出的,这样就可以在完整保留正确指尖点的前提下,去除人脸部分的干扰。First judge whether the y-coordinate difference between the undetermined fingertip point with the largest y-coordinate detected in the connected area and the undetermined fingertip point with the smallest y-coordinate detected exceeds half of the face height, and if so, the connected area Not a gesture area, otherwise it is a pending gesture area. The reason why the distance is set to half the height of the face here is obtained through experimental testing, so that the interference of the face part can be removed under the premise of completely retaining the correct fingertip points.
再进一步判断的每个待定手势区域中待定指尖点数量是否超过预设的数量阈值,如果是,则该连通区域为手势区域,否则不是。实际手势区域得到的指尖点数量的多少与曲率阈值T有关,因此在实际应用中,指尖点数量的阈值可以通过对若干个手势训练样本的实验结果进行统计得到。It is further judged whether the number of undetermined fingertip points in each undetermined gesture area exceeds the preset number threshold, if yes, the connected area is a gesture area, otherwise it is not. The number of fingertip points obtained in the actual gesture area is related to the curvature threshold T. Therefore, in practical applications, the threshold of the number of fingertip points can be obtained by statistically analyzing the experimental results of several gesture training samples.
S304:手势区域分割:S304: Gesture area segmentation:
通过以上操作去除了人脸等其他连通区域的干扰,得到了手势区域。然而手势区域里面有可能不单单包括人的手掌部分,有可能还有手腕等部分。一般情况下,人的手势的有效信息都集中在人的手掌部分,手腕等部分的信息基本可以忽略。因此为了使得后期特征提取和跟踪的高效和有效,需要对手势区域进行分割,达到只保留手指和手掌部分的目的。Through the above operations, the interference of other connected areas such as faces is removed, and the gesture area is obtained. However, the gesture area may not only include the palm of the person, but may also include the wrist and other parts. In general, the effective information of human gestures is concentrated in the palm of the person, and the information of the wrist and other parts can basically be ignored. Therefore, in order to make the later feature extraction and tracking efficient and effective, it is necessary to segment the gesture area to achieve the purpose of retaining only the fingers and palms.
根据人手的形状特征,本发明根据手势的长度与手势的宽度的比值约等于2来实现对手势的分割。在进行分割之前,需要先知道手势区域的主方向,本实施例中求取手势主方向的方法为:求取手势区域的质心,然后求得质心向各个指尖点的向量,将这些向量进行平均,该平均向量的方向即为手势区域主方向。然后再根据手势区域的主方向进行手势的分割。本实施例采用的分割方法为:按手势区域主方向得到手势区域的外接矩形,与主方向平行所在边为长,与主方向垂直的边为宽,选择指尖点所在的宽边,从该宽边开始、沿长边截取距离为2倍宽边长度的外接矩形,该外接矩形内所包含的手势区域即为分割所要得到的只保留手指和手掌部分的手势区域。According to the shape characteristics of the human hand, the present invention realizes the segmentation of the gesture according to the ratio of the length of the gesture to the width of the gesture being approximately equal to 2. Before performing segmentation, it is necessary to know the main direction of the gesture area. In this embodiment, the method for obtaining the main direction of the gesture is: obtain the centroid of the gesture area, and then obtain the vectors from the centroid to each fingertip point, and carry out these vectors On average, the direction of the average vector is the main direction of the gesture area. Then the gesture is segmented according to the main direction of the gesture area. The segmentation method adopted in this embodiment is: according to the main direction of the gesture area, the circumscribed rectangle of the gesture area is obtained, the side parallel to the main direction is long, and the side perpendicular to the main direction is wide, and the wide side where the fingertip point is located is selected. Starting from the broad side, intercept the circumscribed rectangle along the long side with a distance of twice the length of the wide side. The gesture area contained in the circumscribed rectangle is the gesture area to be segmented and only the fingers and palms are reserved.
S205:采用保留相位信息的傅里叶描述子提取手势特征:S205: Extracting gesture features using a Fourier descriptor that retains phase information:
对于步骤S204分割得到的手势区域,本发明设计了一种保留相位信息的傅里叶描述子以提取手势特征信息,从而去除传统傅里叶描述子的旋转不变性,达到区分旋转手势的目的。For the gesture area segmented in step S204, the present invention designs a Fourier descriptor that retains phase information to extract gesture feature information, thereby removing the rotation invariance of traditional Fourier descriptors and achieving the purpose of distinguishing rotation gestures.
离散傅里叶系数z(k)可以表示为:The discrete Fourier coefficient z(k) can be expressed as:
其中,p(i)表示离散序列中的第i个数据,n表示离散序列中的数据数量,e表示自然常数,j为虚数单位。本发明中,由于需要进行变换的是手势轮廓,因此离散序列p(i)是步骤S104分割得到的手势区域轮廓像素点中坐标的复数形式。Among them, p(i) represents the i-th data in the discrete sequence, n represents the number of data in the discrete sequence, e represents the natural constant, and j is the imaginary unit. In the present invention, since it is the gesture contour that needs to be transformed, the discrete sequence p(i) is the complex form of the coordinates of the gesture region contour pixels obtained by segmentation in step S104.
傅里叶逆变换可以表示为:The inverse Fourier transform can be expressed as:
根据傅里叶变换的基本性质z(k)=z*(n-k)去除傅里叶变换形式z中的从K+1到n-K-1的高频部分,其中,这里的z*代表z的共轭复数形式;K的取值范围为:[0,n/2]。然后再对去除高频部分的z进行傅里叶逆变换,将得到和原傅里叶变换近似的曲线,但是该曲线变得更加平滑,这个曲线成为原傅里叶变化曲线的第K近似曲线。其中,上述所描述的傅里叶系数的子集{z(k)n-K<k≤K}则就是要用来提取手势特征的傅里叶描述子。According to the basic property z(k)=z * (nk) of Fourier transform, the high-frequency part from K+1 to nK-1 in the Fourier transform form z is removed, wherein, z* here represents the total of z The complex form of the yoke; the value range of K is: [0,n/2]. Then inverse Fourier transform is performed on z that removes the high-frequency part, and a curve similar to the original Fourier transform will be obtained, but the curve becomes smoother, and this curve becomes the Kth approximate curve of the original Fourier change curve . Wherein, the subset {z(k)nK<k≤K} of the Fourier coefficients described above is the Fourier descriptor used to extract gesture features.
傅里叶描述子与形状的尺度、方向和曲线的起始位置都有一定的关系。因此,为了保证识别算法具有旋转、平移和尺度不变性,则需要对傅里叶描述子进行归一化操作。根据傅里叶变化的基本性质可以证明,用傅里叶系数表示轮廓时,系数幅值||z(k)||具有旋转不变性、平移不变性以及起点位置无关性,其中,0≤k≤n-1,又由于Z[0]不具有平移不变性,故将k的取值范围设置为[1,n-1]。为了实现傅里叶描述子的尺度不变性,可以将除Z[0]以外的每一个系数的幅值Z(k)||除以||Z(1)||,从而达到尺度不变的特性。归一化操作之后的傅里叶描述子S[k′]可以表示为:The Fourier descriptor has a certain relationship with the scale, direction and starting position of the curve. Therefore, in order to ensure that the recognition algorithm has rotation, translation and scale invariance, the Fourier descriptor needs to be normalized. According to the basic properties of the Fourier transformation, it can be proved that when the Fourier coefficient is used to represent the contour, the coefficient amplitude ||z(k)|| has rotation invariance, translation invariance and independence of the starting point position, where 0≤k ≤n-1, and since Z[0] does not have translation invariance, the value range of k is set to [1,n-1]. In order to achieve the scale invariance of the Fourier descriptor, the amplitude Z(k)|| of each coefficient except Z[0] can be divided by ||Z(1)|| to achieve scale invariance characteristic. The Fourier descriptor S[k′] after the normalization operation can be expressed as:
其中,1≤k′≤n-1;|| ||代表取模运算符。Among them, 1≤k′≤n-1; || || represents the modulo operator.
归一化傅里叶描述子的详细说明可以参见文献“宋瑞华.基于傅里叶描绘子的手势识别算法[D].西安电子科技大学,2008”。For a detailed description of the normalized Fourier descriptor, please refer to the document "Song Ruihua. Gesture Recognition Algorithm Based on Fourier Descriptor [D]. Xidian University, 2008".
本发明为了去除传统傅里叶描述子的旋转不变性,保留了旋转之后的相位信息,改进之后的傅里叶描述子的归一化形式可以表示为:In order to remove the rotation invariance of the traditional Fourier descriptor, the present invention retains the phase information after rotation, and improves the normalized form of the Fourier descriptor It can be expressed as:
其中,表示手势区域主方向与x轴的夹角,j是虚数单位。上面的傅里叶描述子S[k′]保留了手势旋转的相位信息,故该描述子不具有旋转不变性。因此本发明采用除以外的的系数作为手势区域的特征。该特征具有平移和尺度不变性,并且与手势轮廓曲线的起始位置无关,同时又具有旋转可变性,该特征向量可以达到对旋转手势区分的目的。由于不同手势区域的轮廓点数量不一定相同,因此在实际应用中,只在傅里叶描述子中统一选择前Q个构成特征向量,Q的大小可以根据实际情况进行确定。in, Indicates the angle between the main direction of the gesture area and the x-axis, and j is the imaginary unit. The above Fourier descriptor S[k'] retains the phase information of gesture rotation, so this descriptor does not have rotation invariance. Therefore, the present invention adopts the Besides The coefficient of is used as the feature of the gesture area. This feature has translation and scale invariance, and has nothing to do with the initial position of the gesture contour curve, and at the same time has rotation variability. This feature vector can achieve the purpose of distinguishing rotation gestures. Since the number of contour points in different gesture regions is not necessarily the same, in practical applications, only the first Q constituent feature vectors are uniformly selected in the Fourier descriptor, and the size of Q can be determined according to the actual situation.
S102:根据训练样本训练BP神经网络:S102: Train the BP neural network according to the training samples:
将训练样本手势图像的特征向量作为训练样本输入BP神经网络,其对应的手势图像类别作为BP神经网络的输出,对BP神经网络进行训练。BP神经网络是一种常用的神经网络,其网络的具体构成和参数以及训练方法,在此不再赘述。The feature vector of the gesture image of the training sample is input into the BP neural network as a training sample, and the corresponding gesture image category is used as the output of the BP neural network to train the BP neural network. The BP neural network is a commonly used neural network. The specific composition, parameters and training methods of the network will not be repeated here.
S103:对待识别样本进行手势识别:S103: Perform gesture recognition on samples to be recognized:
将待识别手势图像的特征向量输入步骤S102训练好的BP神经网络中,输出识别得到的手势图像类别。Input the feature vector of the gesture image to be recognized into the BP neural network trained in step S102, and output the recognized gesture image category.
为了说明本发明的技术效果,对本发明进行了实验验证。选择的手势训练样本分为手势朝上、手势朝下、手势朝左、手势朝右四个部分,每部分的训练样本数量为80,同样从这四类图像中再选择测试样本,每部分测试样本数量40。为了展示方便,此处只选取手势朝上样本来进行实施过程说明,样本中每张图片的尺寸为256×256,灰度级为256。In order to illustrate the technical effects of the present invention, the present invention has been verified experimentally. The selected gesture training samples are divided into four parts: gestures facing up, gestures facing down, gestures facing left, and gestures facing right. The number of training samples in each part is 80. Similarly, test samples are selected from these four types of images. Sample size 40. For the convenience of demonstration, here we only select the sample with the gesture facing up to illustrate the implementation process. The size of each picture in the sample is 256×256, and the gray level is 256.
首先需要对朝上样本进行图像去噪。由于样本图片的尺度大小为256×256,由于脉冲耦合神经网络用于图像降噪领域时,其神经元个数与待处理图像像素点是一一对应的,因此脉冲耦合神经网络的神经元个数设置为65536个,本实施例采用的脉冲耦合神经网络模型的参数设置为:神经元迭代次数τ=10,神经元连接强度β=3,动态门限参数θij=1,阈值输出的放大系数Vθ=20,阈值函数的衰减系数aθ=0.2,然后利用发放特性对脉冲耦合神经网络进行检测,再通过检测结果判定得到噪声点,然后根据噪声点的类型,采用复合去噪算法进行去噪操作,其中复合去噪算法的参数设置为T1=3.5δij,,其中Sk表示噪声窗口,噪声窗口大小和脉冲耦合神经网络的检测窗口大小一致,大小为3×3,T2=8。First, image denoising needs to be performed on the upward samples. Since the scale size of the sample picture is 256×256, and when the pulse-coupled neural network is used in the field of image noise reduction, the number of neurons corresponds to the pixel of the image to be processed, so the number of neurons in the pulse-coupled neural network The number is set to 65536, and the parameters of the pulse-coupled neural network model adopted in this embodiment are set to: the number of neuron iterations τ=10, the neuron connection strength β=3, the dynamic threshold parameter θij =1, and the amplification factor of the threshold output V θ = 20, the attenuation coefficient a θ = 0.2 of the threshold function, and then use the emission characteristics to detect the pulse-coupled neural network, and then judge the noise points through the detection results, and then use the composite denoising algorithm to remove the noise points according to the type of noise points Noise operation, wherein the parameters of the composite denoising algorithm are set to T 1 =3.5δ ij , where S k represents the noise window, the size of the noise window is consistent with the detection window size of the pulse-coupled neural network, the size is 3×3, T 2 = 8.
对去噪后的手势图像进行直方图均衡化后,采用细胞神经网络检测得到手势图像中手势的边缘,实现对手势图像的粗分割,本实施例中,细胞神经网络中每个细胞的邻域的大小为3*3,所采用的模板为:After performing histogram equalization on the denoised gesture image, the edge of the gesture in the gesture image is detected by using the cellular neural network, and the rough segmentation of the gesture image is realized. In this embodiment, the neighborhood of each cell in the cellular neural network The size of is 3*3, and the template used is:
图5是手势粗分割的示例图。Fig. 5 is an example diagram of gesture coarse segmentation.
然后结合手势形状特征对手势图像进行细分割。其中常数L的大小为80,曲率计算的阈值T的大小为0.5。图6是手势细分割的示例图。可以看到,进行手势细分割后,可以消除人脸等区域的影响,得到较为准确的手势区域。Then the gesture image is finely segmented by combining gesture shape features. The size of the constant L is 80, and the size of the threshold T for curvature calculation is 0.5. FIG. 6 is an example diagram of fine segmentation of gestures. It can be seen that after fine segmentation of gestures, the influence of areas such as faces can be eliminated, and more accurate gesture areas can be obtained.
然后再将细分割得到的手势区域的轮廓点坐标构建离散序列,进行傅里叶变换后得到傅里叶系数,然后根据式(13)进行归一化,将归一化后的傅里叶描述子中选择前200个构成手势特征向量。Then the contour point coordinates of the gesture area obtained by subdivision are constructed into a discrete sequence, and the Fourier coefficients are obtained after Fourier transform, and then normalized according to formula (13), and the normalized Fourier description Select the first 200 constituent gesture feature vectors in the subsection.
采用训练样本的手势特征向量对BP神经网络进行训练,其中BP神经网络的输入层的个数由手势特征向量决定,输出层的个数由手势样本种类决定,本发明采用的输入层的个数为200,隐藏层的个数为10,输出层的个数为4。输出结果可以由二进制形式0001,0010,0100,1000表示,其中0001表示手势朝上,0010表示手势朝下,0100表示手势朝左,1000表示手势朝右,根据手势输出的结果判定手势属于何种类型。The gesture feature vector of the training sample is used to train the BP neural network, wherein the number of the input layer of the BP neural network is determined by the gesture feature vector, and the number of the output layer is determined by the gesture sample type, and the number of the input layer adopted in the present invention is 200, the number of hidden layers is 10, and the number of output layers is 4. The output result can be expressed in binary form 0001, 0010, 0100, 1000, where 0001 means the gesture is facing up, 0010 means the gesture is facing down, 0100 means the gesture is facing the left, and 1000 means the gesture is facing the right. According to the result of the gesture output, determine what kind of gesture it belongs to Types of.
为了验证本发明设计的基于脉冲耦合神经网络和复合去噪算法相结合的新型去噪算法的降噪效果的好坏,将本发明设计的去噪算法与单纯复合去噪算法和中值滤波做了对比分析,对比的主要指标是峰值信噪比PSNR。表1是本发明去噪算法与对比算法的PSNR对照表。In order to verify the quality of the noise reduction effect of the new denoising algorithm based on the combination of pulse-coupled neural network and composite denoising algorithm designed by the present invention, the denoising algorithm designed in the present invention is combined with simple composite denoising algorithm and median filtering For comparative analysis, the main index of comparison is the peak signal-to-noise ratio (PSNR). Table 1 is a comparison table of PSNR between the denoising algorithm of the present invention and the comparison algorithm.
表1Table 1
从表1可以看出,在相同的噪声密度的情况下,本发明提出的去噪方法其PSNR的值明显高于中值滤波和单纯复合去噪算法的值。由此可见,本发明设计的结合脉冲耦合神经网络和复合去噪算法的去噪算法具有良好的去噪效果。It can be seen from Table 1 that in the case of the same noise density, the PSNR value of the denoising method proposed by the present invention is obviously higher than that of the median filter and the simple compound denoising algorithm. It can be seen that the denoising algorithm combined with the pulse-coupled neural network and the compound denoising algorithm designed by the present invention has a good denoising effect.
此外,还采用传统的傅里叶描述子的识别效果进行对比,对比指标为手势样本的识别率。表2是传统傅里叶描述子的手势样本识别结果统计表。表3是本发明傅里叶描述子的手势样本识别结果统计表。In addition, the recognition effect of the traditional Fourier descriptor is also used for comparison, and the comparison index is the recognition rate of gesture samples. Table 2 is a statistical table of gesture sample recognition results of traditional Fourier descriptors. Table 3 is a statistical table of gesture sample recognition results of the Fourier descriptor of the present invention.
表2Table 2
表3table 3
通过对比表2和表3的结果可知,传统的傅里叶描述子不能很好的识别旋转较大的手势,识别率仅有71%左右,识别率较低,因此该方法用于旋转手势具有不同含义的场景时,效果不是很好。本发明改进的傅里叶描述子可以在容忍手势旋转一定的角度,虽然在角度旋转过大时会认为是两种不同的图像,但是通过实验验证本发明仍然达到了91%左右的识别率,取得了很好的手势识别效果。By comparing the results of Table 2 and Table 3, it can be seen that the traditional Fourier descriptor cannot recognize the gesture with a large rotation, the recognition rate is only about 71%, and the recognition rate is low, so this method is used for the rotation gesture. It doesn't work very well for scenes with different meanings. The improved Fourier descriptor of the present invention can tolerate a certain angle of gesture rotation. Although it will be considered as two different images when the angle is too large, it is verified by experiments that the present invention still achieves a recognition rate of about 91%. A good gesture recognition effect has been achieved.
尽管上面对本发明说明性的具体实施方式进行了描述,以便于本技术领域的技术人员理解本发明,但应该清楚,本发明不限于具体实施方式的范围,对本技术领域的普通技术人员来讲,只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内,这些变化是显而易见的,一切利用本发明构思的发明创造均在保护之列。Although the illustrative specific embodiments of the present invention have been described above, so that those skilled in the art can understand the present invention, it should be clear that the present invention is not limited to the scope of the specific embodiments. For those of ordinary skill in the art, As long as various changes are within the spirit and scope of the present invention defined and determined by the appended claims, these changes are obvious, and all inventions and creations using the concept of the present invention are included in the protection list.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510280013.3A CN104834922B (en) | 2015-05-27 | 2015-05-27 | Gesture identification method based on hybrid neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510280013.3A CN104834922B (en) | 2015-05-27 | 2015-05-27 | Gesture identification method based on hybrid neural networks |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104834922A CN104834922A (en) | 2015-08-12 |
CN104834922B true CN104834922B (en) | 2017-11-21 |
Family
ID=53812800
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510280013.3A Expired - Fee Related CN104834922B (en) | 2015-05-27 | 2015-05-27 | Gesture identification method based on hybrid neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104834922B (en) |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105487772A (en) * | 2015-11-26 | 2016-04-13 | 上海斐讯数据通信技术有限公司 | Information capturing method and apparatus |
CN105373785B (en) * | 2015-11-30 | 2019-08-02 | 北京地平线机器人技术研发有限公司 | Gesture identification detection method and device based on deep neural network |
CN106022343B (en) * | 2016-05-19 | 2019-08-16 | 东华大学 | A Garment Style Recognition Method Based on Fourier Descriptor and BP Neural Network |
CN106022297A (en) * | 2016-06-01 | 2016-10-12 | 苏州大学 | Gesture identification method and gesture identification device |
CN108073979A (en) * | 2016-11-14 | 2018-05-25 | 顾泽苍 | A kind of ultra-deep study of importing artificial intelligence knows method for distinguishing for image |
CN108229277B (en) | 2017-03-31 | 2020-05-01 | 北京市商汤科技开发有限公司 | Gesture recognition method, gesture control method, multilayer neural network training method, device and electronic equipment |
CN109101860B (en) * | 2017-06-21 | 2022-05-13 | 富泰华工业(深圳)有限公司 | Electronic device and gesture recognition method thereof |
CN107894834B (en) * | 2017-11-09 | 2021-04-02 | 上海交通大学 | Control gesture recognition method and system in augmented reality environment |
CN108230257A (en) * | 2017-11-15 | 2018-06-29 | 北京市商汤科技开发有限公司 | Image processing method, device, electronic equipment and storage medium |
CN108052884A (en) * | 2017-12-01 | 2018-05-18 | 华南理工大学 | A kind of gesture identification method based on improvement residual error neutral net |
CN108108024B (en) * | 2018-01-02 | 2021-01-22 | 京东方科技集团股份有限公司 | Dynamic gesture acquisition method and device, and display device |
CN108198567A (en) * | 2018-02-22 | 2018-06-22 | 成都启英泰伦科技有限公司 | A kind of novel voice is except system of making an uproar |
CN109344689A (en) * | 2018-08-07 | 2019-02-15 | 西安理工大学 | A Kinect-based Mute Language Gesture Recognition Method |
CN109359538B (en) * | 2018-09-14 | 2020-07-28 | 广州杰赛科技股份有限公司 | Training method of convolutional neural network, gesture recognition method, device and equipment |
CN109344793B (en) | 2018-10-19 | 2021-03-16 | 北京百度网讯科技有限公司 | Method, apparatus, device and computer readable storage medium for recognizing handwriting in the air |
CN109697407A (en) * | 2018-11-13 | 2019-04-30 | 北京物灵智能科技有限公司 | A kind of image processing method and device |
CN109612326B (en) * | 2018-12-19 | 2021-08-24 | 西安建筑科技大学 | An intelligent auxiliary teaching system for light weapons shooting based on the Internet of Things |
CN113033256B (en) * | 2019-12-24 | 2024-06-11 | 武汉Tcl集团工业研究院有限公司 | Training method and device for fingertip detection model |
CN111259902A (en) * | 2020-01-13 | 2020-06-09 | 上海眼控科技股份有限公司 | Arc-shaped vehicle identification number detection method and device, computer equipment and medium |
CN111216133B (en) * | 2020-02-05 | 2022-11-22 | 广州中国科学院先进技术研究所 | A Robot Demonstration Programming Method Based on Fingertip Recognition and Hand Movement Tracking |
CN112487981A (en) * | 2020-11-30 | 2021-03-12 | 哈尔滨工程大学 | MA-YOLO dynamic gesture rapid recognition method based on two-way segmentation |
WO2022126367A1 (en) * | 2020-12-15 | 2022-06-23 | Qualcomm Incorporated | Sequence processing for a dataset with frame dropping |
CN112800954A (en) * | 2021-01-27 | 2021-05-14 | 北京市商汤科技开发有限公司 | Text detection method and device, electronic equipment and storage medium |
CN113191361B (en) * | 2021-04-19 | 2023-08-01 | 苏州大学 | A Shape Recognition Method |
CN113449600B (en) * | 2021-05-28 | 2023-07-04 | 宁波春建电子科技有限公司 | Two-hand gesture segmentation algorithm based on 3D data |
CN113792624B (en) * | 2021-08-30 | 2024-10-15 | 河南林业职业学院 | Bank ATM early warning security monitoring method |
CN114625333B (en) * | 2022-03-08 | 2022-10-18 | 深圳康荣电子有限公司 | Method and system capable of recording gesture instructions to control liquid crystal splicing LCD |
CN115240270B (en) * | 2022-07-05 | 2025-02-07 | 海南软件职业技术学院 | A gesture recognition method based on Fourier descriptor |
CN117558068B (en) * | 2024-01-11 | 2024-03-19 | 深圳市阿龙电子有限公司 | Intelligent device gesture recognition method based on multi-source data fusion |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8620024B2 (en) * | 2010-09-17 | 2013-12-31 | Sony Corporation | System and method for dynamic gesture recognition using geometric classification |
CN103778407A (en) * | 2012-10-23 | 2014-05-07 | 南开大学 | Gesture recognition algorithm based on conditional random fields under transfer learning framework |
CN104298354A (en) * | 2014-10-11 | 2015-01-21 | 河海大学 | Man-machine interaction gesture recognition method |
CN104573621A (en) * | 2014-09-30 | 2015-04-29 | 李文生 | Dynamic Gesture Learning and Recognition Method Based on Chebyshev Neural Network |
-
2015
- 2015-05-27 CN CN201510280013.3A patent/CN104834922B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8620024B2 (en) * | 2010-09-17 | 2013-12-31 | Sony Corporation | System and method for dynamic gesture recognition using geometric classification |
CN103778407A (en) * | 2012-10-23 | 2014-05-07 | 南开大学 | Gesture recognition algorithm based on conditional random fields under transfer learning framework |
CN104573621A (en) * | 2014-09-30 | 2015-04-29 | 李文生 | Dynamic Gesture Learning and Recognition Method Based on Chebyshev Neural Network |
CN104298354A (en) * | 2014-10-11 | 2015-01-21 | 河海大学 | Man-machine interaction gesture recognition method |
Non-Patent Citations (1)
Title |
---|
基于神经网络的手势识别技术研究;江立 等;《北京交通大学学报》;20061031(第5期);正文32-36页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104834922A (en) | 2015-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104834922B (en) | Gesture identification method based on hybrid neural networks | |
CN105069472B (en) | A kind of vehicle checking method adaptive based on convolutional neural networks | |
CN106897714B (en) | Video motion detection method based on convolutional neural network | |
CN105205475B (en) | A dynamic gesture recognition method | |
CN110837768B (en) | An online detection and identification method for rare animal protection | |
CN106446952B (en) | A kind of musical score image recognition methods and device | |
CN105654453B (en) | A kind of FCM image partition methods of robustness | |
CN107292311A (en) | A kind of recognition methods of the Characters Stuck identifying code based on neutral net | |
CN103020985B (en) | A kind of video image conspicuousness detection method based on field-quantity analysis | |
CN105574534A (en) | Significant object detection method based on sparse subspace clustering and low-order expression | |
CN103870818B (en) | Smog detection method and device | |
CN104331683B (en) | A kind of facial expression recognizing method with noise robustness | |
CN108961675A (en) | Fall detection method based on convolutional neural networks | |
CN107871099A (en) | Face detection method and apparatus | |
CN105590319A (en) | Method for detecting image saliency region for deep learning | |
CN106991686A (en) | A kind of level set contour tracing method based on super-pixel optical flow field | |
CN110728185B (en) | Detection method for judging existence of handheld mobile phone conversation behavior of driver | |
CN115205636A (en) | An image target detection method, system, device and storage medium | |
Chen et al. | Image splicing localization using residual image and residual-based fully convolutional network | |
Diwan et al. | Unveiling copy-move forgeries: Enhancing detection with SuperPoint keypoint architecture | |
CN108846356A (en) | A method of the palm of the hand tracing and positioning based on real-time gesture identification | |
CN111260655A (en) | Image generation method and device based on deep neural network model | |
CN106874917A (en) | A kind of conspicuousness object detection method based on Harris angle points | |
Bilal et al. | A hybrid method using haar-like and skin-color algorithm for hand posture detection, recognition and tracking | |
CN110135435A (en) | A method and device for saliency detection based on extensive learning system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171121 Termination date: 20200527 |