CN103455794B

CN103455794B - A Dynamic Gesture Recognition Method Based on Frame Fusion Technology

Info

Publication number: CN103455794B
Application number: CN201310374176.9A
Authority: CN
Inventors: 冯志全; 张廷芳
Original assignee: University of Jinan
Current assignee: University of Jinan
Priority date: 2013-08-23
Filing date: 2013-08-23
Publication date: 2016-08-10
Anticipated expiration: 2033-08-23
Also published as: CN103455794A

Abstract

The invention discloses a kind of dynamic gesture identification method based on frame integration technology, it is characterized in that, comprise the steps: according to Density Distribution Feature, one group of set dynamic gesture is divided into one group of gesture set, each gesture set is asked the Density Distribution Feature parameter of its each gesture frame fusion image, then within each is gathered, the meansigma methods of its Density Distribution Feature parameter is sought respectively, using meansigma methods as the template characteristic vector H of this gesture set；The static constitutional diagram Q that dynamic gesture Image Acquisition to be identified is mapped；Calculate the Density Distribution Feature of static constitutional diagram Q, identify the gesture set at dynamic gesture place according to Density Distribution Feature；Scope according to Density Distribution Feature determines that the gesture in gesture set selects to use Hausdorff distance method or fingertip characteristic point methods further.

Description

A Dynamic Gesture Recognition Method Based on Frame Fusion Technology

技术领域technical field

本发明涉及动态手势识别领域，具体地讲，涉及一种基于帧融合技术的动态手势识别方法。The invention relates to the field of dynamic gesture recognition, in particular to a dynamic gesture recognition method based on frame fusion technology.

背景技术Background technique

随着计算机技术的迅速发展，人机交互技术逐渐成为当前最热门研究课题之一。而人手的重要性决定了其在人机交互领域的重大研究价值。人手作为一种最自然、直观而又易于学习的人机交互手段，被人们广泛研究应用，而基于视觉的手势识别则成为人们广泛研究的内容之一。With the rapid development of computer technology, human-computer interaction technology has gradually become one of the most popular research topics. The importance of the human hand determines its great research value in the field of human-computer interaction. Human hand, as the most natural, intuitive and easy-to-learn means of human-computer interaction, has been widely studied and applied by people, and gesture recognition based on vision has become one of the widely studied contents.

根据手势的运动特点，可将手势分为静态手势和动态手势。静态手势依靠手的形状和轮廓传递信息，动态手势则随着时间的变化，手的位置和形状也发生变化，从而可以传递更多准确详细的信息。目前的手势识别方法有模板匹配法(Model/Template Matching)、隐马尔可夫模型(HMM,Hidden Markov Model)、动态时间规划(DTW)等。According to the motion characteristics of gestures, gestures can be divided into static gestures and dynamic gestures. Static gestures rely on the shape and outline of the hand to convey information, while dynamic gestures change the position and shape of the hand over time, which can convey more accurate and detailed information. The current gesture recognition methods include template matching method (Model/Template Matching), hidden Markov model (HMM, Hidden Markov Model), dynamic time programming (DTW) and so on.

模板匹配的方法多用于静态手势识别中。2002年，张良国、吴江琴等人提出的基于hausdorff距离的手势识别，利用hausdorff距离模板匹配的思想，实现了鲁棒性较强的手势识别。黄国范和李英提出一个字母手势识别方法,首先对所有的字母手势图像预处理,然后利用模板匹配的方法进行识别。The method of template matching is mostly used in static gesture recognition. In 2002, Zhang Liangguo, Wu Jiangqin and others proposed gesture recognition based on hausdorff distance, using the idea of hausdorff distance template matching to achieve a more robust gesture recognition. Huang Guofan and Li Ying proposed a letter gesture recognition method. Firstly, all the letter gesture images were preprocessed, and then the template matching method was used for recognition.

2003年，Ahmed Elgammal提出了一种非参数化的HMM模型，该方法将动态手势用一系列既学的姿态来表示，根据概率体系结构进行手势识别。严焰等使用HMM手势指令模型,采用k_means得到手势序列的矢量量化。从而提高了手势识别的性能。In 2003, Ahmed Elgammal proposed a non-parametric HMM model, which represented dynamic gestures with a series of learned gestures and performed gesture recognition based on a probabilistic architecture. Yan Yan et al. used the HMM gesture instruction model and k_means to obtain the vector quantization of the gesture sequence. Thereby improving the performance of gesture recognition.

DTW方法是一种具有非线性时间归一化效果的模式匹配算法。Trevor J.等人在1996年提出一种DTW方法，DTW方法简单有效，在测试模性和参考模型之间允许充分的弹性，从而实现正确的分类。经磊、马文军等人使用动态时间规划的方法有效地改善了基于加速度的动态手势识别效率。The DTW method is a pattern matching algorithm with nonlinear time normalization effect. Trevor J. et al. proposed a DTW method in 1996. The DTW method is simple and effective, and allows sufficient flexibility between the test model and the reference model to achieve correct classification. Jing Lei, Ma Wenjun and others used the method of dynamic time planning to effectively improve the efficiency of dynamic gesture recognition based on acceleration.

发明内容Contents of the invention

本发明要解决的技术问题是提供一种基于帧融合技术的动态手势识别方法，增加了动态手势识别的准确性。The technical problem to be solved by the present invention is to provide a dynamic gesture recognition method based on frame fusion technology, which increases the accuracy of dynamic gesture recognition.

本发明采用如下技术方案实现发明目的：The present invention adopts following technical scheme to realize the object of the invention:

一种基于帧融合技术的动态手势识别方法，其特征在于，包括如下步骤：A kind of dynamic gesture recognition method based on frame fusion technology, it is characterized in that, comprises the steps:

(1)将既定的一组动态手势根据密度分布特征划分为一组手势集合，对每个手势集合求其各个手势帧融合图像的密度分布特征参数，然后在各个集合内分别求其密度分布特征参数的平均值，将平均值作为该手势集合的模板特征向量H；(1) Divide a set of dynamic gestures into a set of gesture sets according to the density distribution characteristics, and calculate the density distribution characteristic parameters of each gesture frame fusion image for each gesture set, and then calculate the density distribution characteristics in each set The average value of the parameters, the average value is used as the template feature vector H of the gesture set;

(2)对待识别的动态手势图像获取映射的静态组合图Q；(2) The dynamic gesture image to be recognized obtains the static combination graph Q of mapping;

(3)计算静态组合图Q的密度分布特征，根据密度分布特征识别出动态手势所在的手势集合；(3) Calculate the density distribution feature of the static combination graph Q, and identify the gesture set where the dynamic gesture is located according to the density distribution feature;

(4)根据密度分布特征的范围确定手势集合内的手势进一步选择采用Haus dorff距离方法或者指尖特征点方法。(4) Determine the gestures in the gesture set according to the range of the density distribution feature and further select the Haus dorff distance method or the fingertip feature point method.

作为对本技术方案的进一步限定，所述步骤(2)包括如下步骤：As a further limitation to the technical solution, said step (2) includes the following steps:

(2.1)获取N帧连续动态手势序列图像{P_i},i＝0，1,2,...,N；(2.1) Acquire N frames of continuous dynamic gesture sequence images {P _i }, i=0, 1, 2,..., N;

(2.2)静态组合图Q中的像素同{P_i}中的像素的映射关系如下(2.2) The mapping relationship between the pixels in the static combined graph Q and the pixels in {P _i } is as follows

$[\begin{matrix} {x x}_{i i} \\ {y the y}_{i i} \end{matrix}] = = [\begin{matrix} \frac{a a}{22} + + ((X x - - γ γ sin sin β β - - \frac{A A}{22})) cos cos α α + + ((Y Y + + γ γ cos cos β β - - \frac{B B}{22})) sin sin α α \\ \frac{b b}{22} - - ((X x - - γ γ sin sin β β - - \frac{A A}{22})) sin sin α α - - ((Y Y + + γ γ cos cos β β - - \frac{B B}{22})) cos cos α α \end{matrix}] - - - - - - ((11))$

其中，x_i,y_i分别代表小图P_i某一像素的横纵坐标，X,Y分别表示组合图Q中对应像素的横纵坐标，a,b分别表示P_i的宽度和高度，A,B分别表示Q的宽度和高度，r为各小图P_i在Q中旋转排列的半径，α表示图形P_i自身旋转的角度，β表示过旋转后的P_i的中心的半径跟Q中所建立的直角坐标系纵坐标的夹角，且：Among them, x _i and y _i respectively represent the horizontal and vertical coordinates of a certain pixel in the small image P _i , X and Y represent the horizontal and vertical coordinates of the corresponding pixel in the combined image Q respectively, a and b represent the width and height of _Pi respectively, and A , B respectively represent the width and height of Q, r is the radius of each small figure Pi rotated in Q, α represents the angle of rotation of the figure P _i itself, β _represents the radius of the center of the rotated _Pi and Q The included angle of the ordinate of the established Cartesian coordinate system, and:

$α α = = \frac{33}{22} π π + + \frac{22}{N N} π π i i - - - - - - ((22))$

$β β = = \frac{22}{N N} π π i i - - - - - - ((33))$

作为对本技术方案的进一步限定，所述步骤(3)包括：As a further limitation to the technical solution, the step (3) includes:

(3.1)静态组合图Q形成的平面图像为f(x,y)，计算图像f(x,y)的形心，即重心 (3.1) The plane image formed by the static combination graph Q is f(x, y), and the centroid of the image f(x, y) is calculated, that is, the center of gravity

(3.2)计算图像f(x,y)中，目标像素点到形心的最大距离D_max，最小距离D_min；(3.2) Calculate the maximum distance D _max and the minimum distance D _min from the target pixel point to the centroid in the image f(x, y);

(3.3)分别计算图像f(x,y)中，以形心为圆心，以D_max为半径的目标区域最大外接圆和以D_min为半径的目标区域最小外接圆，在最大外接圆和最小外接圆组成的区域内，使用等距离区域划分法将图像划分为M个子图像区域(M>0)；(3.3) In the image f(x, y), calculate the maximum circumscribed circle of the target area with the centroid as the center and the radius of D _max and the minimum circumscribed circle of the target area with D _min as the radius. The maximum circumscribed circle and the minimum In the area formed by the circumscribed circle, use the equidistant area division method to divide the image into M sub-image areas (M>0);

(3.4)对各子图像区域分别进行统计，计算每个子图像区域内目标像素的总数S_i(i＝1,…,M),并找出S_i的最大值。(3.4) Make statistics on each sub-image area respectively, calculate the total number S _i (i=1,...,M) of target pixels in each sub-image area, and find the maximum value of S _i .

${S S}_{max max} = = \underset{i i = = 11,, ... ...,, M m}{m m a a x x} (({S S}_{i i})) - - - - - - ((44))$

(3.5)计算静态组合图的密度分布特征d：(3.5) Calculate the density distribution feature d of the static composite graph:

r_i＝S_i/S_max(i＝1，…，M) (5)r _i =S _i /S _max (i=1,...,M) (5)

${dr dr}_{i i} = = \{\begin{matrix} | | {r r}_{11} - - {r r}_{22} | | & i i = = 11 \\ | | 22 {r r}_{i i} - - {r r}_{i i - - 11} - - {r r}_{i i + + 11} | | & 11 < < i i < < M m \\ | | {r r}_{M m} - - {r r}_{M m - - 11} | | & i i = = M m \end{matrix} - - - - - - ((66))$

d＝(r₁，…，r_M，dr₁，…，dr_M) (7)d=(r ₁ ,...,r _M , dr ₁ ,...,dr _M ) (7)

(3.6)根据公式DDF′＝(r₁，…，r₁₀，ar₁₁，…，ar₁₅，br₁₆，…，br₂₀，dr₁，…，dr₁₀，cdr₁₁，…，cdr_M)得到修改后的特征向量其中,a,b,c为可调节参数；(3.6) According to the formula DDF'=(r ₁ ,..., r ₁₀ , ar ₁₁ ,..., ar ₁₅ , br ₁₆ ,..., br ₂₀ , dr ₁ ,..., dr ₁₀ , cdr ₁₁ ,..., cdr _M ) Modified eigenvectors Among them, a, b, c are adjustable parameters;

(3.7)将得到的特征向量分别与各个手势集合中的模板特征向量H相比较，分别计算欧氏距离，欧氏距离最小的手势集合即为所选择的手势集合。(3.7) will get the eigenvector The Euclidean distance is calculated respectively by comparing with the template feature vector H in each gesture set, and the gesture set with the smallest Euclidean distance is the selected gesture set.

作为对本技术方案的进一步限定，所述步骤(4)包括如下步骤：As a further limitation to the technical solution, said step (4) includes the following steps:

(4.1)对于密度分布特征的第35个参数向量在范围5至6之间的手势集合内手势采用Hausdorff距离对动态手势的最后一帧进行进一步的识别，从而达到准确的手势识别效果；(4.1) Hausdorff distance is used to further identify the last frame of the dynamic gesture in the gesture set where the 35th parameter vector of the density distribution feature is in the range of 5 to 6, so as to achieve an accurate gesture recognition effect;

(4.2)对于密度分布特征的第35个参数向量在范围3至4之间的手势集合内手势采用指尖特征点的方法对动态手势的最后一帧进行进一步的识别，从而达到准确的手势识别效果。(4.2) For the gesture set whose 35th parameter vector of the density distribution feature is in the range of 3 to 4, the gesture uses the method of fingertip feature points to further recognize the last frame of the dynamic gesture, so as to achieve accurate gesture recognition Effect.

作为对本技术方案的进一步限定，所述步骤(4.1)的Hausdorff距离方法对动态手势的最后一帧进行进一步的识别包括如下步骤：As a further limitation to the technical solution, the Hausdorff distance method of the step (4.1) further identifies the last frame of the dynamic gesture and includes the following steps:

(4.1.1)首先训练手势集合内手势的边界点集合L；(4.1.1) First train the boundary point set L of gestures in the gesture set;

(4.1.2)然后对待识别手势的最后一帧求其边界点的集合e；(4.1.2) Then find the set e of the boundary points of the last frame of the gesture to be recognized;

(4.1.3)分别计算e同边界点集合L的hausdorff距离；(4.1.3) Calculate the hausdorff distance between e and the boundary point set L respectively;

(4.1.4)输出hausdorff距离最小的模板手势序列i，此序列即为所识别的正确手势序列。(4.1.4) Output the template gesture sequence i with the smallest hausdorff distance, which is the correct gesture sequence identified.

作为对本技术方案的进一步限定，所述步骤(4.2)的指尖特征点的方法包括如下步骤：As a further limitation to the technical solution, the method for the fingertip feature points of the step (4.2) includes the following steps:

(4.2.1)首先训练手势集合，检测出手势集合中各种手势的指尖点信息，将各指尖点同手势重心形成的向量的特征信息记入向量模板G；(4.2.1) First train the gesture set, detect the fingertip point information of various gestures in the gesture set, and combine each fingertip point with the gesture center of gravity to form a vector The feature information of is recorded in the vector template G;

(4.2.2)然后检测待识别手势的指尖向量同向量模板G中的指尖向量相比较，输出相似度最大的模板手势序列i，此序列即为所识别的正确手势序列。(4.2.2) Then detect the fingertip vector of the gesture to be recognized Compared with the fingertip vector in the vector template G, the template gesture sequence i with the largest similarity is output, and this sequence is the correct gesture sequence recognized.

与现有技术相比，本发明的优点和积极效果是：本发明采用组合图的方法，将对动态手势的识别转化为对静态组合图的识别，有效地提高了动态手势的识别率，另外，该方法的时间损耗并不高，在计算机允许的范围之内。Compared with the prior art, the advantages and positive effects of the present invention are: the present invention adopts the method of combination diagram, converts the recognition of dynamic gestures into the recognition of static combination diagrams, effectively improves the recognition rate of dynamic gestures, and in addition , the time loss of this method is not high, within the allowable range of the computer.

附图说明Description of drawings

图1为本发明的动态手势抓取的对应静态组合图。FIG. 1 is a corresponding static combination diagram of dynamic gesture capture in the present invention.

图2为本发明的优选实施例的10种动态抓取的初态及其各种末态手势图像。Fig. 2 is the gesture images of 10 initial states of dynamic capture and various final states thereof according to the preferred embodiment of the present invention.

图3为本发明优选实施例的流程图。Fig. 3 is a flowchart of a preferred embodiment of the present invention.

图4为本发明优选实施例采用密度分布特征方法对静态组合图进行分类识别示意图。Fig. 4 is a schematic diagram of classifying and identifying static combination graphs using the density distribution feature method in a preferred embodiment of the present invention.

图5为本发明优选实施例指尖探测示意图。Fig. 5 is a schematic diagram of fingertip detection in a preferred embodiment of the present invention.

具体实施方式detailed description

下面结合附图和优选实施例对本发明作更进一步的详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and preferred embodiments.

1、组合图1. Combination diagram

如何对动态手势的图像流进行处理，是动态手势识别过程中的一个重要问题。本文将一个动态手势过程中的各帧图像映射为一个静态组合图。通过对静态组合图的分析处理，完成对动态手势的识别。在该组合图中，动态图像流呈圆形排列于其中。本文通过对组合图的识别来完成手势识别的第一步：粗略手势识别，组合图如图1所示：How to process the image stream of dynamic gestures is an important issue in the process of dynamic gesture recognition. In this paper, each frame image in a dynamic gesture process is mapped to a static composite image. Through the analysis and processing of the static combination graph, the recognition of the dynamic gesture is completed. In this combination diagram, the dynamic image streams are arranged in a circle. In this paper, the first step of gesture recognition is completed through the recognition of the combination diagram: rough gesture recognition. The combination diagram is shown in Figure 1:

动态手势图像流与组合图的映射关系如下所示：The mapping relationship between the dynamic gesture image flow and the combination graph is as follows:

假设获取N帧连续动态手势序列图像{P_i},i＝0，1,2,...,N。组合图中的像素同{P_i}中的像素的映射关系如下 $[\begin{matrix} x_{i} \\ y_{i} \end{matrix}] = [\begin{matrix} \frac{a}{2} + (X - γ \sin β - \frac{A}{2}) \cos α + (Y + γ \cos β - \frac{B}{2}) \sin α \\ \frac{b}{2} - (X - γ \sin β - \frac{A}{2}) \sin α - (Y + γ \cos β - \frac{B}{2}) \cos α \end{matrix}]$ $(1)$ Assume that N frames of continuous dynamic gesture sequence images {P _i } are acquired, i=0, 1, 2,...,N. The mapping relationship between the pixels in the combination map and the pixels in {P _i } is as follows $[\begin{matrix} x_{i} \\ {the y}_{i} \end{matrix}] = [\begin{matrix} \frac{a}{2} + (x - γ \sin β - \frac{A}{2}) \cos α + (Y + γ \cos β - \frac{B}{2}) \sin α \\ \frac{b}{2} - (x - γ \sin β - \frac{A}{2}) \sin α - (Y + γ \cos β - \frac{B}{2}) \cos α \end{matrix}]$ $(1)$

其中，x_i,y_i分别代表小图P_i某一像素的横纵坐标，X,Y分别表示组合图Q中对应像素的横纵坐标，a,b分别表示P_i的宽度和高度，A,B分别表示Q的宽度和高度。r为各小图P_i在Q中旋转排列的半径。α表示图形P_i自身旋转的角度，β表示过旋转后的P_i的中心的半径跟Q中所建立的直角坐标系纵坐标的夹角。且：Among them, x _i and y _i represent the horizontal and vertical coordinates of a certain pixel in the small image P _i respectively, X and Y represent the horizontal and vertical coordinates of the corresponding pixel in the combined image Q respectively, a and b represent the width and height of _Pi respectively, and A , B represent the width and height of Q respectively. r is the radius of the rotation arrangement of each small graph P _i in Q. α represents the rotation angle of the figure P _i itself, and β represents the angle between the radius of the center of the rotated P _i and the ordinate of the Cartesian coordinate system established in Q. and:

$α α = = \frac{33}{22} π π + + \frac{22}{N N} π π i i - - - - - - ((22))$

$β β = = \frac{22}{N N} π π i i - - - - - - ((33))$

2、密度分布特征(DDF)2. Density distribution feature (DDF)

帧融合技术之后，对得到的静态组合图求其密度分布特征。After the frame fusion technology, the density distribution characteristics of the obtained static combined image are obtained.

图像二值化后，图像的信息传递取决于图像中目标像素的排列。密度分布特征的根本目的是通过统计目标像素在不同区域空间的分布情况来获取图像的像素分布信息，从而达到表达该二值图像的目的。通过对图像的密度分布特征进行分类可进行不同图像的识别。After the image is binarized, the information transmission of the image depends on the arrangement of the target pixels in the image. The fundamental purpose of the density distribution feature is to obtain the pixel distribution information of the image by counting the distribution of the target pixels in different regional spaces, so as to achieve the purpose of expressing the binary image. Different images can be identified by classifying the density distribution features of the images.

本文根据组合图中手势的变化特征，对密度分布特征做出了一定的修改。In this paper, according to the changing characteristics of gestures in the combination graph, some modifications are made to the density distribution characteristics.

原DDF特征表示如下：The original DDF features are expressed as follows:

DDF＝(r₁，…，r_M，dr₁，…，dr_M) (4)DDF=(r ₁ , . . . , r _M , dr ₁ , . . . , dr _M ) (4)

静态组合图中，手势变化绝大多数存在于手指部分，而掌心部分的变化较少。因而，增加手指部分的DDF权重，可有效降低不同虚拟组合图间的DDF相似度，提高识别效率。经过大量实验，新DDF特征如下所示：In the static composition diagram, most of the gesture changes exist in the finger part, while the palm part changes less. Therefore, increasing the DDF weight of the finger part can effectively reduce the DDF similarity between different virtual combination images and improve the recognition efficiency. After extensive experiments, the new DDF features are as follows:

DDF′＝(r₁，…，r₁₀，ar₁₁，…，ar₁₅，br₁₆，…，br_20，dr₁，…，dr₁₀，cdr₁₁，…，cdr_M) (5)DDF'=(r ₁ , ..., r ₁₀ , ar ₁₁ , ..., ar ₁₅ , br ₁₆ , ..., br _{20 ,} dr ₁ , ..., dr ₁₀ , cdr ₁₁ , ..., cdr _M ) (5)

其中,a,b,c为可调节参数，本文使用参数为：a＝3,b＝6,c＝3。Among them, a, b, and c are adjustable parameters, and the parameters used in this paper are: a=3, b=6, c=3.

3.手势识别3. Gesture recognition

本发明拟对10种动态抓的手势进行识别。动态手势的初态及其各种末态手势图像如图4。The present invention intends to recognize 10 kinds of dynamic grasping gestures. The initial state of the dynamic gesture and its various final state gesture images are shown in Figure 4.

识别过程分为两个阶段。第一阶段对组合图采用密度分布特征的方法进行识别，识别出动态手势所在手势集合，第二阶段在手势集合内采用Hausd orff距离或是特征点的方法对动态手势的最后一帧进行进一步的识别，从而达到准确的手势识别效果，识别过程如图3所示。The identification process is divided into two stages. In the first stage, the density distribution feature method is used to identify the combination graph, and the gesture set where the dynamic gesture is located is identified. In the second stage, the Hausd orff distance or feature point method is used in the gesture set to further analyze the last frame of the dynamic gesture. Recognition, so as to achieve accurate gesture recognition effect, the recognition process is shown in Figure 3.

3.1粗略手势识别3.1 Rough Gesture Recognition

本阶段识别采用密度分布特征方法对静态组合图进行分类识别(图4)。将待识别的10种手势根据DDF特征划分为A、B、C、D四个集合，对待识别手势进行集合分类。方法步骤如下:At this stage of identification, the method of density distribution feature is used to classify and identify the static combination diagram (Figure 4). The 10 gestures to be recognized are divided into four sets A, B, C, and D according to the DDF features, and the gestures to be recognized are classified into sets. The method steps are as follows:

${S S}_{max max} = = \underset{i i = = 11,, ... ...,, M m}{m m a a x x} (({S S}_{i i})) - - - - - - ((66))$

r_i＝S_i/S_max(i＝1，…，M) (7)r _i =S _i /S _max (i=1,...,M) (7)

${dr dr}_{i i} = = \{\begin{matrix} | | {r r}_{11} - - {r r}_{22} | | & i i = = 11 \\ | | 22 {r r}_{i i} - - {r r}_{i i - - 11} - - {r r}_{i i + + 11} | | & 11 < < i i < < M m \\ | | {r r}_{M m} - - {r r}_{M m - - 11} | | & i i = = M m \end{matrix} - - - - - - ((88))$

d＝(r₁，…，r_M，dr₁，…，dr_M) (9)d=(r ₁ ,...,r _M , dr ₁ ,...,dr _M ) (9)

3.2精确手势识别3.2 Accurate Gesture Recognition

第二阶段识别过程中，根据动态手势最后一帧手势图像的不同特征，对最后一帧进行识别，从而完成手势的精确识别，经过分析得知，C集合的手势密度分布特征的第35个参数向量在范围5至6之间，因此采用hausdorff距离公式进行识别，D集合中的手势密度分布特征的第35个参数向量在范围3至4之间，采用指尖特征点方法。具体识别方法如下。In the second stage recognition process, according to the different characteristics of the gesture image of the last frame of the dynamic gesture, the last frame is recognized, so as to complete the precise recognition of the gesture. After analysis, it is known that the 35th parameter of the gesture density distribution feature of the C set The vector is between 5 and 6, so the hausdorff distance formula is used for recognition. The 35th parameter vector of the gesture density distribution feature in the D set is between 3 and 4, and the fingertip feature point method is used. The specific identification method is as follows.

在C集合的识别中，对最后一帧采用hausdorff距离公式进行识别。首先训练手势集合内手势的边界点集合L；然后对待识别手势的最后一帧求其边界点的集合e；分别计算e同边界点集合L的hausdorff距离；输出hausdorf f距离最小的模板手势序列i，此序列即为所识别的正确手势序列。In the identification of the C set, the last frame is identified using the Hausdorff distance formula. First, train the boundary point set L of gestures in the gesture set; then find the set e of the boundary points in the last frame of the gesture to be recognized; calculate the hausdorff distance between e and the boundary point set L; output the template gesture sequence i with the smallest hausdorf f distance , this sequence is the correct gesture sequence recognized.

在D集合的识别中，对最后一帧采用指尖探测(图5)进行识别。In the identification of the D set, the last frame is identified using fingertip detection (Fig. 5).

首先训练手势集合，检测出手势集合中各种手势的指尖点信息，将各指尖点同手势重心形成的向量的特征信息记入向量模板G；First, train the gesture set, detect the fingertip point information of various gestures in the gesture set, and combine each fingertip point with the gesture center of gravity to form a vector The feature information of is recorded in the vector template G;

然后检测待识别手势的指尖向量同向量模板G中的指尖向量相比较，输出相似度最大的模板手势序列i，此序列即为所识别的正确手势序列。Then detect the fingertip vector of the gesture to be recognized Compared with the fingertip vector in the vector template G, the template gesture sequence i with the largest similarity is output, and this sequence is the correct gesture sequence recognized.

4.有益效果4. Beneficial effect

本发明使用普通摄像头，在恒定光照条件下对10种动态手势采用组合图的方法进行第一层识别，然后采用hausdorff距离和指尖探测的方法进行第二层识别，分层次识别大大提高了识别效率。表1所示为采用本文识别方法得到的识别率同单帧手势图进行DDF单帧识别的识别率进行对比。The present invention uses an ordinary camera to perform first-level recognition on 10 kinds of dynamic gestures under constant light conditions using the method of combined diagrams, and then uses the method of hausdorff distance and fingertip detection to perform second-level recognition, and the layered recognition greatly improves the recognition efficiency. Table 1 shows the comparison between the recognition rate obtained by using the recognition method in this paper and the recognition rate of DDF single-frame recognition on a single-frame gesture map.

单帧识别方法是指对连续动态手势的每一帧记录其DDF参数，将各手势的DDF特征形成模板，待识别手势的DDF特征同模板中的DDF特征相比较,从而进行手势识别。The single-frame recognition method refers to recording the DDF parameters for each frame of continuous dynamic gestures, forming a template with the DDF features of each gesture, and comparing the DDF features of the gesture to be recognized with the DDF features in the template to perform gesture recognition.

表1：识别率对比Table 1: Comparison of recognition rates

由表格可知，对于既定手势，本文方法的识别率较之单帧图像DDF识别的识别率有了明显的提高。It can be seen from the table that for a given gesture, the recognition rate of this method is significantly improved compared with the recognition rate of single-frame image DDF recognition.

当然，上述说明并非对本发明的限制，本发明也不仅限于上述举例，本技术领域的普通技术人员在本发明的实质范围内所做出的变化、改型、添加或替换，也属于本发明的保护范围。Of course, the above description is not a limitation of the present invention, and the present invention is not limited to the above examples. Changes, modifications, additions or replacements made by those skilled in the art within the scope of the present invention also belong to the scope of the present invention. protected range.

Claims

1. a dynamic gesture recognition method based on frame fusion technology, is characterized in that, comprises the steps:

(1) Divide a set of dynamic gestures into a set of gesture sets according to the density distribution characteristics, and calculate the density distribution characteristic parameters of each gesture frame fusion image for each gesture set, and then calculate the density distribution characteristics in each set The average value of the parameters, the average value is used as the template feature vector H of the gesture set;

(2) The dynamic gesture image to be recognized obtains the static combination graph Q of mapping;

(3) Calculate the density distribution feature of the static combination graph Q, and identify the gesture set where the dynamic gesture is located according to the density distribution feature;

(4) Determine the gestures in the gesture set according to the range of the density distribution feature and further choose to adopt the Hausdor ff distance method or the fingertip feature point method;

Described step (2) comprises the steps:

(2.1) Acquire N frames of continuous dynamic gesture sequence images {P _i }, i=0, 1, 2,..., N;

(2.2) The mapping relationship between the pixels in the static combination graph Q and the pixels in {P _i } is as follows

[\begin{matrix} {x x}_{i i} \\ {y the y}_{i i} \end{matrix}] = = [\begin{matrix} \frac{a a}{22} + + ((X x - - γ γ sin sin β β - - \frac{A A}{22})) c c o o s the s α α + + ((Y Y + + γ γ c c o o s the s β β - - \frac{B B}{22})) sin sin α α \\ \frac{b b}{22} - - ((X x - - γ γ sin sin β β - - \frac{A A}{22})) s the s i i n no α α - - ((Y Y + + γ γ c c o o s the s β β - - \frac{B B}{22})) c c o o s the s α α \end{matrix}] - - - - - - ((11))

Among them, x _i and y _i represent the horizontal and vertical coordinates of a certain pixel in the small image P _i respectively, X and Y represent the horizontal and vertical coordinates of the corresponding pixel in the combined image Q respectively, a and b represent the width and height of _Pi respectively, and A , B respectively represent the width and height of Q, r is the radius of each small figure Pi rotated in Q, α represents the angle of rotation of the figure P _i itself, and β _represents the radius of the center of the rotated _Pi and Q The included angle of the ordinate of the established Cartesian coordinate system, and:

α α = = \frac{33}{22} π π + + \frac{22}{N N} π π i i - - - - - - ((22))

β β = = \frac{22}{N N} π π i i - - - - - - ((33)) . .

2. the dynamic gesture recognition method based on frame fusion technology according to claim 1, is characterized in that, described step (3) comprises:

(3.1) The plane image formed by the static combination graph Q is f(x, y), and the centroid of the image f(x, y) is calculated, that is, the center of gravity

(3.2) Calculate the maximum distance D _max and the minimum distance D _min from the target pixel point to the centroid in the image f(x, y);

(3.3) In the image f(x, y), calculate the maximum circumscribed circle of the target area with the centroid as the center and the radius of D _max and the minimum circumscribed circle of the target area with D _min as the radius. The maximum circumscribed circle and the minimum In the area formed by the circumscribed circle, use the equidistant area division method to divide the image into M sub-image areas (M>0);

(3.4) Make statistics on each sub-image area respectively, calculate the total number S _i (i=1,...,M) of target pixels in each sub-image area, and find the maximum value of S _i .

{S S}_{m m a a x x} = = \underset{i i = = 11,, ... ...,, M m}{m m a a x x} (({S S}_{i i})) - - - - - - ((44))

(3.5) Calculate the density distribution feature d of the static composite graph:

r _i =S _i /S _max (i=1,...,M) (5)

{dr dr}_{i i} = = \{\begin{matrix} | | {r r}_{11} - - {r r}_{22} | | & i i = = 11 \\ | | 22 {r r}_{i i} - - {r r}_{i i - - 11} - - {r r}_{i i + + 11} | | & 11 < < i i < < M m \\ | | {r r}_{M m} - - {r r}_{M m - - 11} | | & i i = = M m \end{matrix} - - - - - - ((66))

d=(r ₁ ,...,r _M ; dr ₁ ,...,dr _M ) (7)

(3.6) According to the formula DDF'=(r ₁ , ..., r ₁₀ , ar ₁₁ , ..., ar ₁₅ , br ₁₆ , ..., br ₂₀ ; dr ₁ , ..., dr ₁₀ , cdr ₁₁ , ..., ) to get the modified eigenvector Among them, a, b, c are adjustable parameters;

(3.7) will get the eigenvector Compared with the template feature vector H in each gesture set, the Euclidean distance is calculated respectively, and the gesture set with the smallest Euclidean distance is the selected gesture set.

3. the dynamic gesture recognition method based on frame fusion technology according to claim 1, is characterized in that, described step (4) comprises the steps:

(4.1) Hausdorff distance is used to further identify the last frame of the dynamic gesture in the gesture set where the 35th parameter vector of the density distribution feature is in the range of 5 to 6, so as to achieve an accurate gesture recognition effect;

(4.2) For gestures in which the 35th parameter vector of the density distribution feature is in the range of 3 to 4, the gesture uses the method of fingertip feature points to further identify the last frame of the dynamic gesture, so as to achieve accurate gesture recognition. Effect.

4. the dynamic gesture recognition method based on frame fusion technology according to claim 3, is characterized in that, the Hausdorff distance method of described step (4.1) carries out further identification to the last frame of dynamic gesture and comprises the steps:

(4.1.1) First train the boundary point set L of the gesture in the gesture set;

(4.1.2) Then find the set e of the boundary points of the last frame of the gesture to be recognized;

(4.1.3) Calculate the hausdorff distance between e and the boundary point set L respectively;

(4.1.4) Output the template gesture sequence i with the smallest hausdorff distance, which is the correct gesture sequence identified.

5. the dynamic gesture recognition method based on frame fusion technology according to claim 3, is characterized in that, the method for the fingertip feature point of described step (4.2) comprises the steps:

(4.2.1) First train the gesture set, detect the fingertip point information of various gestures in the gesture set, and combine each fingertip point with the gesture center of gravity to form a vector The feature information of is recorded in the vector template G;

(4.2.2) Then detect the fingertip vector of the gesture to be recognized Compared with the fingertip vector in the vector template G, the template gesture sequence i with the largest similarity is output, and this sequence is the correct gesture sequence recognized.