CN112347951B - Gesture recognition method, device, storage medium and data glove - Google Patents
Gesture recognition method, device, storage medium and data glove Download PDFInfo
- Publication number
- CN112347951B CN112347951B CN202011253584.5A CN202011253584A CN112347951B CN 112347951 B CN112347951 B CN 112347951B CN 202011253584 A CN202011253584 A CN 202011253584A CN 112347951 B CN112347951 B CN 112347951B
- Authority
- CN
- China
- Prior art keywords
- data
- gesture
- gesture recognition
- input data
- svm classifier
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/014—Hand-worn input/output arrangements, e.g. data gloves
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Social Psychology (AREA)
- Psychiatry (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本发明提供了一种手势识别方法、装置、存储介质及数据手套,方法包括:获取数据手套完成当前动作时数据手套的各个传感器采集的传感器数据,所有传感器数据组成一个输入数据;采用主成分分析法对输入数据进行特征提取,获得第二特征数据;将第二特征数据输入训练好的多类SVM分类器,确定当前动作对应的手势;当训练好的多类SVM分类器无法识别当前动作时,对输入数据进行预处理,获得预处理后的数据;将预处理后的数据输入训练好的手势识别模型,输出当前动作对应的手势,其中,手势识别模型是基于卷积神经网络和长短期记忆循环网络建立的。本发明的技术方案能够在提高手势识别速度的同时,保证手势识别的精度。
The invention provides a gesture recognition method, device, storage medium and data glove. The method includes: acquiring sensor data collected by each sensor of the data glove when the data glove completes the current action, and all sensor data form an input data; using principal component analysis The feature extraction method is used to extract the input data to obtain the second feature data; the second feature data is input into the trained multi-class SVM classifier to determine the gesture corresponding to the current action; when the trained multi-class SVM classifier cannot recognize the current action , preprocess the input data to obtain the preprocessed data; input the preprocessed data into the trained gesture recognition model, and output the gesture corresponding to the current action. The gesture recognition model is based on convolutional neural network and long-term and short-term Memory recurrent network built. The technical solution of the present invention can ensure the accuracy of gesture recognition while increasing the speed of gesture recognition.
Description
技术领域technical field
本发明涉及手势识别技术领域,具体而言,涉及一种手势识别方法、装置、存储介质及数据手套。The present invention relates to the technical field of gesture recognition, in particular to a gesture recognition method, device, storage medium and data glove.
背景技术Background technique
手语是用手势比量动作,根据手势的变化模拟形象或者音节以构成一定意思或词语,是听力障碍或者无法言语的人互相交际和交流思想的一种手的语言。因此,识别手势对于和手语使用者进行沟通非常重要,目前常采用以下两种方法来识别手势。Sign language is to use gestures to measure movements, and to simulate images or syllables according to gesture changes to form certain meanings or words. It is a hand language for people who are hearing impaired or unable to speak to communicate and exchange ideas. Therefore, recognizing gestures is very important for communicating with sign language users. At present, the following two methods are often used to recognize gestures.
一种是采用摄像头捕捉手势动作,对拍摄的图片进行分析,识别手势动作,但是摄像头在拍摄手势的过程中对光线的要求很高,照明情况不好时会影响手势识别的精度。One is to use a camera to capture gestures, analyze the captured pictures, and recognize gestures. However, the camera has high requirements for light in the process of capturing gestures, and poor lighting conditions will affect the accuracy of gesture recognition.
另一种是获取手部完成手势动作时的手部表面肌电图,通过对肌电图进行分析处理,识别手势动作,但是现有的根据肌电图识别手势动作的算法比较复杂,效率低。The other is to obtain the hand surface electromyogram when the hand completes the gesture, and then analyze and process the electromyogram to recognize the gesture, but the existing algorithm for recognizing gestures based on the electromyogram is relatively complicated and inefficient .
发明内容Contents of the invention
本发明解决的问题是如何兼顾手势识别的效率和精度。The problem solved by the present invention is how to balance the efficiency and accuracy of gesture recognition.
为解决上述问题,本发明提供一种手势识别方法、装置、存储介质及数据手套。To solve the above problems, the present invention provides a gesture recognition method, device, storage medium and data glove.
第一方面,本发明提供了一种手势识别方法,包括:In a first aspect, the present invention provides a gesture recognition method, including:
获取数据手套完成当前动作时所述数据手套的各个传感器采集的传感器数据,所有所述传感器数据组成一个输入数据;Obtain the sensor data collected by each sensor of the data glove when the data glove completes the current action, and all the sensor data form an input data;
采用主成分分析法对所述输入数据进行特征提取,获得第二特征数据;performing feature extraction on the input data by principal component analysis to obtain second feature data;
将所述第二特征数据输入训练好的多类SVM分类器,确定所述当前动作对应的手势;Inputting the second feature data into the trained multi-class SVM classifier to determine the gesture corresponding to the current action;
当所述训练好的多类SVM分类器无法识别所述当前动作时,对所述输入数据进行预处理,获得预处理后的数据;When the trained multi-class SVM classifier cannot recognize the current action, preprocessing the input data to obtain preprocessed data;
将所述预处理后的数据输入训练好的手势识别模型,输出所述当前动作对应的手势,其中,所述手势识别模型是基于卷积神经网络和长短期记忆循环网络建立的。Input the preprocessed data into the trained gesture recognition model, and output the gesture corresponding to the current action, wherein the gesture recognition model is established based on a convolutional neural network and a long-term short-term memory cycle network.
进一步,所述将所述第二特征数据输入训练好的多类SVM分类器之前,包括:Further, before inputting the second feature data into the trained multi-class SVM classifier, it includes:
分别获取所述数据手套完成不同的标定动作时的所述输入数据;respectively acquiring the input data when the data glove completes different calibration actions;
分别对每个所述输入数据进行放大和滤波,获得滤波后的输入数据;respectively amplifying and filtering each of the input data to obtain filtered input data;
采用主成分分析法对所有所述滤波后的输入数据进行特征提取,获得所述第一特征数据;performing feature extraction on all the filtered input data by using principal component analysis to obtain the first feature data;
采用所述第一特征数据训练多类SVM分类器,获得所述训练好的多类SVM分类器。Using the first feature data to train a multi-class SVM classifier to obtain the trained multi-class SVM classifier.
进一步,所述采用主成分分析法对所有所述滤波后的输入数据进行特征提取包括:Further, the feature extraction of all the filtered input data by using principal component analysis method includes:
计算所有所述滤波后的输入数据的平均值;calculating an average of all said filtered input data;
分别确定各个所述滤波后的输入数据与所述平均值之间的差值,并根据所有所述差值确定协方差矩阵;separately determining the difference between each of said filtered input data and said mean value, and determining a covariance matrix based on all said differences;
根据所述协方差矩阵计算特征值和特征向量,根据所述特征向量确定主成分矩阵;Calculate eigenvalues and eigenvectors according to the covariance matrix, and determine a principal component matrix according to the eigenvectors;
根据所述主成分矩阵和所述差值确定所述第一特征数据。The first feature data is determined according to the principal component matrix and the difference.
进一步,所述标定动作与所述手势模板一一对应,所述采用所述特征数据训练多类SVM分类器包括:Further, the calibration action is in one-to-one correspondence with the gesture template, and the training of multi-class SVM classifiers using the feature data includes:
对于任一所述手势模板,将所述手势模板对应的所述第一特征数据作为正集,将所述正集以外的所述第一特征数据作为负集,对应的所述正集和所述负集为一个训练集;For any of the gesture templates, the first feature data corresponding to the gesture template is used as a positive set, and the first feature data other than the positive set is used as a negative set, and the corresponding positive set and all The above negative set is a training set;
将所述训练集输入所述多类SVM分类器,所述多类SVM分类器包括多个分类函数,每个所述分类函数分别对所述训练集进行处理,分别输出一个所述第一分类值;The training set is input into the multi-class SVM classifier, the multi-class SVM classifier includes a plurality of classification functions, each of the classification functions processes the training set respectively, and outputs one of the first classifications respectively value;
确定所述第一分类值中的最大值以及所述最大值对应的所述分类函数,将所述分类函数与所述手势模板对应;determining the maximum value of the first classification value and the classification function corresponding to the maximum value, and corresponding the classification function to the gesture template;
依次对各个所述第一特征数据进行处理,将所述手势模板与所述分类函数一一对应。Each of the first feature data is processed sequentially, and the gesture templates are in one-to-one correspondence with the classification functions.
进一步,将所述第二特征数据输入训练好的多类SVM分类器,确定所述当前动作对应的手势包括:Further, inputting the second feature data into the trained multi-class SVM classifier, and determining the gesture corresponding to the current action includes:
将所述第二特征数据输入所述训练好的多类SVM分类器,每个所述分类函数分别对所述第二特征数据进行处理,分别输出一个所述第二分类值;Inputting the second feature data into the trained multi-class SVM classifier, each of the classification functions processes the second feature data respectively, and outputs a second classification value respectively;
确定所有所述第二分类值中的最大值和次大值,将所述第二分类值中的最大值和所述次大值分别与预设阈值进行对比,当所述最大值大于或等于所述预设阈值而所述次大值小于所述预设阈值时,确定输出所述最大值的所述分类函数对应的所述手势模板为所述当前动作对应的手势。determining the maximum value and the second maximum value among all the second classification values, comparing the maximum value and the second maximum value among the second classification values with preset thresholds, and when the maximum value is greater than or equal to When the preset threshold is smaller than the preset threshold, it is determined that the gesture template corresponding to the classification function that outputs the maximum value is the gesture corresponding to the current action.
进一步,所述训练好的多类SVM分类器无法识别所述当前动作包括所述第二分类值中的最大值和次大值均大于或等于所述预设阈值。Further, the failure of the trained multi-class SVM classifier to identify the current action includes that both the maximum value and the second maximum value of the second classification value are greater than or equal to the preset threshold.
进一步,所述将所述预处理后的数据输入训练好的手势识别模型之前,包括:Further, before inputting the preprocessed data into the trained gesture recognition model, it includes:
分别获取所述数据手套完成不同的标定动作时的所述输入数据,每个所述输入数据包括一个所述手势模板对应的所有所述传感器数据;Acquire the input data when the data glove completes different calibration actions, each of the input data includes all the sensor data corresponding to one gesture template;
分别对所有所述输入数据进行预处理,获得所述预处理后的数据;performing preprocessing on all the input data respectively to obtain the preprocessed data;
构建基于卷积神经网络和长短期记忆循环网络的手势识别模型,采用所述预处理后的数据训练所述手势识别模型,获得所述训练好的手势识别模型。Constructing a gesture recognition model based on a convolutional neural network and a long-term short-term memory cycle network, using the preprocessed data to train the gesture recognition model, and obtaining the trained gesture recognition model.
进一步,每个所述输入数据包括一个所述标定动作对应的所有所述传感器数据,所述分别对所有所述输入数据进行预处理,获得所述预处理后的数据包括:Further, each of the input data includes all the sensor data corresponding to one of the calibration actions, and performing preprocessing on all the input data respectively, and obtaining the preprocessed data includes:
对于任一所述输入数据,采用基于时隙信道跳变的时间同步机制对所述输入数据对应的所有传感器数据进行同步,获得同步后的传感器数据;For any of the input data, a time synchronization mechanism based on time slot channel hopping is used to synchronize all sensor data corresponding to the input data to obtain synchronized sensor data;
采用巴特沃斯带通滤波器对所述同步后的传感器数据进行滤波,获得滤波后的传感器数据;filtering the synchronized sensor data by using a Butterworth bandpass filter to obtain filtered sensor data;
采用滑动窗口截取所述滤波后的传感器数据,获得多个数据段,所有所述数据段组成所述预处理后的数据。A sliding window is used to intercept the filtered sensor data to obtain multiple data segments, and all the data segments form the preprocessed data.
进一步,所述采用所述预处理后的数据训练所述手势识别模型包括前向传播步骤,所述前向传播步骤包括:Further, the training of the gesture recognition model using the preprocessed data includes a forward propagation step, and the forward propagation step includes:
将所述预处理后的数据输入所述手势识别模型,输出所述标定动作为各个手势模板的概率;Input the preprocessed data into the gesture recognition model, and output the probability that the calibration action is each gesture template;
确定概率最大的所述手势模板为预测手势。The gesture template with the highest probability is determined to be the predicted gesture.
进一步,所述手势识别模型包括依次连接的两个一维卷积层、最大池化层、展平层、LSTM层、两个全连接层、Softmax层和输出层,所述将所述预处理后的数据输入所述手势识别模型,输出所述标定动作为各个手势模板的概率包括:Further, the gesture recognition model includes two one-dimensional convolutional layers, a maximum pooling layer, a flattening layer, an LSTM layer, two fully connected layers, a Softmax layer and an output layer connected in sequence, and the preprocessing After the data is input into the gesture recognition model, the probability of outputting the calibration action as each gesture template includes:
将所述预处理后的数据输入第一个所述一维卷积层,两个所述一维卷积层对所述预处理后的数据进行特征提取,获得第三特征数据,所有所述第三特征数据组成特征图;The preprocessed data is input into the first one-dimensional convolution layer, and the two one-dimensional convolution layers perform feature extraction on the preprocessed data to obtain the third feature data, and all the The third feature data constitutes a feature map;
将所述特征图输入所述最大池化层,对所述特征图的每个子区域进行特征提取,获得第四特征数据;Inputting the feature map into the maximum pooling layer, performing feature extraction on each sub-region of the feature map, and obtaining fourth feature data;
将所述第四特征数据输入所述展平层,将所述第四特征数据整形成一维向量,将所述一维向量输入所述LSTM层进行处理,输出处理后的数据;Inputting the fourth feature data into the flattening layer, shaping the fourth feature data into a one-dimensional vector, inputting the one-dimensional vector into the LSTM layer for processing, and outputting the processed data;
将所述处理后的数据依次输入两个所述全连接层和所述Softmax层,输出所述标定动作为各个所述手势模板的概率;The processed data is sequentially input into the two fully connected layers and the Softmax layer, and the output of the calibration action is the probability of each of the gesture templates;
所述输出层输出概率最大的所述手势模板,概率最大的所述手势模板为所述预测手势。The output layer outputs the gesture template with the highest probability, and the gesture template with the highest probability is the predicted gesture.
进一步,所述手势识别模型还包括多个脱落层,其中,两个所述脱落层设置在第二个所述一维卷积层和所述最大池化层之间,两个所述脱落层设置在所述LSTM层和第一个所述全连接层之间。Further, the gesture recognition model also includes a plurality of drop-off layers, wherein two of the drop-off layers are arranged between the second one-dimensional convolutional layer and the maximum pooling layer, and two of the drop-off layers It is set between the LSTM layer and the first fully connected layer.
进一步,所述采用所述预处理后的手部数据训练所述手势识别模型还包括:Further, the training of the gesture recognition model using the preprocessed hand data further includes:
反向传播步骤,包括根据所述标定动作和所述预测手势做交叉熵损失,并根据所述损失优化所述手势识别模型;The backpropagation step includes performing a cross-entropy loss according to the calibration action and the predicted gesture, and optimizing the gesture recognition model according to the loss;
循环重复所述前向传播步骤和所述反向传播步骤,直至所述损失不再下降,获得稳定的手势识别模型。The forward propagation step and the backward propagation step are repeated cyclically until the loss no longer decreases, and a stable gesture recognition model is obtained.
第二方面,本发明提供了一种手势识别装置,包括:In a second aspect, the present invention provides a gesture recognition device, comprising:
获取模块,用于获取数据手套完成当前动作时所述数据手套的各个传感器采集的传感器数据,所有所述传感器数据组成一个输入数据;The acquisition module is used to acquire the sensor data collected by each sensor of the data glove when the data glove completes the current action, and all the sensor data form an input data;
特征提取模块,用于采用主成分分析法对所述输入数据进行特征提取,获得第二特征数据;A feature extraction module, configured to extract features from the input data using principal component analysis to obtain second feature data;
第一处理模块,用于将所述第二特征数据输入训练好的多类SVM分类器,确定所述当前动作对应的手势;A first processing module, configured to input the second feature data into a trained multi-class SVM classifier to determine the gesture corresponding to the current action;
预处理模块,用于当所述训练好的多类SVM分类器无法识别所述当前动作时,对所述输入数据进行预处理,获得预处理后的数据;A preprocessing module, configured to preprocess the input data to obtain preprocessed data when the trained multi-class SVM classifier cannot recognize the current action;
第二处理模块,用于将所述预处理后的数据输入训练好的手势识别模型,输出所述当前动作对应的手势,其中,所述手势识别模型是基于卷积神经网络和长短期记忆循环网络建立的。The second processing module is used to input the preprocessed data into the trained gesture recognition model, and output the gesture corresponding to the current action, wherein the gesture recognition model is based on convolutional neural network and long short-term memory cycle network built.
第三方面,本发明提供了一种手势识别装置,包括存储器和处理器;In a third aspect, the present invention provides a gesture recognition device, including a memory and a processor;
所述存储器,用于存储计算机程序;The memory is used to store computer programs;
所述处理器,用于当执行所述计算机程序时,实现如上所述的手势识别方法。The processor is configured to implement the above gesture recognition method when executing the computer program.
第四方面,本发明提供了一种计算机可读存储介质,所述存储介质上存储有计算机程序,当所述计算机程序被处理器执行时,实现如上所述的手势识别方法。In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned gesture recognition method is realized.
第五方面,本发明提供了一种数据手套,包括手套、多个传感器和如上所述的手势识别装置,多个所述传感器分别与所述手势识别装置电连接,且多个所述传感器分别设置在所述手套上,适于检测各个手指的运动数据。In a fifth aspect, the present invention provides a data glove, including a glove, a plurality of sensors, and the gesture recognition device as described above, a plurality of the sensors are respectively electrically connected to the gesture recognition device, and a plurality of the sensors are respectively It is arranged on the glove and is suitable for detecting motion data of each finger.
进一步,所述传感器包括多个全柔性电容传感器和/或多个压阻式传感器;Further, the sensor includes a plurality of fully flexible capacitive sensors and/or a plurality of piezoresistive sensors;
多个所述全柔性电容传感器分别设置在所述手套的各个手指背面和大拇指背面虎口处,分别用于测量手指屈曲或伸展过程中的运动数据,和大拇指的横向运动过程中的运动数据;A plurality of the fully flexible capacitive sensors are respectively arranged at the back of each finger and the back of the thumb at the tiger's mouth of the glove, and are respectively used to measure the motion data during the flexion or extension of the fingers, and the motion data during the lateral motion of the thumb ;
多个所述压阻式传感器分别设置在所述手套的各个手指内侧的关节处和相邻的两个手指之间,分别用于测量每个手指关节的运动数据和手指外展或内收的运动数据。A plurality of the piezoresistive sensors are respectively arranged at the joints inside each finger of the glove and between two adjacent fingers, and are respectively used to measure the movement data of each finger joint and the abduction or adduction of the fingers. motion data.
本发明的手势识别方法、装置、存储介质及数据手套的有益效果是:获取数据手套完成当前动作时各个传感器采集的传感器数据,当前动作对应的所有传感器数据组成一个输入数据。采用主成分分析法对输入数据进行特征提取,获得特征数据并输入至训练好的多类SVM分类器,识别当前动作对应的手势,多类SVM分类器识别速度快,简单高效。当多类SVM分类器难以识别当前动作,则对当前动作对应的输入数据进行预处理,并将预处理后的数据输入训练好的手势识别模型中,输出当前动作对应的手势,手势识别模型基于深度学习模型构建,能够准确识别手势。本发明的技术方案首先采用多类SVM分类器识别手势,识别速度快,在多类SVM分类器难以识别手势时,采用手势识别模型识别手势,能够在提高识别速度的同时,保证识别精度。The beneficial effects of the gesture recognition method, device, storage medium and data glove of the present invention are: the sensor data collected by each sensor when the data glove completes the current action is acquired, and all the sensor data corresponding to the current action form one input data. The principal component analysis method is used to extract the features of the input data, and the feature data is obtained and input to the trained multi-class SVM classifier to recognize the gesture corresponding to the current action. The multi-class SVM classifier is fast, simple and efficient. When the multi-class SVM classifier is difficult to recognize the current action, the input data corresponding to the current action is preprocessed, and the preprocessed data is input into the trained gesture recognition model, and the gesture corresponding to the current action is output. The gesture recognition model is based on Deep learning model construction, which can accurately recognize gestures. The technical solution of the present invention first uses multi-class SVM classifiers to recognize gestures, and the recognition speed is fast. When multi-class SVM classifiers are difficult to recognize gestures, a gesture recognition model is used to recognize gestures, which can ensure recognition accuracy while improving recognition speed.
附图说明Description of drawings
图1为本发明实施例的一种数据手套的背面结构示意图;Fig. 1 is a schematic diagram of the back structure of a data glove according to an embodiment of the present invention;
图2为本发明实施例的一种数据手套的正面结构示意图;Fig. 2 is a schematic diagram of the front structure of a data glove according to an embodiment of the present invention;
图3为本发明实施例的一种数据手套的电路连接示意图;Fig. 3 is a schematic diagram of a circuit connection of a data glove according to an embodiment of the present invention;
图4为本发明实施例的一种不同手势下传感器信号示意图;Fig. 4 is a schematic diagram of sensor signals under different gestures according to an embodiment of the present invention;
图5为本发明实施例的一种多类SVM分类器训练方法的流程示意图;5 is a schematic flow diagram of a multi-class SVM classifier training method according to an embodiment of the present invention;
图6为本发明实施例的一种手势识别模型训练方法的流程示意图;6 is a schematic flow chart of a gesture recognition model training method according to an embodiment of the present invention;
图7为本发明实施例的手势识别模型的结构示意图FIG. 7 is a schematic structural diagram of a gesture recognition model according to an embodiment of the present invention
图8为本发明实施例的一种手势识别方法的流程示意图;FIG. 8 is a schematic flowchart of a gesture recognition method according to an embodiment of the present invention;
图9为本发明实施例的一种手势识别模型的结构示意图。FIG. 9 is a schematic structural diagram of a gesture recognition model according to an embodiment of the present invention.
附图标记说明:Explanation of reference signs:
10-手套;20-全柔性电容传感器;30-压阻式传感器;40-底座。10-glove; 20-fully flexible capacitive sensor; 30-piezoresistive sensor; 40-base.
具体实施方式Detailed ways
为使本发明的上述目的、特征和优点能够更为明显易懂,下面结合附图对本发明的具体实施例做详细的说明。In order to make the above objects, features and advantages of the present invention more comprehensible, specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. In the description of the present invention, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein.
如图1和图2所示,本发明提供的一种数据手套,包括手套10、多个传感器和如下文所述的手势识别装置,多个所述传感器分别与所述手势识别装置电连接,且多个所述传感器分别设置在所述手套10上,适于检测各个手指的运动数据。As shown in Figures 1 and 2, a data glove provided by the present invention includes a
优选地,所述传感器包括多个全柔性电容传感器20和/或多个压阻式传感器30;Preferably, the sensor includes a plurality of fully flexible
多个所述全柔性电容传感器20分别设置在所述手套10的各个手指背面和大拇指背面虎口处,分别用于测量手指屈曲或伸展过程中的运动数据,和大拇指的横向运动过程中的运动数据;A plurality of fully flexible
多个所述压阻式传感器30分别设置在所述手套10的各个手指内侧的关节处和相邻的两个手指之间,分别用于测量每个手指关节的运动数据和手指外展或内收的运动数据。A plurality of
具体地,如图3所示,数据手套上各个指套处的压阻式传感器30或全柔性电容传感器20的输出端与放大器的输入端电连接,放大器的输出端与滤波器的输入端电连接,所述滤波器的输出端与AD转换处理器的输入端电连接,AD转换处理器的输出端与通信装置电连接。其中,放大器用于放大传感器检测的信号,滤波器用于对放大后的信号进行滤波,AD转换处理器用于将滤波后的信号转换为数字信号,通信装置用于将转换后的数字信号传输至上位机等进行处理,识别手势。Specifically, as shown in FIG. 3 , the output terminals of the
放大器、滤波器、AD转换处理器和通信装置等可集成在一块电路板上,电路板通过底座40根据需要安装在任何方便穿戴和活动的位置,例如:数据手套的手腕背面、手掌背面、小臂上和大臂上等,底座40可通过3D打印制造。Amplifiers, filters, AD conversion processors and communication devices can be integrated on a circuit board, and the circuit board can be installed on any convenient wearing and moving position through the base 40 as required, such as: the back of the wrist, the back of the palm of the data glove, the small On the arm and on the boom etc., the
当压阻式传感器30检测到压力时,或全柔性电容传感器20检测到拉伸时,电压信号会发生变化。如图4所示,当数据手套上进采用全柔性电容传感器20时,当多个手指同时运动时,各个指套上的全柔性电容传感器20分别记录各个手指的运动数据,一个手势下所有手指的运动数据组合成一个输入数据,五个手指的数据采集通道相互独立。一个手指运动时,传感器检测的电压信号发生变化,产生一个电压峰值。例如图4中手势“A”对应的大拇指未运动,未产生电压峰值,其它四个手指运动弯曲,分别产生电压峰值;手势“3”对应的大拇指、食指和中指未运动,未产生电压峰值,而无名指和小拇指弯曲,分别产生电压峰值。When the
如图5所示,本发明实施例提供的一种多类SVM(Support Vector Machine,支持向量机)分类器的训练方法,包括:As shown in Figure 5, a kind of multiclass SVM (Support Vector Machine, support vector machine) classifier training method that the embodiment of the present invention provides, comprises:
步骤110,分别获取所述数据手套完成不同的标定动作时的所述输入数据;Step 110, respectively acquiring the input data when the data glove completes different calibration actions;
步骤120,分别对每个所述输入数据进行放大和滤波,获得滤波后的输入数据;Step 120, respectively amplifying and filtering each of the input data to obtain filtered input data;
步骤130,采用主成分分析法对所有所述滤波后的输入数据进行特征提取,获得所述第一特征数据;Step 130, using principal component analysis to perform feature extraction on all the filtered input data to obtain the first feature data;
步骤140,采用所述第一特征数据训练多类SVM分类器,获得所述训练好的多类SVM分类器。Step 140, using the first feature data to train a multi-class SVM classifier to obtain the trained multi-class SVM classifier.
本实施例中,对输入数据进行放大和滤波,能够减少噪声数据。采用主成分分析法提取各个手势模板对应的输入数据中的特征数据,用于训练多类SVM分类器,能够提高训练速度。训练好的多类SVM分类器能够对手势进行分类,识别手势。In this embodiment, the input data is amplified and filtered to reduce noise data. The principal component analysis method is used to extract the feature data in the input data corresponding to each gesture template, which is used to train a multi-class SVM classifier, which can improve the training speed. A trained multi-class SVM classifier is able to classify and recognize gestures.
优选地,所述采用主成分分析法对所有所述滤波后的输入数据进行特征提取包括:Preferably, the feature extraction of all the filtered input data using the principal component analysis method includes:
计算所有所述滤波后的输入数据的平均值。Compute the average of all said filtered input data.
具体地,将各个手势模板对应的滤波后的输入数据组成一个集合,S={s1,s2,s3…sn},其中,si为第i个手势模板对应的滤波后的输入数据。采用第一公式计算所有滤波后的输入数据的平均值,所述第一公式包括:Specifically, the filtered input data corresponding to each gesture template is combined into a set, S={s 1 , s 2 , s 3 ...s n }, where s i is the filtered input data corresponding to the i-th gesture template data. The average value of all filtered input data is calculated using a first formula comprising:
其中,n为手势模板的数量,Savg为所有滤波后的输入数据的平均值。Wherein, n is the number of gesture templates, and S avg is the average value of all filtered input data.
分别确定各个所述滤波后的输入数据与所述平均值之间的差值,并根据所有所述差值确定协方差矩阵。Differences between each of said filtered input data and said mean value are respectively determined, and a covariance matrix is determined from all said differences.
具体地,采用第二公式确定各个所述滤波后的输入数据与平均值之间的差值,所述第二公式包括:Specifically, a second formula is used to determine the difference between each of the filtered input data and the average value, and the second formula includes:
δi=si-Savg,δ i =s i -S avg ,
其中,δi为第第i个手势模板对应的滤波后的输入数据与平均值之间的差值。Wherein, δ i is the difference between the filtered input data corresponding to the i-th gesture template and the average value.
采用第三公式计算协方差矩阵,所述第三公式包括:The third formula is adopted to calculate the covariance matrix, and the third formula includes:
其中,M为协方差矩阵,δi T为差值向量δi的转置。Among them, M is the covariance matrix, and δ i T is the transpose of the difference vector δ i .
根据所述协方差矩阵计算特征值和特征向量,根据所述特征向量确定主成分矩阵;Calculate eigenvalues and eigenvectors according to the covariance matrix, and determine a principal component matrix according to the eigenvectors;
具体地,采用第四公式计算特征值和特征向量,所述第四公式包括:Specifically, the fourth formula is used to calculate the eigenvalue and the eigenvector, and the fourth formula includes:
M×vi=λi×vi(i=1,2,3...k),M×v i =λ i ×v i (i=1,2,3...k),
其中,(λ1,λ2,λ3...λk)分别为特征值,(v1,v2,v3...vk)为特征向量,主成分矩阵为V={v1,v2,v3...vk}。Among them, (λ 1 ,λ 2 ,λ 3 ...λ k ) are eigenvalues respectively, (v 1 ,v 2 ,v 3 ...v k ) are eigenvectors, and the principal component matrix is V={v 1 , v 2 , v 3 ... v k }.
根据所述主成分矩阵和所述差值确定所述第一特征数据。The first feature data is determined according to the principal component matrix and the difference.
具体地,采用第五公式将输入数据投影到主成分矩阵中,所述第五公式包括:Specifically, adopt the fifth formula to project the input data into the principal component matrix, and the fifth formula includes:
yi=VT×δi,y i =V T ×δ i ,
其中,yi为差值向量δi对应的特征数据。Among them, y i is the feature data corresponding to the difference vector δ i .
优选地,所述标定动作与所述手势模板一一对应,所述采用所述特征数据训练多类SVM分类器包括:Preferably, the calibration action is in one-to-one correspondence with the gesture template, and the training of a multi-class SVM classifier using the feature data includes:
对于任一所述手势模板,将所述手势模板对应的所述第一特征数据作为正集,将所述正集以外的所述第一特征数据作为负集,对应的所述正集和所述负集为一个训练集;For any of the gesture templates, the first feature data corresponding to the gesture template is used as a positive set, and the first feature data other than the positive set is used as a negative set, and the corresponding positive set and all The above negative set is a training set;
将所述训练集输入所述多类SVM分类器,所述多类SVM分类器包括多个分类函数,每个所述分类函数分别对所述训练集进行处理,分别输出一个所述第一分类值;The training set is input into the multi-class SVM classifier, the multi-class SVM classifier includes a plurality of classification functions, each of the classification functions processes the training set respectively, and outputs one of the first classifications respectively value;
确定所述第一分类值中的最大值以及所述最大值对应的所述分类函数,将所述分类函数与所述手势模板对应;determining the maximum value of the first classification value and the classification function corresponding to the maximum value, and corresponding the classification function to the gesture template;
依次对各个所述第一特征数据进行处理,将所述手势模板与所述分类函数一一对应。Each of the first feature data is processed sequentially, and the gesture templates are in one-to-one correspondence with the classification functions.
具体地,多类SVM分类器采用一对全部的支持向量机,获取手势模板对应的输入数据训练多类SVM分类器,多类SVM分类器包括多个分类函数,有多少个手势模板就构造多少个分类函数,多类SVM分类器的一个分类函数能够将对应的手势模板与其它手势模板区分开,分类函数与手势模板一一对应。采用训练好的多类SVM分类器识别手势时,确定手部动作对应的分类函数后,能够迅速确定手势模板,识别手部动作对应的手势,简单高效、速度快。Specifically, the multi-class SVM classifier uses a one-to-all support vector machine to obtain the input data corresponding to the gesture template to train the multi-class SVM classifier. The multi-class SVM classifier includes multiple classification functions, and how many gesture templates are constructed. A classification function, a classification function of the multi-class SVM classifier can distinguish the corresponding gesture template from other gesture templates, and the classification function corresponds to the gesture template one by one. When the trained multi-class SVM classifier is used to recognize gestures, after determining the classification function corresponding to the hand movement, the gesture template can be quickly determined, and the gesture corresponding to the hand movement can be recognized, which is simple, efficient and fast.
如图6所示,本发明实施例提供的一种手势识别模型的训练方法,包括:As shown in FIG. 6, a method for training a gesture recognition model provided by an embodiment of the present invention includes:
步骤210,分别获取所述数据手套完成不同的标定动作时的所述输入数据,每个所述输入数据包括一个所述手势模板对应的所有所述传感器数据;Step 210, acquiring the input data when the data glove completes different calibration actions, each input data includes all the sensor data corresponding to one gesture template;
步骤220,分别对所有所述输入数据进行预处理,获得所述预处理后的数据;Step 220, performing preprocessing on all the input data respectively to obtain the preprocessed data;
步骤230,构建基于卷积神经网络和长短期记忆循环网络的手势识别模型,采用所述预处理后的数据训练所述手势识别模型,获得所述训练好的手势识别模型。Step 230, constructing a gesture recognition model based on a convolutional neural network and a long-term short-term memory cycle network, using the preprocessed data to train the gesture recognition model, and obtaining the trained gesture recognition model.
本实施例中,获取数据手套完成不同手势模板对应的标定动作时的输入数据,采用输入数据训练手势识别模型,手势识别模型是基于卷积神经网络和长短期记忆循环网络建立的,训练所需的数据少。采用基于深度学习的手势识别模型来识别手势时,精度高。In this embodiment, the input data when the data glove completes the calibration actions corresponding to different gesture templates is obtained, and the gesture recognition model is trained using the input data. less data. When the gesture recognition model based on deep learning is used to recognize gestures, the accuracy is high.
优选地,每个所述输入数据包括一个所述标定动作对应的所有所述传感器数据,所述分别对所有所述输入数据进行预处理,获得所述预处理后的数据包括:Preferably, each of the input data includes all the sensor data corresponding to one of the calibration actions, and performing preprocessing on all the input data respectively, and obtaining the preprocessed data includes:
对于任一所述输入数据,采用基于时隙信道跳变(time-slotted channelhopping-based,TSCH)的时间同步机制对所述输入数据对应的所有传感器数据进行同步,获得同步后的传感器数据;For any of the input data, a time synchronization mechanism based on time-slotted channel hopping-based (TSCH) is used to synchronize all sensor data corresponding to the input data to obtain synchronized sensor data;
采用巴特沃斯带通滤波器对所述同步后的传感器数据进行滤波,获得滤波后的传感器数据。The synchronized sensor data is filtered by a Butterworth bandpass filter to obtain filtered sensor data.
具体地,由于人体手部运动频率通常低于10赫兹,在数据手套获取传感器数据之后,采用截止频率为0.5赫兹到10赫兹的巴特沃斯带通滤波器进行滤波,能够去除不相关的能量。并且,由于来自生物测量的强噪声和数据手套穿戴松弛的褶皱误差,采集的传感器数据的相同活动的相似度可能较低,相同活动的传感器数据之间的平均Pearson(皮尔逊)相关系数约为0.2-0.4。巴特沃斯带通滤波器可以通过取出其它频道的噪声来增加相似性,可将相同活动的Pearson相关性增加至0.7-0.8。Specifically, since the human hand movement frequency is usually lower than 10 Hz, after the data glove acquires the sensor data, it uses a Butterworth bandpass filter with a cutoff frequency of 0.5 Hz to 10 Hz for filtering, which can remove irrelevant energy. And, due to strong noise from biometrics and data glove wearing loose fold errors, the similarity of the same activity of the collected sensor data may be low, and the average Pearson (Pearson) correlation coefficient between sensor data of the same activity is about 0.2-0.4. A Butterworth bandpass filter can increase similarity by taking out noise from other channels, increasing the Pearson correlation of the same activity to 0.7-0.8.
采用滑动窗口截取所述滤波后的传感器数据,获得多个数据段,所有所述数据段组成所述预处理后的数据。A sliding window is used to intercept the filtered sensor data to obtain multiple data segments, and all the data segments form the preprocessed data.
具体地,可采用固定时间为4秒的滑动窗口截取滤波后的传感器数据,相邻的两个滑动窗口之间有50%的部分重叠,若获取传感器数据的采样频率为128Hz,结合华东窗口的设计,相关于在128Hz采样频率下有512个样本,将获得的以为时间序列数据重新整形为16×32=512个样本的二维矩阵,将该二维矩阵作为手势识别模型的输入。Specifically, a sliding window with a fixed time of 4 seconds can be used to intercept filtered sensor data, and there is a partial overlap of 50% between two adjacent sliding windows. If the sampling frequency of sensor data is 128Hz, combined with the East China window Design, relative to having 512 samples at 128Hz sampling frequency, will reshape the obtained time series data into a 2D matrix of 16×32=512 samples, and use this 2D matrix as the input of the gesture recognition model.
优选地,所述采用所述预处理后的手部数据训练所述手势识别模型包括前向传播步骤,所述前向传播步骤包括:Preferably, the training of the gesture recognition model using the preprocessed hand data includes a forward propagation step, and the forward propagation step includes:
将所述预处理后的数据输入所述手势识别模型,输出所述标定动作为各个手势模板的概率;Input the preprocessed data into the gesture recognition model, and output the probability that the calibration action is each gesture template;
确定概率最大的所述手势模板为预测手势。The gesture template with the highest probability is determined to be the predicted gesture.
优选地,如图7所示,所述手势识别模型包括依次连接的两个一维卷积层(Conv1D)、最大池化层(Max Pooling)、展平层(flatten layer)、LSTM(Long Short-TermMemory,长短期记忆循环网络)层、两个全连接层(Fully Connected Layer)、Softmax层和输出层(Output Layer),所述将所述预处理后的数据输入所述手势识别模型,输出所述标定动作为各个手势模板的概率包括:Preferably, as shown in Figure 7, the gesture recognition model includes two one-dimensional convolutional layers (Conv1D), a maximum pooling layer (Max Pooling), a flattening layer (flatten layer), an LSTM (Long Short -TermMemory, long-term short-term memory cycle network) layer, two fully connected layers (Fully Connected Layer), Softmax layer and output layer (Output Layer), the data after the described preprocessing is input into the gesture recognition model, output The probability that the calibration action is each gesture template includes:
将所述预处理后的数据输入第一个所述一维卷积层,两个所述一维卷积层对所述预处理后的数据进行特征提取,获得第三特征数据,所有所述第三特征数据组成特征图。The preprocessed data is input into the first one-dimensional convolution layer, and the two one-dimensional convolution layers perform feature extraction on the preprocessed data to obtain the third feature data, and all the The third feature data constitutes a feature map.
具体地,将预处理后的数据从一维时间序列数据整形为二维矩阵,以满足Tensorflow Keras(机器学习框架)计算框架下一维卷积层的输入尺寸要求,其中一个维度是时间步长,另一个维度是每个时间步长上的特征。采用两个一维卷积层,能够提高提取的特征的鲁棒性。一维卷积层是CNN(Convolutional Neural Networks,卷积神经网络)的一个变种,专门用于处理序列和时间序列数据。在一维卷积层中,卷积滤波器仅沿数据的时间方向移动,因此,一维卷积层能够从固定长度段的数据中导出特征。当应用于识别手势时,CNN相对于其他模型具有两个优势,局部依赖性和尺度不变性,局部依赖性表示附近的信号可能是相关的,而尺度不变性表示不同步距或频率的尺度不变。Specifically, the preprocessed data is reshaped from one-dimensional time series data to a two-dimensional matrix to meet the input size requirements of the next-dimensional convolutional layer in the Tensorflow Keras (machine learning framework) computing framework, where one dimension is the time step , and the other dimension is the features at each time step. Using two one-dimensional convolutional layers can improve the robustness of the extracted features. The one-dimensional convolutional layer is a variant of CNN (Convolutional Neural Networks, convolutional neural network), which is specially used to process sequence and time series data. In a 1D convolutional layer, the convolutional filters only move along the time direction of the data, thus, a 1D convolutional layer is able to derive features from fixed-length segments of data. When applied to recognize gestures, CNN has two advantages over other models, local dependence and scale invariance. Local dependence means that nearby signals are likely to be related, while scale invariance means that the scales of different steps or frequencies are different. Change.
将所述特征图输入所述最大池化层,对所述特征图的每个子区域进行特征提取,获得第四特征数据。Inputting the feature map into the maximum pooling layer, performing feature extraction on each sub-region of the feature map, and obtaining fourth feature data.
具体地,在完成卷积之后,应用最大池化层从一维卷积层输出的特征图中的每个区域提取最重要的特征,可以减少特征的数量,加快训练过程。Specifically, after the convolution is completed, the maximum pooling layer is applied to extract the most important features from each region in the feature map output by the one-dimensional convolutional layer, which can reduce the number of features and speed up the training process.
将所述第四特征数据输入所述展平层,将所述第四特征数据整形成一维向量,将所述一维向量输入所述LSTM层进行处理,输出处理后的数据。Inputting the fourth feature data into the flattening layer, shaping the fourth feature data into a one-dimensional vector, inputting the one-dimensional vector into the LSTM layer for processing, and outputting the processed data.
具体地,由于经过一维卷积层和最大池化层处理后的输出是二维矩阵,而LSTM层的输入尺寸要求是一维向量,因此采用展平层将最大池化层输出的第二特征数据重新展平成一维向量,作为一维时间序列数据输入到LSTM层。展平层可被视为CNN和LSTM层之间的桥接层,用于将二维数据转换成一维数据,统一LSTM层而不丢失信息。Specifically, since the output processed by the one-dimensional convolution layer and the maximum pooling layer is a two-dimensional matrix, and the input size of the LSTM layer is required to be a one-dimensional vector, the flattening layer is used to convert the second output of the maximum pooling layer The feature data is re-flattened into a one-dimensional vector, which is input to the LSTM layer as one-dimensional time series data. The flattening layer can be regarded as a bridging layer between CNN and LSTM layers, which is used to convert two-dimensional data into one-dimensional data and unify LSTM layers without losing information.
将所述处理后的数据依次输入两个所述全连接层和所述Softmax层,输出所述标定动作为各个所述手势模板的概率。The processed data is sequentially input into the two fully connected layers and the Softmax layer, and the probability that the calibration action is each of the gesture templates is output.
具体地,在第一个全连接层中采用校正线性单元(ReLu)激活,在第二个全连接层中采用Softmax激活,输出类别标签,类别标签为手势。ReLu在深度学习模型中用作激活函数,能够满足快速收敛的要求,并解决梯度消失的问题,主要是因为它的梯度是非饱和的,能够大幅加快梯度下降的收敛速度,并通过梯度为0或1来解决梯度消失的问题。Specifically, the rectified linear unit (ReLu) activation is used in the first fully connected layer, and the Softmax activation is used in the second fully connected layer to output the category label, which is gesture. ReLu is used as an activation function in the deep learning model, which can meet the requirements of fast convergence and solve the problem of gradient disappearance, mainly because its gradient is non-saturated, which can greatly speed up the convergence speed of gradient descent, and pass the gradient to 0 or 1 to solve the problem of gradient disappearance.
所述输出层输出概率最大的所述手势模板,概率最大的所述手势模板为所述预测手势。The output layer outputs the gesture template with the highest probability, and the gesture template with the highest probability is the predicted gesture.
具体地,Softmax层常用在模型的最后一层,是分类问题中常用的激活函数,输出手势模板对应的动作为各个类别标签的概率,并且所有类别标签的概率之和为1,其中概率最高的类别标签就是模型的预测类别标签,即模型的预测手势。Specifically, the Softmax layer is often used in the last layer of the model, and is an activation function commonly used in classification problems. The action corresponding to the output gesture template is the probability of each category label, and the sum of the probabilities of all category labels is 1, of which the highest probability is The class label is the model's predicted class label, which is the model's predicted gesture.
优选地,所述手势识别模型还包括多个脱落层(Dropout Layer),其中,两个所述脱落层设置在第二个所述一维卷积层和所述最大池化层之间,两个所述脱落层设置在所述LSTM层和第一个所述全连接层之间。Preferably, the gesture recognition model further includes a plurality of dropout layers, wherein two dropout layers are arranged between the second one-dimensional convolutional layer and the maximum pooling layer, and the two dropout layers are arranged between the second one-dimensional convolutional layer and the maximum pooling layer. The first exfoliation layer is disposed between the LSTM layer and the first fully connected layer.
具体地,用相对较少的数据集训练神经网络时会导致训练数据过拟合,这是因为模型会学习训练数据中的统计噪声,当训练模型在测试或新数据被评估时会表现出较差的性能。因此,为了防止过拟合并减少泛化误差,深度学习框架中引入脱落层,模型学习鲁棒特征,可将脱落层的脱落率设置为0.5,表示脱落层随机选择输入单元的50%设置为零。Specifically, training a neural network with a relatively small data set can lead to overfitting of the training data because the model learns the statistical noise in the training data, and when the trained model is evaluated on test or new data, it will perform poorly. poor performance. Therefore, in order to prevent overfitting and reduce generalization errors, a shedding layer is introduced into the deep learning framework, and the model learns robust features. The shedding rate of the shedding layer can be set to 0.5, which means that the shedding layer randomly selects 50% of the input units and is set to zero.
优选地,所述采用所述预处理后的手部数据训练所述手势识别模型还包括:Preferably, the training of the gesture recognition model using the preprocessed hand data further includes:
反向传播步骤,包括根据所述标定动作和所述预测手势做交叉熵损失,并根据所述损失优化所述手势识别模型;The backpropagation step includes performing a cross-entropy loss according to the calibration action and the predicted gesture, and optimizing the gesture recognition model according to the loss;
循环重复所述前向传播步骤和所述反向传播步骤,直至所述损失不再下降,获得稳定的手势识别模型。The forward propagation step and the backward propagation step are repeated cyclically until the loss no longer decreases, and a stable gesture recognition model is obtained.
具体地,根据该损失为驱动优化手势识别模型的参数,经过对手势识别模型多次的迭代优化,当损失不再下降时,手势识别模型达到稳定状态。Specifically, according to the loss, the parameters of the gesture recognition model are driven and optimized. After multiple iterative optimizations of the gesture recognition model, when the loss no longer decreases, the gesture recognition model reaches a stable state.
如图8所示,本发明实施例提供的一种手势识别方法,包括:As shown in Figure 8, a gesture recognition method provided by an embodiment of the present invention includes:
步骤310,获取数据手套完成当前动作时所述数据手套的各个传感器采集的传感器数据,所有所述传感器数据组成一个输入数据;Step 310, acquire the sensor data collected by each sensor of the data glove when the data glove completes the current action, and all the sensor data form an input data;
步骤320,采用主成分分析法对所述输入数据进行特征提取,获得第二特征数据;Step 320, using principal component analysis to perform feature extraction on the input data to obtain second feature data;
步骤330,将所述第二特征数据输入训练好的多类SVM分类器,确定所述当前动作对应的手势。Step 330: Input the second feature data into the trained multi-class SVM classifier to determine the gesture corresponding to the current action.
具体地,将第二特征数据输入训练好的多类SVM分类器,所述训练好的多类SVM分类器包括多个分类函数,每个所述分类函数分别对所述特征数据进行处理,分别输出一个第二分类值,所述分类函数与手势模板一一对应。Specifically, the second feature data is input into a trained multi-class SVM classifier, and the trained multi-class SVM classifier includes a plurality of classification functions, and each of the classification functions processes the feature data respectively, respectively A second classification value is output, and the classification function is in one-to-one correspondence with the gesture template.
确定所有所述第二分类值中的最大值和次大值,将所述最大值和所述次大值分别与预设阈值进行对比,当所述最大值大于或等于所述预设阈值而所述次大值小于所述预设阈值时,确定输出所述最大值的所述分类函数对应的所述手势模板为所述当前动作对应的手势。determining a maximum value and a second maximum value among all the second classification values, and comparing the maximum value and the second maximum value with preset thresholds, and when the maximum value is greater than or equal to the preset threshold and When the second largest value is smaller than the preset threshold, it is determined that the gesture template corresponding to the classification function that outputs the largest value is the gesture corresponding to the current action.
步骤340,当所述训练好的多类SVM分类器无法识别所述当前动作时,对所述输入数据进行预处理,获得预处理后的数据。Step 340, when the trained multi-class SVM classifier cannot recognize the current action, perform preprocessing on the input data to obtain preprocessed data.
具体地,当所述最大值和所述次大值均大于或等于所述预设阈值时,对所述输入数据进行预处理,获得预处理后的数据。Specifically, when both the maximum value and the second maximum value are greater than or equal to the preset threshold, preprocessing is performed on the input data to obtain preprocessed data.
步骤350,将所述预处理后的数据输入训练好的手势识别模型,输出所述当前动作对应的手势,其中,所述手势识别模型是基于卷积神经网络和长短期记忆循环网络建立的。Step 350, input the preprocessed data into the trained gesture recognition model, and output the gesture corresponding to the current action, wherein the gesture recognition model is established based on a convolutional neural network and a long-term short-term memory cycle network.
本实施例中,获取数据手套完成当前动作时各个传感器采集的传感器数据,当前动作对应的所有传感器数据组成一个输入数据。采用主成分分析法对输入数据进行特征提取,获得特征数据并输入至训练好的多类SVM分类器,识别当前动作对应的手势,多类SVM分类器识别速度快,简单高效。当多类SVM分类器难以识别当前动作,则对当前动作对应的输入数据进行预处理,并将预处理后的数据输入训练好的手势识别模型中,输出当前动作对应的手势,手势识别模型基于深度学习模型构建,能够准确识别手势。本发明的技术方案首先采用多类SVM分类器识别手势,识别速度快,在多类SVM分类器难以识别手势时,采用手势识别模型识别手势,能够在提高识别速度的同时,保证识别精度。In this embodiment, the sensor data collected by each sensor when the data glove completes the current action is obtained, and all the sensor data corresponding to the current action form one input data. The principal component analysis method is used to extract the features of the input data, and the feature data is obtained and input to the trained multi-class SVM classifier to recognize the gesture corresponding to the current action. The multi-class SVM classifier is fast, simple and efficient. When the multi-class SVM classifier is difficult to recognize the current action, the input data corresponding to the current action is preprocessed, and the preprocessed data is input into the trained gesture recognition model, and the gesture corresponding to the current action is output. The gesture recognition model is based on Deep learning model construction, which can accurately recognize gestures. The technical solution of the present invention first uses multi-class SVM classifiers to recognize gestures, and the recognition speed is fast. When multi-class SVM classifiers are difficult to recognize gestures, a gesture recognition model is used to recognize gestures, which can ensure recognition accuracy while improving recognition speed.
如图9所示,本发明实施例提供的一种手势识别装置,包括:As shown in Figure 9, a gesture recognition device provided by an embodiment of the present invention includes:
获取模块,用于获取数据手套完成当前动作时所述数据手套的各个传感器采集的传感器数据,所有所述传感器数据组成一个输入数据;The acquisition module is used to acquire the sensor data collected by each sensor of the data glove when the data glove completes the current action, and all the sensor data form an input data;
特征提取模块,用于采用主成分分析法对所述输入数据进行特征提取,获得第二特征数据;A feature extraction module, configured to extract features from the input data using principal component analysis to obtain second feature data;
第一处理模块,用于将所述第二特征数据输入训练好的多类SVM分类器,确定所述当前动作对应的手势;A first processing module, configured to input the second feature data into a trained multi-class SVM classifier to determine the gesture corresponding to the current action;
预处理模块,用于当所述训练好的多类SVM分类器无法识别所述当前动作时,对所述输入数据进行预处理,获得预处理后的数据;A preprocessing module, configured to preprocess the input data to obtain preprocessed data when the trained multi-class SVM classifier cannot recognize the current action;
第二处理模块,用于将所述预处理后的数据输入训练好的手势识别模型,输出所述当前动作对应的手势,其中,所述手势识别模型是基于卷积神经网络和长短期记忆循环网络建立的。The second processing module is used to input the preprocessed data into the trained gesture recognition model, and output the gesture corresponding to the current action, wherein the gesture recognition model is based on convolutional neural network and long short-term memory cycle network built.
本发明另一实施例提供的一种手势识别装置包括存储器和处理器;所述存储器,用于存储计算机程序;所述处理器,用于当执行所述计算机程序时,实现如上所述的手势识别方法。该装置可为计算机和服务器等。A gesture recognition device provided by another embodiment of the present invention includes a memory and a processor; the memory is used to store a computer program; and the processor is used to realize the gesture as described above when the computer program is executed recognition methods. The device can be a computer, a server, and the like.
本发明再一实施例提供的一种计算机可读存储介质,所述存储介质上存储有计算机程序,当所述计算机程序被处理器执行时,实现如上所述的手势识别方法。Still another embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the above-mentioned gesture recognition method is implemented.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random AccessMemory,RAM)等。在本申请中,所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本发明实施例方案的目的。另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM) and the like. In this application, the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple on a network unit. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention. In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
虽然本发明公开披露如上,但本发明公开的保护范围并非仅限于此。本领域技术人员在不脱离本发明公开的精神和范围的前提下,可进行各种变更与修改,这些变更与修改均将落入本发明的保护范围。Although the disclosure of the present invention is as above, the protection scope of the disclosure of the present invention is not limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, and these changes and modifications will all fall within the protection scope of the present invention.
Claims (13)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011253584.5A CN112347951B (en) | 2020-11-11 | 2020-11-11 | Gesture recognition method, device, storage medium and data glove |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202011253584.5A CN112347951B (en) | 2020-11-11 | 2020-11-11 | Gesture recognition method, device, storage medium and data glove |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112347951A CN112347951A (en) | 2021-02-09 |
| CN112347951B true CN112347951B (en) | 2023-07-11 |
Family
ID=74363342
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202011253584.5A Active CN112347951B (en) | 2020-11-11 | 2020-11-11 | Gesture recognition method, device, storage medium and data glove |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112347951B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120406745B (en) * | 2025-06-26 | 2025-10-03 | 成都天软信息技术有限公司 | Remote control method for intelligent audio equipment and audio |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109214250A (en) * | 2017-07-05 | 2019-01-15 | 中南大学 | A kind of static gesture identification method based on multiple dimensioned convolutional neural networks |
| CN110262653A (en) * | 2018-03-12 | 2019-09-20 | 东南大学 | A kind of millimeter wave sensor gesture identification method based on convolutional neural networks |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3743901A4 (en) * | 2018-01-25 | 2021-03-31 | Facebook Technologies, Inc. | REAL-TIME PROCESSING OF HAND REPRESENTATION MODEL ESTIMATES |
| WO2019147996A1 (en) * | 2018-01-25 | 2019-08-01 | Ctrl-Labs Corporation | Calibration techniques for handstate representation modeling using neuromuscular signals |
-
2020
- 2020-11-11 CN CN202011253584.5A patent/CN112347951B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109214250A (en) * | 2017-07-05 | 2019-01-15 | 中南大学 | A kind of static gesture identification method based on multiple dimensioned convolutional neural networks |
| CN110262653A (en) * | 2018-03-12 | 2019-09-20 | 东南大学 | A kind of millimeter wave sensor gesture identification method based on convolutional neural networks |
Non-Patent Citations (1)
| Title |
|---|
| 基于骨架信息的人体动作识别与实时交互技术;张继凯;顾兰君;;内蒙古科技大学学报(03);第66-72页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112347951A (en) | 2021-02-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108491077B (en) | Surface electromyographic signal gesture recognition method based on multi-stream divide-and-conquer convolutional neural network | |
| Yin et al. | A systematic review of human activity recognition based on mobile devices: overview, progress and trends | |
| Chung et al. | Real-time hand gesture recognition model using deep learning techniques and EMG signals | |
| CN102982315B (en) | The Hand Gesture Segmentation recognition methods of a kind of non-gesture mode of automatic detection and system | |
| Assaleh et al. | Low complexity classification system for glove-based arabic sign language recognition | |
| CN107122752B (en) | Human body action comparison method and device | |
| CN102024151B (en) | Training method of gesture motion recognition model and gesture motion recognition method | |
| CN110399846A (en) | A Gesture Recognition Method Based on Correlation of Multi-channel EMG Signals | |
| CN110458235B (en) | Motion posture similarity comparison method in video | |
| CN106408579B (en) | A Video-Based Pinch Fingertip Tracking Method | |
| CN107832736B (en) | Real-time human motion recognition method and real-time human motion recognition device | |
| CN108985157A (en) | A kind of gesture identification method and device | |
| Zheng et al. | L-sign: Large-vocabulary sign gestures recognition system | |
| Badhe et al. | Artificial neural network based indian sign language recognition using hand crafted features | |
| Robert et al. | A review on computational methods based automated sign language recognition system for hearing and speech impaired community | |
| CN110163142B (en) | Real-time gesture recognition method and system | |
| CN109993116B (en) | Pedestrian re-identification method based on mutual learning of human bones | |
| CN111783719A (en) | A kind of myoelectric control method and device | |
| CN111914724A (en) | Continuous Chinese sign language identification method and system based on sliding window segmentation | |
| CN105069444A (en) | Gesture recognition device | |
| Antony et al. | Sign language recognition using sensor and vision based approach | |
| CN112347951B (en) | Gesture recognition method, device, storage medium and data glove | |
| Enikeev et al. | Recognition of sign language using leap motion controller data | |
| Poornima et al. | A comprehensive review on Indian sign language recognition system using vision based approaches | |
| Nahar et al. | A robust model for translating arabic sign language into spoken arabic using deep learning |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |


