CN108388348B

CN108388348B - An EMG gesture recognition method based on deep learning and attention mechanism

Info

Publication number: CN108388348B
Application number: CN201810224699.8A
Authority: CN
Inventors: 耿卫东; 胡钰; 卫文韬
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2018-03-19
Filing date: 2018-03-19
Publication date: 2020-11-24
Anticipated expiration: 2038-03-19
Also published as: CN108388348A

Abstract

The invention discloses an electromyographic signal gesture recognition method based on deep learning and attention mechanism. The steps are as follows: noise reduction filtering is performed on the gesture electromyographic signal; a sliding window is used to extract a classical feature set for each window data, and a New feature-based EMG images; design a deep learning framework based on convolutional neural network, recurrent neural network and attention mechanism, and optimize its network structure parameters; use the designed deep learning framework and training data to train Obtain the classifier model; input the test data into the trained deep learning network model, and according to the likelihood of the output of the last layer, the category corresponding to the maximum likelihood is the identified category. The invention recognizes the electromyographic gesture signal based on the new feature image and the deep learning framework based on the attention mechanism. Using deep learning and attention mechanism-based EMG gesture recognition method can accurately identify multiple different gestures of the same subject.

Description

An EMG gesture recognition method based on deep learning and attention mechanism

技术领域technical field

本发明属于计算机与生物信号相结合领域，尤其涉及一种基于深度学习和注意力机制的肌电信号手势识别方法。The invention belongs to the field of combining computer and biological signals, and in particular relates to an electromyographic signal gesture recognition method based on deep learning and attention mechanism.

背景技术Background technique

表面肌电信号(surface electromyography,sEMG)是一种通过非侵入式的电极贴在皮肤表面来记录肌肉活动的生物信号。通过记录和分析表面肌电信号能够为辅助和康复技术提供更多有效的信息，对于体育科学研究、人机交互、康复医学临床和基础研究等具有重要的学术价值和应用意义。在这些应用中，基于肌电信号的手势识别技术担当重要的角色。一个经典的肌电信号手势识别流程由数据预处理、特征空间构建和分类组成。数据预处理部分主要对信号进行整流和滤波来减少噪声，特征空间构建部分将预处理后的信号变换到特征空间使得类间有更大的区分度，最后用一个机器学习方法训练好模型用于分类。Surface electromyography (sEMG) is a biological signal that records muscle activity through non-invasive electrodes attached to the skin surface. Recording and analyzing surface EMG signals can provide more effective information for assistive and rehabilitation technologies, and has important academic and application significance for sports science research, human-computer interaction, clinical and basic research in rehabilitation medicine. In these applications, gesture recognition technology based on EMG signals plays an important role. A classic EMG gesture recognition process consists of data preprocessing, feature space construction and classification. The data preprocessing part mainly rectifies and filters the signal to reduce noise. The feature space construction part transforms the preprocessed signal into the feature space to make the class more distinguishable. Finally, a machine learning method is used to train the model for Classification.

特征空间的构建部分和手势类别的识别部分是提高识别准确率十分重要的两个部分。因此有很多研究人员致力于通过他们的领域知识来提出新的特征，如Phinyomark特征集。另一方面，在国内外的研究中，很多机器学习分类器被用于肌电信号手势识别中，比如人工神经网络、K近邻、线性判决分析、支持向量机和隐马尔可夫模型。其中支持向量机和线性判决分析是两种最常用的分类器。The construction part of the feature space and the recognition part of the gesture category are two important parts to improve the recognition accuracy. Therefore there are many researchers working on proposing new features, such as Phinyomark feature sets, by using their domain knowledge. On the other hand, in research at home and abroad, many machine learning classifiers are used in EMG gesture recognition, such as artificial neural network, K-nearest neighbor, linear decision analysis, support vector machine and hidden Markov model. Among them, support vector machines and linear decision analysis are the two most commonly used classifiers.

在近几年国内外的研究进展中，深度学习方法在许多领域都获得了目前最好的表现。其中最著名的卷积神经网络也已被成功应用于肌电信号的手势识别中，获得了目前最好的识别效果。注意力机制是一种十分有效的增强循环神经网络建模能力的方法，目前已在机器翻译等领域取得了较好的效果。但目前没有使用循环神经网络结合卷积神经网络的方法对肌电信号手势进行识别，并且本发明将注意力机制加入到循环神经网络中对模型进行增强。In the research progress at home and abroad in recent years, deep learning methods have achieved the best performance in many fields. Among them, the most famous convolutional neural network has also been successfully applied to the gesture recognition of EMG signals, and obtained the best recognition effect so far. Attention mechanism is a very effective method to enhance the modeling ability of recurrent neural network, and it has achieved good results in machine translation and other fields. However, at present, the method of combining the recurrent neural network with the convolutional neural network is not used to recognize the EMG signal gesture, and the present invention adds the attention mechanism to the recurrent neural network to enhance the model.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对现有技术的不足，提供一种基于深度学习和注意力机制的肌电信号手势识别方法，通过设计基于卷积神经网络、循环神经网络和注意力机制的模型结构，提高了手势识别的准确性。The purpose of the present invention is to aim at the deficiencies of the prior art, to provide an EMG signal gesture recognition method based on deep learning and attention mechanism. accuracy of gesture recognition.

本发明的目的是通过以下技术方案来实现的：一种基于深度学习和注意力机制的肌电信号手势识别方法，包括如下步骤：The purpose of the present invention is to be achieved through the following technical solutions: a method for recognizing an EMG signal gesture based on deep learning and an attention mechanism, comprising the following steps:

(1)获取肌电数据，数据预处理，包括以下子步骤：(1) Obtaining EMG data, data preprocessing, including the following sub-steps:

(1.1)从公开数据集NinaProDB1、NinaProDB2、BioPatRec子集、CapgMyo子集和csl-hdemg中获取手势动作肌电数据；(1.1) Obtain the EMG data of gesture movements from the public datasets NinaProDB1, NinaProDB2, BioPatRec subset, CapgMyo subset and csl-hdemg;

(1.2)分别对不同数据集采用不同的预处理方法进行滤波降噪；(1.2) Use different preprocessing methods to filter noise reduction for different data sets;

(2)原始信号训练数据集和原始信号测试数据集的划分，包括以下子步骤：(2) The division of the original signal training data set and the original signal test data set, including the following sub-steps:

(2.1)根据获取到的肌电信号标签，将每个肌电信号文件中的数据分割为若干个肌电信号手势段，每个手势段包含一次动作重复；(2.1) According to the obtained EMG signal label, the data in each EMG signal file is divided into several EMG signal gesture segments, and each gesture segment contains one action repetition;

(2.2)按照不同的评估方法，将手势的多次动作重复分别划分到原始信号训练数据集和原始信号测试数据集中，完成原始训练和测试数据集的划分；(2.2) According to different evaluation methods, the repeated actions of the gesture are divided into the original signal training data set and the original signal test data set respectively, and the division of the original training and test data sets is completed;

(3)数据分割与特征提取，包括以下子步骤：(3) Data segmentation and feature extraction, including the following sub-steps:

(3.1)用滑动窗口将每个手势段分割为多个固定长度的信号段；(3.1) Use a sliding window to divide each gesture segment into multiple fixed-length signal segments;

(3.2)对每个窗口内的定长信号段的每个通道进行特征提取，提取多种特征；(3.2) Feature extraction is performed on each channel of the fixed-length signal segment in each window, and multiple features are extracted;

(4)构建新肌电图像，包括以下子步骤：(4) Constructing a new EMG image, including the following sub-steps:

(4.1)将窗口内每个通道的特征向量重新排布，使得每两个通道都能够相邻；(4.1) Rearrange the feature vectors of each channel in the window so that every two channels can be adjacent;

(4.2)构建新肌电图像，新肌电图像的宽为1，高为重新排列后的通道数，颜色通道数为特征向量维度；(4.2) Constructing a new EMG image, the width of the new EMG image is 1, the height is the number of rearranged channels, and the number of color channels is the dimension of the feature vector;

(5)基于深度学习和注意力机制的肌电信号多类手势识别，包括以下步骤：(5) Multi-type gesture recognition of EMG signals based on deep learning and attention mechanism, including the following steps:

(5.1)设计深度学习和注意力机制的模型结构，模型结构由卷积神经网络、循环神经网络和基本注意力机制构成；卷积神经网络对输入的新肌电图像进行高层特征提取，循环神经网络对新肌电图像序列每帧之间的关系进行建模，基本注意力机制对循环神经网络的输出进行重要性加权，t时刻注意力权重α_t的计算公式为：(5.1) Design the model structure of deep learning and attention mechanism. The model structure consists of convolutional neural network, recurrent neural network and basic attention mechanism; The network models the relationship between each frame of the new EMG image sequence, and the basic attention mechanism weights the importance of the output of the recurrent neural network. The calculation formula of the attention weight α _t at time t is:

M_t＝tanh(W_hh_t)M _t =tanh(W _h h _t )

α_t＝softmax(w^TM_t)α _t =softmax(w ^T M _t )

其中，h_t是循环神经网络的输出，W_h和w^T是待训练的权重矩阵，T是一个手势段的时间长度，r是基本注意力机制部分的输出；softmax函数是归一化指数函数；where h _t is the output of the recurrent neural network, W _h and w ^T are the weight matrices to be trained, T is the time length of a gesture segment, r is the output of the basic attention mechanism part; the softmax function is a normalized exponential function ;

(5.2)原始信号训练数据集中每个样本进行新肌电图像的构建，得到新肌电图像训练数据集作为整个网络的输入，对卷积神经网络和循环神经网络的网络参数逐一进行优化，得到最优模型参数；(5.2) Construct a new EMG image for each sample in the original signal training data set, and obtain a new EMG image training data set as the input of the entire network, optimize the network parameters of the convolutional neural network and the recurrent neural network one by one, and obtain optimal model parameters;

(5.3)由步骤(5.2)训练得到的最优模型参数和新肌电图像训练数据集训练获得分类模型；(5.3) The classification model is obtained by training the optimal model parameters obtained in step (5.2) and the new EMG image training data set;

(5.4)将测试数据集中每个样本进行新肌电图像的构建，得到新肌电图像测试数据集，输入步骤(5.3)得到的分类模型，输出分类结果。(5.4) Construct a new EMG image for each sample in the test data set to obtain a new EMG image test data set, input the classification model obtained in step (5.3), and output the classification result.

进一步地，所述步骤(1.2)中，对NinaProDB1采用低通butterworth滤波，对NinaProDB2采用低通butterworth滤波并降采样到100Hz，BioPatRec子集和CapgMyo子集不进行滤波，对csl-hdemg进行整流和低通butterworth滤波。Further, in the step (1.2), low-pass butterworth filtering is adopted for NinaProDB1, and low-pass butterworth filtering is adopted for NinaProDB2 and down-sampling to 100Hz, BioPatRec subset and CapgMyo subset are not filtered, and csl-hdemg is rectified and Low-pass butterworth filtering.

进一步地，所述步骤(2.1)中，原始信号训练数据集和原始信号测试数据集的划分使用被试内评估；不同数据集采用不同的划分方法：NinaProDB1将每个被试的第1,3,4,6,8,9和10次重复作为训练数据，第2,5,7次作为测试数据；NinaProDB2将第1,3,4,6次重复作为训练数据，第2,5次作为测试数据；BioPatRec子集将第一次重复作为训练数据，另外两次重复作为测试数据；CapgMyo子集将一半的重复作为训练数据，即5次重复，另外5次重复作为测试数据；csl-hdemg数据集将单个被试的数据划分为10份，并进行10折交叉验证。Further, in the step (2.1), the division of the original signal training data set and the original signal test data set uses intra-subject evaluation; different data sets use different division methods: NinaProDB1 divides the first and third , 4, 6, 8, 9 and 10 repetitions are used as training data, and the 2, 5, and 7 repetitions are used as test data; NinaProDB2 uses the 1st, 3rd, 4th, and 6th repetitions as training data, and the 2nd, 5th repetitions are used as test data Data; BioPatRec subset takes the first repetition as training data and the other two repetitions as test data; CapgMyo subset takes half of the repetitions as training data, i.e. 5 repetitions, and the other 5 repetitions as test data; csl-hdemg data The set divides the data of a single subject into 10 copies and performs 10-fold cross-validation.

进一步地，所述步骤(3.1)中，不同的数据集使用不同的滑动窗口长度及滑动步长；NinaProDB1的滑动窗口长度为150ms和200ms，滑动步长为10ms；NinaProDB2的滑动窗口长度为200ms，滑动步长为100ms；BioPatRec子集的滑动窗口长度为50ms和150ms，滑动步长为50ms；CapgMyo子集的滑动窗口长度为40ms和150ms，滑动步长为1ms；csl-hdemg的滑动窗口长度为150ms和170ms，滑动步长为0.5ms。Further, in the step (3.1), different data sets use different sliding window lengths and sliding steps; the sliding window lengths of NinaProDB1 are 150ms and 200ms, and the sliding steps are 10ms; the sliding window lengths of NinaProDB2 are 200ms, The sliding step size is 100ms; the sliding window lengths of the BioPatRec subset are 50ms and 150ms, and the sliding step size is 50ms; the sliding window lengths of the CapgMyo subset are 40ms and 150ms, and the sliding step size is 1ms; the sliding window length of the csl-hdemg is 150ms and 170ms, the sliding step is 0.5ms.

进一步地，所述步骤(3.2)中，对窗口内的肌电信号基于经典特征集Phinyomark进行特征向量提取，包含特征信号幅值绝对均值MAV、波形长度WL、自回归系数AR、绝对均值斜率MAVSLP、平均频率MNF、功率谱最大值附近能量与总能量比率PSR和Willison幅值WAMP；CapgMyo子集和csl-hdemg是高密度肌电信号，不进行特征提取，直接在原始信号上构建图像。Further, in the step (3.2), feature vector extraction is performed on the EMG signal in the window based on the classical feature set Phinyomark, including the absolute mean value of the characteristic signal amplitude MAV, the waveform length WL, the autoregressive coefficient AR, and the absolute mean slope MAVSLP. , the average frequency MNF, the ratio of energy to total energy near the power spectrum maximum PSR and the Willison amplitude WAMP; CapgMyo subset and csl-hdemg are high-density EMG signals, without feature extraction, images are directly constructed on the original signal.

进一步地，所述步骤(5.1)中，循环神经网络部分，选择长短时记忆单元(LSTM)来解决梯度消失和梯度爆炸问题。Further, in the step (5.1), in the recurrent neural network part, a long short-term memory unit (LSTM) is selected to solve the problems of gradient disappearance and gradient explosion.

进一步地，所述步骤(5.2)中，最优模型中卷积神经网络包含2层卷积层，后接2层局部连接层，最后连接3层全连接层，循环神经网络层由输出大小为512的长短时记忆单元(LSTM)构成，最后识别部分由一个G-way全连接层和softmax层构成。Further, in the step (5.2), the convolutional neural network in the optimal model includes 2 layers of convolutional layers, followed by 2 layers of local connection layers, and finally connected to 3 layers of fully connected layers, and the output size of the recurrent neural network layer is 512 long short-term memory unit (LSTM), and the final recognition part consists of a G-way fully connected layer and a softmax layer.

进一步地，所述步骤(5.3)中，分类模型的训练过程为：新肌电图像训练数据集与该数据集每个样本对应手势标签共同作为模型的输入，经过训练得到模型参数进行存储。Further, in the step (5.3), the training process of the classification model is as follows: the new EMG image training data set and the gesture label corresponding to each sample of the data set are used as the input of the model, and the model parameters are obtained after training and stored.

进一步地，所述步骤(5.4)中，分类模型的输出为标签，即对应测试样本的标签，用识别准确率对识别结果进行衡量，识别准确率为识别正确的样本数除以所有测试样本数。Further, in the step (5.4), the output of the classification model is a label, that is, the label corresponding to the test sample, and the recognition result is measured by the recognition accuracy rate, and the recognition accuracy rate is the number of correct samples divided by the number of all test samples. .

本发明的有益效果是：本发明提出一种基于卷积神经网络和循环神经网络的肌电信号手势识别方法，能够同时对肌电信号的空间和时间特征进行提取和建模，与已有发明中的单纯基于卷积神经网络的方法相比，该方法能够有效提升识别率。将注意力机制加入到基于卷积神经网络和循环神经网络的模型结构中，能够增强模型结构的表现。提取传统肌电信号构建新肌电图像作为模型的输入，能够有效提升肌电信号手势识别的准确率。The beneficial effects of the present invention are as follows: the present invention proposes an EMG signal gesture recognition method based on a convolutional neural network and a cyclic neural network, which can extract and model the spatial and temporal features of the EMG signal at the same time, which is similar to the existing invention. Compared with the method based solely on convolutional neural network, this method can effectively improve the recognition rate. Adding attention mechanism to the model structure based on convolutional neural network and recurrent neural network can enhance the performance of the model structure. Extracting traditional EMG signals to construct new EMG images as the input of the model can effectively improve the accuracy of EMG signal gesture recognition.

附图说明Description of drawings

图1为本发明方法流程图；Fig. 1 is the flow chart of the method of the present invention;

图2为本发明网络结构图。FIG. 2 is a network structure diagram of the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施方式对本发明作进一步详细说明。The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

如图1所示，本发明提供的一种基于深度学习和注意力机制的肌电信号手势识别方法，具体实施步骤如下：As shown in FIG. 1 , the present invention provides a method for recognizing EMG signal gestures based on deep learning and attention mechanism, and the specific implementation steps are as follows:

步骤(1)从公开数据集NinaProDB1、NinaProDB2、BioPatRec子集、CapgMyo子集和csl-hdemg中获取手势动作肌电数据；对NinaProDB1采用低通butterworth滤波，对NinaProDB2采用低通butterworth滤波并降采样到100Hz，BioPatRec子集和CapgMyo子集不进行滤波，对csl-hdemg进行整流和低通butterworth滤波。Step (1) Obtain gesture action EMG data from public datasets NinaProDB1, NinaProDB2, BioPatRec subset, CapgMyo subset and csl-hdemg; use low-pass butterworth filtering for NinaProDB1, and low-pass butterworth filtering for NinaProDB2 and downsample to 100Hz, BioPatRec subset and CapgMyo subset are not filtered, csl-hdemg is rectified and low-pass butterworth filtered.

步骤(2)原始信号训练数据集和原始信号测试数据集的划分，根据获取到的肌电信号标签，将每个肌电信号文件中的数据分割为一个个肌电信号手势段，每个手势段包含一次动作重复；我们的测试使用被试内评估，不同数据集采用不同的划分方法：NinaProDB1将每个被试的第1,3,4,6,8,9和10次重复作为训练数据，第2,5,7次作为测试数据；NinaProDB2将第1,3,4,6次重复作为训练数据，第2,5次作为测试数据；BioPatRec子集将第一次重复作为训练数据，另外两次重复作为测试数据；CapgMyo子集将一半的重复作为训练数据，即5次重复，另外5次重复作为测试数据；csl-hdemg数据集将单个被试的数据划分为10份，并进行10折交叉验证。Step (2) The division of the original signal training data set and the original signal test data set, according to the obtained EMG signal label, the data in each EMG signal file is divided into one EMG signal gesture segment, each gesture The segment contains one action repetition; our test uses within-subject evaluation, and different datasets use different partitioning methods: NinaProDB1 uses the 1st, 3rd, 4th, 6th, 8th, 9th and 10th repetitions of each subject as training data , the 2nd, 5th, and 7th times are used as test data; NinaProDB2 uses the 1st, 3rd, 4th, and 6th repetitions as training data, and the 2nd and 5th times as test data; the BioPatRec subset uses the first repetition as training data, and the other Two repetitions are used as test data; the CapgMyo subset uses half of the repetitions as training data, that is, 5 repetitions, and the other 5 repetitions are used as test data; the csl-hdemg data set divides the data of a single subject into 10 copies, and conducts 10 Fold cross validation.

步骤(3)对数据进行分割与特征提取，不同的数据集使用不同的滑动窗口长度及滑动步长。NinaProDB1的滑动窗口长度为150ms和200ms，滑动步长为10ms；NinaProDB2的滑动窗口长度为200ms，滑动步长为100ms；BioPatRec子集的滑动窗口长度为50ms和150ms，滑动步长为50ms；CapgMyo子集的滑动窗口长度为40ms和150ms，滑动步长为1ms；csl-hdemg的滑动窗口长度为150ms和170ms，滑动步长为0.5ms。对窗口内的肌电信号基于经典特征集Phinyomark进行特征向量提取，包含特征信号幅值绝对均值(MAV)、波形长度(WL)、自回归系数(AR)、绝对均值斜率(MAVSLP)、平均频率(MNF)、功率谱最大值附近能量与总能量比率(PSR)和Willison幅值(WAMP)。Step (3) Segmentation and feature extraction are performed on the data, and different data sets use different sliding window lengths and sliding steps. The sliding window length of NinaProDB1 is 150ms and 200ms, and the sliding step is 10ms; the sliding window length of NinaProDB2 is 200ms and the sliding step is 100ms; the sliding window length of BioPatRec subset is 50ms and 150ms, and the sliding step is 50ms; The sliding window lengths of the set are 40ms and 150ms, and the sliding step is 1ms; the sliding window lengths of csl-hdemg are 150ms and 170ms, and the sliding step is 0.5ms. The EMG signal in the window is extracted based on the classic feature set Phinyomark, including the absolute mean value of the characteristic signal amplitude (MAV), the waveform length (WL), the autoregressive coefficient (AR), the absolute mean slope (MAVSLP), and the average frequency. (MNF), energy near power spectral maximum to total energy ratio (PSR), and Willison amplitude (WAMP).

步骤(5)设计深度学习和注意力机制的模型结构，模型结构由卷积神经网络、循环神经网络和基本注意力机制构成。卷积神经网络对输入的新肌电图像进行高层特征提取，循环神经网络对新肌电图像序列每帧之间的关系进行建模，基本注意力机制对循环神经网络的输出进行重要性加权，从而得到最终的表达用于肌电信号手势识别。对卷积神经网络和循环神经网络的网络参数逐一进行优化，最优的网络结构如下表所示：Step (5) Design the model structure of deep learning and attention mechanism. The model structure is composed of convolutional neural network, recurrent neural network and basic attention mechanism. The convolutional neural network performs high-level feature extraction on the input new EMG image, the recurrent neural network models the relationship between each frame of the new EMG image sequence, and the basic attention mechanism weights the importance of the output of the recurrent neural network. Thus, the final expression is obtained for EMG gesture recognition. The network parameters of the convolutional neural network and the recurrent neural network are optimized one by one. The optimal network structure is shown in the following table:

层Floor 名称name 参数parameter 11 卷积层1Convolutional layer 1 64核，核大小(3*3)64 cores, core size (3*3) 22 卷积层2Convolutional layer 2 64核，核大小(3*3)64 cores, core size (3*3) 33 局部连接层1Local connection layer 1 64核64 cores 44 局部连接层2Local connection layer 2 64核64 cores 55 全连接层1Fully connected layer 1 512维输出512-dimensional output 66 全连接层2Fully connected layer 2 512维输出512-dimensional output 77 全连接层3Fully connected layer 3 128维输出128-dimensional output 88 循环神经层recurrent neural layer 长短时记忆(LSTM)512维隐含状态输出Long Short Term Memory (LSTM) 512-Dimensional Hidden State Output 99 全连接层4和softmax层Fully connected layer 4 and softmax layer

循环神经网络部分，我们选择长短时记忆单元(LSTM)来解决梯度消失和梯度爆炸问题。注意力机制增加在循环神经网络的后面，即循环神经网络的输出是注意力机制部分的输入，计算公式为：In the recurrent neural network part, we choose a long short-term memory unit (LSTM) to solve the gradient vanishing and gradient exploding problems. The attention mechanism is added behind the recurrent neural network, that is, the output of the recurrent neural network is the input of the attention mechanism part, and the calculation formula is:

M_t＝tanh(W_hh_t)M _t =tanh(W _h h _t )

α_t＝softmax(w^TM_t)α _t =softmax(w ^T M _t )

其中，h_t是循环神经网络的输出，W_h和w^T是待训练的权重矩阵，T是一个手势段的时间长度，r是基本注意力机制部分的输出；softmax函数是归一化指数函数。训练过程为：原始信号训练数据集中每个样本进行新肌电图像的构建，得到新肌电图像训练数据集，将新肌电图像训练数据集与该数据集每个样本对应手势标签共同作为模型的输入，经过训练得到模型参数并进行存储。测试过程为：将测试数据集中每个样本进行新肌电图像的构建，得到新肌电图像测试数据集，加载由新肌电图像训练数据集训练好的模型，输入新肌电图像测试数据集，输出为手势类别标签，用识别准确率对识别结果进行衡量，识别准确率为识别正确的样本数除以所有样本数。where h _t is the output of the recurrent neural network, W _h and w ^T are the weight matrices to be trained, T is the time length of a gesture segment, r is the output of the basic attention mechanism part; the softmax function is a normalized exponential function . The training process is as follows: construct a new EMG image for each sample in the original signal training data set to obtain a new EMG image training data set, and use the new EMG image training data set and the corresponding gesture label of each sample in the data set as a model After training, the model parameters are obtained and stored. The test process is: construct a new EMG image for each sample in the test data set, obtain a new EMG image test data set, load the model trained by the new EMG image training data set, and input the new EMG image test data set , the output is the gesture category label, and the recognition result is measured by the recognition accuracy rate, which is the number of correctly recognized samples divided by the number of all samples.

对NinaProDB1、NinaProDB2、BioPatRec子集、CapgMyo子集和csl-hdemg数据集的手势全集进行识别。NinaProDB1包含52手势，NinaProDB2包含50手势，BioPatRec子集包含26手势，CapgMyo子集包含8手势，csl-hdemg包含27手势。使用本发明基于深度学习和注意力机制的肌电信号手势识别方法的识别率结果为：Recognition on the full set of gestures from NinaProDB1, NinaProDB2, BioPatRec subsets, CapgMyo subsets and csl-hdemg datasets. NinaProDB1 contains 52 gestures, NinaProDB2 contains 50 gestures, BioPatRec subset contains 26 gestures, CapgMyo subset contains 8 gestures, and csl-hdemg contains 27 gestures. The result of the recognition rate of using the EMG signal gesture recognition method based on deep learning and attention mechanism of the present invention is:

Claims

1. the electromyographic signal gesture recognition method based on deep learning and attention mechanism is characterized by comprising the following steps:

(1) acquiring electromyographic data, preprocessing the data, and acquiring gesture action electromyographic data from public data sets NinaProDB1, NinaProDB2, a BioPatRec subset, a CapgMyo subset and csl-hdemg respectively; respectively carrying out filtering and noise reduction on different data sets by adopting different preprocessing methods;

(2) division of the original signal training dataset and the original signal testing dataset: according to the acquired electromyographic signal labels, dividing data in each electromyographic signal file into a plurality of electromyographic signal gesture sections, wherein each gesture section comprises a repeated action; repeatedly and respectively dividing multiple actions of the gesture into an original signal training data set and an original signal testing data set according to different evaluation methods to finish the division of the original training data set and the original signal testing data set; different data sets adopt different division methods: NinaProDB1 used 1,3,4,6,8,9 and 10 replicates of each test as training data and 2,5,7 as test data; NinaProDB2 used 1,3,4,6 replicates as training data and 2,5 replicates as test data; the biopathrec subset uses the first repeat as training data and the other two repeats as test data; the cappgmyo subset uses half of the replicates as training data, i.e., 5 replicates, and the other 5 replicates as test data; the csl-hdemg data set divides the single tested data into 10 parts and carries out 10-fold cross validation;

(3) the data segmentation and feature extraction method comprises the following substeps:

(3.1) dividing each gesture segment into a plurality of signal segments of fixed length with a sliding window; different data sets use different sliding window lengths and sliding step lengths; the length of a sliding window of the NinaProDB1 is 150ms and 200ms, and the sliding step length is 10 ms; the length of a sliding window of the NinaProDB2 is 200ms, and the sliding step length is 100 ms; the biopathrec subset has sliding window lengths of 50ms and 150ms, with a sliding step size of 50 ms; the length of a sliding window of the CapgMyo subset is 40ms and 150ms, and the sliding step length is 1 ms; the lengths of sliding windows of the csl-hdemg are 150ms and 170ms, and the sliding step length is 0.5 ms;

(3.2) extracting the characteristics of each channel of the fixed-length signal section in each window, and extracting various characteristics; extracting feature vectors of electromyographic signals in the window based on a classical feature set Phynyark, wherein the feature vectors comprise a feature signal amplitude absolute mean value MAV, a waveform length WL, an autoregressive coefficient AR, an absolute mean slope MAVSLP, an average frequency MNF, an energy-to-total energy ratio PSR near a power spectrum maximum value and a Willison amplitude WAMP; the CapgMyo subset and the csl-hdemg are high-density electromyographic signals, and an image is directly constructed on the original signals without feature extraction;

(4) constructing a new electromyogram, comprising the following substeps:

(4.1) rearranging the feature vectors of each channel in the window so that every two channels can be adjacent;

(4.2) constructing a new electromyogram, wherein the width of the new electromyogram is 1, the height of the new electromyogram is the number of rearranged channels, and the number of color channels is the dimension of a feature vector;

(5) myoelectric signal multi-class gesture recognition based on deep learning and attention mechanism comprises the following steps:

(5.1) designing a model structure of a deep learning and attention mechanism, wherein the model structure consists of a convolutional neural network, a cyclic neural network and a basic attention mechanism; convolutional neural network pairPerforming high-level feature extraction on the input new electromyogram, modeling the relation between each frame of the new electromyogram sequence by the recurrent neural network, and performing importance weighting on the output of the recurrent neural network by the basic attention mechanism; the convolutional neural network part comprises two convolutional layers, two local connecting layers and three full connecting layers, wherein the convolutional layers are 64 kernels, the kernel size is 3 x 3, the local connecting layers are 64 kernels, and the three full connecting layers are respectively 512 dimensions, 512 dimensions and 128 dimensions; the recurrent neural network is a layer, the memory unit is a long-time memory LSTM, and the hidden state output dimension is 512; attention mechanism attention weight alpha at time t_tThe calculation formula of (2) is as follows:

M_t＝tanh(W_hh_t)

α_t＝softmax(w^TM_t)

wherein h is_tIs the output of the recurrent neural network, W_hAnd w^TIs the weight matrix to be trained, T is the time length of a gesture segment, r is the output of the basic attention mechanism part; the softmax function is a normalized exponential function;

(5.2) constructing a new electromyogram for each sample in the original signal training data set to obtain a new electromyogram training data set as the input of the whole network, and optimizing network parameters of the convolutional neural network and the cyclic neural network one by one to obtain optimal model parameters;

(5.3) training the optimal model parameters obtained by the training in the step (5.2) and a new electromyography training data set to obtain a classification model;

and (5.4) constructing a new electromyogram of each sample in the test data set to obtain a new electromyogram test data set, inputting the classification model obtained in the step (5.3), and outputting a classification result.

2. The electromyographic signal gesture recognition method based on deep learning and attention mechanism according to claim 1, wherein in step (1), low-pass filtering is applied to the NinaProDB1, low-pass filtering is applied to the NinaProDB2, the filtering is performed to 100Hz, the biopathrec subset and the CapgMyo subset are not filtered, and the csl-hdemg is rectified and the low-pass filtering is performed.