CN113902066A

CN113902066A - A sentence-level sign language recognition method, system, device and terminal

Info

Publication number: CN113902066A
Application number: CN202110990063.6A
Authority: CN
Inventors: 孟宪佳; 杨永; 冯琳; 单士玺
Original assignee: NORTHWEST UNIVERSITY
Current assignee: NORTHWEST UNIVERSITY
Priority date: 2021-08-26
Filing date: 2021-08-26
Publication date: 2022-01-07

Abstract

The invention belongs to the technical field of action recognition and discloses a sentence-level sign language recognition method, a system, equipment and a terminal, wherein the sentence-level sign language recognition method comprises the following steps: collecting original sign language data signals; preprocessing sign language data; dividing effective sign language activity part data; extracting effective characteristics; and (4) carrying out data training by using a random forest classifier to realize sign language recognition. The sign language recognition method of the invention adopts an effective feature extraction method, namely, a new feature is added on the basis of the original ten features, and the moderate degree of data fluctuation can be reflected, so that the method can effectively extract the sentence-level sign language signal features. The invention adopts a random forest classifier, wherein 50 decision trees are used, and the classifier determined by multiple experiments also has good effect of improving the accuracy of the hand recognition. The sign language identification method has good robustness, so that the method can be well applied to application scenes and has certain universality.

Description

A sentence-level sign language recognition method, system, device and terminal

技术领域technical field

本发明属于动作识别技术领域，尤其涉及一种语句级手语识别方法、系统、设备及终端。The invention belongs to the technical field of action recognition, and in particular relates to a method, system, device and terminal for sentence-level sign language recognition.

背景技术Background technique

目前，尽管口语是当今世界的主流，不可否认，手语对本发明仍然很重要。尤其是在人工智能时代，它的应用范围很广，如智能家居、陈列室、等等。手语不仅方便聋哑人士更快地融入智能生活，也方便了本发明与聋哑人士交流。例如，在历史博物馆，当聋哑人士需要帮助时，工作人员由于不了解手语不能和他们交流。如果博物馆有一个机器人(或其他交互设备)，它能够识别聋哑人士的手语并做出反应，它们就可以为其提供帮助和服务。At present, although spoken language is the mainstream in today's world, it is undeniable that sign language is still important to the present invention. Especially in the era of artificial intelligence, it has a wide range of applications, such as smart homes, showrooms, and more. Sign language not only facilitates deaf people to integrate into intelligent life more quickly, but also facilitates the present invention to communicate with deaf people. For example, in a history museum, when deaf people need help, staff cannot communicate with them because they don't understand sign language. If a museum has a robot (or other interactive device) that can recognize and respond to the sign language of deaf people, they can help and serve them.

近年来，关于手语识别的研究主要基于三个类别：基于计算机视觉，基于可穿戴传感器，基于无线信号。大多数研究只关注孤立的单词识别，然而很难在日常生活交流中使用，其中很少可以翻译语句级手语。In recent years, research on sign language recognition has been mainly based on three categories: based on computer vision, based on wearable sensors, and based on wireless signals. Most studies have only focused on isolated word recognition, however it is difficult to use in everyday communication, where few can translate sentence-level sign language.

现有的基于计算机视觉的手语识别系统，使用红外光作为感知机制在语句层面翻译手语。尽管它们的精确度很高，但隐私是一个大问题，尤其是在私人领域环境。基于可穿戴传感器的系统具有侵入性，而且需要用户穿戴昂贵的设备，一定程度上限制了用户的自由。虽然基于无线信号(例如WIFI)的无线网络不侵犯人类隐私，也不需要佩戴任何设备，但无线网络的部署不方便，容易受到干扰，因此不是一个好的选择。而基于RFID技术的手语识别系统技术成本低，其标签也易于部署。RFID(Radio Frequency Identification)技术，指的是无限射频识别技术。Existing computer vision-based sign language recognition systems use infrared light as a perception mechanism to translate sign language at the sentence level. Despite their high accuracy, privacy is a big concern, especially in private domain settings. Wearable sensor-based systems are invasive and require users to wear expensive equipment, which limits the user's freedom to some extent. Although wireless networks based on wireless signals (such as WIFI) do not violate human privacy and do not require any equipment to be worn, wireless networks are inconvenient to deploy and prone to interference, so they are not a good choice. And the sign language recognition system based on RFID technology has low technical cost and its tags are easy to deploy. RFID (Radio Frequency Identification) technology refers to wireless radio frequency identification technology.

现有技术很难划分连续的手语并将其翻译成单词。此外，还有单词和从语句中分离出的单词级单词之间差异，例如，“你”这个词和在两句话中的“你”一词(“你”是主语和对象)完全不同。实现手语的自动分割也是一个难题，以前关于手语的相关研究通常手动提取手语，显然，这是一个巨大的工作量。Existing techniques have difficulty dividing and translating consecutive sign languages into words. In addition, there are differences between words and word-level words separated from sentences, for example, the word "you" is completely different from the word "you" in two sentences (where "you" is the subject and object). It is also a difficult problem to realize automatic segmentation of sign language. Previous studies on sign language usually extract sign language manually. Obviously, this is a huge workload.

综上，现有的手语识别系统普遍存在以下不足：1)不能实现语句级的手语识别；2)设备不廉价。因此，亟需一种新的语句级手语识别方法及系统，以弥补现有技术存在的缺陷。To sum up, the existing sign language recognition systems generally have the following shortcomings: 1) sentence-level sign language recognition cannot be achieved; 2) equipment is not cheap. Therefore, there is an urgent need for a new sentence-level sign language recognition method and system to make up for the shortcomings of the prior art.

通过上述分析，现有技术存在的问题及缺陷为：Through the above analysis, the existing problems and defects in the prior art are:

(1)现有的关于手语识别的研究大多数研究只关注孤立的单词识别，然而很难在日常生活交流中使用，其中很少可以翻译语句级手语。(1) Existing research on sign language recognition Most of the studies only focus on isolated word recognition, however, it is difficult to use in everyday communication, and few of them can translate sentence-level sign language.

(2)现有的基于计算机视觉的手语识别系统在私人领域环境中隐私保护力度不够；同时基于可穿戴传感器的系统具有侵入性，而且需要用户穿戴昂贵的设备，一定程度上限制了用户的自由。(2) The existing computer vision-based sign language recognition system has insufficient privacy protection in the private domain environment; at the same time, the wearable sensor-based system is intrusive and requires users to wear expensive equipment, which limits the user's freedom to a certain extent. .

(3)现有基于无线信号的无线网络部署不方便，容易受到干扰，不是好的选择；现有技术很难划分连续的手语并将其翻译成单词，实现手语的自动分割也是一个难题，工作量巨大，且设备不廉价。(3) The existing wireless network based on wireless signals is inconvenient to deploy, easy to be interfered, and not a good choice; the existing technology is difficult to divide continuous sign language and translate it into words, and it is also a difficult problem to realize automatic segmentation of sign language. The volume is huge and the equipment is not cheap.

解决以上问题及缺陷的难度为：The difficulty of solving the above problems and defects is as follows:

(1)由于信号的连续性及噪音的干扰，提取有效的手语信号成为了一个显著的挑战，这需要设计合理有效的去噪方法和信号处理方法来实现有效手语数据信号的分割.(1) Due to the continuity of the signal and the interference of noise, the extraction of effective sign language signals has become a significant challenge, which requires the design of reasonable and effective denoising methods and signal processing methods to achieve effective sign language data signal segmentation.

(2)同时，采用射频识别技术能够有效的做到隐私保护，而且使用户摆脱了穿戴设备，但是确定有效的手语信号接受范围是一个问题，这需要大量的基于不同范围的实验以及在不同场景下的实验以确定合理的手语信号接受范围；(2) At the same time, the use of radio frequency identification technology can effectively protect privacy and free users from wearable devices, but it is a problem to determine an effective sign language signal reception range, which requires a lot of experiments based on different ranges and in different scenarios. The following experiments are used to determine the reasonable acceptance range of sign language signals;

(3)最后，关键在于如何让系统区分不同的手语信号，这需要提取有效的信号特征，经过大量的实验并进行分析，最后确定关键而有效的信号特征以达到识别手语信号的目的。(3) Finally, the key lies in how to make the system distinguish different sign language signals, which requires the extraction of effective signal features, and after a lot of experiments and analysis, the key and effective signal features are finally determined to achieve the purpose of identifying sign language signals.

解决以上问题及缺陷的意义为：The significance of solving the above problems and defects is:

(1)最主要的贡献是实现了语句级手语信号的识别，不同于之前单词级手语信号的识别，这显著提高了手语交流的效率，使得手语交流更加快捷方便。(1) The main contribution is the realization of sentence-level sign language signal recognition, which is different from the previous word-level sign language signal recognition, which significantly improves the efficiency of sign language communication and makes sign language communication faster and more convenient.

(2)同时，基于射频识别技术的手语分割能够有效的保护用户隐私，无需穿戴设备，极大地方便了用户。(2) At the same time, sign language segmentation based on radio frequency identification technology can effectively protect user privacy without wearing equipment, which greatly facilitates users.

(3)合理的部署范围能够增加识别的准确率，此外，射频识别系统和无源标签显著降低了应用成本，具有较高的商业可用性。(3) A reasonable deployment range can increase the accuracy of identification. In addition, radio frequency identification systems and passive tags significantly reduce application costs and have high commercial availability.

发明内容SUMMARY OF THE INVENTION

针对现有技术存在的问题，本发明提供了一种语句级手语识别方法、系统、设备及终端，尤其涉及一种基于射频识别RFID的语句级手语识别方法、系统、设备及终端。In view of the problems existing in the prior art, the present invention provides a sentence-level sign language recognition method, system, device and terminal, and in particular relates to a sentence-level sign language recognition method, system, device and terminal based on radio frequency identification RFID.

本发明是这样实现的，一种语句级手语识别方法，所述语句级手语识别方法包括以下步骤：The present invention is implemented in this way, a statement-level sign language recognition method, the statement-level sign language recognition method comprising the following steps:

步骤一，基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列，获得原始的手语数据信号。该步骤获得了原始的手语数据信号，为进一步处理做了铺垫；Step 1, a commercial radio frequency identification device based on the COST radio frequency identification technology acquires the phase sequence of a specific target between the identification system and the passive tag through radio, and obtains the original sign language data signal. In this step, the original sign language data signal is obtained, paving the way for further processing;

步骤二，进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列。该步骤去除了原始的手语数据信号中的噪声，提高了手语识别的准确率；Step 2, perform phase offset correction and use threshold-based wavelet denoising method to remove phase offset caused by environmental hardware and Gaussian noise, and obtain a pure phase sequence. This step removes the noise in the original sign language data signal and improves the accuracy of sign language recognition;

步骤三，利用基于标准偏差的信号处理方法实现有效手语数据信号的分割。该步骤分割了有效的手语数据信号，减少了无用的手语数据信号带来的干扰；Step 3, using the signal processing method based on the standard deviation to realize the segmentation of the effective sign language data signal. This step divides the effective sign language data signal and reduces the interference caused by the useless sign language data signal;

步骤四，通过对同一手语信号和不同手语信号之间的每个特征分析，选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取。该步骤提取了对于手语识别的有效特征，极大地提高了识别准确率；Step 4: By analyzing each feature between the same sign language signal and different sign language signals, select features that are stable in the same sign language signal but can be clearly distinguished between different sign language signals, and perform effective feature extraction. This step extracts the effective features for sign language recognition, which greatly improves the recognition accuracy;

步骤五，利用随机森林分类器，并使用50个决策树实现手语识别。该步骤最终实现了语句级的手语识别。The fifth step is to use the random forest classifier and use 50 decision trees to realize sign language recognition. This step finally achieves sentence-level sign language recognition.

进一步，步骤二中，所述进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列，包括：Further, in step 2, the phase offset correction is performed and the threshold-based wavelet denoising method is used to remove the phase offset caused by the environmental hardware and Gaussian noise to obtain a pure phase sequence, including:

(1)利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的偏移，公式如下：(1) Use the threshold-based wavelet denoising method to remove the offset caused by the environmental hardware and Gaussian noise, the formula is as follows:

θ_true＝θ-θ_n(θ_n＝πor2π)；θ _true = θ-θ _n (θ _n = πor2π);

其中，n是序列号，θ_true是去除相位偏移后的真实相位值，θ是原始的相位序列值，θ_n是相位偏移值。Among them, n is the sequence number, θ _true is the real phase value after removing the phase offset, θ is the original phase sequence value, and θ _n is the phase offset value.

(2)通过从RFID阅读器的API中获得手语数据信号移除噪声后的相位序列如下：(2) The phase sequence after removing the noise by obtaining the sign language data signal from the API of the RFID reader is as follows:

θ＝{θ₁，θ₂...θ_i...θ_n}；θ={θ ₁ , θ ₂ ... θ _i ... θ _n };

其中，n是相位序列数，θ是去除噪声后的相位序列，θ_i是相位序列中的相位值。where n is the number of phase sequences, θ is the phase sequence after noise removal, and θ _i is the phase value in the phase sequence.

(3)利用同一化处理数据以减小不同时刻同一手势的微小差异，计算公式如下：(3) Using the same processing data to reduce the slight difference of the same gesture at different times, the calculation formula is as follows:

其中，S(i)是标准化后的中间结果，θ(i)是相位序列中的相位值，min(θ)是相位序列中的最小值，max(θ)是相位序列中的最大值，a(i)是归一化后的第i个相序样本。where S(i) is the normalized intermediate result, θ(i) is the phase value in the phase sequence, min(θ) is the minimum value in the phase sequence, max(θ) is the maximum value in the phase sequence, a (i) is the ith phase sequence sample after normalization.

进一步，步骤三中，所述利用基于标准偏差的信号处理方法实现有效手语数据信号的分割，包括：Further, in step 3, the use of the standard deviation-based signal processing method to realize the segmentation of effective sign language data signals includes:

(1)将样本中的每个相位值减去中位数并求绝对值：(1) Subtract the median from each phase value in the sample and find the absolute value:

m(i)＝|a(i)-median(a)|；m(i)=|a(i)-median(a)|;

其中，in,

是中位数，m(i)是相位序列a(i)中的每个数减去中位数median(a)的绝对值。is the median, and m(i) is the absolute value of each number in the phase sequence a(i) minus the median median(a).

(2)将数据依次分组计算标准差：(2) Calculate the standard deviation by grouping the data in turn:

其中，d(i)是计算标准后的新序列标准差，k是序列的长度，μ是平均值序列k＝50。Among them, d(i) is the standard deviation of the new series after calculating the standard, k is the length of the series, and μ is the mean value series k=50.

(3)利用d(i)的值，单个数据大于阈值r(r＝0.1)的数据组是有效数据。(3) Using the value of d(i), a data group whose single data is greater than the threshold r (r=0.1) is valid data.

进一步，步骤四中，所述通过对同一手语信号和不同手语信号之间的每个特征分析，选择选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取，包括：Further, in step 4, by analyzing each feature between the same sign language signal and different sign language signals, select and select the features that are stable in the same sign language signal and can be clearly distinguished between different sign language signals, and carry out effective analysis. Feature extraction, including:

(1)提取得到的有效数据组的十个特征值，包括偏斜度、期望值、三阶中心距离、平均值、方差、标准差、峰度、能量以及主频比最大频率峰值。(1) Ten eigenvalues of the obtained effective data set, including skewness, expected value, third-order center distance, mean, variance, standard deviation, kurtosis, energy, and the main frequency ratio to the maximum frequency peak.

(2)提取得到的有效数据的SOS值：(2) The SOS value of the extracted valid data:

SOS：假设n是数组的长度，从第一组数组开始，每个数字和下列(k-1)数字组成一个长度为k的数组，得到每个数组的标准偏差值，得到(N-k+1)个标准偏差，找到和所述(N-k+1)标准偏差的总和，并得到SOS：。SOS: Assuming n is the length of the array, starting from the first set of arrays, each number and the following (k-1) numbers form an array of length k, get the standard deviation value of each array, and get (N-k+ 1) standard deviations, find the sum of the (N-k+1) standard deviations and get SOS: .

比较数据的柔和程度波动；当数据波动时，该值很大；当数据波动较小时，会变得较小，公式如下：Compare the fluctuation of the softness of the data; when the data fluctuates, the value is large; when the data fluctuation is small, it will become smaller, the formula is as follows:

其中，N是数据集中的数据总数，μ是k个数据的平均值，k是每个标准偏差的数据数量。where N is the total number of data in the dataset, μ is the average of k data, and k is the number of data per standard deviation.

进一步，步骤五中，所述利用随机森林分类器，并使用50个决策树实现手语识别，包括：Further, in step 5, the random forest classifier is used and 50 decision trees are used to realize sign language recognition, including:

(1)将数据打乱并提取出1/10组的数据作为数据测试集，其余数据作为训练集，将测试集和训练集导入RF分类器进行识别。(1) Disorganize the data and extract 1/10 group of data as the data test set, and the rest of the data as the training set, and import the test set and training set into the RF classifier for identification.

(2)提取另外1/10组的数据作为数据测试集，循环导入RF分类器直至每组数据都进行过测试，最终得到测试结果。(2) Extract another 1/10 group of data as a data test set, and cyclically import the RF classifier until each group of data has been tested, and finally get the test result.

本发明的另一目的在于提供一种应用所述的语句级手语识别方法的语句级手语识别系统，所述语句级手语识别系统包括：Another object of the present invention is to provide a statement-level sign language recognition system applying the statement-level sign language recognition method, the statement-level sign language recognition system comprising:

手语数据信号获取模块，用于基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列，获得原始的手语数据信号；The sign language data signal acquisition module is used for commercial radio frequency identification devices based on COST radio frequency identification technology. It acquires the phase sequence of a specific target between the identification system and the passive tag through radio, and obtains the original sign language data signal;

信号去噪处理模块，用于进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列；The signal denoising processing module is used to correct the phase offset and use the threshold-based wavelet denoising method to remove the phase offset caused by the environmental hardware and Gaussian noise, and obtain a pure phase sequence;

手语信号分割模块，用于利用基于标准偏差的信号处理方法实现有效手语数据信号的分割；The sign language signal segmentation module is used to realize the segmentation of effective sign language data signals by using the signal processing method based on standard deviation;

特征提取模块，用于通过对同一手语信号和不同手语信号之间的每个特征分析，选择选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取；The feature extraction module is used to select features that are stable in the same sign language signal and can be clearly distinguished between different sign language signals by analyzing each feature between the same sign language signal and different sign language signals, and perform effective feature extraction. ;

手语识别模块，用于利用随机森林分类器，使用50个决策树实现手语识别。A sign language recognition module for sign language recognition using 50 decision trees using a random forest classifier.

本发明的另一目的在于提供一种计算机设备，所述计算机设备包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行如下步骤：Another object of the present invention is to provide a computer device, the computer device includes a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the following step:

基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列，获得原始的手语数据信号；进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列；利用基于标准偏差的信号处理方法实现有效手语数据信号的分割；The commercial RFID device based on COST RFID technology acquires the phase sequence of a specific target between the identification system and the passive tag by radio, and obtains the original sign language data signal; performs phase offset correction and uses threshold-based wavelet denoising method Remove the phase offset caused by environmental hardware and Gaussian noise to obtain a pure phase sequence; use the signal processing method based on standard deviation to achieve effective sign language data signal segmentation;

通过对同一手语信号和不同手语信号之间的每个特征分析，选择选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取；利用随机森林分类器，并使用50个决策树实现手语识别。By analyzing each feature between the same sign language signal and different sign language signals, select and select the features that are stable in the same sign language signal and can be clearly distinguished between different sign language signals to perform effective feature extraction; use random forest classifiers , and use 50 decision trees for sign language recognition.

本发明的另一目的在于提供一种计算机可读存储介质，存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行如下步骤：Another object of the present invention is to provide a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, causes the processor to perform the following steps:

本发明的另一目的在于提供一种信息数据处理终端，所述信息数据处理终端用于实现所述的语句级手语识别系统。Another object of the present invention is to provide an information data processing terminal for implementing the sentence-level sign language recognition system.

本发明的另一目的在于提供一种所述的语句级手语识别系统在人工智能技术领域中的应用。Another object of the present invention is to provide an application of the sentence-level sign language recognition system in the field of artificial intelligence technology.

结合上述的所有技术方案，本发明所具备的优点及积极效果为：本发明提供的语句级手语识别方法，通过收集商业RFID接收的信号相位序列，得到了相对纯净的相位特征，并给出了一种实现手语分割的方法，可用于根据手语动作波形的手语识别和人与计算机之间的手语信息交换。同时，有效的特征提取和分类器选择是手语识别的关键，所以本发明经过多次的对比，最终确定了11种有效的特征。本发明通过在现实语言环境下对系统进行评价，弥补了相应的低成本语句级手语识别之间的差距。Combined with all the above technical solutions, the advantages and positive effects of the present invention are: the sentence-level sign language recognition method provided by the present invention obtains relatively pure phase characteristics by collecting the signal phase sequence received by commercial RFID, and gives A method for realizing sign language segmentation, which can be used for sign language recognition based on sign language action waveforms and sign language information exchange between humans and computers. At the same time, effective feature extraction and classifier selection are the keys to sign language recognition, so the present invention has finally determined 11 effective features after many comparisons. The present invention bridges the gap between corresponding low-cost sentence-level sign language recognition by evaluating the system in a real language environment.

本发明提供的基于RFID的手语识别方法，能够可以在处理大量数据和完整级手语识别任务情况下，实现语句级的手语识别，填补了手语识别系统在RFID语句级识别方面的缺陷，对环境的动态适应性较好，且经济成本低。本发明的手语识别方法仅需要少量的RFID被动式标签和RFID阅读器就可以实现手语识别，即所需经济成本低，避免使用昂贵的设备才能够实现手语识别。The RFID-based sign language recognition method provided by the present invention can realize sentence-level sign language recognition under the condition of processing a large amount of data and complete-level sign language recognition tasks, which fills the defects of the sign language recognition system in RFID sentence-level recognition, and has a negative impact on the environment. The dynamic adaptability is good, and the economic cost is low. The sign language identification method of the present invention can realize sign language identification with only a small number of RFID passive tags and RFID readers, that is, the required economic cost is low, and the sign language identification can be realized without using expensive equipment.

本发明的手语识别方法采取去了有效的特征提取方法，即在原有的十个特征基础之上新增了新的特征，即SOS(Sum ofStandard Deviations)，其能够反映数据波动的温和程度，所以本发明能够有效的提取语句级的手语信号特征。本发明采用了随机森林分类器，其中使用了50个决策树，使用经过多次实验确定的此分类器也对手语识别的准确率有好的提升效果。The sign language recognition method of the present invention adopts an effective feature extraction method, that is, a new feature is added on the basis of the original ten features, namely SOS (Sum of Standard Deviations), which can reflect the mildness of data fluctuations, so The present invention can effectively extract sentence-level sign language signal features. The present invention adopts a random forest classifier in which 50 decision trees are used, and the classifier determined through many experiments also has a good effect of improving the accuracy of sign language recognition.

本发明利用RFID系统产生的信号相位序列进语句级的手语识别，突破了目前基于RFID的单词级手语识别，具有良好的实用性。本发明的手语识别方法的鲁棒性非常好，因此能够很好的适用与应用场景，具有一定的普适性。The present invention utilizes the signal phase sequence generated by the RFID system to perform sentence-level sign language recognition, breaks through the current RFID-based word-level sign language recognition, and has good practicability. The robustness of the sign language recognition method of the present invention is very good, so it can be well applied to application scenarios, and has certain universality.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对本发明实施例中所需要使用的附图做简单的介绍，显而易见地，下面所描述的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following will briefly introduce the accompanying drawings that need to be used in the embodiments of the present invention. Obviously, the drawings described below are only some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.

图1是本发明实施例提供的语句级手语识别方法流程图。FIG. 1 is a flowchart of a sentence-level sign language recognition method provided by an embodiment of the present invention.

图2是本发明实施例提供的语句级手语识别方法原理图。FIG. 2 is a schematic diagram of a sentence-level sign language recognition method provided by an embodiment of the present invention.

图3是本发明实施例提供的语句级手语识别系统结构框图；3 is a structural block diagram of a sentence-level sign language recognition system provided by an embodiment of the present invention;

图中：1、手语数据信号获取模块；2、信号去噪处理模块；3、手语信号分割模块；4、特征提取模块；5、手语识别模块。In the figure: 1. Sign language data signal acquisition module; 2. Signal denoising processing module; 3. Sign language signal segmentation module; 4. Feature extraction module; 5. Sign language recognition module.

图4是本发明实施例提供的手语信号进行相位校准处理后的手语信号示意图。FIG. 4 is a schematic diagram of a sign language signal after phase calibration processing is performed on the sign language signal provided by an embodiment of the present invention.

图5是本发明实施例提供的手语信号经过小波变换去除环境引起的噪音之后的手语信号示意图。FIG. 5 is a schematic diagram of a sign language signal provided by an embodiment of the present invention after the sign language signal is subjected to wavelet transform to remove noise caused by the environment.

图6是本发明实施例提供的将相位序列进行归一化和减去相位序列中位数median(a)之后的相位序列样本图。FIG. 6 is a sample diagram of the phase sequence after normalizing the phase sequence and subtracting the median (a) of the phase sequence provided by an embodiment of the present invention.

图7是本发明实施例提供的计算样本相位序列的标准差和最后经过阈值处理之后的手语信号分割结果示意图。FIG. 7 is a schematic diagram of calculating the standard deviation of a sample phase sequence and a sign language signal segmentation result after thresholding, according to an embodiment of the present invention.

图8是本发明实施例提供的实验设备和场景布置图。FIG. 8 is an arrangement diagram of experimental equipment and a scene provided by an embodiment of the present invention.

图9是本发明实施例提供的手语识别的混淆矩阵图。FIG. 9 is a confusion matrix diagram of sign language recognition provided by an embodiment of the present invention.

图10是本发明实施例提供的随机森林决策树数量和手语识别准确率的关系图。FIG. 10 is a relationship diagram between the number of random forest decision trees and sign language recognition accuracy provided by an embodiment of the present invention.

图11是本发明实施例提供的动态多路径对于手语识别系统的干涉图。FIG. 11 is an interference diagram of a sign language recognition system provided by dynamic multi-pathing according to an embodiment of the present invention.

图12是本发明实施例提供的手语识别数量和识别准确率的关系图。FIG. 12 is a relationship diagram between the number of sign language recognitions and the recognition accuracy provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

针对现有技术存在的问题，本发明提供了一种语句级手语识别方法、系统、设备及终端，下面结合附图对本发明作详细的描述。In view of the problems existing in the prior art, the present invention provides a sentence-level sign language recognition method, system, device and terminal. The present invention is described in detail below with reference to the accompanying drawings.

如图1所示，本发明实施例提供的语句级手语识别方法包括以下步骤：As shown in FIG. 1 , the sentence-level sign language recognition method provided by the embodiment of the present invention includes the following steps:

S101，基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列，获得原始的手语数据信号；S101, a commercial radio frequency identification device based on COST radio frequency identification technology, obtains the phase sequence of a specific target between the identification system and the passive tag through radio, and obtains the original sign language data signal;

S102，进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列；S102, performing phase offset correction and using a threshold-based wavelet denoising method to remove the phase offset caused by environmental hardware and Gaussian noise to obtain a pure phase sequence;

S103，利用基于标准偏差的信号处理方法实现有效手语数据信号的分割；S103, using a signal processing method based on standard deviation to achieve effective sign language data signal segmentation;

S104，通过对同一手语信号和不同手语信号之间的每个特征分析，选择选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取；S104, by analyzing each feature between the same sign language signal and different sign language signals, select and select features that are stable in the same sign language signal but can be clearly distinguished between different sign language signals, and perform effective feature extraction;

S105，利用随机森林分类器，并使用50个决策树实现手语识别。S105, using a random forest classifier and using 50 decision trees to realize sign language recognition.

本发明实施例提供的语句级手语识别方法原理图如图2所示。The principle diagram of the sentence-level sign language recognition method provided by the embodiment of the present invention is shown in FIG. 2 .

如图3所示，本发明实施例提供的语句级手语识别系统包括：As shown in FIG. 3 , the sentence-level sign language recognition system provided by the embodiment of the present invention includes:

手语数据信号获取模块1，用于基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列，获得原始的手语数据信号；Sign language data signal acquisition module 1, used for commercial radio frequency identification devices based on COST radio frequency identification technology, obtains the phase sequence of a specific target between the identification system and the passive tag through radio, and obtains the original sign language data signal;

信号去噪处理模块2，用于进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列；The signal denoising processing module 2 is used to perform phase offset correction and use the threshold-based wavelet denoising method to remove the phase offset caused by environmental hardware and Gaussian noise, and obtain a pure phase sequence;

手语信号分割模块3，用于利用基于标准偏差的信号处理方法实现有效手语数据信号的分割；The sign language signal segmentation module 3 is used to realize the segmentation of effective sign language data signals by using the signal processing method based on standard deviation;

特征提取模块4，用于通过对同一手语信号和不同手语信号之间的每个特征分析，选择选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取；The feature extraction module 4 is used to select and select the features that are stable in the same sign language signal and can be clearly distinguished between different sign language signals by analyzing each feature between the same sign language signal and different sign language signals. extract;

手语识别模块5，用于利用随机森林分类器，使用50个决策树实现手语识别。The sign language recognition module 5 is used to realize sign language recognition using 50 decision trees by using the random forest classifier.

下面结合具体实施例对本发明的技术方案作进一步描述。The technical solutions of the present invention will be further described below with reference to specific embodiments.

实施例1Example 1

针对上述现有技术中存在的缺陷和不足，本发明的目的在于，提供一种基于RFID的手语识别方法，该方法能够可以在处理大量数据和完整级手语识别任务情况下，实现语句级的手语识别，填补了手语识别系统在RFID语句级识别方面的缺陷，对环境的动态适应性较好，且经济成本低。Aiming at the above-mentioned defects and deficiencies in the prior art, the purpose of the present invention is to provide an RFID-based sign language recognition method, which can realize sentence-level sign language while processing a large amount of data and complete-level sign language recognition tasks. Recognition fills the shortcomings of sign language recognition system in RFID sentence-level recognition, has good dynamic adaptability to the environment, and has low economic cost.

为达到上述目的，本发明采取如下的技术方案：To achieve the above object, the present invention adopts the following technical scheme:

本发明实施例提供的基于RFID的手语识别方法，具体包括以下步骤：The RFID-based sign language recognition method provided by the embodiment of the present invention specifically includes the following steps:

步骤一，基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列，获得原始的手语数据信号；Step 1, the commercial radio frequency identification device based on the COST radio frequency identification technology obtains the phase sequence of the specific target between the identification system and the passive tag through radio, and obtains the original sign language data signal;

步骤二，进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列；Step 2, perform phase offset correction and use threshold-based wavelet denoising method to remove phase offset caused by environmental hardware and Gaussian noise to obtain a pure phase sequence;

步骤三，利用基于标准偏差的信号处理方法实现有效手语数据信号的分割；Step 3, using a signal processing method based on standard deviation to achieve effective sign language data signal segmentation;

步骤四，通过对同一手语信号和不同手语信号之间的每个特征分析，选择选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取；Step 4, by analyzing each feature between the same sign language signal and different sign language signals, select and select the features that are stable in the same sign language signal but can be clearly distinguished between different sign language signals, and perform effective feature extraction;

步骤五，利用随机森林分类器，并使用50个决策树实现手语识别。The fifth step is to use the random forest classifier and use 50 decision trees to realize sign language recognition.

具体地，所述的步骤二的具体处理方法包括：Specifically, the specific processing method of the step 2 includes:

步骤2.1：利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的偏移，其公式如下：Step 2.1: Use the threshold-based wavelet denoising method to remove the offset caused by the environmental hardware and Gaussian noise, the formula is as follows:

θ_true＝θ-θ_i(θ_i＝πor2π)θ _true = θ-θ _i (θ _i =πor2π)

其中i是序列号，θ是相位。where i is the sequence number and θ is the phase.

步骤2.2：利用同一化处理数据以减小不同时刻同一手势的微小差异，计算公式如下：Step 2.2: Use the same processing data to reduce the slight difference of the same gesture at different times. The calculation formula is as follows:

其中S(i)是标准化后的中间结果，a(i)是归一化后的第i个相序样本。where S(i) is the normalized intermediate result, and a(i) is the ith phase sequence sample after normalization.

具体地，所述的步骤三的具体处理方法包括：Specifically, the specific processing method of the step 3 includes:

步骤3.1：将样本中的每个相位值减去中位数并求绝对值：Step 3.1: Subtract the median and absolute value from each phase value in the sample:

m(i)＝|a(i)-median(a)|m(i)=|a(i)-median(a)|

其中median(a)是中位数；where median(a) is the median;

步骤3.2：将数据依次分组计算标准差：Step 3.2: Calculate the standard deviation by grouping the data in sequence:

其中d(i)是计算标准后的新序列标准差，k是序列的长度，μ是平均值序列(k＝50)；where d(i) is the standard deviation of the new series after calculating the standard, k is the length of the series, and μ is the mean series (k=50);

步骤3.3：利用d(i)的值，单个数据大于阈值r(r＝0.1)的数据组是有效数据。Step 3.3: Using the value of d(i), a data group whose single data is greater than the threshold r (r=0.1) is valid data.

具体地，所述的步骤四的具体处理方法包括：Specifically, the specific processing method of the step 4 includes:

4.1：提取得到的有效数据组的十个特征值：偏斜度，期望值，三阶中心距离，平均值，方差，标准差，峰度，能量，主频比最大频率峰值；4.1: Ten eigenvalues of the extracted valid data set: skewness, expected value, third-order center distance, mean, variance, standard deviation, kurtosis, energy, main frequency ratio to the maximum frequency peak;

4.2：提取得到的有效数据的SOS值，4.2: SOS value of the extracted valid data,

SOS：本发明假设n是数组的长度。从第一组数组开始，每个数字和下列(k-1)数字组成一个长度为k的数组。本发明得到每个数组的标准偏差值，然后本发明得到了(N-k+1)个标准偏差，找到和、这些(N-k+1)标准偏差的总和，并得到SOS；。SOS: The present invention assumes that n is the length of the array. Starting with the first set of arrays, each number and the following (k-1) numbers form an array of length k. The present invention obtains the standard deviation value of each array, and then the present invention obtains (N-k+1) standard deviations, finds the sum, the sum of these (N-k+1) standard deviations, and obtains SOS;.

功能:可以比较数据的柔和程度波动。当数据波动时，该值很大；当数据波动较小时，会变得较小。Function: You can compare the fluctuation of the softness of the data. When the data fluctuates, the value is large; when the data fluctuates less, it becomes small.

其中N是数据集中的数据总数，μ是k个数据的平均值，k是每个标准偏差的数据数量，在这个实验中k取为5。where N is the total number of data in the dataset, μ is the average of k data, k is the number of data per standard deviation, and k is taken as 5 in this experiment.

具体地，所述的步骤五的具体处理方法包括：Specifically, the specific processing method of the step 5 includes:

步骤5.1：将数据打乱并提取出1/10组的数据作为数据测试集，其余数据作为训练集，将测试集和训练集导入RF分类器进行识别；Step 5.1: Disorganize the data and extract 1/10 of the data as the data test set, and the rest of the data as the training set, import the test set and the training set into the RF classifier for identification;

步骤5.2：提取另外1/10组的数据作为数据测试集，循环导入RF分类器直至每组数据都进行过测试，最终得到测试结果。Step 5.2: Extract another 1/10 group of data as a data test set, and import the RF classifier cyclically until each group of data has been tested, and finally get the test result.

实施例2Example 2

本发明实施例提供了一种手语识别方法，该方法可以应用在多种系统平台，其执行主体可以为计算机终端或各种移动设备的处理器，所述方法的方法流程图如图1所示，具体包括：An embodiment of the present invention provides a sign language recognition method, which can be applied to various system platforms, and the execution body of which can be a computer terminal or a processor of various mobile devices. The method flowchart of the method is shown in FIG. 1 . , including:

本发明的基于RFID的手语识别方法，具体包括以下步骤：The RFID-based sign language recognition method of the present invention specifically includes the following steps:

步骤一，采集原始的手语数据信号Step 1: Collect the original sign language data signal

基于COST射频识别技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取特定目标的相位序列。Commercial RFID devices based on COST RFID technology acquire the phase sequence of a specific target between the identification system and the passive tag by radio.

其中RFID系统由RFID标签，RFID阅读器和RFID定向天线组成，商用RFID设备给用户提供了应用程序接口，通过API用户能够获得接收信号强度指示和被动式标签获取相位等特征。API(ApplicationProgramInterface)即应用程序接口，提供给用户的应用操作界面。The RFID system consists of RFID tags, RFID readers and RFID directional antennas. Commercial RFID devices provide users with an application program interface. Through the API, users can obtain the received signal strength indication and passive tags to obtain the phase and other characteristics. API (ApplicationProgramInterface) is an application program interface, an application operation interface provided to the user.

在这些特征值中，信号的相位序列具有较高的稳定性并且很少会被环境所干扰，因此本发明选择相位序列作为本发明进行手语识别的特征。Among these eigenvalues, the phase sequence of the signal has high stability and is rarely disturbed by the environment, so the present invention selects the phase sequence as the feature of the present invention for sign language recognition.

步骤二，进行手语信号的去噪处理Step 2: De-noise the sign language signal

进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境硬件和高斯噪声引起的相位偏移，获得纯净的相位序列。Perform phase offset correction and use threshold-based wavelet denoising method to remove the phase offset caused by environmental hardware and Gaussian noise, and obtain a pure phase sequence.

从RFID阅读器的API获得的相位序列包含由于硬件引起的相位偏移和环境引起的高斯噪声。The phase sequence obtained from the RFID reader's API contains phase shifts due to hardware and Gaussian noise due to the environment.

步骤2.1：在接受的信号中有两种类型的相位偏移，通常情况下相位序列偏移了π或2π，具体如下：Step 2.1: There are two types of phase shifts in the received signal, usually the phase sequence is shifted by π or 2π, as follows:

θ_true＝θ-θ_i(θ_i＝πor2π)θ _true = θ-θ _i (θ _i =πor2π)

其中i是序列号，θ是相位。为了校准相位偏移，计算得到的香味序列的平均值并且通过加减π或2π将其恢复到真实的相位序列，得到相位序列如图4所示。where i is the sequence number and θ is the phase. To calibrate the phase offset, the average of the resulting scent sequences was calculated and restored to the true phase sequence by adding or subtracting π or 2π, resulting in the phase sequence shown in Figure 4.

同时为了移除环境产生的高斯噪声，本发明继续对上一步得到的相位序列进行基于阈值的离散小波变换得到如图5所示的相位序列。At the same time, in order to remove the Gaussian noise generated by the environment, the present invention continues to perform threshold-based discrete wavelet transform on the phase sequence obtained in the previous step to obtain the phase sequence shown in FIG. 5 .

步骤2.2：通过步骤2.1从RFID阅读器的API中获得手语数据信号移除噪声后的相位序列如下：Step 2.2: Obtain the sign language data signal from the API of the RFID reader through step 2.1. The phase sequence after noise removal is as follows:

θ＝{θ₁，θ₂...θ_i...θ_n}其中n是相位序列数，θ(i)是第i个相位值。θ={θ ₁ , θ ₂ ... θ _i ... θ _n } where n is the phase sequence number and θ(i) is the ith phase value.

同时利用同一化处理数据以减小不同时刻同一手势的微小差异，计算公式如下：At the same time, the same processing data is used to reduce the slight difference of the same gesture at different times. The calculation formula is as follows:

步骤三，利用基于标准偏差的信号处理方法实现有效手语数据信号的分割。Step 3, using the signal processing method based on the standard deviation to realize the segmentation of the effective sign language data signal.

m(i)＝|a(i)-median(a)|m(i)=|a(i)-median(a)|

其中median(a)是中位数，m(i)是相位序列a(i)中的每个数减去中位数的绝对值，得到如图6所示的相位序列图。where median(a) is the median, and m(i) is the absolute value of the median minus the median from each number in the phase sequence a(i) to obtain the phase sequence diagram shown in Figure 6.

步骤3.3：对于分组后的标准偏差d(i)，将单个数据大于阈值r(r＝0.1)的数据组视为有效数据将其提取出来。Step 3.3: For the standard deviation d(i) after grouping, a data group with a single data greater than the threshold r (r=0.1) is regarded as valid data and extracted.

经过以上步骤，本发明提取到了有效的手语数据信号并实现了手语数据的分割，见图7。After the above steps, the present invention extracts an effective sign language data signal and realizes the segmentation of the sign language data, as shown in FIG. 7 .

步骤四，通过对同一手语信号和不同手语信号之间的每个特征分析，选择在同一手语信号中保持稳定而在不同手语信号之间能够明显区分的特征，进行有效的特征提取。Step 4: By analyzing each feature between the same sign language signal and different sign language signals, select features that are stable in the same sign language signal but can be clearly distinguished between different sign language signals, and perform effective feature extraction.

4.2：具体介绍提取到的新特征SOS，即标准偏差之和(SumofStandardDeviations)，其能够衡量数据波动的严重程度。4.2: Introduce the extracted new feature SOS, namely Sum of Standard Deviations, which can measure the severity of data fluctuations.

SOS；本发明假设n是数组的长度。从第一组数组开始，每个数字和下列(k-1)数字组成一个长度为k的数组。本发明得到每个数组的标准偏差值，然后本发明得到了(N-k+1)个标准偏差，找到和、这些(N-k+1)标准偏差的总和，并得到SOS；。SOS; the present invention assumes that n is the length of the array. Starting with the first set of arrays, each number and the following (k-1) numbers form an array of length k. The present invention obtains the standard deviation value of each array, and then the present invention obtains (N-k+1) standard deviations, finds the sum, the sum of these (N-k+1) standard deviations, and obtains SOS;.

功能：可以比较数据的柔和程度波动。当数据波动时，该值很大；当数据波动较小时，会变得较小。Function: You can compare the fluctuation of the softness of the data. When the data fluctuates, the value is large; when the data fluctuates less, it becomes small.

实施例3Example 3

使用通用的RFID系统在室内实验室进行实验验证与评估。本实验中，总共有三种不同的实验，用于验证本发明的识别准确率，鲁棒性以及与其他方法相比具有的优势。如图8所示，主要的实验设备包括商用RFID阅读器，所用具体型号为英频杰Impinj R420读写器，普通的定向天线和现有的UHF无源标签，具体为AZ-9662RFID电子标签。此外，使用网线来连接RFID阅读器和电脑，使其能够发送命令得以发送和接受数据。本实验招募了8位志愿者，5位男性和3位女性，年龄从19岁到29岁不等，身高分布范围为1.56m～1.85m，体重分布范围为46kg～70kg。实验过程中的手语数据信号处理基于MATLAB 2018b。Experimental verification and evaluation are carried out in an indoor laboratory using a general-purpose RFID system. In this experiment, there are three different experiments in total to verify the recognition accuracy, robustness and advantages of the present invention compared with other methods. As shown in Figure 8, the main experimental equipment includes commercial RFID readers, the specific models used are Impinj R420 readers, common directional antennas and existing UHF passive tags, specifically AZ-9662RFID electronic tags. In addition, a network cable is used to connect the RFID reader to the computer, enabling it to send commands to send and receive data. This experiment recruited 8 volunteers, 5 males and 3 females, ranging in age from 19 to 29 years old, with a height distribution range of 1.56m-1.85m, and a weight distribution range of 46kg-70kg. The signal processing of sign language data during the experiment is based on MATLAB 2018b.

本实例在大约70m²并且包括几张桌子和椅子的实验室中进行。为了确定天线和标签之间的较佳距离，本发明测试了从0.8m～1.5m的距离，步长为0.1m。当两者之间的距离超过1.0m所接收到的手语信号是混乱的难以将有效的相位序列特征提取出来；当两者之间的距离少于1.0m时所提取出的相位序列难以反应其真实性，故确定天线和标签之间的距离为1.0m。同时设备距离地面的高度设置为1.2m，这适合大部分人，若高度少于1.2m，会导致地面对于信号的干扰大大增加。如图8所示，志愿者站在天线和标签中间。This example was carried out in a laboratory of approximately ^70m2 and including several tables and chairs. In order to determine the optimal distance between the antenna and the tag, the present invention tests the distance from 0.8m to 1.5m with a step size of 0.1m. When the distance between the two is more than 1.0m, the received sign language signal is chaotic, and it is difficult to extract the effective phase sequence features; when the distance between the two is less than 1.0m, the extracted phase sequence is difficult to reflect its characteristics. Therefore, the distance between the antenna and the tag is determined to be 1.0m. At the same time, the height of the device from the ground is set to 1.2m, which is suitable for most people. If the height is less than 1.2m, the interference of the ground to the signal will be greatly increased. As shown in Figure 8, the volunteer stood between the antenna and the tag.

本发明实施例提供的基于RFID的语句级手语识别方法，具体包括以下步骤：The RFID-based sentence-level sign language recognition method provided by the embodiment of the present invention specifically includes the following steps:

步骤一，基于COST RFID技术的商业射频识别装置，通过无线电在识别系统和无源标签之间获取原始手语数据信号，通过RFID reader的API提取其相位序列特征，获得原始手语数据信号的相位序列；Step 1, the commercial radio frequency identification device based on COST RFID technology obtains the original sign language data signal between the identification system and the passive tag through radio, extracts its phase sequence feature through the API of the RFID reader, and obtains the phase sequence of the original sign language data signal;

步骤二，利用步骤一得到的相位序列，进行相位偏移校正和利用基于阈值的小波去噪方法去除由于环境引起的高斯噪声，获得去除噪后的相位序列；Step 2, use the phase sequence obtained in step 1 to perform phase offset correction and use threshold-based wavelet denoising method to remove Gaussian noise caused by the environment, and obtain a phase sequence after denoising;

步骤三，利用步骤二得到的去除噪后的相位序列，使用基于标准偏差的信号处理方法进行有效手语数据信号相位序列的分割，获得分割后的有效手语数据的相位序列；Step 3, using the phase sequence obtained in step 2 after denoising, using a signal processing method based on standard deviation to segment the phase sequence of the valid sign language data signal, and obtain the phase sequence of the segmented valid sign language data;

步骤四，利用步骤三得到的分割后的有效手语数据的相位序列，通过对同一手语数据的相位序列和不同手语数据的相位序列之间的每个特征进行分析，选择在同一手语信号相位序列中保持稳定而在不同手语信号相位序列之间能够明显区分的特征，进行有效的特征提取，获得有效的十一个特征；Step 4: Using the phase sequence of the segmented valid sign language data obtained in step 3, by analyzing each feature between the phase sequence of the same sign language data and the phase sequences of different sign language data, select the phase sequence of the same sign language signal. The features that are stable and can be clearly distinguished between different sign language signal phase sequences can be effectively extracted to obtain eleven effective features;

步骤五，利用步骤四得到的十一个特征，使用集成50个决策树的随机森林分类器进行训练，其输入是对不同手语信号的相位序列提取到的十一个特征，标签是对应的手语实际含义，譬如“How old are you？(HOAR)”，“Nice to meet you.(NTMY)”等。经过训练后，得到训练好的随机森林分类器模型。Step 5: Use the eleven features obtained in Step 4 to train a random forest classifier integrating 50 decision trees. The input is the eleven features extracted from the phase sequences of different sign language signals, and the labels are the corresponding sign language signals. The actual meaning, such as "How old are you? (HOAR)", "Nice to meet you. (NTMY)" and so on. After training, the trained random forest classifier model is obtained.

步骤六，由步骤五得到了的训练好的随机森林分类器模型。对于新的手语数据信号，由步骤一中所描述的RFID reader的API提取其相位序列特征，经过步骤一，二，三，四处理后，得到对于此新的手语数据信号的十一个特征，将其输入到步骤五中所描述的训练好的分类器中，得到此新的手语数据信号所对应的标签，即所对应的手语实际含义。至此，实现了语句级的手语识别。In step 6, the trained random forest classifier model obtained in step 5 is obtained. For the new sign language data signal, its phase sequence features are extracted by the API of the RFID reader described in step 1, and after steps 1, 2, 3 and 4 are processed, eleven features for this new sign language data signal are obtained, Input it into the trained classifier described in step 5, and obtain the label corresponding to the new sign language data signal, that is, the actual meaning of the corresponding sign language. So far, sentence-level sign language recognition has been achieved.

本实施例的基于RFID的手语识别方法，具体按照以下步骤进行：The RFID-based sign language recognition method of this embodiment is specifically carried out according to the following steps:

其中RFID系统由RFID标签，RFID阅读器和RFID定向天线组成，商用RFID设备给用户提供了应用程序接口，通过API用户能够获得接收信号强度指示和被动式标签获取相位等特征。API(Application Program Interface)即应用程序接口，提供给用户的应用操作界面。The RFID system consists of RFID tags, RFID readers and RFID directional antennas. Commercial RFID devices provide users with an application program interface. Through the API, users can obtain the received signal strength indication and passive tags to obtain the phase and other characteristics. API (Application Program Interface) is an application program interface, an application operation interface provided to users.

本发明根据历史博物馆中面向聋哑人士常用的交流语，设计了9种不同的手语，分别为“Hi”,“Da Tang Treasures Exhibition(DTTE)”,“Digital Exhibition Hall(DEH)”,“How old are you？(HOAY)”,“Tang Dynasty Mural Exhibition Hall(TDMEH)”,“Exhibition Exchange(EE)”,“Nice to meet you.(NTMY)”,“What is yourname？(WIYN)”，“Exhibition Hall(EH)”，“Where is the toilet？(WITT)”,“Map”，“Where isthe exit？(WITE)”，“Closed time(CT)”，“Cultural products(CP)”and“Parkinglocation(PL)”。每位志愿者在进行实验之前学习了标准的手语，在实验进行过程中每位志愿者对于每个手语手势执行100次，同时信号采样率设置为每秒270次。The present invention designs 9 different sign languages according to the common communication language for deaf people in history museums, namely "Hi", "Da Tang Treasures Exhibition (DTTE)", "Digital Exhibition Hall (DEH)", "How old are you?(HOAY)","Tang Dynasty Mural Exhibition Hall(TDMEH)","Exhibition Exchange(EE)","Nice to meet you.(NTMY)","What is yourname?(WIYN)"," Exhibition Hall(EH)”, “Where is the toilet? (WITT)”, “Map”, “Where is the exit? (WITE)”, “Closed time (CT)”, “Cultural products (CP)” and “Parkinglocation” (PL)”. Each volunteer learned standard sign language before the experiment, and each volunteer performed each sign language gesture 100 times during the experiment, while the signal sampling rate was set to 270 times per second.

θ_true＝θ-θ_i(θ_i＝πor2π)θ _true = θ-θ _i (θ _i =πor2π)

θ＝{θ₁，θ₂...θ_i...θ_n}θ={θ ₁ , θ ₂ ... θ _i ... θ _n }

其中n是相位序列数，θ(i)是第i个相位值。where n is the number of phase sequences and θ(i) is the ith phase value.

m(i)＝|a(i)-median(a)|其中median(a)是中位数，m(i)是相位序列a(i)中的每个数减去中位数的绝对值，得到如图6所示的相位序列图。m(i)=|a(i)-median(a)| where median(a) is the median and m(i) is the absolute value of each number in the phase sequence a(i) minus the median , the phase sequence diagram shown in Figure 6 is obtained.

4.2：具体介绍提取到的新特征SOS，即标准偏差之和(Sum of StandardDeviations)，其能够衡量数据波动的严重程度。4.2: Specifically introduce the extracted new feature SOS, that is, the Sum of Standard Deviations, which can measure the severity of data fluctuations.

SOS：本发明假设n是数组的长度。从第一组数组开始，每个数字和下列(k-1)数字组成一个长度为k的数组。本发明得到每个数组的标准偏差值，然后本发明得到了(N-k+1)个标准偏差，找到和、这些(N-k+1)标准偏差的总和，并得到SOS：。SOS: The present invention assumes that n is the length of the array. Starting with the first set of arrays, each number and the following (k-1) numbers form an array of length k. The present invention obtains the standard deviation value of each array, then the present invention obtains (N-k+1) standard deviations, finds the sum, the sum of these (N-k+1) standard deviations, and obtains SOS:.

实施例4：手语识别效果验证：Example 4: Sign language recognition effect verification:

实验I：Experiment I:

实验I的目标在于验证本发明的有效性和识别准确率。首先设计并使用了历史博物馆中常用的9种手语，并且对于每个手语数据信号进行迭代计算，使得每种手语数据信号都能够进入训练集并且在测试集中得到评估。The goal of Experiment I is to verify the effectiveness and recognition accuracy of the present invention. Firstly, 9 kinds of sign languages commonly used in historical museums are designed and used, and iterative calculation is performed for each sign language data signal, so that each sign language data signal can enter the training set and be evaluated in the test set.

实验I的测试结果：Test results of Experiment I:

通过混淆矩阵来展示手语识别的准确性。混淆矩阵中的每个元素代表了一种手语被识别为其他手语的可能性。如图9所示，手语识别的平均准确率达到了95.6％，其中“DEH”，“WITT”，“CP”的识别率达到了100％，然而“TDMEH”，“NTMY”的识别率相对而言较低，但是总体来看有着较高的识别率。Demonstrate the accuracy of sign language recognition through confusion matrix. Each element in the confusion matrix represents the likelihood of one sign language being recognized as another. As shown in Figure 9, the average accuracy of sign language recognition reached 95.6%, among which the recognition rates of "DEH", "WITT" and "CP" reached 100%, while the recognition rates of "TDMEH" and "NTMY" were relatively The language is low, but overall it has a high recognition rate.

实验II：Experiment II:

实验II的目标在于验证本发明的RFID系统对于周边动态多路径的影响抗干扰的鲁棒性。本实验的实验场景基于如图8所示的场景布置。The goal of experiment II is to verify the robustness of the RFID system of the present invention to the influence of surrounding dynamic multipaths against interference. The experimental scene of this experiment is based on the scene arrangement shown in Figure 8.

相比于静态多路径的干扰，动态多路径往往具有较大的干扰性，因此在动态多路径场景中进行试验。在试验期间，新增一个志愿者在周围走动。移动范围设置为以做手势的志愿者为中心，半径分别设置为0.8～1.4m(步长为0.1m)的圆环中测试，试验期间，中心志愿者不断地做着相同的手势。Compared with the interference of static multipath, dynamic multipath often has greater interference, so experiments are carried out in dynamic multipath scenarios. During the experiment, an additional volunteer walked around. The movement range was set as the center of the volunteers making gestures, and the radius was set to be 0.8-1.4m respectively (step length of 0.1m). During the test, the center volunteers kept making the same gestures.

实验II的测试结果：Test results of experiment II:

如图10所示，当移动者的范围超过1.4m时，对于此系统的干扰几乎消失，在1.2m时尚且能够识别，由此目前的其他方法识别范围远没有此精度，故此方法对于外界的动态多路径干扰鲁棒性较强。As shown in Figure 10, when the range of the mover exceeds 1.4m, the interference to the system almost disappears, and it can be recognized at 1.2m. Therefore, the recognition range of other current methods is far from this accuracy. Dynamic multipath interference is robust.

动态多路径对于手语识别系统的干涉图如图11所示，手语识别数量和识别准确率的关系图如图12所示。Figure 11 shows the interferogram of the dynamic multipath for the sign language recognition system, and Figure 12 shows the relationship between the number of sign language recognition and the recognition accuracy.

实验III：Experiment III:

实验III中，本发明对比其他两种手语识别文献中的方法MyoSign和WiSign在语句级的手语识别。In Experiment III, the present invention compares the sign language recognition at sentence level of MyoSign and WiSign, the methods in other two sign language recognition literatures.

实验III的测试结果见表1。The test results of Experiment III are shown in Table 1.

表1不同文献的语句级手语识别准确性对比Table 1 Comparison of sentence-level sign language recognition accuracy in different literatures

WiSign使用SVM(support vector machines)，即支持向量机来对手语数据信号进行处理，MyoSign使用CNN(convolutional neural networks)，即卷积神经网络来对手语信号进行处理。从表1中可以看到WiSign的准确率为93.8％，MyoSign的准确率为93.1％，而本发明的方法准确率为95.6％，具有一定优势。而且WiSign方法使用了三台Wi-Fi设备来提升性能，同时MyoSign方法中需要穿戴设备和大的训练数据集。综上，本发明的成本低廉并且识别较高。WiSign uses SVM (support vector machines), that is, support vector machines, to process sign language data signals, and MyoSign uses CNN (convolutional neural networks), that is, convolutional neural networks, to process sign language signals. It can be seen from Table 1 that the accuracy rate of WiSign is 93.8%, the accuracy rate of MyoSign is 93.1%, and the accuracy rate of the method of the present invention is 95.6%, which has certain advantages. Moreover, the WiSign method uses three Wi-Fi devices to improve performance, while the MyoSign method requires wearable devices and a large training dataset. In conclusion, the present invention has low cost and high identification.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用全部或部分地以计算机程序产品的形式实现，所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时，全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输)。所述计算机可读取存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质，(例如，软盘、硬盘、磁带)、光介质(例如，DVD)、或者半导体介质(例如固态硬盘SolidState Disk(SSD))等。In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in whole or in part in the form of a computer program product, the computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer may be a general purpose computer, special purpose computer, computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wireline (eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.)). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, etc. that includes one or more available mediums integrated. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), and the like.

以上所述，仅为本发明的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，都应涵盖在本发明的保护范围之内。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art is within the technical scope disclosed by the present invention, and all within the spirit and principle of the present invention Any modifications, equivalent replacements and improvements made within the scope of the present invention should be included within the protection scope of the present invention.

Claims

1. A sentence-level sign language recognition method is characterized by comprising the following steps:

step one, a commercial radio frequency identification device based on COST radio frequency identification technology obtains a phase sequence of a specific target between an identification system and a passive tag through radio to obtain an original sign language data signal;

step two, phase offset correction is carried out, and a wavelet denoising method based on a threshold value is utilized to remove phase offset caused by environmental hardware and Gaussian noise, so that a pure phase sequence is obtained;

thirdly, the effective sign language data signal is segmented by using a signal processing method based on the standard deviation;

step four, selecting and selecting the characteristics which are kept stable in the same sign language signal and can be obviously distinguished between different sign language signals through analyzing each characteristic between the same sign language signal and different sign language signals, and carrying out effective characteristic extraction;

and fifthly, recognizing the sign language by using a random forest classifier and 50 decision trees.

2. The sentence-level sign language identification method of claim 1, wherein in the second step, the phase offset correction and the removal of the phase offset caused by the environmental hardware and the gaussian noise by using the threshold-based wavelet denoising method to obtain the clean phase sequence comprise:

(1) the method for removing the offset caused by environmental hardware and Gaussian noise by using the wavelet denoising method based on the threshold value has the following formula:

θ_true＝θ-θ_n(θ_n＝πor 2π)；

wherein n is a serial number, θ_trueIs the true phase value after removing the phase offset, theta is the original phase sequence value, theta is the phase offset_nIs a phase offset value;

(2) the phase sequence after removing noise by obtaining the sign language data signal from the API of the RFID reader is as follows:

θ＝{θ₁，θ₂…θ_i…θ_n}；

where n is the number of phase sequences, θ is the phase sequence after noise removal, and θ is the number of phase sequences after noise removal_iIs a phase value in a phase sequence;

(3) the same processing data is used for reducing the slight difference of the same gesture at different moments, and the calculation formula is as follows:

where s (i) is the normalized intermediate result, θ (i) is the phase value in the phase sequence, min (θ) is the minimum value in the phase sequence, max (θ) is the maximum value in the phase sequence, and a (i) is the normalized ith phase sequence sample.

3. The sentence-level sign language identification method of claim 1, wherein in step three, the segmenting the valid sign language data signal by using the standard deviation-based signal processing method comprises:

(1) subtract the median from each phase value in the sample and find the absolute value:

m(i)＝|a(i)-median(a)|；

where mean (a) is the median, m (i) is the absolute value of each number in the phase sequence a (i) minus the median (a);

(2) and (3) grouping the data in sequence to calculate standard deviation:

wherein d (i) is the new sequence standard deviation after the calculation of the standard, k is the length of the sequence, μ is the average value sequence k-50;

(3) with the value of d (i), a data group whose individual data is greater than the threshold value r (r ═ 0.1) is valid data.

4. The sentence-level sign language identification method of claim 1, wherein in step four, the selecting of features that remain stable in the same sign language signal and can be distinguished clearly between different sign language signals by performing each feature analysis between the same sign language signal and different sign language signals, and performing effective feature extraction, comprises:

(1) extracting ten characteristic values of the obtained effective data group, wherein the ten characteristic values comprise skewness, an expected value, a third-order center distance, an average value, a variance, a standard deviation, kurtosis, energy and a main frequency ratio maximum frequency peak value;

(2) extracting the SOS value of the obtained valid data:

and (4) SOS: assuming N is the length of the array, starting with the first array, each number and the following (k-1) number make up an array of length k, get the standard deviation value for each array, get (N-k +1) standard deviations, find the sum of the (N-k +1) standard deviations, and get the SOS: (ii) a

Comparing the softness fluctuation of the data; when the data fluctuates, the value is large; when the data fluctuation is small, it becomes small, and the formula is as follows:

where N is the total number of data in the data set, μ is the average of k data, and k is the number of data per standard deviation.

5. The sentence-level sign language recognition method of claim 1, wherein in step five, the sign language recognition is implemented by using a random forest classifier and using 50 decision trees, and the method comprises the following steps:

(1) the data are scrambled, 1/10 groups of data are extracted to be used as a data test set, the rest data are used as a training set, and the test set and the training set are led into an RF classifier for identification;

(2) and extracting the data of other 1/10 groups as a data test set, and circularly importing the data into the RF classifier until each group of data is tested, thereby finally obtaining a test result.

6. A sentence-level sign language recognition system for implementing the sentence-level sign language recognition method according to any one of claims 1 to 5, wherein the sentence-level sign language recognition system comprises:

the sign language data signal acquisition module is used for acquiring a phase sequence of a specific target between an identification system and a passive tag through radio to obtain an original sign language data signal based on a commercial radio frequency identification device of COST radio frequency identification technology;

the signal denoising processing module is used for correcting phase offset and removing the phase offset caused by environmental hardware and Gaussian noise by using a wavelet denoising method based on a threshold value to obtain a pure phase sequence;

the sign language signal segmentation module is used for realizing the segmentation of effective sign language data signals by utilizing a signal processing method based on standard deviation;

the characteristic extraction module is used for selecting and selecting the characteristics which are kept stable in the same sign language signal and can be obviously distinguished between different sign language signals through analyzing each characteristic between the same sign language signal and different sign language signals, and carrying out effective characteristic extraction;

and the sign language recognition module is used for realizing sign language recognition by using 50 decision trees by utilizing a random forest classifier.

7. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:

the commercial radio frequency identification device based on COST radio frequency identification technology obtains a phase sequence of a specific target between an identification system and a passive tag through radio to obtain an original sign language data signal; performing phase offset correction and removing phase offset caused by environmental hardware and Gaussian noise by using a wavelet denoising method based on a threshold value to obtain a pure phase sequence; the method comprises the following steps of utilizing a signal processing method based on standard deviation to realize effective sign language data signal segmentation;

by analyzing each feature between the same sign language signal and different sign language signals, selecting and selecting the features which are stable in the same sign language signal and can be obviously distinguished between different sign language signals, and carrying out effective feature extraction; and (4) realizing sign language recognition by utilizing a random forest classifier and using 50 decision trees.

8. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:

9. An information data processing terminal characterized by being configured to implement the sentence-level sign language recognition system of claim 6.

10. An application of the sentence-level sign language recognition system of claim 6 in the technical field of artificial intelligence.