CN115343704A

CN115343704A - Hand gesture recognition method for FMCW millimeter wave radar based on multi-task learning

Info

Publication number: CN115343704A
Application number: CN202210906729.XA
Authority: CN
Inventors: 曾三友; 周琳洁; 杨秀晴
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2022-07-29
Filing date: 2022-07-29
Publication date: 2022-11-15
Anticipated expiration: 2042-07-29
Also published as: CN115343704B

Abstract

The invention provides a gesture recognition method of an FMCW millimeter wave radar based on multitask learning, which comprises the following steps: acquiring a target characteristic vector time sequence of a gesture to be recognized; inputting the target characteristic vector time sequence into a multi-task learning gesture recognition model, and outputting a gesture classification result; the multi-task learning gesture recognition model is obtained by training a sample target characteristic vector time sequence, a corresponding gesture category label and a corresponding gesture track label, and the target characteristic vector time sequence and the sample target characteristic vector time sequence are obtained by processing gesture recognition original data acquired by an FMCW millimeter wave radar through a preset algorithm. The multi-dimensional characteristics of the user gestures are effectively and conveniently acquired, and the accuracy of gesture recognition classification is improved through multi-task learning.

Description

Hand gesture recognition method for FMCW millimeter wave radar based on multi-task learning

技术领域technical field

本发明涉及手势识别技术领域，尤其涉及一种基于多任务学习的FMCW(FrequencyModulated Continuous Wave,调频连续波)毫米波雷达的手势识别方法。The present invention relates to the technical field of gesture recognition, in particular to a gesture recognition method for FMCW (Frequency Modulated Continuous Wave, Frequency Modulated Continuous Wave) millimeter-wave radar based on multi-task learning.

背景技术Background technique

近些年来，人机交互技术被逐渐应用到日常生活中，手势识别作为人类交互方式中最直观的手段之一，使得用户能够通过手掌或手指的运动以更自然的方式与机器进行交互，逐渐成为研究热点。已有的手势识别方案根据数据来源可以分为基于视觉传感器、基于惯性传感器和基于无线射频传感器的手势识别方法。基于光学传感器的手势识别方案利用摄像头捕捉包含手势的场景视频，然后利用计算机算法对图像中的手势特征进行识别、提取和分类，其硬件系统相对简单，所采集到的图像视频等能够提供高空间分辨率的信息，但也因为需要处理二维甚至三维图像数据，数据量大且计算成本较高，所需的大功耗使其难以应用在便携设备中。此外，基于光学传感器的手势识别对环境光线较为敏感，容易出现视觉盲区和光线遮挡，限制了用户的使用范围，并且存在用户隐私泄露的风险。惯性传感器可以测量物体的加速度、可以测量物体的，但是其要求使用者在执行动作时一直穿戴该设备，使用体验较差。In recent years, human-computer interaction technology has been gradually applied to daily life. Gesture recognition, as one of the most intuitive means of human interaction, enables users to interact with machines in a more natural way through palm or finger movements. become a research hotspot. Existing gesture recognition schemes can be divided into gesture recognition methods based on visual sensors, inertial sensors, and wireless radio frequency sensors according to data sources. The gesture recognition scheme based on optical sensors uses the camera to capture the scene video containing gestures, and then uses computer algorithms to identify, extract and classify the gesture features in the image. The hardware system is relatively simple, and the collected images and videos can provide high space Resolution information, but also because of the need to process two-dimensional or even three-dimensional image data, the amount of data is large and the calculation cost is high, and the required high power consumption makes it difficult to apply to portable devices. In addition, gesture recognition based on optical sensors is sensitive to ambient light, prone to visual blind spots and light occlusion, which limits the scope of use of users, and there is a risk of user privacy leakage. The inertial sensor can measure the acceleration of the object, and can measure the object, but it requires the user to wear the device all the time while performing the action, and the user experience is poor.

随着射频技术和集成电路的发展，基于雷达、WiFi等无线信号的手势识别方法逐渐引起了学者们的关注。相比于光学传感器和惯性传感器，无线信号对环境要求低，其中，毫米波雷达具有很多独特的优点。首先，雷达发射和接收信号、采集手势信息不会受到天气条件和环境光照的影响，使系统能够覆盖的应用场景得到很大的改善。雷达还可以穿透遮挡传播，使得在部分遮挡或完全遮挡情况下实现手势交互成为可能。其次，毫米波雷达接收回波比光学方案能耗低，能更好地集成到嵌入式设备中，其便易性优于穿戴设备方案。同时雷达信号更具有安全性和保密性，基本上不会泄露用户隐私。此外，毫米波雷达传感器与其它低频段射频传感器相比，具有更窄的波束窄小和更高的空间分辨率。毫米波雷达因为上述的诸多优点，近年来得到了研究者的关注，但由于基于毫米波雷达的手势识别方法在数据采集、特征提取融合、手势分类算法设计等环节的设计上均存在固有的差别，所以其算法发展仍然没有十分成熟，还存在一些局限性，例如，未能充分利用雷达所能提供的信息，识别模型大小和时间复杂度较大。With the development of radio frequency technology and integrated circuits, gesture recognition methods based on wireless signals such as radar and WiFi have gradually attracted the attention of scholars. Compared with optical sensors and inertial sensors, wireless signals have lower environmental requirements. Among them, millimeter wave radar has many unique advantages. First of all, the radar transmits and receives signals, and collects gesture information without being affected by weather conditions and ambient light, which greatly improves the application scenarios that the system can cover. Radar can also propagate through occlusions, making gesture interaction possible under partial or complete occlusions. Secondly, millimeter-wave radar receives echoes with lower energy consumption than optical solutions, and can be better integrated into embedded devices, and its convenience is better than wearable device solutions. At the same time, radar signals are more secure and confidential, and basically do not reveal user privacy. In addition, mmWave radar sensors have a narrower beam narrower and higher spatial resolution than other low-band RF sensors. Because of the many advantages mentioned above, millimeter wave radar has attracted the attention of researchers in recent years. However, due to the inherent differences in the design of data collection, feature extraction and fusion, and gesture classification algorithm design, the gesture recognition method based on millimeter wave radar has inherent differences. Therefore, the development of its algorithm is still not very mature, and there are still some limitations. For example, the information provided by radar cannot be fully utilized, and the size and time complexity of the recognition model are relatively large.

从特征提取方法和手势分类方法两个方面来说，基于雷达的手势识别方法，在雷达手势识别的技术研究和应用过程中，虽然己经取得了一定的成果，但仍然存在一些挑战：From the two aspects of feature extraction method and gesture classification method, the radar-based gesture recognition method has achieved certain results in the technical research and application process of radar gesture recognition, but there are still some challenges:

(1)标记数据集的构建。在运用深度学习进行手势识别时，充足的手势数据是模型训练的基础，而基于已有的硬件平台，手势数据的获取需要大量时间，非常昂贵。(1) Construction of labeled dataset. When using deep learning for gesture recognition, sufficient gesture data is the basis for model training, but based on the existing hardware platform, the acquisition of gesture data takes a lot of time and is very expensive.

(2)多维手势特征提取。已有的算法以微多普勒谱图和距离多普勒谱图作为网络输入数据时数据量大且各目标点特征之间没有对应，且缺乏对于方位角和俯仰角信息的利用和融合，未能有效充分地利用雷达提供的信息，影响手势的识别准确率。(2) Multi-dimensional gesture feature extraction. The existing algorithms use the micro-Doppler spectrum and the range-Doppler spectrum as network input data, and the amount of data is large and there is no correspondence between the characteristics of each target point, and there is a lack of utilization and fusion of azimuth and elevation angle information. Failure to make full use of the information provided by the radar affects the accuracy of gesture recognition.

(3)分类器设计。在利用深度学习对雷达手势进行分类时，多采用单一种类的识别模型，识别算法计算量大、模型复杂度高，输入数据的有效特征未能被充分提取，从而导致手势分类准确率不高。(3) Classifier design. When using deep learning to classify radar gestures, a single type of recognition model is often used. The recognition algorithm has a large amount of calculation and high model complexity, and the effective features of the input data cannot be fully extracted, resulting in low gesture classification accuracy.

综上，如何有效便捷地提取多维手势特征，以及如何设计分类器提高手势分类的准确率，仍然是本领域技术人员亟待解决的问题。To sum up, how to effectively and conveniently extract multi-dimensional gesture features, and how to design a classifier to improve the accuracy of gesture classification are still problems to be solved urgently by those skilled in the art.

发明内容Contents of the invention

本发明提供一种基于多任务学习的FMCW毫米波雷达的手势识别方法，用以解决现有的多维手势特征提取不够有效便捷，手势分类的准确率不高的问题。The present invention provides a gesture recognition method of FMCW millimeter wave radar based on multi-task learning, which is used to solve the problems that the existing multi-dimensional gesture feature extraction is not effective and convenient, and the accuracy rate of gesture classification is not high.

本发明提供一种基于多任务学习的FMCW毫米波雷达的手势识别方法，包括：The present invention provides a kind of gesture recognition method based on the FMCW millimeter-wave radar of multi-task learning, comprising:

获取待识别手势的目标特征向量时序序列；Obtain the time series sequence of the target feature vector of the gesture to be recognized;

将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；Inputting the sequence sequence of the target feature vector into a multi-task learning gesture recognition model, and outputting gesture category results;

其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列、对应的手势类别标签和对应的手势轨迹标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。Wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture trajectory label, the target feature vector sequence and the sample target feature vector sequence The sequences are obtained by processing the raw gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm.

根据本发明提供的一种基于多任务学习的FMCW毫米波雷达的手势识别方法，所述多任务学习手势识别模型在训练过程中包括手势识别分类和重构手势轨迹两个学习任务。According to the multi-task learning-based FMCW millimeter-wave radar gesture recognition method provided by the present invention, the multi-task learning gesture recognition model includes two learning tasks of gesture recognition classification and gesture trajectory reconstruction in the training process.

根据本发明提供的一种基于多任务学习的FMCW毫米波雷达的手势识别方法，所述多任务学习手势识别模型在训练过程中的总体损失函数为手势识别分类部分和重构手势轨迹部分的两个部分的损失函数的加权。According to a gesture recognition method of FMCW millimeter-wave radar based on multi-task learning provided by the present invention, the overall loss function of the multi-task learning gesture recognition model in the training process is two parts of the gesture recognition classification part and the reconstructed gesture track part. The weighting of the loss function of each part.

根据本发明提供的一种基于多任务学习的FMCW毫米波雷达的手势识别方法，所述多任务学习手势识别模型在训练过程中的训练网络结果包括：According to the gesture recognition method of a FMCW millimeter-wave radar based on multi-task learning provided by the present invention, the training network results of the multi-task learning gesture recognition model in the training process include:

顺次连接的一维CNN(Convolutional Neural Network,卷积神经网络)层、LSTM(Long Short-Term Memory,长短期记忆网络)层和一维转置卷积层；Sequentially connected one-dimensional CNN (Convolutional Neural Network, convolutional neural network) layer, LSTM (Long Short-Term Memory, long-term short-term memory network) layer and one-dimensional transposed convolutional layer;

以及，顺次连接的一维CNN层、LSTM层和全连接层。And, sequentially connected one-dimensional CNN layer, LSTM layer and fully connected layer.

根据本发明提供的一种基于多任务学习的FMCW毫米波雷达的手势识别方法，所述一维CNN层有128个卷积核，所述全连接层具有softmax激活函数，所述一维转置卷积使用的是128个大小为7×1的反卷积核。According to a gesture recognition method of FMCW millimeter-wave radar based on multi-task learning provided by the present invention, the one-dimensional CNN layer has 128 convolution kernels, the fully connected layer has a softmax activation function, and the one-dimensional transpose The convolution uses 128 deconvolution kernels of size 7×1.

根据本发明提供的一种基于多任务学习的FMCW毫米波雷达的手势识别方法，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的，具体包括：According to the multi-task learning-based FMCW millimeter-wave radar gesture recognition method provided by the present invention, the target feature vector time series and the sample target feature vector time series are both acquired by the FMCW millimeter-wave radar through a preset algorithm It is obtained after processing the raw data of gesture recognition, including:

以特定的多维度融合的特征提取算法对FMCW毫米波雷达获取的手势信号的空间和时序特征进行了提取，将每一时刻的距离、角度和速度信息一一对应以向量的形式进行特征融合，得到目标特征向量时序序列和样本目标特征向量时序序列。A specific multi-dimensional fusion feature extraction algorithm is used to extract the spatial and temporal features of the gesture signal acquired by the FMCW millimeter-wave radar, and the distance, angle and speed information at each moment are one-to-one in the form of vector feature fusion. Obtain the time series of target feature vectors and the time series of sample target feature vectors.

本发明还提供一种基于多任务学习的FMCW毫米波雷达的手势识别方法，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的，具体包括：The present invention also provides a gesture recognition method for FMCW millimeter-wave radar based on multi-task learning, wherein the target feature vector time series and the sample target feature vector time series are all gestures acquired by the FMCW millimeter wave radar through a preset algorithm It is obtained after identifying the raw data for processing, including:

对于每一帧FMCW毫米波雷达获取的手势识别原始特征中的距离值，先对中频信号做快时间的FFT，去除静态杂波后通过CFAR检测可能存在的目标距离；For the distance value in the original feature of gesture recognition acquired by FMCW millimeter-wave radar for each frame, first perform fast-time FFT on the intermediate frequency signal, and then detect the possible target distance through CFAR after removing static clutter;

然后对相应距离处使用Capon算法计算方位角角度谱并通过CFAR(ConstantFalse Alarm Rate,恒虚警)检测特定距离处可能存在的目标方位角；Then use the Capon algorithm to calculate the azimuth angle spectrum at the corresponding distance and detect the possible target azimuth angle at a specific distance through CFAR (Constant False Alarm Rate, constant false alarm rate);

在相应距离和方位角处使用Capon算法计算俯仰角角度谱并通过CFAR检测特定距离和方位角处存在的目标俯仰角；Use the Capon algorithm to calculate the pitch angle spectrum at the corresponding distance and azimuth angle and detect the pitch angle of the target at a specific distance and azimuth angle through CFAR;

最后利用波束形成计算每个目标点的目标速度；Finally, beamforming is used to calculate the target velocity of each target point;

将目标距离、目标方位角、目标俯仰角和目标速度构建成目标手势识别特征，进行多普勒FFT处理，再基于密度聚类，提取聚类中心作为目标特征向量时序序列和样本目标特征向量时序序列。Construct target distance, target azimuth, target pitch angle and target velocity into target gesture recognition features, perform Doppler FFT processing, and then based on density clustering, extract cluster centers as target feature vector time series and sample target feature vector time series sequence.

根据本发明提供的一种基于多任务学习的FMCW毫米波雷达的手势识别装置，包括：According to the gesture recognition device of a FMCW millimeter-wave radar based on multi-task learning provided by the present invention, it includes:

获取单元，用于获取待识别手势的目标特征向量时序序列；An acquisition unit, configured to acquire a time series sequence of target feature vectors of gestures to be recognized;

识别单元，用于将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；A recognition unit, configured to input the time series sequence of the target feature vector into the multi-task learning gesture recognition model, and output gesture category results;

本发明还提供一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述任一种所述的基于多任务学习的FMCW毫米波雷达的手势识别方法的步骤。The present invention also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the program, the multi-based Steps in a task-learning approach to gesture recognition for FMCW millimeter-wave radar.

本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现如上述任一种所述的基于多任务学习的FMCW毫米波雷达的手势识别方法的步骤。The present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the gesture of the FMCW millimeter-wave radar based on multi-task learning as described in any one of the above is realized Identify the steps of the method.

本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别方法，通过获取待识别手势的目标特征向量时序序列；将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列、对应的手势类别标签和对应的手势轨迹标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。现实了有效便捷的获取用户手势多维特征，并且通过多任务学习提高了手势识别分类的准确率。The gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning provided by the present invention obtains the time-series sequence of the target feature vector of the gesture to be recognized; the time-series sequence of the target feature vector is input into the multi-task learning gesture recognition model, and the gesture category result is output ; Wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture track label, the target feature vector sequence and the sample target feature vector The time series are all obtained by processing the raw gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm. It realizes the effective and convenient acquisition of multi-dimensional features of user gestures, and improves the accuracy of gesture recognition classification through multi-task learning.

附图说明Description of drawings

为了更清楚地说明本发明或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the present invention or the technical solutions in the prior art, the accompanying drawings that need to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the accompanying drawings in the following description are the present invention. For some embodiments of the invention, those skilled in the art can also obtain other drawings based on these drawings without creative effort.

图1为本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别方法的流程示意图；Fig. 1 is the schematic flow chart of the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning provided by the present invention;

图2为本发明提供的FMCW毫米波雷达硬件平台的实体结构图；Fig. 2 is the physical structural diagram of the FMCW millimeter-wave radar hardware platform provided by the present invention;

图3是本发明对雷达发射信号参数配置的界面的示意图；Fig. 3 is the schematic diagram of the interface of the present invention to the parameter configuration of radar transmission signal;

图4为本发明提供的多任务学习网络示意图；4 is a schematic diagram of a multi-task learning network provided by the present invention;

图5为本发明提供的多任务学习网络框架示意图；Fig. 5 is a schematic diagram of a multi-task learning network framework provided by the present invention;

图6为本发明提供的多维特征提取流程图；Fig. 6 is the flow chart of multi-dimensional feature extraction provided by the present invention;

图7为本发明提供的训练过程中验证准确率的变化；Fig. 7 is the variation of verification accuracy in the training process provided by the present invention;

图8为本发明提供的测试集分类结果的混淆矩阵；Fig. 8 is the confusion matrix of the test set classification result provided by the present invention;

图9为本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别装置的结构示意图；FIG. 9 is a schematic structural diagram of a gesture recognition device based on a multi-task learning FMCW millimeter-wave radar provided by the present invention;

图10是本发明提供的电子设备的结构示意图。Fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明中的附图，对本发明中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the present invention. Obviously, the described embodiments are part of the embodiments of the present invention , but not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

现有的手势识别分类技术中普遍存在多维手势特征提取不够有效便捷，手势分类的准确率不高的问题。下面结合图1描述本发明的基于多任务学习的FMCW毫米波雷达的手势识别方法。图1为本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别方法的流程示意图，如图1所示，该方法包括：In the existing gesture recognition and classification technologies, there are generally problems that the multi-dimensional gesture feature extraction is not effective and convenient, and the accuracy of gesture classification is not high. The gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning of the present invention will be described below in conjunction with FIG. 1 . Fig. 1 is the schematic flow chart of the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning provided by the present invention, as shown in Fig. 1, the method comprises:

步骤110，获取待识别手势的目标特征向量时序序列。Step 110, acquiring a time series sequence of target feature vectors of gestures to be recognized.

具体地，获取待识别手势的目标特征向量时序序列，即待识别手势的距离、速度、方位角和俯仰角形成的特征向量进过多维特征融合提取后得到的目标特征向量，然后再从时序上将多帧目标特征向量拼接在一起，得到目标特征向量时序序列，以供对其对应的手势类型进行识别。Specifically, the time series sequence of the target feature vector of the gesture to be recognized is obtained, that is, the target feature vector obtained after the feature vector formed by the distance, speed, azimuth and pitch angle of the gesture to be recognized is extracted by multi-dimensional feature fusion, and then from the time series The multi-frame target feature vectors are spliced together to obtain a time series sequence of target feature vectors for identifying the corresponding gesture types.

步骤120，将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；Step 120, input the time series sequence of the target feature vector into the multi-task learning gesture recognition model, and output gesture category results;

具体地，将目标特征向量时序序列输入多任务学习手势识别模型，输出预测标签，即输入的目标特征向量时序序列对应的手势类别识别结果。其中，多任务学习手势识别模型是经过大量目标特征向量时序序列样本和与样本一一对应的手势类别标签和手势轨迹标签进行训练后得到的，使用的样本标签数量越多，模型训练完成后的使用准确率越高。而本发明实施例中使用的训练模型是多任务学习型，即该模型除了输出手势分类结果，还可以输出类似于轨迹重构的结果，这样两个学习任务一起在各自标签的监督下对网络中的待调参数进行调整，使得模型训练完成后得到的模型用于识别手势时结果更为准确。此处需要说明的是目标特征向量时序序列和样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。Specifically, the target feature vector sequence is input into the multi-task learning gesture recognition model, and the predicted label is output, that is, the gesture category recognition result corresponding to the input target feature vector sequence. Among them, the multi-task learning gesture recognition model is obtained after training a large number of target feature vector time series samples and gesture category labels and gesture trajectory labels corresponding to the samples one by one. The higher the accuracy rate is used. The training model used in the embodiment of the present invention is a multi-task learning type, that is, in addition to outputting gesture classification results, the model can also output results similar to trajectory reconstruction. Adjust the parameters to be adjusted in , so that the model obtained after model training is used to recognize gestures more accurately. What needs to be explained here is that both the time series of target feature vectors and the time series of sample target feature vectors are obtained by processing the original gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm.

此处需要进一步说明的是，本方法使用FMCW毫米波雷达的硬件实验平台及软件采集数据，然后针对所采集的雷达手势信号设计了一种多维度融合的特征提取方法，在已建立的雷达手势数据集的基础上开展手势识别算法研究，提出了基于多任务的神经网络结构，充分利用雷达提取的参数信息以达到较高的分类准确率和较快的分类速度。What needs to be further explained here is that this method uses the hardware experiment platform and software of FMCW millimeter-wave radar to collect data, and then designs a multi-dimensional fusion feature extraction method for the collected radar gesture signals. On the basis of the data set, the gesture recognition algorithm research is carried out, and a multi-task-based neural network structure is proposed, which makes full use of the parameter information extracted by the radar to achieve higher classification accuracy and faster classification speed.

图2为本发明提供的FMCW毫米波雷达硬件平台的实体结构图，如图2所示，本发明实施例采用的FMCW毫米波雷达平台主要由两部分构成，包括德州仪器公司的IWR 1443雷达模块以及DCA1000数据采集板。其中，IWR1443雷达开发板主要负责产生和发射线性调频波，接收并解调目标回波，获得载有目标信息的中频信号。DCA1000数据采集板负责将中频信号传输到个人计算机以进行下一步的数据处理。软件方面，图3是本发明对雷达发射信号参数配置的界面的示意图，如图3所示，本发明实施例采用TI公司开发的mmWave studio上位机软件对雷达进行配置，并采集数据存储到PC端。Fig. 2 is the physical structure diagram of the FMCW millimeter-wave radar hardware platform provided by the present invention, as shown in Fig. 2, the FMCW millimeter-wave radar platform adopted in the embodiment of the present invention is mainly composed of two parts, including the IWR 1443 radar module of Texas Instruments And DCA1000 data acquisition board. Among them, the IWR1443 radar development board is mainly responsible for generating and transmitting chirp waves, receiving and demodulating target echoes, and obtaining intermediate frequency signals carrying target information. The DCA1000 data acquisition board is responsible for transmitting the intermediate frequency signal to the personal computer for the next step of data processing. In terms of software, Fig. 3 is a schematic diagram of the interface of the present invention for configuring the parameters of the radar transmission signal. As shown in Fig. 3, the embodiment of the present invention adopts the mmWave studio host computer software developed by TI Company to configure the radar, and collect data and store it in the PC end.

由于商用雷达的方向图的增益下降很快，没有考虑到手势识别应用的需求，针对本发明所要采用的手势和手势识别的应用场景，设计天线优化算法对阵列天线方向图进行优化，以提高感兴趣目标区域的方向图增益，减少其他区域的干扰，并期望通过分析雷达信号采集平台的配置参数，找到用于手势识别的FMCW毫米波雷达系统的最佳配置，最大化利用雷达性能并在此基础上采集手势数据构建手势数据集。Since the gain of the pattern of the commercial radar drops rapidly, the requirements of the gesture recognition application are not taken into account. For the gesture and gesture recognition application scenarios to be adopted in the present invention, an antenna optimization algorithm is designed to optimize the pattern of the array antenna to improve the sensitivity. The pattern gain of the target area of interest reduces the interference of other areas, and it is expected that by analyzing the configuration parameters of the radar signal acquisition platform, the optimal configuration of the FMCW millimeter-wave radar system for gesture recognition can be found to maximize the use of radar performance and hereby Based on the collected gesture data, a gesture data set is constructed.

本发明实施例提供的基于多任务学习的FMCW毫米波雷达的手势识别方法，通过获取待识别手势的目标特征向量时序序列；将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列和对应的手势类别标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。现实了有效便捷的获取用户手势多维特征，并且通过多任务学习提高了手势识别分类的准确率。The gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning provided by the embodiment of the present invention obtains the time-series sequence of the target feature vector of the gesture to be recognized; inputs the time-series sequence of the target feature vector into the multi-task learning gesture recognition model, and outputs the gesture Category results; wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector time series and the corresponding gesture category label, and the target feature vector time series and the sample target feature vector time series are both It is obtained by processing the raw gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm. It realizes the effective and convenient acquisition of multi-dimensional features of user gestures, and improves the accuracy of gesture recognition classification through multi-task learning.

基于上述实施例，该方法中，所述多任务学习手势识别模型在训练过程中包括手势识别分类和重构手势轨迹两个学习任务。Based on the above embodiments, in this method, the multi-task learning gesture recognition model includes two learning tasks of gesture recognition classification and gesture trajectory reconstruction during the training process.

具体地，在训练过程中多任务学习手势识别模型包括手势识别分类和重构手势轨迹两个学习任务，即在训练过程中使用的标签包括对应于样本目标手势特征向量时序序列的手势类别标签，还包括对应于样本目标手势特征向量时序序列的手势轨迹标签，这两个标签分别用于在“校正”预测标签和重构轨迹的同时合并调节模型网络中的待调参数。Specifically, the multi-task learning gesture recognition model in the training process includes two learning tasks of gesture recognition classification and gesture trajectory reconstruction, that is, the labels used in the training process include gesture category labels corresponding to the time sequence of sample target gesture feature vectors, It also includes gesture trajectory labels corresponding to the time series of sample target gesture feature vectors, which are used to incorporate the parameters to be adjusted in the adjustment model network while "correcting" the predicted labels and reconstructing the trajectory.

基于上述实施例，该方法中，所述多任务学习手势识别模型在训练过程中的总体损失函数为手势识别分类部分和重构手势轨迹部分的两个部分的损失函数的加权。Based on the above embodiment, in this method, the overall loss function of the multi-task learning gesture recognition model in the training process is the weighting of the loss functions of the gesture recognition classification part and the gesture trajectory reconstruction part.

具体地，将压缩表示后的数据作为重构手势轨迹部分和手势识别分类部分的输入，并将重构手势轨迹部分和手势识别分类部分的损失函数的加权和作为总体的损失函数，旨在通过重构任务协助网络中的特征提取模块能够更好地对输入数据进行特征提取，以提高手势识别分类部分的性能。Specifically, the data after the compressed representation is used as the input to reconstruct the gesture trajectory part and the gesture recognition classification part, and the weighted sum of the loss functions of the reconstructed gesture trajectory part and the gesture recognition classification part is used as the overall loss function, aiming to pass The feature extraction module in the reconstruction task assists the network to perform better feature extraction on the input data to improve the performance of the gesture recognition classification part.

基于上述实施例，该方法中，所述多任务学习手势识别模型在训练过程中的训练网络结果包括：Based on the foregoing embodiments, in this method, the training network results of the multi-task learning gesture recognition model during the training process include:

顺次连接的一维CNN层、LSTM层和一维转置卷积层；Sequentially connected one-dimensional CNN layer, LSTM layer and one-dimensional transposed convolutional layer;

具体地，图4为本发明提供的多任务学习网络示意图，如图4所示，多任务学习手势识别模型在训练过程中的训练网络结果包括：顺次连接的一维CNN层、LSTM层和一维转置卷积层(1D CNN Transpose)；以及，顺次连接的一维CNN层、LSTM层和全连接层。网络框架示意图如图4所示，本发明所使用的神经网络结构主要可分为四个阶段：特征提取、编码、分类和重构。在特征提取阶段，利用一维卷积运算从输入的距离、角度等时序轨迹中提取适当且充分的特征，提取的特征在编码阶段通过LSTM层映射到隐藏层，模型中的LSTM将输入数据转换为经过学习的压缩表示形式，压缩表示作为分类和重构阶段的输入。分类阶段由一个全连接层组成，该层通过应用softmax函数，根据学习到的输入轨迹特征计算输出的条件概率。在重建或解码阶段，输入轨迹的压缩表示被送至几个一维转置卷积层，这些层试图重建输入的特征轨迹。Specifically, Fig. 4 is a schematic diagram of a multi-task learning network provided by the present invention. As shown in Fig. 4, the training network results of the multi-task learning gesture recognition model in the training process include: sequentially connected one-dimensional CNN layer, LSTM layer and 1D transposed convolution layer (1D CNN Transpose); and, sequentially connected 1D CNN layer, LSTM layer and fully connected layer. The schematic diagram of the network framework is shown in Figure 4. The neural network structure used in the present invention can be mainly divided into four stages: feature extraction, encoding, classification and reconstruction. In the feature extraction stage, the one-dimensional convolution operation is used to extract appropriate and sufficient features from the input time series trajectory such as distance and angle. The extracted features are mapped to the hidden layer through the LSTM layer in the encoding stage, and the LSTM in the model converts the input data. is the learned compressed representation, which is used as input to the classification and reconstruction stages. The classification stage consists of a fully connected layer that computes the conditional probability of the output based on the learned input trajectory features by applying a softmax function. In the reconstruction or decoding stage, the compressed representation of the input trajectory is fed to several 1D transposed convolutional layers, which attempt to reconstruct the input feature trajectory.

基于上述实施例，该方法中，所述一维CNN层有128个卷积核，所述全连接层具有softmax激活函数，所述一维转置卷积层使用的是128个大小为7×1的反卷积核。Based on the above embodiment, in this method, the one-dimensional CNN layer has 128 convolution kernels, the fully connected layer has a softmax activation function, and the one-dimensional transposed convolution layer uses 128 convolution kernels with a size of 7× 1 deconvolution kernel.

具体地，图5为本发明提供的多任务学习网络框架示意图，如图5所示，一维CNN的输入为4×100，该层有128个卷积核。一维CNN层之后是LSTM层，LSTM层的输出再被输入到具有softmax激活函数的全连接层。全连接层的输出再输入到7个一维转置卷积层，前两层使用32个大小为7×1的反卷积核，接卸来的两层使用64个大小为7×1的反卷积核，之后的两层使用的是128个大小为7×1的反卷积核，最后的一维转置卷积层使用的是128个大小为7×1的反卷积核。另外，所有的卷积和转置层都使用整流线性单元(ReLU)作为激活函数。为了避免模型过拟合，在模型的最后一维转置卷积层中使用了dropout。Specifically, FIG. 5 is a schematic diagram of a multi-task learning network framework provided by the present invention. As shown in FIG. 5, the input of a one-dimensional CNN is 4×100, and this layer has 128 convolution kernels. The one-dimensional CNN layer is followed by an LSTM layer, and the output of the LSTM layer is then input to a fully connected layer with a softmax activation function. The output of the fully connected layer is then input to seven one-dimensional transposed convolutional layers. The first two layers use 32 deconvolution kernels with a size of 7×1, and the next two layers use 64 deconvolution kernels with a size of 7×1. Deconvolution kernel, the next two layers use 128 deconvolution kernels with a size of 7×1, and the last one-dimensional transposed convolution layer uses 128 deconvolution kernels with a size of 7×1. Additionally, all convolutional and transposed layers use rectified linear units (ReLU) as activation functions. To avoid model overfitting, dropout is used in the last 1D transposed convolutional layer of the model.

基于上述实施例，该方法中，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的，具体包括：Based on the above-mentioned embodiments, in this method, the target feature vector time series and the sample target feature vector time series are both obtained by processing the original gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm, specifically including :

具体地，设计了一种多维度融合的特征提取方法，对手势信号的空间和时序特征进行了提取，将每一时刻的距离、角度、速度信息一一对应以向量的形式进行特征融合，解决了以图片作为网络输入数据量大且各目标点特征之间没有对应的问题，更加精确地体现手部在运动时随时间的变化规律。Specifically, a multi-dimensional fusion feature extraction method is designed, which extracts the spatial and temporal features of gesture signals, and integrates the distance, angle, and speed information at each moment in the form of vectors to solve the problem. It solves the problem of using pictures as network input data and there is no correspondence between the features of each target point, and more accurately reflects the changing law of the hand over time during movement.

基于上述实施例，该方法中，所述所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的，具体包括：Based on the above embodiments, in this method, the target feature vector time series and the sample target feature vector time series are both obtained by processing the original gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm, Specifically include:

然后对相应距离处使用Capon算法计算方位角角度谱并通过CFAR检测特定距离处可能存在的目标方位角；Then use the Capon algorithm to calculate the azimuth angle spectrum at the corresponding distance and detect the possible target azimuth angle at a specific distance through CFAR;

具体地，为了充分利用雷达提供的有效信息并对各个特征进行整合，本发明实施例设计了一种多维特征融合的手势特征提取方法，使每一时刻的每个目标间的距离、速度、方位角和俯仰角特征对应起来，更精确地体现了手部在运动时随时间的变化规律，减小了分类模型的输入尺寸，减少了分类模型前向传播的计算量，图6为本发明提供的多维特征提取流程图，具体步骤如图6所示。在每一帧中，先对中频信号做快时间的FFT，去除静态杂波后通过CFAR检测可能存在目标的距离，然后对相应距离处使用Capon算法计算方位角角度谱并通过CFAR检测特定距离处可能存在目标的方位，同理在相应距离和方位角处检测目标存在的俯仰角，计算出目标的距离、方位角和俯仰角之后，对8路回波信号进行波束形成，然后对慢时间维进行多普勒FFT，计算每个目标点的速度，并根据每一帧的目标点在空间中的位置对进行聚类，将聚类中心作为当前帧目标的位置，以此构建特征向量。Specifically, in order to make full use of the effective information provided by the radar and integrate each feature, the embodiment of the present invention designs a multi-dimensional feature fusion gesture feature extraction method, so that the distance, speed, and orientation of each target at each moment Angle and pitch angle features correspond to each other, which more accurately embodies the changing law of the hand over time during movement, reduces the input size of the classification model, and reduces the calculation amount of the forward propagation of the classification model. Figure 6 provides the present invention with The flow chart of multi-dimensional feature extraction, the specific steps are shown in Figure 6. In each frame, first perform a fast-time FFT on the intermediate frequency signal, remove the static clutter, and then use CFAR to detect the distance of the possible target, and then use the Capon algorithm to calculate the azimuth angle spectrum at the corresponding distance and use CFAR to detect the specific distance. The azimuth of the target may exist. Similarly, the elevation angle of the target is detected at the corresponding distance and azimuth angle. After calculating the distance, azimuth angle, and elevation angle of the target, beamforming is performed on the 8 echo signals, and then the slow time dimension Carry out Doppler FFT, calculate the velocity of each target point, and cluster according to the position of the target point in each frame in space, and use the cluster center as the position of the target in the current frame to construct the feature vector.

在对雷达原始回波信号进行预处理之后，将每个手势用距离、角度、速度的时序轨迹来进行了表示，并以连续100帧的距离、方位角、俯仰角、速度向量作为分类器的输入建立了手势数据集。为了能够充分利用提取到的手势特征向量，使用基于多任务学习的网络框架来进行手势识别。After preprocessing the raw radar echo signal, each gesture is represented by the time series trajectory of distance, angle, and velocity, and the distance, azimuth, pitch angle, and velocity vector of 100 consecutive frames are used as the classifier The input builds the gesture dataset. In order to make full use of the extracted gesture feature vectors, a network framework based on multi-task learning is used for gesture recognition.

本发明实施例的关键点在于：1、首先设计了一种多维度融合的特征提取方法，对手势信号的空间和时序特征进行了提取，将每一时刻的距离、角度、速度信息一一对应以向量的形式进行特征融合，解决了以图片作为网络输入数据量大且各目标点特征之间没有对应的问题，更加精确地体现手部在运动时随时间的变化规律；使用卡尔曼滤波对手势轨迹进行平滑处理的同时，利用参数的改变进行结果的微调，在保证滤波效果的同时增加了样本数量，实现了样本增强；2、其次，设计了一个多任务学习网络框架，将整体网络框架分为轨迹重构和手势分类两个任务模块，通过重构任务协助网络中的共享部分更好地对输入数据进行特征提取。首先通过一维卷积层提取输入信息的特征，然后运用LSTM层对提取到的信息进行编码，也就是压缩表示；下一步再将压缩表示后的数据作为重构部分和分类部分的输入，并将重构部分和分类部分的损失函数的加权和作为总体的损失函数，旨在通过重构任务协助网络中的特征提取模块能够更好地对输入数据进行特征提取，以提高分类部分的性能。The key points of the embodiment of the present invention are as follows: 1. Firstly, a multi-dimensional fusion feature extraction method is designed, which extracts the spatial and temporal features of the gesture signal, and corresponds the distance, angle, and speed information at each moment Feature fusion in the form of vectors solves the problem of using pictures as network input data and there is no correspondence between the features of each target point, and more accurately reflects the changing law of hands over time during movement; using Kalman filtering to While smoothing the gesture trajectory, fine-tuning the result by changing the parameters, increasing the number of samples while ensuring the filtering effect, and realizing sample enhancement; 2. Secondly, a multi-task learning network framework is designed, and the overall network framework It is divided into two task modules of trajectory reconstruction and gesture classification. Through the reconstruction task, the shared part of the network is assisted to better extract features from the input data. First, the features of the input information are extracted through a one-dimensional convolutional layer, and then the LSTM layer is used to encode the extracted information, that is, the compressed representation; the next step is to use the compressed and represented data as the input of the reconstruction part and the classification part, and The weighted sum of the loss functions of the reconstruction part and the classification part is used as the overall loss function, which aims to assist the feature extraction module in the network to better extract features from the input data through the reconstruction task, so as to improve the performance of the classification part.

与现有的技术比较，本发明实施例的优点是：首先设计了一种多维度融合的特征提取方法，对手势信号的空间和时序特征进行了提取，将每一时刻的距离、角度、速度信息一一对应以向量的形式进行特征融合，解决了以图片作为网络输入数据量大且各目标点特征之间没有对应的问题，更加精确地体现手部在运动时随时间的变化规律；实验表明不同手势动作的轨迹存在明显差异，均表现出了不同的特点。Compared with the existing technology, the advantages of the embodiments of the present invention are: first, a multi-dimensional fusion feature extraction method is designed to extract the spatial and temporal features of the gesture signal, and the distance, angle, and speed of each moment One-to-one correspondence of information is carried out in the form of vector feature fusion, which solves the problem of using pictures as network input data and there is no correspondence between the features of each target point, and more accurately reflects the changing law of hands over time during movement; the experiment It shows that there are obvious differences in the trajectories of different gestures, and they all show different characteristics.

实验验证过程：Experimental verification process:

使用所设计的多任务学习模型对手势数据集进行训练和分类，使用总样本的80％用来对模型进行训练，其余20％的样本用来测试。根据所使用的网络结构及损失函数，本发明采用Adam优化器，学习率设置为0.0005，epoch设置为50，batchsize设置为32，然后对网络进行训练和测试，统计实验结果。最后，模型对6种手势的平均识别准确率为99％，图7为本发明提供的训练过程中验证准确率的变化，图8为本发明提供的测试集分类结果的混淆矩阵。如图8所示，八类手势的总体识别准确率最后收敛在99％，其中被错误分类的主要是第三类手势向左滑动和第四类手势向右滑动，根据分析主要原因可能在于这两类手势仅在方位角特征上有所差异，而其余特征轨迹基本相同，故混淆概率更大。Use the designed multi-task learning model to train and classify the gesture data set, use 80% of the total samples to train the model, and use the remaining 20% of the samples for testing. According to the used network structure and loss function, the present invention adopts Adam optimizer, the learning rate is set to 0.0005, the epoch is set to 50, the batchsize is set to 32, then the network is trained and tested, and the experimental results are counted. Finally, the average recognition accuracy of the model for the 6 gestures is 99%. Figure 7 shows the variation of verification accuracy during the training process provided by the present invention, and Figure 8 shows the confusion matrix of the test set classification results provided by the present invention. As shown in Figure 8, the overall recognition accuracy of the eight types of gestures finally converges at 99%. Among them, the misclassified ones are mainly the third type of gestures sliding to the left and the fourth type of gestures sliding to the right. According to the analysis, the main reason may be this. The two types of gestures differ only in the azimuth feature, while the rest of the feature trajectories are basically the same, so the probability of confusion is greater.

将本发明中的多任务网络结构和去除重构模块的单任务网络结构以及Infineon公司的工作进行了比较，为了验证识别性能，本发明采用以下参数分别从分类性能和时间复杂度两个方面对算法进行比较：The multi-task network structure in the present invention is compared with the single-task network structure that removes the reconstruction module and the work of Infineon Company. In order to verify the recognition performance, the present invention adopts the following parameters to analyze from two aspects of classification performance and time complexity respectively: Algorithms for comparison:

(1)准确率(Accuracy)：测试集中分类正确的样本数与总样本数的比值，可以反映识别算法对数据集的判定能力。(1) Accuracy: The ratio of the number of correctly classified samples in the test set to the total number of samples can reflect the ability of the recognition algorithm to judge the data set.

(2)浮点运算数(Floating Point Operations,FLOPs)：为计算量，可以用来衡量模型的复杂度。(2) Floating Point Operations (FLOPs): It is the calculation amount, which can be used to measure the complexity of the model.

(3)参数数量：神经网络参数的数量，可以衡量模型的大小。(3) Number of parameters: The number of neural network parameters can measure the size of the model.

表1为不同方法性能比较，采用本发明中采集的样本对识别方案进行验证，分类性能和算法复杂度的比较如表1所示。Table 1 shows the performance comparison of different methods, using the samples collected in the present invention to verify the recognition scheme, and the comparison of classification performance and algorithm complexity is shown in Table 1.

表1不同方法性能比较Table 1 Performance comparison of different methods

由表1可知，在分类性能方面，本发明所使用的多任务网络结构相较于去除重构模块的单任务模块，针对相同的数据集，其平均识别准确率提高了3％，说明多任务结构相较于没有增加多任务模块的网络结构而言，在重构任务的辅助下，能够更好地提取到手势参数中的空间特征和时序特征；而Infineon公司所用方法仅达到了87.5％的识别准确率，主要原因在于其将距离多普勒图像序列作为网络输入，仅利用了手势的距离信息和速度信息而未能利用横向的角度信息，故难以识别有角度变化而径向速度和径向距离变化不明显的手势，造成总体识别准确率的降低。在算法时间复杂度方面，使用参数数量表示了分类模型的大小，使用FLOPs评估了正向传播时模型的计算量，可以看出本发明工作与单任务网络相比在时间复杂度相同时可以拥有更高的识别准确率；与3D CNN相比参数数量，模型参数量和FLOPs更少的同时拥有更高的识别准确率，说明本发明所提出的算法优于Infineon公司所用方法。将本算法与其它算法进行了比较，并分析了分类性能的影响因素，实验结果表明本发明工作的平均识别准确率可以达到99％，优于所比较的算法。It can be seen from Table 1 that in terms of classification performance, compared with the single-task module without the reconstruction module, the average recognition accuracy of the multi-task network structure used in the present invention is increased by 3% for the same data set, indicating that the multi-task Compared with the network structure without adding multi-task modules, the structure can better extract the spatial features and temporal features in the gesture parameters with the assistance of the reconstruction task; while the method used by Infineon company only reached 87.5% The main reason for the recognition accuracy is that it uses the range Doppler image sequence as the network input, and only uses the distance information and velocity information of gestures, but fails to use the horizontal angle information, so it is difficult to recognize the radial velocity and radial velocity and radius information with angle changes. Gestures with insignificant changes in distance, resulting in a decrease in the overall recognition accuracy. In terms of algorithm time complexity, the number of parameters is used to represent the size of the classification model, and FLOPs is used to evaluate the calculation amount of the model during forward propagation. It can be seen that the work of the present invention can have the same time complexity compared with the single-task network. Higher recognition accuracy; compared with 3D CNN, the number of parameters, model parameters and FLOPs are less while having higher recognition accuracy, which shows that the algorithm proposed by the present invention is better than the method used by Infineon. This algorithm is compared with other algorithms, and the influencing factors of classification performance are analyzed. The experimental results show that the average recognition accuracy of the present invention can reach 99%, which is better than the compared algorithms.

下面对本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别装置进行描述，下文描述的基于多任务学习的FMCW毫米波雷达的手势识别装置与上文描述的基于多任务学习的FMCW毫米波雷达的手势识别方法可相互对应参照。The gesture recognition device of the FMCW millimeter-wave radar based on multi-task learning provided by the present invention is described below. The gesture recognition device of the FMCW millimeter-wave radar based on multi-task learning described below is the same as the FMCW millimeter-wave radar based on multi-task learning described above. The gesture recognition methods of the radar can be referred to in correspondence with each other.

图9为本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别装置的结构示意图，如图9所示，该装置包括获取单元910和识别单元920，其中，FIG. 9 is a schematic structural diagram of a multi-task learning-based FMCW millimeter-wave radar gesture recognition device provided by the present invention. As shown in FIG. 9, the device includes an acquisition unit 910 and a recognition unit 920, wherein,

所述获取单元910，用于获取待识别手势的目标特征向量时序序列；The acquiring unit 910 is configured to acquire a time series sequence of target feature vectors of gestures to be recognized;

所述识别单元920，用于将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；The recognition unit 920 is configured to input the time series sequence of the target feature vector into a multi-task learning gesture recognition model, and output gesture category results;

本发明提供的基于多任务学习的FMCW毫米波雷达的手势识别装置，通过获取待识别手势的目标特征向量时序序列；将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列、对应的手势类别标签和对应的手势轨迹标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。现实了有效便捷的获取用户手势多维特征，并且通过多任务学习提高了手势识别分类的准确率。The gesture recognition device of the FMCW millimeter-wave radar based on multi-task learning provided by the present invention obtains the target feature vector time sequence sequence of the gesture to be recognized; the target feature vector time sequence sequence is input into the multi-task learning gesture recognition model, and the gesture category result is output ; Wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture track label, the target feature vector sequence and the sample target feature vector The time series are all obtained by processing the raw gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm. It realizes the effective and convenient acquisition of multi-dimensional features of user gestures, and improves the accuracy of gesture recognition classification through multi-task learning.

在上述实施例的基础上，该装置中，所述多任务学习手势识别模型在训练过程中包括手势识别分类和重构手势轨迹两个学习任务。On the basis of the above embodiments, in the device, the multi-task learning gesture recognition model includes two learning tasks of gesture recognition classification and gesture trajectory reconstruction during the training process.

在上述实施例的基础上，该装置中，所述多任务学习手势识别模型在训练过程中的总体损失函数为手势识别分类部分和重构手势轨迹部分的两个部分的损失函数的加权。On the basis of the above embodiments, in the device, the overall loss function of the multi-task learning gesture recognition model during training is the weighting of the loss functions of the gesture recognition classification part and the gesture trajectory reconstruction part.

在上述实施例的基础上，该装置中，所述多任务学习手势识别模型在训练过程中的训练网络结果包括：On the basis of the foregoing embodiments, in this device, the training network results of the multi-task learning gesture recognition model during the training process include:

在上述实施例的基础上，该装置中，所述一维CNN层有128个卷积核，所述全连接层具有softmax激活函数，所述一维转置卷积层使用的是128个大小为7×1的反卷积核。On the basis of the above embodiment, in this device, the one-dimensional CNN layer has 128 convolution kernels, the fully connected layer has a softmax activation function, and the one-dimensional transposed convolution layer uses 128 convolution kernels of size It is a 7×1 deconvolution kernel.

在上述实施例的基础上，该装置中，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的，具体包括：On the basis of the above-mentioned embodiments, in the device, the target feature vector time series and the sample target feature vector time series are all obtained by processing the gesture recognition raw data acquired by the FMCW millimeter wave radar through a preset algorithm. , including:

以特定的多维度融合的特征提取算法对FMCW毫米波雷达获取的手势信号的空间和时序特征进行了提取，将每一时刻的距离、角度和速度信息一一对应以向量的形式进行特征融合，得到目标特征向量时序序列和样本目标特征向量时序序列。A specific multi-dimensional fusion feature extraction algorithm is used to extract the spatial and temporal features of the gesture signal acquired by the FMCW millimeter-wave radar, and the distance, angle and speed information at each moment are one-to-one in the form of vector feature fusion. The time series of target feature vectors and the time series of sample target feature vectors are obtained.

图10示例了一种电子设备的实体结构示意图，如图10所示，该电子设备可以包括：处理器(processor)1010、通信接口(CommunicationsInterface)1020、存储器(memory)1030和通信总线1040，其中，处理器1010，通信接口1020，存储器1030通过通信总线1040完成相互间的通信。处理器1010可以调用存储器1030中的逻辑指令，以执行基于多任务学习的FMCW毫米波雷达的手势识别方法，该方法包括：获取待识别手势的目标特征向量时序序列；将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列、对应的手势类别标签和对应的手势轨迹标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。FIG. 10 illustrates a schematic diagram of the physical structure of an electronic device. As shown in FIG. 10 , the electronic device may include: a processor (processor) 1010, a communication interface (CommunicationsInterface) 1020, a memory (memory) 1030, and a communication bus 1040, wherein , the processor 1010 , the communication interface 1020 , and the memory 1030 communicate with each other through the communication bus 1040 . The processor 1010 can call the logic instructions in the memory 1030 to execute the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning, the method includes: acquiring the target feature vector sequence sequence of the gesture to be recognized; Sequentially input the multi-task learning gesture recognition model, and output the gesture category result; wherein, the multi-task learning gesture recognition model is trained based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture track label, so Both the target feature vector time series and the sample target feature vector time series are obtained by processing the gesture recognition raw data acquired by the FMCW millimeter-wave radar through a preset algorithm.

此外，上述的存储器1030中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-OnlyMemory)、随机存取存储器(RAM，RandomAccessMemory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above-mentioned logic instructions in the memory 1030 may be implemented in the form of software functional units and be stored in a computer-readable storage medium when sold or used as an independent product. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-OnlyMemory), random access memory (RAM, RandomAccessMemory), magnetic disk or optical disk and other media that can store program codes.

另一方面，本发明还提供一种计算机程序产品，所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序，所述计算机程序包括程序指令，当所述程序指令被计算机执行时，计算机能够执行上述各方法所提供的基于多任务学习的FMCW毫米波雷达的手势识别方法，该方法包括：获取待识别手势的目标特征向量时序序列；将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列、对应的手势类别标签和对应的手势轨迹标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。On the other hand, the present invention also provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer During execution, the computer can perform the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning provided by the above-mentioned methods, the method includes: obtaining the target feature vector sequence sequence of the gesture to be recognized; inputting the target feature vector sequence sequence A multi-task learning gesture recognition model that outputs gesture category results; wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture track label, and the target Both the time series of feature vectors and the time series of feature vectors of the sample target are obtained by processing the original gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm.

又一方面，本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现以执行上述各提供的基于多任务学习的FMCW毫米波雷达的手势识别方法，该方法包括：获取待识别手势的目标特征向量时序序列；将所述目标特征向量时序序列输入多任务学习手势识别模型，输出手势类别结果；其中，所述多任务学习手势识别模型是基于样本目标特征向量时序序列、对应的手势类别标签和对应的手势轨迹标签进行训练得到的，所述目标特征向量时序序列和所述样本目标特征向量时序序列均是通过预设算法对FMCW毫米波雷达获取的手势识别原始数据进行处理后得到的。In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, it is implemented to perform the above-mentioned FMCW millimeter-wave radar based on multi-task learning. A gesture recognition method, the method comprising: obtaining a target feature vector time series sequence of a gesture to be recognized; inputting the target feature vector time series sequence into a multi-task learning gesture recognition model, and outputting a gesture category result; wherein, the multi-task learning gesture recognition The model is trained based on the sample target feature vector time series, the corresponding gesture category label and the corresponding gesture track label. The target feature vector time series and the sample target feature vector time series are both FMCW It is obtained after processing the raw gesture recognition data acquired by the millimeter-wave radar.

以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下，即可以理解并实施。The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without any creative efforts.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the implementations, those skilled in the art can clearly understand that each implementation can be implemented by means of software plus a necessary general hardware platform, and of course also by hardware. Based on this understanding, the essence of the above technical solution or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic discs, optical discs, etc., including several instructions to make a computer device (which may be a personal computer, server, or network device, etc.) execute the methods described in various embodiments or some parts of the embodiments.

最后应说明的是：以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent replacements are made to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.

Claims

1. a gesture recognition method based on the FMCW millimeter-wave radar of multi-task learning, it is characterized in that, comprising:

Obtain the time series sequence of the target feature vector of the gesture to be recognized;

Inputting the sequence sequence of the target feature vector into a multi-task learning gesture recognition model, and outputting gesture category results;

Wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture trajectory label, the target feature vector sequence and the sample target feature vector sequence The sequences are all obtained by processing the raw gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm.

2. the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning according to claim 1, is characterized in that, described multi-task learning gesture recognition model comprises gesture recognition classification and reconstruction gesture track two in training process learning assignment.

3. the gesture recognition method of the FMCW millimeter-wave radar based on multitask learning according to claim 2, is characterized in that, the total loss function of described multitask learning gesture recognition model in training process is gesture recognition classification part and weight The weighting of the loss function of the two parts of the gesture trajectory part.

4. the gesture recognition method of the FMCW millimeter-wave radar based on multitask learning according to claim 1, is characterized in that, the training network result of described multitask learning gesture recognition model in training process comprises:

Sequentially connected one-dimensional CNN layer, LSTM layer and one-dimensional transposed convolutional layer;

And, sequentially connected one-dimensional CNN layer, LSTM layer and fully connected layer.

5. the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning according to claim 4, is characterized in that, described one-dimensional CNN layer has 128 convolution cores, and described fully connected layer has softmax activation function, The one-dimensional transposed convolution layer uses 128 deconvolution kernels with a size of 7×1.

6. the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning according to claim 1, is characterized in that, described target feature vector sequence and described sample target feature vector sequence all are by preset algorithm The gesture recognition raw data acquired by the FMCW millimeter wave radar is processed, including:

A specific multi-dimensional fusion feature extraction algorithm is used to extract the spatial and temporal features of the gesture signal acquired by the FMCW millimeter-wave radar, and the distance, angle and speed information at each moment are one-to-one in the form of vector feature fusion. Obtain the time series of target feature vectors and the time series of sample target feature vectors.

7. the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning according to claim 6, is characterized in that, described target feature vector sequence and described sample target feature vector sequence all are by preset algorithm The gesture recognition raw data acquired by the FMCW millimeter wave radar is processed, including:

For the distance value in the original feature of gesture recognition acquired by FMCW millimeter-wave radar for each frame, first perform fast-time FFT on the intermediate frequency signal, and then detect the possible target distance through CFAR after removing static clutter;

Then use the Capon algorithm to calculate the azimuth angle spectrum at the corresponding distance and detect the possible target azimuth angle at a specific distance through CFAR;

Use the Capon algorithm to calculate the pitch angle spectrum at the corresponding distance and azimuth angle and detect the pitch angle of the target at a specific distance and azimuth angle through CFAR;

Finally, beamforming is used to calculate the target velocity of each target point;

Construct target distance, target azimuth, target pitch angle and target velocity into target gesture recognition features, perform Doppler FFT processing, and then based on density clustering, extract cluster centers as target feature vector time series and sample target feature vector time series sequence.

8. A gesture recognition device based on the FMCW millimeter-wave radar of multi-task learning, it is characterized in that, comprising:

An acquisition unit, configured to acquire a time series sequence of target feature vectors of gestures to be recognized;

A recognition unit, configured to input the time series sequence of the target feature vector into the multi-task learning gesture recognition model, and output gesture category results;

Wherein, the multi-task learning gesture recognition model is obtained by training based on the sample target feature vector sequence, the corresponding gesture category label and the corresponding gesture trajectory label, the target feature vector sequence and the sample target feature vector sequence The sequences are obtained by processing the raw gesture recognition data acquired by the FMCW millimeter-wave radar through a preset algorithm.

9. An electronic device, comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, characterized in that, when the processor executes the program, any of the following claims 1 to 7 are realized. A step of the gesture recognition method of the FMCW millimeter-wave radar based on multi-task learning.

10. A non-transitory computer-readable storage medium, on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the multi-task-based learning as described in any one of claims 1 to 7 is realized The steps of the gesture recognition method for the FMCW millimeter wave radar.