CN110942040B

CN110942040B - Gesture recognition system and method based on ambient light

Info

Publication number: CN110942040B
Application number: CN201911203896.2A
Authority: CN
Inventors: 黄苗; 段海涵; 杨彦兵; 陈良银; 陈彦如; 郭敏
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2019-11-29
Filing date: 2019-11-29
Publication date: 2023-04-18
Anticipated expiration: 2039-11-29
Also published as: CN110942040A

Abstract

The present invention relates to the technical field of gesture recognition, and aims to provide a low-cost, high-accuracy gesture recognition system and method based on ambient light, which can overcome the requirements of similar systems for light sources and provide more abundant use scenarios. The technical solution adopted is: including data acquisition terminal, gesture recognition server and application terminal, data acquisition terminal includes multiple photoelectric receivers, signal amplification module, analog-to-digital conversion module and signal processing module; photoelectric receiver is used to capture gesture generated When the optical signal changes, the output terminals of the photoelectric receiver are connected to the input terminals of the signal amplification module, the output terminals of the signal amplification module are connected to the input terminals of the analog-to-digital conversion module, and the output terminals of the analog-to-digital conversion module are connected to the input terminals of the signal processing module. The data information processed by the signal processing module is transmitted to the gesture recognition server for recognition, and the gesture recognition server outputs the recognition information and sends it to the application end, and the application end displays the recognition information in real time.

Description

A gesture recognition system and method based on ambient light

技术领域technical field

本发明涉及手势识别技术领域，具体涉及一种基于环境光的手势识别系统和方法。The present invention relates to the technical field of gesture recognition, in particular to a gesture recognition system and method based on ambient light.

背景技术Background technique

智能设备的普及与人机交互的不断创新相辅相成。传统的智能家居或智能楼宇通常使用物联网技术将智能设备组网，通过智能终端发送指令与其进行交互。这种传统的交互方式正逐渐向学习成本更低的自然交互方式发展。市场上天猫精灵等智能音箱，Siri等智能设备语音助手已经屡见不鲜，作为自然交互方式之一的语音识别已经逐渐成熟并被广泛应用。尽管市场上有诸如Kinect，Leap等产品，手势识别目前仍鲜少在日常生活中得到应用。The popularity of smart devices and the continuous innovation of human-computer interaction complement each other. Traditional smart homes or smart buildings usually use IoT technology to network smart devices, and send instructions through smart terminals to interact with them. This traditional interaction method is gradually developing towards a natural interaction method with lower learning costs. Voice assistants such as smart speakers such as Tmall Genie and smart devices such as Siri are common in the market. As one of the natural interaction methods, voice recognition has gradually matured and been widely used. Although there are products such as Kinect and Leap on the market, gesture recognition is still rarely used in daily life.

常见的实现手势识别的方法包括使用图像、声学、射频以及可见光等技术。但是图像和声学技术需要进行图像采集和音频采集会从而引发安全和隐私问题。射频技术不仅需要在发射端、接收端使用较为复杂的技术手段且容易受到各种电磁场的干扰。而环境光具有频谱宽、几乎无处不在、安全、无隐私忧患、采集方便等优点。Common methods for realizing gesture recognition include using technologies such as images, acoustics, radio frequency, and visible light. But image and acoustic technologies require image capture and audio capture, which raises security and privacy concerns. Radio frequency technology not only requires the use of relatively complex technical means at the transmitting end and receiving end, but also is susceptible to interference from various electromagnetic fields. Ambient light has the advantages of wide spectrum, almost everywhere, safety, no privacy concerns, and convenient collection.

Tianxing Li等人在《Reconstructing Hand Poses Using Visible Light》中提出并构建了一种基于LED光的手势重建方法和系统Aili。该系统采用调制后的LED阵列作为光源，光电二极管作为可见光感应器实现了3D手势重建。但是该系统以及很多可见光通信领域的类似系统有以下缺点：Tianxing Li et al. proposed and built a gesture reconstruction method and system Aili based on LED light in "Reconstructing Hand Poses Using Visible Light". The system uses a modulated LED array as a light source and a photodiode as a visible light sensor to realize 3D gesture reconstruction. However, this system and many similar systems in the field of visible light communication have the following disadvantages:

1)需要通过在光源处安装额外的调制设备，安装不便捷且增加了系统成本；1) It is necessary to install additional modulation equipment at the light source, which is inconvenient to install and increases the system cost;

2)因需要调制，所以该系统光源类型只能是LED，而实际上目前LED的普及率并不高；2) Due to the need for modulation, the light source type of the system can only be LED, but in fact the current penetration rate of LED is not high;

3)该系统主要工作在于3D重建，若要用于手势识别场景，仍需要额外增加其他工作。3) The main work of the system is 3D reconstruction. If it is used in gesture recognition scenarios, additional work is still required.

发明内容Contents of the invention

本发明的目的在于提供一种低成本、高准确度的基于环境光的手势识别系统和方法，可以克服同类系统对于光源的要求，使用场景更加丰富。The purpose of the present invention is to provide a low-cost, high-accuracy gesture recognition system and method based on ambient light, which can overcome the requirements of similar systems for light sources, and provide more diverse usage scenarios.

为实现上述发明目的，本发明所采用的技术方案是：一种基于环境光的手势识别系统，包括数据采集终端、手势识别服务器和应用端，所述数据采集终端包括多个光电接收器、信号放大模块、模数转换模块和信号处理模块；所述光电接收器用于捕捉手势动作产生的光信号变化，多个所述光电接收器构成组合排列；所述光电接收器的输出端均与信号放大模块的输入端相连，所述信号放大模块的输出端与模数转换模块的输入端相连，所述模数转换模块的输出端与信号处理模块输入端相连，所述信号处理模块处理后的数据信息传输至手势识别服务器中进行识别，所述手势识别服务器输出识别信息并发送至应用端，所述应用端将识别信息进行实时展示。In order to achieve the purpose of the above invention, the technical solution adopted by the present invention is: a gesture recognition system based on ambient light, including a data collection terminal, a gesture recognition server and an application terminal, the data collection terminal includes a plurality of photoelectric receivers, signal An amplification module, an analog-to-digital conversion module and a signal processing module; the photoelectric receiver is used to capture the light signal changes generated by gestures, and a plurality of the photoelectric receivers form a combined arrangement; the output terminals of the photoelectric receiver are all connected to the signal amplification The input end of the module is connected, the output end of the signal amplification module is connected with the input end of the analog-digital conversion module, the output end of the analog-digital conversion module is connected with the input end of the signal processing module, and the data processed by the signal processing module The information is transmitted to the gesture recognition server for recognition, and the gesture recognition server outputs the recognition information and sends it to the application end, and the application end displays the recognition information in real time.

进一步的，所述手势识别服务器包括数据预处理单元和深度学习网络模型单元，所述数据预处理单元对信号处理模块输出的数据包进行解码并还原成多通道数据，所述深度学习网络模型单元将还原后的多通道数据进行识别并分类；所述多通道数据和所述深度学习网络模型单元识别分类的结果作为所述手势识别服务器输出的识别信息。Further, the gesture recognition server includes a data preprocessing unit and a deep learning network model unit, the data preprocessing unit decodes the data packets output by the signal processing module and restores them into multi-channel data, and the deep learning network model unit Recognizing and classifying the restored multi-channel data; the multi-channel data and the recognition and classification results of the deep learning network model unit are used as the recognition information output by the gesture recognition server.

进一步的，所述组合排列为矩形阵列排布、梯形排布或分散排布。Further, the combination arrangement is a rectangular array arrangement, a trapezoidal arrangement or a dispersed arrangement.

进一步的，所述深度学习网络模型单元为门控循环单元。Further, the deep learning network model unit is a gated recurrent unit.

进一步的，所述应用端为网页前端。Further, the application end is a web page front end.

一种基于环境光的手势识别方法，包括以下识别步骤：A gesture recognition method based on ambient light, comprising the following recognition steps:

S1：多个光电接收器实时捕捉手势动作在不同的手部位置产生的光信号变化，并将收集到的多个光信号分别转换为电流信号；S1: Multiple photoelectric receivers capture the light signal changes generated by gestures at different hand positions in real time, and convert the collected multiple light signals into current signals respectively;

S2：信号放大模块将电流信号转换为电压信号，并将其放大；S2: The signal amplification module converts the current signal into a voltage signal and amplifies it;

S3：放大后的电压信号经过模数转换模板转换为数字信号；S3: The amplified voltage signal is converted into a digital signal through an analog-to-digital conversion template;

S4：多通道数字信号经信号处理模块进行合并、编码，然后传输到手势识别服务器；S4: Multi-channel digital signals are merged and encoded by the signal processing module, and then transmitted to the gesture recognition server;

S5：数据预处理单元对接收到的原始数据进行解码，并还原成多通道数据；S5: The data preprocessing unit decodes the received original data and restores it to multi-channel data;

S6：还原后的多通道数据输入到深度学习网络模型单元中，完成手势的识别分类；S6: The restored multi-channel data is input into the deep learning network model unit to complete the recognition and classification of gestures;

S7：同时将还原后的多通道数据和识别分类的结果作为应用端的输入，应用端实时展示多通道的数据和识别出的手势。S7: At the same time, the restored multi-channel data and the recognition and classification results are used as the input of the application side, and the application side displays the multi-channel data and the recognized gestures in real time.

进一步的，手势识别方法还包括识别前的训练步骤，所述训练步骤为：所述手势识别服务器根据不同的手势动作通过反向传播算法进行训练，建立深度网络模型。Further, the gesture recognition method further includes a training step before recognition, and the training step is: the gesture recognition server performs training through a backpropagation algorithm according to different gesture actions, and establishes a deep network model.

本发明的有益效果集中体现在：The beneficial effects of the present invention are embodied in:

1、光电接收器仅用环境光就可进行手势识别，对光源没有限制，应用场景更加广泛；1. The photoelectric receiver can perform gesture recognition only with ambient light, there is no limit to the light source, and the application scenarios are more extensive;

2、无需安装额外的调制光源设备，降低了系统成本，使用、安装更加简便；2. There is no need to install additional modulation light source equipment, which reduces the system cost and makes it easier to use and install;

3、光电接收器非常敏感，能检测出微小的光强变化，因此可以识别细微差别的动作；3. The photoelectric receiver is very sensitive and can detect small changes in light intensity, so it can identify nuanced movements;

4、多个光电接收器构成组合排列，形成多通道的感光数据，可以提高手势识别准确率，同时降低识别延时。4. Multiple photoelectric receivers are combined and arranged to form multi-channel photosensitive data, which can improve the accuracy of gesture recognition and reduce the recognition delay.

附图说明Description of drawings

图1是本发明系统结构框图；Fig. 1 is a system block diagram of the present invention;

图2是本发明循环神经网络模型框图；Fig. 2 is a block diagram of the recurrent neural network model of the present invention;

图3是本发明光电接收器布局示意图；Fig. 3 is a schematic diagram of the layout of the photoelectric receiver of the present invention;

图4是本发明深度学习手势训练框架示意图；Fig. 4 is a schematic diagram of the deep learning gesture training framework of the present invention;

图5是手势训练动作示意图。Fig. 5 is a schematic diagram of gesture training actions.

具体实施方式Detailed ways

为了使本领域的技术人员更好地理解本发明的技术方案，下面结合附图和具体实施例对本发明作进一步的详细说明。In order to enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

如图1所示，一种基于环境光的手势识别系统，包括数据采集终端、手势识别服务器和应用端，所述数据采集终端包括多个光电接收器、信号放大模块、模数转换模块和信号处理模块；As shown in Figure 1, a gesture recognition system based on ambient light includes a data collection terminal, a gesture recognition server, and an application terminal. The data collection terminal includes a plurality of photoelectric receivers, a signal amplification module, an analog-to-digital conversion module, and processing module;

所述光电接收器用于接收任一自由空间的环境光，并能捕捉手势动作产生的光信号变化，多个所述光电接收器构成组合排列；光电接收器将捕捉到的光信号转换为电流信号，光电接收器的输出端均与信号放大模块的输入端相连；The photoelectric receiver is used to receive ambient light in any free space, and can capture the light signal changes generated by gestures, and a plurality of the photoelectric receivers are arranged in combination; the photoelectric receiver converts the captured light signal into a current signal , the output terminals of the photoelectric receiver are connected to the input terminals of the signal amplification module;

所述信号放大器微小电流信号转化为电压信号，再进行放大处理，所述信号放大模块的输出端与模数转换模块的输入端相连；The small current signal of the signal amplifier is converted into a voltage signal, and then amplified, and the output end of the signal amplification module is connected to the input end of the analog-to-digital conversion module;

所述数模转换模块将信号放大器输出的模拟量的电压信号转换为数字量的数字信号，所述模数转换模块的输出端与信号处理模块输入端相连，所述信号处理模块用于将模数转换模块输出的数字信号合并，并进行编码；当手势识别服务器向信号处理模块发出请求后，信号处理模块向手势识别服务器发送数据包；The digital-to-analog conversion module converts the analog voltage signal output by the signal amplifier into a digital digital signal, the output end of the analog-to-digital conversion module is connected to the input end of the signal processing module, and the signal processing module is used to convert the analog The digital signals output by the digital conversion module are combined and encoded; when the gesture recognition server sends a request to the signal processing module, the signal processing module sends a data packet to the gesture recognition server;

所述手势识别服务器包括数据预处理单元和深度学习网络模型单元，在本实施例中，所述深度学习网络模型单元为门控循环单元；如图2所示，其中，h表示神经网络的迭代向量，t为时间点，v表示输入向量，U和W是门控循环单元参数矩阵，方框中的σ和tanh分别代表sigmoid和tanh激活函数，g是每个门的输出向量，

代表向量逐元素相乘；门控循环单元共包括了两个门，后缀为u代表更新门，后缀为r代表重置门，最后通过后缀为c的状态更新运算计算出下一时刻的神经网络迭代向量；The gesture recognition server includes a data preprocessing unit and a deep learning network model unit. In this embodiment, the deep learning network model unit is a gated loop unit; as shown in Figure 2, wherein h represents the iteration of the neural network Vector, t is the time point, v represents the input vector, U and W are the parameter matrix of the gated recurrent unit, σ and tanh in the box represent the sigmoid and tanh activation functions respectively, g is the output vector of each gate,

Represents the multiplication of vectors element by element; the gated cycle unit includes two gates, the suffix u represents the update gate, the suffix r represents the reset gate, and finally calculates the neural network at the next moment through the state update operation with the suffix c iteration vector;

所述数据预处理单元对信号处理模块输出的数据包进行解码并还原成多通道数据，还原后的多通道数据作为所述深度学习网络模型单元的输入，再将还原后的多通道数据进行识别并分类，具体的手势分类如图5所示，包括五指分别自然下垂、手掌自然摊开以及握拳七种手势。The data preprocessing unit decodes the data packets output by the signal processing module and restores them into multi-channel data, the restored multi-channel data is used as the input of the deep learning network model unit, and then the restored multi-channel data is identified And classification, the specific classification of gestures is shown in Figure 5, including seven gestures of natural drooping of five fingers, natural spreading of palms and clenching of fists.

所述多通道数据和所述深度学习网络模型单元识别分类的结果作为所述手势识别服务器输出的识别信息，所述手势识别服务器输出识别信息并发送至应用端，在本实施例中所述应用端为网页前端，网页前端可向用户展示每个通道的电压数值，还可用于展示最终识别手势结果以及相关交互应用。The multi-channel data and the recognition and classification results of the deep learning network model unit are used as the recognition information output by the gesture recognition server, and the gesture recognition server outputs the recognition information and sends it to the application end. In this embodiment, the application The terminal is the front end of the web page, which can display the voltage value of each channel to the user, and can also be used to display the final gesture recognition result and related interactive applications.

在本实施例中，所述光电接收器为光电二极管或光电三极管，其数量优选为八个，所述光电接收器在用户的手部上有三种组合排列，如图3所示，在图3的手部a中，光电接收器呈2*4矩形阵列排布，每相邻的两个光电接收器之间的距离为5cm，该方式优点在于左右对称、部署简单；In this embodiment, the photoelectric receivers are photodiodes or phototransistors, the number of which is preferably eight, and the photoelectric receivers are arranged in three combinations on the user's hand, as shown in Figure 3, in Figure 3 In the hand a of , the photoelectric receivers are arranged in a 2*4 rectangular array, and the distance between every two adjacent photoelectric receivers is 5cm. The advantage of this method is that it is symmetrical and easy to deploy;

在图3的手部b中，光电接收器呈梯形排布，上侧有三个光电接收器，设置在手部的食指、中指和无名指上，下侧有五个光电接收器，横向均匀分布，上下之间的光电接收器之间的距离为5cm；该方式的优点在于左右对称可以方便的切换左手、右手而无须进行排布调整，能适应不同手型，目的在于用有限的光电接收器尽可能多的捕捉本实施例手势相应空间微小的动作变化；In the hand b in Figure 3, the photoelectric receivers are arranged in a trapezoidal shape. There are three photoelectric receivers on the upper side, which are arranged on the index finger, middle finger and ring finger of the hand. There are five photoelectric receivers on the lower side, which are evenly distributed laterally. The distance between the upper and lower photoelectric receivers is 5cm; the advantage of this method is that the left and right hands can be switched conveniently without adjusting the arrangement, and it can adapt to different hand shapes. It is possible to capture as many tiny movement changes as possible in the corresponding space of gestures in this embodiment;

在图3的手部c中，光电接收器呈分散排布，光电接收器分别安装在每根手指的指尖附近和手掌上，该排布方式能最大限度地捕捉本实施例所识别手势的光信号的变化，但是因其不对称的特征只能用于识别单只手的手势。In the hand c of Fig. 3, the photoelectric receivers are arranged in a dispersed manner, and the photoelectric receivers are respectively installed near the fingertips of each finger and on the palm. This arrangement can capture the gestures recognized by this embodiment to the greatest extent Changes in the light signal, however, can only be used to recognize single-hand gestures due to their asymmetrical characteristics.

S1：多个光电接收器实时捕捉手势动作在不同的手部位置产生的光信号变化，并将收集到的八个光信号分别转换为电流信号；S1: Multiple photoelectric receivers capture the light signal changes generated by gestures at different hand positions in real time, and convert the eight collected light signals into current signals respectively;

S4：八通道数字信号经信号处理模块进行合并、编码，然后传输到手势识别服务器；S4: The eight-channel digital signals are merged and encoded by the signal processing module, and then transmitted to the gesture recognition server;

S5：数据预处理单元对接收到的原始数据进行解码，并还原成八通道数据；S5: The data preprocessing unit decodes the received original data and restores it to eight-channel data;

S6：还原后的八通道数据输入到深度学习网络模型单元中，完成手势的识别分类；S6: The restored eight-channel data is input into the deep learning network model unit to complete the recognition and classification of gestures;

S7：同时将还原后的八通道数据和识别分类的结果作为应用端的输入，应用端实时展示八通道的数据和识别出的手势。S7: At the same time, the restored eight-channel data and the recognition and classification results are used as the input of the application side, and the application side displays the eight-channel data and recognized gestures in real time.

所述手势识别方法还包括识别前的训练步骤，所述训练步骤为：所述手势识别服务器根据不同的手势动作通过反向传播算法进行训练，建立深度网络模型；其训练原理为：训练的主要途径是通过反向传播算法迭代更新参数矩阵的参数，主要使用了多分类交叉熵作为损失函数，进行反向传播迭代。如图4所示，每一个V表示每个时刻输入神经网络的光电接收器数据，经过神经网络运算后得到神经网络的输出(即RNN Output)，通过全连接层和softmax函数后可以得到输入的光电接收器数据对应的手势；通过算法输出的手势与正确手势进行比对，正确则不调整神经网络权重矩阵的参数，错误则通过反向传播算法更新神经网络权重矩阵。The gesture recognition method also includes a training step before recognition, the training step is: the gesture recognition server trains through a backpropagation algorithm according to different gesture actions, and establishes a deep network model; its training principle is: the main The approach is to iteratively update the parameters of the parameter matrix through the backpropagation algorithm, mainly using multi-classification cross entropy as the loss function to perform backpropagation iterations. As shown in Figure 4, each V represents the photoelectric receiver data input to the neural network at each moment, and the output of the neural network (that is, RNN Output) is obtained after the neural network operation, and the input can be obtained after passing through the fully connected layer and the softmax function The gesture corresponding to the photoelectric receiver data; the gesture output by the algorithm is compared with the correct gesture. If it is correct, the parameters of the neural network weight matrix will not be adjusted, and if it is wrong, the neural network weight matrix will be updated through the back propagation algorithm.

需要说明的是，对于前述的各个方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某一些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和单元并不一定是本申请所必须的。It should be noted that, for the sake of simple description, all the aforementioned method embodiments are expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because according to the application, certain steps may be performed in other order or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification belong to preferred embodiments, and the actions and units involved are not necessarily required by this application.

Claims

1. A non-contact gesture recognition device based on ambient light, comprising:

a data acquisition terminal: the system comprises 8 photoelectric receivers, a controller and a display, wherein the 8 photoelectric receivers are used for receiving an ambient light signal and generating a current signal, and the ambient light signal is generated in one or more non-contact gestures under an ambient light scene; the 8 photoelectric receivers are arranged in a trapezoidal shape and located below the hand, the upper side of the 8 photoelectric receivers is provided with 3 photoelectric receivers which are arranged at positions corresponding to the index finger, the middle finger and the ring finger of the hand, and the lower side of the 8 photoelectric receivers is provided with 5 photoelectric receivers which are transversely and uniformly distributed;

the signal amplification module: the photoelectric receiver is used for converting the current signal into a voltage signal and amplifying the voltage signal, and the input end of the signal amplification module is connected with the output end of the photoelectric receiver;

an analog-to-digital conversion module: the analog-to-digital conversion module is used for converting the voltage signal into a digital signal, and the input end of the analog-to-digital conversion module is connected with the output end of the signal amplification module;

the signal processing module: the photoelectric receivers are used for receiving the 8 photoelectric signals generated by the photoelectric receivers;

a gesture recognition server: the data preprocessing unit is used for decoding the data output by the signal processing module and restoring the data into a plurality of groups of data;

deep learning network model unit: the data processing system is used for identifying and classifying the restored groups of data in real time;

an application end: the gesture recognition server is used for displaying the one or more non-contact gestures output by the gesture recognition server in real time.

2. The device according to claim 1, wherein the deep learning network model unit is a gate control cycle unit, the deep learning network model unit is trained in advance based on a non-contact gesture interpretation data set in a historical environment light scene, and the training algorithm is a back propagation algorithm.

3. The device according to claim 1, wherein the application terminal is a front end of a web page.

4. The ambient light-based non-contact gesture recognition device according to claim 1, wherein the 8 photo receivers are arranged in a 2 x 4 rectangular array and located below the hand, and the distance between every two adjacent photo receivers is 5cm.

5. The device according to claim 1, wherein the 8 photoelectric receivers are distributed and located under the hand, and the photoelectric receivers are respectively located near the fingertip of each finger and at the palm position.

6. A non-contact gesture recognition method based on ambient light is characterized by comprising the following steps: the method comprises the following identification steps:

s1:8 photoelectric receivers capture the light signal change generated by the gesture action at different hand positions in real time, and collect a plurality of collected lights

The signals are respectively converted into current signals; the 8 photoelectric receivers are arranged in a trapezoid manner and positioned below the hand, the upper side of the 8 photoelectric receivers is provided with 3 photoelectric receivers which are arranged at positions corresponding to the index finger, the middle finger and the ring finger of the hand, and the lower side of the 8 photoelectric receivers is provided with 5 photoelectric receivers which are transversely and uniformly distributed;

s2: the signal amplification module converts the current signal into a voltage signal and amplifies the voltage signal;

s3: the amplified voltage signal is converted into a digital signal through an analog-to-digital conversion template;

s4: the multi-channel digital signals are merged and coded by the signal processing module and then transmitted to the gesture recognition server;

s5: the data preprocessing unit decodes the received original data and restores the original data into multi-channel data;

s6: inputting the restored multi-channel data into a deep learning network model unit to finish the recognition and classification of gestures;

s7: and meanwhile, the restored multi-channel data and the recognition and classification results are used as the input of the application terminal, and the application terminal displays the data of the multiple channels and recognized gestures in real time.

7. The ambient light-based non-contact gesture recognition method of claim 6, wherein: the deep learning network model unit is a gating cycle unit, and the deep learning network model unit further comprises a training step before recognition, wherein the training step comprises the following steps: and the gesture recognition server trains through a back propagation algorithm according to different gesture actions to establish a deep network model.