[go: up one dir, main page]

CN103226692A - A system and method for identifying video stream image frames - Google Patents

A system and method for identifying video stream image frames Download PDF

Info

Publication number
CN103226692A
CN103226692A CN2012104809177A CN201210480917A CN103226692A CN 103226692 A CN103226692 A CN 103226692A CN 2012104809177 A CN2012104809177 A CN 2012104809177A CN 201210480917 A CN201210480917 A CN 201210480917A CN 103226692 A CN103226692 A CN 103226692A
Authority
CN
China
Prior art keywords
frame
level
district
picture frame
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012104809177A
Other languages
Chinese (zh)
Other versions
CN103226692B (en
Inventor
王磊
郑伟龙
张文山
姚以鹏
陈曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUANGDONG SCIENCE CENTER
Original Assignee
GUANGDONG SCIENCE CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GUANGDONG SCIENCE CENTER filed Critical GUANGDONG SCIENCE CENTER
Priority to CN201210480917.7A priority Critical patent/CN103226692B/en
Publication of CN103226692A publication Critical patent/CN103226692A/en
Application granted granted Critical
Publication of CN103226692B publication Critical patent/CN103226692B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a system and a method for identifying video stream image frames, wherein a primary buffer area receives a data packet formed by an association gesture and divides the data packet into continuous image frames for storage, a matching unit sequentially takes out the image frames from the primary buffer area and stores the image frames into a secondary buffer area, the image frames stored in the secondary buffer area are matched with a conventional frame matching library, the image frames stored in the secondary buffer area are stored into a statement buffer area after the matching unit is successfully matched, and the association matching unit is started to carry out double-line matching identification. The invention provides a new visual gesture recognition scheme, which performs matching processing by utilizing frame-by-frame increment and a constant frame library, does not need to extract characteristic values, but performs matching by corresponding to the data frame library one by one, and directly intercepts the frame number for matching by an association recognition function after recognition, thereby greatly reducing the workload of a computer and obviously improving the operation speed and the matching precision.

Description

一种视频流图像帧的识别系统及其方法A system and method for identifying video stream image frames

技术领域 technical field

本发明涉及手势视频流识别技术领域,具体的涉及一种视频流图像帧的识别系统及方法。  The invention relates to the technical field of gesture video stream recognition, in particular to a system and method for recognizing video stream image frames. the

背景技术 Background technique

手势作为人类最自然的表达方式之一,在日常生活中得到了广泛的应用。其中,手语是用手势表示意思的语言,但对不熟悉手语的普通人来说,理解手语是非常困难的,所以如果有能够对手语进行翻译的技术,将大大方便聋哑人与正常人之间的交流。在手势及手语的识别中,一个关键环节便是手势跟踪。  As one of the most natural expressions of human beings, gestures are widely used in daily life. Among them, sign language is a language that uses gestures to express meaning, but it is very difficult for ordinary people who are not familiar with sign language to understand sign language. exchange between. In the recognition of gestures and sign language, a key link is gesture tracking. the

按现有技术,手势识别根据外围设备采集手势图像方式的不同可以分为:基于数据手套的手势识别和基于计算机视觉的手势识别。基于数据手套的手势识别是通过数据手套和位置跟踪来测量手势运动的轨迹和时序信息,其优点是识别率高,缺点是输入设备昂贵,并且要求打手势的人必须穿戴复杂的数据手套。  According to the prior art, gesture recognition can be divided into gesture recognition based on data glove and gesture recognition based on computer vision according to different ways of collecting gesture images by peripheral devices. Gesture recognition based on data gloves is to measure the trajectory and timing information of gesture movements through data gloves and position tracking. It has the advantage of high recognition rate, but the disadvantage is that the input equipment is expensive, and the gesturer must wear complex data gloves. the

而基于计算机视觉的手势识别,一般采用单目普通摄像头下的目标跟踪原理,其过程中比较难解决的一个问题就是遮挡,当一个目标物被另一个物体部分遮挡或完全遮挡时,跟踪的特征就会不完整或者消失,进而导致跟踪过程中断,此时需要重新检测目标物进行跟踪的重新初始化,非常不方便。为了解决这个问题,可以采用多个摄像头进行跟踪,但是跟踪算法会变得比较复杂,增加了技术难度和不稳定性。所以,基于计算机视觉的手势识别会使能够识别的手势种类受到很大的限制。  Gesture recognition based on computer vision generally uses the target tracking principle under a monocular ordinary camera. One of the more difficult problems in the process is occlusion. When a target is partially or completely occluded by another object, the tracking features It will be incomplete or disappear, which will lead to the interruption of the tracking process. At this time, it is necessary to re-detect the target to re-initialize the tracking, which is very inconvenient. In order to solve this problem, multiple cameras can be used for tracking, but the tracking algorithm will become more complicated, increasing technical difficulty and instability. Therefore, gesture recognition based on computer vision will greatly limit the types of gestures that can be recognized. the

为了解决上述问题,专利号为200810068423.1的中国专利“一种数据流图像帧的分割识别方法及其装置”,提供了积累一定的周期图像数据,通过判断区域图像是否满足识别区域的边界条件来进行模式识别,再通过特征值提取来进行模式比对从而得出想要的结果。该篇专利虽然提供了一定的技术解决方案,但如果实际进行应用则依然存在较多的缺陷,如识别种类少、识别速度慢、匹配精度不够等。  In order to solve the above problems, the Chinese patent No. 200810068423.1 "A Method and Device for Segmentation and Recognition of Data Stream Image Frames" provides a certain period of accumulated image data, which is carried out by judging whether the regional image satisfies the boundary conditions of the recognition region. Pattern recognition, and then pattern comparison through feature value extraction to obtain the desired result. Although this patent provides a certain technical solution, if it is actually applied, there are still many defects, such as few types of recognition, slow recognition speed, and insufficient matching accuracy. the

发明内容 Contents of the invention

本发明的目的在于克服现有技术的不足,提供一种视频流图像帧的识别系统及方法,主要通过对图像帧的分割,再通过逐帧递增的方式将帧与常规常规帧匹配库进行匹配,从而避免了对来的图像数据进行提取关键帧的步骤,再通过联想功能进行联想,进而有效的缩短了识别匹配时间。  The purpose of the present invention is to overcome the deficiencies of the prior art and provide a video stream image frame recognition system and method, mainly through the segmentation of the image frame, and then match the frame with the conventional conventional frame matching library in a frame-by-frame increment mode , thereby avoiding the step of extracting key frames from the incoming image data, and then performing association through the associative function, thereby effectively shortening the identification and matching time. the

本发明通过以下技术方案实现:  The present invention is realized through the following technical solutions:

一种视频流图像帧的识别系统,包括匹配单元、联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区和二级缓冲区;其中,一级缓冲区和二级缓冲区与相连;匹配单元与联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连;同时联想匹配单元与常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连。 A recognition system of a video stream image frame, including a matching unit, an associative matching unit, a conventional frame matching library, a statement buffer, a primary buffer, and a secondary buffer; wherein, the primary buffer and the secondary buffer are connected to ;The matching unit is connected with the associative matching unit, the conventional frame matching library, the statement buffer, the first-level buffer, and the second-level buffer; at the same time, the associative matching unit is connected with the conventional frame matching library, the statement buffer, the first-level buffer, and the second-level buffer area connected.

具体的,一级缓冲区接收联想手势形成的数据包并将其分割转存为连续的图像帧,匹配单元从一级缓冲区内依次取出图像帧存入二级缓冲区,并将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,在匹配单元匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区,并启动联想匹配单元进行双线匹配识别。  Specifically, the first-level buffer receives the data packets formed by Lenovo gestures and divides them into continuous image frames, and the matching unit sequentially takes out the image frames from the first-level buffer and stores them in the second-level buffer, The image frames stored in the area are matched with the conventional frame matching library. After the matching unit is successfully matched, the image frames stored in the secondary buffer are stored in the sentence buffer area, and the associative matching unit is started for double-line matching recognition. the

其中,常规帧匹配库将每种相同帧数构成的手语分类存储,依次形成一帧匹配库、二帧匹配库、……N-1帧匹配库和N帧匹配库。  Among them, the conventional frame matching library classifies and stores each sign language composed of the same number of frames, and sequentially forms a one-frame matching library, a two-frame matching library, ... N-1 frame matching library and an N-frame matching library. the

举例来说,如:“我”、“你”、“他”,假如都由2个帧就可以构成,那么将这些由2个帧组成的手势都归纳在“二帧匹配库”中,其他以此类推。  For example, if "I", "You", and "He" can be composed of 2 frames, then these gestures composed of 2 frames are all summarized in the "two-frame matching library", and other and so on. the

其中,匹配单元将一级缓冲区内的图像帧从第一个图像帧开始依次递增存入二级缓冲区内,其中,当二级缓冲区内每存入新的一个图像帧时,匹配单元将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区。  Wherein, the matching unit stores the image frames in the primary buffer sequentially from the first image frame into the secondary buffer, wherein, when a new image frame is stored in the secondary buffer, the matching unit Match the image frames stored in the secondary buffer with the conventional frame matching library, and store the image frames stored in the secondary buffer into the statement buffer after the matching is successful. the

其中,在匹配单元匹配成功后,根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理;反复执行上述匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别。  Among them, after the matching unit is successfully matched, three groups of word frame values are associated according to the number of frames required by the associative gesture and the image frames stored in the secondary buffer (set to M1, M2, M3 in turn), and the secondary buffer is cleared; Then start the matching process, read the unmatched image frames in the primary buffer sequentially through the three groups of word frame values, store the read image frames in the secondary buffer and match them with the conventional frame matching library, if the matching is successful The image frame stored in the secondary buffer is sent to the sentence buffer area for processing; the above matching process is repeatedly executed until the three groups of word frame values cannot be matched successfully, and the association matching unit ends the two-line matching recognition. the

其中,匹配单元即时检测语句缓存区内所存储的图像帧,当语句缓存区内所存储的图像帧能够形成完整的句式时,由匹配单元优化并输出结果。  Wherein, the matching unit immediately detects the image frames stored in the sentence buffer, and when the image frames stored in the sentence buffer can form a complete sentence pattern, the matching unit optimizes and outputs the result. the

根据上述系统所实现的一种视频流图像帧的识别方法,包括以下步骤:  According to the recognition method of a kind of video stream image frame realized by above-mentioned system, comprise the following steps:

1)匹配单元运行; 1) The matching unit runs;

2)对当前接收到的数据包分割成连续的图像帧,将分割完成的连续图像帧存储在一级缓冲区中; 2) Divide the currently received data packet into continuous image frames, and store the divided continuous image frames in the primary buffer;

3)将一级缓冲区内的连续图像帧从还未进行匹配的第一帧开始依次递增存入二级缓冲区,再与常规帧匹配库进行匹配处理,如匹配成功则进行步骤5,如匹配不成功则进行步骤4; 3) The consecutive image frames in the primary buffer are sequentially stored in the secondary buffer from the first frame that has not yet been matched, and then matched with the conventional frame matching library. If the matching is successful, go to step 5, such as If the matching is unsuccessful, proceed to step 4;

 4)将匹配不成功的图像帧加上其后面的一个图像帧一起,与常规帧匹配库进行再一次匹配;如匹配成功则进行步骤5,如匹配不成功则再一次进行步骤4; 4) Match the unsuccessful image frame and the image frame behind it with the conventional frame matching library; if the match is successful, go to step 5; if the match is unsuccessful, go to step 4 again;

5)将匹配成功的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时,进行步骤8; 5) Send the successfully matched image frames to the sentence buffer area for processing, and when the image frames stored in the sentence buffer area can form a complete sentence pattern, go to step 8;

6)联想匹配单元进行双线匹配识别; 6) The associative matching unit performs double-line matching recognition;

7)根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行步骤8;反复执行上述的匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别,清空二级缓冲区,返回步骤3; 7) According to the number of frames required by the associative gesture and the image frames stored in the secondary buffer, three sets of word frame values are associated (set to M1, M2, M3 in turn), and the secondary buffer is cleared; The word frame value sequentially reads the unmatched image frames in the first-level buffer, stores the read image frames in the second-level buffer and matches them with the conventional frame matching library, and if the match is successful, the The stored image frame is sent to the sentence buffer area for processing, and when the image frame stored in the sentence buffer area can form a complete sentence pattern, step 8 is carried out; the above-mentioned matching process is repeatedly carried out until the three groups of word frame values cannot When the matching is successful, the Lenovo matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3;

8)对语句缓存区内所存储的图像帧进行优化排列处理; 8) Optimize and arrange the image frames stored in the statement buffer;

9)输出结果。 9) Output the result.

其中,所述的步骤8中,因为手语语法与人们正常的说话语法不同,所以需要进行优化处理,如:手语打的比如:一杯可乐,应该打可乐 一(杯),这个时候就需要用到优化把可乐一翻译成一杯可乐。  Among them, in the step 8, because the sign language grammar is different from people's normal speech grammar, it needs to be optimized, such as: sign language typing, for example: a cup of Coke, should be a Coke (cup), this time you need to use Optimize the translation of Coke into a glass of Coke. the

其中,所述的步骤7中包括如下匹配过程:  Wherein, described step 7 includes the following matching process:

7.1)将词帧值M1发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.2,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则继续进行步骤7.1; 7.1) Send the word frame value M1 to the first-level buffer, read the unmatched image frame in the first-level buffer, store the read image frame in the second-level buffer and match it with the conventional frame matching library, such as If the matching is not successful, proceed to step 7.2. If the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer for processing. When the image frame stored in the statement buffer can form a complete sentence pattern, then proceed to Step 8, otherwise proceed to step 7.1;

7.2)如不能匹配成功,则将词帧值M2发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.3;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.2) If the match cannot be successful, send the word frame value M2 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.3; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;

7.3)如不能匹配成功,则将词帧值M3发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.4;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.3) If the match cannot be successful, send the word frame value M3 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.4; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;

7.4)联想匹配单元结束双线匹配识别,清空二级缓冲区,返回所述步骤3。 7.4) The association matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3.

与现有技术相比,本发明具有以下有益效果:  Compared with the prior art, the present invention has the following beneficial effects:

本发明提供了新的视觉手势识别方案,利用逐帧递增与常数帧库进行匹配处理,不需要提取特征值,而是通过逐一对应数据帧库进行匹配,而当识别出后由联想识别功能直接截取帧数进匹配,大大减少了计算机的工作量,显著提高了运算速度和匹配精度,有效解决了基于计算机视觉手势方式识别种类少、识别速度慢的问题,能够适用于各个领域中,应用前景广泛,具备突出的效率性。 The present invention provides a new visual gesture recognition scheme, which uses frame-by-frame increments and constant frame databases for matching processing, without extracting feature values, but by matching data frame databases one by one, and when recognized, the associative recognition function directly Intercepting the number of frames for matching greatly reduces the workload of the computer, significantly improves the computing speed and matching accuracy, and effectively solves the problem of few types of gesture recognition and slow recognition speed based on computer vision gestures. It can be applied to various fields and has a promising application prospect. Extensive, with outstanding efficiency.

附图说明 Description of drawings

下面将结合实施例和附图对本发明作进一步的详细描述:  The present invention will be described in further detail below in conjunction with embodiment and accompanying drawing:

图1为本发明一具体实施例的系统结构示意图; Fig. 1 is a schematic diagram of the system structure of a specific embodiment of the present invention;

图2为本发明一具体实施例的方法流程示意图; Fig. 2 is the schematic flow chart of the method of a specific embodiment of the present invention;

图3为本发明一具体实施例的方法细致流程示意图; Fig. 3 is a detailed flow diagram of the method of a specific embodiment of the present invention;

图4为本发明一具体实施例的联想匹配单元的流程示意图; Fig. 4 is a schematic flow diagram of an associative matching unit according to a specific embodiment of the present invention;

图5为为本发明一具体实施例的常规帧匹配库的结构示意图。       FIG. 5 is a schematic structural diagram of a conventional frame matching library according to a specific embodiment of the present invention. ``

具体实施方式 Detailed ways

下面结合实施例及附图,对本发明作进一步的详细说明,但理应理解本发明的实施方式并不限于此。  The present invention will be further described in detail below with reference to the embodiments and the accompanying drawings, but it should be understood that the embodiments of the present invention are not limited thereto. the

如图1所示为本发明的一种视频流图像帧的识别系统,包括匹配单元、联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区和二级缓冲区;其中,一级缓冲区和二级缓冲区与相连;匹配单元与联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连;同时联想匹配单元与常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连。  As shown in Figure 1, it is a recognition system of a video stream image frame of the present invention, including a matching unit, an associative matching unit, a conventional frame matching library, a statement buffer, a primary buffer and a secondary buffer; wherein, the primary The buffer is connected to the secondary buffer; the matching unit is connected to the Lenovo matching unit, the conventional frame matching library, the statement buffer, the primary buffer, and the secondary buffer; at the same time, the Lenovo matching unit is connected to the conventional frame matching library and the statement buffer , the primary buffer, and the secondary buffer are connected. the

具体的,如图2所示,一级缓冲区接收联想手势形成的数据包并将其分割转存为连续的图像帧,匹配单元从一级缓冲区内依次取出图像帧存入二级缓冲区,并将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,在匹配单元匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区,并启动联想匹配单元进行双线匹配识别。  Specifically, as shown in Figure 2, the first-level buffer receives the data packets formed by Lenovo gestures and divides them into continuous image frames, and the matching unit sequentially takes out the image frames from the first-level buffer and stores them in the second-level buffer , and match the image frames stored in the secondary buffer with the conventional frame matching library, and store the image frames stored in the secondary buffer into the statement buffer after the matching unit is successfully matched, and start the associative matching unit for two-line match recognition. the

其中,如图5所示,常规帧匹配库将每种相同帧数构成的手语分类存储,依次形成一帧匹配库、二帧匹配库、……N-1帧匹配库和N帧匹配库。  Among them, as shown in Fig. 5, the conventional frame matching library classifies and stores the sign language composed of each kind of the same number of frames, and sequentially forms a one-frame matching library, two-frame matching library, ... N-1 frame matching library and N-frame matching library. the

举例来说:“我”、“你”、“他”,假如都由2个帧就可以构成,那么将这些由2个帧组成的手势都归纳在“二帧匹配库”中,其他类型的帧按现有技术和行业知识,以此类推。  For example: "I", "You", and "He", if they can all be composed of 2 frames, then these gestures composed of 2 frames are all summarized in the "two-frame matching library", and other types of gestures Frame by prior art and industry knowledge, and so on. the

 其中,匹配单元将一级缓冲区内的图像帧从第一个图像帧开始依次递增存入二级缓冲区内,其中,当二级缓冲区内每存入新的一个图像帧时,匹配单元将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区。  Wherein, the matching unit stores the image frames in the primary buffer sequentially from the first image frame into the secondary buffer, wherein, when a new image frame is stored in the secondary buffer, the matching unit Match the image frames stored in the secondary buffer with the conventional frame matching library, and store the image frames stored in the secondary buffer into the statement buffer after the matching is successful. the

 具体的,如图4所示,在匹配单元匹配成功后,根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(可依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理;反复执行上述匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别。  Specifically, as shown in Figure 4, after the matching unit is successfully matched, three groups of word frame values are associated according to the number of frames required by the associative gesture and the image frames stored in the secondary buffer (which can be set to M1, M2, M3 in turn ), clear the secondary buffer; then start the matching process, read the unmatched image frames in the primary buffer sequentially through the three groups of word frame values, store the read image frames in the secondary buffer and compare them with the regular frames The matching library performs matching, and if the matching is successful, the image frame stored in the secondary buffer is sent to the sentence buffer area for processing; the above matching process is repeatedly executed until the three groups of word frame values cannot be matched successfully, and the association matching unit ends Two-line match recognition. the

 其中,匹配单元即时检测语句缓存区内所存储的图像帧,当语句缓存区内所存储的图像帧能够形成完整的句式时,由匹配单元优化并输出结果。  Among them, the matching unit immediately detects the image frames stored in the sentence buffer, and when the image frames stored in the sentence buffer can form a complete sentence, the matching unit optimizes and outputs the result. the

 如图2~图3所示,根据上述系统所实现的一种视频流图像帧的识别方法,包括以下步骤:  As shown in Figures 2 to 3, a method for identifying video stream image frames implemented according to the above system includes the following steps:

1)运行匹配单元; 1) Run the matching unit;

2)对当前接收到的数据包分割成连续的图像帧,将分割完成的连续图像帧存储在一级缓冲区中; 2) Divide the currently received data packet into continuous image frames, and store the divided continuous image frames in the primary buffer;

3)将一级缓冲区内的连续图像帧从还未进行匹配的第一帧开始依次递增存入二级缓冲区,再与常规帧匹配库进行匹配处理,如匹配成功则进行步骤5,如匹配不成功则进行步骤4; 3) The consecutive image frames in the primary buffer are sequentially stored in the secondary buffer from the first frame that has not yet been matched, and then matched with the conventional frame matching library. If the matching is successful, go to step 5, such as If the matching is unsuccessful, proceed to step 4;

4)将匹配不成功的图像帧加上其后面的一个图像帧一起,与常规帧匹配库进行再一次匹配;如匹配成功则进行步骤5,如匹配不成功则再一次进行步骤4; 4) Match the unsuccessful image frame and the following image frame with the conventional frame matching library; if the match is successful, go to step 5; if the match is unsuccessful, go to step 4 again;

5)将匹配成功的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时,进行步骤8; 5) Send the successfully matched image frames to the sentence buffer area for processing, and when the image frames stored in the sentence buffer area can form a complete sentence pattern, go to step 8;

6)联想匹配单元进行双线匹配识别; 6) The associative matching unit performs double-line matching recognition;

7)根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行步骤8;反复执行上述的匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别,清空二级缓冲区,返回步骤3; 7) According to the number of frames required by the associative gesture and the image frames stored in the secondary buffer, three sets of word frame values are associated (set to M1, M2, M3 in turn), and the secondary buffer is cleared; The word frame value sequentially reads the unmatched image frames in the first-level buffer, stores the read image frames in the second-level buffer and matches them with the conventional frame matching library, and if the match is successful, the The stored image frame is sent to the sentence buffer area for processing, and when the image frame stored in the sentence buffer area can form a complete sentence pattern, step 8 is carried out; the above-mentioned matching process is repeatedly carried out until the three groups of word frame values cannot When the matching is successful, the Lenovo matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3;

8)对语句缓存区内所存储的图像帧进行优化排列处理; 8) Optimize and arrange the image frames stored in the statement buffer;

9)输出结果。 9) Output the result.

其中,所述的步骤8中,因为手语语法与人们正常的说话语法不同,所以需要进行优化处理,如:手语打的比如:一杯可乐,应该打“可乐 一(杯)”,如二包吃的,应该打“吃的 二(包)”这个时候就需要用到优化把可乐一翻译成一杯可乐。  Among them, in the step 8, because the sign language grammar is different from people's normal speech grammar, it needs to be optimized. For example, if you type a sign language, for example: a cup of Coke, you should type "Coke One (cup)", such as two packs to eat Yes, it should be typed as "eating two (package)". At this time, you need to use optimization to translate Coke One into a cup of Coke. the

其中,所述的步骤7中包括如下匹配过程:  Wherein, described step 7 includes the following matching process:

7.1)将词帧值M1发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.2,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则继续进行步骤7.1; 7.1) Send the word frame value M1 to the first-level buffer, read the unmatched image frame in the first-level buffer, store the read image frame in the second-level buffer and match it with the conventional frame matching library, such as If the matching is not successful, proceed to step 7.2. If the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer for processing. When the image frame stored in the statement buffer can form a complete sentence pattern, then proceed to Step 8, otherwise proceed to step 7.1;

7.2)如不能匹配成功,则将词帧值M2发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.3;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.2) If the match cannot be successful, send the word frame value M2 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.3; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;

7.3)如不能匹配成功,则将词帧值M3发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.4;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.3) If the match cannot be successful, send the word frame value M3 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.4; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;

7.4)联想匹配单元结束双线匹配识别,清空二级缓冲区,返回所述步骤3。 7.4) The association matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3.

Claims (7)

1. the recognition system of a video streaming image frame is characterized in that comprising matching unit, association's matching unit, conventional frame coupling storehouse, subquery cache district, first-level buffer district and level 2 buffering district; Wherein, first-level buffer district and level 2 buffering district with link to each other; Matching unit links to each other with association matching unit, conventional frame coupling storehouse, subquery cache district, first-level buffer district, level 2 buffering district; Associating matching unit simultaneously links to each other with conventional frame coupling storehouse, subquery cache district, first-level buffer district, level 2 buffering district; Concrete, the first-level buffer district receives the packet of association's gesture formation and it is cut apart unloading is the continuous images frame, matching unit takes out picture frame successively in the first-level buffer district and deposits the level 2 buffering district in, and the picture frame of level 2 buffering district stored and conventional frame coupling storehouse mated, deposit the picture frame of level 2 buffering district stored in the subquery cache district at matching unit after the match is successful, and start association's matching unit and carry out the identification of two-wire coupling.
2. the recognition system of video streaming image frame according to claim 1, it is characterized in that described conventional frame coupling storehouse with the sign language classification and storage that every kind of same number of frames constitutes, form successively frame coupling storehouse, two frames coupling storehouse ... N-1 frame coupling storehouse and N frame coupling storehouse.
3. the recognition system of video streaming image frame according to claim 1, it is characterized in that described matching unit begins the picture frame in the first-level buffer district to increase progressively successively from first picture frame deposits in the level 2 buffering district, wherein, when whenever depositing new picture frame in the level 2 buffering district in, matching unit mates the picture frame and the conventional frame coupling storehouse of level 2 buffering district stored, and the picture frame with level 2 buffering district stored after the match is successful deposits the subquery cache district in.
4. the recognition system of video streaming image frame according to claim 1, it is characterized in that at described matching unit after the match is successful, go out three groups of speech frame values (being made as M1, M2, M3 successively) according to the frame number of association's gesture needs and the picture frame association of level 2 buffering district stored, empty the level 2 buffering district; Then begin matching process, read the picture frame that does not mate in the first-level buffer district successively by three groups of speech frame values, deposit the picture frame that is read in the level 2 buffering district and mate, then the picture frame of level 2 buffering district stored is sent to the subquery cache district as the match is successful and carries out pending with conventional frame coupling storehouse; Carry out above-mentioned matching process repeatedly, when three groups of speech frame values all can not the match is successful, association's matching unit finished the identification of two-wire coupling.
5. the recognition system of video streaming image frame according to claim 1, it is characterized in that the interior picture frame of being stored of the instant inspect statement buffer area of described matching unit, when the picture frame of being stored in the while statement buffer area can form complete sentence formula, by matching unit optimization and output result.
6. the recognition methods of the video streaming image frame that system according to claim 1 realizes is characterized in that may further comprise the steps:
1) matching unit operation;
2) the current packet that receives is divided into the continuous images frame, will cuts apart the successive image frame of finishing and be stored in the first-level buffer district;
3) successive image frame in the first-level buffer district is begun to increase progressively successively from first frame that also mates deposit the level 2 buffering district in, carry out matching treatment with conventional frame coupling storehouse again, then carry out step 5, as mate and unsuccessfully then carry out step 4 as the match is successful;
4) will mate a picture frame that unsuccessful picture frame adds its back together, mate again with conventional frame coupling storehouse; Then carry out step 5 as the match is successful, as mate and unsuccessful then carry out step 4 again;
5) picture frame that will the match is successful is sent to the subquery cache district and carries out pendingly, when the picture frame of being stored in the while statement buffer area can form complete sentence formula, carry out step 8;
6) association's matching unit carries out the identification of two-wire coupling;
7) go out three groups of speech frame values (being made as M1, M2, M3 successively) according to the frame number of association's gesture needs and the picture frame association of level 2 buffering district stored, empty the level 2 buffering district; Then begin matching process, read the picture frame that does not mate in the first-level buffer district successively by three groups of speech frame values, deposit the picture frame that is read in the level 2 buffering district and mate with conventional frame coupling storehouse, then the picture frame of level 2 buffering district stored is sent to the subquery cache district as the match is successful and carries out pendingly, then carry out step 8 when the picture frame of being stored in the while statement buffer area can form complete sentence formula; Carry out above-mentioned matching process repeatedly, when three groups of speech frame values all can not the match is successful, association's matching unit finished the identification of two-wire coupling, empties the level 2 buffering district, returns step 3;
8) picture frame of being stored in the subquery cache district being optimized arrangement handles;
9) output result.
7. the recognition methods of video streaming image frame according to claim 6 is characterized in that comprising in the described step 7 following matching process:
7.1) speech frame value M1 is mail to the level cache district, read the picture frame that does not mate in the first-level buffer district, deposit the picture frame that is read in the level 2 buffering district and mate with conventional frame coupling storehouse, as mate and unsuccessful then carry out step 7.2, then the picture frame of level 2 buffering district stored being sent to the subquery cache district as the match is successful carries out pending, the picture frame of being stored in the while statement buffer area then carries out described step 8 in the time of can forming complete sentence formula, otherwise proceeds step 7.1;
7.2) as can not the match is successful, then speech frame value M2 is mail to the level cache district, read in the first-level buffer district the not picture frame of coupling, deposit the picture frame that is read in the level 2 buffering district and mate, as mate and unsuccessfully then carry out step 7.3 with conventional frame coupling storehouse; Then the picture frame of level 2 buffering district stored is sent to the subquery cache district as the match is successful and carries out pendingly, the picture frame of being stored in the while statement buffer area then carries out described step 8 in the time of can forming complete sentence formula, otherwise carry out step 7.1;
7.3) as can not the match is successful, then speech frame value M3 is mail to the level cache district, read in the first-level buffer district the not picture frame of coupling, deposit the picture frame that is read in the level 2 buffering district and mate, as mate and unsuccessfully then carry out step 7.4 with conventional frame coupling storehouse; Then the picture frame of level 2 buffering district stored is sent to the subquery cache district as the match is successful and carries out pendingly, the picture frame of being stored in the while statement buffer area then carries out described step 8 in the time of can forming complete sentence formula, otherwise carry out step 7.1;
7.4) association's matching unit end two-wire coupling identification.
CN201210480917.7A 2012-11-22 2012-11-22 System and method for identifying video stream image frame Expired - Fee Related CN103226692B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210480917.7A CN103226692B (en) 2012-11-22 2012-11-22 System and method for identifying video stream image frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210480917.7A CN103226692B (en) 2012-11-22 2012-11-22 System and method for identifying video stream image frame

Publications (2)

Publication Number Publication Date
CN103226692A true CN103226692A (en) 2013-07-31
CN103226692B CN103226692B (en) 2016-01-20

Family

ID=48837133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210480917.7A Expired - Fee Related CN103226692B (en) 2012-11-22 2012-11-22 System and method for identifying video stream image frame

Country Status (1)

Country Link
CN (1) CN103226692B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875538A (en) * 2017-01-20 2017-06-20 深圳怡化电脑股份有限公司 The method and apparatus for obtaining crown word number information
CN111679272A (en) * 2020-06-01 2020-09-18 南京欧曼智能科技有限公司 Target track tracking processing method based on buffering technology

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0905644A2 (en) * 1997-09-26 1999-03-31 Matsushita Electric Industrial Co., Ltd. Hand gesture recognizing device
US5982853A (en) * 1995-03-01 1999-11-09 Liebermann; Raanan Telephone for the deaf and method of using same
CN101452705A (en) * 2007-12-07 2009-06-10 希姆通信息技术(上海)有限公司 Voice character conversion nd cued speech character conversion method and device
CN102592112A (en) * 2011-12-20 2012-07-18 四川长虹电器股份有限公司 Method for determining gesture moving direction based on hidden Markov model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5982853A (en) * 1995-03-01 1999-11-09 Liebermann; Raanan Telephone for the deaf and method of using same
EP0905644A2 (en) * 1997-09-26 1999-03-31 Matsushita Electric Industrial Co., Ltd. Hand gesture recognizing device
CN101452705A (en) * 2007-12-07 2009-06-10 希姆通信息技术(上海)有限公司 Voice character conversion nd cued speech character conversion method and device
CN102592112A (en) * 2011-12-20 2012-07-18 四川长虹电器股份有限公司 Method for determining gesture moving direction based on hidden Markov model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋桂霞: "手语数据分析及生成技术", 《优秀论文数据库信息科技辑(2009年)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875538A (en) * 2017-01-20 2017-06-20 深圳怡化电脑股份有限公司 The method and apparatus for obtaining crown word number information
CN111679272A (en) * 2020-06-01 2020-09-18 南京欧曼智能科技有限公司 Target track tracking processing method based on buffering technology

Also Published As

Publication number Publication date
CN103226692B (en) 2016-01-20

Similar Documents

Publication Publication Date Title
CN107122751B (en) A face tracking and face image capture method based on face alignment
CN103455794B (en) A Dynamic Gesture Recognition Method Based on Frame Fusion Technology
CN114882351B (en) Multi-target detection and tracking method based on improved YOLO-V5s
CN114973408B (en) Dynamic gesture recognition method and device
CN112001347A (en) Motion recognition method based on human skeleton shape and detection target
WO2004066191A3 (en) A method and or system to perform automated facial recognition and comparison using multiple 2d facial images parsed from a captured 3d facial image
CN102982099B (en) A kind of personalized Parallel Word Segmentation disposal system and disposal route thereof
CN107909042B (en) A Continuous Gesture Segmentation Recognition Method
CN116758451A (en) Audio-visual emotion recognition method and system based on multi-scale and global cross-attention
CN103226692B (en) System and method for identifying video stream image frame
CN109299650B (en) Video-based nonlinear online expression pre-detection method and device
CN107330387A (en) Pedestrian detection method based on view data
CN107220634B (en) Gesture recognition method based on improved D-P algorithm and multi-template matching
WO2023019927A1 (en) Facial recognition method and apparatus, storage medium, and electronic device
CN107346207B (en) Dynamic gesture segmentation recognition method based on hidden Markov model
CN111914724A (en) Continuous Chinese sign language identification method and system based on sliding window segmentation
CN108537109A (en) Monocular camera sign Language Recognition Method based on OpenPose
CN110633663A (en) A method for automatically cropping multimodal data from sign language videos
CN110674291A (en) A classification method of Chinese patent text effect categories based on multi-neural network fusion
WO2025140159A1 (en) Text processing method and apparatus, electronic device, and storage medium
CN110288732A (en) An integrated device of a dual-chip smart lock fingerprint recognition functional unit
CN104392237B (en) Fuzzy sign language identification method for data gloves
CN114973397B (en) Real-time process detection system, method and storage medium
CN115565252A (en) A dynamic gesture recognition method and device
Gao et al. A discriminative multi-modal adaptation neural network model for video action recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Wang Lei

Inventor after: Zheng Weilong

Inventor after: Zhang Wenshan

Inventor after: Yao Yipeng

Inventor after: Chen Xi

Inventor after: Wu Huanbin

Inventor after: Wen Yingying

Inventor after: Lian Junwei

Inventor before: Wang Lei

Inventor before: Zheng Weilong

Inventor before: Zhang Wenshan

Inventor before: Yao Yipeng

Inventor before: Chen Xi

COR Change of bibliographic data
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160120

Termination date: 20171122

CF01 Termination of patent right due to non-payment of annual fee