CN103226692A - A system and method for identifying video stream image frames - Google Patents
A system and method for identifying video stream image frames Download PDFInfo
- Publication number
- CN103226692A CN103226692A CN2012104809177A CN201210480917A CN103226692A CN 103226692 A CN103226692 A CN 103226692A CN 2012104809177 A CN2012104809177 A CN 2012104809177A CN 201210480917 A CN201210480917 A CN 201210480917A CN 103226692 A CN103226692 A CN 103226692A
- Authority
- CN
- China
- Prior art keywords
- frame
- level
- district
- picture frame
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000005457 optimization Methods 0.000 claims description 2
- 230000003139 buffering effect Effects 0.000 claims 27
- 230000008878 coupling Effects 0.000 claims 24
- 238000010168 coupling process Methods 0.000 claims 24
- 238000005859 coupling reaction Methods 0.000 claims 24
- 230000015572 biosynthetic process Effects 0.000 claims 1
- 238000000151 deposition Methods 0.000 claims 1
- 238000010977 unit operation Methods 0.000 claims 1
- 230000000007 visual effect Effects 0.000 abstract description 2
- 239000000571 coke Substances 0.000 description 8
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
Description
技术领域 technical field
本发明涉及手势视频流识别技术领域,具体的涉及一种视频流图像帧的识别系统及方法。 The invention relates to the technical field of gesture video stream recognition, in particular to a system and method for recognizing video stream image frames. the
背景技术 Background technique
手势作为人类最自然的表达方式之一,在日常生活中得到了广泛的应用。其中,手语是用手势表示意思的语言,但对不熟悉手语的普通人来说,理解手语是非常困难的,所以如果有能够对手语进行翻译的技术,将大大方便聋哑人与正常人之间的交流。在手势及手语的识别中,一个关键环节便是手势跟踪。 As one of the most natural expressions of human beings, gestures are widely used in daily life. Among them, sign language is a language that uses gestures to express meaning, but it is very difficult for ordinary people who are not familiar with sign language to understand sign language. exchange between. In the recognition of gestures and sign language, a key link is gesture tracking. the
按现有技术,手势识别根据外围设备采集手势图像方式的不同可以分为:基于数据手套的手势识别和基于计算机视觉的手势识别。基于数据手套的手势识别是通过数据手套和位置跟踪来测量手势运动的轨迹和时序信息,其优点是识别率高,缺点是输入设备昂贵,并且要求打手势的人必须穿戴复杂的数据手套。 According to the prior art, gesture recognition can be divided into gesture recognition based on data glove and gesture recognition based on computer vision according to different ways of collecting gesture images by peripheral devices. Gesture recognition based on data gloves is to measure the trajectory and timing information of gesture movements through data gloves and position tracking. It has the advantage of high recognition rate, but the disadvantage is that the input equipment is expensive, and the gesturer must wear complex data gloves. the
而基于计算机视觉的手势识别,一般采用单目普通摄像头下的目标跟踪原理,其过程中比较难解决的一个问题就是遮挡,当一个目标物被另一个物体部分遮挡或完全遮挡时,跟踪的特征就会不完整或者消失,进而导致跟踪过程中断,此时需要重新检测目标物进行跟踪的重新初始化,非常不方便。为了解决这个问题,可以采用多个摄像头进行跟踪,但是跟踪算法会变得比较复杂,增加了技术难度和不稳定性。所以,基于计算机视觉的手势识别会使能够识别的手势种类受到很大的限制。 Gesture recognition based on computer vision generally uses the target tracking principle under a monocular ordinary camera. One of the more difficult problems in the process is occlusion. When a target is partially or completely occluded by another object, the tracking features It will be incomplete or disappear, which will lead to the interruption of the tracking process. At this time, it is necessary to re-detect the target to re-initialize the tracking, which is very inconvenient. In order to solve this problem, multiple cameras can be used for tracking, but the tracking algorithm will become more complicated, increasing technical difficulty and instability. Therefore, gesture recognition based on computer vision will greatly limit the types of gestures that can be recognized. the
为了解决上述问题,专利号为200810068423.1的中国专利“一种数据流图像帧的分割识别方法及其装置”,提供了积累一定的周期图像数据,通过判断区域图像是否满足识别区域的边界条件来进行模式识别,再通过特征值提取来进行模式比对从而得出想要的结果。该篇专利虽然提供了一定的技术解决方案,但如果实际进行应用则依然存在较多的缺陷,如识别种类少、识别速度慢、匹配精度不够等。 In order to solve the above problems, the Chinese patent No. 200810068423.1 "A Method and Device for Segmentation and Recognition of Data Stream Image Frames" provides a certain period of accumulated image data, which is carried out by judging whether the regional image satisfies the boundary conditions of the recognition region. Pattern recognition, and then pattern comparison through feature value extraction to obtain the desired result. Although this patent provides a certain technical solution, if it is actually applied, there are still many defects, such as few types of recognition, slow recognition speed, and insufficient matching accuracy. the
发明内容 Contents of the invention
本发明的目的在于克服现有技术的不足,提供一种视频流图像帧的识别系统及方法,主要通过对图像帧的分割,再通过逐帧递增的方式将帧与常规常规帧匹配库进行匹配,从而避免了对来的图像数据进行提取关键帧的步骤,再通过联想功能进行联想,进而有效的缩短了识别匹配时间。 The purpose of the present invention is to overcome the deficiencies of the prior art and provide a video stream image frame recognition system and method, mainly through the segmentation of the image frame, and then match the frame with the conventional conventional frame matching library in a frame-by-frame increment mode , thereby avoiding the step of extracting key frames from the incoming image data, and then performing association through the associative function, thereby effectively shortening the identification and matching time. the
本发明通过以下技术方案实现: The present invention is realized through the following technical solutions:
一种视频流图像帧的识别系统,包括匹配单元、联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区和二级缓冲区;其中,一级缓冲区和二级缓冲区与相连;匹配单元与联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连;同时联想匹配单元与常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连。 A recognition system of a video stream image frame, including a matching unit, an associative matching unit, a conventional frame matching library, a statement buffer, a primary buffer, and a secondary buffer; wherein, the primary buffer and the secondary buffer are connected to ;The matching unit is connected with the associative matching unit, the conventional frame matching library, the statement buffer, the first-level buffer, and the second-level buffer; at the same time, the associative matching unit is connected with the conventional frame matching library, the statement buffer, the first-level buffer, and the second-level buffer area connected.
具体的,一级缓冲区接收联想手势形成的数据包并将其分割转存为连续的图像帧,匹配单元从一级缓冲区内依次取出图像帧存入二级缓冲区,并将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,在匹配单元匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区,并启动联想匹配单元进行双线匹配识别。 Specifically, the first-level buffer receives the data packets formed by Lenovo gestures and divides them into continuous image frames, and the matching unit sequentially takes out the image frames from the first-level buffer and stores them in the second-level buffer, The image frames stored in the area are matched with the conventional frame matching library. After the matching unit is successfully matched, the image frames stored in the secondary buffer are stored in the sentence buffer area, and the associative matching unit is started for double-line matching recognition. the
其中,常规帧匹配库将每种相同帧数构成的手语分类存储,依次形成一帧匹配库、二帧匹配库、……N-1帧匹配库和N帧匹配库。 Among them, the conventional frame matching library classifies and stores each sign language composed of the same number of frames, and sequentially forms a one-frame matching library, a two-frame matching library, ... N-1 frame matching library and an N-frame matching library. the
举例来说,如:“我”、“你”、“他”,假如都由2个帧就可以构成,那么将这些由2个帧组成的手势都归纳在“二帧匹配库”中,其他以此类推。 For example, if "I", "You", and "He" can be composed of 2 frames, then these gestures composed of 2 frames are all summarized in the "two-frame matching library", and other and so on. the
其中,匹配单元将一级缓冲区内的图像帧从第一个图像帧开始依次递增存入二级缓冲区内,其中,当二级缓冲区内每存入新的一个图像帧时,匹配单元将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区。 Wherein, the matching unit stores the image frames in the primary buffer sequentially from the first image frame into the secondary buffer, wherein, when a new image frame is stored in the secondary buffer, the matching unit Match the image frames stored in the secondary buffer with the conventional frame matching library, and store the image frames stored in the secondary buffer into the statement buffer after the matching is successful. the
其中,在匹配单元匹配成功后,根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理;反复执行上述匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别。 Among them, after the matching unit is successfully matched, three groups of word frame values are associated according to the number of frames required by the associative gesture and the image frames stored in the secondary buffer (set to M1, M2, M3 in turn), and the secondary buffer is cleared; Then start the matching process, read the unmatched image frames in the primary buffer sequentially through the three groups of word frame values, store the read image frames in the secondary buffer and match them with the conventional frame matching library, if the matching is successful The image frame stored in the secondary buffer is sent to the sentence buffer area for processing; the above matching process is repeatedly executed until the three groups of word frame values cannot be matched successfully, and the association matching unit ends the two-line matching recognition. the
其中,匹配单元即时检测语句缓存区内所存储的图像帧,当语句缓存区内所存储的图像帧能够形成完整的句式时,由匹配单元优化并输出结果。 Wherein, the matching unit immediately detects the image frames stored in the sentence buffer, and when the image frames stored in the sentence buffer can form a complete sentence pattern, the matching unit optimizes and outputs the result. the
根据上述系统所实现的一种视频流图像帧的识别方法,包括以下步骤: According to the recognition method of a kind of video stream image frame realized by above-mentioned system, comprise the following steps:
1)匹配单元运行; 1) The matching unit runs;
2)对当前接收到的数据包分割成连续的图像帧,将分割完成的连续图像帧存储在一级缓冲区中; 2) Divide the currently received data packet into continuous image frames, and store the divided continuous image frames in the primary buffer;
3)将一级缓冲区内的连续图像帧从还未进行匹配的第一帧开始依次递增存入二级缓冲区,再与常规帧匹配库进行匹配处理,如匹配成功则进行步骤5,如匹配不成功则进行步骤4; 3) The consecutive image frames in the primary buffer are sequentially stored in the secondary buffer from the first frame that has not yet been matched, and then matched with the conventional frame matching library. If the matching is successful, go to step 5, such as If the matching is unsuccessful, proceed to step 4;
4)将匹配不成功的图像帧加上其后面的一个图像帧一起,与常规帧匹配库进行再一次匹配;如匹配成功则进行步骤5,如匹配不成功则再一次进行步骤4; 4) Match the unsuccessful image frame and the image frame behind it with the conventional frame matching library; if the match is successful, go to step 5; if the match is unsuccessful, go to step 4 again;
5)将匹配成功的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时,进行步骤8; 5) Send the successfully matched image frames to the sentence buffer area for processing, and when the image frames stored in the sentence buffer area can form a complete sentence pattern, go to step 8;
6)联想匹配单元进行双线匹配识别; 6) The associative matching unit performs double-line matching recognition;
7)根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行步骤8;反复执行上述的匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别,清空二级缓冲区,返回步骤3; 7) According to the number of frames required by the associative gesture and the image frames stored in the secondary buffer, three sets of word frame values are associated (set to M1, M2, M3 in turn), and the secondary buffer is cleared; The word frame value sequentially reads the unmatched image frames in the first-level buffer, stores the read image frames in the second-level buffer and matches them with the conventional frame matching library, and if the match is successful, the The stored image frame is sent to the sentence buffer area for processing, and when the image frame stored in the sentence buffer area can form a complete sentence pattern, step 8 is carried out; the above-mentioned matching process is repeatedly carried out until the three groups of word frame values cannot When the matching is successful, the Lenovo matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3;
8)对语句缓存区内所存储的图像帧进行优化排列处理; 8) Optimize and arrange the image frames stored in the statement buffer;
9)输出结果。 9) Output the result.
其中,所述的步骤8中,因为手语语法与人们正常的说话语法不同,所以需要进行优化处理,如:手语打的比如:一杯可乐,应该打可乐 一(杯),这个时候就需要用到优化把可乐一翻译成一杯可乐。 Among them, in the step 8, because the sign language grammar is different from people's normal speech grammar, it needs to be optimized, such as: sign language typing, for example: a cup of Coke, should be a Coke (cup), this time you need to use Optimize the translation of Coke into a glass of Coke. the
其中,所述的步骤7中包括如下匹配过程: Wherein, described step 7 includes the following matching process:
7.1)将词帧值M1发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.2,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则继续进行步骤7.1; 7.1) Send the word frame value M1 to the first-level buffer, read the unmatched image frame in the first-level buffer, store the read image frame in the second-level buffer and match it with the conventional frame matching library, such as If the matching is not successful, proceed to step 7.2. If the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer for processing. When the image frame stored in the statement buffer can form a complete sentence pattern, then proceed to Step 8, otherwise proceed to step 7.1;
7.2)如不能匹配成功,则将词帧值M2发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.3;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.2) If the match cannot be successful, send the word frame value M2 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.3; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;
7.3)如不能匹配成功,则将词帧值M3发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.4;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.3) If the match cannot be successful, send the word frame value M3 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.4; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;
7.4)联想匹配单元结束双线匹配识别,清空二级缓冲区,返回所述步骤3。 7.4) The association matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3.
与现有技术相比,本发明具有以下有益效果: Compared with the prior art, the present invention has the following beneficial effects:
本发明提供了新的视觉手势识别方案,利用逐帧递增与常数帧库进行匹配处理,不需要提取特征值,而是通过逐一对应数据帧库进行匹配,而当识别出后由联想识别功能直接截取帧数进匹配,大大减少了计算机的工作量,显著提高了运算速度和匹配精度,有效解决了基于计算机视觉手势方式识别种类少、识别速度慢的问题,能够适用于各个领域中,应用前景广泛,具备突出的效率性。 The present invention provides a new visual gesture recognition scheme, which uses frame-by-frame increments and constant frame databases for matching processing, without extracting feature values, but by matching data frame databases one by one, and when recognized, the associative recognition function directly Intercepting the number of frames for matching greatly reduces the workload of the computer, significantly improves the computing speed and matching accuracy, and effectively solves the problem of few types of gesture recognition and slow recognition speed based on computer vision gestures. It can be applied to various fields and has a promising application prospect. Extensive, with outstanding efficiency.
附图说明 Description of drawings
下面将结合实施例和附图对本发明作进一步的详细描述: The present invention will be described in further detail below in conjunction with embodiment and accompanying drawing:
图1为本发明一具体实施例的系统结构示意图; Fig. 1 is a schematic diagram of the system structure of a specific embodiment of the present invention;
图2为本发明一具体实施例的方法流程示意图; Fig. 2 is the schematic flow chart of the method of a specific embodiment of the present invention;
图3为本发明一具体实施例的方法细致流程示意图; Fig. 3 is a detailed flow diagram of the method of a specific embodiment of the present invention;
图4为本发明一具体实施例的联想匹配单元的流程示意图; Fig. 4 is a schematic flow diagram of an associative matching unit according to a specific embodiment of the present invention;
图5为为本发明一具体实施例的常规帧匹配库的结构示意图。 FIG. 5 is a schematic structural diagram of a conventional frame matching library according to a specific embodiment of the present invention. ``
具体实施方式 Detailed ways
下面结合实施例及附图,对本发明作进一步的详细说明,但理应理解本发明的实施方式并不限于此。 The present invention will be further described in detail below with reference to the embodiments and the accompanying drawings, but it should be understood that the embodiments of the present invention are not limited thereto. the
如图1所示为本发明的一种视频流图像帧的识别系统,包括匹配单元、联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区和二级缓冲区;其中,一级缓冲区和二级缓冲区与相连;匹配单元与联想匹配单元、常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连;同时联想匹配单元与常规帧匹配库、语句缓存区、一级缓冲区、二级缓冲区相连。 As shown in Figure 1, it is a recognition system of a video stream image frame of the present invention, including a matching unit, an associative matching unit, a conventional frame matching library, a statement buffer, a primary buffer and a secondary buffer; wherein, the primary The buffer is connected to the secondary buffer; the matching unit is connected to the Lenovo matching unit, the conventional frame matching library, the statement buffer, the primary buffer, and the secondary buffer; at the same time, the Lenovo matching unit is connected to the conventional frame matching library and the statement buffer , the primary buffer, and the secondary buffer are connected. the
具体的,如图2所示,一级缓冲区接收联想手势形成的数据包并将其分割转存为连续的图像帧,匹配单元从一级缓冲区内依次取出图像帧存入二级缓冲区,并将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,在匹配单元匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区,并启动联想匹配单元进行双线匹配识别。 Specifically, as shown in Figure 2, the first-level buffer receives the data packets formed by Lenovo gestures and divides them into continuous image frames, and the matching unit sequentially takes out the image frames from the first-level buffer and stores them in the second-level buffer , and match the image frames stored in the secondary buffer with the conventional frame matching library, and store the image frames stored in the secondary buffer into the statement buffer after the matching unit is successfully matched, and start the associative matching unit for two-line match recognition. the
其中,如图5所示,常规帧匹配库将每种相同帧数构成的手语分类存储,依次形成一帧匹配库、二帧匹配库、……N-1帧匹配库和N帧匹配库。 Among them, as shown in Fig. 5, the conventional frame matching library classifies and stores the sign language composed of each kind of the same number of frames, and sequentially forms a one-frame matching library, two-frame matching library, ... N-1 frame matching library and N-frame matching library. the
举例来说:“我”、“你”、“他”,假如都由2个帧就可以构成,那么将这些由2个帧组成的手势都归纳在“二帧匹配库”中,其他类型的帧按现有技术和行业知识,以此类推。 For example: "I", "You", and "He", if they can all be composed of 2 frames, then these gestures composed of 2 frames are all summarized in the "two-frame matching library", and other types of gestures Frame by prior art and industry knowledge, and so on. the
其中,匹配单元将一级缓冲区内的图像帧从第一个图像帧开始依次递增存入二级缓冲区内,其中,当二级缓冲区内每存入新的一个图像帧时,匹配单元将二级缓冲区内存储的图像帧与常规帧匹配库进行匹配,匹配成功后将二级缓冲区内存储的图像帧存入语句缓存区。 Wherein, the matching unit stores the image frames in the primary buffer sequentially from the first image frame into the secondary buffer, wherein, when a new image frame is stored in the secondary buffer, the matching unit Match the image frames stored in the secondary buffer with the conventional frame matching library, and store the image frames stored in the secondary buffer into the statement buffer after the matching is successful. the
具体的,如图4所示,在匹配单元匹配成功后,根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(可依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理;反复执行上述匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别。 Specifically, as shown in Figure 4, after the matching unit is successfully matched, three groups of word frame values are associated according to the number of frames required by the associative gesture and the image frames stored in the secondary buffer (which can be set to M1, M2, M3 in turn ), clear the secondary buffer; then start the matching process, read the unmatched image frames in the primary buffer sequentially through the three groups of word frame values, store the read image frames in the secondary buffer and compare them with the regular frames The matching library performs matching, and if the matching is successful, the image frame stored in the secondary buffer is sent to the sentence buffer area for processing; the above matching process is repeatedly executed until the three groups of word frame values cannot be matched successfully, and the association matching unit ends Two-line match recognition. the
其中,匹配单元即时检测语句缓存区内所存储的图像帧,当语句缓存区内所存储的图像帧能够形成完整的句式时,由匹配单元优化并输出结果。 Among them, the matching unit immediately detects the image frames stored in the sentence buffer, and when the image frames stored in the sentence buffer can form a complete sentence, the matching unit optimizes and outputs the result. the
如图2~图3所示,根据上述系统所实现的一种视频流图像帧的识别方法,包括以下步骤: As shown in Figures 2 to 3, a method for identifying video stream image frames implemented according to the above system includes the following steps:
1)运行匹配单元; 1) Run the matching unit;
2)对当前接收到的数据包分割成连续的图像帧,将分割完成的连续图像帧存储在一级缓冲区中; 2) Divide the currently received data packet into continuous image frames, and store the divided continuous image frames in the primary buffer;
3)将一级缓冲区内的连续图像帧从还未进行匹配的第一帧开始依次递增存入二级缓冲区,再与常规帧匹配库进行匹配处理,如匹配成功则进行步骤5,如匹配不成功则进行步骤4; 3) The consecutive image frames in the primary buffer are sequentially stored in the secondary buffer from the first frame that has not yet been matched, and then matched with the conventional frame matching library. If the matching is successful, go to step 5, such as If the matching is unsuccessful, proceed to step 4;
4)将匹配不成功的图像帧加上其后面的一个图像帧一起,与常规帧匹配库进行再一次匹配;如匹配成功则进行步骤5,如匹配不成功则再一次进行步骤4; 4) Match the unsuccessful image frame and the following image frame with the conventional frame matching library; if the match is successful, go to step 5; if the match is unsuccessful, go to step 4 again;
5)将匹配成功的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时,进行步骤8; 5) Send the successfully matched image frames to the sentence buffer area for processing, and when the image frames stored in the sentence buffer area can form a complete sentence pattern, go to step 8;
6)联想匹配单元进行双线匹配识别; 6) The associative matching unit performs double-line matching recognition;
7)根据联想手势需要的帧数和二级缓冲区内存储的图像帧联想出三组词帧值(依次设为M1、M2、M3),清空二级缓冲区;接着开始匹配过程,通过三组词帧值依次读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行步骤8;反复执行上述的匹配过程,直至三组词帧值都不能匹配成功时,联想匹配单元结束双线匹配识别,清空二级缓冲区,返回步骤3; 7) According to the number of frames required by the associative gesture and the image frames stored in the secondary buffer, three sets of word frame values are associated (set to M1, M2, M3 in turn), and the secondary buffer is cleared; The word frame value sequentially reads the unmatched image frames in the first-level buffer, stores the read image frames in the second-level buffer and matches them with the conventional frame matching library, and if the match is successful, the The stored image frame is sent to the sentence buffer area for processing, and when the image frame stored in the sentence buffer area can form a complete sentence pattern, step 8 is carried out; the above-mentioned matching process is repeatedly carried out until the three groups of word frame values cannot When the matching is successful, the Lenovo matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3;
8)对语句缓存区内所存储的图像帧进行优化排列处理; 8) Optimize and arrange the image frames stored in the statement buffer;
9)输出结果。 9) Output the result.
其中,所述的步骤8中,因为手语语法与人们正常的说话语法不同,所以需要进行优化处理,如:手语打的比如:一杯可乐,应该打“可乐 一(杯)”,如二包吃的,应该打“吃的 二(包)”这个时候就需要用到优化把可乐一翻译成一杯可乐。 Among them, in the step 8, because the sign language grammar is different from people's normal speech grammar, it needs to be optimized. For example, if you type a sign language, for example: a cup of Coke, you should type "Coke One (cup)", such as two packs to eat Yes, it should be typed as "eating two (package)". At this time, you need to use optimization to translate Coke One into a cup of Coke. the
其中,所述的步骤7中包括如下匹配过程: Wherein, described step 7 includes the following matching process:
7.1)将词帧值M1发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.2,如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则继续进行步骤7.1; 7.1) Send the word frame value M1 to the first-level buffer, read the unmatched image frame in the first-level buffer, store the read image frame in the second-level buffer and match it with the conventional frame matching library, such as If the matching is not successful, proceed to step 7.2. If the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer for processing. When the image frame stored in the statement buffer can form a complete sentence pattern, then proceed to Step 8, otherwise proceed to step 7.1;
7.2)如不能匹配成功,则将词帧值M2发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.3;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.2) If the match cannot be successful, send the word frame value M2 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.3; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;
7.3)如不能匹配成功,则将词帧值M3发往一级缓存区,读取一级缓冲区内未匹配的图像帧,将所读取的图像帧存入二级缓冲区并与常规帧匹配库进行匹配,如匹配不成功则进行步骤7.4;如匹配成功则将二级缓冲区内存储的图像帧送往语句缓存区进行待处理,当语句缓存区内所存储的图像帧能够形成完整的句式时则进行所述的步骤8,否则进行步骤7.1; 7.3) If the match cannot be successful, send the word frame value M3 to the primary buffer, read the unmatched image frame in the primary buffer, store the read image frame in the secondary buffer and compare it with the regular frame The matching library performs matching, and if the matching is unsuccessful, proceed to step 7.4; if the matching is successful, the image frame stored in the secondary buffer is sent to the statement buffer area for processing, and when the image frame stored in the statement buffer area can form a complete sentence pattern, then proceed to step 8, otherwise proceed to step 7.1;
7.4)联想匹配单元结束双线匹配识别,清空二级缓冲区,返回所述步骤3。 7.4) The association matching unit ends the two-line matching identification, clears the secondary buffer, and returns to step 3.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210480917.7A CN103226692B (en) | 2012-11-22 | 2012-11-22 | System and method for identifying video stream image frame |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210480917.7A CN103226692B (en) | 2012-11-22 | 2012-11-22 | System and method for identifying video stream image frame |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103226692A true CN103226692A (en) | 2013-07-31 |
CN103226692B CN103226692B (en) | 2016-01-20 |
Family
ID=48837133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210480917.7A Expired - Fee Related CN103226692B (en) | 2012-11-22 | 2012-11-22 | System and method for identifying video stream image frame |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103226692B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106875538A (en) * | 2017-01-20 | 2017-06-20 | 深圳怡化电脑股份有限公司 | The method and apparatus for obtaining crown word number information |
CN111679272A (en) * | 2020-06-01 | 2020-09-18 | 南京欧曼智能科技有限公司 | Target track tracking processing method based on buffering technology |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0905644A2 (en) * | 1997-09-26 | 1999-03-31 | Matsushita Electric Industrial Co., Ltd. | Hand gesture recognizing device |
US5982853A (en) * | 1995-03-01 | 1999-11-09 | Liebermann; Raanan | Telephone for the deaf and method of using same |
CN101452705A (en) * | 2007-12-07 | 2009-06-10 | 希姆通信息技术(上海)有限公司 | Voice character conversion nd cued speech character conversion method and device |
CN102592112A (en) * | 2011-12-20 | 2012-07-18 | 四川长虹电器股份有限公司 | Method for determining gesture moving direction based on hidden Markov model |
-
2012
- 2012-11-22 CN CN201210480917.7A patent/CN103226692B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5982853A (en) * | 1995-03-01 | 1999-11-09 | Liebermann; Raanan | Telephone for the deaf and method of using same |
EP0905644A2 (en) * | 1997-09-26 | 1999-03-31 | Matsushita Electric Industrial Co., Ltd. | Hand gesture recognizing device |
CN101452705A (en) * | 2007-12-07 | 2009-06-10 | 希姆通信息技术(上海)有限公司 | Voice character conversion nd cued speech character conversion method and device |
CN102592112A (en) * | 2011-12-20 | 2012-07-18 | 四川长虹电器股份有限公司 | Method for determining gesture moving direction based on hidden Markov model |
Non-Patent Citations (1)
Title |
---|
宋桂霞: "手语数据分析及生成技术", 《优秀论文数据库信息科技辑(2009年)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106875538A (en) * | 2017-01-20 | 2017-06-20 | 深圳怡化电脑股份有限公司 | The method and apparatus for obtaining crown word number information |
CN111679272A (en) * | 2020-06-01 | 2020-09-18 | 南京欧曼智能科技有限公司 | Target track tracking processing method based on buffering technology |
Also Published As
Publication number | Publication date |
---|---|
CN103226692B (en) | 2016-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107122751B (en) | A face tracking and face image capture method based on face alignment | |
CN103455794B (en) | A Dynamic Gesture Recognition Method Based on Frame Fusion Technology | |
CN114882351B (en) | Multi-target detection and tracking method based on improved YOLO-V5s | |
CN114973408B (en) | Dynamic gesture recognition method and device | |
CN112001347A (en) | Motion recognition method based on human skeleton shape and detection target | |
WO2004066191A3 (en) | A method and or system to perform automated facial recognition and comparison using multiple 2d facial images parsed from a captured 3d facial image | |
CN102982099B (en) | A kind of personalized Parallel Word Segmentation disposal system and disposal route thereof | |
CN107909042B (en) | A Continuous Gesture Segmentation Recognition Method | |
CN116758451A (en) | Audio-visual emotion recognition method and system based on multi-scale and global cross-attention | |
CN103226692B (en) | System and method for identifying video stream image frame | |
CN109299650B (en) | Video-based nonlinear online expression pre-detection method and device | |
CN107330387A (en) | Pedestrian detection method based on view data | |
CN107220634B (en) | Gesture recognition method based on improved D-P algorithm and multi-template matching | |
WO2023019927A1 (en) | Facial recognition method and apparatus, storage medium, and electronic device | |
CN107346207B (en) | Dynamic gesture segmentation recognition method based on hidden Markov model | |
CN111914724A (en) | Continuous Chinese sign language identification method and system based on sliding window segmentation | |
CN108537109A (en) | Monocular camera sign Language Recognition Method based on OpenPose | |
CN110633663A (en) | A method for automatically cropping multimodal data from sign language videos | |
CN110674291A (en) | A classification method of Chinese patent text effect categories based on multi-neural network fusion | |
WO2025140159A1 (en) | Text processing method and apparatus, electronic device, and storage medium | |
CN110288732A (en) | An integrated device of a dual-chip smart lock fingerprint recognition functional unit | |
CN104392237B (en) | Fuzzy sign language identification method for data gloves | |
CN114973397B (en) | Real-time process detection system, method and storage medium | |
CN115565252A (en) | A dynamic gesture recognition method and device | |
Gao et al. | A discriminative multi-modal adaptation neural network model for video action recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information |
Inventor after: Wang Lei Inventor after: Zheng Weilong Inventor after: Zhang Wenshan Inventor after: Yao Yipeng Inventor after: Chen Xi Inventor after: Wu Huanbin Inventor after: Wen Yingying Inventor after: Lian Junwei Inventor before: Wang Lei Inventor before: Zheng Weilong Inventor before: Zhang Wenshan Inventor before: Yao Yipeng Inventor before: Chen Xi |
|
COR | Change of bibliographic data | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160120 Termination date: 20171122 |
|
CF01 | Termination of patent right due to non-payment of annual fee |