CN103080878B - One or more groups of method and apparatus are divided into for hand-written stroke will be overlapped - Google Patents
One or more groups of method and apparatus are divided into for hand-written stroke will be overlapped Download PDFInfo
- Publication number
- CN103080878B CN103080878B CN201080068735.8A CN201080068735A CN103080878B CN 103080878 B CN103080878 B CN 103080878B CN 201080068735 A CN201080068735 A CN 201080068735A CN 103080878 B CN103080878 B CN 103080878B
- Authority
- CN
- China
- Prior art keywords
- stroke
- strokes
- series
- feature
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000011218 segmentation Effects 0.000 claims abstract description 38
- 238000004590 computer program Methods 0.000 claims abstract description 25
- 239000013598 vector Substances 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims 7
- 230000015654 memory Effects 0.000 description 19
- 230000006870 function Effects 0.000 description 15
- 238000004891 communication Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 101100172288 Pseudomonas fluorescens biotype A endX gene Proteins 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000010923 batch production Methods 0.000 description 1
- 230000010267 cellular communication Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012634 optical imaging Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
- G06V30/333—Preprocessing; Feature extraction
- G06V30/347—Sampling; Contour coding; Stroke extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/22—Character recognition characterised by the type of writing
- G06V30/226—Character recognition characterised by the type of writing of cursive writing
- G06V30/2268—Character recognition characterised by the type of writing of cursive writing using stroke segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Character Discrimination (AREA)
- User Interface Of Digital Computer (AREA)
- Character Input (AREA)
Abstract
Description
技术领域technical field
本发明的示例性实施方式涉及重叠手写的处理,并且更具体地,涉及将包括重叠手写的一系列笔画分割成一组或多组笔画。Exemplary embodiments of the present invention relate to the processing of overlapping handwriting, and more particularly, to segmenting a series of strokes including overlapping handwriting into one or more groups of strokes.
背景技术Background technique
各种设备通过触摸屏或其他输入设备捕获或接收手写输入。例如,很多计算机、平板电脑、手机、个人数字助理(PDA)和其他类型的电子设备包括触摸屏,其允许用户如以手写的形式提供触觉输入。手写输入可由电子设备捕获,并且被处理以试图识别手写字符以便告知电子设备的未来动作,包括例如手写输入的表示的存储或传送。Various devices capture or receive handwritten input through touch screens or other input devices. For example, many computers, tablets, cell phones, personal digital assistants (PDAs), and other types of electronic devices include touch screens that allow users to provide tactile input, such as in the form of handwriting. Handwriting input may be captured by the electronic device and processed in an attempt to recognize handwritten characters to inform future actions of the electronic device, including, for example, storage or transmission of representations of the handwriting input.
为了使字符识别更容易,手写输入通常被处理以便将手写输入分割成不同字符,如字母、数字或其他字符。存在几种不同的用于分割手写输入的技术。一种技术利用每个字符输入之间的暂停。根据该技术,用户接口,如触摸屏,只需要包括一个书写区域。用户可在该书写区域内书写一个字符。然后暂停,以便接收该字符并且清除该书写区域以为下一字符的接收准备好该书写区域。因此,每个字符输入之间的暂停用于手写的分割。To make character recognition easier, handwritten input is usually processed to segment the handwritten input into different characters, such as letters, numbers or other characters. Several different techniques exist for segmenting handwritten input. One technique exploits the pause between each character input. According to this technique, a user interface, such as a touch screen, only needs to include a writing area. The user can write a character in the writing area. Then pause to receive the character and clear the writing area to prepare the writing area for the reception of the next character. Thus, pauses between each character input are used for handwriting segmentation.
另一种技术采用具有两个或更多书写区域,如两个或更多触摸屏的用户接口。用户可在这些书写区域中的一个中书写一个字符,然后可切换到其他书写区域以便写下一字符。当用户正在这些书写区域中的一个中书写字符时,之前书写在其他书写区域中的字符可被接收并处理,并且其他书写区域可接着被清除以便为下一字符的接收准备好该其他书写区域。因此,不同书写区域之间的切换用于手写的分割。Another technique employs a user interface with two or more writing areas, such as two or more touch screens. The user can write a character in one of these writing areas, and then can switch to the other writing area to write the next character. While the user is writing characters in one of these writing areas, characters previously written in the other writing area can be received and processed, and the other writing area can then be cleared to prepare the other writing area for reception of the next character . Therefore, switching between different writing regions is used for handwriting segmentation.
再一种技术采用相当大的单个书写区域。在书写区域内,用户可以采用可比于将字符书写在一张纸上的方式一个接一个地连续书写几个字符。该技术利用位于所书写字符之间的空间间隔以及底层智能以便适当地分割手写输入。Yet another technique employs a relatively large single writing area. Within the writing area, the user can write several characters in succession one after the other in a manner comparable to writing characters on a piece of paper. The technique exploits the spatial separation between written characters and the underlying intelligence to properly segment handwritten input.
在又一种技术中,用户接口提供单个书写区域,在该书写区域中,用户可以采用字符之间无空间间隔地彼此重叠的方式连续书写几个字符。通过利用底层智能,重叠的字符可被分割以便将一个字符和其他字符分开。然而,该程序比上面使用通常采用的识别处理的技术更加复杂。在最大似然估计下,该识别处理可有误差地将一系列笔画分割成字符。该识别处理有很大的计算强度,并且对于实时操作特别是小型设备,很有挑战性。In yet another technique, the user interface provides a single writing area in which the user can write several characters consecutively in such a way that the characters overlap each other with no space between them. By exploiting the underlying intelligence, overlapping characters can be segmented to separate one character from the other. However, this procedure is more complicated than the above technique using commonly employed recognition processing. Under maximum likelihood estimation, the recognition process can erroneously segment a series of strokes into characters. This recognition process is computationally intensive and challenging to operate in real time, especially for small devices.
利用重叠连续手写的多种技术,允许具有相对小的用户接口的电子设备接收使用用户的手指或其他手写笔的连续手写输入,以便手写输入可以相比于一次一个字符的输入相对快的方式提供。然而,关于分割和用户交互,对重叠连续手写的依赖可能提出挑战。在这方面,重叠连续手写会增加与有效地分割重叠手写以便将每个字符和其他字符分开相关的困难,特别是相比于更普遍的所写字符之间具有空间间隔的连续手写。另外,多个字符的重叠会对用户造成困难,因为用户可能由于其他重叠的字符造成的嘈杂且复杂的背景不能清楚地查看用户当前正在写什么。A variety of techniques that utilize overlapping continuous handwriting to allow electronic devices with relatively small user interfaces to receive continuous handwritten input using a user's finger or other stylus so that handwritten input can be provided relatively quickly compared to input one character at a time . However, the reliance on overlapping continuous handwriting can pose challenges with regard to segmentation and user interaction. In this regard, overlapping continuous handwriting can add to the difficulties associated with efficiently segmenting overlapping handwriting to separate each character from the others, especially compared to the more common continuous handwriting with spatial separations between written characters. Additionally, the overlapping of multiple characters can create difficulties for the user, as the user may not be able to clearly see what the user is currently writing due to the noisy and complex background of other overlapping characters.
发明内容Contents of the invention
根据示例性实施方式,提供了一种方法、装置和计算机程序产品,以便将包括重叠手写的一系列笔画分割成一组或多组笔画。该分割可在分割成一个或多个字符的任何分割操作之前执行,并且事实上每组笔画可以是字符或字符的一部分,但是无论如何没有一组笔画包括来自多于一个字符的笔画。通过将一系列笔画分割成一组或多组笔画,后续将该系列笔画分割成一个或多个字符变得更有效。另外,之前的一组或多组笔画可以以较不明显不同的方式显示,同时连续接收重叠手写以便允许用户更加清楚地查看最新的笔画。According to an exemplary embodiment, a method, apparatus and computer program product are provided to segment a series of strokes including overlapping handwriting into one or more groups of strokes. This segmentation may be performed prior to any segmentation into one or more characters, and indeed each set of strokes may be a character or part of a character, but in any event no set of strokes includes strokes from more than one character. By segmenting a series of strokes into one or more groups of strokes, subsequent segmentation of the series of strokes into one or more characters becomes more efficient. Additionally, the previous set or sets of strokes may be displayed in a less noticeably different manner while continuously receiving overlapping handwriting to allow the user to view the latest strokes more clearly.
在一个实施方式中,提供了一种方法,该方法包括:接收包括重叠手写的一系列笔画,并且对于这多个笔画中的每一个笔画,通过处理器,基于该系列笔画的几何特性确定与当前笔画相关联的多个特征。该实施方式的方法还基于与笔画相关联的特征将一系列笔画分割成一组或多组笔画。如上面提到的,每组笔画是字符或字符的一部分,但是没有一组笔画包括来自多于一个字符的笔画。In one embodiment, a method is provided, the method comprising: receiving a series of strokes comprising overlapping handwriting, and for each stroke of the plurality of strokes, determining, by a processor, a relationship with Multiple features associated with the current stroke. The method of this embodiment also segments a series of strokes into one or more groups of strokes based on features associated with the strokes. As mentioned above, each set of strokes is a character or part of a character, but no set of strokes includes strokes from more than one character.
一个实施方式的方法还可基于该系列笔画已被分割成的一组或多组笔画将该系列笔画分割成一个或多个字符,从而增加了将一系列笔画分割成字符的效率。根据示例性实施方式,该方法可通过仅基于该系列笔画的几何特性确定与当前笔画相关联的多个特征,从而确定与当前笔画相关联的多个特征。在一个实施方式中,该方法还可引起对所述组中的至少一些组的显示,以使得至少一组被以明显不同于至少另一组的方式显示。因此,重叠手写的显示图像可被简化以方便用户查看例如最新的笔画。The method of one embodiment may also segment the series of strokes into one or more characters based on one or more groups of strokes into which the series of strokes have been segmented, thereby increasing the efficiency of segmenting a series of strokes into characters. According to an example embodiment, the method may determine the plurality of features associated with the current stroke by determining the plurality of features associated with the current stroke based solely on geometric properties of the series of strokes. In one embodiment, the method may also cause display of at least some of the groups such that at least one group is displayed in a distinct manner from at least one other group. Therefore, the displayed image of overlapping handwriting can be simplified to facilitate the user to view, for example, the latest strokes.
在一个实施方式中,该方法还将与当前笔画相关联的多个特征标准化。在这方面,可基于一系列笔画的总尺寸将该多个特征标准化。在确定与当前笔画相关联的该多个特征时,该多个特征可从由当前笔画的终点、当前笔画的几何中心、下一笔画的起点、下一笔画的几何中心、包含当前笔画的最小矩形、包含下一笔画的最小矩形和包含前一笔画的最小矩形构成的组中选择。In one embodiment, the method also normalizes a number of features associated with the current stroke. In this regard, the plurality of features may be normalized based on the total size of a series of strokes. When determining the plurality of features associated with the current stroke, the plurality of features can be selected from the end point of the current stroke, the geometric center of the current stroke, the starting point of the next stroke, the geometric center of the next stroke, the smallest Select from the group consisting of the rectangle, the smallest rectangle that contains the next stroke, and the smallest rectangle that contains the previous stroke.
该方法可以以递增模式或批处理模式执行。在递增模式中,多个特征的确定和一系列笔画的分割的步骤在接收每个连续笔画之后重复执行。在批处理模式中,多个特征的确定和一系列笔画的分割的步骤在接收多个笔画之后重复执行。The method can be executed in incremental or batch mode. In incremental mode, the steps of determination of features and segmentation of a series of strokes are repeated after each successive stroke is received. In batch mode, the steps of determination of multiple features and segmentation of a series of strokes are performed repeatedly after receiving multiple strokes.
在另一实施方式中,提供了一种装置,包括至少一个处理器和包括计算机程序代码的至少一个存储器。该实施方式的装置的至少一个存储器和计算机程序代码被配置为:与至少一个处理器一起使得该装置至少接收包括重叠手写的一系列笔画,并且对于这多个笔画中的每一个笔画,基于该系列笔画的几何特性确定与当前笔画相关联的多个特征。该实施方式的装置的至少一个存储器和计算机程序代码还被配置为与至少一个处理器一起使得该装置基于与笔画相关联的特征将一系列笔画分割成一组或多组笔画。如上面提到的,每组笔画是字符或字符的一部分,但是没有一组笔画包括来自多于一个字符的笔画。In another embodiment, an apparatus is provided comprising at least one processor and at least one memory comprising computer program code. The at least one memory and computer program code of the apparatus of this embodiment are configured, with at least one processor, to cause the apparatus to at least receive a series of strokes comprising overlapping handwriting, and for each of the plurality of strokes, based on the The geometric properties of the series of strokes determine a number of characteristics associated with the current stroke. The at least one memory and the computer program code of the apparatus of this embodiment are further configured, with at least one processor, to cause the apparatus to segment a series of strokes into one or more groups of strokes based on features associated with the strokes. As mentioned above, each set of strokes is a character or part of a character, but no set of strokes includes strokes from more than one character.
一个实施方式的装置的至少一个存储器和计算机程序代码还可被配置为:与至少一个处理器一起使得该装置基于该系列笔画已被分割成的一组或多组笔画将该系列笔画分割成一个或多个字符,从而增加了将一系列笔画分割成字符的效率。一个示例性实施方式的装置的至少一个存储器和计算机程序代码还被配置为:与至少一个处理器一起使得该装置通过仅基于该系列笔画的几何特性确定与当前笔画相关联的多个特征,确定与当前笔画相关联的多个特征。在一个实施方式中,该装置的至少一个存储器和计算机程序代码还可被配置为:与至少一个处理器一起使得该装置引起对所述组中的至少一些组的显示,以使得至少一组以明显不同于至少另一组的方式被显示。因此,重叠手写的显示图像可被简化以方便用户查看例如最新的笔画。The at least one memory and the computer program code of the apparatus of an embodiment may also be configured to, with at least one processor, cause the apparatus to segment the series of strokes into one or more groups of strokes based on the group or groups of strokes into which the series of strokes has been segmented. or multiple characters, thereby increasing the efficiency of segmenting a series of strokes into characters. The at least one memory and the computer program code of the apparatus of an exemplary embodiment are further configured to, with at least one processor, cause the apparatus to determine, by determining a plurality of features associated with a current stroke based solely on geometric properties of the series of strokes, to determine A number of features associated with the current stroke. In one embodiment, the at least one memory and the computer program code of the apparatus are further configured to, with at least one processor, cause the apparatus to cause the display of at least some of the groups such that at least one group ends with Significantly different patterns from at least one other group are shown. Therefore, the displayed image of overlapping handwriting can be simplified to facilitate the user to view, for example, the latest strokes.
在一个实施方式中,该装置的至少一个存储器和计算机程序代码还可被配置为:与至少一个处理器一起使得该装置将与当前笔画相关联的多个特征标准化。在这方面,可基于一系列笔画的总尺寸将多个特征标准化。在确定与当前笔画相关联的多个特征时,多个特征可从由当前笔画的终点、当前笔画的几何中心、下一笔画的起点、下一笔画的几何中心、包含当前笔画的最小矩形、包含下一笔画的最小矩形和包含前一笔画的最小矩形构成的组中选择。In one embodiment, the at least one memory and the computer program code of the apparatus are further configured to, with at least one processor, cause the apparatus to normalize the plurality of features associated with the current stroke. In this regard, multiple features may be normalized based on the total size of a series of strokes. When determining a plurality of features associated with the current stroke, the plurality of features can be selected from the endpoint of the current stroke, the geometric center of the current stroke, the starting point of the next stroke, the geometric center of the next stroke, the smallest rectangle containing the current stroke, Select from the group consisting of the smallest rectangle containing the next stroke and the smallest rectangle containing the previous stroke.
可以以递增模式或批处理模式执行对笔画的分析。在递增模式中,多个特征的确定和一系列笔画的分割的步骤在接收每个连续笔画之后重复执行。在批处理模式中,多个特征的确定和一系列笔画的分割的步骤在接收多个笔画之后重复执行。Analysis of strokes can be performed in incremental or batch mode. In incremental mode, the steps of determination of features and segmentation of a series of strokes are repeated after each successive stroke is received. In batch mode, the steps of determination of multiple features and segmentation of a series of strokes are performed repeatedly after receiving multiple strokes.
在再一个实施方式中,提供了一种设备,其包括用于接收包括重叠手写的一系列笔画的装置,以及对于这多个笔画中的每一个笔画,基于该系列笔画的几何特性确定与当前笔画相关联的多个特征的装置。该实施方式的设备还可包括用于基于与笔画相关联的特征将一系列笔画分割成一组或多组笔画的装置。如上面提到的,每组笔画是字符或字符的一部分,但是没有一组笔画包括来自多于一个字符的笔画。In yet another embodiment, an apparatus is provided that includes means for receiving a series of strokes including overlapping handwriting, and for each stroke in the plurality of strokes, determining a relationship with the current A device for associating multiple features with strokes. The apparatus of this embodiment may also include means for segmenting a series of strokes into one or more groups of strokes based on features associated with the strokes. As mentioned above, each set of strokes is a character or part of a character, but no set of strokes includes strokes from more than one character.
一个实施方式的设备还可包括用于基于一系列笔画已被分割成的一组或多组笔画将该系列笔画分割成一个或多个字符的装置,从而增加将一系列笔画分割成字符的效率。在一个示例性实施方式中,用于确定与当前笔画相关联的多个特征的装置包括用于仅基于该系列笔画的几何特性确定与当前笔画相关联的多个特征的装置。在一个实施方式中,该设备还可包括用于引起对所述组中的至少一些组的显示的装置,以使得至少一组被以明显不同于至少另一组的方式显示。因此,重叠手写的显示图像可被简化以方便用户查看例如最新的笔画。The apparatus of an embodiment may further comprise means for segmenting the series of strokes into one or more characters based on the set or groups of strokes into which the series has been segmented, thereby increasing the efficiency of segmenting the series of strokes into characters . In an exemplary embodiment, the means for determining the plurality of features associated with the current stroke includes means for determining the plurality of features associated with the current stroke based solely on geometric properties of the series of strokes. In one embodiment, the apparatus may further comprise means for causing display of at least some of said groups such that at least one group is displayed in a manner distinct from at least another group. Therefore, the displayed image of overlapping handwriting can be simplified to facilitate the user to view, for example, the latest strokes.
在一个实施方式中,该设备还包括用于将与当前笔画相关联的多个特征标准化的装置。在这方面,可基于一系列笔画的总尺寸将多个特征标准化。在确定与当前笔画相关联的多个特征时,多个特征可从由当前笔画的终点、当前笔画的几何中心、下一笔画的起点、下一笔画的几何中心、包含当前笔画的最小矩形、包含下一笔画的最小矩形和包含前一笔画的最小矩形构成的组中选择。In one embodiment, the device further comprises means for normalizing the plurality of features associated with the current stroke. In this regard, multiple features may be normalized based on the total size of a series of strokes. When determining a plurality of features associated with the current stroke, the plurality of features can be selected from the endpoint of the current stroke, the geometric center of the current stroke, the starting point of the next stroke, the geometric center of the next stroke, the smallest rectangle containing the current stroke, Select from the group consisting of the smallest rectangle containing the next stroke and the smallest rectangle containing the previous stroke.
该设备可以以递增模式或批处理模式分析笔画。在递增模式中,多个特征的确定和一系列笔画的分割的步骤在接收每个连续笔画之后重复执行。在批处理模式中,多个特征的确定和一系列笔画的分割的步骤在接收多个笔画之后重复执行。The device can analyze strokes in incremental or batch mode. In incremental mode, the steps of determination of features and segmentation of a series of strokes are repeated after each successive stroke is received. In batch mode, the steps of determination of multiple features and segmentation of a series of strokes are performed repeatedly after receiving multiple strokes.
在又一实施方式中,提供了一种计算机程序产品,其包括具有存储于其中的计算机可执行代码部分的至少一个计算机可读存储器。该计算机可执行代码部分包括用于接收包括重叠手写的一系列笔画的程序代码指令和用于对于这多个笔画中的每一个笔画,基于该系列笔画的几何特性确定与当前笔画相关联的多个特征的程序代码指令。该实施方式的计算机可执行代码部分还包括用于基于与笔画相关联的特征将一系列笔画分割成一组或多组笔画的程序代码指令。如上面提到的,每组笔画是字符或字符的一部分,但是没有一组笔画包括来自多于一个字符的笔画。In yet another embodiment, a computer program product comprising at least one computer readable memory having computer executable code portions stored therein is provided. The computer-executable code portion includes program code instructions for receiving a series of strokes including overlapping handwriting and for, for each of the plurality of strokes, determining a plurality of strokes associated with the current stroke based on geometric characteristics of the series of strokes. characteristic program code instructions. The computer-executable code portion of this embodiment also includes program code instructions for segmenting a series of strokes into one or more groups of strokes based on features associated with the strokes. As mentioned above, each set of strokes is a character or part of a character, but no set of strokes includes strokes from more than one character.
一个实施方式的计算机可执行代码部分还可包括用于基于一系列笔画已被分割成的一组或多组笔画将该系列笔画分割成一个或多个字符的程序代码指令,从而增加了将一系列笔画分割成字符的效率。一个示例性实施方式的计算机可执行代码部分还可包括用于通过仅基于该系列笔画的几何特性确定与当前笔画相关联的多个特征,确定与当前笔画相关联的多个特征的程序代码指令。在一个实施方式中,该计算机可执行代码部分还可包括用于引起对所述组中的至少一些组的显示的程序代码指令,以使得至少一组被以明显不同于至少另一组的方式显示。因此,重叠手写的显示图像可被简化以方便用户查看例如最新的笔画。The computer-executable code portion of an embodiment may also include program code instructions for segmenting a series of strokes into one or more characters based on one or more groups of strokes into which the series has been segmented, thereby adding a Efficiency with which series of strokes are segmented into characters. The computer-executable code portion of an exemplary embodiment may also include program code instructions for determining a number of features associated with a current stroke by determining the number of features associated with the current stroke based only on geometric properties of the series of strokes . In one embodiment, the computer-executable code portions may further include program code instructions for causing display of at least some of the groups such that at least one group is displayed in a manner that is distinct from at least another group. show. Therefore, the displayed image of overlapping handwriting can be simplified to facilitate the user to view, for example, the latest strokes.
在一个实施方式中,该计算机可执行代码部分还包括用于将与当前笔画相关联的多个特征标准化的程序代码指令。在这方面,可基于一系列笔画的总尺寸将多个特征标准化。在确定与当前笔画相关联的多个特征时,该多个特征可从由当前笔画的终点、当前笔画的几何中心、下一笔画的起点、下一笔画的几何中心、包含当前笔画的最小矩形、包含下一笔画的最小矩形和包含前一笔画的最小矩形构成的组中选择。In one embodiment, the computer-executable code portion further includes program code instructions for normalizing the plurality of features associated with the current stroke. In this regard, multiple features may be normalized based on the total size of a series of strokes. When determining a plurality of features associated with the current stroke, the plurality of features can be selected from the endpoint of the current stroke, the geometric center of the current stroke, the starting point of the next stroke, the geometric center of the next stroke, the smallest rectangle containing the current stroke , the smallest rectangle containing the next stroke, and the smallest rectangle containing the previous stroke.
该计算机程序产品可以以递增模式或批处理模式分析笔画。在递增模式中,多个特征的确定和一系列笔画的分割的步骤在接收每个连续笔画之后重复执行。在批处理模式中,多个特征的确定和该系列笔画的分割的步骤在接收多个笔画之后重复执行。The computer program product can analyze strokes in incremental mode or batch mode. In incremental mode, the steps of determination of features and segmentation of a series of strokes are repeated after each successive stroke is received. In batch mode, the steps of determination of features and segmentation of the series of strokes are repeated after receiving a plurality of strokes.
附图说明Description of drawings
已经以通用术语描述了本公开的某些示例性实施方式,现在将参照附图,附图并不需要按比例绘制,其中:Having described certain exemplary embodiments of the present disclosure in general terms, reference will now be made to the accompanying drawings, which are not necessarily drawn to scale, in which:
图1是对其中四个字符重叠的重叠手写的说明;Figure 1 is an illustration of overlapping handwriting in which four characters overlap;
图2是根据本发明的示例性实施方式的设备的框图;Figure 2 is a block diagram of an apparatus according to an exemplary embodiment of the present invention;
图3是根据本发明的示例性实施方式执行的操作的功能框图;Figure 3 is a functional block diagram of operations performed in accordance with an exemplary embodiment of the invention;
图4是示出根据本发明的示例性实施方式执行的操作的流程图;FIG. 4 is a flowchart illustrating operations performed in accordance with an exemplary embodiment of the present invention;
图5是对根据本发明的示例性实施方式的其中三个连续笔画的特征被识别的重叠手写的说明;5 is an illustration of overlapping handwriting in which features of three consecutive strokes are recognized, according to an exemplary embodiment of the invention;
图6是对根据本发明的示例性实施方式的图1的重叠手写可被处理以便识别四个字符的方式的说明;6 is an illustration of the manner in which the overlapping handwriting of FIG. 1 may be processed in order to recognize four characters, according to an exemplary embodiment of the invention;
图7是示出根据本发明的示例性实施方式的递增模式执行的操作的流程图;以及7 is a flowchart illustrating operations performed in incremental mode according to an exemplary embodiment of the present invention; and
图8是示出根据本发明的另一示例性实施方式的批处理模式执行的操作的流程图。FIG. 8 is a flowchart illustrating operations performed in a batch mode according to another exemplary embodiment of the present invention.
具体实施方式detailed description
现在将参考附图在下文中更充分地描述本发明的一些实施方式,其中,示出了本发明的一些而非全部实施方式。实际上,本发明的各种实施方式可以很多不同形式体现并且不应当理解为受限于这里陈述的各实施方式;更确切地说,提供这些实施方式以便本说明书将满足适用的法律要求。自始至终,同样的附图标记指代同样的元件。如这里使用的,根据本发明的各实施方式,术语“数据”、“内容”、“信息”和类似的术语可以可互换地用于指代能够被传送、接收和/或存储的数据。因此,任何这种术语的使用不应当用于限制本发明的各实施方式的精神和范围。Some embodiments of the present invention now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the invention are shown. Indeed, various embodiments of the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this description will satisfy applicable legal requirements. Like reference numerals refer to like elements throughout. As used herein, the terms "data," "content," "information" and similar terms may be used interchangeably to refer to data capable of being transmitted, received and/or stored in accordance with various embodiments of the invention. Thus, use of any such terms should not be taken to limit the spirit and scope of the various embodiments of the invention.
另外,如这里使用的,术语“电路”指的是:(a)只有硬件的电路实现(如,以模拟电路和/或数字电路实现);(b)电路和包括存储在一个或多个计算机可读存储器上的软件和/或固件指令的(各)计算机程序产品的组合,其一起运行以使得装置执行这里描述的一个或多个功能;以及(c)电路,例如,微处理器或微处理器的一部分,其需要用于操作的软件或固件,即使该软件或固件并不是物理存在的。“电路”的该定义适用于该术语在这里的全部使用,包括任意权利要求中。作为再一个示例,如这里使用的,术语“电路”还包括包含一个或多个处理器和/或其部分以及附带的软件和/或固件的实现。作为另一示例,如这里使用的,术语“电路”还包括,例如,用于手机的基带集成电路或应用处理器集成电路,或服务器、蜂窝网络设备、其他网络设备和/或其他计算设备中的类似的集成电路。In addition, as used herein, the term "circuitry" refers to: (a) a hardware-only circuit implementation (e.g., implemented in analog and/or digital circuits); (b) circuits and a combination of computer program product(s) of software and/or firmware instructions on a readable memory, which run together to cause the apparatus to perform one or more of the functions described herein; and (c) circuitry, such as a microprocessor or microprocessor The part of a processor that requires software or firmware to operate, even if that software or firmware does not physically exist. This definition of 'circuitry' applies to all uses of this term herein, including in any claims. As a further example, as used herein, the term 'circuitry' also includes an implementation comprising one or more processors and/or portion(s) thereof and accompanying software and/or firmware. As another example, as used herein, the term "circuitry" also includes, for example, baseband integrated circuits or application processor integrated circuits used in mobile phones, or in servers, cellular network devices, other network devices, and/or other computing devices similar integrated circuits.
如这里定义的,指代永久、物理的存储介质(例如,易失性或非易失性存储设备)的“计算机可读存储介质”可区别于指代电磁信号的“计算机可读传送介质”。As defined herein, "computer-readable storage media," which refer to permanent, physical storage media (e.g., volatile or non-volatile storage devices), are distinguishable from "computer-readable transmission media," which refer to electromagnetic signals .
如图1中所示,手写可以以重叠方式经由用户接口如触摸屏输入,在该重叠方式中,字符连续地重叠,而字符之间无空间或时间间隔。在这方面,图1的示例示出在该图的下部中分别列出并以100标示的四个字符的顺序重叠。在这方面,显示102示出第一字符的输入,显示104示出第一和第二字符的重叠输入,显示106示出第一、第二和第三字符的重叠输入,并且显示108示出全部四个字符的重叠输入。As shown in FIG. 1 , handwriting can be input via a user interface, such as a touch screen, in an overlapping manner in which characters are continuously overlapped without space or time intervals between the characters. In this regard, the example of FIG. 1 shows that the four characters respectively listed and indicated at 100 in the lower part of the figure overlap in sequence. In this regard, display 102 shows input of a first character, display 104 shows overlapping input of first and second characters, display 106 shows overlapping input of first, second and third characters, and display 108 shows Overlapping input of all four characters.
重叠手写允许用户充分利用用户接口,这对于利用他们的手指例如在相对较小且分辨率低的触摸屏上提供手写输入的用户特别有用。另外,重叠手写的使用提供相对自然和快速的用于接收手写输入的技术。参照图1的逐渐拥挤的显示,将认识到,由于另外的字符与之前的字符重叠,因而依赖于重叠手写以提供手写输入对于不同字符的分割提出了挑战。另外,由于很多其他重叠字符的同时显示,重叠手写还会使得用户查看用户当前正在书写的字符或前一字符变得相当困难。因此,根据本发明的示例性实施方式提供了一种用于预分割重叠手写的技术。Overlaying handwriting allows users to take full advantage of the user interface, which is particularly useful for users providing handwriting input with their fingers, eg, on relatively small and low resolution touch screens. Additionally, the use of overlapping handwriting provides a relatively natural and fast technique for receiving handwritten input. Referring to the progressively crowded display of FIG. 1 , it will be appreciated that reliance on overlapping handwriting to provide handwritten input presents challenges for segmentation of different characters as additional characters overlap previous characters. In addition, overlapping handwriting can also make it quite difficult for a user to see the character the user is currently writing or the previous character due to the simultaneous display of many other overlapping characters. Therefore, an exemplary embodiment according to the present invention provides a technique for pre-segmenting overlapping handwriting.
可被预分割的重叠手写可经由各种各样的输入设备中的任一个来接收,如用户接口,例如触摸屏等。不仅重叠手写可经由各种各样的不同输入设备接收,而且这些输入设备可被证实并可包括各种不同类型的电子设备的一部分。例如,图2示出可体现本发明的示例性实施方式的移动终端10的框图。然而,应当理解,示出并且在下文中描述的移动终端10仅仅是可从本发明的示例性实施方式获益的一种类型的设备的示例,并且,各种类型的移动终端,如便携式数字助理(PDA)、手机、寻呼机、移动电视、游戏设备、手提电脑、照相机、录像机、音频/视频播放器、收音机、定位设备如全球定位系统(GPS)设备、或上述的和其他类型的声音和文本通信系统的组合,可容易地使用本发明的示例性实施方式。The overlapping handwriting, which may be pre-segmented, may be received via any of a variety of input devices, such as a user interface, eg, a touch screen, or the like. Not only can overlapping handwriting be received via a variety of different input devices, but these input devices can be authenticated and can include a portion of a variety of different types of electronic devices. For example, FIG. 2 shows a block diagram of a mobile terminal 10 that may embody an exemplary embodiment of the present invention. It should be understood, however, that the mobile terminal 10 shown and hereinafter described is merely an example of one type of device that may benefit from exemplary embodiments of the present invention, and that various types of mobile terminals, such as portable digital assistants (PDAs), cell phones, pagers, mobile TVs, gaming devices, laptop computers, cameras, video recorders, audio/video players, radios, positioning devices such as Global Positioning System (GPS) devices, or the above and other types of sound and text Combinations of communication systems can easily use the exemplary embodiments of the present invention.
移动终端10可包括可操作地与发射器14和接收器16通信的天线12或多个天线。移动终端10可进一步包括分别提供信号至或接收信号自发射器14和接收器16的装置,如处理器20。该信号包括根据适用的蜂窝系统的空中接口标准的信号信息,并且还包括用户语音、接收数据和/或用户生成数据。在这方面,移动终端10能够根据多种第一、第二、第三和/或第四代通信协议等中的任一个操作。例如,移动终端10能够根据第二代(2G)无线通信协议IS-136、时分多址(TDMA)、全球移动通信系统(GSM)和IS-95码分多址(CDMA),或根据第三代(3G)无线通信协议如通用移动通信系统(UMTS)、CDMA2000、宽带CDMA(WCDMA)和时分同步CDMA(TD-CDMA),根据3.9G无线通信协议如演进UMTS陆地无线电接入网(E-UTRAN),根据第四代(4G)无线通信协议等操作。作为替代或附加地,移动终端10能够根据非蜂窝通信机制操作。例如,移动终端10能够在无线局域网(WLAN)或其他通信网络中通信。Mobile terminal 10 may include antenna 12 or multiple antennas in operable communication with transmitter 14 and receiver 16 . The mobile terminal 10 may further include means, such as a processor 20, for providing signals to and receiving signals from the transmitter 14 and receiver 16, respectively. The signal includes signal information according to the air interface standard of the applicable cellular system and also includes user speech, received data and/or user generated data. In this regard, the mobile terminal 10 is capable of operating according to any of a variety of first, second, third and/or fourth generation communication protocols, and the like. For example, the mobile terminal 10 can be based on second generation (2G) wireless communication protocols IS-136, Time Division Multiple Access (TDMA), Global System for Mobile Communications (GSM) and IS-95 Code Division Multiple Access (CDMA), or according to third Generation (3G) wireless communication protocols such as Universal Mobile Telecommunications System (UMTS), CDMA2000, Wideband CDMA (WCDMA) and Time Division Synchronous CDMA (TD-CDMA), according to 3.9G wireless communication protocols such as Evolved UMTS Terrestrial Radio Access Network (E- UTRAN), operating according to the fourth generation (4G) wireless communication protocol, etc. Alternatively or additionally, the mobile terminal 10 is capable of operating according to non-cellular communication mechanisms. For example, the mobile terminal 10 is capable of communicating in a wireless local area network (WLAN) or other communication network.
在一些实施方式中,处理器20可包括期望用来实现移动终端10的音频和逻辑功能的电路。例如,处理器20包括一个或多个数字信号处理器和/或一个或多个微处理器。该处理器可进一步包括一个或多个模数转换器、一个或多个数模转换器和/或其他支持电路。移动终端10的控制和信号处理功能根据它们各自的能力在这些器件之间分配。因此,处理器20还可包括调制和传送之前卷积编码和交错消息和数据的功能。处理器20可另外包括内部语音编码器,并且可包括内部数据调制解调器。此外,处理器20可包括操作可存储在存储器中的一个或多个软件程序的功能。例如,处理器20能够操作连接性程序,如传统的Web浏览器。然后,连接性程序可允许移动终端10根据例如无线应用协议(WAP)、超文本传送协议(HTTP)和/或类似协议传送和接收Web内容,如基于位置的内容和/或其他网页内容。In some embodiments, the processor 20 may include circuitry desired to implement the audio and logic functions of the mobile terminal 10 . For example, processor 20 includes one or more digital signal processors and/or one or more microprocessors. The processor may further include one or more analog-to-digital converters, one or more digital-to-analog converters, and/or other support circuits. The control and signal processing functions of the mobile terminal 10 are distributed between these devices according to their respective capabilities. Accordingly, processor 20 may also include functionality to modulate and transmit previously convolutionally encoded and interleaved messages and data. Processor 20 may additionally include an internal speech coder, and may include an internal data modem. Additionally, processor 20 may include functionality to operate one or more software programs, which may be stored in memory. For example, processor 20 is capable of operating a connectivity program, such as a conventional web browser. The connectivity program may then allow the mobile terminal 10 to transmit and receive web content, such as location-based content and/or other web content, according to, for example, Wireless Application Protocol (WAP), Hypertext Transfer Protocol (HTTP), and/or the like.
移动终端10还可包括用户接口,该用户接口包括输出设备和输入设备,输出设备例如为传统的耳机或扬声器24、振铃器22、麦克风26、显示器28,输入设备例如为用户输入接口,这些全部耦合至处理器20。允许移动终端10接收数据的该用户输入接口可包括多个允许移动终端10接收数据的设备中的任一个,如小键盘30,触摸屏显示器(如以显示器28表示),或其他输入设备。在包括小键盘30的各实施方式中,该小键盘可包括传统的数字(0-9)键和相关键(#,*)以及用于操作移动终端10的其他硬键和软键。可替代地,小键盘30还可包括各种传统的QWERTY键盘布置。小键盘30还可包括带有相关功能的软键。另外或可替代地,移动终端10可包括接口设备,如操纵杆或其他用户输入接口。移动终端10可进一步包括电池34,如振动电池组,用于供电给操作移动终端10需要的各种电路,以及可选地用于提供机械振动作为可检测输出。Mobile terminal 10 can also comprise user interface, and this user interface comprises output device and input device, and output device is for example traditional earphone or loudspeaker 24, ringer 22, microphone 26, display 28, and input device is for example user input interface, these All coupled to processor 20 . The user input interface that allows mobile terminal 10 to receive data may include any of a number of devices that allow mobile terminal 10 to receive data, such as keypad 30, a touch screen display (as represented by display 28), or other input devices. In embodiments that include a keypad 30 , the keypad may include conventional numeric (0-9) keys and relative keys (#, *) as well as other hard and soft keys for operating the mobile terminal 10 . Alternatively, keypad 30 may also include various conventional QWERTY keyboard arrangements. Keypad 30 may also include soft keys with associated functions. Additionally or alternatively, the mobile terminal 10 may include an interface device, such as a joystick or other user input interface. The mobile terminal 10 may further include a battery 34, such as a vibrating battery pack, for powering various circuits required to operate the mobile terminal 10, and optionally for providing mechanical vibrations as a detectable output.
如上所述,该用户输入接口可包括触摸屏显示器28,其可体现为任何已知的触摸屏显示器。因此,例如,触摸屏显示器28可构成为通过任何适当的技术实现触摸识别,如电阻、电容、红外、应变仪、表面波、光学成像、分散信号技术、声学脉冲识别等技术。触摸屏显示器28可构成为接收用户输入的指示并且将用户输入的表示提供至处理器20。As noted above, the user input interface may include a touch screen display 28, which may embody any known touch screen display. Thus, for example, touch screen display 28 may be configured to implement touch recognition by any suitable technique, such as resistive, capacitive, infrared, strain gauge, surface wave, optical imaging, discrete signal technology, acoustic pulse recognition, and the like. Touch screen display 28 may be configured to receive indications of user input and provide a representation of the user input to processor 20 .
移动终端10可进一步包括用户标识模块(UIM)38。UIM38典型地是具有内置处理器的存储设备。UIM38可以例如包括用户标识模块(SIM)、通用集成电路卡(UICC)、通用用户标识模块(USIM)、可移除用户标识模块(R-UIM)等。UIM38典型地存储与移动用户相关的信息元素。除了UIM38之外,移动终端10还可配备有存储器。例如,移动终端10可包括易失性存储器40,如包括用于数据的暂时存储的缓存区域的易失性随机存取存储器(RAM)。移动终端10还可包括其他非易失性存储器42,其可以是嵌入式的和/或可以是可移除的。存储器可存储移动终端10所使用的多条信息和数据以执行移动终端10的各功能。例如,存储器可包括能够唯一地标识移动终端10的标识符,如国际移动设备标识(IMEI)码。此外,这些存储器可存储用于确定小区id信息的指令。具体地,这些存储器可存储用于由处理器20执行的应用程序,其确定移动终端10与其处于通信中的当前小区的身份,如小区id身份或小区id信息。The mobile terminal 10 may further include a User Identity Module (UIM) 38 . UIM 38 is typically a memory device with a built-in processor. UIM 38 may include, for example, a Subscriber Identity Module (SIM), Universal Integrated Circuit Card (UICC), Universal Subscriber Identity Module (USIM), Removable Subscriber Identity Module (R-UIM), and the like. UIM 38 typically stores information elements related to mobile users. In addition to the UIM 38, the mobile terminal 10 may also be equipped with memory. For example, the mobile terminal 10 may include volatile memory 40 such as volatile Random Access Memory (RAM) including a cache area for temporary storage of data. The mobile terminal 10 may also include other non-volatile memory 42, which may be embedded and/or may be removable. The memory may store various pieces of information and data used by the mobile terminal 10 to perform various functions of the mobile terminal 10 . For example, the memory may include an identifier capable of uniquely identifying the mobile terminal 10, such as an International Mobile Equipment Identity (IMEI) code. Additionally, these memories may store instructions for determining cell id information. Specifically, these memories can store applications for execution by the processor 20, which determine the identity of the current cell with which the mobile terminal 10 is in communication, such as cell id identity or cell id information.
不管输入设备的类型和包括该输入设备的电子设备的类型如何,重叠手写输入可根据如图3中所示示例性实施方式进行分析。在该实施方式中,重叠手写可经由触摸屏110接收,如图2的触摸屏显示器28等。然后,手写输入的每个笔画可经过特征提取112,其中,如下面所述确定与该笔画相关的一个或多个特征。然后,例如由分类器114对当前笔画是否是与前一笔画相同的笔画组的一部分,或当前笔画是否是另一笔画组的一部分进行确定。尽管各种分类器114可被用于分析与当前笔画相关的特征以确定当前笔画所属的组,可采用常用的统计分类器,如支持向量机或人工神经网络。Regardless of the type of input device and the type of electronic device including the input device, overlapping handwriting input may be analyzed according to an exemplary embodiment as shown in FIG. 3 . In this embodiment, overlaid handwriting may be received via a touch screen 110, such as touch screen display 28 of FIG. 2 or the like. Each stroke of the handwriting input may then be subjected to feature extraction 112, wherein one or more features associated with the stroke are determined as described below. A determination is then made, eg by classifier 114, whether the current stroke is part of the same group of strokes as the previous stroke, or whether the current stroke is part of another group of strokes. Although various classifiers 114 may be used to analyze features associated with the current stroke to determine the group to which the current stroke belongs, commonly used statistical classifiers such as support vector machines or artificial neural networks may be employed.
如图3中所示,分类器114通过利用分割规则116分析当前笔画。可以各种方式确定分割规则116。然而,在示出的实施方式中,提供了手写数据库118,手写数据库118包括多个不同系列的手写笔画。然后,手写数据库118的每个笔画系列可经过特征提取120,其中,分析该手写数据库的每个笔画系列以便识别与该笔画系列相关的特征和包括该系列的各个笔画。在一个实施方式中,已被提取并与手写数据库118的当前笔画关联的这些特征经过训练122以便完善分割规则。在这方面,手写数据库118可包含由不同人书写的不同单词和/或句子很多样本。这些单词和/或句子可被手动地分离成字符。基于该分离成字符,每个笔画可被分类为字符的最后一个笔画或除了字符的最后一个笔画之外的笔画。因此,训练122可从手写数据库118中的笔画中学习以便建立定义在确定笔画是或不是字符的最后一个笔画时要考虑的参数的分割规则。如下所述,可基于笔画的各种参数由特征向量来表示该笔画,然后该特征向量的值可基于分割规则116由分类器114确定。接着,该分类器可比较表示该笔画的特征向量的值和预定义阈值以确定该笔画是或不是字符的最后一个笔画。因此,训练122可用于确定对于手写数据库118中的全部各种笔画提供适当地分类笔画方面的最大成功率的参数集。基于训练122,分割规则116被定义,其继而定义可被分类器114利用以基于手写数据库118所存储的不同系列的笔画对笔画进行分类的参数集,例如根据与每个笔画系列相关的特征和/或包括每个系列的各个笔画。通过比较基于已从当前笔画提取的特征的特征向量的值和预定义阈值,分类器114可因此确定当前笔画是与前一笔画相同的笔画组的一部分还是当前笔画开始于不同的笔画组。As shown in FIG. 3 , classifier 114 analyzes the current stroke by utilizing segmentation rules 116 . Segmentation rules 116 may be determined in various ways. However, in the illustrated embodiment, a handwriting database 118 is provided that includes a plurality of different series of handwritten strokes. Each series of strokes of the handwriting database 118 may then undergo feature extraction 120, wherein each series of strokes of the handwriting database is analyzed to identify features associated with the series of strokes and the individual strokes comprising the series. In one embodiment, these features that have been extracted and associated with the current strokes of the handwriting database 118 are trained 122 to refine the segmentation rules. In this regard, handwriting database 118 may contain many samples of different words and/or sentences written by different people. These words and/or sentences can be manually separated into characters. Based on this separation into characters, each stroke may be classified as the last stroke of the character or a stroke other than the last stroke of the character. Accordingly, training 122 may learn from strokes in handwriting database 118 to establish segmentation rules that define parameters to be considered in determining whether a stroke is or is not the last stroke of a character. As described below, a stroke may be represented by a feature vector based on various parameters of the stroke, and the value of the feature vector may then be determined by the classifier 114 based on the segmentation rule 116 . Next, the classifier may compare the value of the feature vector representing the stroke with a predefined threshold to determine whether the stroke is or is not the last stroke of the character. Accordingly, training 122 may be used to determine a set of parameters that provides the greatest success rate in properly classifying strokes for all of the various strokes in handwriting database 118 . Based on training 122, segmentation rules 116 are defined, which in turn define a set of parameters that can be utilized by classifier 114 to classify strokes based on the different series of strokes stored in handwriting database 118, such as according to the features and /or include individual strokes for each series. By comparing the value of a feature vector based on features that have been extracted from the current stroke with a predefined threshold, the classifier 114 can thus determine whether the current stroke is part of the same stroke group as the previous stroke or whether the current stroke begins with a different stroke group.
因为目的是将笔画分割成完整的字符,所以如果分割的组只包括字符的一部分或者如果分割的组包括来自多于一个字符的笔画,则可能出现误差。由于一个示例性实施方式的方法和装置会更容易适应和校正与只包括字符的一部分的分割的组相关的任何误差,因为分割的组可能组合有下面的预分割,所以在一个实施方式中,可调节预定义阈值,例如通过增加该阈值,以便减少分割的组将包括来自多于一个字符的笔画的可能性。Since the purpose is to segment strokes into complete characters, errors may occur if the segmented group includes only a part of the character or if the segmented group includes strokes from more than one character. Since the method and apparatus of an exemplary embodiment will more easily accommodate and correct any errors associated with a segmented group comprising only a portion of a character, as the segmented group may be combined with the following pre-segmentation, in one embodiment, The predefined threshold may be adjusted, for example by increasing it, in order to reduce the likelihood that a segmented group will include strokes from more than one character.
在图3的示例性实施方式中,虚线之上的操作包括训练阶段126。如此,可预先执行这些操作并且不需要在重叠手写输入由触摸屏110接收时重复这些操作。然而,虚线之下的操作包括执行阶段124并且在重叠手写输入的接收之时或之后执行,如在触摸屏110上输入一个或多个笔画之后。In the exemplary embodiment of FIG. 3 , operations above the dashed line include a training phase 126 . As such, these operations may be performed in advance and do not need to be repeated when overlapping handwriting input is received by the touch screen 110 . However, operations below the dashed line include execution stage 124 and are performed upon or after receipt of overlapping handwriting input, such as after entering one or more strokes on touch screen 110 .
作为进一步的解释,现在参照图4,其中示出由根据本发明的示例性实施方式的装置执行的操作。该装置可例如被应用在移动终端10上。然而,可替代地,该装置可体现在各种其他设备上,包括移动和固定设备两者,例如,任何上面列出的设备。该装置可包括用于接收一系列笔画的装置,如处理器20,用户输入接口(例如触摸屏显示器28)等。参见图4的操作130。如上所述,所接收的多个笔画包括重叠手写,其中,多个字符被一个接着一个地连续写出。对于每个笔画,可确定多个特征。参见操作132。在这方面,装置可包括用于确定与每个笔画相关的多个特征的装置,如处理器20。根据本发明的示例性实施方式可确定各种特征。然而,在一个实施方式中,对于当前笔画所确定的多个特征包括与当前笔画相关的特征以及与前一和下一笔画相关的特征。在这方面,图5示出包括当前笔画200、前一笔画210和下一笔画220的多个重叠手写字符。For further explanation, reference is now made to FIG. 4, in which operations performed by an apparatus according to an exemplary embodiment of the present invention are shown. The device can be applied on the mobile terminal 10, for example. Alternatively, however, the apparatus may be embodied on various other devices, including both mobile and stationary devices, eg, any of the devices listed above. The device may include means for receiving a sequence of strokes, such as a processor 20, a user input interface (eg, touch screen display 28), or the like. See operation 130 of FIG. 4 . As described above, the received plurality of strokes includes overlapping handwriting, in which a plurality of characters are continuously written one after the other. For each stroke, a number of features may be determined. See operation 132 . In this regard, the device may include means, such as processor 20, for determining a plurality of features associated with each stroke. Various features may be determined according to exemplary embodiments of the present invention. However, in one embodiment, the plurality of features determined for the current stroke includes features related to the current stroke as well as features related to previous and next strokes. In this regard, FIG. 5 illustrates a plurality of overlapping handwritten characters including a current stroke 200 , a previous stroke 210 and a next stroke 220 .
举例而言,对于当前笔画所确定的特征可包括当前笔画的终点、当前笔画的几何中心、下一笔画的起点、下一笔画的几何中心、包含当前笔画的最小矩形、包含下一笔画的最小矩形和包含前一笔画的最小矩形。关于定义特定点或位置的那些特征,通常用一对坐标来定义那些特征,如x、y坐标。类似地,对于用矩形或其他二维形状定义的那些特征,每个形状如每个矩形可用四个特征定义,如该形状的左侧和右侧的坐标以及该形状的上部和下部的坐标。参照图5,根据示例性实施方式,从当前笔画提取的特征可包括以204标示的当前笔画的终点的x、y坐标(endX,endY),以206标示的当前笔画的几何中心(currentGCX,currentGCY),以222标示的下一笔画的起点(startX,startY),以226标示的下一笔画的几何中心(nextGCX,nextGCY)和包含当前笔画、下一笔画和前一笔画的最小矩形。参照图5,每个矩形可由四个特征定义,即,与左侧、右侧、上部和下部相关的坐标。在这方面,在图5中,与当前笔画、下一笔画和前一笔画相关的最小矩形的左、右、上和下坐标分别以前缀current(当前)、next(下一个)和previous(前一个)标示。举例而言,与当前笔画的最小矩形相关的特征在图5中以currentLeft(当前左)、currentRight(当前右)、currentTop(当前上)和currentBottom(当前下)标示。为了定向而在图5中标示的其他点包括以202标示的当前笔画的起点、下一笔画的终点224和分别以212和214标示的前一笔画的起点和终点,尽管这些其他点在该示例性实施方式中未被提取作为特征。For example, the characteristics determined for the current stroke may include the end point of the current stroke, the geometric center of the current stroke, the starting point of the next stroke, the geometric center of the next stroke, the smallest rectangle containing the current stroke, the smallest rectangle containing the next stroke, Rectangle and the smallest rectangle containing the previous stroke. With respect to those features that define a particular point or location, those features are typically defined by a pair of coordinates, such as x, y coordinates. Similarly, for those features defined by rectangles or other two-dimensional shapes, each shape, such as each rectangle, can be defined with four features, such as the coordinates of the left and right sides of the shape and the coordinates of the top and bottom of the shape. Referring to FIG. 5 , according to an exemplary embodiment, the features extracted from the current stroke may include the x, y coordinates (endX, endY) of the end point of the current stroke indicated at 204, the geometric center (currentGCX, currentGCY) of the current stroke indicated at 206 ), the starting point of the next stroke marked by 222 (startX, startY), the geometric center of the next stroke marked by 226 (nextGCX, nextGCY) and the smallest rectangle including the current stroke, the next stroke and the previous stroke. Referring to FIG. 5, each rectangle may be defined by four features, ie, coordinates related to left, right, upper, and lower. In this regard, in Figure 5, the left, right, upper, and lower coordinates of the smallest rectangle associated with the current stroke, the next stroke, and the previous stroke are prefixed with current (current), next (next) and previous (previous a) marked. For example, the features related to the smallest rectangle of the current stroke are indicated by currentLeft (current left), currentRight (current right), currentTop (current top) and currentBottom (current bottom) in FIG. 5 . Other points marked in FIG. 5 for orientation include the start point of the current stroke at 202, the end point of the next stroke at 224, and the start and end points of the previous stroke at 212 and 214, respectively, although these other points are not included in this example. Non-existent implementations were not extracted as features.
尽管可以以各种方式定义笔画的几何中心,但是根据一个示例性实施方式,笔画的几何中心被定义为笔画中全部点的平均点。举例而言,包含终点(xi,yi)(i=0…n-1)的笔画,具有几何中心(GCX,GCY),其被定义如下:Although the geometric center of a stroke can be defined in various ways, according to one exemplary embodiment, the geometric center of a stroke is defined as the average point of all points in the stroke. For example, a stroke containing an end point ( xi , y i ) (i=0...n-1), has a geometric center (GCX, GCY), which is defined as follows:
GCX=sum(xi)/n,GCY=sum(yi)/n;i=0…n-1GCX=sum(x i )/n, GCY=sum(y i )/n; i=0...n-1
多个重叠的字符和包括这些字符的对应笔画可能具有不同的尺寸,因为这些字符可能被书写得较小或较大同时仍想传达相同的含义。如此,该装置可包括用于将从当前笔画提取的特征标准化以考虑(例如通过移除)相同笔画尺寸上不同的任何影响的装置,如处理器20。参见图4的操作134。在一个实施方式中,基于一系列笔画的总尺寸将这些特征标准化。在这方面,包含每个手写笔画的最小矩形可如图5中所示被定义。在这方面,包含全部笔画的最小矩形可由与重叠手写字符的上、下、左和右相关的坐标定义。在其中包含全部系列笔画的最小矩形被定位以便具有该坐标系的原点处的一角并且具有沿该坐标轴延伸的边的示例性实施方式中,包含全部笔画的最小矩形可通过它的宽度如totalWidth(总宽度)和它的高度如totalHeight(总高度)以简化形式表示。Multiple overlapping characters and corresponding strokes comprising these characters may be of different sizes, as the characters may be written smaller or larger while still attempting to convey the same meaning. As such, the apparatus may comprise means, such as the processor 20, for normalizing the features extracted from the current stroke to account for (eg by removing) any effect of differences in the size of the same stroke. See operation 134 of FIG. 4 . In one embodiment, these features are normalized based on the total size of a series of strokes. In this regard, the smallest rectangle containing each handwritten stroke can be defined as shown in FIG. 5 . In this regard, the smallest rectangle containing all strokes may be defined by coordinates relative to the top, bottom, left and right of overlapping handwritten characters. In an exemplary embodiment where the smallest rectangle containing the entire series of strokes is positioned so as to have a corner at the origin of the coordinate system and have sides extending along the coordinate axis, the smallest rectangle containing the entire series of strokes can be defined by its width such as totalWidth (total width) and its height as totalHeight (total height) in simplified form.
在一个示例中,上面已描述的特征可如下所示那样被标准化,其中,前缀Current(当前)、Next(下一个)和Pre(前一个)分别指与当前笔画、下一笔画和前一笔画相关的特征。In one example, the features already described above can be normalized as follows, where the prefixes Current (current), Next (next) and Pre (previous) refer to the current stroke, next stroke, and previous stroke respectively relevant features.
如此,CurrentStrokeEndX和CurrentStrokeEndY是当前笔画的终点的标准化的x和y坐标。NextStrokeStartX和NextStrokeStartY是下一笔画的起点的标准化的x和y坐标。CurrentGCX和CurrentGCY是当前笔画的几何中心的标准化的x和y坐标。NextGCX和NextGCY是下一笔画的几何中心的标准化的x和y坐标。CurrentLeft、CurrentRight、CurrentTop和CurrentBottom分别是当前笔画的左、右、上和下坐标。NextLeft、NextRight、NextTop和NextBottom分别是下一笔画的左、右、上和下坐标。最后,PreLeft、PreRight、PreTop和PreBottom分别是前一笔画的左、右、上和下坐标。Thus, CurrentStrokeEndX and CurrentStrokeEndY are the normalized x and y coordinates of the end point of the current stroke. NextStrokeStartX and NextStrokeStartY are the normalized x and y coordinates of the start point of the next stroke. CurrentGCX and CurrentGCY are the normalized x and y coordinates of the geometric center of the current stroke. NextGCX and NextGCY are the normalized x and y coordinates of the geometric center of the next stroke. CurrentLeft, CurrentRight, CurrentTop, and CurrentBottom are the left, right, top, and bottom coordinates of the current stroke, respectively. NextLeft, NextRight, NextTop, and NextBottom are the left, right, top, and bottom coordinates of the next stroke, respectively. Finally, PreLeft, PreRight, PreTop, and PreBottom are the left, right, top, and bottom coordinates of the previous stroke, respectively.
在一个实施方式中,从当前笔画提取的这多个特征然后可被组合,例如通过处理器20组合成特征向量。在其中正被分析的当前笔画是初始笔画的实例中,将不存在前一笔画。如此,与前一笔画相关的特征可被设置为预定义值,例如-1。In one embodiment, the plurality of features extracted from the current stroke may then be combined, eg, by the processor 20, into a feature vector. In instances where the current stroke being analyzed is the initial stroke, there will be no previous stroke. As such, features related to the previous stroke may be set to a predefined value, eg -1.
一旦对于当前笔画已确定多个特征,并且在一个示例中,多个笔画已被标准化,则可基于与笔画相关的特征将该系列笔画分割成一组或多组笔画。参见图4的操作136。如此,该装置可包括用于基于这些特征将多个系列笔画分割成一组或多组笔画的装置,如处理器20。如上所述,这些笔画可由分类器114并且根据下面更详细描述的技术分割成一组或多组笔画。该装置还可包括用于基于多个系列笔画已被分割成的一组或多组笔画,将多个系列笔画分割成一个或多个字符的装置,如处理器20。参见操作138。基于这些组到字符的分割可以以各种方式执行,包括通过利用分割规则116等,分割规则116定义将被分类器114利用的参数以便适当地估计笔画的特征向量。如此,一个示例性实施方式的处理器20可包括或执行特征提取112、分类器114和分割规则116,如上面结合图3所述。相比于在不对笔画进行中间分组的情况下对多个重叠字符进行分割的实例,通过初始地定义笔画组并继而基于这些组定义字符,可以以更有计算效率的方式执行笔画到字符的分割。Once features have been determined for the current stroke, and in one example, the strokes have been normalized, the series of strokes may be segmented into one or more groups of strokes based on features associated with the strokes. See operation 136 of FIG. 4 . As such, the apparatus may include means, such as the processor 20, for segmenting the plurality of series of strokes into one or more groups of strokes based on these features. As noted above, these strokes may be segmented by classifier 114 into one or more groups of strokes according to techniques described in more detail below. The apparatus may also include means, such as processor 20, for segmenting the series of strokes into one or more characters based on the set or groups of strokes into which the series of strokes have been segmented. See operation 138. Segmentation into characters based on these groups can be performed in various ways, including by utilizing segmentation rules 116, etc., which define parameters to be utilized by classifier 114 in order to properly estimate the feature vectors of the strokes. As such, processor 20 of an exemplary embodiment may include or implement feature extraction 112 , classifier 114 , and segmentation rules 116 , as described above in connection with FIG. 3 . By initially defining groups of strokes and then defining characters based on these groups, segmentation of strokes to characters can be performed in a more computationally efficient manner than in the case of segmenting multiple overlapping characters without intermediate grouping of strokes .
参照图6,在230处示出多个重叠字符。通过将多个系列笔画分割成笔画组,可如232处所示定义多个组。此后,可基于234处所示的组将多个系列笔画分割成多个字符。然后,产生的字符可被识别,如通过模式识别等,如236处所示,以便多个重叠手写字符可被非常有效地分割且识别为各系列字符。在图6的示例性实施方式中,基于这些组将笔画分割成字符,可以识别这些组的多个可能的组合,例如这些组的全部可能的组合,如234和236处所示。在这方面,由这些组的不同组合表示的各字符可由手写识别引擎识别,该手写识别引擎可由处理器20或与该处理器保持通信的另一计算设备体现。在一个实施方式中,该手写识别引擎还可确定这些组的每种组合与当前字符集之间的相似性。然后,由这些组的组合表示的每个字符集可被分析,如通过由处理器20或与该处理器保持通信的另一计算设备体现的语言模型,以便确定每个字符集是否有意义。然后,该装置(如处理器20)、手写识别引擎和/或语言模型,可对基于对已由手写识别引擎确定的相似性的测量、对已由语言模型确定的当前字符集是否有意义的测量和在一些实施方式中对各组的预定义几何特性的测量的组合的每个可能的字符集指定分数。在这方面,可以例如通过处理器20分析每个组以确定该组是否满足一个或多个预定义几何特性,例如,一个几何特性可与组的尺寸相关,小于预定义阈值的组被认为太小而不是完整的字符。作为几何特性的另一示例,位于沿着书写区域的边,如沿着触摸屏显示器28的最左或最右部分的组,可不认为是完整的字符。例如基于相似性、有意义和几何特性中的一个或多个被如处理器20确定具有最高分数的字符集,可被识别为最能表示该多个重叠字符的字符集,如238处所示。Referring to FIG. 6 , a plurality of overlapping characters are shown at 230 . Multiple groups may be defined as shown at 232 by dividing multiple series of strokes into stroke groups. Thereafter, the series of strokes may be segmented into characters based on the groups shown at 234 . The resulting characters can then be recognized, such as by pattern recognition or the like, as shown at 236, so that multiple overlapping handwritten characters can be very efficiently segmented and recognized as series of characters. In the exemplary embodiment of FIG. 6 , based on the segmentation of strokes into characters by these groups, multiple possible combinations of these groups, eg, all possible combinations of these groups, can be identified, as shown at 234 and 236 . In this regard, each character represented by a different combination of these groups may be recognized by a handwriting recognition engine, which may be embodied by processor 20 or another computing device in communication with the processor. In one embodiment, the handwriting recognition engine can also determine the similarity between each combination of these groups and the current character set. Each character set represented by a combination of these groups may then be analyzed, such as by a language model embodied by processor 20 or another computing device in communication with the processor, to determine whether each character set is meaningful. The device (such as the processor 20), the handwriting recognition engine and/or the language model may then determine whether the current character set determined by the language model makes sense based on the measure of similarity determined by the handwriting recognition engine. Each possible set of characters is measured and in some embodiments assigned a score for a combination of measurements of predefined geometric properties for each set. In this regard, each group may be analyzed, for example, by processor 20, to determine whether the group satisfies one or more predefined geometric properties, for example, a geometric property may be related to the size of the group, a group smaller than a predefined threshold being considered too large. Small rather than full characters. As another example of a geometric characteristic, groups located along the sides of the writing area, such as along the leftmost or rightmost portions of the touch screen display 28, may not be considered complete characters. The character set with the highest score as determined by processor 20, for example based on one or more of similarity, meaningfulness, and geometric properties, may be identified as the character set that best represents the plurality of overlapping characters, as shown at 238 .
在一个实施方式中,可对识别这些组的多个可能组合的过程施加一个或多个限制,以增加识别这些组合以及随后处理这些组合的效率。例如,这些可能的组合可通过将任何潜在组合或字符限制为最大四组进行限制。In one embodiment, one or more constraints may be imposed on the process of identifying multiple possible combinations of these groups to increase the efficiency of identifying and subsequently processing these combinations. For example, these possible combinations can be limited by limiting any potential combinations or characters to a maximum of four groups.
除了增加可对多个重叠字符进行分割的效率外,一系列笔画到一组或多组笔画的分割还可方便手写输入的显示。在这方面,该装置还可包括用于使得这些组中的至少一些显示以使得至少一组以明显不同于至少另一组的方式显示的装置,如处理器20、显示器28等。参见图4的操作140。例如,只有这些组的子集可被显示,以使得至少一组不被显示。在这方面,可显示相对小数目的最新组而从显示器28移除之前的全部组。如此,显示器28不那么杂乱并且用户可更容易地查看当前正在输入的笔画以及之前紧接着的各笔画。在其他实施方式中,可以以不同笔画组之间有明显区别的方式显示这些笔画组。例如,这些笔画组可用不同颜色和/或亮度水平显示,如基于笔画被接收的顺序变化的不同颜色或亮度水平。在一个示例性实施方式中,最新的笔画组可用最暗的颜色(和/或亮度)显示,之前紧接着的笔画组可用稍亮一些的颜色(和/或亮度)显示,依此类推直至用最亮的颜色(和/或亮度)显示的初始笔画组。可替代地,可通过不同类型的线表示不同笔画组,例如用实线表示最新的笔画组,用点划线表示之前紧接着的笔画组,依次类推。在每个实施方式中,笔画组被显示以致不同笔画组明显不同,并且在一些实施方式中,越新的笔画组越明显。In addition to increasing the efficiency with which multiple overlapping characters can be segmented, the segmentation of a series of strokes into one or more groups of strokes also facilitates the display of handwritten input. In this regard, the apparatus may also include means, such as processor 20, display 28, etc., for causing at least some of the groups to be displayed such that at least one group is displayed in a manner that is distinct from at least another group. See operation 140 of FIG. 4 . For example, only a subset of the groups may be displayed such that at least one group is not displayed. In this regard, a relatively small number of the most recent groups may be displayed with all previous groups removed from the display 28 . In this way, the display 28 is less cluttered and the user can more easily view the stroke currently being entered and the strokes immediately preceding it. In other embodiments, groups of strokes may be displayed in such a way that there is a clear distinction between the groups of strokes. For example, the groups of strokes may be displayed in different colors and/or brightness levels, such as different colors or brightness levels that vary based on the order in which the strokes were received. In one exemplary embodiment, the newest stroke group may be displayed with the darkest color (and/or brightness), the immediately preceding stroke group may be displayed with a slightly lighter color (and/or brightness), and so on until the stroke group is displayed with the darkest color (and/or brightness). The brightest color (and/or brightness) for the initial group of strokes shown. Alternatively, different stroke groups may be represented by different types of lines, for example, the latest stroke group is represented by a solid line, the previous stroke group is represented by a dot-dash line, and so on. In each embodiment, stroke groups are displayed such that different stroke groups are distinct, and in some embodiments newer stroke groups are more apparent.
在一个实施方式中,与笔画相关的特征的确定和该笔画到当前组的分割可在输入每个笔画之后执行,例如如图7中所示。关于图7的后续描述,当前笔画和下一笔画(如上面讨论的)分别表示为笔画k-1和笔画k。参照图7的操作250、252和254,输入笔画k之后,确定该笔画是否是初始笔画,即,确定是否k=0。在其中已输入的笔画是初始笔画的实例中,可初始化实际书写区域。参见操作256。在这方面,实际书写区域可初始化成包围该初始笔画的最小矩形,并且可用如图5中所示相对于坐标系统定向和定位的矩形的总宽度和总高度或用位于其他方向和位置的矩形的上、下、左和右坐标定义。此后,计数器k可在等待下一笔画的输入之前递增。参见操作258。在输入下一笔画之后,可重新计算实际书写区域以便重新计算的实际书写区域表示包围每个笔画的最小矩形。参见操作260。此后,可确定当前笔画k-1的特征。参见操作262。基于对于笔画k-1已确定的特征,分类器114可基于笔画k-1的特征提供值Spre。在这方面,分类器114可基于笔画k-1的特征向量和分割规则116提供值Spre,分割规则116定义被该分类器利用以估计笔画的特征向量的参数。参见图7的操作264。然后,由分类器114提供的值Spre可与用于分类的预定义阈值Tpre进行比较。参见操作266。如果由分类器114提供的值Spre超过用于分类的预定义阈值Tpre,则笔画k-1和笔画k被认为属于不同的组。参见操作268。相反,如果由分类器114提供的值Spre不大于用于分类的预定义阈值Tpre,则笔画k-1和笔画k被认为属于相同的组。参见操作270。上面描述并在图7中示出的该过程可对于输入的每个笔画递增地重复以便将该系列笔画适当地分割成组。In one embodiment, the determination of features associated with a stroke and the segmentation of that stroke into the current group may be performed after each stroke is input, for example as shown in FIG. 7 . With regard to subsequent descriptions of FIG. 7, the current stroke and the next stroke (as discussed above) are denoted stroke k−1 and stroke k, respectively. Referring to operations 250, 252, and 254 of FIG. 7, after a stroke k is input, it is determined whether the stroke is an initial stroke, ie, whether k=0. In instances where the strokes that have been input are initial strokes, an actual writing area may be initialized. See operation 256. In this regard, the actual writing area may be initialized to be the smallest rectangle enclosing the initial stroke, and may use the total width and total height of the rectangle oriented and positioned relative to the coordinate system as shown in FIG. The top, bottom, left, and right coordinates of the definition. Thereafter, counter k may be incremented before waiting for the input of the next stroke. See operation 258. After the next stroke is input, the actual writing area may be recalculated so that the recalculated actual writing area represents the smallest rectangle enclosing each stroke. See operation 260 . Thereafter, the characteristics of the current stroke k-1 can be determined. See operation 262 . Based on the determined features for stroke k-1, classifier 114 may provide a value Spre based on the features of stroke k-1. In this regard, the classifier 114 may provide a value S pre based on the feature vector of the stroke k−1 and a segmentation rule 116 defining parameters utilized by the classifier to estimate the feature vector of the stroke. See operation 264 of FIG. 7 . The value S pre provided by the classifier 114 may then be compared with a predefined threshold T pre for classification. See operation 266. Stroke k−1 and stroke k are considered to belong to different groups if the value S pre provided by the classifier 114 exceeds a predefined threshold T pre for classification. See operation 268 . Conversely, stroke k−1 and stroke k are considered to belong to the same group if the value S pre provided by the classifier 114 is not greater than the predefined threshold T pre for classification. See operation 270 . The process described above and shown in Figure 7 may be repeated incrementally for each stroke entered in order to properly segment the series of strokes into groups.
作为在每个笔画输入之后个别地分析这些笔画以便适当地将这些笔画分组的替代方案,可以以批处理流程分析多个笔画,例如如图8中所示。在这方面,可确定实际书写区域,如包围多个笔画0,1,…,M-1的最小矩形。参见操作280。然后,可初始化计数器k,如k=1,并且可确定该批的全部笔画是否已被考虑,例如通过比较该计数器和笔画M的数目。参见操作282和284。在其中全部笔画还没有被考虑的实例中,确定笔画k-1的特征并且接着可基于笔画k-1的特征例如通过分类器114确定笔画k-1的值Spre。参见操作286和288。如同之前,可进行值Spre与阈值Tpre的比较,其中,笔画k-1和笔画k属于不同的组还是相同的组分别取决于值Spre是否大于用于分类的阈值Tpre。参见操作290、292和294。可对于操作296中计数器k的递增所指示的该批的每个笔画重复该过程,直至每个笔画已被考虑在内并且被适当地分入各组。As an alternative to analyzing each stroke individually after its input in order to group the strokes appropriately, multiple strokes may be analyzed in a batch process, such as shown in FIG. 8 . In this regard, the actual writing area can be determined, such as the smallest rectangle enclosing a plurality of strokes 0, 1, . . . , M-1. See operation 280 . Then, a counter k can be initialized, eg k=1, and it can be determined whether all strokes of the batch have been considered, for example by comparing this counter with the number of strokes M. See operations 282 and 284 . In instances where all strokes have not been considered, features of stroke k−1 are determined and then a value S pre for stroke k−1 may be determined based on the features of stroke k−1 , for example by classifier 114 . See operations 286 and 288. As before, a comparison of the value S pre with the threshold T pre can be done, wherein stroke k−1 and stroke k belong to different groups or the same group, respectively, depending on whether the value S pre is greater than the threshold T pre for classification. See operations 290, 292 and 294. This process may be repeated for each stroke of the batch indicated by the increment of counter k in operation 296, until each stroke has been accounted for and properly sorted into groups.
图4、图7和图8是根据本发明的示例性实施方式的方法和程序产品的流程图。将理解的是,这些流程图的每个块以及这些流程图中各块的组合,可以各种方式实施,例如硬件、固件、处理器、电路和/或与包括一个或多个计算机程序指令的软件的执行相关的其他设备。例如,上面描述的一个或多个流程可由计算机程序指令体现。在这方面,体现上面描述的各流程的计算机程序指令,可由移动终端10的存储设备存储并且由该移动终端中的处理器20执行。将理解的是,任何这种计算机程序指令可加载到计算机或其他可编程装置(例如,硬件)上,以产生机器,以便在该计算机或其他可编程装置上执行的这些指令生成用于执行(各)流程图块中所指定功能的装置。这些计算机程序指令还可存储在可引导计算机或其他可编程装置以特定方式运行的计算机可读存储器中,以便存储在该计算机可读存储器中的这些指令产生包括执行(各)流程图块中所指定功能的指令装置的制品。这些计算机程序指令还可加载到计算机或其他可编程装置以使得一系列操作在该计算机或其他可编程装置上执行以产生计算机执行过程,以便在该计算机或其他可编程装置上执行的这些指令执行流程图块中所指定的功能。4, 7 and 8 are flowcharts of methods and program products according to exemplary embodiments of the invention. It will be understood that each block of the flow diagrams, and combinations of blocks in the flow diagrams, can be implemented in various ways, such as hardware, firmware, processors, circuits, and/or in combination with one or more computer program instructions other devices related to the execution of the software. For example, one or more of the procedures described above may be embodied by computer program instructions. In this regard, the computer program instructions embodying the processes described above may be stored in the storage device of the mobile terminal 10 and executed by the processor 20 in the mobile terminal. It will be understood that any such computer program instructions may be loaded onto a computer or other programmable apparatus (e.g., hardware) to produce a machine such that execution of these instructions on the computer or other programmable apparatus generates for execution ( Each) means of the function specified in the flowchart block. These computer program instructions may also be stored in a computer-readable memory that directs a computer or other programmable apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce a sequence of events including execution of the flowchart block(s). A product of a command device for a given function. These computer program instructions can also be loaded into a computer or other programmable device to cause a series of operations to be performed on the computer or other programmable device to generate a computer-executed process, so that these instructions executed on the computer or other programmable device can be executed The function specified in the flowchart block.
因此,这些流程图的块支持用于执行所指定功能的装置的组合,用于执行所指定功能的操作和用于执行所指定功能的程序指令的组合。还将理解的是,这些流程图的一个或多个块,以及这些流程图中各块的组合,可由执行所指定功能的专用基于硬件的计算机系统,或专用硬件和计算机指令的组合来实施。Accordingly, blocks of the flowchart support combinations of means for performing the specified functions, combinations of operations for performing the specified functions and program instructions for performing the specified functions. It will also be understood that one or more blocks of the flowcharts, and combinations of blocks in the flowcharts, can be implemented by special purpose hardware-based computer systems which perform the specified functions, or combinations of special purpose hardware and computer instructions.
在之前的描述和相关附图中呈现的有益教导下,发明所属领域的技术人员将想到这里阐述的本发明的很多修改和其他实施方式。因此,要理解的是,本发明并不限于所公开的具体实施方式,并且修改和其他实施方式将包括在所附权利要求的范围内。此外,尽管之前的描述和相关附图在各元件和/或功能的某些示例性组合的上下文中描述了各示例性实施方式,应当理解的是,各元件和/或功能的不同组合还可由替代实施方式提供,而不脱离所附权利要求的范围。在这方面,例如,除了上面明确地描述的那些之外的各元件和/或功能的不同组合也被考虑并且可在所附权利要求中的一些中提出。尽管这里采用具体的术语,它们只在一般和描述性的意义上使用并且不用于限制。Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the particular embodiments disclosed and that modifications and other embodiments are to be included within the scope of the appended claims. Additionally, although the foregoing description and associated drawings have described exemplary embodiments in the context of certain exemplary combinations of elements and/or functions, it should be appreciated that different combinations of elements and/or functions may also be produced by Alternative embodiments are provided without departing from the scope of the appended claims. In this regard, for example, different combinations of elements and/or functions than those explicitly described above are also contemplated and may be set forth in some of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for limitation.
Claims (27)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2010/076285 WO2012024829A1 (en) | 2010-08-24 | 2010-08-24 | Method and apparatus for segmenting strokes of overlapped handwriting into one or more groups |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103080878A CN103080878A (en) | 2013-05-01 |
CN103080878B true CN103080878B (en) | 2017-03-29 |
Family
ID=45722811
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201080068735.8A Active CN103080878B (en) | 2010-08-24 | 2010-08-24 | One or more groups of method and apparatus are divided into for hand-written stroke will be overlapped |
Country Status (4)
Country | Link |
---|---|
JP (1) | JP5581448B2 (en) |
KR (1) | KR101486174B1 (en) |
CN (1) | CN103080878B (en) |
WO (1) | WO2012024829A1 (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103376998B (en) * | 2012-04-19 | 2016-06-15 | 中兴通讯股份有限公司 | Handwriting equipment Chinese character type-setting method and device |
JP2013246732A (en) * | 2012-05-28 | 2013-12-09 | Toshiba Corp | Handwritten character retrieval apparatus, method and program |
JP5717691B2 (en) * | 2012-05-28 | 2015-05-13 | 株式会社東芝 | Handwritten character search device, method and program |
WO2013178867A1 (en) * | 2012-05-31 | 2013-12-05 | Multitouch Oy | User interface for drawing with electronic devices |
CN105283882B (en) * | 2013-04-12 | 2019-12-27 | 诺基亚技术有限公司 | Apparatus for text input and associated method |
KR102121487B1 (en) * | 2013-06-09 | 2020-06-11 | 애플 인크. | Managing real-time handwriting recognition |
US9898187B2 (en) | 2013-06-09 | 2018-02-20 | Apple Inc. | Managing real-time handwriting recognition |
CN103345365B (en) * | 2013-07-12 | 2016-04-13 | 北京蒙恬科技有限公司 | The display packing of continuous handwriting input and the hand input device of employing the method |
KR102125212B1 (en) * | 2013-08-29 | 2020-07-08 | 삼성전자 주식회사 | Operating Method for Electronic Handwriting and Electronic Device supporting the same |
JP2015099566A (en) * | 2013-11-20 | 2015-05-28 | 株式会社東芝 | Feature calculation device, method and program |
US9224038B2 (en) | 2013-12-16 | 2015-12-29 | Google Inc. | Partial overlap and delayed stroke input recognition |
US9881224B2 (en) | 2013-12-17 | 2018-01-30 | Microsoft Technology Licensing, Llc | User interface for overlapping handwritten text input |
US9524440B2 (en) * | 2014-04-04 | 2016-12-20 | Myscript | System and method for superimposed handwriting recognition technology |
CN105095924A (en) * | 2014-04-25 | 2015-11-25 | 夏普株式会社 | Handwriting recognition method and device |
EP3295292B1 (en) * | 2015-05-15 | 2020-09-02 | MyScript | System and method for superimposed handwriting recognition technology |
DK179374B1 (en) | 2016-06-12 | 2018-05-28 | Apple Inc | Handwriting keyboard for monitors |
JP7071840B2 (en) * | 2017-02-28 | 2022-05-19 | コニカ ミノルタ ラボラトリー ユー.エス.エー.,インコーポレイテッド | Estimating character stroke information in the image |
CN108492349B (en) * | 2018-03-19 | 2023-04-11 | 广州视源电子科技股份有限公司 | Processing method, device and equipment for writing strokes and storage medium |
CN110503101A (en) * | 2019-08-23 | 2019-11-26 | 北大方正集团有限公司 | Font evaluation method, apparatus, device and computer-readable storage medium |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS60254384A (en) * | 1984-05-31 | 1985-12-16 | Fujitsu Ltd | Stroke correspondence method |
JPH05233599A (en) * | 1992-02-18 | 1993-09-10 | Seiko Epson Corp | Online character recognizing device |
JPH06162266A (en) * | 1992-11-20 | 1994-06-10 | Seiko Epson Corp | Method and apparatus for on-line handwritten character recognition |
JPH0792817B2 (en) * | 1994-08-19 | 1995-10-09 | 沖電気工業株式会社 | Online character recognizer |
JPH08161426A (en) * | 1994-12-09 | 1996-06-21 | Sharp Corp | Handwritten character stroke segmenting device |
JP2939147B2 (en) * | 1994-12-29 | 1999-08-25 | シャープ株式会社 | Handwritten character input display device and method |
JPH09161011A (en) * | 1995-12-13 | 1997-06-20 | Matsushita Electric Ind Co Ltd | Handwritten character input device |
US5970498A (en) * | 1996-12-06 | 1999-10-19 | International Business Machines Corporation | Object oriented framework mechanism for metering objects |
JP3216800B2 (en) * | 1997-08-22 | 2001-10-09 | 日立ソフトウエアエンジニアリング株式会社 | Handwritten character recognition method |
JP3024680B2 (en) * | 1998-01-13 | 2000-03-21 | 日本電気株式会社 | Handwritten pattern storage and retrieval device |
JP3456931B2 (en) * | 1999-12-10 | 2003-10-14 | シャープ株式会社 | Handwritten character recognition device, computer readable recording medium storing a handwritten character recognition program, and method for correcting characters recognized by handwritten characters |
JP3974359B2 (en) * | 2000-10-31 | 2007-09-12 | 株式会社東芝 | Online character recognition apparatus and method, computer-readable storage medium, and online character recognition program |
US7324691B2 (en) * | 2003-09-24 | 2008-01-29 | Microsoft Corporation | System and method for shape recognition of hand-drawn objects |
JP2005141329A (en) * | 2003-11-04 | 2005-06-02 | Toshiba Corp | Device and method for recognizing handwritten character |
KR100677426B1 (en) * | 2005-01-14 | 2007-02-02 | 엘지전자 주식회사 | How to Display Text Messages on Mobile Terminals |
JP2009289188A (en) * | 2008-05-30 | 2009-12-10 | Nec Corp | Character input device, character input method and character input program |
CN101299236B (en) * | 2008-06-25 | 2010-06-09 | 华南理工大学 | A Chinese Handwritten Phrase Recognition Method |
-
2010
- 2010-08-24 JP JP2013525107A patent/JP5581448B2/en not_active Expired - Fee Related
- 2010-08-24 KR KR1020137007240A patent/KR101486174B1/en not_active Expired - Fee Related
- 2010-08-24 WO PCT/CN2010/076285 patent/WO2012024829A1/en active Application Filing
- 2010-08-24 CN CN201080068735.8A patent/CN103080878B/en active Active
Also Published As
Publication number | Publication date |
---|---|
JP2013543158A (en) | 2013-11-28 |
KR101486174B1 (en) | 2015-01-23 |
KR20130058053A (en) | 2013-06-03 |
WO2012024829A1 (en) | 2012-03-01 |
JP5581448B2 (en) | 2014-08-27 |
CN103080878A (en) | 2013-05-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103080878B (en) | One or more groups of method and apparatus are divided into for hand-written stroke will be overlapped | |
CN109241861B (en) | Mathematical formula identification method, device, equipment and storage medium | |
US8194921B2 (en) | Method, appartaus and computer program product for providing gesture analysis | |
US9613296B1 (en) | Selecting a set of exemplar images for use in an automated image object recognition system | |
US20180300542A1 (en) | Drawing emojis for insertion into electronic text-based messages | |
US7796817B2 (en) | Character recognition method, character recognition device, and computer product | |
CN102257511A (en) | Method, apparatus and computer program product for providing adaptive gesture analysis | |
CN110942004A (en) | Handwriting recognition method and device based on neural network model and electronic equipment | |
WO2017197593A1 (en) | Apparatus, method and computer program product for recovering editable slide | |
CN113392820B (en) | Dynamic gesture recognition method and device, electronic equipment and readable storage medium | |
CN111722717A (en) | Gesture recognition method and device and computer readable storage medium | |
CN109388935B (en) | Document verification method and device, electronic equipment and readable storage medium | |
JP4800144B2 (en) | Character string determination device, character string determination method, character string determination program, and computer-readable recording medium | |
CN107885770B (en) | Target domain database construction method, target domain database sample identification method, terminal and storage medium | |
CN113610064B (en) | Handwriting recognition method and device | |
CN110555431B (en) | Image recognition method and device | |
JP6127685B2 (en) | Information processing apparatus, program, and shape recognition method | |
JP4431335B2 (en) | String reader | |
JP2024522364A (en) | Method, apparatus and computer-readable storage medium for recognizing characters in digital documents - Patents.com | |
JP4335185B2 (en) | Character identification based on written information | |
CN118411725A (en) | Font attribute identification result generation method and device | |
JP2016197362A (en) | Range specification program, range specification method, and range specification device | |
JP2001344567A (en) | Device and method for recognizing character and recording medium with program for performing the method recorded thereon | |
JP2006092205A (en) | Capital letter recognizing device, capital letter recognizing method and capital letter recognizing program | |
CN103854021A (en) | Word identification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C41 | Transfer of patent application or patent right or utility model | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20160115 Address after: Espoo, Finland Applicant after: Technology Co., Ltd. of Nokia Address before: Espoo, Finland Applicant before: Nokia Oyj |
|
GR01 | Patent grant | ||
GR01 | Patent grant |