[go: up one dir, main page]

CN1186744C - Chinese character recognizing method based on structure model - Google Patents

Chinese character recognizing method based on structure model Download PDF

Info

Publication number
CN1186744C
CN1186744C CNB021259496A CN02125949A CN1186744C CN 1186744 C CN1186744 C CN 1186744C CN B021259496 A CNB021259496 A CN B021259496A CN 02125949 A CN02125949 A CN 02125949A CN 1186744 C CN1186744 C CN 1186744C
Authority
CN
China
Prior art keywords
stroke
model
strokes
standard
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB021259496A
Other languages
Chinese (zh)
Other versions
CN1474351A (en
Inventor
贾云得
刘峡壁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CNB021259496A priority Critical patent/CN1186744C/en
Publication of CN1474351A publication Critical patent/CN1474351A/en
Application granted granted Critical
Publication of CN1186744C publication Critical patent/CN1186744C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)

Abstract

The present invention relates to a Chinese character recognition method based on a structure model, which belongs to the fields of mode recognition, artificial intelligence and Chinese information processing. The present invention uses the two primitives of a stroke segment and a stroke to respectively establish two mathematical models for describing a Chinese character structure, namely a central point model of the stroke segment and a relational matrix model of the stroke, and a central point recognition method of the stroke segment and a relational matrix recognition method of the stroke are established. The central point recognition method of the stroke segment is combined with the matrix recognition method of the stroke, the central point recognition method of the stroke segment is used for the rough sort of Chinese character recognition, and the matrix recognition method of the stroke is used for the fine sort of the Chinese character recognition to form a set of integral Chinese character recognition methods. Printed Chinese character recognition and handwritten Chinese character recognition are processed by a uniform mechanism, and the present invention not only can be used for off-line recognition, but also can be used for online recognition. The present invention has the advantages of high recognition accuracy and stable performance.

Description

A kind of Chinese characters recognition method based on structural model
Technical field
The present invention relates to the Chinese characters recognition method based on structural model, claimed technical scheme belongs to pattern-recognition, artificial intelligence and Chinese information processing field.
Background technology
Through the development of decades, Chinese character recognition technology made great progress already.But unconfined Handwritten Chinese Character Recognition, particularly Off-line Handwritten Chinese Character Recognition also have certain distance apart from people's expectation.In order to solve this problem of Off-line Handwritten Chinese Character Recognition, at present statistical method and neural net methods of adopting by the study to a large amount of handwritten Chinese character samples, reach the purpose that adapts to the Chinese character distortion more.This method need be collected the magnanimity sample and spend huge learning time, but effect is not very good.Structural approach is strong to the adaptive faculty of distortion, does not collect sample and the burden learnt, though that existing structural approach is obtained in the Online Handwritten Chinese Character Recognition is quite successful, but is difficult to apply in the off line Chinese Character Recognition field and goes.
Summary of the invention
Technical matters to be solved by this invention provides a kind of structural approach of effective identification Chinese character, this method recognition correct rate height, good stability, both can be used for Handwritten Chinese Character Recognition, also can be used for printed Chinese character identification, both can be used for the off line Chinese Character Recognition, also can be used for online Chinese Character Recognition.
Matter of utmost importance with structural approach identification Chinese character is to set up the structural model of Chinese character image.The invention provides two kinds of mathematical models that are used for Description of Chinese Character Structure: sub-stroke center model and stroke relation matrix model.
The sub-stroke center model serves as to form the primitive of Chinese character with the pen section, describes Chinese character by the type and the position of pen section.Here, pen section refers in the Chinese character image set of a foreground pixel understanding horizontal, vertical, that cast aside, press down four kinds of basic strokes (other stroke can be combined by these four kinds of basic strokes) being consistent with people.Being expressed as follows of sub-stroke center model:
1) segment type
According to the direction vector of pen section correspondence, be divided into horizontal, vertical, cast aside, press down four kinds.
2) fragment position
Fragment position is represented by the mid point Euclidean coordinate of pen section, is referred to as center point coordinate.This coordinate is tried to achieve on the standardization Chinese character image.
3) model constitutes
H={(X i,Y i,T i)},i=1,2,…,N (1)
Wherein, H represents Chinese character, X iBe the central point abscissa value of i pen section, Y iBe the central point ordinate value of i pen section, T iThe type of representing i pen section, value be horizontal, vertical, cast aside, press down one of four kinds, N is for forming the pen section number of Chinese character.
Formula (1) illustrates, if a standardization Chinese character image is determined on the position (by X at all iAnd Y iDetermine) definite type is all arranged (by T iDetermine) the pen section, then this image is exactly a certain Chinese character (being determined by H), otherwise then is not.
Based on the sub-stroke center model, the invention provides following Chinese characters recognition method, this method is called as the sub-stroke center method of identification.
At first determine the pairing standard sub-stroke center of each Chinese character classification model.During identification, calculating the distance between the pairing sub-stroke center model of Chinese character to be identified and all standard sub-stroke center models, is recognition result with classification under classification under the distance reckling or the inferior little top n.The computing formula of distance is as follows:
Figure C0212594900081
Wherein, D (SP, RP) expression center for standard point set and wait distance between knowing central point gathers, Q represents the set of center for standard point and waits to know the maximum number of the pen section that can mate between the central point set, I represents the pen section number of center for standard point set, J represents to wait to know the pen section number of central point set, and the remaining later on pen section number of pen section that is considered to connect pen in matching process is removed in J ' expression from the input pen section is gathered.(G iX, G iY) center point coordinate of gathering for center for standard point, (H jX, H jY) for waiting to know the center point coordinate of central point set, MS iExpression with center for standard point set in before the cross-talk collection of waiting of section being complementary of i-1 pen during knowing central point gathers, Simi (ST i, PT j) type and the similarity of waiting to know in the central point set j section type of i pen section in the expression center for standard point set, V is the threshold value of the pen section number difference that allowed, T is for giving the threshold value of the ultimate range that section is given that can not mate, and W is the threshold value of the minor increment between the section that allows coupling.
The concrete steps of sub-stroke center method of identification are as follows:
(1) the standard sub-stroke center of setting up each Chinese character is gathered;
(2) will wait to know standardization of Chinese characters, extract all sections in the Chinese character to be identified then, form central point set to be identified to normal size;
(3) by formula (2) calculate the distance of each center for standard point set between gathering with central point to be identified, and with as the distance between each standard Chinese character and the Chinese character to be identified;
(4) in all standard Chinese characters, get and Chinese character to be identified between be recognition result apart from reckling or inferior little top n.
The stroke relation matrix model is the primitive of forming Chinese character with the stroke, concerns by the type of stroke and position each other and describes Chinese character.Here, stroke is meant the common Chinese character stroke of being familiar with of people.The concrete form of stroke relation matrix model is:
(1) type of stroke
See accompanying drawing 1
(2) relation of the mutual alignment between the stroke
For represent as much as possible one between the various forms of Chinese character general character and ignore the factor that those might produce violent change, we turn to six kinds with the mutual alignment between each stroke relation is fuzzy: upper and lower, left and right, intersection, link to each other.
(3) built-up pattern
Because Chinese character image is two-dimentional,, stroke and mutual alignment relation thereof can reflect its architectural feature more accurately so expressing with two-dimensional approach.We adopt the form of matrix to describe:
S 1 S 2 ..... S N-1 S N
S 1 R 11 R 12 ..... R 1(N-1) R 1N
S 2 R 21 R 22 ..... R 2(N-1) R 2N
..... ..... .... ..... ...... .....
S N-1 R (N-1)1 R (N-1)2 ..... R (N-1)(N-1) R (N-1)N
S N R N1 R N2 ..... R N(N-1) R NN
Wherein, S represents stroke, and R representation relation, N are represented the stroke number.S 1~S NRepresent the meaning of row or column, i.e. stroke type, R 11~R NNBe matrix element, row that expression is corresponding with it and the mutual alignment that lists between two strokes concern.
Based on the stroke relation matrix model, the invention provides following Chinese characters recognition method, this method is called as the stroke relation matrix method of identification:
At first determine the pairing standard stroke relational matrix of each Chinese character classification model.During identification, calculate the similarity between Chinese character to be identified pairing pen section set and all standard stroke relational matrix models.With classification under similarity value the maximum is recognition result.The computing formula of similarity value is as follows:
Figure C0212594900101
Wherein, S (SP, RP) expression canonical matrix and wait to know similarity between the matrix, the pen section number that BN (SP) expression is corresponding with canonical matrix, BN (RP) represents and waits to know the corresponding pen section number of matrix, BN (RP ') expression from wait to know the matrix correspondence and matching process, remove and be considered to connect remaining pen section number after the pen section of pen, SS (S k, T k) k stroke and wait to know in the matrix similarity (k is i or j) on the type between k the stroke, RS (R in the expression canonical matrix Ij, G Ij) in the expression canonical matrix the capable j column element of i with wait to know the similarity between the capable j column element of i in the matrix, V is the threshold value of the pen section number difference that allowed.
The concrete steps of stroke relation matrix method of identification are as follows:
(1) sets up the standard stroke relational matrix model of each Chinese character.
(2) with standardization of Chinese characters to be identified to normal size, extract all sections in the Chinese character to be identified then, form the set of input pen section.
(3) by formula (3) calculate the similarity between the set of each canonical matrix and input pen section, and with as the similarity between each standard Chinese character and the Chinese character to be identified.
(4) in all standard Chinese characters, get and Chinese character to be identified between one of the similarity maximum be recognition result.
Sub-stroke center method of identification and stroke relation matrix method of identification respectively have characteristics, and the stroke relation matrix method of identification is more accurate, and sub-stroke center method of identification speed is faster.Therefore, Chinese characters recognition method provided by the invention adopts the sub-stroke center method of identification to carry out rough sort, adopts the stroke relation matrix method of identification to carry out disaggregated classification.Simultaneously, the accuracy that the sub-stroke center method of identification is discerned the Chinese character of shape comparison standard also is gratifying, therefore, when enforcement the present invention discerns the Chinese character of shape comparison standard, can adopt the sub-stroke center method of identification to carry out disaggregated classification separately.
The present invention has the following advantages:
1, Chinese characters recognition method provided by the invention carries out Chinese Character Recognition with unified mechanism, both can be used for off line identification, also can be used for off line identification, both can be used for handwritten form identification, also can be used for block letter identification.
2, Chinese characters recognition method recognition correct rate height provided by the invention, strong to the adaptive faculty of distortion, good stability.
Description of drawings
Fig. 1 is the stroke type figure in the stroke relation matrix model;
Fig. 2 is the synoptic diagram of sub-stroke center model;
Fig. 3 is the synoptic diagram of stroke relation matrix model;
Fig. 4 is the The general frame of Chinese characters recognition method
Fig. 5 is the Chinese Character Recognition process flow diagram of pen section center identification method;
Fig. 6 is the Chinese Character Recognition process flow diagram of stroke relation matrix method of identification;
Embodiment
Invention can be implemented in the various occasions that need carry out Chinese Character Recognition, optimal way is Online Handwritten Chinese Character Recognition System and device, off line printed Chinese characters recognition system and device, Off-line Handwritten Chinese Character Recognition system and device.Embodiment, in 6763 Chinese character scopes of GB2312-80 regulation, unrestricted free handwritten Chinese character is discerned, the accuracy of sub-stroke center sorter identification top ten candidate is more than 99%, average recognition speed is 1 a second/word, the recognition correct rate of stroke relation matrix sorter is more than 91.2%, and average recognition speed is 0.2 a second/word.

Claims (7)

1、一种基于结构模型的汉字识别方法,其特征在于:1, a kind of Chinese character recognition method based on structure model, it is characterized in that: 采用以笔段中心点模型为基础的笔段中心点识别法作粗分类;采用以笔划关系矩阵模型为基础的笔划关系矩阵识别法对粗分类结果作细分类;Using the stroke center point recognition method based on the stroke center point model for rough classification; using the stroke relationship matrix recognition method based on the stroke relationship matrix model to fine-tune the rough classification results; 所述的笔段中心点模型具有如下形式:首先将一个汉字图像规范化为标准大小,然后把它分解为笔段的集合,并将这些笔段确定为横、竖、撇、捺四种,最后用这些笔段的中心点的坐标和这些笔段的类型来构成表示一个汉字的模型,上述模型可归结为以下公式:The stroke center point model has the following form: first a Chinese character image is normalized to a standard size, then it is decomposed into a collection of strokes, and these strokes are determined as four types: horizontal, vertical, left and right, and finally Use the coordinates of the center points of these strokes and the types of these strokes to form a model representing a Chinese character. The above model can be summarized as the following formula: H={(Xi,Yi,Ti)},i=1,2,…,NH={(X i , Y i , T i )}, i=1, 2, ..., N 其中,H表示汉字,Xi为第i个笔段的中心点横坐标值,Yi为第i个笔段的中心点纵坐标值,Ti表示第i个笔段的类型,取值为横、竖、撇、捺四种之一,N为组成汉字的笔段个数;Among them, H represents a Chinese character, X i is the abscissa value of the center point of the i-th stroke segment, Y i is the ordinate value of the center point of the i-th stroke segment, T i represents the type of the i-th stroke segment, and the value is One of the four types of horizontal, vertical, left and right, and N is the number of strokes forming Chinese characters; 所述的笔段中心点识别法根据标准笔段中心点模型与待识笔段中心点模型之间的距离进行识别,距离按以下公式计算:Described stroke center point identification method is identified according to the distance between the standard stroke center point model and the stroke center point model to be recognized, and the distance is calculated by the following formula: 其中,D(SP,RP)表示标准中心点集合与待识中心点集合之间的距离,Q表示标准中心点集合与待识中心点集合之间可匹配的笔段的最大个数,I表示标准中心点集合的笔段个数,J表示待识中心点集合的笔段个数,J′表示匹配集和非匹配集中所有笔段的个数,(GiX,GiY)为标准中心点集合的中心点坐标,(HjX,HjY)为待识中心点集合的中心点坐标,MSi表示已经与标准中心点集合中前i-1个笔段相匹配的待识中心点集合中的笔段子集,Simi(STi,PTj)表示标准中心点集合中第i个笔段的类型与待识中心点集合中第j个笔段的类型的相似度,V为所允许的笔段个数差异的阈值,T为给不能匹配的笔段所赋予的最大距离的阈值,W为允许匹配的笔段之间的最小距离的阈值;Wherein, D (SP, RP) represents the distance between the standard central point collection and the central point collection to be recognized, Q represents the maximum number of strokes that can be matched between the standard central point collection and the central point collection to be recognized, and I represents The number of strokes in the standard central point set, J represents the number of strokes in the central point set to be recognized, J′ represents the number of all strokes in the matching set and non-matching set, (G i X, G i Y) is the standard The coordinates of the center point of the center point set, (H j X, H j Y) are the center point coordinates of the center point set to be recognized, and MS i indicates that it has been matched with the first i-1 strokes in the standard center point set The stroke subset in the center point set, Simi(ST i , PT j ) represents the similarity between the type of the i-th stroke segment in the standard center point set and the type of the j-th stroke segment in the center point set to be recognized, and V is The threshold value of the allowed stroke number difference, T is the threshold value of the maximum distance given to the stroke segments that cannot be matched, and W is the threshold value of the minimum distance between the stroke segments that allow matching; 所述的笔划关系矩阵模型具有如下形式:首先将一个汉字图像规范化为标准大小,然后把它分解为预先定义的不同类型笔划的集合,并确定这些笔划之间的相互位置关系,最后用这些笔划及其相互位置关系组成矩阵来构成表示一个汉字的模型,该模型可归结为以下矩阵公式:The stroke relationship matrix model has the following form: first, a Chinese character image is normalized to a standard size, then it is decomposed into a collection of predefined different types of strokes, and the mutual positional relationship between these strokes is determined, and finally these strokes are used to And its mutual positional relationship forms a matrix to form a model representing a Chinese character, which can be summarized as the following matrix formula:          S1       S2        ……    SN-1         SN S 1 S 2 …… S N-1 S N S1     R11       R12       ……    R1(N-1)       R1N S 1 R 11 R 12 …… R 1(N-1) R 1N S2     R21       R22       ……    R2(N-1)       R2N S 2 R 21 R 22 …… R 2(N-1) R 2N ……       ……               ……               ……    ……                   ……... ... ... ... ... ... ... ... SN-1    R(N-1)1    R(N-1)2    ……     R(N-1)(N-1)    R(N-1)N S N-1 R (N-1)1 R (N-1)2 …… R (N-1)(N-1) R (N-1)N SN     RN1       RN2       ……    RN(N-1)       RNN S N R N1 R N2 …… R N(N-1) R NN 其中,S表示笔划,R表示关系,N表示笔划个数,S1~SN表示行或列的意义,即笔划类型,R11~RNN为矩阵元素,表示与之对应的行与列上两笔划之间的相互位置关系,具体取值为上、下、左、右、交叉、相连六种关系之一;Among them, S represents the stroke, R represents the relationship, N represents the number of strokes, S 1 ~ S N represents the meaning of the row or column, that is, the stroke type, R 11 ~ R NN are matrix elements, representing the corresponding row and column The mutual positional relationship between two strokes, the specific value is one of the six relationships: up, down, left, right, cross, and connected; 所述的笔划关系矩阵识别法根据标准笔划关系矩阵模型与待识笔划关系矩阵模型之间的相似度进行识别,相似度按以下公式计算:Described stroke relationship matrix recognition method is identified according to the similarity between the standard stroke relationship matrix model and the stroke relationship matrix model to be recognized, and the similarity is calculated by the following formula:
Figure C021259490004C1
Figure C021259490004C1
其中,S(SP,RP)表示标准模型与待识模型之间的相似度,BN(SP)表示与标准模型对应的笔段个数,BN(RP)表示与待识模型对应的笔段个数,BN(RP′)表示与待识模型对应的笔段集合中包含的笔段个数,SS(Sk,Tk)表示标准模型中第k个笔划与待识模型中第k个笔划之间类型上的相似度(k为i或j),RS(Rij,Gij)表示标准模型中第i行第j列元素与待识模型中第i行第j列元素之间的相似度,V为所允许的笔段个数差异的阈值。Among them, S(SP, RP) represents the similarity between the standard model and the model to be recognized, BN(SP) represents the number of strokes corresponding to the standard model, and BN(RP) represents the number of strokes corresponding to the model to be recognized BN(RP′) represents the number of strokes contained in the stroke set corresponding to the model to be recognized, SS(S k , T k ) represents the kth stroke in the standard model and the kth stroke in the model to be recognized The similarity between types (k is i or j), RS(R ij , G ij ) represents the similarity between the element in row i, column j in the standard model and the element in row i, column j in the model to be recognized degree, and V is the threshold value of the allowable difference in the number of stroke segments.
2.如权利要求1所述的一种基于结构模型的汉字识别方法,其特征在于:所述的笔段中心点识别法包含以下步骤:(1)建立标准模型库:根据笔段中心点模型,建立每一个汉字的标准笔段中心点模型并保存在模型库中;(2)根据每一个标准笔段中心点模型和输入笔段集合确定对应的待识笔段中心点模型;(3)计算标准笔段中心点模型与待识笔段中心点模型的距离;(4)取距离值最小及次小的前N个标准笔段中心点模型所对应的汉字为识别结果。2. a kind of Chinese character recognition method based on structural model as claimed in claim 1, it is characterized in that: described stroke central point recognition method comprises the following steps: (1) set up standard model storehouse: according to stroke central point model , set up the standard stroke center point model of each Chinese character and save in the model library; (2) determine the corresponding stroke center point model to be recognized according to each standard stroke center point model and input stroke set; (3) Calculate the distance between the standard stroke center point model and the stroke center point model to be recognized; (4) take the Chinese characters corresponding to the first N standard stroke center point models with the smallest and second smallest distance values as the recognition result. 3、如权利要求2所述的一种基于结构模型的汉字识别方法,其特征在于:所述的根据标准笔段中心点模型和输入笔段集合确定待识笔段中心点模型的方法包含以下步骤:(1)对于标准笔段中心点模型中的每一个笔段,在输入的笔段集合中寻找与其距离最小的笔段;(2)如果该最小距离大于所限定的最大阈值,认为该标准笔段在输入的笔段集合中无可以匹配的笔段,否则将这两个笔段对应起来,并从各自的笔段集合中删除;(3)重复上述过程,直到标准笔段中心点模型中的每一个笔段都得到处理;(4)在上述计算过程中得到的与标准笔段中心点模型对应的笔段构成匹配集;(5)在没有被纳入匹配集的输入笔段中,除去连接了匹配集中两个笔段的笔段,剩下的笔段构成非匹配集;(6)确定匹配集和非匹配集中所有笔段的类型以及中心点坐标,形成待识笔段中心点模型。3, a kind of Chinese character recognition method based on structural model as claimed in claim 2, it is characterized in that: described according to standard stroke center point model and input stroke set the method for determining the stroke center point model to be recognized comprises the following Steps: (1) For each stroke segment in the standard stroke center point model, look for the stroke segment with the smallest distance to it in the input stroke segment set; (2) If the minimum distance is greater than the defined maximum threshold, consider the stroke segment Standard strokes have no matching strokes in the input stroke set, otherwise these two strokes will be matched and deleted from the respective stroke collection; (3) Repeat the above process until the standard stroke center point Each stroke in the model is processed; (4) The strokes corresponding to the standard stroke center point model obtained in the above calculation process form a matching set; (5) Among the input strokes that are not included in the matching set , remove the strokes that connect the two strokes in the matching set, and the remaining strokes form a non-matching set; (6) determine the types and center point coordinates of all strokes in the matching set and the non-matching set, and form the center of the strokes to be recognized point model. 4、如权利要求3所述的一种基于结构模型的汉字识别方法,其特征在于:所述的标准笔段与输入笔段的距离的计算方法包含以下步骤:(1)计算标准笔段的中心点与输入笔段的中心点之间的欧式距离;(2)根据标准笔段与输入笔段的类型,确定其类型相似度:横与竖之间、撇与捺之间的相似度为0,相同类型之间的相似度为1,其他情况下的相似度的确定则根据待识别笔段的角度偏离标准笔段的类型所允许的角度范围值的程度来定;(3)将步骤(1)中求出的距离除以步骤(2)中求出的类型相似度,得到最终的距离,如果类型相似度为0,则最终距离为所赋予的最大值。4, a kind of Chinese character recognition method based on structural model as claimed in claim 3 is characterized in that: the calculating method of the distance of described standard stroke and input stroke comprises the following steps: (1) calculate the standard stroke The Euclidean distance between the center point and the center point of the input stroke; (2) according to the type of the standard stroke and the input stroke, determine its type similarity: the similarity between horizontal and vertical, left and right is 0, the similarity between the same type is 1, and the determination of the similarity in other cases is determined according to the degree of the angle range value that the angle of the stroke to be recognized deviates from the type of the standard stroke; (3) step Divide the distance obtained in (1) by the type similarity obtained in step (2) to obtain the final distance. If the type similarity is 0, the final distance is the maximum value assigned. 5.如权利要求1所述的一种基于结构模型的汉字识别方法,其特征在于:所述的笔划关系矩阵识别法包含以下步骤:(1)建立标准模型库:根据笔划关系矩阵模型,建立每一个汉字的标准笔划关系矩阵模型并保存在模型库中;(2)根据每一个标准笔划关系矩阵模型从输入的笔段集合中确定笔划及其相互位置关系,构成一个待识笔划关系矩阵模型;(3)计算标准笔划关系矩阵模型与待识笔划关系矩阵模型的相似度值;(4)重复步骤(2)及(3)直到可以从标准笔划关系矩阵模型导出的所有待识笔划关系矩阵模型都已计算过,取其中最小的相似度值为该标准笔划关系矩阵模型对应的最终相似度值;(5)取最终相似度值最小的标准笔划关系矩阵模型所对应的汉字为识别结果。5. a kind of Chinese character recognition method based on structural model as claimed in claim 1, it is characterized in that: described stroke relation matrix recognition method comprises the following steps: (1) set up standard model storehouse: according to stroke relation matrix model, set up The standard stroke relationship matrix model of each Chinese character is stored in the model library; (2) according to each standard stroke relationship matrix model, strokes and their mutual positional relationships are determined from the input stroke collection to form a stroke relationship matrix model to be recognized (3) calculate the similarity value of the standard stroke relation matrix model and the stroke relation matrix model to be recognized; (4) repeat steps (2) and (3) until all stroke relation matrices to be recognized can be derived from the standard stroke relation matrix model The models have been calculated, and the smallest similarity value is taken as the final similarity value corresponding to the standard stroke relationship matrix model; (5) The Chinese character corresponding to the standard stroke relationship matrix model with the smallest final similarity value is taken as the recognition result. 6、如权利要求5所述的一种基于结构模型的汉字识别方法,其特征在于:所述的根据标准笔划关系矩阵模型从输入笔段集合中确定待识笔划关系矩阵模型的方法包含以下步骤:(1)对于标准笔划关系矩阵模型中的每一个笔划,在输入的笔段集合中寻找与之相同或相似的笔划,形成与之对应的一个笔段子集;(2)从标准笔划关系矩阵模型的所有笔划对应的待识别笔划集合中各自取出一个,构成待识笔划关系矩阵模型中的所有笔划,这些取出的笔划彼此之间不应矛盾,即不能共享笔段;(3)从原始笔段集合中暂时删除那些没有被纳入待识笔划关系矩阵模型但是连接了待识笔划关系矩阵模型中两个笔划的笔段,剩下的笔段形成与待识笔划关系矩阵模型对应的笔段集合;(4)根据所得到的待识笔划关系矩阵模型中的所有笔划确定各自的类型以及相互之间的位置关系,形成待识笔划关系矩阵模型。6, a kind of Chinese character recognition method based on structural model as claimed in claim 5, it is characterized in that: described according to standard stroke relation matrix model from the method for determining the stroke relation matrix model to be recognized from the input stroke segment set comprising the following steps : (1) For each stroke in the standard stroke relationship matrix model, search for the same or similar strokes in the input stroke collection to form a corresponding stroke subset; (2) from the standard stroke relationship matrix Take one out of the stroke sets to be recognized corresponding to all the strokes in the model to form all the strokes in the stroke relationship matrix model to be recognized. These taken out strokes should not contradict each other, that is, they cannot share strokes; (3) Temporarily delete those strokes that are not included in the stroke relationship matrix model to be recognized but connect two strokes in the stroke relationship matrix model to be recognized in the segment set, and the remaining strokes form a stroke segment set corresponding to the stroke relationship matrix model to be recognized (4) Determining their respective types and mutual positional relationships according to all strokes in the obtained stroke relationship matrix model to be recognized, forming a stroke relationship matrix model to be recognized. 7、如权利要求6所述的一种基于结构模型的汉字识别方法,其特征在于:所述的从输入笔段集合中寻找相同或相似笔划的方法包含以下步骤:(1)建立描述各个笔划的模板;(2)建立各个笔划之间的类型相似度值;(3)根据给定的相似度值的阈值,确定需要查找的笔划类型;(4)根据所要查找的笔划类型的模板在输入笔段集合中搜索,确定与之对应的笔段子集。7, a kind of Chinese character recognition method based on structural model as claimed in claim 6, it is characterized in that: described method for finding identical or similar stroke from input stroke collection comprises the following steps: (1) establish and describe each stroke (2) establish the type similarity value between each stroke; (3) determine the type of stroke that needs to be searched according to the threshold of the given similarity value; (4) input the template according to the type of stroke to be searched Search in the collection of strokes to determine the subset of strokes corresponding to it.
CNB021259496A 2002-08-06 2002-08-06 Chinese character recognizing method based on structure model Expired - Fee Related CN1186744C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB021259496A CN1186744C (en) 2002-08-06 2002-08-06 Chinese character recognizing method based on structure model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB021259496A CN1186744C (en) 2002-08-06 2002-08-06 Chinese character recognizing method based on structure model

Publications (2)

Publication Number Publication Date
CN1474351A CN1474351A (en) 2004-02-11
CN1186744C true CN1186744C (en) 2005-01-26

Family

ID=34143156

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB021259496A Expired - Fee Related CN1186744C (en) 2002-08-06 2002-08-06 Chinese character recognizing method based on structure model

Country Status (1)

Country Link
CN (1) CN1186744C (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1315090C (en) * 2005-02-08 2007-05-09 华南理工大学 Method for identifying hand-writing characters
CN102375994B (en) * 2010-08-10 2013-05-29 广东因豪信息科技有限公司 Method and device for detecting and reducing correctness of order of strokes of written Chinese character
CN107844740A (en) * 2017-09-05 2018-03-27 中国地质调查局西安地质调查中心 A kind of offline handwriting, printing Chinese character recognition methods and system
CN110909563B (en) * 2018-09-14 2023-07-28 新方正控股发展有限责任公司 Method, apparatus, device and computer readable storage medium for extracting text skeleton
CN109740415B (en) * 2018-11-19 2021-02-09 深圳市华尊科技股份有限公司 Vehicle attribute identification method and related product

Also Published As

Publication number Publication date
CN1474351A (en) 2004-02-11

Similar Documents

Publication Publication Date Title
CN110116415B (en) Bottle and tank garbage identification and classification robot based on deep learning
CN107679078B (en) Bayonet image vehicle rapid retrieval method and system based on deep learning
CN108154102B (en) Road traffic sign identification method
CN103761531B (en) The sparse coding license plate character recognition method of Shape-based interpolation contour feature
CN105975968B (en) A kind of deep learning license plate character recognition method based on Caffe frame
CN110147794A (en) A kind of unmanned vehicle outdoor scene real time method for segmenting based on deep learning
CN1315090C (en) Method for identifying hand-writing characters
CN111967313B (en) Unmanned aerial vehicle image annotation method assisted by deep learning target detection algorithm
CN111738367B (en) Part classification method based on image recognition
CN102163287A (en) Method for recognizing characters of licence plate based on Haar-like feature and support vector machine
CN105574540B (en) A Pest Image Feature Learning and Automatic Classification Method Based on Unsupervised Learning Technology
CN111523622B (en) Handwriting simulation method of mechanical arm based on feature image self-learning
CN112270681A (en) Method and system for detecting and counting yellow plate pests deeply
CN107273889B (en) License plate recognition method based on statistics
CN109325487B (en) Full-category license plate recognition method based on target detection
CN1186744C (en) Chinese character recognizing method based on structure model
CN107577994A (en) A recognition and retrieval method for pedestrians and vehicle accessories based on deep learning
CN117854036A (en) Water surface obstacle detection method based on improved YOLOv3
CN1025764C (en) Characters recognition method and system
CN110968735B (en) An Unsupervised Person Re-ID Method Based on Spherical Similarity Hierarchical Clustering
CN210161172U (en) Bottle and can type garbage identification and classification robot based on deep learning
CN1790374A (en) Face recognition method based on template matching
CN114926691A (en) Insect pest intelligent identification method and system based on convolutional neural network
CN115049881A (en) Ceramic fragment classification method based on convolutional neural network
CN114612718B (en) Small sample image classification method based on graph structural feature fusion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee