[go: up one dir, main page]

JPS58223884A - Sorting processor of character pattern - Google Patents

Sorting processor of character pattern

Info

Publication number
JPS58223884A
JPS58223884A JP57107346A JP10734682A JPS58223884A JP S58223884 A JPS58223884 A JP S58223884A JP 57107346 A JP57107346 A JP 57107346A JP 10734682 A JP10734682 A JP 10734682A JP S58223884 A JPS58223884 A JP S58223884A
Authority
JP
Japan
Prior art keywords
character
pattern
character pattern
coordinate axis
directional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP57107346A
Other languages
Japanese (ja)
Other versions
JPH0159627B2 (en
Inventor
Norihiro Hagita
紀博 萩田
Seiichiro Naito
内藤 誠一郎
Isao Masuda
功 増田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP57107346A priority Critical patent/JPS58223884A/en
Publication of JPS58223884A publication Critical patent/JPS58223884A/en
Publication of JPH0159627B2 publication Critical patent/JPH0159627B2/ja
Granted legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)

Abstract

PURPOSE:To enable recognition of a character pattern using deformed characters, by observing the character pattern from plural coordinate axis directions and counting the directional contribution degree of a character line in the direction orthogonal to the coordinte axis as well as the resemblance between two directional contribution degrees at each position on the coordinate axis. CONSTITUTION:A binary coded handwritten character pattern is stored 1, and the position and size of the pattern are normalized by a normalizing processor 2 with the centroid and a secondary moment. A device 3 frames characters in a forward region of W mesh of a character framing frame around on the centroid in order to cope with the extreme extrusion of a character line at the place near a contour of a handwritten character. Thus the outside of a frame is removed. A smoothing processor 4 fills up or removes the recessing/projecting parts of one black point mesh at the contour part of the character line. A feature extractor 5 performs a scan from the coordinate axis of direction K and then from direction M when it crosses a character to extract features. A sorter 6 obtains matching with a dictionary table.

Description

【発明の詳細な説明】 (1)発明の属する分野の説明 本発明は1文字パターンの分類処理装置、特に光電変換
によって得られた文字パターンを2値化した文字パター
ンに対し1手書き漢字のよう力多字種、多様な手書き変
形をもつ文字対象を効率よく分類するために9文字線構
造に関する特徴を文字パターンの外郭付近から抽出し、
入力文字パターンを分類する文字パターンの分類装置に
関するものである。
DETAILED DESCRIPTION OF THE INVENTION (1) Description of the field to which the invention pertains The present invention relates to a classification processing device for single character patterns, and in particular, a system for classifying character patterns obtained by photoelectric conversion into binary character patterns, such as one handwritten kanji. In order to efficiently classify character objects with multiple character types and various handwriting deformations, features related to nine character line structures are extracted from near the outline of the character pattern.
The present invention relates to a character pattern classification device that classifies input character patterns.

(2)従来の技術の説明 従来、漢字を含む文字パターンの認識装置または分類装
置として1手書き漢字を対象として芯線化処理を行った
文字パターンから端交点、屈曲点等の文字の幾何学的形
状を抽出し、これを用いて各文字線の位置関係、接続関
係等を表わし、これらの情報とすでに蓄えておいた各文
字の辞書テーブルとの照合を行い文字パターンを認識す
る方式が知られる。しかしながら、この方式では芯線化
処理によって起こるヒゲ、ゲイ5.ド等による雑音成分
が発生するため文字の幾何学的形状抽出が困難になるば
かりでなく9手書き文字変形によシ各文字線の位置関係
、接続関係が著しく変動し、この変動を吸収するために
膨大なデータあるいは膨大な認識論理を準備しなければ
ならず、効率のよい認識システムを実現できないという
欠点がある。
(2) Description of conventional technology Conventionally, as a recognition device or a classification device for character patterns including kanji characters, a core line processing is performed on a single handwritten kanji character pattern to obtain geometrical shapes of characters such as edge intersections and bending points. A method is known in which character patterns are recognized by extracting the information, using this to represent the positional relationship, connection relationship, etc. of each character line, and comparing this information with a dictionary table for each character that has already been stored. However, with this method, beards and gays caused by the core wire processing, 5. Not only does it become difficult to extract the geometric shape of characters due to the generation of noise components such as dots, but also the positional relationship and connection relationship of each character line changes significantly due to the deformation of handwritten characters. However, this method requires the preparation of a huge amount of data or a huge amount of recognition logic, and has the disadvantage that an efficient recognition system cannot be realized.

また9手書き漢字を対象として、2値化9位置及び大き
さの正規化、平滑化を行った文字パターンを水平軸及び
垂直軸の2座標軸方向から観測し。
In addition, character patterns of 9 handwritten kanji characters were binarized, normalized in position and size, and smoothed, and observed from two coordinate axes, the horizontal and vertical axes.

該座標軸上の各位置における該座標軸に直交する−  
方向の文字部を横切る文字線数を計数し、この情′iI 報から特徴ベクトルパターンを作成し、すでに蓄えてお
いた各文字の特徴辞書テーブルとのマツチングをとり1
文字パターンの分類を行う方式、あるいは2値化1位置
及び大きさの正規化、平滑化を行った文字パターンを粗
いメツシュの矩形領域に分割して、当該各矩形領域内に
存在する文字部に対して、水平軸及び垂直軸の2座標軸
方向から観測し、該座標軸上の各位置における該座標軸
に直交する方向の文字部を横切る文字線数を計数し。
perpendicular to the coordinate axis at each position on the coordinate axis -
Count the number of character lines that cross the character part in the direction, create a feature vector pattern from this information, and match it with the feature dictionary table for each character that has already been stored.
A method for classifying character patterns, or binarization 1 Divide the character pattern that has been normalized and smoothed in position and size into rectangular areas of coarse mesh, and divide the character parts existing in each rectangular area into On the other hand, it is observed from two coordinate axes directions, a horizontal axis and a vertical axis, and the number of character lines crossing the character part in the direction perpendicular to the coordinate axes at each position on the coordinate axes is counted.

との情報から特徴ベクトルパターンを作成し、すでに蓄
えておいた各文字の特徴辞書テーブルとのマツチングを
と〕1文字パターンの分類を行う方式が知られる。しか
しながら、これらの文字パターン分類方式で計数時に横
切る文字線数は1字種のちがいによる文字線構造の大ま
かな複維さのちがいを区別できるものの1文字線の方向
成分と方向成分別の分布等の、よシ詳細な文字線構造の
ちがいを表わす情報がないため、類似文字が多く。
A known method is to classify a single character pattern by creating a feature vector pattern from the information and matching it with a feature dictionary table for each character that has already been stored. However, with these character pattern classification methods, although the number of character lines crossed during counting can be distinguished from the rough complexity of the character line structure depending on the type of character, the direction component of one character line and the distribution by direction component, etc. Because there is no detailed information on the differences in character line structure, there are many similar characters.

かつ文字線の位置ずれ等の手書き変形も多い文字対象を
効率よく認識、″に:きないという欠点があった。
Moreover, it has the disadvantage that it cannot efficiently recognize character objects that have many handwritten deformations such as misalignment of character lines.

Cワ・ 従って、上記の各方法は手書き漢字などの字種も多く、
変形の大きい文字集合″には十分な認識を行うことが期
待し難い。
C Wa・ Therefore, each of the above methods can be applied to many types of characters such as handwritten kanji,
It is difficult to expect sufficient recognition for character sets with large deformations.

(3)発明の目的 本発明は、従来技術によって2値化9位置及び大きさの
正規化1文字枠取シ処理、平滑化を行った文字パターン
について1文字線の方向、接続関係、平行性等の文字線
構造情報を黒点の方向寄与度をもとにして求め、さらK
これらの情報を複数方向から文字を走査することによシ
計数する装置を提供し、多字種、多変形をもつ手書き漢
字を含む手書き文字パターンを効率よく分類することを
目的としている。
(3) Purpose of the Invention The present invention provides the direction, connection relationship, and parallelism of one character line for a character pattern that has been binarized, nine positions and sizes normalized, one character frame removal processing, and smoothed using conventional techniques. The character line structure information such as
The purpose of this invention is to provide a device that counts this information by scanning characters from multiple directions, and to efficiently classify handwritten character patterns including handwritten kanji with multiple types and variations.

(4)発明の構成および作用の説明 以下0図面を用いて詳細に説明する。(4) Explanation of the structure and operation of the invention A detailed explanation will be given below using the drawings.

第1図は1本発明による文字パターンの分類装置の一実
施例のブロック図である。第1図において回路1は記憶
回路で、はじめに2値化された手書き文字パターンを記
憶しておく回路である。装置2は、この2値化された文
字パターンを入力し。
FIG. 1 is a block diagram of an embodiment of a character pattern classification device according to the present invention. In FIG. 1, a circuit 1 is a storage circuit that first stores a binary handwritten character pattern. The device 2 inputs this binary character pattern.

例えば従来まで知られている重心と2次モーメントを用
いて位置及び大きさの正規化処理を行う文字パターンの
位置及び大きさの正規化処理装置である。装置3は、装
置2によって得られ九NXNメツシュの正規化文字パタ
ーン゛を入力し9手書き文字の外郭付近で起きる文字線
の極端なつき出しに対処するため9重心を中心とする文
字枠取り枠Wメツシュの正方領域で文字を枠取シし、枠
外にはみ出した文字部を除去する装置である。装置4は
、装置3によって得られた文字枠取シ後の文字パターン
を入力し1文字線輪郭部分の黒点の1メツシユの凹、凸
をそれぞれうめるまたは取シ除く平滑化処理装置である
For example, the present invention is a normalization processing device for the position and size of a character pattern that normalizes the position and size using the conventionally known center of gravity and second-order moment. The device 3 inputs the normalized character pattern of 9NXN mesh obtained by the device 2, and creates a character frame W centered on the center of gravity in order to deal with the extreme protrusion of character lines that occurs near the outer edge of 9 handwritten characters. This is a device that frames characters in the square area of the mesh and removes the characters that protrude outside the frame. The device 4 is a smoothing processing device which inputs the character pattern obtained by the device 3 after character frame removal and fills in or removes concavities and convexities of one mesh of black dots in the outline of one character line.

装置5は9本発明の主要部をなす特徴抽出装置で、平滑
化処理を行った文字パターンを入力し。
Device 5 is a feature extraction device which is the main part of the present invention, and inputs a character pattern that has been subjected to smoothing processing.

あらかじめ定め九に方向座標軸(たとえば4方向の場合
には水平方向を基準にして水平軸、+45’方向軸、垂
直軸、−45°方向軸の4座標軸)から。
From the predetermined nine direction coordinate axes (for example, in the case of four directions, the four coordinate axes are the horizontal axis, the +45' direction axis, the vertical axis, and the -45° direction axis).

(1)該座標軸に直交する方向に文字を走査し1文字部
と交叉した場合、該交叉文字部の黒画素についてあらか
じめ定め九M方向くたとえば、8方向の場合にはO;4
5τ90’: 135’、180’、225°、270
°。
(1) When a character is scanned in a direction perpendicular to the coordinate axis and intersects one character part, the black pixels of the intersecting character part are predetermined in the 9M direction. For example, in the case of 8 directions, O;
5τ90': 135', 180', 225°, 270
°.

315°の8方向)に触手を伸ばし、各方向別に連結す
る黒点数を計数し、該黒点の方向寄与度(特願昭56−
46659)を求める処理と、(2)#座標軸に直交す
る方向の走査で文字部とM(M≧2)回交叉した場合、
fi%(2≦m≦M)回目の交叉時の方向寄与度と(m
−1)回目の交叉時の方向寄与度との類似度を求める処
理とを具備する特徴抽出装置である。
The tentacles are extended in 8 directions of 315 degrees), and the number of sunspots connected in each direction is counted, and the directional contribution of the sunspots (Patent application 1986-
46659) and (2) when the character part is crossed M (M≧2) times by scanning in the direction orthogonal to the # coordinate axis,
fi% (2≦m≦M)-th crossover direction contribution and (m
-1) A feature extraction device comprising: a process of determining the degree of similarity with the degree of directional contribution at the time of the second crossover.

装置6は、装置5によって割υ当てられた方向寄与度の
値、及びその類似度をもとに9文字パターンを分類する
ための特徴テーブルを作成し、#手段“によって作成し
た特徴テーブルをもとに、すでにたくわえておいた各文
字の特徴辞書テーブルとマツチングをとり0文字パター
ンの分類を行う分類装置である。
The device 6 creates a feature table for classifying the nine character patterns based on the directional contribution value assigned by the device 5 and its similarity, and also uses the feature table created by #means. In particular, this is a classification device that performs matching with the feature dictionary table of each character that has already been stored to classify zero character patterns.

装置5の具体例として、4方向座標軸(水平軸。As a specific example of the device 5, four directional coordinate axes (horizontal axis.

+45°方向軸、垂直軸、−45°方向軸の4座標軸(
)   でそれぞれ1.2.3.4  の番号を付与す
る)から。
4 coordinate axes: +45° direction axis, vertical axis, -45° direction axis (
) to give numbers 1.2.3.4 respectively).

各座標軸に直交する方向に文字を走査し文字部と交叉し
た場合に8方向(0’、45°’、 90’、 1as
’、xso’。
When a character is scanned in a direction perpendicular to each coordinate axis and intersects with the character part, 8 directions (0', 45°', 90', 1as
',xso'.

225−270°、315°の8方向でそれらの方向に
それぞれ1.2.3.4.5.6.7.8  の番号を
付与する)に触手を伸ばし、方向寄与度、及び2つの方
向寄与度の類似度を求め1文字パターンを分類する場合
を例にとって説明する。
The tentacles extend in eight directions of 225-270° and 315° (numbering 1.2.3.4.5.6.7.8 to each direction), the directional contribution, and the two directions. An example will be explained in which a single character pattern is classified based on the similarity of contribution.

その第1の方法は以下の通りである。装置4によって得
られたWxWメツシュの文字パターンを水平軸方向を基
準にして、4方向座標軸から観測し、A方向座標軸(m
=1.2.3.4)上の位置ノ(m=1.3では/=1
.2.・・・、W、m=2.4では/=1.2.・・・
l W’となる。ここでW’= 2 W )で該座標軸
に直交する方向に走査し1文字部と惰回目に交叉した場
合、該交叉時に白点から黒点に変化した該黒点(走査開
始時は直前の画素が白点と仮定する)の方向寄与度6.
は G、=(引、的、・・・、a8)情  □(1)なる8
次元ベクトルで表わされる。ここで、al。
The first method is as follows. The character pattern of the WxW mesh obtained by the device 4 was observed from four directional coordinate axes with the horizontal axis as a reference, and the A-direction coordinate axis (m
=1.2.3.4) position on (m=1.3, /=1
.. 2. ..., W, m=2.4, /=1.2. ...
lW'. Here, when scanning in the direction perpendicular to the coordinate axis (W' = 2 W) and intersecting the 1st character part for the first time, the black point that changed from a white point to a black point at the time of the intersection (at the start of scanning, the previous pixel is Directional contribution of (assumed to be a white point) 6.
is G, = (pull, target, ..., a8) information □(1) becomes 8
Represented by a dimensional vector. Here, al.

α!、・・・、α8はそれぞれ、8方向の方向寄与度成
分で、該黒点から8方向に触手を伸ばして各方向別に得
られる黒点連結長7(C4−1,2,・・・、8)を用
いて、1例として なる式で表わされる。このa(には、ここで示したユー
クリッド距離以外の距離を適用することが可能である。
α! , ..., α8 are directional contribution components in eight directions, and the sunspot connection length 7 (C4-1, 2, ..., 8) obtained in each direction by extending tentacles in eight directions from the sunspot. As an example, it is expressed by the following formula. It is possible to apply a distance other than the Euclidean distance shown here to this a(.

また、該走査時に文字部と愼回、・目に交叉した場合、
該交叉時に黒点から白点に変化した該黒点の方向寄与度
〜′は @m’ =(61′、8富′、・・・、68′)情 □
(3)なる8次元ベクトルで表わされる。ここでa4′
(、=1.2.・・・、8)は(2)式と同様に、該黒
点の8方向別の黒点連結長l(((=1121・・・、
8)を用いて、1例として なる式で表゛わされる。
In addition, if the character part and the eyes intersect with each other during the scanning,
The directional contribution of the sunspot that changed from a sunspot to a white point at the time of the intersection ~' is @m' = (61', 8fu', ..., 68') Information □
(3) is expressed as an eight-dimensional vector. Here a4'
(,=1.2...,8) is the same as equation (2), the sunspot connection length l(((=1121...,
8), it is expressed by the following equation as an example.

さらに、該走査により文字部とM(M≧2)回交叉した
場合に、情(2≦情≦M)回目の交叉時の方向寄与度〜
および6Iと(m−t)回目の交叉時の方向寄与度−1
−叢および’m−1との間の類似度−−1+mおよび’
m−1***は1例として なる式で表わされる。
Furthermore, when the scanning intersects with the character part M (M≧2) times, the direction contribution at the time of intersection (2≦≦M) times ~
and 6I and the direction contribution at the (m-t)th intersection -1
- similarity between plexus and 'm-1--1+m and'
m-1*** is expressed by the following formula as an example.

このようにして求められる’fi、+ ’1% + ”
m−重1.。
'fi, + '1% + ” obtained in this way
m-weight 1. .

ベー1.情 のうち、fcとえば。Be 1. For example, fc.

’m+  %−1+fiについてはm = 1からm=
mo (1≦gn6≦M)の範囲の値を。
'm+%-1+fi for m=1 to m=
mo (1≦gn6≦M).

”fN+ へ’l+、にりいては、m=Mからm=M−
m。
``fN+ to 'l+, then from m=M to m=M-
m.

+1の範囲の値を選ぶことによシ。By choosing a value in the +1 range.

4方向座標軸上の位置ノの走査によって得られる特徴パ
ターン14/は !Aノ ==  (g l 、 −1、”’j#%01
 ’1!l ’!81…1%o−1+yFLOl*M 
+ −M−1+・・・IIIM−常6 + 1 + ’
M−g +M +…を福−濯O・ト情出)4/    
     (7)で表わされる。
The feature pattern 14/ obtained by scanning the position on the four-directional coordinate axes is! Aノ == (g l , -1,"'j#%01
'1! l'! 81...1%o-1+yFLOl*M
+ -M-1+...IIIM-always 6 + 1 + '
M-g +M +... 4/
It is expressed as (7).

従って1文字パターンの特徴ベクトルG#iG=(ft
t+#u+中+#tw+#*t+#st+−+#tw+
txt +fH+”・+ fvw + Pat r t
4m+”’+J’4W’ ) −(8)で表わされる。
Therefore, the feature vector of one character pattern G#iG=(ft
t+#u+medium+#tw+#*t+#st+-+#tw+
txt +fH+”・+ fvw + Pat r t
4m+"'+J'4W')-(8).

このようにして表わされる文字パターンの特徴ベクトル
Gの各要素を複数個まとめて、平均化した値を文字パタ
ーンの特徴として特徴テーブルを作成し、公知の識別関
数D (G)を求め1文字パターンを分類する。
A feature table is created by combining multiple elements of the feature vector G of the character pattern expressed in this way, using the averaged value as the feature of the character pattern, and a well-known discriminant function D (G) is calculated for one character pattern. to classify.

次に第2の方法は以下の通シである。前記のW×Wメツ
シュの文字パターンを、水平軸方向を基準にして、4方
向座標軸から観測し、4方向座標軸(’=1.213.
4)上の位置ノ(’=t+aでは)=1.2.・・・、
W、A=2.4ではJ=1.2.・・・IW’となる。
Next, the second method is as follows. The above-mentioned W×W mesh character pattern is observed from four directional coordinate axes with the horizontal axis direction as a reference, and the four directional coordinate axes ('=1.213.
4) Upper position (at '=t+a)=1.2. ...,
When W and A=2.4, J=1.2. ...becomes IW'.

ここでW’= 2 W ”)で該座標軸に直交する方向
に走査し1文字部と常回目に交叉した場合、該交叉時に
白点から黒点に変化した該黒点(走査開始時は直前の画
素が白点と仮定する)の方向寄与2  度鴫は 1′1 11、WL=(bl、bl、ba 、bi )g   
     (9)なる4次元ベクトルで表わされる。こ
こで、bl。
Here, when scanning in the direction perpendicular to the coordinate axis with W' = 2 is the white point) is 1'1 11, WL=(bl, bl, ba, bi)g
(9) It is expressed as a four-dimensional vector. Here, bl.

blw bm+ baはそれぞれ4方向の方向寄与度成
分で。
blw bm+ba are directional contribution components in four directions.

該黒点から8方向に触手を伸ばして各方向別に得られる
黒点連結長1((t=1,2.・・・、8)を用いて。
Using the sunspot connection length 1 ((t=1, 2..., 8) obtained in each direction by extending the tentacle in 8 directions from the sunspot.

1例として なる式で表わされる。このb(には、ここで示したユー
クリッド距離以外の距離を適用することが可能である。
As an example, it is expressed by the following formula. It is possible to apply a distance other than the Euclidean distance shown here to this b(.

また、該走査時に文字部と惰回目に交叉した場合、該交
叉時に黒点から白点に変化した該黒点の方向寄与度bm
は 1m = (bI’、b*’+ bs’+ b4’)f
nC”)なる4次元ベクトルで表わされる。ここでb 
、z、b !’+b:+ba’は、 (10)式と同様
に、肢黒点の8方向別の黒点連結長14((=1 、2
 、・・・、8)を用いて、1例どしてなる式で表わさ
れる。
In addition, when the character part intersects with the coaster during the scanning, the directional contribution bm of the black dot changed from a black dot to a white dot at the time of the intersection
is 1m = (bI', b*'+ bs'+ b4') f
nC”), where b
,z,b! '+b:+ba' is the sunspot connection length 14 ((=1, 2
, . . . , 8), it can be expressed as an example.

さらに、該走査によシ文字部とM’(M≧2)回文叉し
た場合に、m(2≦毒≦M)回目の交叉時の方向寄与度
輸、およびbmと(m−1)回目の交叉時の方向寄与度
−−1およびbm’−t との間の類似度rオ、。
Furthermore, when a palindrome intersects with the character part M' (M≧2) during the scanning, the direction contribution factor at the m (2≦poison≦M) intersection, and bm and (m-1) The degree of directional contribution at the time of the second crossover -1 and the degree of similarity between bm'-t and ro.

mおよびr、;−、、−は1例、!:してなる式で表わ
される。
m and r,;-,,- is one example,! : Represented by the formula.

このようにして求められる一’ ”%”fFL−j H
イ。
1'"%" fFL-j H obtained in this way
stomach.

’m−x、mのうち、たとえば bgyl 、 ’m−1、イ については毒=1から情
=惰。
Among 'm-x, m, for example, bgyl, 'm-1, i, poison = 1 to compassion = inertia.

(1≦惰。≦M)の範囲の値を。(1≦Inertia.≦M).

bm + ’m−t *fiについてはm = Mがら
m = M m6+ 1の範囲の値を選ぶことにょシ。
For bm + 'm-t *fi, we recommend choosing a value in the range of m = M m6 + 1 from m = M.

4方向座標軸上の位置ノの走査によって得られる特徴パ
ターン/ha)は J’4/ =  (b 1 + 11 川bmo +’
ll 、 r寓@ +”’+rfflO−1.t+tl
 +bM′、 l1ls−s 、”’、輻−” l ”
’;−’ l’l ”’ l ”’−fi。。
The characteristic pattern /ha) obtained by scanning the position on the four-directional coordinate axis is J'4/ = (b 1 + 11 river bmo +'
ll, r@@+”'+rfflO-1.t+tl
+bM', l1ls-s,"',radius-"l"
';-'l'l”' l ”'-fi. .

M−mo+1 )47           Hで表わ
される。
M-mo+1)47H.

従って1文字パターンの特徴ベクトル■は1(=  (
Act  、 /hu 、 、−、/Atw、/Aff
it 、/lhn 、*−、Axw’。
Therefore, the feature vector ■ of a single character pattern is 1 (= (
Act, /hu, , -, /Atw, /Aff
it, /lhn, *-, Axw'.

/As+  、/hst 、 ・−・、/Asw、&*
t  、/A41 、 …、lル4w’)   −Hで
表わされる。
/As+, /hst, ・-・, /Asw, &*
t, /A41, ..., l4w') -H.

このようにして表わされる文字パターンの特徴ベクトル
Hの各要素を複数個まとめて平均化した値を文字パター
ンの特徴として特徴テーブルを作成し、公知の識別関数
D(財)を求め文字パターンを分類する。
A feature table is created using a value obtained by averaging multiple elements of the feature vector H of the character pattern expressed in this way as a feature of the character pattern, and a known discriminant function D (goods) is obtained to classify the character pattern. do.

第2図に0回路1.装置2.装置3.装#4の動作例を
示す。第2図(4)は回路1にはじめにたくわ見られて
いる2値化された文学パターンの例である。第2図CB
)は第2図囚の文字パターンに対し装置2Vcより位置
および大きさの正規化を行った文字パターンの例である
。第2図(0は第2図03)の文字パターンに対し、装
置3にょシ1文字枠取シ処理を行った文字パターンの例
である。第2図0は第2図(Oの文字パターンに対し、
装置4にょシ平滑化を行った文字パターンの例である。
Figure 2 shows 0 circuit 1. Device 2. Device 3. An example of the operation of device #4 is shown below. FIG. 2 (4) is an example of a binarized literary pattern that is first seen in circuit 1. Figure 2 CB
) is an example of a character pattern obtained by normalizing the position and size of the character pattern shown in FIG. 2 by device 2Vc. This is an example of a character pattern obtained by performing one-character frame removal processing on the character pattern in FIG. 2 (0 is 03 in FIG. 2) by the device 3. Figure 2 0 is Figure 2 (for the letter pattern of O,
This is an example of a character pattern that has been smoothed by the device 4.

第3図は装置5の動作を説明する説明図で方向寄与度及
び2つの方向寄与度の類似度を求めるために観測する4
方向座標軸と各軸内の位置Jの動作範囲を示す。第3図
の軸8−1は水平方向座標軸、軸8−2は+45°方向
座標軸、軸8−3は垂直方向座標軸、軸8−4は一45
°方向座標軸を示す。
FIG. 3 is an explanatory diagram for explaining the operation of the device 5, and is an explanatory diagram illustrating the operation of the device 5, which is observed in order to determine the directional contribution and the similarity between the two directional contributions.
The direction coordinate axes and the operating range of position J within each axis are shown. In FIG. 3, the axis 8-1 is the horizontal coordinate axis, the axis 8-2 is the +45° coordinate axis, the axis 8-3 is the vertical coordinate axis, and the axis 8-4 is the -45° coordinate axis.
Indicates the ° direction coordinate axis.

第4図は、第3図と同様に装置5−の動作を説明する説
明図である。
FIG. 4 is an explanatory diagram illustrating the operation of the device 5- similarly to FIG. 3.

第4図囚は水平方向座標軸上の位置7=7.で。The prisoner in Figure 4 is at position 7 on the horizontal coordinate axis. in.

座標軸に直交する方向に走査して9文字部と2回交叉し
た場合を示したものである。第4図囚の黒点9−1及び
黒点9−3の部分は、該走査によりそれぞれ文字部と1
回目及び2回目に交叉した場合の白点から黒点に変化し
た該黒点を示し、黒点9−2及び黒点9−4の部分は、
該走査によシ、それぞれ文字部と1回目及び2回目に交
叉した場合の焦点から白点に変化した該黒点を示す。第
4図の)4.1文*/< I’ −’/(D黒点0黒0
結長を求″5/′−の触手を伸ばす方向を矢印で示した
ものであるO第4図囚の黒点9−1.黒点9−2.黒点
9−3.黒点9−4の各部分における。前記第1の方法
で悔=1からm=2回の文字交叉の範囲で得られる方向
寄与度’1 + ’l + ’1’+ a2’  およ
び2つの方向寄与度の類似度’12 、 ’I!’の1
例を第1表と第2表と第3表に示し、前記第2の方法で
営=1からm=2回の文字交叉の範囲で得られる方向寄
与度b1* b* + i’+ b!’および2つの方
向寄与度の類似度rIR、rl!’を第4表と第5表と
第6表に示す。
This figure shows a case where scanning is performed in a direction perpendicular to the coordinate axes and the nine character parts are crossed twice. The parts of black dot 9-1 and black dot 9-3 in Fig.
It shows the black dots that changed from white dots to black dots when they crossed each other the second time, and the black dots 9-2 and 9-4 are as follows:
The graph shows the black dot that changes from the focus to a white dot when it intersects with the character portion for the first and second time during the scan. )4.1 sentence */<I'-'/(D black point 0 black point 0 in Figure 4)
Each part of black point 9-1, black point 9-2, black point 9-3, and black point 9-4 in Figure 4, which shows the direction in which the tentacles of the 5/'- extending tentacles are shown by arrows. In the first method, the directional contribution '1 + 'l + '1' + a2' obtained in the range of character intersections of 1 to m = 2 times and the similarity of the two directional contributions '12 , 1 of 'I!'
Examples are shown in Tables 1, 2, and 3, and the directional contribution b1* b* + i'+ b obtained by the second method in the range of character intersections from m = 1 to m = 2 times is shown in Table 1, Table 2, and Table 3. ! ' and the similarity of the two directional contributions rIR, rl! ' are shown in Tables 4, 5, and 6.

第1表 第2表 第4表 第5表 第6表 従って、第4図囚の水平方向座標軸上の位置J=J0の
走査によって得られる特徴)くターンは、前記第1の方
法では It10=(0,902,0、O、Oro 、0.26
5.0.212 + 0.265 。
Table 1 Table 2 Table 4 Table 5 Table 6 Therefore, in the first method, it10= (0,902,0, O, Oro, 0.26
5.0.212 + 0.265.

0.653.0.0.0.0,0.490.0.40B
、0.40B10.913 。
0.653.0.0.0.0, 0.490.0.40B
, 0.40B10.913.

0.0.960,0.185.0.148.0.148
.0,0,0゜0.0.727.0.364,0.45
5,0.364.0,0.0゜0.886) また、前記第2の方法では 船ハ=(0,902、0,265、0,212、0,2
65。
0.0.960, 0.185.0.148.0.148
.. 0,0,0゜0.0.727.0.364,0.45
5,0.364.0,0.0゜0.886) Also, in the second method, ship H = (0,902, 0,265, 0,212, 0,2
65.

0.653 、0.49G 、 0.408 、0.4
08 。
0.653, 0.49G, 0.408, 0.4
08.

0.913゜ 0.148.0.960.0.184.0.148゜0
.364,0.727.0.364,0.455゜0.
886) で表わされる。
0.913゜0.148.0.960.0.184.0.148゜0
.. 364, 0.727.0.364, 0.455°0.
886).

この1hJ6および1htハのベク)A−の値は水平方
向座標軸の該位置/=10で座標軸に直交する方向に存
在する文字線の方向及び接続関係と相対位置関係及び2
つの文字線の平行性の度合を表わしている。
The value of A- of these 1hJ6 and 1htc is the direction, connection relationship, relative positional relationship, and 2
It represents the degree of parallelism of two character lines.

前記第2の方法から求めたA1ハを例にとれば。Take A1c obtained from the second method as an example.

Ih1ハで−1ではblの値が、  km’ではb茸′
の値が大きいことから第4図囚で核位置の上からの走査
では。
In Ih1ha, the value of bl is -1, and in km', the value of b mushroom is
Because the value of is large, in the scan from above the nuclear position in Figure 4.

まずはじめに水平方向の文字線が存在し、下からの走査
では、はじめに+45°方向の文字線が存在することが
わかる。前記第1及び第2の方法はこれら文字線の該走
査方向の位置ずれに影響をうけ力いため1手書き漢字等
に起きやすいこうした文字変形にも安定な特徴パターン
を得ることができるという利点をもつ。
First, there is a character line in the horizontal direction, and when scanning from below, it can be seen that there is a character line in the +45° direction first. The first and second methods have the advantage of being able to obtain a stable feature pattern even with character deformations that tend to occur in handwritten kanji, etc., since they are affected by the positional deviation of these character lines in the scanning direction. .

また、前記第1の方法を用いれば゛、端点、交叉点、屈
曲点等の詳細な幾何学的特徴も抽出できるという利点を
もつ0 さらに、前記第1及び第2の方法で走査に交叉した2つ
の文字線の方向寄与度の類似度を得るととにより第5図
(4)@(C’)に示すような手書きによる文字線の傾
き変動についても各傾き変動ノくターンの文字の平行性
が保たれているため、これら変形に影響を受けずに文字
を分類できる。
Furthermore, if the first method is used, it has the advantage that detailed geometric features such as end points, intersection points, and bending points can also be extracted. By obtaining the degree of similarity between the directional contributions of two character lines, even when the inclination of a handwritten character line as shown in Fig. 5 (4) @ (C') is changed, each change in inclination is equal to the character of the turn. Since the character is maintained, characters can be classified without being affected by these transformations.

これによシ、装置5に示した特徴抽出は幾何学的特徴に
〜着目する手書き漢字等の文字認識に有効i    な
手段となる。
Accordingly, the feature extraction shown in the device 5 becomes an effective means for character recognition such as handwritten Chinese characters that focuses on geometric features.

1 (5)効果の説明 以上、説明したように1本発明によれば9文字パターン
を複数の座標軸方向から観測し、座標軸上の各位置で座
標軸に直交する方向の文字線の方向寄与度及び2つの方
向寄与度の類似度を計数するため9文字パターンの各文
字線の方向、接続関係、平行性等の文字線構造情報を簡
易な手法で抽出できるだけでなく1文字線の傾き変動9
位置ずれ等による手書き変形にも強く、多字種、多くの
手書き文字変形をもつ手書き漢字等を含む文字対象を効
率よく分類できるという利点をもつ。
1 (5) Description of Effects As explained above, according to the present invention, nine character patterns are observed from a plurality of coordinate axis directions, and at each position on the coordinate axes, the directional contribution of the character line in the direction perpendicular to the coordinate axes and the In order to count the degree of similarity between two directional contributions, it is possible to extract character line structure information such as the direction, connection relationship, parallelism, etc. of each character line in a 9-character pattern using a simple method, as well as to calculate the change in the slope of a single character line9.
It is resistant to handwritten deformations due to misalignment, etc., and has the advantage of being able to efficiently classify character objects including handwritten Chinese characters with many character types and many handwritten character deformations.

【図面の簡単な説明】 第1図は本発明による文字パターンの分類処理装置の一
実施例ブロック図、第2図は第1図の回路1.装置2.
装置3及び装置4による処理の態様を説明する説明図、
第3図及び第4図は本発明の主要部でおる装置5の特徴
抽出手段を説明するための説明図、第5図は手書き文字
の手書き変形を説明するための説明図を示す。    
      □1は記憶回路、2は文字パターン正規化
処理装置、3は文字パターン枠取シ処理装置、4は平滑
化処理装置、5は文字パターンの特徴抽出装置。 6は文字パターンの分類装置、7は文字枠取り枠。 8−1.8−2.8−3.8−4は方向寄与度及び2つ
の方向寄与度の類似度を観測するための座標軸。 9−1.9−2.9−3.9−4は文字走査時に黒点か
ら白点または白点から黒点に変化する該黒点を示す0 特許出願人 日本電信電話公社 代理人弁理士 森 1)  寛 (A) i 第 (3ン (7〕 (8) 4図 3) 第  5 (O) 図
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of an embodiment of a character pattern classification processing device according to the present invention, and FIG. 2 is a circuit diagram of the circuit 1 of FIG. Device 2.
An explanatory diagram illustrating aspects of processing by the device 3 and the device 4,
3 and 4 are explanatory diagrams for explaining the feature extracting means of the device 5 which is the main part of the present invention, and FIG. 5 is an explanatory diagram for explaining handwritten transformation of handwritten characters.
□ 1 is a storage circuit, 2 is a character pattern normalization processing device, 3 is a character pattern frame processing device, 4 is a smoothing processing device, and 5 is a character pattern feature extraction device. 6 is a character pattern classification device, and 7 is a character frame. 8-1.8-2.8-3.8-4 are coordinate axes for observing the directional contribution and the similarity between the two directional contributions. 9-1.9-2.9-3.9-4 indicates a black dot that changes from a black dot to a white dot or from a white dot to a black dot when scanning a character0 Patent applicant Mori, patent attorney representing Nippon Telegraph and Telephone Public Corporation 1) Hiroshi (A) i No. 3 (7) (8) 4 Fig. 3) No. 5 (O) Fig.

Claims (1)

【特許請求の範囲】[Claims] 2値化された文字パターンに対して文字部の位置及び大
きさについて正規化処理を行う手段と;該手段によって
得られた文字パターンを文字枠取シ枠で枠取ルする手段
と;該枠取シ手段によって得られた文字パターンに平滑
化処理を行う手段とをそなえると共に;該平滑化処理手
段によって得られた文字パターンに対して、あらかじめ
定めた複数方向の座標軸について、該座標軸に直交する
方向に文字を走査し9文字部と交叉した場合に該交叉し
た文字部の黒画素について、あらかじめ定めた複数方向
に触手を伸ばして各方向別に黒画素連結長を求め、該黒
画素連結長から該交叉黒画素の方向寄4度を求める第1
の手段と;該座標軸に直交する方向への走査中に文字部
と複数回交叉した場合に該複数回の交叉のうち、ある交
叉時の文字部の方向寄与度とそれ以前に交叉した文字部
の方向寄与度との類似度を求める第2の手段と;第1の
手段と第2の手段とからの出力情報を少なくとも利用し
て文字パターンを分類する手段とを具備することを特徴
とする文字パターンの分類処理装置。
means for normalizing the position and size of the character portion of the binarized character pattern; means for framing the character pattern obtained by the means with a character frame; and the frame. a means for smoothing the character pattern obtained by the smoothing means; and a means for smoothing the character pattern obtained by the smoothing means with respect to coordinate axes in a plurality of predetermined directions, perpendicular to the coordinate axes; When a character is scanned in a direction and intersects nine character parts, tentacles are extended in multiple predetermined directions for the black pixels of the crossed character part, and the black pixel connected length is determined for each direction, and from the black pixel connected length. The first step is to obtain the 4 degree deviation in direction of the crossed black pixel.
means; when a character part is crossed multiple times during scanning in a direction perpendicular to the coordinate axis, the directional contribution of the character part at a certain time of intersection among the multiple crossovers and the previously crossed character parts; and a means for classifying character patterns using at least the output information from the first means and the second means. Character pattern classification processing device.
JP57107346A 1982-06-21 1982-06-21 Sorting processor of character pattern Granted JPS58223884A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57107346A JPS58223884A (en) 1982-06-21 1982-06-21 Sorting processor of character pattern

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57107346A JPS58223884A (en) 1982-06-21 1982-06-21 Sorting processor of character pattern

Publications (2)

Publication Number Publication Date
JPS58223884A true JPS58223884A (en) 1983-12-26
JPH0159627B2 JPH0159627B2 (en) 1989-12-19

Family

ID=14456718

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57107346A Granted JPS58223884A (en) 1982-06-21 1982-06-21 Sorting processor of character pattern

Country Status (1)

Country Link
JP (1) JPS58223884A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0834826A3 (en) * 1996-10-01 1999-10-06 Canon Kabushiki Kaisha Positioning templates in optical character recognition systems

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0834826A3 (en) * 1996-10-01 1999-10-06 Canon Kabushiki Kaisha Positioning templates in optical character recognition systems
US6081621A (en) * 1996-10-01 2000-06-27 Canon Kabushiki Kaisha Positioning templates in optical character recognition systems

Also Published As

Publication number Publication date
JPH0159627B2 (en) 1989-12-19

Similar Documents

Publication Publication Date Title
US5841905A (en) Business form image identification using projected profiles of graphical lines and text string lines
US5930393A (en) Method and apparatus for enhancing degraded document images
Goto et al. Extracting curved text lines using local linearity of the text line
Chi et al. Separation of single-and double-touching handwritten numeral strings
JPH05342408A (en) Document image filing device
JPS58223884A (en) Sorting processor of character pattern
Ramel et al. A structural representation adapted to handwritten symbol recognition
JP2613959B2 (en) Fingerprint pattern classification device
Mitchell et al. Document page segmentation based on pattern spread analysis
JPS6238752B2 (en)
JP3104355B2 (en) Feature extraction device
Li An implementation of ocr system based on skeleton matching
JPH026112B2 (en)
JPH0425588B2 (en)
JPS634231B2 (en)
JPS634229B2 (en)
JP2789622B2 (en) Character / graphic area determination device
JP3009237B2 (en) Feature extraction method
JP2575402B2 (en) Character recognition method
JPS5855553B2 (en) Character pattern classification device
JP2918363B2 (en) Character classification method and character recognition device
GB2329738A (en) Determining relationship between line segments in pattern recognition
JPH0145669B2 (en)
JPS58165178A (en) Character reader
JPH022189B2 (en)