JP2797721B2

JP2797721B2 - Character recognition device

Info

Publication number: JP2797721B2
Application number: JP3000526A
Authority: JP
Inventors: 大輔西脇
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1991-01-08
Filing date: 1991-01-08
Publication date: 1998-09-17
Anticipated expiration: 2013-09-17
Also published as: JPH052660A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は文字認識装置に関し、特
に少なくとも２種類の文字分類手段を併用して文字認識
を行なう文字認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition apparatus, and more particularly to a character recognition apparatus for performing character recognition by using at least two types of character classification means.

【０００２】[0002]

【従来の技術】従来のパターン整合法による文字認識方
式では、認識すべき未知文字と認識辞書部に記憶されて
いる標準文字との類似度を計算し、最大類似度が予めそ
の標準文字に対して設定したしきい値以上であり、かつ
最大類似度と次に類似度の高い次大類似度との差が予め
設定したもうひとつのしきい値以上の場合に、未知文字
は最大類似度を示す標準文字と同一の字種という認識結
果を出力する。これらの２個のしきい値を、各字種毎に
適当な値を設定することにより、誤読する未知文字の数
を抑えるとともに認識不能となる未知文字の数も減らそ
うとすることは公知である（特公昭５８−１７８４８４
号公報参照）。2. Description of the Related Art In a conventional character recognition method based on a pattern matching method, a similarity between an unknown character to be recognized and a standard character stored in a recognition dictionary is calculated, and the maximum similarity is determined in advance for the standard character. If the difference between the maximum similarity and the next highest similarity is equal to or greater than another predetermined threshold, the unknown character has the maximum similarity. The recognition result of the same character type as the indicated standard character is output. It is known that by setting these two thresholds to appropriate values for each character type, it is possible to suppress the number of misread unknown characters and also reduce the number of unknown characters that cannot be recognized. Yes (Japanese Patent Publication No. 58-178484)
Reference).

【０００３】また、認識率を向上させるため、２種類の
文字認識方式を併用することが考えられているが、この
方式において同種のしきい値を使うものとしては、「自
由手書き文字認識方式に於ける判定理論の一方式」が知
られている（昭６２電子通信学会総合全国大会１４９
０）。図６は、この方式における判定アルゴリズムの概
要を示すものである。これによれば、文字のパターン整
合をステップ２０１と２０２に、文字の構造解析をステ
ップ２０４と２０５に配し、パターン整合法だけで認識
結果とするかどうかを判定する第１のしきい値をステッ
プ２０３に設定し、未知文字とあらかじめ記憶した標準
文字との類似度を計算し、最大類似度が予めその標準文
字に対して設定したしきい値以上であれば、未知文字は
最大類似度を示す標準文字と同一の字種という認識結果
を出力するステップ２０３とステップ２１０を実行す
る。In order to improve the recognition rate, it has been considered to use two types of character recognition methods in combination. In this method, the same type of threshold is used, for example, "free handwritten character recognition method". A method of decision theory in 1987 is known.
0). FIG. 6 shows an outline of the determination algorithm in this method. According to this, the pattern matching of characters is arranged in steps 201 and 202, the structural analysis of characters is arranged in steps 204 and 205, and the first threshold value for determining whether or not a recognition result is obtained only by the pattern matching method is set. In step 203, the similarity between the unknown character and the standard character stored in advance is calculated, and if the maximum similarity is equal to or greater than the threshold value previously set for the standard character, the unknown character has the maximum similarity. Steps 203 and 210 for outputting a recognition result of the same character type as the indicated standard character are executed.

【０００４】しきい値以下の場合は、パターン整合法だ
けでは不完全ということで、さらにステップ２０４で構
造解析による特徴抽出をその未知文字に対して行ない、
標準との比較を行なう。その際に求められた第一候補の
文字コードに対して計算された評価値による評価結果を
認識結果とするかどうかをステップ２０７で第２のしき
い値を用いて判定し、さらに構造解析で一旦は棄却され
た文字を認識結果として救済するかどうかをステップ２
０９で第３のしきい値を使って判定するといったよう
に、合計３個のしきい値を認識対象字種ごとに設定し、
各判断で安定と思われる認識結果を順次これらしきい値
により判定し、しきい値処理を通らないものはステップ
２１１により候補棄却となる。If the threshold value is less than the threshold value, it is imperfect that the pattern matching method alone is incomplete. In step 204, feature extraction by structural analysis is performed on the unknown character.
Perform a comparison with the standard. In step 207, it is determined using the second threshold value whether or not the evaluation result based on the evaluation value calculated for the first candidate character code obtained at this time is used as the recognition result. Step 2 is to determine whether to rescue the rejected characters as a recognition result
For example, a total of three thresholds are set for each recognition target character type, such as making a determination using the third threshold in 09,
Recognition results that are considered stable in each determination are sequentially determined based on these thresholds, and those that do not pass the threshold processing are rejected in step 211.

【０００５】[0005]

【発明が解決しようとする課題】上述した従来の文字認
識は、しきい値処理を複数回実行することにより、誤読
は減るが棄却する未知文字が増加し全体の認識率が下が
るという問題点がある。また、先に述べたパターン整合
と構造解析の２種類の手法を併用して認識率の低下を防
ぐ方式では、字種数ごとに３個のしきい値を設定しなけ
ればならず、字種数が増大すると、これらしきい値の設
定が複雑になるといった問題点や、判断処理が多くなり
処理量が増大するといった問題点がある。The above-described conventional character recognition has a problem in that, by performing threshold processing a plurality of times, misreading is reduced, but unknown characters to be rejected are increased and the overall recognition rate is reduced. is there. Further, in the above-described method of using two methods of pattern matching and structural analysis in combination to prevent a reduction in the recognition rate, three thresholds must be set for each number of character types. As the number increases, there is a problem that the setting of these thresholds becomes complicated, and there is a problem that the number of determination processes increases and the processing amount increases.

【０００６】本発明の目的は上述の問題点を解決し、２
種類および３種類以上の分類手段が出す分類結果の第一
候補の文字コードに対して評価値をそれぞれ計算し、そ
れらの大小比較と、候補の文字コード一個につき一回の
しきい値判定処理により棄却判定すると同時に、２種類
および３種類以上の分類手段の分類結果の第一候補が一
致しなかった場合でも、どれかに正解となる候補がある
場合には、その候補が最終認識結果として採用されるよ
うな評価方法を備え、誤読及び棄却する未知文字の数が
同時に減少する効果をもつ文字認識装置を提供すること
にある。SUMMARY OF THE INVENTION An object of the present invention is to solve the above-mentioned problems and to solve the problem
The evaluation value is calculated for each of the first candidate character codes of the classification results output by the type and three or more types of classification means, and a comparison is made between the evaluation values and a threshold determination process is performed once for each candidate character code. At the same time as the rejection determination, even if the first candidate of the classification result of the two or three or more classification means does not match, if any of the candidates is correct, the candidate is adopted as the final recognition result. It is an object of the present invention to provide a character recognition device having an evaluation method as described above and having an effect of simultaneously reducing the number of misread and rejected unknown characters.

【０００７】[0007]

【課題を解決するための手段】本発明の文字認識装置
は、学習用パターンから作成した字種Ｋに関する２種
類の標準パターンと入力した未知文字パターンとの距離
計算によって文字認識のための２種類の文字分類を行な
う第一および第二の分類手段と、前記第一の分類手段に
よる分類結果の最もよく標準パタンに類似した第一候補
の文字コードＣｉとその距離値ｄ１（ｉ）および前記第
二の分類手段による分類結果の第一候補の文字コードＣ
ｊとその距離値ｄ１（ｊ）を入力し前記文字コードＣｉ
とＣｊが一致する場合にはその文字コードを最終認識結
果とし、前記文字コードＣｉとＣｊが異る場合には前記
第一の分類手段および第二の分類手段の距離値ｄ１
（ｋ）およびｄ２（ｋ）をそれぞれＸ軸およびＹ軸とす
る平面上における認識候補の文字コードＣｋの２次元ベ
クトルＣｋ（ｄ１（ｋ），ｄ２（ｋ））の分布の傾きを
代表する主軸方向ベクトルを主成分分析法を用い前記距
離値ｄ１（ｋ）およびｄ２（ｋ）を変量として形成され
る２次元の分散共分散行列の最大固有値に対応する固有
ベクトルＶｋ（ｐ１（ｋ），ｐ２（ｋ））として求め、
さらに前記主軸方向ベクトルと前記２次元ベクトルＣｋ
（ｄ１（ｋ），ｄ２（ｋ））にもとづいて得られかつ前
記文字コードＣｋを認識結果と評価する尺度としての評
価値Ｅｋを前記文字コードＣｉに対する評価値Ｅｉと前
記文字コードＣｊに対する評価値Ｅｊのいずれかの選択
によって求めるものとし、また前記選択にあたってはし
きい値および相互の大小比較にもとづいて前記評価値Ｅ
ｉおよびＥｊに対応するいずれかの文字コードを最終認
識結果として選択するかもしくはいずれも棄却するかを
判定する判定手段とを備えて構成される。SUMMARY OF THE INVENTION A character recognition apparatus according to the present invention comprises two types of standard patterns for character type K created from a learning pattern and two types of character recognition by calculating the distance between an input unknown character pattern. First and second classifying means for performing character classification of a first candidate character code Ci most similar to a standard pattern as a result of classification by the first classifying means, and a distance value d1 (i) of the first candidate character code; Character code C of the first candidate of the classification result by the second classification means
j and its distance value d1 (j) are input and the character code Ci is input.
When the character codes Ci and Cj match, the character code is used as the final recognition result. When the character codes Ci and Cj are different, the distance value d1 between the first classifying unit and the second classifying unit is used.
A main axis representing the inclination of the distribution of the two-dimensional vector Ck (d1 (k), d2 (k)) of the character code Ck of the recognition candidate on a plane having (k) and d2 (k) as the X axis and the Y axis, respectively. The eigenvectors Vk (p1 (k), p2 () corresponding to the maximum eigenvalues of the two-dimensional variance-covariance matrix formed by using the distance vectors d1 (k) and d2 (k) as variables using the principal component analysis method with the direction vectors. k)),
Further, the main axis direction vector and the two-dimensional vector Ck
(D1 (k), d2 (k)), and an evaluation value Ek as a scale for evaluating the character code Ck as a recognition result is an evaluation value Ei for the character code Ci and an evaluation value for the character code Cj. Ej, and the evaluation value E is determined based on a threshold value and a mutual magnitude comparison.
a determination unit configured to determine whether to select any of the character codes corresponding to i and Ej as the final recognition result or to reject both of them.

【０００８】また、本発明の文字認識装置は、学習用パ
ターンから作成した字種Ｋに関する２種類の標準パター
ンと入力した未知文字パターンとの距離計算によって文
字認識のための２種類の文字分類を行なう第一および第
二の分類手段と、前記第一の分類手段による分類結果の
最もよく標準パタンに類似した第一候補の文字コードＣ
ｉとその距離値ｄ１（ｉ）および前記第二の分類手段に
よる分類結果の第一候補の文字コードＣｊとその距離値
ｄ１（ｊ）を入力し前記文字コードＣｉとＣｊが一致す
る場合にはその文字コードを最終認識結果とし、前記文
字コードＣｉとＣｊが異る場合には前記第一の分類手段
および第二の分類手段の距離値ｄ１（ｋ）およびｄ２
（ｋ）をそれぞれＸ軸およびＹ軸とする平面上における
認識候補の文字コードＣｋの２次元ベクトルＣｋ（ｄ１
（ｋ），ｄ２（ｋ））の分布の主軸方向ベクトルを主成
分分析法を用い前記距離値ｄ１（ｋ）およびｄ２（ｋ）
を変量として形成される２次元の相関行列の最大固有値
に対応する固有ベクトルＶｋ（ｐ１（ｋ），ｐ２
（ｋ））として求め、さらに前記主軸方向ベクトルと前
記２次元ベクトルＣｋ（ｄ１（ｋ），ｄ２（ｋ））にも
とづいて得られかつ前記文字コードＣｋを認識結果と評
価する尺度としての評価値Ｅｋを前記文字コードＣｉに
対する評価値Ｅｉと前記文字コードＣｊに対する評価値
Ｅｊのいずれかの選択によって求めるものとし、また前
記選択にあたってはしきい値および相互の大小比較にも
とづいて前記評価値ＥｉおよびＥｊに対応するいずれか
の文字コードを最終認識結果として選択するかもしくは
いずれも棄却するかを判定する判定手段とを備えて構成
される。The character recognition apparatus of the present invention calculates two types of character classification for character recognition by calculating a distance between two types of standard patterns related to the character type K created from a learning pattern and an input unknown character pattern. First and second classifying means, and a first candidate character code C most similar to a standard pattern as a result of classification by the first classifying means.
When i and its distance value d1 (i) and the first candidate character code Cj of the classification result by the second classification means and its distance value d1 (j) are input and the character codes Ci and Cj match, The character code is used as the final recognition result. If the character codes Ci and Cj are different, the distance values d1 (k) and d2 of the first and second classifying means are used.
A two-dimensional vector Ck (d1 of a character code Ck of a recognition candidate on a plane having (k) as an X axis and a Y axis, respectively.
(K), d2 (k)) using the principal component analysis method to calculate the distance values d1 (k) and d2 (k)
Are the eigenvectors Vk (p1 (k), p2) corresponding to the maximum eigenvalues of the two-dimensional correlation matrix formed by using
(K)), and an evaluation value which is obtained based on the main axis direction vector and the two-dimensional vector Ck (d1 (k), d2 (k)) and is a scale for evaluating the character code Ck as a recognition result. Ek is determined by selecting one of the evaluation value Ei for the character code Ci and the evaluation value Ej for the character code Cj. In the selection, the evaluation value Ei and the evaluation value Ei A determination means is provided for determining whether any character code corresponding to Ej is selected as the final recognition result or whether any character code is rejected.

【０００９】また、本発明の装置は、学習用パターンか
ら作成した字種Ｋに関する２種類の標準パターンと入力
した未知文字パターンとの距離計算によって文字認識の
ための３種類以上全ｎ種類の文字分類を行なう第一乃至
第ｎの分類手段と、第ｍ（ｍ＜ｎ）の分類手段の分類結
果の第一候補の文字コードＣｈの距離値ｄｍ（ｈ）と第
ｍの分類手段以外のｎ−１の分類手段における前記文字
コードＣｈに対する距離値とを参照しこれら距離値を分
類手段順に並べたｎ次元ベクトルＣｈ（ｄ１（ｈ），ｄ
２（ｈ），…ｄｍ（ｈ），…ｄｎ（ｈ））を生成するこ
とを全分類手段の第一候補に対して行ない前記ｎ次元ベ
クトルから全ｎ個の分類手段の分類結果の第一候補の文
字コードのそれぞれに対して認識結果と評価する尺度と
しての評価値を計算する場合に、全分類手段の分類結果
の第一候補の文字コードが同一の場合にはその文字コー
ドを最小認識結果とし、同一でない場合には前記ｎ個の
分類手段の距離値ｄ１（ｋ），ｄ２（ｋ），…，ｄｍ
（ｋ），…，ｄｎ（ｋ）を座標軸としたときのｎ次元空
間内におけるｎ次元ベクトルＣｋ（ｄ１（ｋ），ｄ２
（ｋ），…，ｄｍ（ｋ），…，ｄｎ（ｋ））の分布の主
軸方向ベクトルを主成分分析法を用い前記ｎ次元ベクト
ルの各要素ｄ１（ｋ），ｄ２（ｋ），…，ｄｍ（ｋ），
…，ｄｎ（ｋ）を変量として形成されるｎ次元の分散共
分散行列の最大固有値に対応する固有ベクトルＶｋ（ｐ
１（ｋ），ｐ２（ｋ），…，ｐｍ（ｋ），…，ｐｎ
（ｋ））として求め、さらに前記主軸方向ベクトルと前
記ｎ次元ベクトルＣｋ（ｄ１（ｋ），ｄ２（ｋ），…，
ｄｍ（ｋ），…，ｄｎ（ｋ））にもとづいて得られかつ
前記文字コードＣｋを認識結果として評価する尺度とし
ての評価値Ｅｋを前記ｎ個の分類手段によるｎ個の第一
候補の文字コードに対するｎ個の評価値のいずれかの選
択によって求めるものとし、また前記選択にあたっては
しきい値および相互の大小比較にもとづいて前記ｎ個の
評価値のいずれかに対応する文字コードを最小認識結果
として選択するかもしくはいずれも棄却するかを判定す
る判定手段とを備えて成ることを特徴とする文字認識装
置。In addition, the apparatus of the present invention calculates three or more types of all n types of characters for character recognition by calculating a distance between two types of standard patterns relating to the character type K created from the learning pattern and the input unknown character pattern. First to n-th classifying means for classifying, and the distance value dm (h) of the first candidate character code Ch of the classification result of the m-th (m <n) classifying means and n other than the m-th classifying means. -1 classifying means, the n-dimensional vector Ch (d1 (h), d
2 (h),... Dm (h),... Dn (h)) for the first candidate of all the classifiers. When calculating the evaluation value as a scale to evaluate the recognition result for each of the candidate character codes, if the character code of the first candidate in the classification result of all the classification means is the same, the character code is recognized as a minimum. As a result, if they are not the same, the distance values d1 (k), d2 (k),.
(K),..., Dn (k) are coordinate axes, and an n-dimensional vector Ck (d1 (k), d2
(K),..., Dm (k),..., Dn (k)), using principal component analysis, the respective elements d1 (k), d2 (k),. dm (k),
, Dn (k) as a variable, the eigenvector Vk (p
1 (k), p2 (k), ..., pm (k), ..., pn
(K)), and further, the main axis direction vector and the n-dimensional vector Ck (d1 (k), d2 (k),.
dm (k),..., dn (k)), and evaluates the character code Ck as a recognition result as an evaluation value Ek by the n number of first candidate characters by the n classifying means. The character code corresponding to any of the n evaluation values is determined based on a threshold value and a magnitude comparison between the character codes. A character recognition device comprising: a determination unit that determines whether to select as a result or to reject any of them.

【００１０】また、本発明の文字認識装置は、学習用パ
ターンから作成した字種Ｋに関する２種類の標準パター
ンと入力した未知文字パターンとの距離計算によって文
字認識のための３種類以上全ｎ種類の文字分類を行なう
第一乃至第ｎの分類手段と、第ｍ（ｍ＜ｎ）の分類手段
の分類結果の第一候補の文字コードＣｈの距離値ｄｍ
（ｈ）と第ｍの分類手段以外のｎ−１の分類手段におけ
る前記文字コードＣｈに対する距離値とを参照しこれら
距離値を分類手段順に並べたｎ次元ベクトルＣｈ（ｄ１
（ｈ），ｄ２（ｈ），…ｄｍ（ｈ），…ｄｎ（ｈ））を
生成することを全分類手段の第一候補に対して行ない前
記ｎ次元ベクトルから全ｎ個の分類手段の分類結果の第
一候補の文字コードのそれぞれに対して認識結果と評価
する尺度としての評価値を計算する場合に、全分類手段
の分類結果の第一候補の文字コードが同一の場合にはそ
の文字コードを最終認識結果とし、同一でない場合には
前記ｎ個の分類手段の距離値ｄ１（ｋ），ｄ２（ｋ），
…，ｄｍ（ｋ），…，ｄｎ（ｋ）を座標軸としたときの
ｎ次元空間内におけるｎ次元ベクトルＣｋ（ｄ１
（ｋ），ｄ２（ｋ），…，ｄｍ（ｋ），…，ｄｎ
（ｋ））の分布の主軸方向ベクトルを主成分分析法を用
い前記ｎ次元ベクトルの各要素ｄ１（ｋ），ｄ２
（ｋ），…，ｄｍ（ｋ），…，ｄｎ（ｋ）を変量として
形成されるｎ次元の相関行列の最大固有値に対応する固
有ベクトルＶｋ（ｐ１（ｋ），ｐ２（ｋ），…，ｐｍ
（ｋ），…，ｐｎ（ｋ））として求め、さらに前記主軸
方向ベクトルと前記ｎ次元ベクトルＣｋ（ｄ１（ｋ），
ｄ２（ｋ），…，ｄｍ（ｋ），…，ｄｎ（ｋ））にもと
づいて得られかつ前記文字コードＣｋを認識結果として
評価する尺度としての評価値Ｅｋを前記ｎ個の分類手段
によるｎ個の第一候補の文字コードに対するｎ個の評価
値のいずれかの選択によって求めるものとし、また前記
選択にあってはしきい値および相互の大小比較にもとづ
いて前記ｎ個の評価値のいずれかに対応する文字コード
を最小認識結果として選択するかもしくはいずれも棄却
するかを判定する判定手段とを備えて構成される。In addition, the character recognition apparatus of the present invention calculates three or more types of character recognition by calculating the distance between two types of standard patterns related to the character type K created from the learning pattern and the input unknown character pattern. Distance value dm of the first candidate character code Ch as a classification result of the first to n-th classification means for performing the character classification of (m) (m <n).
(H) and an n-dimensional vector Ch (d1) in which these distance values are arranged in the order of the classifying means with reference to the distance values for the character code Ch in the (n-1) th classifying means other than the m-th classifying means.
(H), d2 (h),... Dm (h),... Dn (h)) for the first candidate of all the classifying means, and classifying all n classifying means from the n-dimensional vector. When calculating the evaluation value as a scale to evaluate the recognition result for each of the first candidate character codes of the result, if the character code of the first candidate of the classification result of all classification means is the same, the character If the codes are not the same, the distance values d1 (k), d2 (k),
, Dm (k),..., Dn (k) as coordinate axes, an n-dimensional vector Ck (d1
(K), d2 (k), ..., dm (k), ..., dn
Using the principal component analysis method, the principal axis direction vector of the distribution (k)) is used to calculate each element d1 (k), d2 of the n-dimensional vector.
, Dm (k),..., Dn (k) are eigenvectors corresponding to the maximum eigenvalues of an n-dimensional correlation matrix formed as variables. (P1 (k), p2 (k),.
(K),..., Pn (k)), and further, the main axis direction vector and the n-dimensional vector Ck (d1 (k),
d2 (k),..., dm (k),..., dn (k)), and the evaluation value Ek as a scale for evaluating the character code Ck as a recognition result is calculated by n classification means. The selection is made by selecting any one of the n evaluation values for the character codes of the first candidates, and in the selection, any one of the n evaluation values is determined based on a threshold value and a mutual magnitude comparison. And a determination unit for determining whether to select a character code corresponding to or as a minimum recognition result or to reject any of them.

【００１１】[0011]

【作用】本発明の作用を、図５を参照して説明する。図
５は、２種類の分類手段を併用した場合を例としてい
る。The operation of the present invention will be described with reference to FIG. FIG. 5 shows an example in which two types of classification means are used together.

【００１２】第一の分類手段の分類結果の第一候補の文
字コードＣｉに対する評価値Ｅｉと、第二の分類手段の
分類結果の第一候補の文字コードＣｊに対する評価値Ｅ
ｊを計算する場合に、分類対象となる字種ｋごとに第一
の分類手段と第二の分類手段の距離値をそれぞれ軸とす
る二次元平面に、字種ｋに属する学習用の文字パターン
Ｃｋに対して得られた第一の分類手段、第二の分類手段
の距離値ｄ１（ｋ），ｄ２（ｋ）をプロットすると、図
５の閉じた曲線Ｓの内部のように分布する。この分布の
傾きを代表する主軸Ｖｋ（ｐ１（ｋ），ｐ２（ｋ））は
同図５に示されるようになり、公知の主成分分析法によ
り求まる。An evaluation value Ei for the first candidate character code Ci of the classification result of the first classification means and an evaluation value Ei for the first candidate character code Cj of the classification result of the second classification means.
When calculating j, a character pattern for learning belonging to the character type k is placed on a two-dimensional plane whose axis is the distance value between the first classification means and the second classification means for each character type k to be classified. When the distance values d1 (k) and d2 (k) of the first classifier and the second classifier obtained for Ck are plotted, they are distributed as shown in a closed curve S in FIG. The main axis Vk (p1 (k), p2 (k)) representing the gradient of this distribution is as shown in FIG. 5, and is obtained by a known principal component analysis method.

【００１３】この主軸Ｖｋに対し、図５に示すように主
軸Ｖｋに直交し、かつ曲線Ｓで示す分布に接する直線Ｌ
を設定する。候補とする文字コードＣｋの２次元ベクト
ルＣｋ（ｄ１（ｋ），ｄ２（ｋ））がこれより原点側に
ある場合には文字コードＣｋは字種ｋに属するとするこ
とにより、図５の四角印で示すような一方の距離値が小
さいものも字種ｋに属すると判定する。また、図５に三
角印で示す候補の文字コードＣ_'ｋに対する２次元ベク
トルＣ^'ｋ（ｄ１^'（ｋ），ｄ２^'（ｋ））は斜線の領
域外であるから字種ｋに属するとは判断せずに棄却す
る。As shown in FIG. 5, a straight line L perpendicular to the main axis Vk and tangent to the distribution indicated by the curve S is shown in FIG.
Set. When the two-dimensional vector Ck (d1 (k), d2 (k)) of the candidate character code Ck is located closer to the origin, the character code Ck belongs to the character type k. One having a small distance value as indicated by a mark is also determined to belong to the character type k. In addition, the two-dimensional vector C ^′ k (d1 ^′ (k), d2 ^′ (k)) corresponding to the candidate character code C _′ k indicated by a triangle in FIG. 5 is outside the shaded area and belongs to the character type k. Reject without judgment.

【００１４】またＥｋｍａｘは、この例においては前述
の直線Ｌと先に求めた主軸Ｖｋの交点に対する主軸Ｖｋ
上の値であり、これを候補の文字コードＣｋが字種ｋに
属すると考えてよいか否かの判定しきい値とする。In this example, Ekmax is the main axis Vk with respect to the intersection of the aforementioned straight line L and the main axis Vk obtained earlier.
The above value is used as a determination threshold value for determining whether or not the candidate character code Ck belongs to the character type k.

【００１５】これに対し、第一の分類手段、第二の分類
手段がそれぞれ単独でしきい値を設定した従来の場合を
考え、その時の第一の分類手段、第二の分類手段に対す
るしきい値をそれぞれｔ１，ｔ２とする。ｔ１，ｔ２が
学習パタンに対する距離値の最大値とすると、これらを
どちらも超えない領域は、図５において、縦軸，横軸と
一点鎖線，二点鎖線で囲まれる長方形の領域となり、必
要以上の棄却領域を設定することになり認識率が低下す
る。On the other hand, consider a conventional case in which the first classifying means and the second classifying means independently set thresholds, respectively, and a threshold for the first classifying means and the second classifying means at that time. The values are t1 and t2, respectively. Assuming that t1 and t2 are the maximum values of the distance values with respect to the learning pattern, the area that does not exceed either of them is a rectangular area surrounded by the dashed line and the two-dot chain line in FIG. Is set, and the recognition rate decreases.

【００１６】この比較から明かなように、本発明によれ
ば、棄却されない領域が広くとれ、一方の分類手法の第
一候補の文字コードが非常に近い距離値を出力している
ものに対してはしきい値以下であれば、その字種と判定
することが可能となる。As is clear from this comparison, according to the present invention, the area which is not rejected can be widened, and the character code of the first candidate of one of the classification methods outputs a very close distance value. If is less than or equal to the threshold value, it is possible to determine the character type.

【００１７】第一の分類手段の第一候補の文字コードＣ
ｉと第二の分類手段の第一候補の文字コードＣｊに対し
て、どちらを出力するか、もしくは棄却するかを判定す
る手段においては、それぞれの分類手段の第一候補の文
字コードＣｉ，Ｃｊに対して計算された評価値Ｅｉ，Ｅ
ｊに対し、どちらもしきい値Ｅｉｍａｘ，Ｅｊｍａｘを
超えていれば、第一の分類手段と第二の分類手段の出力
する第一の候補の文字コードＣｉ，Ｃｉを棄却する。そ
れ以外の場合においては、この評価値は小さい方が原点
に近いことから、評価値の小さい方が、その字種に対す
る判定しきい値を超えていない時は、その評価値に対す
る第一候補の文字コードを最終認識結果とし、そうでな
い場合は第一の分類手段と第二の分類手段の出力する第
一の候補の文字コードＣｉ，Ｃｊを棄却とする。３種の
分類手段を併用する際も同一の作用を持つ。The first candidate character code C of the first classifying means
In the means for determining which one of the character codes Cj and i is to be output or rejected, the character codes Ci and Cj of the first candidate of the respective classification means Evaluation values Ei, E calculated for
If both j exceed the threshold values Eimax and Ejmax, the character codes Ci and Ci of the first candidates output by the first classifying unit and the second classifying unit are rejected. In other cases, since the smaller evaluation value is closer to the origin, when the smaller evaluation value does not exceed the judgment threshold value for the character type, the first candidate for the evaluation value is determined. The character code is used as the final recognition result. Otherwise, the first candidate character codes Ci and Cj output by the first classification means and the second classification means are rejected. The same effect is obtained when three kinds of classification means are used together.

【００１８】[0018]

【実施例】次に、本発明について図面を参照して説明す
る。Next, the present invention will be described with reference to the drawings.

【００１９】図１は本発明の第一の実施例の構成を示す
ブロック図、図２は図１の第一の実施例の動作を示すフ
ローチャート、図３は本発明の第二の実施例の構成を示
すブロック図、図４は図３の第二の実施例の動作を示す
フローチャートである。FIG. 1 is a block diagram showing the configuration of the first embodiment of the present invention, FIG. 2 is a flowchart showing the operation of the first embodiment of FIG. 1, and FIG. 3 is a block diagram of the second embodiment of the present invention. FIG. 4 is a block diagram showing the configuration, and FIG. 4 is a flowchart showing the operation of the second embodiment of FIG.

【００２０】図１に示す第一の実施例は、文字パターン
を入力する文字パターン入力部１と、第一の分類手段を
構成する第一分類部１４と、第二の分類手段を構成する
第二分類部１５と、第一分類部１４と第二分類部１５の
分類結果にもとづいて文字パターンを判定する判定部１
６と、最終認識結果を出力する最終認識結果出力部１３
および各部の動作を制御する制御部１７を備えて成る。In the first embodiment shown in FIG. 1, a character pattern input unit 1 for inputting a character pattern, a first classifying unit 14 constituting a first classifying unit, and a second classifying unit constituting a second classifying unit are provided. A second classification unit 15, a determination unit 1 for determining a character pattern based on the classification results of the first classification unit 14 and the second classification unit 15;
6 and a final recognition result output unit 13 for outputting a final recognition result
And a control unit 17 for controlling the operation of each unit.

【００２１】文字パター入力部１は、帳票上の文字を認
識系に取り込む通常のイメージスキャナーで構成され、
取り込まれた文字イメージは、第一分類部１４と第二分
類部１５に供給されて２種類の手段で分類を受ける。判
定部１６は、第一分類部１４、第二分類部１５の分類結
果に対し、それぞれ分類結果の第一候補の文字コードが
同一の場合には、それを最終認識結果として最終認識結
果出力部１３に転送する。The character pattern input unit 1 is composed of a normal image scanner that takes in characters on a form into a recognition system.
The captured character image is supplied to the first classification unit 14 and the second classification unit 15 and is classified by two types of means. If the character codes of the first candidates of the classification results are the same as the classification results of the first classification unit 14 and the second classification unit 15, the determination unit 16 sets the final recognition result as the final recognition result output unit. 13 is transferred.

【００２２】また、それぞれの分類結果の第一候補の文
字コードが異なる場合には、第一分類部１４，第二分類
部１５のうち、どちらの第一候補の文字コードを最終認
識結果として採用するか、もしくはそれぞれの分類結果
の第一候補の文字コードを棄却するかを判定する。制御
部１７は、これら文字パターン入力部１，第一分類部１
４，第二分類部１５，対判定部１６および最終認識結果
出力部１３の動作制御を行う。If the first candidate character code of each classification result is different, the first candidate character code of the first classification unit 14 and the second classification unit 15 is adopted as the final recognition result. Or reject the first candidate character code of each classification result. The control unit 17 controls the character pattern input unit 1, the first classifying unit 1
4, operation control of the second classification unit 15, the pair determination unit 16, and the final recognition result output unit 13.

【００２３】第一分類部１４は、第一文字認識部２，第
一文字認識部２が参照する第一分類辞書メモリ３，第一
文字認識部２の出力する分類結果を記憶する第一分類結
果格納メモリ４を備え、第二分類部１５は第二文字認識
部５，第二文字認識部５が参照する第二分類辞書メモリ
６，第二文字認識部５の出力する分類結果を記憶する第
二分類結果格納メモリ７を備えて成る。The first classification unit 14 includes a first character recognition unit 2, a first classification dictionary memory 3 referred to by the first character recognition unit 2, and a first classification result storage memory for storing classification results output from the first character recognition unit 2. The second classification unit 15 includes a second character recognition unit 5, a second classification dictionary memory 6 referred to by the second character recognition unit 5, and a second classification that stores a classification result output from the second character recognition unit 5. The result storage memory 7 is provided.

【００２４】第一分類辞書メモリ３，第二分類辞書メモ
リ６には、それぞれの第一文字認識部２，第二文字認識
部５が取り扱う相異る標準パタンを形成する特徴量を予
め学習用のパターンから作成し記憶しておく。これら第
一分類部１４と第二分類部１５においては、それぞれの
分類辞書メモリに記憶されている分類対象字種ｋに対す
る標準と未知文字パターンとの距離計算を行ない、それ
らを距離の小さい順に該当する文字コードとともに第一
分類結果格納メモリ４，第二分類結果格納メモリ７に記
憶する。The first classifying dictionary memory 3 and the second classifying dictionary memory 6 store in advance the feature amounts forming the different standard patterns handled by the first character recognition unit 2 and the second character recognition unit 5 for learning. Created from patterns and stored. The first classifying unit 14 and the second classifying unit 15 calculate the distance between the standard and unknown character patterns for the classification target character type k stored in the respective classification dictionary memories, and match them in ascending order of distance. Are stored in the first classification result storage memory 4 and the second classification result storage memory 7 together with the corresponding character code.

【００２５】判定部１６は、第一分類部１４，第二分類
部１５の処理が終了すると、制御部１７により起動さ
れ、第一分類結果格納メモリ４から第一分類部１４の第
一の候補の文字コードと距離値を抽出し、これに第二分
類結果格納メモリ７から第二分類部１５における第一候
補の文字コードに対応する距離値を付加した２次元のベ
クトル形式で統合結果格納レジスタ１０に格納する。同
時に、第二分類部１５の第二分類結果格納メモリ７から
第二分類部１５の第一候補の文字コードと距離値を抽出
し、それに第一分類結果格納メモリ４から第一分類部１
４における第一候補の文字コードに対応する距離値を付
加した２次元のベクトル形式で統合結果格納レジスタ１
０に格納する。この時、第一分類部１４，第二分類部１
５それぞれの分類結果の第一候補の文字コードが同一で
あればその文字コードを最終認識結果として最終認識結
果出力部１３に出力する。When the processing of the first classifying unit 14 and the second classifying unit 15 is completed, the judging unit 16 is started by the control unit 17 and stores the first candidate of the first classifying unit 14 from the first classification result storage memory 4. And a distance value corresponding to the character code of the first candidate in the second classification unit 15 from the second classification result storage memory 7 in the two-dimensional vector format. 10 is stored. At the same time, the character code and the distance value of the first candidate of the second classification unit 15 are extracted from the second classification result storage memory 7 of the second classification unit 15, and are extracted from the first classification result storage memory 4.
4 is an integrated result storage register 1 in a two-dimensional vector format to which a distance value corresponding to the character code of the first candidate is added.
Store to 0. At this time, the first classification unit 14, the second classification unit 1
If the first candidate character code of each of the classification results is the same, the character code is output to the final recognition result output unit 13 as the final recognition result.

【００２６】評価値計算部９は、分類結果統合部８の処
理が終了した後、第一分類部１４，第二分類部１５の分
類結果が異なるとき、制御部１７により起動され、統合
結果格納レジスタ１０と評価値計算用辞書メモリ１１を
参照し、第一分類部１４，第二分類部１５が出力した候
補の文字コードに対する評価値Ｅｉ，Ｅｊをそれぞれ例
えば次式（１），（２）により計算する。Ｅｉ＝ｄ１（ｉ）Ｐ１＋ｄ２（ｉ）Ｐ２（ｉ）…（１）Ｅｊ＝ｄ１（ｊ）Ｐ１＋ｄ２（ｊ）Ｐ２（ｊ）…（２）評価値計算用辞書メモリ１１には、分散共分散行列から
求められた前述の種軸Ｖｋ（ｐ１（ｋ），ｐ２（ｋ））
と判定用のしきい値Ｅｋｍａｘを字種ｋ毎に記憶する。
ここで各字種ｋに対する主軸Ｖｋの計算は、前述した距
離（ｄ１（ｋ），ｄ２（ｋ））を変量とした場合の２次
元の分散共分散行列を使うが、第一分類部１４，第二分
類部１５が出力する距離値のスケールが大きく異なる場
合には相関行列を代わりに用いる。これによって、評価
値の計算時に、距離値の分散の大なる方の分類手段の評
価値に重みがかかる作用をなくすことができ、第一分類
部１４，第二分類部１５の出力する距離値をしきい値判
定部１２において対等に評価できる。When the classification results of the first classification unit 14 and the second classification unit 15 are different after the processing of the classification result integration unit 8 is completed, the evaluation value calculation unit 9 is started by the control unit 17 and stores the integration result. Referring to the register 10 and the dictionary memory 11 for calculating an evaluation value, the evaluation values Ei and Ej for the candidate character codes output by the first classification unit 14 and the second classification unit 15 are respectively expressed by, for example, the following equations (1) and (2). Is calculated by Ei = d1 (i) P1 + d2 (i) P2 (i) (1) Ej = d1 (j) P1 + d2 (j) P2 (j) (2) The variance-covariance matrix is stored in the evaluation value calculation dictionary memory 11. Seed axis Vk (p1 (k), p2 (k)) obtained from
And a threshold value Ekmax for determination are stored for each character type k.
Here, the calculation of the main axis Vk for each character type k uses a two-dimensional variance-covariance matrix when the distances (d1 (k), d2 (k)) are variables, but the first classification unit 14, If the scales of the distance values output by the second classification unit 15 are significantly different, a correlation matrix is used instead. Accordingly, when calculating the evaluation value, it is possible to eliminate the effect that the evaluation value of the classification means having the larger variance of the distance value is weighted, and the distance value output by the first classification unit 14 and the second classification unit 15 can be eliminated. Can be evaluated equally by the threshold value determination unit 12.

【００２７】しきい値判定部１２は、評価値計算部９の
処理が終了すると制御部１７によって起動され、評価値
計算部９によって計算された第一分類部１４，第二分類
部１５が出力した候補の文字コードに対する評価値の比
較を評価値計算用辞書メモリ１１に記憶したしきい値と
の比較により行い、どちらの分類結果を採用するか、も
しくはどちらの分類結果も棄却するかを判定する。その
アルゴリズムの一例を図２を用いて説明する。The threshold value judging unit 12 is started by the control unit 17 when the processing of the evaluation value calculation unit 9 is completed, and the first classification unit 14 and the second classification unit 15 calculated by the evaluation value calculation unit 9 output The comparison of the evaluation value for the candidate character code is performed by comparing the evaluation value with the threshold value stored in the evaluation value calculation dictionary memory 11, and it is determined which classification result is to be adopted or both classification results are rejected. I do. An example of the algorithm will be described with reference to FIG.

【００２８】ステップ１０１，１０３によって抽出され
た文字候補とステップ１０２，１０４によって抽出され
た文字コードが同一である場合はステップ１０５でその
候補の文字コードを最終認識結果として採用するステッ
プ１１０を実行する。この処理は分類結果統合部８が行
う。If the character candidates extracted in steps 101 and 103 are the same as the character codes extracted in steps 102 and 104, step 105 is executed in which the candidate character code is adopted as the final recognition result in step 105. . This processing is performed by the classification result integration unit 8.

【００２９】しきい値判定部１２は、ステップ１０５か
ら先の処理を実行する。評価値計算部９で得た２つの分
類部の出力する候補の文字コードに対する評価値Ｅｉ，
Ｅｊがどちらも評価値計算用辞書メモリ１１のしきい値
を超えている場合には、どちらの候補の文字コードも棄
却するステップ１１１を実行する。そうでない場合に
は、しきい値を超えていない方の候補の文字コードを選
択し、ステップ１０９によりその評価値がもう一方の評
価値以下であれば、該当する候補の文字コードを最終認
識結果として採用し、その文字コードを最終認識結果出
力部１３に送り、そうでなければ該当する候補の文字コ
ードを棄却し、棄却に対応する予め設定してある棄却コ
ードを最終認識結果出力部１３に送る。The threshold value judging section 12 executes the processing from step 105 onward. Evaluation values Ei, for candidate character codes output by the two classification units obtained by the evaluation value calculation unit 9
If both Ej exceed the threshold value of the evaluation value calculation dictionary memory 11, a step 111 for rejecting both candidate character codes is executed. Otherwise, the character code of the candidate that does not exceed the threshold value is selected, and if the evaluation value is equal to or smaller than the other evaluation value in step 109, the character code of the corresponding candidate is selected as the final recognition result. And sends the character code to the final recognition result output unit 13; otherwise, rejects the corresponding candidate character code and sends a preset rejection code corresponding to the rejection to the final recognition result output unit 13. send.

【００３０】最終認識結果出力部１３は、制御部１７の
命令に応じて、判定部１６の出力する最終認識結果を文
字コードとして出力する。The final recognition result output unit 13 outputs the final recognition result output from the determination unit 16 as a character code in response to an instruction from the control unit 17.

【００３１】次に、図３を参照して本発明の第二の実施
例について説明する。図３の第二の実施例の図１の第一
の実施例との相違点は、第一分類部１４，第二分類部１
５が同一の候補の文字コードを出力する場合でも、ただ
にちそれを最終認識結果として最終認識結果出力部１３
に出力するのではなく、第一の実施例において、第一分
類部１４，第二分類部１５が異なった候補の文字コード
を出力する場合と同様に、判定部１６でしきい値安定処
理を行なう点であり、従って、図１において設定した対
判定処理回避ルートＲ１が削除されている。Next, a second embodiment of the present invention will be described with reference to FIG. The difference between the second embodiment of FIG. 3 and the first embodiment of FIG.
5 outputs the same candidate character code, it is immediately used as the final recognition result.
In the first embodiment, the threshold stabilizing process is performed by the determination unit 16 in the same manner as in the first embodiment in which the first classification unit 14 and the second classification unit 15 output character codes of different candidates. Therefore, the pair determination processing avoidance route R1 set in FIG. 1 is deleted.

【００３２】本第二の実施例における対判定部１６での
判定アルゴリズムを図４に示す。判定部１６では、ステ
ップ１０５の判断処理をおこなうところまでは同一であ
るが、第一分類部１４，第二分類部１５の候補の文字コ
ードが同一だった場合に対しても評価値計算を行ない、
ステップ１１２の結果がステップ１１３で予め設定され
ているしきい値以下であれば最終認識結果として採用
し、その文字コードを最終認識結果出力部１３に送り、
そうでなければ該当する候補の文字コードを棄却するス
テップ１１１を実行し、棄却に対応する予め設定してあ
る棄却コードを最終認識結果出力部１３に送る。FIG. 4 shows a determination algorithm in the pair determination section 16 in the second embodiment. The determination unit 16 performs the same evaluation up to the point at which the determination process of step 105 is performed, but also performs an evaluation value calculation when the character codes of the candidates of the first classification unit 14 and the second classification unit 15 are the same. ,
If the result of step 112 is equal to or less than the threshold value set in step 113, the result is adopted as the final recognition result, and the character code is sent to the final recognition result output unit 13,
Otherwise, a step 111 for rejecting the character code of the corresponding candidate is executed, and a preset rejection code corresponding to the rejection is sent to the final recognition result output unit 13.

【００３３】このようにして、第一分類部１４，第二分
類部１５の出力する候補の文字コードが同じ場合でも、
その距離値がどちらも学習用のパターンで計算したもの
に比べて大きいものであればそれを棄却することによ
り、第一分類部１４，第二分類部１５が同時に誤読して
いるような場合にも対応でき、認識率の低下を第一の実
施例以下に抑圧することが可能となる。In this way, even if the candidate character codes output by the first classification unit 14 and the second classification unit 15 are the same,
If both of the distance values are larger than those calculated using the learning pattern, the distance values are rejected, so that the first classifying unit 14 and the second classifying unit 15 may misread at the same time. Therefore, it is possible to suppress a decrease in the recognition rate to be equal to or less than the first embodiment.

【００３４】以上、本発明に係わる２種の分類手段を併
用した場合について２つの実施例を説明してきたが、第
一分類部１４，第二分類部１５で計算する距離の尺度は
文字認識において通常使用されるものに限らず、いずれ
の距離尺度、または類似性の尺度でも対応が可能であ
り、またそれらの混用も可能である。また、３種類以上
の分類手段を併用した場合においても容易に前述した２
つの実施例に準じて構成可能である。なお、上述した２
つの実施例では、ハードウェアを基本として構成した例
を示したが、これらの各機能はハードウェアまたはソフ
トウェアのいずれの手段によって実現してもよい。Although two embodiments have been described for the case where two types of classification means according to the present invention are used together, the distance scale calculated by the first classification unit 14 and the second classification unit 15 is used in character recognition. It is not limited to those usually used, and any distance scale or similarity scale can be used, and a mixture of them is also possible. Further, even when three or more kinds of classification means are used together,
It can be configured according to one embodiment. Note that the above 2
In one embodiment, an example in which hardware is used as a basis has been described, but each of these functions may be realized by any means of hardware or software.

【００３５】[0035]

【発明の効果】以上説明したように本発明によれば、少
なくとも２種の文字分類手段を併用することにより誤読
される文字数を減らすと同時に、各文字分類手段が出力
する候補の文字コードに対して計算された評価値の大小
比較と１回のしきい値判定処理で棄却判定を行なうこと
により、２種類の文字分類手段に対して個別に棄却判定
用のしきい値を設定する場合に比べ、棄却される文字数
を大幅に減らすことができ、認識制度を著しく向上する
ことができる効果がある。As described above, according to the present invention, the number of erroneously read characters is reduced by using at least two types of character classifying means, and at the same time, the candidate character codes output by each character classifying means are reduced. By performing rejection determination by comparing the calculated evaluation values with each other and performing one rejection determination process as compared with the case where rejection determination thresholds are individually set for two types of character classification means. The number of characters to be rejected can be greatly reduced, and the recognition system can be significantly improved.

[Brief description of the drawings]

【図１】本発明の第一の実施例の構成を示すブロック図
である。FIG. 1 is a block diagram showing a configuration of a first embodiment of the present invention.

【図２】図１の第一の実施例の動作を示すフローチャー
トである。FIG. 2 is a flowchart showing the operation of the first embodiment of FIG.

【図３】本発明の第二の実施例の構成を示すブロック図
である。FIG. 3 is a block diagram showing a configuration of a second exemplary embodiment of the present invention.

【図４】図３の第二の実施例の動作を示すフローチャー
トである。FIG. 4 is a flowchart showing the operation of the second embodiment of FIG. 3;

【図５】本発明の作用の説明図である。FIG. 5 is an explanatory diagram of the operation of the present invention.

【図６】従来の文字認識装置の動作を示すフローチャー
トである。FIG. 6 is a flowchart showing the operation of a conventional character recognition device.

[Explanation of symbols]

１文字パターン入力部２第一文字認識部３第一分類辞書メモリ４第一分類結果格納メモリ５第二文字認識部６第二分類辞書メモリ７第二分類結果格納メモリ８分類結果統合部９評価値計算部１０統合結果格納レジスタ１１評価値計算用辞書メモリ１２しきい値判定部１３最終認識結果出力部１４第一分類部１５第二分類部１６判定部１７制御部 Reference Signs List 1 Character pattern input unit 2 First character recognition unit 3 First classification dictionary memory 4 First classification result storage memory 5 Second character recognition unit 6 Second classification dictionary memory 7 Second classification result storage memory 8 Classification result integration unit 9 Evaluation value Calculation unit 10 Integrated result storage register 11 Evaluation value calculation dictionary memory 12 Threshold value judgment unit 13 Final recognition result output unit 14 First classification unit 15 Second classification unit 16 Judgment unit 17 Control unit

フロントページの続き (56)参考文献特開昭63−263588（ＪＰ，Ａ) 特開昭57−25082（ＪＰ，Ａ) 特開昭63−79191（ＪＰ，Ａ) 特開昭62−74188（ＪＰ，Ａ) 電子情報通信学会全国大会講演論文集、ＶＯＬ．1991，ＮＯ．ＳＰＲＩＮＧＰＴ７ＰＡＧＥ．７−244 電子情報通信学会技術研究報告、ＶＯＬ．89，ＮＯ．436（ＰＲＵ89 118− 125）ＰＡＧＥ．15−22 (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06K 9/62 G06K 9/03Continuation of the front page (56) References JP-A-63-263588 (JP, A) JP-A-57-25082 (JP, A) JP-A-63-79191 (JP, A) JP-A-62-74188 (JP, A) , A) Proceedings of the IEICE National Convention, VOL. 1991, NO. SPRING PT 7 PAGE. 7-244 IEICE Technical Report, VOL. 89, NO. 436 (PRU89 118-125) PAGE. 15-22 (58) Field surveyed (Int. Cl. ⁶ , DB name) G06K 9/62 G06K 9/03

Claims

(57) [Claims]

1. A first and second classification for performing two types of character classification for character recognition by calculating a distance between two types of standard patterns related to the character type K created from a learning pattern and an input unknown character pattern. Means, the first candidate character code Ci most similar to the standard pattern and the distance value d1 (i) of the classification result by the first classification means, and the first candidate of the classification result by the second classification means. A character code Cj and its distance value d1 (j) are input. If the character codes Ci and Cj match, the character code is used as a final recognition result. If the character codes Ci and Cj are different, the first code is used. A two-dimensional vector Ck (d of the character codes Ck of the recognition candidates on a plane having the distance values d1 (k) and d2 (k) of the classification means and the second classification means as the X axis and the Y axis, respectively. 1 (k), d2 (k)) is represented by a principal axis direction vector representing the gradient of the distribution using a principal component analysis method, and the two-dimensional variance covariance formed by using the distance values d1 (k) and d2 (k) as variables. An eigenvector Vk (p1 (k), p2 (k)) corresponding to the maximum eigenvalue of the variance matrix is obtained, and further obtained based on the main axis direction vector and the two-dimensional vector Ck (d1 (k), d2 (k)). And an evaluation value Ek as a scale for evaluating the character code Ck as a recognition result is an evaluation value Ei for the character code Ci.
And any one of the evaluation values Ej for the character code Cj. In the selection, any one of the character codes corresponding to the evaluation values Ei and Ej is determined based on a threshold value and a mutual magnitude comparison. A character recognition device comprising: a determination unit that determines whether to select as a final recognition result or to reject any of them.

2. A first and second classification for performing two types of character classification for character recognition by calculating a distance between two types of standard patterns related to the character type K created from a learning pattern and an input unknown character pattern. Means, the first candidate character code Ci most similar to the standard pattern and the distance value d1 (i) of the classification result by the first classification means, and the first candidate of the classification result by the second classification means. A character code Cj and its distance value d1 (j) are input. If the character codes Ci and Cj match, the character code is used as a final recognition result. If the character codes Ci and Cj are different, the first code is used. A two-dimensional vector Ck (d of the character codes Ck of the recognition candidates on a plane having the distance values d1 (k) and d2 (k) of the classification means and the second classification means as the X axis and the Y axis, respectively. 1 (k), d2 (k)) is a two-dimensional correlation matrix formed by using principal component analysis as a principal axis direction vector representing the gradient of the distribution and using the distance values d1 (k) and d2 (k) as variables. As the eigenvector Vk (p1 (k), p2 (k)) corresponding to the maximum eigenvalue of the main axis direction vector and the two-dimensional vector Ck
(D1 (k), d2 (k)), and an evaluation value Ek as a scale for evaluating the character code Ck as a recognition result is an evaluation value Ei for the character code Ci and an evaluation value for the character code Cj. Ej, and the evaluation value E is determined based on a threshold value and a mutual magnitude comparison.
A character recognition device comprising: a determination unit that determines whether any one of the character codes corresponding to i and Ej is selected as a final recognition result or both are rejected.

3. All three or more types for character recognition by calculating a distance between two types of standard patterns related to the character type K created from a learning pattern and an input unknown character pattern.
First to n-th classifying means for classifying the types of characters, and the distance value dm (h) of the first candidate character code Ch of the classification result of the m-th (m <n) classifying means and the m-th classifying means Other than n
−1, the n-dimensional vectors Ch (d1 (h), d2 (h),.
(H),... Dn (h)) are generated for the first candidates of all the classifying means, and the n-dimensional vectors are used as the first candidate character codes of the classification results of all the n classifying means. When calculating the evaluation value as a scale to be evaluated as the recognition result, if the character code of the first candidate of the classification result of all classification means is the same, the character code is regarded as the minimum recognition result, and if not, Are the distance values d1 (k), d2 (k),..., Dm (k),.
An n-dimensional vector Ck (d1 (k), d2 (k),..., Dm) in an n-dimensional space when (k) is a coordinate axis
(K),..., Dn (k)), the principal axis direction vectors representing the gradients of the distributions are calculated using principal component analysis, and the respective elements d1 (k), d2 (k),. ),
, Dn (k) as a variable, the eigenvector Vk (p
1 (k), p2 (k), ..., pm (k), ..., pn
(K)), and further, the main axis direction vector and the n-dimensional vector Ck (d1 (k), d2 (k),.
dm (k),..., dn (k)), and evaluates the character code Ck as a recognition result as an evaluation value Ek by the n number of first candidate characters by the n classifying means. The character code corresponding to any of the n evaluation values is determined based on a threshold value and a magnitude comparison between the character codes. A character recognition device comprising: a determination unit that determines whether to select as a result or to reject any of them.

4. All three or more types for character recognition by calculating the distance between two types of standard patterns related to the character type K created from a learning pattern and an input unknown character pattern.
First to n-th classifying means for classifying the types of characters, and the distance value dm (h) of the first candidate character code Ch of the classification result of the m-th (m <n) classifying means and the m-th classifying means Other than n
−1, the n-dimensional vectors Ch (d1 (h), d2 (h),.
(H),... Dn (h)) are generated for the first candidates of all the classifying means, and the n-dimensional vectors are used as the first candidate character codes of the classification results of all the n classifying means. When calculating the evaluation value as a scale to be evaluated as the recognition result, if the character code of the first candidate of the classification result of all the classification means is the same, the character code is used as the final recognition result, and if not, Are the distance values d1 (k), d2 (k),..., Dm (k),.
An n-dimensional vector Ck (d1 (k), d2 (k),..., Dm) in an n-dimensional space when (k) is a coordinate axis
(K),..., Dn (k)), the principal axis direction vectors representing the gradients of the distributions are calculated using principal component analysis, and the respective elements d1 (k), d2 (k),. ),
.., Dn (k) as a variable, an eigenvector Vk (p1) corresponding to the maximum eigenvalue of an n-dimensional correlation matrix.
(K), p2 (k), ..., pm (k), ..., pn
(K)), and further, the main axis direction vector and the n-dimensional vector Ck (d1 (k), d2 (k),.
dm (k),..., dn (k)), and evaluates the character code Ck as a recognition result as an evaluation value Ek by the n number of first candidate characters by the n classifying means. The character code corresponding to any one of the n evaluation values is determined based on a threshold value and a magnitude comparison between each other. A character recognition device comprising: a determination unit that determines whether to select as a minimum recognition result or to reject any of them.