JPS59158482A - Character recognizing device - Google Patents
Character recognizing deviceInfo
- Publication number
- JPS59158482A JPS59158482A JP58032426A JP3242683A JPS59158482A JP S59158482 A JPS59158482 A JP S59158482A JP 58032426 A JP58032426 A JP 58032426A JP 3242683 A JP3242683 A JP 3242683A JP S59158482 A JPS59158482 A JP S59158482A
- Authority
- JP
- Japan
- Prior art keywords
- recognition
- result
- answer
- characters
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010586 diagram Methods 0.000 description 4
- 238000000034 method Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 241000219112 Cucumis Species 0.000 description 2
- 235000015510 Cucumis melo subsp melo Nutrition 0.000 description 2
- 240000000220 Panda oleosa Species 0.000 description 2
- 235000016496 Panda oleosa Nutrition 0.000 description 2
- FJJCIZWZNKZHII-UHFFFAOYSA-N [4,6-bis(cyanoamino)-1,3,5-triazin-2-yl]cyanamide Chemical compound N#CNC1=NC(NC#N)=NC(NC#N)=N1 FJJCIZWZNKZHII-UHFFFAOYSA-N 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
Landscapes
- Character Discrimination (AREA)
Abstract
Description
【発明の詳細な説明】 〔発明の技術分野〕 この発明は、光学的文字読取装置に関する。[Detailed description of the invention] [Technical field of invention] The present invention relates to an optical character reading device.
近年、光学的文字読取装置(以下OCRと称する)等に
は、英数字、仮名文字および特殊記号(総称して感文字
と称する)に加えて漢字(平仮名を含む)を含む文章を
読取ることができる漢字OCRがある。このような漢字
OCRでは、読取対象の文字がANK文字(約128字
種)および漢字(2000字種以上)であるため、認識
字種が多大となる。そのため、ANK文字のみの文字認
識方式と異なり、辞書メモリが大容量になるなど複雑で
大規模な方式となる。In recent years, optical character reading devices (hereinafter referred to as OCR) are capable of reading texts that include kanji (including hiragana) in addition to alphanumeric characters, kana characters, and special symbols (collectively referred to as kanji). There is a kanji OCR that can do it. In such Kanji OCR, since the characters to be read are ANK characters (approximately 128 character types) and Kanji characters (more than 2000 character types), the number of recognized character types is large. Therefore, unlike a character recognition method using only ANK characters, this method is complicated and large-scale, as the dictionary memory has a large capacity.
しかしながら、従来の漢字OCRでは、読取字種が多大
となる反面、文章の中に含まれるANK文字を認識する
場合や、ANK文字用文字認識方式を用いた場合に比較
して認識精度が低下する欠点があった。これは、認識対
象の字種が広範囲で多大となるため、−文字当りの認識
精度が低下するからである。However, while conventional Kanji OCR can read a large number of character types, the recognition accuracy is lower than when recognizing ANK characters included in a sentence or when using a character recognition method for ANK characters. There were drawbacks. This is because the number of character types to be recognized is wide and large, and the recognition accuracy per character is lowered.
この発明は上記の事情に鑑みてなされたもので、その目
的は、瓜文字および漢字からなる文章を読取る場合でも
、厘文字の認識を確実に行なうことができるようにして
、文字読取精度を高めることができる文字認識装置を提
供することにある。This invention was made in view of the above circumstances, and its purpose is to improve the accuracy of character reading by making it possible to reliably recognize lin characters even when reading texts consisting of melon characters and kanji. The object of the present invention is to provide a character recognition device that can perform the following tasks.
この発明は、限定範囲の字種からなるANK文字を認識
するANK認識部および広範囲の字種からなる文字を認
識する漢字認識部の両者を設ける。この両認識部の各認
識結果は、認識照合判定部に与えられ、この判定部によ
り最終的答が判定されて出力される。この場合、認識照
合判定部は、予め判定テーブルとして類似度値の差によ
る認識用統計データを記憶し、この統計データに基づい
て最終的答を出力することになる。This invention provides both an ANK recognition section that recognizes ANK characters made up of a limited range of character types and a Kanji recognition section that recognizes characters made up of a wide range of character types. The recognition results of both recognition units are given to a recognition matching determination unit, and this determination unit determines and outputs the final answer. In this case, the recognition matching determination unit stores in advance statistical data for recognition based on differences in similarity values as a determination table, and outputs a final answer based on this statistical data.
これにより、ANK文字および漢字からなる文章の読取
処理において、ANK文字の認識を確実に行なうことが
できるものである。This makes it possible to reliably recognize ANK characters in the process of reading sentences made up of ANK characters and Chinese characters.
以下図面を参照してこの発明の一実施例について説明す
る。第1図はこの発明に係る文字認識方式の構成を示す
ブロック図である。図中、1は前処理回路で、帳票を走
査して得られる文字i4ターンPが入力し、その文字パ
ターンPに対するノイズ除去等の前処理を行なう。文字
パターンPは、通常光電変換されたデジタル信号からな
る。前処理回路1から出力される文字ツクターンPは、
ANK認識認識上2び漢字認識部3の両者に与えられる
。ANK認識認識上2英数字、仮名文字および特殊記号
等の限定範囲の字種の文字(ANK文字)を認識する。An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of a character recognition system according to the present invention. In the figure, reference numeral 1 denotes a preprocessing circuit, into which a character i4 turn P obtained by scanning a form is input, and performs preprocessing such as noise removal on the character pattern P. The character pattern P usually consists of a photoelectrically converted digital signal. The character Tsukutan P output from the preprocessing circuit 1 is
It is given to both the ANK recognition unit 2 and the kanji recognition unit 3. ANK Recognition Recognizes a limited range of characters (ANK characters) such as alphanumeric characters, kana characters, and special symbols.
漢字認識部3は、ANK文字および漢字(平仮名を含む
)からなる文章の文字を認識する。認識照合判定回路(
以下判定回路と称する)4は、マイクロプロセッサ等か
らなり、両認識部2,3から出力される認識結果R1,
R2に対して、予めメモリに判定テーブルとして記憶し
た認識用統計データに基づいて認識判定を行なって、最
終的答を出力する。The kanji recognition unit 3 recognizes characters in sentences consisting of ANK characters and kanji (including hiragana). Recognition matching judgment circuit (
(hereinafter referred to as a determination circuit) 4 is composed of a microprocessor, etc., and the recognition results R1, which are output from both recognition units 2 and 3,
For R2, recognition is determined based on statistical data for recognition stored in memory as a determination table in advance, and a final answer is output.
このような構成において、第2図および第3図を参照し
てその動作を説明する。いま、例えば第2図に示すよう
な帳票10に記録された文字の認識が行なわれるものと
する。最初に、帳票10のANK文字のみからなる読取
フィールド11hが走査され、通常−文字毎に文字パタ
ーンPが瓜認識部2および漢字認識部3に与えられる。The operation of such a configuration will be described with reference to FIGS. 2 and 3. Assume now that characters recorded on a form 10 as shown in FIG. 2 are to be recognized. First, the reading field 11h consisting of only ANK characters of the form 10 is scanned, and a character pattern P is given to the melon recognition section 2 and the kanji recognition section 3 for each normal character.
この場合、文字パターンPは、上記のように光電変換さ
れたデジタル信号で、前処理回路1を経て両認識部2,
3に与えられる。そして、両認識部2,3の各認識結果
R1,R2は、判定回路4に与えられる。この場合、判
定回路4は、予め用意されたフォーマットコントロール
データに基づいて、帳票1oの読取フィールドllaの
文字パターンPに対してANK認識認識上2識結果Rノ
を答として判定し出力する。このフォーマットコントロ
ールデータは、通常OCRの制御装置(図示せず)のメ
モリに格納されており、読取対象の帳票1oに対する読
取フィールド毎の行データ内容を示すデータ等である。In this case, the character pattern P is a digital signal that has been photoelectrically converted as described above, and passes through the preprocessing circuit 1 to both the recognition units 2 and 2.
given to 3. The recognition results R1 and R2 of both recognition units 2 and 3 are then given to a determination circuit 4. In this case, the determination circuit 4 determines and outputs the ANK recognition recognition result R for the character pattern P in the reading field lla of the form 1o based on format control data prepared in advance. This format control data is normally stored in the memory of an OCR control device (not shown), and is data indicating the line data content for each reading field for the form 1o to be read.
次に、帳票10の読取フィールド11bでANK文字お
よび漢字等からなる文章が読取られたとする。この場合
も、上記と同様に一文字毎の文字パターンPが、ANK
認識認識上2び漢字−5=
認識部30両者に与えられる。判定回路4は、両方の認
識部2,3から各認識結果R1,R2を得る。ここで、
各認識結果R1,R2は、単に答だけではなく、各認識
結果R1,R2における第1位乃至第N位までの候補文
字、第1位の類似度値および第1位、第2位の候補文字
間の類似度差のデータ等からなる。判定回路4は、以下
のような条件を満足した場合、判定テーブルを使用する
ことなく答を出力する。即ち、認識結果R1,R2の各
答k1.A2が同一である場合には、漢字認識部3から
の答A2を最終的答として出力する。この場合、両者の
認識結果R1,R2がリジェクトのときには、判定回路
4はリジェクトと判定する。さらに、各答Al、A2が
同一でない場合において、認識結果R1がリジェクトの
ときは答A2を最終的答とする。址だ、答A1が認識結
果R2の第1位乃至第N位までの候補文字の中に存在し
ていないとき、答A2を最終的答とする。Next, assume that a sentence consisting of ANK characters, Chinese characters, etc. is read in the reading field 11b of the form 10. In this case as well, the character pattern P for each character is ANK
Recognition Recognition 2 and Kanji - 5 = given to both recognition units 30. The determination circuit 4 obtains recognition results R1 and R2 from both recognition units 2 and 3. here,
Each recognition result R1, R2 is not just an answer, but also the 1st to Nth candidate characters in each recognition result R1, R2, the 1st similarity value, and the 1st and 2nd candidates. It consists of data on similarity differences between characters, etc. The determination circuit 4 outputs an answer without using a determination table if the following conditions are satisfied. That is, each answer k1. of the recognition results R1, R2. If A2 are the same, the answer A2 from the kanji recognition unit 3 is output as the final answer. In this case, when both recognition results R1 and R2 are rejected, the determination circuit 4 determines that the recognition results are rejected. Furthermore, if the answers Al and A2 are not the same and the recognition result R1 is rejected, the answer A2 is set as the final answer. If the answer A1 does not exist among the first to Nth candidate characters of the recognition result R2, then the answer A2 is set as the final answer.
このような条件の場合に対して、判定回路46−
は、各認識部2.3が異なる答kl、12の認識結果R
1,R2をそれぞれ出力し、しかも認識結果R1の答A
1が認識結果R2(即ち漢字認識部3の認識結果)の第
1位乃至第N位の候補文字の中に存在している場合、上
記のような判定テーブル(認識用統計データ)を参照す
ることになる。この判定テーブルは、例えば第3図に示
すように構成されており、ANK認識認識部上Aノでア
ドレスが指定される。このアドレスで指定される参照デ
ータは、ANK認識認識部上られた第1位の候補文字の
類似度値と比較するための閾値8人および漢字認識部3
で得られた第1位と第2位の候補文字間の類似度差を比
較するだめの閾値SRからなる。具体的には、例えば入
カッ(ターンPが「ア」とした場合、ANK認識認識部
上識結果R1が「ア」で、漢字認識部3の認識結果R2
が1了」であるとする。Under such conditions, the determination circuit 46- determines that each recognition unit 2.3 has a different answer kl, 12 recognition results R.
1 and R2 respectively, and the answer A of the recognition result R1.
1 exists among the 1st to Nth candidate characters in the recognition result R2 (i.e., the recognition result of the kanji recognition unit 3), refer to the determination table (statistical data for recognition) as described above. It turns out. This determination table is configured, for example, as shown in FIG. 3, and the address is designated by A on the ANK recognition section. The reference data specified by this address is the threshold of 8 people and the kanji recognition unit 3 for comparison with the similarity value of the first candidate character picked up by the ANK recognition recognition unit.
It consists of a threshold value SR for comparing the similarity difference between the first and second candidate characters obtained in . Specifically, for example, if the input character (turn P is "A"), the ANK recognition recognition unit's knowledge result R1 is "A", and the recognition result R2 of the kanji recognition unit 3 is "A".
Suppose that the result is 1 completion.
判定回路4は、認識結果R2の第1位乃至第N位までの
候補文字の中に「ア」があれば、上記のような判定テー
ブルを参照する。即ち、判定回路4は、第3図のテーブ
ルにおいて、認識結果R1の答「ア」で指定されるアド
レスの参照データsA、 SRを胱出す。そして、参照
データs人、 SBを、それぞれ認識結果R1の類似度
値および認識結果R2の類似度差と比較することになる
。この場合、R)の類似度値が90で、R2の類似度差
が「20」であれば、判定回路4は閾値SA 、 Sm
(80、15)を越えるため、ANK認識認識部上識
結果R1の答である「ア」を最終的答として判定し出力
する。また、認識結果R1,R2の6値が参照データS
A r SBより小さい場合には、判定回路4は漢字認
識部3の認識結果R2の答である「了」を最終的答とし
て判定し出力する。If "A" is found among the first to Nth candidate characters in the recognition result R2, the determination circuit 4 refers to the determination table as described above. That is, the determination circuit 4 outputs the reference data sA and SR of the address specified by the answer "A" of the recognition result R1 in the table of FIG. Then, the reference data s people and SB are compared with the similarity value of the recognition result R1 and the similarity difference of the recognition result R2, respectively. In this case, if the similarity value of R) is 90 and the similarity difference of R2 is "20", the determination circuit 4 sets the threshold values SA and Sm
Since it exceeds (80, 15), the ANK recognition recognition unit determines and outputs "A", which is the answer of the knowledge result R1, as the final answer. In addition, the six values of recognition results R1 and R2 are the reference data S
If it is smaller than A r SB, the determination circuit 4 determines and outputs "Ryo", which is the answer of the recognition result R2 of the kanji recognition unit 3, as the final answer.
このようにして、ANK文字および漢字等からなる文章
の文字を認識する場合、漢字認識部3の認識結果R2だ
けでなく、本来ANK文字の認識精度の高いANK認識
認識部上識結果R1を参照して、ANK文字の認識を高
い精度で行なうことができる。この場合、認識結果R1
に基づいて、判定回路4は判定テーブルの参照データに
より最終的答を判定し出力することになる。この判定テ
ーブルは、ANK認識認識部上び漢字認識部3の各認識
結果R1,R2の類似度値に基づいて最適な答を選択で
きるように予め設定される認識用統計データからなる。In this way, when recognizing characters in a text consisting of ANK characters and kanji, etc., refer not only to the recognition result R2 of the kanji recognition unit 3, but also to the ANK recognition recognition unit's recognition result R1, which originally has high recognition accuracy for ANK characters. Thus, ANK characters can be recognized with high accuracy. In this case, recognition result R1
Based on this, the determination circuit 4 determines and outputs the final answer using the reference data of the determination table. This determination table is made up of recognition statistical data set in advance so that the optimal answer can be selected based on the similarity value of each recognition result R1, R2 of the ANK recognition recognition unit and the kanji recognition unit 3.
以上詳述したようにこの発明によれば、漢字OCRにお
いて、ANK文字および漢字等からなる文章の文字を認
識する場合でも、ANK文字の認識の精度を高めるよう
にして、確実な読取を行なうことができる。したがって
、結果的に帳票等に記録された文章全体の読取精度を高
めることができる効果を得ることができるものである。As detailed above, according to the present invention, even when recognizing characters in a text consisting of ANK characters, kanji, etc., in Kanji OCR, the accuracy of recognition of ANK characters is increased to ensure reliable reading. I can do it. Therefore, as a result, it is possible to obtain the effect of increasing the reading accuracy of the entire text recorded on a form or the like.
【図面の簡単な説明】
第1図はこの発明の一実施例に係る文字認識方式の構成
を示すブロック図、第2図は第1図の動作を説明するた
めの帳票の一例を示す図、第3図は第1図の判定回路の
動作を説明するだめの判定テーブルの一例を示す図であ
る。
9−
2・・・ANK認識部、3・・・漢字認識部、4・・・
判定回路。
出願人代理人 弁理士 鈴 江 武 彦10−[BRIEF DESCRIPTION OF THE DRAWINGS] FIG. 1 is a block diagram showing the configuration of a character recognition system according to an embodiment of the present invention, FIG. 2 is a diagram showing an example of a form for explaining the operation of FIG. 1, FIG. 3 is a diagram showing an example of a judgment table for explaining the operation of the judgment circuit shown in FIG. 1. 9- 2...ANK recognition section, 3...Kanji recognition section, 4...
Judgment circuit. Applicant's agent Patent attorney Takehiko Suzue 10-
Claims (1)
を認識する第1の認識部と、上記限定字種および漢字等
を含む広範囲の字種の文字を認識する第2の認識部と、
上記第1および第2の認識部から出力される各g識結果
からそれぞれの類似度値により最適の答を選択するため
に予め設定される認識用統計データに基づいて最終的答
を判定して出力する認識照合判定部とを具備したことを
特徴とする文字認識装置。A first recognition unit that recognizes characters in a limited range of character types such as alphanumeric characters and symbols to be recognized, and a second recognition unit that recognizes characters in a wide range of character types, including the above-mentioned limited character types and kanji, etc. and,
A final answer is determined based on preset recognition statistical data in order to select the optimal answer based on each similarity value from each g recognition result output from the first and second recognition units. A character recognition device characterized by comprising a recognition collation determination unit that outputs an output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP58032426A JPS59158482A (en) | 1983-02-28 | 1983-02-28 | Character recognizing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP58032426A JPS59158482A (en) | 1983-02-28 | 1983-02-28 | Character recognizing device |
Publications (1)
Publication Number | Publication Date |
---|---|
JPS59158482A true JPS59158482A (en) | 1984-09-07 |
Family
ID=12358622
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP58032426A Pending JPS59158482A (en) | 1983-02-28 | 1983-02-28 | Character recognizing device |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPS59158482A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63223890A (en) * | 1987-03-12 | 1988-09-19 | Toshiba Corp | Drawing reader |
JPH01259475A (en) * | 1988-04-11 | 1989-10-17 | Canon Inc | Character recognizing device |
JPH0221383A (en) * | 1988-01-04 | 1990-01-24 | Sumitomo Electric Ind Ltd | Optical character reader |
JPH0981730A (en) * | 1995-09-18 | 1997-03-28 | Canon Inc | Method and device for pattern recognition and computer controller |
-
1983
- 1983-02-28 JP JP58032426A patent/JPS59158482A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63223890A (en) * | 1987-03-12 | 1988-09-19 | Toshiba Corp | Drawing reader |
JPH0221383A (en) * | 1988-01-04 | 1990-01-24 | Sumitomo Electric Ind Ltd | Optical character reader |
JPH01259475A (en) * | 1988-04-11 | 1989-10-17 | Canon Inc | Character recognizing device |
JPH0981730A (en) * | 1995-09-18 | 1997-03-28 | Canon Inc | Method and device for pattern recognition and computer controller |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4989258A (en) | Character recognition apparatus | |
JPS63216189A (en) | Character recognition system | |
JPS59158482A (en) | Character recognizing device | |
EP0144006A2 (en) | An improved method of character recognitionand apparatus therefor | |
JPH0743755B2 (en) | Character recognition device | |
JPS6139175A (en) | Optical character reading device | |
JPS6336389A (en) | Character reader | |
JPS6146573A (en) | Character recognizing device | |
JPH0223490A (en) | Character reading system | |
JPS60110089A (en) | Character recognizer | |
JP2746345B2 (en) | Post-processing method for character recognition | |
JPH01311390A (en) | Character substitution control system | |
JPS6095689A (en) | Optical character reader | |
JP2972443B2 (en) | Character recognition device | |
JPS62177686A (en) | Optical character reader | |
JPS6160184A (en) | Optical character reader | |
JPS60160481A (en) | Reader of character | |
JPS6073796A (en) | optical character reader | |
JPH05189604A (en) | Optical character reader | |
JPH0576674B2 (en) | ||
JPS59205681A (en) | Character reader | |
JPS63143685A (en) | Method for displaying recognized result in character recognizing device | |
JPS61109183A (en) | Character recognizer | |
JPS59188783A (en) | Character discriminating and processing system | |
JPH04337892A (en) | System for reading pattern |