JPH03263282A

JPH03263282A - Character segmenting method for character reader

Info

Publication number: JPH03263282A
Application number: JP2063547A
Authority: JP
Inventors: Atsushi Shimoyama; 霜山　篤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-03-14
Filing date: 1990-03-14
Publication date: 1991-11-22

Abstract

PURPOSE:To enable character segmentation by using a normal medium by reading a slip, setting the positions of a reference character area and an objective character area, storing these positions into a memory and correcting the position of the read character area with these set positions as references. CONSTITUTION:An MPU 4 reads the slip to be used, extracts the character area, sets the positions of the reference character area and the objective character area and stores them into as memory 5 as format information. The MPU 4 reads out the format information from the memory 5. Next, the slip is read and a character area extraction program 7a is executed so as to detect the position of the reference character area. Then, a correct value extraction program 7c is executed so as to calculate error to a format reference position and to extract a correct value. The MPU 4 executes a character position correction program 7d and according to this correct value, the respective character area positions of the format are corrected. By using the objective area position, the character area is segmented from the picture data of a picture memory 3.

Description

【発明の詳細な説明】〔目次〕概要産業上の利用分野従来の技術（第９図）発明が解決しようとする課題課題を解決するための手段（第１図）作用実施例（ａ）　　文字読取装置の説明（第２図）（ｂ）　　フ
ォーマット作成処理の説明（第３図乃至第６図）（Ｃ）　　読取処理の説明（第７図、第８図）（ｄ）　
　他の実施例の説明発明の効果〔概要〕帳票を読取り、帳票の対象文字を認識する文字読取装置
；二おいて、帳票上の対象文字領域を切出すための文字
切出し方法に関し。[Detailed description of the invention] [Table of contents] Overview Industrial field of application Prior art (Figure 9) Means for solving the problems to be solved by the invention (Figure 1) Working examples (a) Characters Explanation of the reading device (Figure 2) (b) Explanation of the format creation process (Figures 3 to 6) (C) Explanation of the reading process (Figures 7 and 8) (d)
DESCRIPTION OF OTHER EMBODIMENTS Effects of the Invention [Summary] Character reading device for reading a form and recognizing target characters on the form; Second, regarding a character cutting method for cutting out a target character area on a form.

高価なＯＣＲ専用用紙を用いずに２通常の媒体を用いて
文字切出しを可能とすることを目的とし。The purpose is to make it possible to cut out characters using ordinary media without using expensive OCR paper.

帳票を読取る読取部と、読取った帳票情報を格納する画
像メモリと、該帳票情報から対象文字領域を切出し、切
出した対象文字領域の文字を認識する処理部とを有する
文字読取装置；；おいて、フォーマット作成時に、該帳
票を読取り１文字領域を抽出し、基準となる文字領域の
位置及び該基準位置からの対象文字領域の位置を設定し
、メモリに記憶しておき、読取り時に、読取った帳票の
文字領域を抽出し、基準となる文字領域の位置を検出し
、該メモリの基準位置と検出位置との誤差を求め、該誤
差；：基づく補正値で該メモリの対象文字領域位置を補
正して対象文字領域を切出す。A character reading device having a reading unit that reads a form, an image memory that stores read form information, and a processing unit that cuts out a target character area from the form information and recognizes characters in the cut out target character area; , When creating the format, read the form, extract one character area, set the position of the reference character area and the position of the target character area from the reference position, store it in memory, and when reading, Extract the character area of the form, detect the position of the reference character area, find the error between the reference position of the memory and the detected position, and correct the target character area position of the memory with a correction value based on the error. to cut out the target character area.

[Industrial application field]

本発明は、帳票を読取り、帳票の対象文字を認識する文
字読取装置（＝おいて、帳票上の対象文字領域を切出す
ための文字切出し方法に関する。The present invention relates to a character reading device for reading a form and recognizing target characters on the form, and a character cutting method for cutting out a target character area on the form.

帳票を読取って、帳票上の対象文字を認識する光学的文
字読取装置は、コンピュータ等への入力手段として広く
利用されており２％に手書き文字読取装置は、帳票上の
手書き文字を入力できるため２便オリである。Optical character reading devices that read forms and recognize target characters on forms are widely used as input means for computers, etc., and handwritten character reading devices account for 2% of the total, because they can input handwritten characters on forms. There are two flights.

このような文字読取装置では、帳票上の対象文字を正確
に認識するには、帳票上の対象文字領域を正確に検出し
て、対象文字領域を精度良く切り出すことが必要となる
。In such a character reading device, in order to accurately recognize a target character on a form, it is necessary to accurately detect the target character area on the form and cut out the target character area with high precision.

[Conventional technology]

第９図は従来技術の説明図である。 FIG. 9 is an explanatory diagram of the prior art.

従来の文字切出し方法には、第９図（４）に示すように
、帳票ｌの縁を基準として、縁から対象文字領域の位置
ｘ、ｙを求めておき、読取るべき帳票の読取り信号から
帳票の縁を検出し、検出位置を基準に当該位置の帳票上
の領域を対象文字領域として切り出し１文字認識の対象
とすることが行なわれていた。As shown in FIG. 9 (4), the conventional character cutting method involves determining the position x and y of the target character area from the edge with the edge of the form l as a reference, and then calculating the position of the target character area from the reading signal of the form to be read. The edges of the document are detected, and an area on the document at that position is extracted as a target character area based on the detected position and used as a target for single character recognition.

又、他の従来の文字切出し方法として、第９図０に示す
ように、帳票１上にマークＭｋ又はＬ字形のラインマー
クＬｍｋ　　を付しておき、マークＭｋ又はラインマー
クＬｍｋ　　を基準とした帳票上の対象文字領域の位置
を求めておき、読取るべき帳票の読取り信号からマーク
Ｍｋ又はラインマークＬｒｎｋ　を検出し、検出位置を
基準に当該位置の帳票上の領域を対象文字領域として切
出すことが行なわれていた。Another conventional method for cutting out characters is to attach a mark Mk or an L-shaped line mark Lmk to the form 1, as shown in FIG. It is possible to find the position of the target character area above, detect the mark Mk or line mark Lrnk from the reading signal of the form to be read, and cut out the area on the form at the relevant position as the target character area based on the detected position. It was being done.

（発明が解決しようとする課題〕しかしながら、従来技術では次の問題があった。(Problem to be solved by the invention) However, the conventional technology has the following problems.

■　第９図（８）の前者の従来方法では、帳票１の縁を
基準にするため、帳票１のカッティング精度が高いもの
が要求され１通常のものの約１０倍程度の高価な媒体を
使用しなければならないという問題点かあグた〇 ■　第９区間の後者の従来方法では、ラインマークＬｍ
ｋ、　　マークＭｋを印刷したＯＣＲ専用用紙を用いる
必要があり、高価となるという問題があった。■ In the former conventional method shown in Figure 9 (8), since the edge of the form 1 is used as a reference, it is necessary to cut the form 1 with high precision. In the latter conventional method in the 9th section, the line mark Lm
It is necessary to use special OCR paper with the mark Mk printed on it, which poses a problem of being expensive.

従って２本発明は、高価なＯＣＲ専用用紙を用いずに１
通常の媒体を用いて文字切出しの可能な文字読取装置の
文字切出し方法を提供することを目的とする。Therefore, the present invention can be used without using expensive OCR paper.
It is an object of the present invention to provide a character cutting method for a character reading device that can cut out characters using a normal medium.

[Means to solve the problem]

第１図は本発明の原理図である。 FIG. 1 is a diagram showing the principle of the present invention.

本発明は、第１図（４）に示すように、帳票１を読取る
読取部２と、読取った帳票情報を格納する画像メモリ３
と。As shown in FIG. 1 (4), the present invention includes a reading unit 2 that reads a form 1, and an image memory 3 that stores the read form information.
and.

該帳票情報から対象文字領域を切出し、切出した対象文
字領域の文字を認識する処理部４とを有する文字読取装
置；：おいて、第１図（Ｂ）に示すように、フォーマッ
ト作成時に、該帳票１を読取り。As shown in FIG. 1(B), a character reading device has a processing unit 4 that cuts out a target character area from the form information and recognizes the characters in the cut out target character area. Read form 1.

文字領域を抽出し、基準となる文字領域の位置及び該基
準位置からの対象文字領域の位置を設定し。Extract the character area, and set the position of the reference character area and the position of the target character area from the reference position.

メモリ５に記憶しておき、第１図０に示すように。It is stored in the memory 5 as shown in FIG. 10.

読取り時に、読取った帳票の文字領域を抽出し。When reading, extracts the character area of the read form.

基準となる文字領域の位置を検出し、該メモリ５の基準
位置と検出位置との誤差を求め、該誤差に基づく補正値
で該メモリ５の対象文字領域位置を補正して対象文字領
域を切出すものである。The position of the reference character area is detected, the error between the reference position of the memory 5 and the detected position is determined, and the target character area position of the memory 5 is corrected with a correction value based on the error to cut the target character area. It is something to put out.

[Effect]

本発明は、帳票１上の既に印刷されている文字等の文字
領域を基準として、対象文字領域の位置を補正するもの
である。The present invention corrects the position of a target character area using a character area such as characters already printed on a form 1 as a reference.

このために、フォーマット作成時に、帳票１を読取り２
文字領域を抽出し、基準となる文字領域の位置及び対象
文字領域の位置を設定し、メモリ５にフォーマット情報
として格納しておく。For this purpose, when creating the format, read form 1 and read form 2.
The character area is extracted, the position of the reference character area and the position of the target character area are set, and stored in the memory 5 as format information.

そして読取り時には、読取った帳票の文字領域を抽出し
、抽出結果により基準文字領域の位置を検出し、検出位
置とメモリ５のフォーマット基準位置との誤差を求め、
これを補正量として対象文字領域の位置を補正するよう
にしたものである。When reading, the character area of the read form is extracted, the position of the reference character area is detected based on the extraction result, and the error between the detected position and the format reference position of the memory 5 is determined.
This is used as a correction amount to correct the position of the target character area.

即ち、伝票等の帳票１は、各文字領域の位置関係は印刷
精度によって一定である〇そこで、この文字領域の内既に印刷されている表題等の
漢字印刷文字位置を基準とし、これによって補正するよ
うにしたものである。In other words, in form 1 such as a slip, the positional relationship of each character area is fixed depending on the printing accuracy. Therefore, the position of the characters printed in kanji such as the title that has already been printed in this character area is used as a reference, and correction is made based on this. This is how it was done.

このため、ラインマークのない、一般に流通している伝
票や、高精度のカットによらない一般伝票も読取り可能
である。Therefore, it is possible to read commonly distributed slips without line marks and general slips that are not cut with high precision.

従って、ＯＣＲ専用の媒体から一般伝票等ＯＣＲ専用で
ない伝票を取り扱うことができ、低価格な媒体を用いる
ことができる。Therefore, it is possible to handle documents not exclusively for OCR, such as general slips, from media exclusively for OCR, and low-cost media can be used.

又、現在使用している伝票等をそのまま読めるため１文
字読取装置が非常（＝導入し易い。In addition, single-character reading devices are extremely easy to install because they can read the slips, etc. that are currently in use.

〔Example〕

（ａ）　　文字読取装置の説明第２図は本発明の対象とする光学式文字読取装置の構成
図である。(a) Description of character reading device FIG. 2 is a block diagram of an optical character reading device to which the present invention is applied.

図中、第１図で示したものと同一のものは、１１￥Ｉ−
の記号で示しである。In the figure, the same thing as shown in Figure 1 is 11 yen I-
It is indicated by the symbol.

スキャナ２は、ＣＣＤラインセンサで構成され。The scanner 2 is composed of a CCD line sensor.

帳票１を送りながら、帳票１を読み取り２画像信号に変
換する。While sending the form 1, it is read and converted into two image signals.

処理部４は、マイクロプロセッサ（ＭＰＵ）で構成され
、各プログラムを実行する。The processing unit 4 is composed of a microprocessor (MPU) and executes each program.

５はデータメモリであり、基準位置（位置基準）。5 is a data memory and a reference position (position reference).

対象文字領域の位置等及び補正値等を格納するもの、６
は制御プログラムメモリであり、ＭＰＵ４の制御プログ
ラムを格納するものである。Something that stores the position, etc. of the target character area, correction values, etc., 6
is a control program memory, which stores a control program for the MPU 4.

７はプログラムメモリであり、ＭＰＵ４の文字読取処理
に必要なプログラム等を格納しておくものである。A program memory 7 stores programs necessary for the character reading process of the MPU 4.

即ち、帳票１の画像情報から文字領域を抽出する文字領
域抽出プログラムとそのワークメモリ７ａ、帳票上の位
置基準を設定する位置基準設定プログラム７ｂ（第３図
）、読取り時の補正値を抽出する補正値抽出プログラム
７Ｃ（第７図、第８図）、補正値によって対象文字（領
域）位置を補正する文字位置補正プログラム７ｄと２文
字領域を切出し２文字認識する文字認識プログラム７ｅ
とを格納する。That is, a character area extraction program and its work memory 7a extract a character area from the image information of the form 1, a position reference setting program 7b (FIG. 3) that sets a position reference on the form, and a correction value at the time of reading is extracted. A correction value extraction program 7C (FIGS. 7 and 8), a character position correction program 7d that corrects the target character (area) position using correction values, and a character recognition program 7e that cuts out two character areas and recognizes two characters.
and store it.

８は表示入力部であり、イメージ、文字を表示する表示
部８ａと、データ等の入力のためのキーボード８ｂと２
表示部８ａ上の入力位置を示１′マウス８Ｃとを有して
いるもの、９はバスであり。8 is a display input section, which includes a display section 8a for displaying images and characters, a keyboard 8b for inputting data, etc.
The input position on the display section 8a is indicated by 1', which has a mouse 8C, and 9 is a bus.

ＭＰＵ４と、スキャナ２．メモリ３，５，６，７゜表示
入力部８とを接続するものである。MPU 4 and scanner 2. It connects the memories 3, 5, 6, and 7 degrees to the display input section 8.

尚、データメモリ５とプログラムメモリ７とは。Furthermore, what is the data memory 5 and the program memory 7?

一つの磁気ディスク装置で構成されている。It consists of one magnetic disk device.

（ｂ）　　フォーマット作成処理の説明第３図は本発明
の一実施例フォーマット作成処理フロー図、第４図はそ
のスキュー量抽出処理説明図、第５図はその文字領域抽
出処理説明図、第６図はその位置設定処理説明図である
。(b) Description of format creation processing FIG. 3 is a flowchart of format creation processing according to an embodiment of the present invention, FIG. 4 is an explanation diagram of the skew amount extraction process, FIG. 5 is an explanation diagram of the character area extraction process, The figure is an explanatory diagram of the position setting process.

■　読み取られるべき帳票をスキャナ２に入力し。■ Input the form to be read into the scanner 2.

読取りイメージデータをＭＰＵ４は制御プログラムの実
行の元に画像メモリ３へ格納する。The MPU 4 stores the read image data in the image memory 3 under execution of a control program.

■　次に、ＭＰＵ４は１文字領域抽出プログラム７ａを
実行し、先づ帳票１のスキュー量θを求める０これを第４図（＝より説明する。(2) Next, the MPU 4 executes the one-character area extraction program 7a, and first calculates the skew amount θ of the form 1. This will be explained with reference to FIG.

第４図（Ａ）（二示すように、スキュー量θは帳票１の
スキャナ２の読取り上の傾き角である。As shown in FIG. 4(A)(2), the skew amount θ is the inclination angle when the scanner 2 reads the form 1.

ＭＰＵ４は、第４図中）のように１画像メモリ３のイメ
ージデータの各ライン毎の黒ドツトの数を計数し、計数
値のヒストグラムを作成する。The MPU 4 counts the number of black dots for each line of image data in one image memory 3, as shown in FIG. 4, and creates a histogram of the counted values.

これを、第４図０のように、走査角度を変えて走査し、
各角度のヒストグラムｈ１〜ｈ３を得る。This is scanned by changing the scanning angle as shown in Figure 4 0.
Obtain histograms h1 to h3 for each angle.

次に、ＭＰＵ４は各ヒストグラム上の白から黒への変化
量を算出し、比較して変化の最も急なものを求め、その
時の走査角をスキュー量θとする。Next, the MPU 4 calculates the amount of change from white to black on each histogram, compares them, finds the one with the steepest change, and sets the scan angle at that time as the skew amount θ.

■　次に、ＭＰＵ４は２文字領域抽出プログラム７ａを
続行し２文字領域を抽出する。(2) Next, the MPU 4 continues the two-character area extraction program 7a to extract a two-character area.

これを第５図）：より説明する。This will be explained further in Fig. 5).

ＭＰＵ４は、求めたスキュー量の走査角で画像メモリ３
を走査し、黒を横方向に投影し、投影像ｐｈを得る。The MPU 4 stores the image memory 3 at the scan angle of the determined skew amount.
is scanned, and black is projected in the horizontal direction to obtain a projected image ph.

第５図のように２文字領域では黒に投影され。As shown in Figure 5, the two-character area is projected black.

それ以外は白部分となる。The rest will be white.

そこで、ＭＰＵ４は、投影像Ｐｈを上から走査し、規定
長以上の白部分は文字と文字の間とみなし９文字領域を
１０．１１．１２のように抽出する０次に、横方向に抽出された各文字領域１０゜１１．１２
に対し、スキュー量を用いて走査し。Therefore, the MPU 4 scans the projected image Ph from above, and extracts 9 character areas as shown in 10.11.12, assuming that white parts longer than a specified length are between characters. Each character area 10°11.12
In contrast, scan using the skew amount.

第５図のような、黒の投影像Ｐｖ１　ｒ　Ｐｖ２１　Ｐ
　ｖ３を得る。Black projected image Pv1 r Pv21 P as shown in FIG.
Get v3.

この黒の投影像Ｐｖｌ、Ｐｖ２．Ｐｖ３を図の左から右
へ走査し、同様に規定長以上の白部分は文字領域外とみ
なし、縦方向の文字領域を決定する。These black projected images Pvl, Pv2. Pv3 is scanned from left to right in the diagram, and similarly, a white portion longer than a specified length is considered to be outside the character area, and the vertical character area is determined.

このようにして、第５図の斜線のように文字領域１０，
１１．１２を抽出する。In this way, as shown by diagonal lines in FIG.
Extract 11.12.

■　次に、ＭＰＵ４は、第５図と同一の方法で各文字領
域１０，１１．１２をスキュー量を用い。(2) Next, the MPU 4 uses the skew amount for each character area 10, 11, and 12 in the same manner as in FIG.

横方向（二投影し、より小さい規定長を用いて規定長以
上の白部分を文字行間とみなし２文字行を抽出する。In the horizontal direction (two projections are made, the smaller specified length is used, and the white part longer than the specified length is regarded as the character line spacing, and two character lines are extracted.

更１：、、ＭＰＵ４は、各文字行について、第５図に示
した方法と同一の方法で各文字行を縦方向に投影し２行
内の文字を切り出す。Further 1: For each character line, the MPU 4 projects each character line in the vertical direction using the same method as shown in FIG. 5 and cuts out the characters within two lines.

このようにして、抽出された各文字領域の切り出し矩形
領域の四隅の座標はワークメモリ７ａに格納される。In this manner, the coordinates of the four corners of the cut-out rectangular area of each extracted character area are stored in the work memory 7a.

■　次に、ＭＰＵ４は２位置基準設定プログラム７ｂを
実行し、基準位置等を設定する。(2) Next, the MPU 4 executes the 2-position reference setting program 7b to set the reference position and the like.

これを第６図により説明するＯＭＰＴＪ４は１表示部８ａで画像メモリ３の入力帳票イ
メージ上に１．ワークメモリ７ａの行又は行内文字領域
の切出し矩形領域を切出し矩形枠の形状で重ね表示する
０例えば、第６図のように入力帳票１が「表題」の下に、
読取り対象データフィールドがあるものであれば２表示
イメージは１表題と読取り対象データフィールドの内容
と、それを囲む矩形枠となるＯこの矩形枠は９画像メモリ３の内容からステップ■〜■
で抽出した文字領域座標から作成される。This will be explained with reference to FIG. 6.The O MPTJ4 displays 1. For example, as shown in FIG. 6, if the input form 1 is displayed under "Title",
If there is a data field to be read, the 2 display images will be 1 title, the contents of the data field to be read, and a rectangular frame surrounding it.
It is created from the character area coordinates extracted in .

尚、入力帳票の読取りデータフィールドの文字枠がドロ
ップアウトカラーならば、予じめ文字枠位置がイメージ
として入力されるよう、帳票；ニトロツブアウトしない
色で第６図のようζ二記載しておく必要がある０この表示イメージは、入力帳票の内容と抽出処理によっ
て切出される領域の関係を示す。If the character frame of the read data field of the input form is a dropout color, write it on the form in a color that will not drop out, as shown in Figure 6, so that the character frame position can be entered as an image in advance. This display image shows the relationship between the contents of the input form and the area cut out by the extraction process.

次に、オペレータはマウス８Ｃを用いて表示部８ａの画
面上で基準文字矩形枠な指示入力する。Next, the operator uses the mouse 8C to input an instruction for a standard character rectangular frame on the screen of the display section 8a.

この基準文字矩形枠（二は、読取対象データの記載され
る文字記入枠の位置との関係が変化しない位置の漢字表
題文字等を選択する必要がある。This reference character rectangular frame (secondly, it is necessary to select a Kanji title character, etc. at a position that does not change in relation to the position of the character entry frame in which the data to be read is written).

即ち、伝票等では、「納品書」等の表題文字が印刷され
ており、この文字領域は印刷されるから。That is, on a slip, etc., title characters such as "Delivery Note" are printed, and this character area is printed.

一定のため、読取対象データの記入枠に対し、−定の位
置で抽出できる。Since it is constant, it can be extracted at a fixed position with respect to the entry frame of the data to be read.

この基準枠を入力すると、ＭＰＵ４は、基準枠内の文字
を文字認識プログラム７ｄで文字認識する０そして２文字認識の結果を答として２表示部８ａの基準
枠の下に表示する。When this reference frame is input, the MPU 4 recognizes the characters within the reference frame using the character recognition program 7d and displays the result of the two character recognition as an answer below the reference frame on the second display section 8a.

オペレータは、この答と表示部８ａの基準枠内の文字と
を目視比較して、確認し、答が合っていれば、マウス８
Ｃで確認指示し、誤っていれば。The operator visually compares and confirms this answer with the characters within the reference frame on the display section 8a, and if the answer is correct, presses the mouse 8.
Check with C and if it is incorrect.

キーボード８ｂより正しい答を入力し、修正する。Enter the correct answer using the keyboard 8b and correct it.

これによって、ＭＰＵ４は２選択された基準文字矩形枠
の左上と右下の座標を基準位置として。As a result, the MPU 4 uses the upper left and lower right coordinates of the two selected reference character rectangular frames as reference positions.

基準枠内の文字コードとともに、データメモリ５のワー
クメモリ；二格納する。It is stored in the work memory of the data memory 5 together with the character code within the reference frame.

■　次に、オペレータは２表示部８ａの前述の表示イメ
ージから対象文字領域の左上と右下の座標を対象文字領
域位置としてマウス８Ｃにより指示して入力する。(2) Next, the operator uses the mouse 8C to specify and input the upper left and lower right coordinates of the target character area as the target character area position from the above-mentioned display image on the second display section 8a.

ＭＰＵ４は、この座標を基準位置からの相対位置に変換
し、対象文字領域位置とする。The MPU 4 converts these coordinates into a relative position from the reference position, and sets it as the target character area position.

更に、オペレータは、キーボード８ｂよりその対象文字
領域の文字種及び文字数を入力する。Furthermore, the operator inputs the character type and number of characters of the target character area using the keyboard 8b.

ＭＰＵ４は、これらの入力後、基準位置データ及び対象
文字領域位置データをスキュー量θを「０」とした値に
変換し、フォーマット情報を作成し、基準位置とその文
字コードをデータメモリ５の位置基準記憶用メモリに、
各対象文字領域位置とその文字種及び文字数をデータメ
モリ５の対象文字領域位置メモリに記憶する。After these inputs, the MPU 4 converts the reference position data and target character area position data into values with the skew amount θ being "0", creates format information, and stores the reference position and its character code in the data memory 5. In the memory for reference storage,
The position of each target character area, its character type, and number of characters are stored in the target character area position memory of the data memory 5.

このようにして、使用する帳票を入力し、帳票の１つの
印刷文字領域を基準位置とし、基準位置からの各対象文
字領域の位置をフォーマット情報として作成する。In this way, the form to be used is input, one print character area of the form is used as a reference position, and the position of each target character area from the reference position is created as format information.

このようにオペレータを介在させてフォーマット情報を
作成するのは、オペレータによって基準位置を帳票に適
して自由に選択させ、且つ伝票中の会社名等対象文字領
域以外のものを対象文字領域から排除するためである。Creating format information with the operator's intervention in this way allows the operator to freely select the reference position suitable for the form, and excludes items other than the target character area, such as the company name in the document, from the target character area. It's for a reason.

（Ｃ）　　読取処理の説明第７図は本発明の一実施例読取処理フロー図。(C) Explanation of reading process FIG. 7 is a flowchart of a reading process according to an embodiment of the present invention.

第８図は第７図の基準位置補正値抽出処理フロー図であ
る。FIG. 8 is a flowchart of the reference position correction value extraction process of FIG. 7.

■　帳票読取りに先立って、ＭＰＵ４は、データメモリ
５よりフォーマット情報を読出しておく。■ Prior to reading the form, the MPU 4 reads format information from the data memory 5.

■　第８図で説明するように、帳票を読み取り。■ Read the form as explained in Figure 8.

文字領域抽出プログラム７ａを実行し、基準文字領域の
位置を検出し、補正値抽出プログラム７ｃを実行し、フ
ォーマットの基準位置との誤差を求め、補正値を抽出す
る。The character area extraction program 7a is executed to detect the position of the reference character area, the correction value extraction program 7c is executed, the error with the format reference position is determined, and the correction value is extracted.

ＭＰＵ４は２文字位置補正プログラム７ｄを実行し、こ
の補正値によって７オーマプトの各対象文字領域位置を
補正する。The MPU 4 executes the 2-character position correction program 7d, and uses this correction value to correct the position of each target character area in the 7-ohmapto.

■　そして、ＭＰＵ４は、補正後の対象文字領域位置を
用いて、読み取るべき文字の存在する位置を決定し２画
像メモリ３の画像情報（イメージデータ）から文字領域
を切り出す。(2) Then, the MPU 4 uses the corrected target character area position to determine the position where the character to be read exists, and cuts out the character area from the image information (image data) in the two-image memory 3.

ＭＰＵ４は２文字認識プログラム７ｅを実行し。The MPU 4 executes the two-character recognition program 7e.

フォーマットの文字種２文字数から切り出した文字領域
の文字を認識する。Recognize the characters in the character area cut out from the number of characters in the two character types of the format.

ＭＰＵ４は全文字認識終了したかを調べ、終了していな
ければ、ステップ■の先頭に戻る。The MPU 4 checks whether all character recognition has been completed, and if it has not been completed, the process returns to the beginning of step (2).

一方、ＭＰＵ４は、全文字終了と判断すると２次頁の帳
票が有るかを判定し、有れば、ステップ■に戻り、なけ
れば終了する○ 次に、第８図により基準位置補正値抽出処理について説
明する○ ■　スキャナ２で帳票１を読み取り、イメージデータな
画像メモリ３へ格納する。On the other hand, when the MPU 4 determines that all the characters have been completed, it determines whether there is a second page of the form, and if there is, it returns to step ■, otherwise it ends. Next, the reference position correction value extraction process is performed according to FIG. ○ ■ Scanner 2 reads form 1 and stores it in image memory 3 as image data.

次に、ＭＰＵ４は、第３図のステップ■と同一の方法で
スキュー量θを抽出する。Next, the MPU 4 extracts the skew amount θ using the same method as step (2) in FIG.

更に、ＭＰＵ４は、第３図のステップ■と同一の方法で
文字領域を抽出した後、第３図のステップ■と同一の方
法でスキュー量を用い１文字領域内の文字行を抽出する
。Furthermore, the MPU 4 extracts a character area using the same method as in step (2) in FIG. 3, and then extracts a character line within one character area using the skew amount in the same manner as in step (2) in FIG.

次に、ＭＰＵ４は、抽出した各文字行の左上座標と、フ
ォーマットの基準位置の左上座標との差を求め、差の最
も小さい文字行を基準位置に最も近い文字行として選択
する。Next, the MPU 4 calculates the difference between the upper left coordinate of each extracted character line and the upper left coordinate of the standard position of the format, and selects the character line with the smallest difference as the character line closest to the standard position.

＠　　ＭＰＵ４は２選択した文字行の行内文字を第３図
のステップ■と同一の方法で切り出す。@ The MPU 4 cuts out the inline characters of the 2 selected character lines using the same method as in step 3 of FIG.

そして、ＭＰＵ４は、第７図のステップ■と同一の方法
で、切り出した行内文字行を文字認識する０ ■　ＭＰＵ４は２文字認識結果（答）とフォーマットの
基準位置の文字コードとを比較する。Then, the MPU 4 performs character recognition on the extracted in-line character line using the same method as in step (2) in FIG.

■　比較結果が等しいことを示せば、ＭＰＵ４は。■ If the comparison results show that they are equal, the MPU4.

行内文字切り出しで得た検出座標と、フォーマットの基
準位置座像とのずれ値を計算する。Calculate the deviation value between the detected coordinates obtained by cutting out in-line characters and the standard position image of the format.

次に、ＭＰＵ４は、ずれ値にスキニー量の補正を加算し
、基準位置補正値として出力する。Next, the MPU 4 adds the skinny amount correction to the deviation value and outputs it as a reference position correction value.

■　一方、比較結果が等しくないことを示せば。■ On the other hand, if you show that the comparison results are not equal.

ＭＰＵ４は、ステップ■で求めた文字行の次に近いもの
を選択する。The MPU 4 selects the next closest character line to the character line found in step (2).

そして、ＭＰＵ４は、その文字行の検出座標（Ｙ）とフ
ォーマットの基準位置座標（Ｙ）とのずれ量を求め、こ
のずれ量と予じめ定めた限界値と比較する。Then, the MPU 4 determines the amount of deviation between the detected coordinates (Y) of the character line and the standard position coordinates (Y) of the format, and compares this amount of deviation with a predetermined limit value.

ずれ量が限界値より小さければ、ステップ■に戻り、ず
れ量が限界値より大きけれは、リジェクト扱いを指定し
て、終了する。If the amount of deviation is smaller than the limit value, the process returns to step (2); if the amount of deviation is larger than the limit value, it is designated to be treated as rejected and the process ends.

このよう；二して、読取るべき帳票の基準文字領域を検
出し、フォーマットの基準位置とのずれを求めて、帳票
のずれによる補正値を算出し、フォーマットの対象文字
領域位置を補正して、これによって文字を切り出す。In this way, the reference character area of the form to be read is detected, the deviation from the standard position of the format is calculated, the correction value due to the deviation of the form is calculated, and the position of the target character area of the format is corrected. This will cut out the characters.

このため２通常の伝票を用い、伝票の表題等の印刷文字
位置を基準に、伝票のずれにかかわらず。For this reason, 2 normal slips are used, and the printed character position of the slip title, etc. is used as a reference, regardless of the slippage of the slip.

対象文字領域を切出すことができる。The target character area can be extracted.

父、この実施例では、帳票の基準文字領域位置の検出に
、フォーマット作成時の位置抽出と同一ノアルゴリズム
を用いているので２位置検出が同一精度で行なえる。In this embodiment, the same algorithm as used for position extraction during format creation is used to detect the position of the reference character area of the form, so two position detection can be performed with the same precision.

更に、帳票の検出基準文字領域内を文字認識し。Furthermore, characters within the detection reference character area of the form are recognized.

７ｉ−マットの文字コードと比較しているので。Because it is compared with the character code of 7i-mat.

検出が一層確実となる。Detection becomes more reliable.

（ｄ）　　他の実施例の説明本発明では２次のような変形も可能である。(d) Description of other embodiments In the present invention, a quadratic modification is also possible.

■　フォーマット作成時に、基準文字領域を抽出して、
イメージに重ね合わせて表示して選択しているが、帳票
イメージを表示し、オペレータがマウス等で基準文字領
域位置を指示選択してもよい。■ When creating a format, extract the standard character area and
Although selection is made by displaying the reference character area superimposed on the image, it is also possible to display the form image and allow the operator to specify and select the position of the reference character area using a mouse or the like.

又、逆に、基準文字領域の抽出により得た位置を、オペ
レータの指示なしにそのまま基準文字領域位置として採
用してもよい。Conversely, the position obtained by extracting the reference character area may be directly adopted as the reference character area position without any instruction from the operator.

■　第３図のステップ■で、対象文字領域位置をオペレ
ータが入力しているが、切り出しによって得た座標を対
象文字領域位置として用いてもよい。(2) In step (2) of FIG. 3, the operator inputs the position of the target character area, but the coordinates obtained by cutting may be used as the position of the target character area.

■　第８図のステップＯ２■において、基準領域とされ
た領域の文字認識を行い、７オーマツトの文字コードと
比較しているが、これを省略してもよい。(2) In step O2 (2) of FIG. 8, character recognition is performed in the area set as the reference area and compared with the 7-orbit character code, but this may be omitted.

以上本発明を実施例により説明したが１本発明は本発明
の主旨に従い種々の変形が可能であり。Although the present invention has been described above with reference to examples, the present invention can be modified in various ways according to the gist of the present invention.

本発明からこれらを排除するものではない。These are not excluded from the present invention.

〔Effect of the invention〕

以上説明した様に９本発明によれは２次の効果を奏する
。As explained above, the present invention provides the following effects.

■　帳票の文字領域の１の文字位置を基準として各対象
文字領域の位置を補正して文字切出しを行なうので、Ｏ
ＣＲ専用でない一般に流通している伝票等を読取り媒体
として使用できるという効果を奏し、低価格な媒体を利
用でき、しかも文字読取装置が導入し易い。■ Characters are extracted by correcting the position of each target character area based on the character position 1 in the character area of the form, so O
This has the advantage that commonly available slips and the like, which are not exclusively for CR, can be used as reading media, and low-cost media can be used, and character reading devices can be easily installed.

■　フォーマットの作成も、読み取り対象の伝票等を読
ませることによって、可能なため、簡単（二できるとい
う効果を奏し２種々の形態の媒体に対し、フォーマット
作成をユーザー側で手軽にできるＯ■ It is easy to create a format by having the user read the slip, etc. to be read.

[Brief explanation of drawings]

第１図は本発明の原理図。第２図は本発明の対象とする光学式文字読取装置の構成
図。第３図は本発明の一実施例７ｉ−マット作成処理フロー
図。第４図は第３図におけるスキュー量抽出処理説明図。第５図は第３図における文字領域抽出処理説明図。第６図は第３図における位置設定処理説明図。第７図は本発明の一実施例読取処理７０−図。第８図は第７図の基準位置補正値抽出処理フロー図。第９図は従来技術の説明図である。図中、１・・・帳票。２・・・読取部（スキャナー）。３・・・画像メモリ。４・・・処理部。５・・・データメモリ。（Ｂ）（Ｃ）フォーマツ１イ乍八友理７０−図第３図光学民文字割」匈装置の横阪囚第２図二二５累１カウントとストプラム／’Ｐ／　−］−〜」−ｍ− 請９に遂還フロー図第７図、ｈｈ−ａｌＪ−７仕置９ｉ丸理説明ｇFIG. 1 is a diagram showing the principle of the present invention. FIG. 2 is a configuration diagram of an optical character reading device to which the present invention is applied. FIG. 3 is a flowchart of i-mat creation processing according to an embodiment of the present invention. FIG. 4 is an explanatory diagram of the skew amount extraction process in FIG. 3. FIG. 5 is an explanatory diagram of the character area extraction process in FIG. 3. FIG. 6 is an explanatory diagram of the position setting process in FIG. 3. FIG. 7 is a diagram illustrating a reading process 70 according to an embodiment of the present invention. FIG. 8 is a flowchart of the reference position correction value extraction process in FIG. 7. FIG. 9 is an explanatory diagram of the prior art. In the figure, 1... form. 2...Reading unit (scanner). 3... Image memory. 4...processing section. 5...Data memory. (B) (C) Foramatsu 1 I-Hachiyuri 70-Figure 3 Kominami character division ``Yokosaka prisoner of the Xion device Figure 2 225 Cumulative 1 count and stop plum /'P/ -]--''- m- Flowchart for return to request 9 Figure 7, hh-alJ-7 Explanation of punishment 9i Maruri g

Claims

[Scope of Claims] A reading unit (2) that reads a form (1), an image memory (3) that stores read form information, a target character area that is cut out from the form information, and a character of the cut out target character area. A character reading device having a processing unit (4) that recognizes a character, when creating a format, reads the form (1), extracts a character area, and determines the position of the reference character area and the target character area from the reference position. Set the position of the form and store it in the memory (5), and when reading, extract the character area of the read form (1),
The position of the reference character area is detected, the error between the reference position of the memory (5) and the detected position is determined, and the target character area position of the memory (5) is corrected with a correction value based on the error. A method for cutting out characters for a character reading device, characterized by cutting out a character area.