JPH01191279A

JPH01191279A - Kanji recognition method

Info

Publication number: JPH01191279A
Application number: JP63014421A
Authority: JP
Inventors: Kiyoya Shima; 島　清哉; Hidetoshi Saito; 斉藤　秀俊; Kiyoshi Ishikawa; 澄石川; Yoshihiro Mizuniwa; 水庭　佳弘; Takeo Maeda; 前田　武男; Masateru Sakata; 坂田　正輝
Original assignee: Hitachi Engineering Co Ltd; Hitachi Ltd
Current assignee: Hitachi Ltd; Hitachi Industry and Control Solutions Co Ltd
Priority date: 1988-01-27
Filing date: 1988-01-27
Publication date: 1989-08-01

Abstract

PURPOSE:To recognize a KANJI (Chinese character) code at high speed by a short program by encoding a KANJI and directly recognizing. CONSTITUTION:The points of the four corners of the code are respectively defined to be (c), (d), (e), (f). The 1, 2, 23, 24 rows and 1, 2, 23, 24 columns of a picture element are frames and black. 3-7 rows indicate the information of 4 bits (0100) for representing a numeric character 2. Since the 3-7 rows have the least significant bit of 0, they are white, since 8-12 columns show the numeric character of 1, they are black, and since 13-17 columns and 18-22 columns show 0, they are white. Since 8-12 columns show the numeric character of 0, 13-17 columns show 3 and 18-22 columns show 3, the JIS code 2033 of the character of KAN (Chinese) is represented. Then, the code is inputted to a computer by an image scanner to detect the coordinates of the four corners of the frame, calculate the position of respective information bits from the values, decide 1 or 0 according to that there are many white points or black points at a relevant position and read the numeric character represent one KANJI.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、コード化した漢字を認識する漢字認識方式に
関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a kanji recognition method for recognizing encoded kanji.

[Conventional technology]

コンピュータ利用技術において、入力に要する時間は、
非常に多くの割合を占めており、従来から、入力時間短
縮のため、種々の手法が開発されている。In computer-based technology, the time required for input is
This accounts for a very large proportion of input data, and various methods have been developed to shorten the input time.

その１つとして、印刷されたコードを読取装置によりコ
ンピュータに入力する手法がある。バーコードとカルラ
コードがそれである。One method is to input a printed code into a computer using a reading device. These include barcodes and Carla codes.

バーコードは、細い線と太い線により構成されたコード
であり、線状に走査するスキャナで少数の英数字を読み
取るのに適している。A barcode is a code made up of thin lines and thick lines, and is suitable for reading a small number of alphanumeric characters with a linear scanner.

また、カルラコードは、朝日新聞・昭和６２年１月１６
日付夕刊に記載されているように、第８図に示す田の字
の中の４個の四角を白または黒とすることにより、４ビ
ットの信号を表わすというものである。In addition, the Carla Code is published by Asahi Shimbun, January 16, 1986.
As described in the Date Evening Edition, a 4-bit signal is represented by making the four squares in the square shown in FIG. 8 white or black.

しかして、カルラコードにより漢字を表わすには、カル
ラコードの最小構成単位（情報要素と称する）４個を用
いて４桁の数字を表わせばよく、前記カルラコードの情
報要素を２行、２列に配置したいわゆる複合ダブルカル
ラコードも先に提案されている。Therefore, in order to represent a kanji using the Karura code, it is sufficient to represent a 4-digit number using the four minimum constituent units (referred to as information elements) of the Karura code, and the information elements of the Karura code are arranged in two rows and two columns. A so-called composite double Carla cord arranged at

[Problem to be solved by the invention]

ところで、プリンタで印刷したコードをイメージスキャ
ナで読み取って認識するシステムにおいて、漢字１字は
２４Ｘ２４ドットの画素で表現され、したがってその漢
字コードも、漢字１字に相当する４桁の数字を２４Ｘ２
４ドットで表現すれば、プリンタで印字する場合におけ
る当該プリンタのソフトウェア開発は容易になるが、前
記した複合ダブルカルラコードは、枠や情報要素間の空
白が無駄となり、漢字１字を２４Ｘ２４ドットで表現す
ることが困難となる。By the way, in a system that recognizes a code printed by a printer by reading it with an image scanner, one kanji character is represented by 24 x 24 dots of pixels.
If it is expressed in 4 dots, it will be easier to develop software for the printer when printing with a printer, but with the above-mentioned composite double Karra code, spaces between frames and information elements are wasted, and one kanji character is written in 24 x 24 dots. It becomes difficult to express.

ここで、カルラコードを認識する場合の手順を−述べる
と、次のようになる。Here, the procedure for recognizing the Carla code is as follows.

１）情報要素の切出し２）情報ビットの位置の決定３）白黒判定そして、前記した手順のうち、情報要素の切出しのため
には、情報要素が枠で囲まれていなければならない。ま
た、前記枠の４隅の位置から情報ビットの位置を決定す
る。なお、複合カルラコードは、前記枠の中に４Ｘ４ビ
ットの情報がなければならないが、その中にさらに枠や
空白を入れる必要はない。1) Cutting out the information element 2) Determining the position of the information bit 3) Black and white determination And among the above steps, in order to cut out the information element, the information element must be surrounded by a frame. Furthermore, the position of the information bit is determined from the positions of the four corners of the frame. Note that although the composite Carla code must contain 4×4 bits of information within the frame, there is no need to further insert a frame or blank space therein.

本発明の目的は、漢字をコード化して認識するにあたり
、標準パターンを持つ必要がなく、かつパターンマツチ
ングをおこなう必要もなく、８ドット／ｍ程度の分解能
である比較的安価なイメージスキャナでも、短いプログ
ラムで高速に認識することができ、しかもプリンタでプ
リントするときのソフトウェア開発を容易とした新しい
漢字認識方式を提供することにある。The purpose of the present invention is to encode and recognize Kanji characters without having to have a standard pattern, without having to perform pattern matching, and even with a relatively inexpensive image scanner with a resolution of about 8 dots/m. The object of the present invention is to provide a new kanji recognition method that can be recognized at high speed with a short program and that also facilitates software development when printing on a printer.

[Means to solve the problem]

前記目的を達成するため、本発明に係る漢字認識方式は
、漢字１字を４桁の数字で表わし、かつ前記数字１字を
４ビットとして計１６ビットで表わし、これを２４ドッ
ト×２４ドットの画素でコード化してプリントしたもの
を、プリンタと同一分解能のイメージスキャナで読み取
ることを特徴とするものである。In order to achieve the above object, the kanji recognition method according to the present invention represents one kanji character with a 4-digit number, and each digit is represented with 4 bits for a total of 16 bits, and this is expressed as a 24 dot x 24 dot. It is characterized by printing a pixel code and reading it with an image scanner that has the same resolution as the printer.

[Effect]

ここで、本発明の作用を図面を参照して説明する。 Here, the operation of the present invention will be explained with reference to the drawings.

ところで、コードの印刷は、イメージスキャナと同一の
装置に取り付けられたプリンタで印刷するのが普通であ
るが、一般に、同一装置に取り付けられたプリンタとイ
メージスキャナとは同一分解能である。何故ならば、こ
のようにすることにより、イメージスキャナで読み取っ
た画像が、そのまま、同一寸法でプリンタに印刷される
からである。By the way, codes are usually printed using a printer attached to the same device as the image scanner, and generally the printer and image scanner attached to the same device have the same resolution. This is because by doing so, the image read by the image scanner can be printed as is on the printer in the same size.

ここで、前記したごとき条件の下において、情報ビット
や線の幅に最低何個の画素を割り当てる必要があるかに
ついて検討してみる。Here, under the conditions described above, we will consider the minimum number of pixels that need to be allocated to the width of information bits and lines.

まず、線については、第８図（ａ）に示すように、１ド
ットの幅としたと考える。この場合、画素と同じ大きさ
を検出するイメージセンサが白黒の判定をするのである
が、画素の位置とセンサの位置とがぴったりと一致する
ことはほとんどなく。First, it is assumed that the line has a width of one dot, as shown in FIG. 8(a). In this case, the image sensor that detects the same size as the pixel determines whether the image is black or white, but the pixel position and the sensor position rarely match exactly.

最悪の場合は、センサの交点が画素の中心にくることも
あり得る。しかして、センサは、その面積の半分以上が
白であるときに白と判定するように調整されるものであ
るから、センサの特性によって、検出される線幅はＯか
ら２ドットにばらつくことになり、線が途切れる可能性
がある。In the worst case, the intersection of the sensors may be at the center of the pixel. However, since the sensor is adjusted to judge it as white when more than half of its area is white, the detected line width will vary from 0 to 2 dots depending on the characteristics of the sensor. There is a possibility that the line may be interrupted.

これに対し、第８図（ｂ）に示すように、線幅を２ドッ
トとすると、検出される線幅は、１から３ドットとなり
、線が途切れることはない。On the other hand, as shown in FIG. 8(b), if the line width is 2 dots, the detected line width will be 1 to 3 dots, and the line will not be interrupted.

このように、枠の線幅を２ドットとすると、残りの画素
は２０Ｘ２０ドットであり、これを４×４ビットの情報
ビットに分配すると、１情報ビット当り５×５ドットと
なる。In this way, if the line width of the frame is 2 dots, the remaining pixels are 20×20 dots, and if this is divided into 4×4 information bits, each information bit becomes 5×5 dots.

しかして、５Ｘ５ドットの情報ビットは、第９図に示す
ように、センサの交点が画素の中心に来るような最悪条
件においても、１６個の画素を黒と判定し、これは、前
記した２５個の画素の過半数を占めており、判定可能で
ある。Therefore, as shown in FIG. 9, the information bits of 5×5 dots determine that 16 pixels are black even under the worst condition where the intersection of the sensors is at the center of the pixel, which corresponds to the 25 pixels described above. This occupies the majority of pixels and can be determined.

ところで、８ドット／１ｍ程度の分解能を有するプリン
タとイメージスキャナとからなるシステムにおいて、漢
字は２４Ｘ２４ドットで構成される。By the way, in a system consisting of a printer and an image scanner having a resolution of about 8 dots/1 m, a kanji is composed of 24×24 dots.

また、漢字コードは、４桁の数字で表わすことができ、
情報ビットは、４行４列であるので、本発明方式におい
ては、情報ビット１行で数字１個を割り当てる。例えば
、漢という漢字は、ＪＩＳコードでは、２０３３である
が、情報ビットを読み取る順序は、左上から右方向へ読
み取り、続いて１行づつ下へ読み取るので、読取順序に
合せて、左端に最小位ビットを割り合で、０１００゜０
０００．１１００，１１００と表わす。また、黒点を１
、白点を０とする。In addition, kanji codes can be expressed as 4-digit numbers,
Since the information bits are arranged in 4 rows and 4 columns, in the method of the present invention, one number is assigned to one row of information bits. For example, the Kanji character ``Kan'' is 2033 in the JIS code, but the order in which the information bits are read is from the top left to the right, and then down one line at a time, so the minimum position is placed at the left end according to the reading order. Bit as a percentage, 0100°0
It is expressed as 000.1100,1100. Also, add 1 sunspot
, the white point is set to 0.

第１図には、本方式によって漢という漢字をコードで表
した場合の例が示されている。FIG. 1 shows an example of the case where the kanji character ``kan'' is represented by a code using this method.

第１図において、コードの４隅の点をそれぞれＱＴ　ｄ
ｔ　ｅＴ　ｆとする。画素の１．２，２３゜２４行およ
び１，２，２３．２４列は枠であり、黒である。In Figure 1, the four corner points of the code are each QT d
Let t eT f. The 1.2, 23° 24th row and 1, 2, 23.24th column of pixels is a frame and is black.

３〜７行は数字２を表わすための４ビットの情報（０１
００）を示している。そのうち、３〜７列は、最下位ビ
ットがＯであるので白、８〜１２列では１であるので黒
、１３〜１７列および１８〜２２列は０であるので白と
なる。Lines 3 to 7 contain 4-bit information (01
00). Among these, columns 3 to 7 are white because the least significant bit is O, columns 8 to 12 are black because they are 1, and columns 13 to 17 and 18 to 22 are white because they are 0.

以下、８〜１２行で数字０．１３〜１７列で３゜１８〜
２２で３を表わすことにより、漢という字のＪＩＳコー
ド２０３３を表わす。Below, in rows 8 to 12, columns of numbers 0.13 to 17, 3°18 to
By representing 3 with 22, it represents the JIS code 2033 of the character kan.

しかして、前記コードをイメージスキャナでコンピュー
タに入力し、枠の４隅の座標を検出して、その値から各
情報ビットの位置を計算し、該当する位置の点に白点が
多いか黒点が多いかにより、１かＯかを判定して漢字１
字を表す数字を読み取る。Then, input the above code into a computer using an image scanner, detect the coordinates of the four corners of the frame, calculate the position of each information bit from the values, and find whether there are many white points or black points at the corresponding position. Depending on how many there are, judge whether it is 1 or O and write kanji 1.
Read the numbers that represent the letters.

〔Example〕

以下、本発明を、第１図〜第６図の一実施例にもとづい
て説明すると、第２図には、本発明方式に用いる漢字認
識装置の全体構成がブロック図で示されている。Hereinafter, the present invention will be described based on an embodiment shown in FIGS. 1 to 6. FIG. 2 shows a block diagram of the overall configuration of a kanji recognition device used in the method of the present invention.

第２図において、漢字コードをプリントする場合は、メ
モリ１０３のコード部１０３−２に記憶されている漢字
コードを、制御部１０２がプログラム１０３−１に記憶
されているプログラムを実行することによって読み出し
、かつイメージメモリ１０１上に、前記漢字コードに相
当するパターンを発生し、この漢字コードに相当するパ
ターンを、通信部１０４を介して、プリンタ１０５に送
ることによってプリントする。In FIG. 2, when printing a kanji code, the control unit 102 reads out the kanji code stored in the code section 103-2 of the memory 103 by executing the program stored in the program 103-1. , and generates a pattern corresponding to the Kanji code on the image memory 101, and prints the pattern by sending the pattern corresponding to the Kanji code to the printer 105 via the communication unit 104.

一方、漢字コードを認識する場合は、紙に記録されたコ
ードをイメージスキャナ１００で検出し、通信部１０４
により、イメージメモリ１０１に転送して記憶する。そ
して、このコードを制御部１０２がプログラム部１０３
−１に記憶されているプログラムを実行することによっ
て認識し、その結果をコード部１０３−２に記憶する。On the other hand, when recognizing a kanji code, the image scanner 100 detects the code recorded on paper, and the communication unit 104 detects the code recorded on paper.
The image is transferred to the image memory 101 and stored. Then, the control unit 102 transfers this code to the program unit 103.
-1 is recognized by executing the program stored in section 103-1, and the result is stored in code section 103-2.

ここで、第２図に示す漢字認識装置を用いてコードをプ
リントする場合の手順についてさらに詳述すると、従来
方式では、漢字をプリントする場合、メモリのコード部
から漢字コードを読み、この漢字コードから漢字パタン
を記憶している漢字ＲＯＭのアドレスを求め、漢字ＲＯ
Ｍに記憶されているパターンを読み出して、これをプリ
ンタ１０５に転送することにより、プリントをおこなう
ようにしている。これに対し、第２図に示す漢字認識装
置を用いてコードをプリントする場合は、漢字コードに
よってパターンが決まるので、漢字ＲＯＭに相当するも
のは不要である。Now, to explain in more detail the procedure for printing a code using the kanji recognition device shown in Figure 2, in the conventional method, when printing kanji, the kanji code is read from the code section of the memory, and the kanji code is Find the address of the kanji ROM that stores the kanji pattern from kanji RO.
Printing is performed by reading out the pattern stored in M and transferring it to the printer 105. On the other hand, when printing codes using the kanji character recognition device shown in FIG. 2, the pattern is determined by the kanji code, so there is no need for anything equivalent to a kanji ROM.

第３図に漢字コードから直接漢字コードに相当するパタ
ーンを発生するためのフローチャートが示されている。FIG. 3 shows a flowchart for generating a pattern corresponding to a Kanji code directly from a Kanji code.

第３図において、漢字コードを認識する場合は、まず、
イメージメモリ１０３−１に２４Ｘ２４ドットのエリア
を設定し、その値を全てＯに設定する。In Figure 3, when recognizing a kanji code, first,
A 24×24 dot area is set in the image memory 103-1, and all values thereof are set to O.

次に、１，２，２３．２４行および１，２゜２３．２４
列を全て１にして枠を作り、メモリコード部１０３−２
から、プリントするべき４桁の数字からなる漢字コード
を読み出す。Next, lines 1, 2, 23.24 and 1, 2° 23.24
Create a frame with all columns set to 1, and write the memory code section 103-2
Read out the Kanji code consisting of four digit numbers to be printed.

そして、前記漢字コードの４桁の値（漢という字のコー
ド２０３３の場合は２で、２０３３を１０００で割って
小数点以下を切り捨てるといった公知のプログラムで得
られる）を求め、この値を４桁の２進数（２の場合なら
００１０）に変換する。Then, find the 4-digit value of the kanji code (2 for the code 2033 for the character kanji, which can be obtained using a known program such as dividing 2033 by 1000 and rounding down the decimal places), and convert this value into a 4-digit value. Convert to binary number (0010 in case of 2).

次に、４桁の２進数の１桁の値が０か１かを調ベ、１で
あれば３〜７行と３〜７列を１とする。Next, check whether the value of one digit of the four-digit binary number is 0 or 1, and if it is 1, set the 3rd to 7th rows and 3rd to 7th columns to 1.

続いて２〜４桁（ｍ＋１桁とする）の値が０か１かを調
べ、１であれば、３〜７行の４ｍ＋３〜４ｍ＋７列を１
とする。Next, check whether the value of the 2nd to 4th digit (m+1 digit) is 0 or 1, and if it is 1, set the 4m+3 to 4m+7 column of the 3rd to 7th row to 1.
shall be.

その後、行を４行ずらしながら、３〜１桁の漢字コード
について同様の操作をおこなう。After that, the same operation is performed for the 3- to 1-digit kanji code while shifting the line by 4 lines.

しかして、前記操作により、２４Ｘ２４ビットのパター
ンを、漢字ＲＯＭに相当するものを持つことなく発生で
きるので、この後は、Ｏを白点、１を黒点として従来と
同様の方法でプリントすれば、所要とするコードをプリ
ントすることができる。By the above operation, a 24 x 24 bit pattern can be generated without having anything equivalent to a kanji ROM, so if you print in the same way as before, with O as a white dot and 1 as a black dot, You can print out the code you need.

次に、漢字コードを認識する場合について詳述する。Next, the case of recognizing kanji codes will be described in detail.

漢字コードを認識するためには、情報要素内のコードを
認識する以前に、まず、情報要素を順番に検出する必要
があるが、そのためには、情報要素の配列方法が重要と
なる。一方、イメージスキャナで漢字コードを読み取る
場合、このコードは、多少傾斜して読み取られるのが普
通であるので、どの程度までの傾斜を許容するかを決め
ておかなければならない。また、漢字コードの認識を開
始する以前に、前記傾斜が許容値内にあるかどうかをチ
エツクする必要がある。さらに、１行に何個の情報要素
があるかを知っていれば、改行が容易におこなわれる。In order to recognize a Kanji code, it is first necessary to detect the information elements in order before recognizing the code within the information element, and for this purpose, the method of arranging the information elements is important. On the other hand, when reading a Kanji code with an image scanner, the code is usually read with a slight slant, so it is necessary to decide how much slant is allowed. Also, before starting the recognition of the Kanji code, it is necessary to check whether the slope is within a tolerance value. Furthermore, if you know how many information elements are on one line, line breaks can be easily performed.

以上の点を全て満足し、かつ先頭の情報要素の検出を容
易にするためには、情報要素の配列上部に直線を引くの
が効果的である。その状態を第４図に示す。In order to satisfy all of the above points and to facilitate detection of the first information element, it is effective to draw a straight line above the array of information elements. The state is shown in FIG.

第４図において、直線が前記したコード列のスタート信
号、スペースがコード列のエンド信号となる。また、第
４図において、検出した画素の位置表示法としては、イ
メージスキャナ読取範囲の左上端を原点とし、右方向に
Ｘ座標、下方向にｙ座標で表わすことにする。In FIG. 4, the straight line is the start signal of the code string, and the space is the end signal of the code string. Further, in FIG. 4, the position of the detected pixel is expressed by using the upper left end of the image scanner reading range as the origin, the X coordinate in the right direction, and the y coordinate in the downward direction.

しかして、漢字コードの認識にあたっては、まず、直線
の左上端ａ点と、右上端す点とを求める必要があり、直
線の幅を６〜１２ドット程度とすれば、検出される幅は
５ドット以上となり、左上端から右端へ、また右端から
次の列の左端へと走査していて、最初の黒点を発見した
場合、原稿が左上りの場合はａ点の近く、右上りの場合
はｂ点の近くであるということができる。そして、その
黒点の下５ドットに３個以上の黒点があれば、左に１列
移動し、５ドットの中に３個以上の黒点があれば、最上
の黒点から下５ドットを左に１列移動して、以下同様の
操作をおこない、このようにして黒点が３個以上になっ
た位置の前の列がａ点とＸ座標Ｘ＆である。また、Ｘ座
標の右５ドットに黒点が３個以上含まれる最上列がａ点
のｙ座標Ｙａである。Therefore, when recognizing a kanji code, it is first necessary to find the upper left end point a and the upper right end point of the straight line.If the width of the straight line is about 6 to 12 dots, the detected width is 5. If the number of dots or more is reached and the first black dot is found while scanning from the upper left edge to the right edge and from the right edge to the left edge of the next column, if the document is on the upper left side, it will be near point a, and if it is on the upper right side, it will be near point a. It can be said that it is near point b. If there are 3 or more black dots in the 5 dots below the black point, move one column to the left, and if there are 3 or more black dots in the 5 dots, move the 5 dots below the top black dot 1 column to the left. Move the column and perform the same operation, and the column before the position where there are three or more black points in this way is the point a and the X coordinate X&. Further, the top row in which three or more black dots are included in the five dots to the right of the X coordinate is the y coordinate Ya of point a.

そして、最初の黒点から右方向に前記と同様の処理をお
こなうと、ｂ点の座標Ｘｂ、Ｙｂが求められる。Then, by performing the same process as described above from the first black point to the right, the coordinates Xb and Yb of point b are obtained.

また、前記ａ点とｂ点の座標から傾斜が得られ、その傾
斜が許容値Ｄｔ以下であるかどうかを、次の式で確認す
る。Further, the slope is obtained from the coordinates of the points a and b, and it is determined whether the slope is less than or equal to the allowable value Dt using the following equation.

Ｄ、としては、０．１程度であれば、イメージスキャナ
に原稿を置く場合に十分対応できるので、Ｄｔ　を０．
１　　とすると、２４ドット士スペース分離れた隣の情
報要素の正常時における位置からのずれは、３ドット以
内となる。If D is around 0.1, it will be sufficient for placing a document on an image scanner, so Dt should be set at 0.1.
1, the deviation from the normal position of the adjacent information element separated by 24 dot spaces is within 3 dots.

また、情報要素の１個当りの間隔はわかっているので、
直線の長さから１行当りの情報要素の数がわかる。Also, since the spacing between each information element is known,
The number of information elements per line can be determined from the length of the straight line.

しかして、次に、先頭の情報要素から順に認識していく
が、先頭の情報要素の位置はａ点の座標から計算する。Next, the information elements are recognized in order starting from the first information element, and the position of the first information element is calculated from the coordinates of point a.

また、情報要素の４隅の座標を求めるのであるが、セン
サ位置のずれによって検出が不安定になることを考慮に
入れる必要がある。そして、第１図に示したコードを検
出する場合、センサが０．５　　ドット分だけ右下にず
れた最悪条件で、各点がどのように検出されるかについ
て検討した結果を第５図に示す。第５図において、マで
示したのは不安定な点であり、白と判定したり、黒と判
定したりする。このように、外枠には不安定な点が出る
ので、この条件で情報コードの左上端の座標をどう決め
るかが重要となる。１行の２〜５列の４点は、画素がセ
ンサの中心より上になれば、３点以上が黒となるであろ
う。つまり、続く右５点のうち、３点以上が黒である行
をもってＹ座標とし、続く下５点のうち、３点以上が黒
となる列をもってＸ座標とすればよい。Furthermore, although the coordinates of the four corners of the information element are determined, it is necessary to take into account that detection may become unstable due to a shift in the sensor position. When detecting the code shown in Figure 1, Figure 5 shows the results of examining how each point would be detected under the worst-case condition where the sensor is shifted by 0.5 dots to the lower right. show. In FIG. 5, the points indicated by ma are unstable points, which may be determined to be white or black. In this way, unstable points appear in the outer frame, so it is important to determine the coordinates of the upper left corner of the information code under these conditions. For the four points in row 1, columns 2-5, three or more points will be black if the pixel is above the center of the sensor. In other words, the row in which three or more points among the five points on the right side are black is the Y coordinate, and the column in which three or more points among the five points on the bottom are black is the X coordinate.

ここで、本発明方式の具体例につき、第６図に示す検出
データから黒の部分の左上隅（Ｃ点）を求める場合につ
いて説明する。なお、第６図の例では、センサ位置がず
れると同時に多少左上がりになっていることを仮定して
いる。Here, as a specific example of the method of the present invention, a case will be described in which the upper left corner (point C) of the black portion is determined from the detection data shown in FIG. In the example shown in FIG. 6, it is assumed that the sensor position is shifted and is also tilted slightly upward to the left.

しかして、第６図において、最初の情報要素の位置は、
第４図に示す直線のａ点からの距離がわかっているので
、その概略の位置はわかる。したがって、これに誤差を
考慮して、Ｃ点から左および上に数ドット離れていると
思われる点をＣ点検出の開始点と定め、この検出点から
、まず、右方向５点、および、下方向５点の中で１点で
も黒点のある座標を求め、次に、右方向５点、および、
下方向５点の中で３点以上が黒である点を求める。Therefore, in FIG. 6, the position of the first information element is
Since the distance from point a to the straight line shown in FIG. 4 is known, its approximate position can be determined. Therefore, taking into account the error, a point that is thought to be several dots away to the left and above from point C is set as the starting point for detecting point C, and from this detection point, first, five points to the right, and Find the coordinates of at least one black point among the five points in the downward direction, then find the coordinates of the five points in the right direction, and
Find points where three or more points are black among the five points in the downward direction.

仮に、第６図における１行、１列の点ｐ（１，１）から
Ｃ点検出を開始するものとすると、Ｐは、白と検出され
ていればＯ１黒と検出されていれば１である。Assuming that point C detection is started from point p (1, 1) in the 1st row and 1st column in FIG. 6, P is 0 if white is detected, 1 if black is detected. be.

まず、Ｐ’（１，１）から右へＰ　（５，１）までを加
え、Ｏであるので、Ｐ　（１，２）へＹ座標に１を加え
る。First, add from P'(1,1) to P(5,1) to the right, and since it is O, add 1 to the Y coordinate of P(1,2).

次に、点Ｐ　（１，２）から下にＰ　（１，６）までを
加え、Ｏであるので、Ｐ　（２，２）へＸ座標に１を加
える。同様に、Ｘ座標、Ｙ座標を交互にチエツクを繰り
返し、Ｐ　（３，３）にきたとき。Next, add points P (1, 6) downward from point P (1, 2), and since it is O, add 1 to the X coordinate of P (2, 2). Similarly, when checking the X and Y coordinates alternately, P (3, 3) is reached.

右へＰ　（７，３）まで加えると３となる。Ｐ（３゜３
）に止まって、下へＰ　（３，７）まで加えて０、Ｐ　
（４，３）へ移り、下へＰ　（４，７）まで加えると２
となる。このようにして、右方向へも下方向へも１点で
も黒点がある条件が求まった。Adding to the right up to P (7, 3) gives 3. P(3゜3
), add down to P (3, 7), then 0, P
Move to (4, 3) and add down to P (4, 7), 2
becomes. In this way, the conditions for having at least one black spot both to the right and to the bottom were found.

続いて、Ｐ　（４，３）から右方向へＰ（７，３）まで
を加えると３以上であるので、Ｐ　（４，３）に止まっ
て、下へＰ　（４，７）まで加えると２で、３以下であ
るので、Ｐ　（５，３）へ移って、下方向へＰ　（５，
７）までを加えると５で、３以上であるので、Ｃ点の座
標は（５，３）と決まる。Next, if you add from P (4, 3) to P (7, 3) in the right direction, it will be 3 or more, so if you stop at P (4, 3) and add down to P (4, 7), 2, which is less than or equal to 3, so move to P (5, 3) and move downward to P (5,
Adding up to 7) gives 5, which is 3 or more, so the coordinates of point C are determined to be (5, 3).

Ｃ点の位置が決まると、この情報ビットの左下隅点ｄ、
右上隅点ｅ、右下隅点ｆの位置も同様の手法で求めるこ
とができる。Once the position of point C is determined, the lower left corner point d of this information bit,
The positions of the upper right corner point e and the lower right corner point f can also be determined using a similar method.

各点の座標をＸｃ、Ｙｃ・・・、Ｘｔ、Ｙｚとするとき
、各情報ビットの位置を、前記座標から求める必要があ
るが、ｇ列、ｈ行の各情報ビットの左上隅の座標Ｘ　ｇ
ｈ　ｇ　Ｙ　ｇｈは、次の式により求めることができる
。When the coordinates of each point are Xc, Yc..., Xt, Yz, it is necessary to find the position of each information bit from the coordinates. g
h g Y gh can be determined by the following formula.

（２）、　（３）式において、第３項は情報ビットの幅
を加えるもの、第４項は傾きの補正に相当するものであ
る。小数点以下については４捨５人する。In equations (2) and (3), the third term corresponds to adding the width of the information bit, and the fourth term corresponds to correction of the slope. For numbers below the decimal point, use 4 to the nearest 5.

そして、以上のようにして各情報ビットの左上隅の座標
が求められたので、この点から、右へ５ドット、下へ５
ドットの辺で囲まれる四角の中の２５ドットのうち、半
数以上が黒であれば１、黒が半数以下であれば０とする
。なお、先に述べた理由により、１６以上が黒であれば
１．９以下ならＯｌその中間は判定不能としても、判定
不能となる確率は小さく、このようにして最初の情報ビ
ットのコードが１６ビットの２進数として得られるので
、４桁づつに区切って１０進数にし、ＪＩＳ漢字コード
を得ることができる。Then, since the coordinates of the upper left corner of each information bit have been obtained as described above, from this point, 5 dots to the right and 5 dots down
If more than half of the 25 dots in the square surrounded by the dot sides are black, it is set as 1, and if less than half are black, it is set as 0. For the reason mentioned above, if 16 or more is black, if it is 1.9 or less, it is OL.Even if it is impossible to determine the value in between, the probability that it will be impossible to determine is small, and in this way, the code of the first information bit is 16. Since it is obtained as a binary number of bits, it is possible to divide it into 4-digit units and convert it into a decimal number to obtain the JIS Kanji code.

また、前記のようにして最初の情報要素のコードが認識
されたならば、次に、右隣の情報要素を求める。なお、
その位置は、前の情報要素の位置から推定し、このよう
にして１行の認識が終ったことを、情報要点の１行の数
から知ると、次の行の先頭に移る。また、このときの位
置は、前行の先頭の情報要素の位置から推定し、前記の
ようにして情報要素を次々にさがしていき、推定した位
置に情報要素がなかったならば、コード列が終了したも
のと判定する。Furthermore, once the code of the first information element is recognized as described above, the next information element on the right is determined. In addition,
Its position is estimated from the position of the previous information element, and when it is known from the number of information points in one line that recognition of one line has ended in this way, the process moves to the beginning of the next line. In addition, the position at this time is estimated from the position of the first information element in the previous line, and the information elements are searched one after another as described above. If there is no information element at the estimated position, the code string is It is determined that the process has ended.

〔Effect of the invention〕

本発明は以上のごときであり１本発明は漢字をコード化
して直接認識するものであるから、標準パターンを持つ
必要がなく、かつパターンマツチングをおこなう必要も
ないので、短いプログラムで高速に漢字コードを認識す
ることができる。The present invention is as described above.1 Since the present invention encodes kanji and directly recognizes them, there is no need to have a standard pattern and there is no need to perform pattern matching. Can recognize codes.

また、漢字１字を２４ドツ１〜×２４ドットで表わすプ
リンタによりプリントされたコードを、同一分解能を有
するイメージスキャナで入力して認識することができる
ので、８ドットＩＩｍ＋程度の分解能である比較的安価
なイメージスキャナでも漢字コードの認識が可能である
。In addition, the code printed by a printer that represents one kanji character with 24 dots 1 to 24 dots can be input and recognized by an image scanner with the same resolution, so the resolution is about 8 dots IIm+. Kanji codes can be recognized even with inexpensive image scanners.

さらに、漢字１字と情報要素が同じ画素数であるので、
プリンタでプリントするときのソフトウェア開発が容易
となる。Furthermore, since a single kanji character and an information element have the same number of pixels,
Software development for printing with a printer becomes easier.

[Brief explanation of the drawing]

第１図〜第６図は本発明の一実施例を示し、第１図は本
発明の漢字認識方式に用いられる情報要素コードを示す
図、第２図は本発明方式に用いられる漢字認識装置の全
体構成を示すブロック図、第３図は情報要素コードをプ
リントする場合のフローチャート、第４図は情報要素コ
ード配列法を示す図、第５図は本発明方式による情報要
素コード入力時の状態を示す図、第６図は本発明方式に
よる情報要素コード外枠の座標検出法を説明する図、第
７図は従来のカルラコードを示す図、第８図および第９
図はイメージスキャナセンサの位置と判定結果とを示す
図である。ｃ　＝　ｆ・・・情報要素の４隅の点、１００・・・イ
メージスキャナ、１０１・・・イメージメモリ、１０２
・・・制御部、１０３・・・メモリ、１０３−１・・・
プログラム部、１０３−２・・・コード部、１０４・・
通信部、１０５・・・プリンタ。第　１　囚ｃ　−４、＝　Ｉ′を辣学ト４禍鷺、夢２　ｍ／４３＄３（２１第４の１８ａ闘國Ｎ圏も脱國灰脱闘多多闘圏國１８ａ！８８國
ぺ脱闘脱閥刺圏脱糸炎羽羽羽圏灰袷灰圏多Ｎ國灰國脳は
脳國國灰１１１ｉｉ１８１１灰國多刺四謂炎多脱國圏多脱φＳ国夢ｂ　云冶Ｔ　ロア　　　　　　　　　ｚ　　　　　　　　　Ｊ第ｇ　口（ａ−）1 to 6 show an embodiment of the present invention, FIG. 1 is a diagram showing information element codes used in the kanji recognition method of the present invention, and FIG. 2 is a kanji recognition device used in the kanji recognition method of the present invention. FIG. 3 is a flowchart for printing information element codes, FIG. 4 is a diagram showing an information element code arrangement method, and FIG. 5 is a state when information element codes are input using the method of the present invention. FIG. 6 is a diagram illustrating the coordinate detection method of the information element code outer frame according to the method of the present invention, FIG. 7 is a diagram showing the conventional Carla code, and FIGS. 8 and 9
The figure is a diagram showing the position of the image scanner sensor and the determination result. c = f... Four corner points of information element, 100... Image scanner, 101... Image memory, 102
...Control unit, 103...Memory, 103-1...
Program section, 103-2...Code section, 104...
Communication Department, 105...Printer. 1st prisoner c -4, = I' is studied 4 disasters, dream 2 m /43 $3 (21 The 4th 18a fighting country N area also de-nationalises the national ashes, the fighting multi-fighting area country 18a! 88 country pe Fighting -channel stabilization area Removal flame Hazui area Gray ash huge area N -kuni ash national brain z Jth g mouth (a-)

Claims

[Claims] 1. One Kanji character is represented by a four-digit number, and one number is represented by four bits, totaling 16 bits, and this is encoded and printed with pixels of 24 dots x 24 dots. A kanji recognition method that uses an image scanner with the same resolution as a printer to read the kanji characters. 2. In the invention described in claim 1, the outer frame of the code has a width of 2 dots, and each information bit has a width of 5 dots.
Kanji recognition method that distributes 5 dots of pixels.