JPS6084684A

JPS6084684A - Character recognizing system

Info

Publication number: JPS6084684A
Application number: JP58193140A
Authority: JP
Inventors: Mitsumasa Sugiyama; 杉山　光正
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1983-10-14
Filing date: 1983-10-14
Publication date: 1985-05-14

Abstract

PURPOSE:To suppress classifying error rate of a basic stroke to low by a small processing quantity, and to obtain high recognition rate by limiting a basic stroke group being an object to be collated, by the number of strokes of an input character. CONSTITUTION:A stroke number counting part 5 cuts a character by a character spacing designation by touching some area on a tablet 1 with a pen point of an input pen 2, and a pen touch off-state exceeding some prescribed time, etc. As a result, the number of strokes of an input character, and an order of each stroke are derived from on and off of a pen touch of the input pen 2, and sent to a stroke recognizing part 6 and a character recognizing part 9. In a pre- processing part 4, stroke information of every stroke is sent to the stroke recognizing part 6. In the stroke recognizing part 6, the stroke information obtained from the per-processing part 4 is collated with a stroke pattern of a basic stroke in accordance with the number of strokes of the input stroke by a stroke number counting part 5.

Description

【発明の詳細な説明】（技術分野）本発明は、文字を構成するストロークに関する情報によ
って文字認識を行なう文字認識方式に関するものである
。DETAILED DESCRIPTION OF THE INVENTION (Technical Field) The present invention relates to a character recognition method that performs character recognition based on information regarding strokes that constitute a character.

（従来技術）従来、文字を構成するストロークに関する情報によって
、入力ストロークを予め準備された基本ストロークに分
類し、基本ストロークの集合から人力した文字を認識す
る方式をとっている。しかしながらＢ氷詰においては、
漢字、ひらがな、カタカナ、英字、数字等、多くの文字
が使用されており、そのストロークを分類するための基
本ストロークも多い。しかし、ひらがなを構成するスト
ロークには、漢字、カタカナ、英字、数字等には使われ
ないものも多くあり、また、英字、数字を構成するスト
ロークにも他の文字には使われないものがある。この様
に認識させるべき文字の種類が多いと必然的に備えるべ
き基本ストロークも多くなるので入力ストロークを基本
ストロークに分幀する際に生しる基本ストローク分類誤
りが認識ｐの低ドを招いていた。(Prior Art) Conventionally, a system has been adopted in which input strokes are classified into basic strokes prepared in advance based on information regarding the strokes that constitute a character, and a manually created character is recognized from a set of basic strokes. However, in B ice,
Many characters are used, including kanji, hiragana, katakana, alphabets, and numbers, and there are also many basic strokes for classifying the strokes. However, there are many strokes that make up hiragana that are not used for kanji, katakana, alphabets, numbers, etc., and there are also strokes that make up alphabets and numbers that are not used for other characters. . If there are many types of characters to be recognized in this way, the number of basic strokes that must be prepared will also inevitably increase, so errors in basic stroke classification that occur when dividing input strokes into basic strokes result in a low recognition p. Ta.

（１１的）本発明は入力文字の画数により照合対象とするノ１（本
ストローク群を限定し、少ない処理酸で基本ス）・ロー
フの分類誤り率を低く抑え、高い認識率を１１１ること
ができる文字認識方式を提供することを目的とする。(Point 11) The present invention aims to keep the classification error rate low for No. 1 (basic stroke by limiting the main stroke group and use a small amount of processing acid) and loaf to be matched based on the number of strokes of the input character, and to achieve a high recognition rate of 111. The purpose is to provide a character recognition method that can.

（実施例）以下、図面に従って本発明の一実施例を詳細に説明する
。(Example) Hereinafter, an example of the present invention will be described in detail with reference to the drawings.

第１図は本発明の一実施例である文字認識装置の構成を
示すブロツク図である。図において３は認識させるべき
文字情報を入力するための文字情報入力装置でタブレッ
トｌと入力ペン２により構成されており、入力ペン２を
用いてタブレットｌ上に認識させるべき文字情報を描く
ことにより入力が行なわれる。４は文字情報入力装置３
より入力された文字情報にノイズ除去、正規化等を施す
前処理部、５はタブレット１上での入力ペン２の座標情
報及び人力ペン２がタブレット１にタッチしているかい
ないかのオン、オフ情報から入力文字の画数をめ番画数
計数部、６は前処理部４からの情報及び画数計数部５か
らの情報により入力文字の入力ストロークを認識するス
トローク認識部、７はストローク認識のために使用され
る基本ストロークの代表ストロークパターンが登録しで
ある基本ストローク辞書部、８は各入力ストロークの長
さ、位置関係等を処理する文字情報処理部、９は６のス
トローク認識部から得た結果と文字情報処理部から得た
文字情報により入力文字を認識する文字認識部、ｌＯは
複数種の文字パターンが格納されている文字辞書部、１
１は文字認識部９で認識された結果を出力する出力部で
ある。FIG. 1 is a block diagram showing the configuration of a character recognition device which is an embodiment of the present invention. In the figure, 3 is a character information input device for inputting character information to be recognized, which is composed of a tablet 1 and an input pen 2. By using the input pen 2 to draw character information to be recognized on the tablet 1, Input is made. 4 is a character information input device 3
5 is a pre-processing unit that performs noise removal, normalization, etc. on the input character information; 5 is the coordinate information of the input pen 2 on the tablet 1, and whether the human pen 2 is touching the tablet 1 or not is turned on or off A stroke number counting section 6 calculates the number of strokes of the input character from the information; 6 is a stroke recognition section that recognizes the input stroke of the input character based on information from the preprocessing section 4 and information from the stroke number counting section 5; 7 is for stroke recognition; 8 is a character information processing unit that processes the length of each input stroke, positional relationship, etc., and 9 is the result obtained from the stroke recognition unit 6. and a character recognition unit that recognizes input characters based on character information obtained from the character information processing unit; lO is a character dictionary unit in which a plurality of types of character patterns are stored;
Reference numeral 1 denotes an output unit that outputs the result recognized by the character recognition unit 9.

第２図は第１図の基本ストローク辞書部７に格納されて
いる基本ストロークの例であり、ストローク１ｄナンバ
ー、代表ストローク１〜６画の文字にのみ使われる基本
ストロークを表示している。代表ストロークの矢印は入
力ペン２の移動の方向を表している。第３列に「０」の
あるストロークはひらがな、カタカナ、英字、数字にの
み使われる基本ストロークであり、画数が１以上６以下
の文字にのみ使われる。FIG. 2 is an example of basic strokes stored in the basic stroke dictionary section 7 of FIG. 1, and displays basic strokes used only for characters with a stroke 1d number and representative strokes 1 to 6. The arrow of the representative stroke represents the direction of movement of the input pen 2. The strokes with "0" in the third column are basic strokes used only for hiragana, katakana, alphabets, and numbers, and are used only for characters with a stroke count of 1 or more and 6 or less.

次に第１図、第２図を参照しつつ、本実施例を説明する
。Next, the present embodiment will be described with reference to FIGS. 1 and 2.

オペレータがタブレット１−Ｌ−で入力ベン２を用いて
文字を書くと、ある・定時間毎に　タブレットｌ」、に
おける入力ペン２の座標情報と入力ペン２のペン先がタ
ブレットｌに触れているかいないかの情報が前処理部４
と画数計数部５に送られる。画数計数部５はタブレット
ｌ」二のある区域を人力ペン２のペン先で触れることに
よる文字区切り指定、あるー・定時間以ヒのベンタッチ
オフ状態等により文字の切り出しを行い、入力ペン２の
ペンタンチのオン、オフにより入力文字の画数および、
各ストロークが第何画であるかをめ、ストローク認識部
６、文字認識部９へ送る。前処理部４では、入力された
文字情報に対し、ノイズ除去、平滑化等の処理を行った
後、入力ペン２のベンタッチのオン、オフ情報によりス
トロークの切り出しを行い、ストローク毎のストローク
情報をストローク認識部６へ送る。また、入力文字の各
ストロークの長さ、ストロークの始点、終点、入力ペン
２の移動方向変化点の座標、各ストロークの交差のイ■
無等を文字情報処理部８へ送る。When the operator writes a character using the input pen 2 on the tablet 1-L-, the coordinate information of the input pen 2 on the tablet 1 and whether the tip of the input pen 2 is touching the tablet 1 at regular intervals is determined. Information on whether or not there is
and is sent to the stroke number counting section 5. The stroke counting unit 5 specifies character separation by touching a certain area on the tablet 2 with the tip of the manual pen 2, cuts out characters by touching off the pen after a certain period of time, and inputs the input pen 2. The number of strokes of the input character and
The number of each stroke is determined and sent to the stroke recognition section 6 and character recognition section 9. The preprocessing unit 4 performs processing such as noise removal and smoothing on the input character information, and then cuts out strokes based on the on/off information of the bend touch of the input pen 2, and extracts stroke information for each stroke. It is sent to the stroke recognition section 6. In addition, the length of each stroke of the input character, the start point and end point of the stroke, the coordinates of the point of change in the moving direction of the input pen 2, the intersection of each stroke, etc.
The character information processing section 8 sends "No."

ストローク認識部６では、前処理部４から得たストロー
ク情報に対して、画数計数部５から得た入力ストローク
の画数に従って、基本ストロークのストロークパターン
と照合して、基本ストロークのいずれかに分類する。The stroke recognition unit 6 compares the stroke information obtained from the preprocessing unit 4 with the stroke pattern of basic strokes according to the number of input strokes obtained from the stroke count counting unit 5, and classifies it as one of the basic strokes. .

いま、漢字「辞」がタブレッ）１から「／」。Now, the kanji "ji" is tablet) 1 to "/".

ｒ−Ｊ　、ｒｌＪ　、Ｎ　Ｊ　’＋　ｒｌＪ　、ｒ−Ｊ
「司　、ｒ−」　、ｒ＼　」、、ｒ／」、　ｒ−Ｊ　。r-J, rlJ, NJ'+ rlJ, r-J
"Tsukasa, r-", r\",, r/", r-J.

ｒ−Ｊ、ｒｌＪの１３画の文字として入力されるとする
。１画から６画までのストロークｒ′」　。It is assumed that the characters r-J and rlJ are input as 13-stroke characters. Stroke r' from 1st stroke to 6th stroke.

ｒ−Ｊ　、Ｎ」　、Ｎ」　、ｒ−＋」、、ｒ−Ｊが入力
された時点では、この入力文字の画数が６以Ｆか７以下
かわからないので、各ストロークをすべての恭木ストロ
ークパターンと所定のアルゴリズムに従って照合し、基
本ストロークに分類する。しかし、７両目の「−」が入
力された時点でこの文字の画数が７以上であることがわ
かるので、以降のストロークは１画から６画の文字にの
み表われる基本ストロークを除いた基本ストロークパタ
ーンと所定のアルゴリズムに従って照合し基本ストロー
クに分類する。もし、１画から６画までのストロークが
、１画から６画の文字にのみ表われる基本ストロークに
分類されている場合には、１画から６画の文字にのみ表
われる基本ストロークを除く基本ストロークパターンと
所定のアルゴリズムに従って照合し、基本ストロークに
分類するか、認識不可能なストロークとしてリジェクト
し、所定の処理を行うか等の処理を行う。以りのように
１文字のすべてのストロークの処理がストローク認識部
６で終ると、文字認識部９ではストローク認識部６から
各人カストロークの１ｄナン／ヘ−１文字情報処理部８
からストローク位置情報、ストローク交差情報、ストロ
ーク長の文字情報、画数計数部５から入力文字の画数を
得、文字辞書部１０に登録しである文字パターンと照合
して認識結果を出力部１１より出力する。When r-J, N'', N'', r-+'', , r-J is input, it is not known whether the number of strokes of this input character is 6 or more F or 7 or less, so each stroke is divided into all Kyogi strokes. It matches the pattern according to a predetermined algorithm and classifies it into basic strokes. However, when the 7th character "-" is input, it is known that the number of strokes of this character is 7 or more, so the subsequent strokes are basic strokes excluding the basic strokes that only appear in characters with 1st to 6th strokes. The pattern is matched according to a predetermined algorithm and classified into basic strokes. If strokes from 1st to 6th stroke are classified as basic strokes that appear only in characters from 1st to 6th stroke, then The stroke pattern is compared with a predetermined algorithm, and processing is performed, such as classifying the stroke as a basic stroke or rejecting it as an unrecognizable stroke and performing predetermined processing. As described above, when all the strokes of one character are processed by the stroke recognition unit 6, the character recognition unit 9 processes the stroke recognition unit 6 to the 1d number/h-1 character information processing unit 8 for each character stroke.
Stroke position information, stroke intersection information, stroke length character information, and the number of strokes of the input character are obtained from the stroke number counting section 5, and the results are compared with the character pattern registered in the character dictionary section 10 and the recognition result is output from the output section 11. do.

Ｍ実施例では１文字の画数が６以下か７以下かで照合対
象とする基本ストローク群を変えたか、第２図において
、１両目にしか表われないストロークｉｄナンバー７．
１０．１Ｂ、３４，３５゜３６のスＩ・ローフ、２両目
にしか表われないストロークＩｄナンへ〜１６，１７の
ストローク等、ある画数口にのみ表われるストロークも
存在するので、文字の画数だけではなく、その入力スト
ロークが何両目であるかにより照合対象とする基本スト
ローク群を限定してもよい。In the M embodiment, the basic stroke group to be compared was changed depending on whether the number of strokes of one character was 6 or less or 7 or less, or the stroke ID number 7.
10.1B, 34, 35° 36 strokes, loaf, strokes that appear only in the second car, strokes in Id Nan ~ 16, 17, etc. There are strokes that appear only in a certain number of strokes, so the number of strokes in a character. In addition, the group of basic strokes to be compared may be limited depending on which car the input stroke belongs to.

（効　果）以」、の説明から明らかなように１本発明によれば、人
力文字の画数により照合対象となる基本ストローク長が
限定され、少ない処理量で高いストローク認識率が得ら
れ１文字認識率を高めることかできる。(Effects) As is clear from the explanation of 1. According to the present invention, the basic stroke length to be verified is limited by the number of strokes of a human character, and a high stroke recognition rate can be obtained with a small amount of processing. It is possible to increase the recognition rate.

[Brief explanation of drawings]

第１図は本発明の−・実施例である文字認識装置の構成
を示すプロ、り図、第２図は第１図に示した基本ストロ
ーク辞書部に格納されている基本ス１０−クを示す図で
あり、３は文字情報入力装置、５は画数５１数部、６は
ストローク認識部、７は基本ストローク辞書部、９は文
字認識部、１０は文字辞書部である。出順人　キャノン株式会社FIG. 1 is a diagram showing the configuration of a character recognition device according to an embodiment of the present invention, and FIG. 2 shows a basic stroke dictionary stored in the basic stroke dictionary section shown in FIG. In this figure, 3 is a character information input device, 5 is a stroke number of 51, 6 is a stroke recognition section, 7 is a basic stroke dictionary section, 9 is a character recognition section, and 10 is a character dictionary section. Junjin Canon Co., Ltd.

Claims

[Claims]

A character recognition method, characterized in that, in a character recognition device that performs character recognition based on information regarding strokes constituting a character, a group of basic strokes to be compared is changed according to the number of strokes of an input character.