JPH0721817B2

JPH0721817B2 - Document image processing method

Info

Publication number: JPH0721817B2
Application number: JP61065640A
Authority: JP
Inventors: 浩至福田; 匡利樋野; 邦晃田畑
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1986-03-26
Filing date: 1986-03-26
Publication date: 1995-03-08
Anticipated expiration: 2010-03-08
Also published as: JPS62224870A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は文書画像の処理方法に関し、更に詳しくは、画
像データとして取り込まれた文書情報中の文字認識、あ
るいは特定の文字領域の切り出しに適した文書画像処理
方法に関する。Description: TECHNICAL FIELD The present invention relates to a method for processing a document image, and more specifically, it is suitable for character recognition in document information captured as image data or clipping of a specific character area. Document image processing method.

[Conventional technology]

光ディスクのような大容量のメモリをファイル装置とし
た文書画像処理システムにおいては、蓄積文書の検索に
必要な索引情報の自動入力化が要望される。文書画像
は、本質的に濃淡値をとる写真領域と、文字あるいは図
形の如く２値情報で表わされる領域とに大別されるが、
索引情報の自動入力のためには、入力文書画像中の文字
領域で特定の文字列、あるいは文字を抽出し、文字認識
により索引用語を文字コードに変換する必要がある。In a document image processing system using a large-capacity memory such as an optical disk as a file device, it is required to automatically input index information necessary for searching stored documents. The document image is roughly divided into a photographic area that essentially takes a gray value and an area represented by binary information such as a character or a graphic.
In order to automatically input the index information, it is necessary to extract a specific character string or character in the character area in the input document image and convert the index term into a character code by character recognition.

従来、文書画像中の文字を認識する技術として、例え
ば、アイ・イー・イー・イー，プロシーディング第６回
アイ・シー・ピー・アール（IEEE Proceeding 6th ICP
R）,1982年，第1023頁〜第1026頁において、英文（ロー
マ字）を対象とし、文字列から互いに接触した位置関係
にある文字群（語）を分離して認識する手法が報告され
ている。また、文字認識のために、文字列から個別文字
を分離する技術として、例えば、電子通信学会論文誌,V
ol.J67-D,No.10（1984年10月）には、日本語文書を対象
とした文字切り出しの例が報告されている。しかしなが
ら、これらの報告は、いずれも文書構成文字の種類（英
文か和文か）が既知であることを前提としたものであ
り、文字種類が未知、あるいは複数種類の文字が混在す
る文書画像を対象としたものではない。Conventionally, as a technique for recognizing characters in a document image, for example, I E E E Proceeding 6th ICP
R), 1982, pp. 1023 to 1026, a method is reported for recognizing English characters (Roman characters) by separating character groups (words) that are in contact with each other from a character string. . Further, as a technique for separating individual characters from a character string for character recognition, for example, the Institute of Electronics and Communication Engineers, V.
ol.J67-D, No.10 (October 1984), an example of character segmentation for Japanese documents is reported. However, all of these reports are based on the assumption that the type of document constituent characters (English or Japanese) is known, and target document images in which the character type is unknown or multiple types of characters are mixed. It is not what

一方、情報処理学会論文誌,Vol.25,No.2（1985年３月）
の第257頁〜第264頁には、文書構成文字の種類を、文字
パターンの視覚的特徴により識別するための方法につい
ての報告がなされている。この報告では、例えば３絵素
×３絵素の小さな単位メッシュで文字を走査し、単位メ
ッシュ内に白画素領域と黒画素領域がたかだか１個ずつ
しか現われないようにした場合、単位メッシュのパター
ン種類が2⁹＝512個のうちの66個のパターン（合法パタ
ーン）に限定されてしまうことに着目し、文字走査時に
得られる合法パターンの種類，頻度分布から、文字スト
ロークの傾斜方向，曲線性，文字の縦横比などの特徴を
抽出し、これによりローマ字，ギリシャ文字，ロシア文
字，日本文字，ハングル文字等の判別をしている。Meanwhile, IPSJ Transactions, Vol.25, No.2 (March 1985)
Pp. 257 to 264, there is reported a method for identifying the type of document constituent characters by visual characteristics of character patterns. In this report, for example, when a character is scanned with a small unit mesh of 3 picture elements x 3 picture elements and only one white pixel area and one black pixel area appear in the unit mesh, the pattern of the unit mesh Focusing on the fact that the types are limited to 66 patterns (legal patterns) out of 2 ⁹ = 512, the types of legal patterns obtained during character scanning and the frequency distribution are used to determine the inclination direction of the character stroke and the curvilinearity. The characteristics such as the aspect ratio of characters are extracted, and the Roman characters, Greek characters, Russian characters, Japanese characters, Hangul characters, etc. are discriminated by this.

[Problems to be solved by the invention]

然るに、上述した合法パターンを用いる文字種類識別方
式は、文書中に現われる合法パターンの統計的な特徴を
利用しているため、同一文書中に２種類以上の文字が混
在する場合、例えば、タイトル，著者，アブストラクト
部などの重要部分については日本語の他に英文による記
載もなされているような学会論文に対しては適用できな
い。また、この方式では、全ての黒画素についてストロ
ークの方向、形状等を判定する必要があるため、処理が
複雑であり、判定のためのデータ処理に大きなメモリ容
量を要するという問題もある。However, since the character type identification method using the legal pattern described above uses the statistical characteristics of the legal pattern appearing in the document, when two or more characters are mixed in the same document, for example, a title, The important parts such as the author and abstract part cannot be applied to the academic papers which are described in English as well as Japanese. Further, in this method, since it is necessary to determine the stroke direction, shape, etc. for all black pixels, the processing is complicated, and there is a problem that a large memory capacity is required for data processing for the determination.

本発明の目的は、同一文書画像中に２種類の文字領域が
混在する場合に、特に日本文（和文）領域と英文領域と
の識別に適した文書画像処理方法を提供することにあ
る。An object of the present invention is to provide a document image processing method particularly suitable for distinguishing a Japanese (Japanese sentence) region and an English region when two types of character regions are mixed in the same document image.

[Means for solving problems]

上記目的を達成するため、本発明による文書画像処理方
法は、文字列を含む文書情報を画像データとしてメモリ
に取り込むステップと、上記画像データから黒画素連続
領域毎の外接矩形を求めるステップと、上記画像データ
から文字列毎の外接矩形を求めるステップと、上記文字
列の各々について、当該文字列の外接矩形に含まれる黒
画素連続領域毎の外接矩形の上辺および下辺の上下方向
相対位置に関する頻度分布を求めるステップと、上記頻
度分布のピーク位置から当該文字列を構成する文字種類
が英文文字か和文文字かを判定するステップとを含むこ
とを特徴とする。In order to achieve the above object, a document image processing method according to the present invention includes a step of taking document information including a character string into a memory as image data, a step of obtaining a circumscribed rectangle for each black pixel continuous area from the image data, A step of obtaining a circumscribed rectangle for each character string from the image data, and a frequency distribution regarding the vertical position of the upper and lower sides of the circumscribed rectangle for each black pixel continuous area included in the circumscribed rectangle of the character string for each of the character strings And a step of determining from the peak position of the frequency distribution whether the character type forming the character string is an English character or a Japanese character.

[Action]

文書中の文字領域において、各文字のストロークを示す
黒画素連続領域毎の外接矩形は、例えば漢字，片仮名，
平仮名からなる和文と、大文字，小文字のアルファベッ
トからなる英文とを比較すると、その並び方において差
違がある。例えば、英文の場合は、単語と単語との間に
スペース部分が存在するが、和文にはない。また、英文
では大文字と小文字とで黒画素連続領域の外接矩形の大
きさに明らかな違いがあるが、和文の漢字と平仮名，片
仮名の間にはアルファベット程の大きさの違いはない。
従って、これらの外接矩形の隣接関係、あるいは上下方
向でのばらつき方には特徴があり、例えば、文字列外接
矩形の上辺および下辺に対する各黒画素連続領域の外接
矩形の離間の程度、あるいは、各黒画素連続領域の外接
矩形の文字列方向の離間程度を頻度分布として求める
と、分布のパターンから各文字列が和文か英文かを判別
できる。同様に、和文とギリシャ語，ロシア語，ドイツ
語などの文との区別もできる。In the character area in the document, the circumscribed rectangle for each black pixel continuous area indicating the stroke of each character is, for example, Kanji, Katakana,
Comparing Japanese sentences consisting of hiragana and English sentences consisting of uppercase and lowercase alphabets, there is a difference in the arrangement. For example, in the case of an English sentence, there is a space between words, but there is no space in a Japanese sentence. Also, in English, there is a clear difference in the size of the circumscribed rectangle of the black pixel continuous area between uppercase and lowercase, but there is no difference in the size of the alphabet between the Kanji and Hiragana or Katakana in Japanese.
Therefore, there is a characteristic in the adjacency relationship of these circumscribed rectangles or the variation in the vertical direction. If the degree of separation of the circumscribed rectangle of the black pixel continuous area in the character string direction is obtained as a frequency distribution, it can be determined from the distribution pattern whether each character string is a Japanese sentence or an English sentence. Similarly, it is possible to distinguish Japanese sentences from sentences such as Greek, Russian, and German.

本発明によれば、各文字列毎に、黒画素連続領域の外接
矩形の位置に関する頻度分布から、当該文字列の文字種
類を判別することにより、例えば、日本語で記載された
学会発表文献、論文から英文による文字領域を抽出し、
これを切り出して抄録を自動的に作成することができ
る。また、特定の領域について、文字種類を知った上で
文字認識処理を実行することができるため、誤認識を少
なくして文字コードへの変換を迅速に実行でき、これら
の文字コードを索引として自動登録することも可能とな
る。According to the present invention, for each character string, by determining the character type of the character string from the frequency distribution regarding the position of the circumscribed rectangle of the black pixel continuous region, for example, academic conference publications written in Japanese, Extract the text area in English from the paper,
This can be cut out and an abstract created automatically. In addition, because the character recognition process can be executed after knowing the character type for a specific area, misrecognition can be reduced and conversion to character codes can be performed quickly, and these character codes are automatically used as an index. It is also possible to register.

〔Example〕

以下、本発明の実施例について説明する。 Examples of the present invention will be described below.

第１図は、本発明を実施する文書画像処理システムのハ
ードウエア構成図であり、１はデータプロセッサ、２は
上記プロセッサで必要とするプログラムを格納するため
のメモリ、３は文字認識に必要な辞書を格納するための
メモリであり、この例では和文用の辞書と英文用の辞書
とが用意される。４は処理結果あるいは中間結果などの
データ類を格納するためのメモリ、５はプロセッサ１に
対して各種コマンドあるいは数値データ等を入力するキ
ーボード、６は文書画像の入力装置である。文書画像の
入力装置６としては、例えば文書から画像情報を読み取
るスキャナが代表的であるが、既にデータが入力されて
いる記録媒体からデータを読取る磁気テープ，磁気ディ
スク等であってもよい。また、７は入力された文書画像
データを一時的に格納するフレームメモリ、８は文書画
像データを表示するためのディスプレイ装置、９はハー
ドコピーをとるためのプリンタ装置であり、これらの要
素はバス10により接続されている。FIG. 1 is a hardware configuration diagram of a document image processing system for carrying out the present invention. 1 is a data processor, 2 is a memory for storing a program required by the processor, and 3 is required for character recognition. This is a memory for storing a dictionary. In this example, a dictionary for Japanese sentences and a dictionary for English sentences are prepared. Reference numeral 4 is a memory for storing data such as processing results or intermediate results, 5 is a keyboard for inputting various commands or numerical data to the processor 1, and 6 is a document image input device. A typical example of the document image input device 6 is a scanner that reads image information from a document, but it may be a magnetic tape, a magnetic disk, or the like that reads data from a recording medium in which data has already been input. Further, 7 is a frame memory for temporarily storing the input document image data, 8 is a display device for displaying the document image data, 9 is a printer device for making a hard copy, and these elements are a bus. Connected by 10.

本発明では、画像入力装置６から入力された文書画像デ
ータをプロセッサ１により、第２図の如く処理する。In the present invention, the document image data input from the image input device 6 is processed by the processor 1 as shown in FIG.

すなわち、フレームメモリ７に入力された文書画像11に
対して、黒画素連続成分の外接矩形12と各文字列の外接
矩形13を求める。黒画素連続成分の外接矩形は、例えば
特開昭60-129379号公報等で知られた公知の方法により
抽出できる。また、文字列の外接矩形13は、例えば信学
技報PRL-83-7において“縦書き横書き文書からの個別文
字切出し法”と題して報告された公知の手法を採用でき
る。That is, for the document image 11 input to the frame memory 7, a circumscribed rectangle 12 of black pixel continuous components and a circumscribed rectangle 13 of each character string are obtained. The circumscribed rectangle of the black pixel continuous component can be extracted by a known method known in, for example, JP-A-60-129379. For the circumscribing rectangle 13 of the character string, for example, a publicly known method reported in "Technical Technical Report PRL-83-7" entitled "Individual Character Extraction Method from Vertical Writing Horizontal Writing Document" can be adopted.

本発明では、このようにして入力画像から抽出した２種
類の外接矩形12と13とから、各文字列を構成する文字種
類14を判別し、これに基づいて、文字切り出し処理15、
あるいは文字認識処理16を行なう。In the present invention, the character type 14 forming each character string is discriminated from the two types of circumscribing rectangles 12 and 13 thus extracted from the input image, and based on this, the character cutting process 15,
Alternatively, character recognition processing 16 is performed.

第３図（Ａ）は、入力文書の１例としして、和文領域20
と英文領域21とを含む入力画像11の一部を示し、第３図
（Ｂ）は、上記入力画像から抽出した文字列矩形31〜38
と、黒画素連続成分の外接矩形（各文字列外接矩形内の
小矩形）を示す。この例で、文字列の外接矩形（以下、
文字列矩形と言う）とそれに含まれる黒画素連続領域の
外接矩形（以下、黒画素外接矩形と言う）との関係をみ
ると、和文領域20と英文領域21では次の相違があること
が判る。FIG. 3 (A) shows an example of an input document in a Japanese sentence area 20.
FIG. 3 (B) shows a part of the input image 11 including the text area 21 and the English text area 21. FIG.
And a circumscribed rectangle of black pixel continuous components (small rectangle in each character string circumscribed rectangle). In this example, the circumscribed rectangle of the string (below,
Looking at the relationship between the character string rectangle) and the circumscribing rectangle of the black pixel continuous area contained therein (hereinafter referred to as the black pixel circumscribing rectangle), it can be seen that the Japanese language area 20 and the English language area 21 have the following differences. .

（１）和文では、黒画素外接矩形の多くが文字列矩形の
底辺に揃っているが、英文の場合は「ｇ」，「ｊ」，
「ｐ」，「ｑ」，「ｙ」，「，」など、一部の文字，記
号が下側突出ストロークを有し、これによって文字列矩
形の底辺が決まるため、黒画素外接矩形の多くは文字列
底辺から離れている。(1) In the Japanese sentence, most of the black pixel circumscribed rectangles are aligned with the bottom of the character string rectangle, but in the English sentence, “g”, “j”,
Since some characters and symbols such as “p”, “q”, “y”, and “,” have a downward protruding stroke, which determines the base of the character string rectangle, most of the black pixel circumscribing rectangles It is far from the bottom of the string.

（２）英文では、ほとんどの文字が連続したストローク
からなり、黒画素外接矩形の上辺は、文字列矩形の上辺
に一致する第１レベルと、小文字の上辺に相当する第２
レベルの２つの顕著なレベルをもつが、和文の場合は、
黒画素外接矩形の上辺は、文字列矩形の上辺に集中して
おり、明確な第２レベルがない。(2) In English, most characters consist of continuous strokes, and the upper side of the black pixel circumscribed rectangle is the first level that matches the upper side of the character string rectangle and the second level that corresponds to the upper side of lowercase letters.
There are two remarkable levels, but in the case of Japanese,
The upper side of the black pixel circumscribed rectangle is concentrated on the upper side of the character string rectangle, and there is no clear second level.

（３）各文字列における黒画素外接矩形の隣接関係につ
いてみると、英文の場合は、単語間に大きな離間部が存
在するが、和文には、これがなく、各矩形間の離間距離
も不均一である。(3) Regarding the adjacency relationship between the black pixel circumscribed rectangles in each character string, in the case of English sentences, there are large gaps between words, but in Japanese sentences there is no such gap, and the gaps between rectangles are also uneven. Is.

本発明は、２種類の文字領域における上述した特徴差に
着目して、各文字列毎に構成文字の種類を識別する。The present invention identifies the type of constituent character for each character string by paying attention to the above-mentioned feature difference in the two types of character areas.

第４図は、複数の黒画素外接矩形R1〜R7を含む文字列矩
形Ｌについて、各矩形の表示記号を示す。文字列矩形Ｌ
の大きさと位置は、左上と右下の画素の座標アドレス
（LXmin,LYmin），（LXmax,LYmax）で示され、同様に、
各黒画素外接矩形の大きさと位置も、それぞれの左上と
右下の座標アドレス（RXmin,RYmin），（RXmax,RYmax）
で示される。１つの文字列矩形Ｌ内に含まれる全ての黒
画素外接矩形Ｒは、を満足する。入力画像から抽出された上記各矩形の座標
値は、例えば第５図に示す如く、データメモリ４に用意
した黒画素外接矩形テーブル51および文字列矩形テーブ
ル52に、各矩形ID毎に記憶される。FIG. 4 shows a display symbol of each of the character string rectangles L including a plurality of black pixel circumscribing rectangles R1 to R7. String rectangle L
The size and position of is indicated by the coordinate address (LXmin, LYmin), (LXmax, LYmax) of the upper left and lower right pixels, and similarly,
The size and position of each black pixel circumscribed rectangle is also the coordinate address (RXmin, RYmin), (RXmax, RYmax) of the upper left and lower right of each rectangle.
Indicated by. All the black pixel circumscribing rectangles R included in one character string rectangle L are To be satisfied. The coordinate values of each rectangle extracted from the input image are stored for each rectangle ID in a black pixel circumscribing rectangle table 51 and a character string rectangle table 52 prepared in the data memory 4, as shown in FIG. 5, for example. .

上述した性質（１），（２）に着目して各文字列の構成
文字種類を判別する場合は、例えば第６図に示す如く、
各文字列矩形Ljを高さ方向にｎ個の領域（この例では領
域ａ〜ｈの８領域）に分割し、この文字列に含まれる黒
画素外接矩形Ｒの上辺RXmaxと下辺RYminとが、それぞれ
どの分割領域に位置しているかを調べ、その頻度分布を
求めればよい。RXmaxとRYminの頻度分布は、それぞれ第
７図に示す如く、和文と英文で異なった特徴を示すか
ら、例えば上下両端の分割領域ａとｈに高い頻度を示す
文字列は和文、その内側の分割領域ｂとｇに高い頻度を
もつ文字列は英文と判断できる。When distinguishing the constituent character types of each character string by paying attention to the above properties (1) and (2), for example, as shown in FIG.
Each character string rectangle Lj is divided in the height direction into n areas (8 areas a to h in this example), and the upper side RXmax and the lower side RYmin of the black pixel circumscribed rectangle R included in this character string are It suffices to find out in which divided region each of them is located and obtain its frequency distribution. As shown in FIG. 7, the frequency distributions of RXmax and RYmin show different characteristics in Japanese and English. Therefore, for example, the character strings showing high frequency in the upper and lower divided areas a and h are Japanese and the inside A character string having a high frequency in the areas b and g can be judged as an English sentence.

一方、性質（３）に着目する場合は、第８図に示すよう
に、同じ文字列内に含まれる各黒画素外接矩形領域間の
距離Ｄの分布を求め、第９図の如く、D₁とD₂の２つの顕
著なピークがある場合は英文、そうでない場合は和文と
判断すればよい。On the other hand, if attention is paid to properties (3), as shown in FIG. 8, determine the distribution of the distance D between the black pixel circumscribed rectangular area contained within the same string, as in FIG. 9, D ₁ If there are two prominent peaks of D ₂ English, otherwise it may be determined that Japanese.

以下、性質（１），（２）に着目する第１の判定と、性
質（３）に着目する第２の判定とを利用した本発明の文
書画像処理の手順をプログラム・フローチャートにより
説明する。Hereinafter, the procedure of the document image processing of the present invention using the first determination focusing on the properties (1) and (2) and the second determination focusing on the property (3) will be described with reference to a program flow chart.

第10図は、文書画像処理の全体手順を示しており、次の
処理ステップからなっている。FIG. 10 shows the overall procedure of document image processing, and includes the following processing steps.

処理61:画像入力装置６から文書画像を入力する。Process 61: A document image is input from the image input device 6.

処理62:黒画素連続成分の外接矩形Ｒを抽出する。抽出
した矩形の座標アドレスを行ごとに第５図の矩形テーブ
ル51に記憶する。ここでは、対角点のアドレスで記憶し
ているが矩形のアドレスを一意に定義できるテーブル形
式ならば例えば、一点の座標アドレスと、矩形の幅，高
さを表現する他の表示形式を採用してもよい。Process 62: The circumscribed rectangle R of the black pixel continuous component is extracted. The coordinate address of the extracted rectangle is stored row by row in the rectangle table 51 of FIG. Here, if the table format is such that the address of the diagonal point is stored but the address of the rectangle can be uniquely defined, for example, the coordinate address of one point and another display format for expressing the width and height of the rectangle are adopted. May be.

処理63:文字列矩形Ｌを抽出して、各アドレスデータを
文字列矩形テーブル52に記憶する。Process 63: The character string rectangle L is extracted and each address data is stored in the character string rectangle table 52.

処理64:黒画素外接矩形テーブル51と文字列矩形テーブ
ル52を参照して文字列が和文であるか英文であるかを判
定する第１の判定処理を行なう。Process 64: The first determination process for determining whether the character string is a Japanese sentence or an English sentence is performed by referring to the black pixel circumscribing rectangular table 51 and the character string rectangular table 52.

処理65:処理64で判定できなかった文字列に対して、和
文か英文かの第２の判定処理を行なう。Process 65: For the character string that could not be determined in process 64, the second determination process of Japanese sentence or English sentence is performed.

処理66:上記判定結果を参照して、辞書メモリ３を利用
して文字認識，文字切出し等の処理を行なう。Process 66: With reference to the above determination result, the dictionary memory 3 is used to perform character recognition, character segmentation, and the like.

第11図は第１の判定処理64の詳細を示すフローチャート
であり、次のステップからなっている。FIG. 11 is a flow chart showing the details of the first judgment processing 64, and comprises the following steps.

処理701:各文字列毎に黒画素外接矩形Ｒの上辺分布頻度
をカウントするためのカウンタUCと、底辺分布頻度をカ
ウントするためのカウンタDCをそれぞれクリアする。Process 701: The counter UC for counting the upper side distribution frequency of the black pixel circumscribed rectangle R and the counter DC for counting the bottom side distribution frequency are cleared for each character string.

処理702:黒画素外接矩形テーブル51から矩形ID順に矩形
Ｒの座標データを読み込む。Process 702: The coordinate data of the rectangle R is read from the black pixel circumscribed rectangle table 51 in the order of the rectangle ID.

処理703:テーブル52を参照し、矩形Ｒが含まれる文字列
Ljを検索する。包含条件は前述した式（１）による。Process 703: Referring to the table 52, a character string including the rectangle R
Search for Lj. The inclusion condition is based on the above-mentioned formula (1).

処理704:文字列Ljを高さ方向にｎ個に分割する。各分割
領域の上限アドレスAmax（ｉ）と下限アドレスAmin
（ｉ）は次式で定義される。Process 704: The character string Lj is divided into n pieces in the height direction. Upper limit address Amax (i) and lower limit address Amin of each divided area
(I) is defined by the following equation.

Amax（ｉ）＝LYmin＋（LYmax−LYmin）・i/n Amin（ｉ）＝LYmin＋（LYmax−LYmin）・（ｉ−１）/n 処理705:矩形Ｒの上辺アドレスRYmaxが上記分割領域の
どれに含まれるかを調べ、該当するカウンタUC（ｉ）を
インクリメントする。分割領域のインデックスｉとRYma
xとの関係は、 Amin（ｉ）≦RYmax≦Amax（ｉ）により求める。Amax (i) = LYmin + (LYmax-LYmin) -i / n Amin (i) = LYmin + (LYmax-LYmin)-(i-1) / n Process 705: Which of the above divided areas is the upper side address RYmax of the rectangle R It is checked whether it is included, and the corresponding counter UC (i) is incremented. Segment area index i and RYma
The relationship with x is obtained by Amin (i) ≦ RYmax ≦ Amax (i).

処理706:矩形Ｒの底辺アドレスRYminが上記分割領域の
どれに含まれるかを調べ、該当するカウンタDC（ｉ）を
インクリメントする。分割領域のインデックスとRYmin
との関係は、 Amin（ｉ）≦RYmin≦Amax（ｉ）により求める。Process 706: It is checked which of the divided areas the bottom address RYmin of the rectangle R is included in and the corresponding counter DC (i) is incremented. Segmented area index and RYmin
The relationship between and is obtained by Amin (i) ≦ RYmin ≦ Amax (i).

これらの処理により、矩形Ｒの上辺を下辺が、第６図の
分割領域ａ〜ｈのいずれに位置しているかが、各文字列
毎のカウンタUC,DCに頻度として記憶される。Through these processes, which of the divided areas a to h in FIG. 6 where the upper side and the lower side of the rectangle R are located is stored as the frequency in the counters UC and DC for each character string.

処理707:テーブル51の全ての矩形について判定処理を終
えたか否かをチェックし、NOならば処理702に戻り、YES
なら処理708に進む。Process 707: It is checked whether the determination process has been completed for all the rectangles in the table 51. If NO, the process returns to process 702, YES.
If so, the process proceeds to step 708.

処理708:各文字列毎のカウンタUC,DCを参照し、それぞ
れの最大カウント値が文字列矩形の上端または下端の分
割領域に対応するカウンタで生じているか否かを判定す
る。すなわち、 max〔UC（ｉ）〕＝UC（ｈ），（ｉ＝ａ〜ｈ） max〔DC（ｉ）〕＝DC（ａ），（ｉ＝ａ〜ｈ）が共に満足するか否かを調べる。もし、満足すれば、こ
の文字列は和文と判定できる。Process 708: By referring to the counters UC and DC for each character string, it is determined whether or not the respective maximum count values are generated in the counters corresponding to the divided areas at the upper end or the lower end of the character string rectangle. That is, whether max [UC (i)] = UC (h), (i = a to h) max [DC (i)] = DC (a), (i = a to h) are both satisfied. Find out. If satisfied, this character string can be determined to be a Japanese sentence.

処理709:当該文字列が和文であることを示すため、例え
ばテーブル52の判定結果を示す欄Ｆに和文表示コードを
記録する。Process 709: In order to indicate that the character string is a Japanese sentence, for example, a Japanese sentence display code is recorded in the column F indicating the determination result of the table 52.

処理710:処理708で和文と判定できなかった場合、カウ
ンタUCの最大カウント値が最上端の分割領域ｈ以外であ
り、且つ、カウンタDCの最大カウント値が最下端の分割
領域ａ以下に生じていれば、この文字列は英文であると
判定する。Process 710: When it is not possible to determine a Japanese sentence in process 708, the maximum count value of the counter UC is other than the highest divided region h, and the maximum count value of the counter DC is below the lowest divided region a. If so, it is determined that this character string is an English sentence.

すなわち、 max〔UC（ｉ）〕≠UC（ｈ），（ｉ＝ａ〜ｈ） max〔DC（ｉ）〕≠DC（ａ），（ｉ＝ａ〜ｈ）が共に満足する場合、英文と判断して処理ステップ711
に進み、そうでない場合はステップ712に進む。That is, if max [UC (i)] ≠ UC (h), (i = a to h) max [DC (i)] ≠ DC (a), (i = a to h) are both satisfied, the English sentence and Determine and process step 711
Otherwise, go to step 712.

処理711:文字列矩形テーブル52の下欄に英文であること
を示すコードを記録する。Process 711: A code indicating an English sentence is recorded in the lower column of the character string rectangular table 52.

処理712:文字列矩形テーブル52のＦ欄に判定不能である
旨の表示記号を記録する。Process 712: A display symbol indicating that determination is impossible is recorded in the F column of the character string rectangular table 52.

処理713:全ての文字列について文字種類判定を終了した
か否かを判定し、未処理の文字列が残っている場合は処
理ステップ708に戻り、そうでない場合は、このルーチ
ンを終了する。Process 713: It is determined whether or not the character type determination has been completed for all the character strings, and if there is an unprocessed character string, the process returns to process step 708, and if not, this routine ends.

第12図は第２の判定処理65の詳細を示すフローチャート
である。FIG. 12 is a flow chart showing details of the second judgment processing 65.

処理800:文字列テーブル52の下欄をチェックし、前回の
第１の判定で判定不能と判断された文字列を見つける。
もし、なければ、このルーチンを終了し、あれば次のス
テップ801に進む。Process 800: The lower column of the character string table 52 is checked to find a character string that was judged as undecidable in the previous first judgment.
If not, this routine is terminated, and if there is, the process proceeds to the next step 80 1.

処理801:文字列矩形Ljデータを読み取る。Process 801: Read the character string rectangle Lj data.

処理802:文字列矩形Ljに含まれる全ての黒画素外接矩形
を検索する。この場合の検索条件は式（１）であり、既
に行なった第１の判定処理64で各矩形とLjとの対応関係
が保存（例えばテーブル51に対応するLjの矩形IDを記
憶）してあれば、それを利用してもよい。Process 802: Search all black pixel circumscribing rectangles included in the character string rectangle Lj. The search condition in this case is the expression (1), and the correspondence between each rectangle and Lj has been saved (for example, the rectangle ID of Lj corresponding to the table 51 is stored) in the already performed first determination processing 64. If you like, you may use it.

処理803:隣接する黒画素外接矩形間の距離Ｄの分布をカ
ウントするためのカウンタLCを初期化する。Process 803: Initialize a counter LC for counting the distribution of the distance D between adjacent black pixel circumscribing rectangles.

処理804:ステップ802で検索された黒画素外接矩形の１
つR_iについて、隣接矩形R_i+1とのＸ方向の離間距離Ｄを
それぞれのＸ座標を用いて求める。Process 804: 1 of the black pixel circumscribed rectangle searched in step 802
For each R _i , the distance D in the X direction from the adjacent rectangle R _{i + 1} is obtained using each X coordinate.

処理805:距離Ｄに対応するカウンタLC（ｎ）をインクリ
メントする。カウンタのインデックスｎは、距離Ｄを所
定の値ずつ離散化することにより求まる。Process 805: The counter LC (n) corresponding to the distance D is incremented. The index n of the counter is obtained by discretizing the distance D by a predetermined value.

処理806:文字列矩形Ljに含まれる全ての黒画素外接矩形
について、上記距離判定を終了したか否かをチェック
し、未処理矩形があれば、処理ステップ804に戻り、そ
うでなければ処理ステップ807に進む。Process 806: For all black pixel circumscribing rectangles included in the character string rectangle Lj, it is checked whether or not the distance determination is completed. If there is an unprocessed rectangle, the process returns to the processing step 804, and if not, the processing step. Proceed to 807.

処理807:カウンタLCの値をチェックし、第９図の如く、
ピーク値が互いに離間した２箇所にあるか否かを調べ
る。もし、あれば英文と判定して処理ステップ808に進
み、そうでなければ処理ステップ809に進む。ピーク値
の判定は、ピーク点を示す所定の傾きＫに対して、 LC（ｉ）−LC（ｉ−１）＞Ｋ LC（ｉ）−LC（ｉ＋１）＞Ｋを満たすカウンタLC（ｉ）が２つ（ｉ＝I₁,I₂）存在
し、｜I₁−I₂｜＞ｐ但し、ｐは所定の距離を示す値か否かを判定すればよ
い。Process 807: Check the value of the counter LC, and as shown in FIG.
It is checked whether or not the peak values are at two places separated from each other. If there is, it is determined as an English sentence and the process proceeds to step 808, and if not, the process proceeds to step 809. For the determination of the peak value, a counter LC (i) satisfying LC (i) -LC (i-1)> K LC (i) -LC (i + 1)> K with respect to a predetermined slope K indicating the peak point There are two (i = I ₁ , I ₂ ) and | I ₁ −I ₂ |> p However, it is sufficient to determine whether or not p is a value indicating a predetermined distance.

処理808:文字列矩形Ljが英文であることを示すコード
を、テーブル52の欄Ｆに記録する。Process 808: The code indicating that the character string rectangle Lj is an English sentence is recorded in the column F of the table 52.

処理809:カウンタLCの各カウント値の分散を所定の閾値
と比較し、これより大きい場合は処理ステップ810に進
み、そうでない場合は処理ステップ811に進む。Process 809: The variance of each count value of the counter LC is compared with a predetermined threshold value, and if larger than this, the process proceeds to a process step 810, and if not, the process proceeds to a process step 811.

処理810:文字列矩形Ljを和文と判断し、テーブル52の欄
Ｆにその旨の表示コードを記録する。Process 810: The character string rectangle Lj is determined to be a Japanese sentence, and the display code to that effect is recorded in the column F of the table 52.

処理811:全文字列について処理が終了したか否かを判定
し、未処理の文字列があれば処理ステップ800に戻り、
なければこのルーチンを終了する。Process 811: It is determined whether the process has been completed for all character strings, and if there is an unprocessed character string, the process returns to processing step 800,
If not, this routine ends.

以上、判定処理64と65について説明したが、これらの処
理は互いに入れ換えてもよく、また、いずれか一方のみ
を用いるようにしてもよい。また、判定処理64あるいは
65において、文字種類を判定できない場合、前後の文字
列の判定結果から不明文字列の種類を推定するようにし
てもよい。Although the determination processes 64 and 65 have been described above, these processes may be interchanged with each other, or only one of them may be used. In addition, the determination process 64 or
In 65, if the character type cannot be determined, the type of the unknown character string may be estimated from the determination results of the preceding and following character strings.

尚、第３図の英文領域21を参照すると、アルファベット
の大多数は、各文字のストロークが連続しており、黒画
素外接矩形領域が縦方向に分離されるケース（例えば小
文字の「ｉ」，「ｊ」）は極めて稀である。即ち、英文
の場合は、RYmin,RYmaxのそれぞれの値の分散が小さい
ため、これを利用して和文との区別をつけることができ
る。例えば、１つの文字列に含まれる黒画素外接矩形の
数をｍとすると、RYminの分散Sminは、で求まり、RYmaxの分散Smaxも同様にして求まる。従っ
て、分散の閾値Smaxφ,Sminφに対して、和文は Smin＞Sminφ and Smax＞Smaxφ で判断できる。尚、Smaxφ,Sminφの値は、文字列の高
さの関数として定義することにより、文字の大きさによ
り変えることができる。Incidentally, referring to the English area 21 in FIG. 3, in the majority of the alphabet, the strokes of each character are continuous, and the black pixel circumscribed rectangular area is separated in the vertical direction (for example, a lowercase letter "i", "J") is extremely rare. That is, in the case of an English sentence, since the variances of the respective values of RYmin and RYmax are small, this can be used to distinguish it from a Japanese sentence. For example, if the number of black pixel circumscribing rectangles included in one character string is m, the variance Smin of RYmin is Then, the variance Smax of RYmax can be similarly obtained. Therefore, the Japanese sentence can be judged by Smin> Sminφ and Smax> Smaxφ with respect to the dispersion thresholds Smaxφ and Sminφ. The values of Smaxφ and Sminφ can be changed according to the size of the character by defining them as a function of the height of the character string.

この判定は、第1,第２の判定に代え、あるいは、これら
を補う第３の判定として、第10図のフローチャートに加
えることができる。This determination can be added to the flowchart of FIG. 10 instead of the first and second determinations, or as a third determination that supplements these.

以上の方法により、文字列毎の文字種類が自動的に判別
できるため、第２図に示したように、これらの文字種類
を前提として、文字切出し、あるいは文字認識を行なう
ことができる。Since the character type for each character string can be automatically discriminated by the above method, as shown in FIG. 2, character cutting or character recognition can be performed based on these character types.

〔The invention's effect〕

本発明によれば、文書画像データ中の文字領域におい
て、構成文字の種類を文字列毎に自動的に判別できるた
め、この判別結果を利用して、特定文字領域の抽出（切
り出し）、文字認識等を行なうことができ、その結果を
画像ファイルの索引、あるいは抄録等の補助情報として
自動的に登録するような文書画像処理システムにおいて
有効である。According to the present invention, in the character area in the document image data, the type of the constituent characters can be automatically discriminated for each character string. Therefore, the discrimination result is used to extract (cut out) the specific character area and perform the character recognition. It is effective in a document image processing system in which the results can be registered and the result is automatically registered as an index of an image file or auxiliary information such as an abstract.

[Brief description of drawings]

第１図は本発明を実施するための文書画像処理システム
の１例を示す全体構成図、第２図は本発明による文書画
像処理の概略手順を説明するための図、第３図（Ａ），
（Ｂ）はそれぞれ入力文書画像と、これから抽出された
黒画素連続領域の外接矩形および文字列外接矩形の具体
例を示す図、第４図は文字列外接矩形と黒画素連続領域
外接矩形の表示データについての説明図、第５図は上記
各矩形の格納テーブルの構成を示す図、第６図および第
７図は文字種類の第１の判定方法における頻度分布のカ
ウントと頻度分布の特性を説明するための図、第８図と
第９図は文字種類の第２の判定方法における領域分布の
カウントと頻度分布の特性を説明するための図、第10図
は本発明による画像処理の全体のフローチャート、第11
図は第１の判定方法を実施するためのフローチャート、
第12図は第２の判定方法を実施するためのフローチャー
トである。符号の配明 1:プロセッサ、2:プログラムメモリ 3:辞書メモリ、4:データメモリ 5:キーボード、6:画像入力装置 7:フレームメモリ、8:表示装置、20:和文領域、21:英文
領域 L:文字列外接矩形 R:黒画素連続領域の外接矩形。FIG. 1 is an overall configuration diagram showing an example of a document image processing system for carrying out the present invention, FIG. 2 is a diagram for explaining a schematic procedure of document image processing according to the present invention, and FIG. 3 (A). ，
FIG. 4B is a diagram showing a concrete example of the input document image and the circumscribed rectangle of the black pixel continuous area and the character string circumscribed rectangle extracted therefrom, and FIG. 4 is a display of the character string circumscribed rectangle and the black pixel continuous area circumscribed rectangle. FIG. 5 is an explanatory diagram of data, FIG. 5 is a diagram showing a configuration of a storage table of each of the above rectangles, and FIGS. 6 and 7 are explanations of frequency distribution count and frequency distribution characteristics in the first character type determination method. FIG. 8 and FIG. 9 are diagrams for explaining the characteristics of the count and frequency distribution of the area distribution in the second character type determination method, and FIG. 10 is the entire image processing according to the present invention. Flow chart, 11th
The figure is a flow chart for implementing the first determination method,
FIG. 12 is a flowchart for carrying out the second determination method. Code distribution 1: Processor, 2: Program memory 3: Dictionary memory, 4: Data memory 5: Keyboard, 6: Image input device 7: Frame memory, 8: Display device, 20: Japanese area, 21: English area L : Character string circumscribing rectangle R: Black rectangle continuous circumscribing rectangle.

Claims

[Claims]

1. A step of fetching document information including a character string into a memory as image data, a step of obtaining a circumscribed rectangle for each black pixel continuous area from the image data, and a step of obtaining a circumscribed rectangle for each character string from the image data. Step, for each of the character string, a step of obtaining a frequency distribution regarding the vertical relative position of the upper side and the lower side of the circumscribing rectangle for each black pixel continuous region included in the circumscribing rectangle of the character string,
A document image processing method comprising the step of determining whether the character type forming the character string is an English character or a Japanese character from the peak position of the frequency distribution.

2. The document image processing method according to claim 1,
Further, a step of obtaining the frequency distribution of the interval in the left-right direction between adjacent black pixel continuous circumscribed rectangles, and determining whether the character type forming the character string is an English character or a Japanese character from the number of peaks of the frequency distribution of the interval A method for processing a document image, comprising the steps of: