JPH07192083A

JPH07192083A - Document picture layout analysis device

Info

Publication number: JPH07192083A
Application number: JP5330555A
Authority: JP
Inventors: Noboru Nakajima; 昇中島; Takeshi Kamimura; 健上村
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1993-12-27
Filing date: 1993-12-27
Publication date: 1995-07-28
Anticipated expiration: 2013-06-25
Also published as: JP2768249B2

Abstract

PURPOSE:To perform highly accurate layout analysis to a document in which the characters of different point numbers coexist. CONSTITUTION:The connection components of document pictures inputted from a document picture input means are extracted by the extraction means 11 of the connection components and classified corresponding to size information by the classification means 12 of the connection components and picture planes provided with only the connection components.belonging to a class obtained by performing the classification are respectively generated by a partial picture generation means 13. The layout analysis is performed to the respective divided picture planes by layout analysis means 14 and 15, layout information extracted from the respective picture planes is synthesized in a layout information synthesis means 16 and a layout analyzed result over the entire document pictures is obtained. When discrepancy is generated at the time of synthesizing the layout information, the layout information of the plane provided with a lot of the connection components and provided with the connection components in a size appropriate for the characters is preferentially synthesized and the final layout analyzed result is obtained.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は文書画像のレイアウト解
析装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document image layout analysis apparatus.

【０００２】[0002]

【従来の技術】文書画像認識装置においては、文書をイ
メージスキャン等より入力して得られるディジタル画像
に対してレイアウト解析処理を行い、抽出された個々の
文字画像に対して文字認識処理を行い、文字コードへ変
換するという処理の流れがとられる。ここでレイアウト
解析とは入力画像から文字行を抽出し、文字切り出しを
行うまでの処理を指すものとする。2. Description of the Related Art In a document image recognition apparatus, layout analysis processing is performed on a digital image obtained by inputting a document by image scanning or the like, and character recognition processing is performed on each extracted character image. The process flow of converting to a character code is taken. Here, the layout analysis means a process of extracting a character line from an input image and cutting out a character.

【０００３】これまで提案されてきたレイアウト解析方
法に関する文献として、「文字構造情報に基づく高精度
な文字切り出し処理を用いた文書認識システム」（孫
他、情報処理学会論文誌，Ｖｏｌ．３３，Ｎｏ．９９、
１９９２）がある。これは日本語文書を対象とした方法
であり、以下その概要について説明する。この方法で
は、入力された文書画像は２値化、傾き補正処理を施し
た後、レイアウト解析される。領域分割を行い、文字領
域のみを取り出し、文字領域に対してラベリング処理に
より黒画素連結成分を抽出する。得られた黒画素の塊を
候補図形と呼ぶ。文字には、「北」「ハ」の用な分離文
字（複数の連結成分より構成される文字）や、接触して
一つの連結成分を構成する接触文字が多く存在するた
め、候補図形から文字図形を生成する処理が必要とな
る。補正図形のサイズで最も頻度の高いものを文字の平
均サイズとする。候補図形を横軸に射影し、その分布か
ら例の抽出を行う。列内で候補図形の統合を行い、分離
文字に対処する。平均サイズに近いサイズを持つ候補図
形を優先的に切り出し、残った図形は、分離文字に対し
ては強制統合、接触文字に対しては分離処理及び句読点
の抽出を行い、最終的な文字切り出し結果を得る。As a document relating to the layout analysis methods that have been proposed so far, "Document recognition system using highly accurate character segmentation processing based on character structure information" (Sun et al., Information Processing Society of Japan, Vol.33, No. .99,
1992). This is a method for Japanese documents, and its outline is explained below. In this method, the input document image is subjected to binarization and inclination correction processing, and then subjected to layout analysis. Region division is performed, only the character region is extracted, and the black pixel connected component is extracted by performing a labeling process on the character region. The obtained block of black pixels is called a candidate figure. There are many separated characters such as “north” and “c” (characters composed of multiple connected components) and contact characters that make contact to form one connected component. A process to generate a figure is required. The most frequently used size of the corrected figure is the average size of the characters. The candidate figure is projected on the horizontal axis, and examples are extracted from the distribution. Integrate candidate shapes in columns to deal with separators. The candidate figure with a size close to the average size is preferentially cut out, the remaining figures are forcibly integrated for separated characters, separated for contact characters and extracted for punctuation, and the final character cutout result To get

【０００４】また、もう１つのレイアウト解析方法で、
英文の文書を対象とした例として、ＴＳＵＪＩＭＯＴＯ
らによって”ＭａｊｏｒＣｏｍｐｏｎｅｎｔｓｏｆ
ａＣｏｍｐｌｅｔｅＴｅｘｔＲｅａｄｉｎｇＳ
ｙｓｔｅｍ”（ＴＵＪＩＭＯＴＯ他，Ｐｒｏｃｅｅｄｉ
ｎｇｓｏｆｔｈｅＩＥＥＥ，Ｖｏｌ．８０，Ｎ
ｏ．７，１９９２）に記載された方法について説明す
る。文書画像をランレングスで表現し、（１）黒画素の
連結成分を抽出し、１ｍｍ程度の比較的近い距離にある
近接する連結成分を統合してセグメントする。（２）各
セグメントをテキスト行、図、絵等に分類する。（３）
テキスト行に分類されたセグメントに対して近接するも
の同士を統合する。これによりテキストのブロックを抽
出する。ここで、単語は（１）（２）の処理、テキスト
行は（３）の処理で抽出できるとしている。Another layout analysis method is
As an example for English documents, TSUJIMOTO
By "Major Components of
aComplete Text Reading S
system ”(TUJIMOTO et al., Proceedi
ngs of the IEEE, Vol. 80, N
o. 7, 1992). The document image is expressed by run length, (1) the connected component of the black pixel is extracted, and the adjacent connected components at a relatively short distance of about 1 mm are integrated and segmented. (2) Classify each segment into text lines, figures, pictures, etc. (3)
Integrate adjacent segments of text lines into segments. This extracts a block of text. Here, the words can be extracted by the processes of (1) and (2), and the text lines can be extracted by the process of (3).

【０００５】これらの方法では同一文書内に異なるポイ
ント数の文字行が近接して存在する場合、これらを正し
く抽出できない。レイアウト解析処理は段組間のスペー
スは行間のスペースよりも、行間スペースは文字間スペ
ースより、文字間スペースは文字内のスペースより、大
きいという文書の組み版規則が、入力された文書画像に
成立していることを前提として、文書画像のレイアウト
解析を行っている。このとき、例えば図１０の様にポイ
ント数の異なる文字間隔より近接して存在すると、ポイ
ント数の大きな文字の文字間隔よりポイント数の小さな
テキスト行の行間隔が小さくなり、近接するポイント数
の大きな文字がポイント数の小さなテキスト行に統合さ
れてしまい、正しい行及び文字の抽出が行えなくなる。According to these methods, when character lines having different points are present close to each other in the same document, they cannot be correctly extracted. In the layout analysis process, the typesetting rule of the document that the space between columns is larger than the space between lines, the space between lines is larger than the space between characters, and the space between characters is larger than the space inside characters is established in the input document image. The layout analysis of the document image is performed on the assumption that the above is done. At this time, for example, when they exist closer to each other than the character intervals having different points as shown in FIG. 10, the line intervals of the text lines having the smaller points become smaller than the character intervals of the characters having the larger points, and the points having a larger number of adjacent points become larger. Characters are merged into a text line with a small number of points, making it impossible to extract the correct line and character.

【０００６】この様に、異なるポイント数の文字が同一
文書内に混在する場合、レイアウト解析が困難であっ
た。As described above, when characters having different points are mixed in the same document, layout analysis is difficult.

【０００７】[0007]

【発明が解決しようとする課題】従来のレイアウト解析
処理は段組間のスペースは行間のスペースよりも、行間
スペースは文字間スペースより、文字間スペースは文字
内のスペースより、大きいというような文書の組み版規
則が、入力された文書画像に成立している場合に限り、
文書画像のレイアウト解析が正常に動作するものであっ
た。ポイント数の異なる文字が混在するような文書にお
いて、このような規則は必ずしも成り立っているとは限
らない。In the conventional layout analysis process, the space between columns is larger than the space between lines, the space between lines is larger than the space between characters, and the space between characters is larger than the space within characters. Only when the typesetting rule of is valid for the input document image,
The layout analysis of the document image worked properly. In a document in which characters having different points are mixed, such a rule does not always hold.

【０００８】本発明の目的は、従来手法で困難であった
異なるポイント数の文字が同一文書内に混在するような
文書に対して高精度なレイアウト解析性能を実現するこ
とである。An object of the present invention is to realize highly accurate layout analysis performance for a document in which characters having different points are mixed in the same document, which is difficult with the conventional method.

【０００９】[0009]

【課題を解決するための手段】本発明は、文書画像入力
手段、前記文書画像から連結成分を抽出する手段と、前
記連結成分を大きさ情報に応じて分類する手段と、分類
して得られたクラスに属する連結成分のみを含んだ部分
画像を各々生成する手段と、前記各部分画像に対して、
レイアウト解析を行い、レイアウト情報を抽出するレイ
アウト解析手段と、前記各レイアウト情報を合成する手
段、を含んで構成されることを特徴とする。The present invention is obtained by classifying document image input means, means for extracting connected components from the document image, and means for classifying the connected components according to size information. Means for respectively generating partial images including only connected components belonging to the class, and for each partial image,
It is characterized by including a layout analysis means for performing layout analysis and extracting layout information, and a means for synthesizing the layout information.

【００１０】[0010]

【作用】連結成分を大きさ情報に応じて分類し、得られ
たクラスに属する連結成分のみを含んだ部分画像を各々
生成することにより、ポイント数の異なる文字を各部分
画像に振り分ける。各部分画像においては従来手法によ
るレイアウト解析を行う。さらに各部分画像のレイアウ
ト解析の結果より得られたレイアウト解析を行う。さら
に各部分画像のレイアウト解析の結果より得られたレイ
アウト情報を合成することで、最終的なレイアウト解析
結果を得る。レイアウト情報の合成の際に、各レイアウ
ト情報間で矛盾が生じた場合、各部分画像に含まれる連
結成分数が多く、かつ文字らしい大きさを持つ連結成分
を含むクラスのレイアウト解析結果を優先して合成を行
う。The connected components are classified according to the size information, and the partial images including only the connected components belonging to the obtained class are generated, so that the characters having different points are distributed to the partial images. Layout analysis is performed on each partial image by the conventional method. Further, the layout analysis obtained from the result of the layout analysis of each partial image is performed. Further, the final layout analysis result is obtained by synthesizing the layout information obtained from the layout analysis result of each partial image. When there is a contradiction between layout information when synthesizing layout information, priority is given to the layout analysis result of the class that includes the connected components that have a large number of connected components in each partial image and have a character-like size. To synthesize.

【００１１】[0011]

【実施例】異化に、図面を用いて本発明の実施例につい
て説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below with reference to the drawings.

【００１２】図１は同実施例を示すブロック図である。
画像入力処断より入力された文書画像は２値化処理を施
され画像の走査信号に変換される。図３は入力文書画像
を走査した後、２値化処理を施した画像の例である。こ
の入力文書は表題となる書面の大部分を占める本分に相
当する文字に加え、表題部分に相当する大きな文字、ル
ビの小さな文字、粒状のノイズ、の大きさの異なる４つ
の成分を含んでいるものである。FIG. 1 is a block diagram showing the same embodiment.
The document image input by the image input processing is binarized and converted into a scanning signal of the image. FIG. 3 shows an example of an image obtained by scanning an input document image and then binarizing it. This input document contains the characters corresponding to the main text that occupies most of the title document, as well as four components with different sizes: large characters corresponding to the title part, small ruby characters, and granular noise. There is something.

【００１３】連結成分の抽出手段１１により各黒画素連
結成分が抽出され、連結成分の外接矩形情報１０１を得
る。ここで、黒画素は文字、図形等、文書を構成する要
素部分、白画素は背景部分に相当するものである。次
に、２つの外接矩形が包含関係にある場合、もしくは外
接矩形の重複度がある程度より大きい場合、両矩形を統
合する。ここで重複度は例えば重複している２つの矩形
の面積の合計にしめる重複部分の面積の割合で表すもの
とする。図４は連結成分の外接矩形に対して、包含関係
にある矩形の統合、及び重複度を用いた統合を行った結
果を示すものである。Each black pixel connected component is extracted by the connected component extracting means 11 to obtain circumscribed rectangle information 101 of the connected component. Here, the black pixels correspond to elements such as characters and figures that form a document, and the white pixels correspond to the background portion. Next, if the two circumscribing rectangles have an inclusive relation, or if the degree of overlap of the circumscribing rectangles is greater than a certain degree, then both rectangles are integrated. Here, the degree of overlap is represented by, for example, the ratio of the area of the overlapping portion to the total area of two overlapping rectangles. FIG. 4 shows a result of performing integration of rectangles having an inclusion relation and integration using a degree of overlap with respect to a circumscribed rectangle of connected components.

【００１４】連結成分の分類手段１２では連結成分の抽
出手段１１において抽出、統合された連結成分の外接矩
形の大きさで、各連結成分をクラスに分類し、連結成分
のクラスへの分類情報を出力する。ここで、クラスとは
大きさの近い連結成分の集合であり、例えば、見出し文
字に相当するクラスＣ₃、本分文字に相当するクラスＣ
₂、ルビや文字のへん、つくり、句読点に相当するクラ
スＣ₁、小さなノイズに相当するクラスＣ₀の計４種類
が考えられる。分類する基準としては、外接矩形の高
さ、幅、面積、輪郭長等が考えられるが、ここでは一例
としてクラスへの分類を矩形の面積を用いて行うことと
する。具体的には、各矩形の面積を求め、例えばＫ−ｍ
ｅａｎｓ法（例えばＫ＝４）を用いたクラスタリングに
よりクラスＣ₀、Ｃ₁、Ｃ₂、Ｃ₃、…に各矩形を分類
する。The connected component classifying unit 12 classifies each connected component into a class according to the size of the circumscribed rectangle of the connected component extracted and integrated by the connected component extracting unit 11, and classifies the connected component into class information. Output. Here, a class is a set of connected components having a similar size, for example, a class C ₃ corresponding to a heading character and a class C corresponding to a main character.
_2. There are four possible types: class C ₁ corresponding to ruby and lettering, making, punctuation, and class C ₀ corresponding to small noise. The height, width, area, contour length, and the like of the circumscribed rectangle can be considered as the criteria for classification, but here, as an example, classification into classes is performed using the area of the rectangle. Specifically, the area of each rectangle is calculated and, for example, K-m
Each rectangle is classified into classes C ₀ , C ₁ , C ₂ , C ₃ , ... By clustering using the eans method (for example, K = 4).

【００１５】部分画像生成手段１３はこれに基づき連結
成分を各クラスに属する連結成分を部分画像ごとに割り
振り、クラスごとの部分画像１０３〜１０４を生成す
る。例えばクラスＣ_iに属する連結成分のみを含む部分
画像をＰ_i（ｉ＝１，２，３）とする。図５（ａ）〜
（ｄ）は、外接矩形のクラスへの分類情報に従い、各部
分画像に連結成分を割り当てた結果である。この例で
は、部分画像Ｐ₀はノイズ、部分画像Ｐ₁はルビと文字
内に分離を含む文字が矩形統合の際に統合処理されず分
離したまま残った文字の一部分、部分画像Ｐ₂は本分部
分の文字、部分画像Ｐ₃は表題部分の文字に相当してい
る。On the basis of this, the partial image generating means 13 allocates the connected component belonging to each class to each partial image, and generates the partial images 103 to 104 for each class. For example, a partial image that includes only connected components belonging to the class C _i is P _i (i = 1, 2, 3). FIG. 5 (a)-
(D) is the result of assigning a connected component to each partial image according to the classification information on the class of the circumscribed rectangle. In this example, the partial image P ₀ is the noise, the partial image P ₁ is the letter characters including separation in ruby and characters remained separate without being integrated process during rectangular integrated portion, the partial image P ₂ is present The character of the minute portion, the partial image P ₃ corresponds to the character of the title portion.

【００１６】前記化部分画像はそれぞれ対応するレイア
ウト解析手段１４〜１５でレイアウト解析処理される。
ここでの部分画像のレイアウト解析手段は、既に矩形面
積のほぼ等しい矩形が各部分画像に振り分けられている
ので、従来の技術で述べた孫らの方法、Ｔｕｊｉｍｏｔ
ｏらの方法等を用いて行うことが可能である。ここでは
一例として、辻によって「スプリット検出法に基づく頁
画像の構造解析」と題して１９８５年に電子通信学界技
術研究報告パターン認識と学習ＰＲＬ８５−１７に提案
された方法を各クラスの部分画像に共通して適用するこ
ととする。この方法を用いたレイアウト解析方法は各部
分画像内において、水平、垂直方向に投影パターンを求
め、行の配置の周期性を考慮して行を切り出し、文字が
ほぼ正方形をなすことを仮定して、行の幅から文字ピッ
チを推定する。推定文字ピッチを用いて、文字行から多
少の変動を考慮してほぼ等間隔に文字を切り出す。この
様な従来技術を用いることで点在する文字を位置関係、
文字の並びの周期性を考慮して、文字、行の抽出が行え
る。図６（ａ）〜（ｄ）は同入力画像例入力時に、各ク
ラスの部分画像に対して文字行及び文字を切り出した結
果である。The converted partial images are subjected to layout analysis processing by the corresponding layout analysis means 14 to 15.
In the partial image layout analysis means here, since rectangles having substantially the same rectangular area are already allocated to the respective partial images, the method of Son et al., Tujimot described in the prior art.
It is possible to use the method of O et al. Here, as an example, the method proposed by Tsuji in 1985, entitled “Structural Analysis of Page Image Based on Split Detection Method”, was proposed in the Technical Report of Electronic Communication Studies, Pattern Recognition and Learning, PRL85-17, for partial images of each class. Commonly applied. In the layout analysis method using this method, the projection pattern is obtained in each of the partial images in the horizontal and vertical directions. , Estimate character pitch from line width. Using the estimated character pitch, characters are cut out from the character line at approximately equal intervals in consideration of some variation. By using such a conventional technique, the interspersed characters are placed in a positional relationship,
Characters and lines can be extracted in consideration of the periodicity of the character arrangement. FIGS. 6A to 6D are the results of extracting character lines and characters from the partial images of each class when the same input image example is input.

【００１７】部分画像毎のレイアウト情報１０５〜１０
６はレイアウト情報合成手段１６において合成され、最
終的な全文書にわたるレイアウト解析結果を出力する。
レイアウト情報合成手段１６に関わる一実施例を図２を
用いて説明する。Layout information 105 to 10 for each partial image
The layout information synthesizing means 6 synthesizes the layout information 6 to output the final layout analysis result of all the documents.
An embodiment relating to the layout information synthesizing means 16 will be described with reference to FIG.

【００１８】まず、連結成分数ヒストグラム生成手段２
０により各クラスに属する矩形数のヒストグラムである
連結成分数ヒストグラム２００を求め、各クラスに含ま
れる連結成分数を記憶しておく。図７は図３の入力画像
例を入力した際の連結成分数ヒストグラムである。First, the connected component number histogram generating means 2
A connected component number histogram 200 which is a histogram of the number of rectangles belonging to each class is obtained from 0, and the number of connected components included in each class is stored. FIG. 7 is a histogram of the number of connected components when the input image example of FIG. 3 is input.

【００１９】レイアウト情報選択手段２２において、レ
イアウト情報の合成を行うときに、各部分画像のレイア
ウト情報の単純な重量を行った結果、一つの連結成分が
複数のレイアウト情報において矛盾を起こす場合、優先
度決定手段２１に対して優先度要求信号２０２を出す。In the layout information selecting means 22, when the layout information is synthesized, the layout information of each partial image is simply weighted. As a result, when one connected component causes a contradiction in a plurality of layout information, priority is given. A priority request signal 202 is issued to the degree determining means 21.

【００２０】レイアウト情報選択手段２２において各レ
イアウト情報の合成を行う。このとき、レイアウト情報
の矛盾している部分が発見された場合、優先度要求信号
２０２を発生し優先度決定手段２１に送る。次に、優先
度決定手段２１から得られた優先度信号２０１に従っ
て、優先度の高いクラスのレイアウト情報を選択する。
これを全てのレイアウト情報に関する矛盾点について行
い、レイアウト情報合成結果１０７を出力する。The layout information selecting means 22 synthesizes each layout information. At this time, if an inconsistent portion of the layout information is found, a priority request signal 202 is generated and sent to the priority determining means 21. Next, according to the priority signal 201 obtained from the priority determining means 21, the layout information of the class having a high priority is selected.
This is performed for all inconsistencies regarding the layout information, and the layout information synthesis result 107 is output.

【００２１】優先度決定手段２１は優先度要求信号２０
２を受信すると前記連結成分数ヒストグラム２００を参
照し、レイアウト情報における矛盾を生じている部分に
対応している複数のクラス内の各頻度を比較し、最大の
頻度を持つクラスにレイアウト情報合成の際の優先度が
与えられる。また、このときクラスに含まれる連結成分
の外接矩形の大きさが小さく部分画像に含まれる連結成
分の多くがノイズとみなされる場合、もしくは外接矩形
の大きさが大きく部分画像に含まれる連結成分の多くが
図、表等とみなされる場合には優先度は与えられない。
この結果の優先度信号２０１をレイアウト情報選択手段
２２に返す。The priority determining means 21 uses the priority request signal 20.
When 2 is received, the connected component number histogram 200 is referred to, each frequency in a plurality of classes corresponding to the portion in which the layout information is inconsistent is compared, and the class having the maximum frequency is subjected to layout information synthesis. Priority is given. Also, at this time, when the size of the circumscribed rectangle of the connected component included in the class is small and most of the connected components included in the partial image are regarded as noise, or when the size of the circumscribed rectangle is large and the connected component of the connected image included in the partial image is large. If many are regarded as figures, tables, etc., no priority is given.
The resulting priority signal 201 is returned to the layout information selection means 22.

【００２２】図８が各クラスの部分画像のレイアウト情
報を単純に重畳した結果である。同図では文字「行」の
横ストローク、「か」の右側、「能」の各連結成分、に
おいて、レイアウト情報に関して矛盾が生じているが、
この矛盾に関与する部分画像Ｐ1 、Ｐ2 に属する文字数
頻度を参照すると、部分画像Ｐ2 に属する文字数の頻度
が高くなっており、この部分においては部分画像Ｐ2 の
レイアウト解析結果に優先度が与えられ、レイアウト情
報合成の際に優先される。また、部分画像Ｐ0に属する
連結成分は矩形サイズが小さくノイズ成分を多く含んで
いるとみなされるため優先度は与えられない。このた
め、レイアウト情報合成時に他の部分画像のレイアウト
情報と重複のあるもの以外の連結成分は削除される。図
９はレイアウト情報を合成した結果であり、ポイント数
の異なる文字が混在している文書であるに拘らず正しい
レイアウト解析結果が得られている。FIG. 8 shows the result of simply superposing the layout information of the partial images of each class. In the figure, there is a contradiction regarding the layout information in the horizontal stroke of the character “line”, the right side of the “ka”, and each connected component of the “Noh”.
Referring to the frequency of the number of characters belonging to the partial images P1 and P2 involved in this contradiction, the frequency of the number of characters belonging to the partial image P2 is high. In this part, the layout analysis result of the partial image P2 is given priority, Priority is given to layout information composition. Further, since the connected components belonging to the partial image P0 are considered to have a small rectangular size and contain a large amount of noise components, priority is not given. Therefore, when the layout information is combined, connected components other than those that overlap with the layout information of other partial images are deleted. FIG. 9 is a result of synthesizing layout information, and a correct layout analysis result is obtained regardless of a document in which characters having different points are mixed.

【００２３】以上の方法で、文字の大きさに捕らわれず
に文書全体にわたるレイアウト解析結果を得ることがで
きる。With the above method, the layout analysis result for the entire document can be obtained without being restricted by the size of the character.

【００２４】[0024]

【発明の効果】例えば図１０のように異なるポイント数
の文字から構成される行が近接する文書において、連結
成分の大きさ毎に部分画像を生成し、各クラスの部分画
像毎にレイアウト解析処理を行うことで、正しい解析結
果が得られる。As shown in FIG. 10, for example, in a document in which lines composed of characters having different points are close to each other, partial images are generated for each size of connected components, and layout analysis processing is performed for each partial image of each class. The correct analysis result can be obtained by performing.

[Brief description of drawings]

【図１】本発明の一実施例に係わるレイアウト解析方式
の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a layout analysis method according to an embodiment of the present invention.

【図２】図１におけるレイアウト情報合成手段の１実施
例に係わるブロック図である。FIG. 2 is a block diagram according to an embodiment of layout information synthesizing means in FIG.

【図３】入力される文書画像の例である。FIG. 3 is an example of a document image input.

【図４】図２の文書画像を入力した際に、連結成分の抽
出手段により、抽出した連結成分の外接矩形抽出結果で
ある。FIG. 4 is a circumscribed rectangle extraction result of the connected components extracted by the connected component extracting means when the document image of FIG. 2 is input.

【図５】同入力画像に対して、部分画像生成手段により
各クラスに分類された連結成分を各部分画像に振り分
け、部分画像を生成した結果である。（ａ）部分画像Ｐ
₀、（ｂ）部分画像Ｐ₁、（ｃ）部分画像Ｐ₂、（ｄ）
部分画像Ｐ₃。FIG. 5 is a result of generating a partial image by dividing connected components classified into each class by a partial image generating unit into each partial image with respect to the same input image. (A) Partial image P
₀ , (b) partial image P ₁ , (c) partial image P ₂ , (d)
Partial image P ₃ .

【図６】同入力画像に対して、各部分画像ごとのレイア
ウト解析結果（ａ）部分画像Ｐ₀のレイアウト解析結
果、（ｂ）部分画像Ｐ₁のレイアウト解析結果、（ｃ）
部分画像Ｐ₂のレイアウト解析結果、（ｄ）部分画像Ｐ
₃のレイアウト解析結果。FIG. 6 is a layout analysis result of each partial image with respect to the same input image; (a) a layout analysis result of a partial image P ₀ ; (b) a layout analysis result of a partial image P ₁ ;
Layout analysis result of partial image P ₂ , (d) partial image P
Layout analysis result of ₃ .

【図７】図２の文書画像を入力した際に、連結成分の分
類手段により、クラスに分類された連結成分の数をクラ
スごとに計数した結果の連結成分ヒストグラムである。FIG. 7 is a connected component histogram as a result of counting the number of connected components classified into classes by a connected component classification unit when the document image of FIG. 2 is input.

【図８】同入力画像に対して、レイアウト情報合成手段
で、各クラスの部分画像のレイアウト解析結果情報を重
畳した結果である。FIG. 8 is a result of superimposing layout analysis result information of partial images of each class on the same input image by a layout information synthesizing unit.

【図９】同入力画像に対して、図７のレイアウト情報重
畳結果を優先度の高いクラスのレイアウト情報を優先さ
せてレイアウト情報を統合した結果である。FIG. 9 is a result of integrating the layout information of the same input image by laying out the layout information superimposing result of FIG. 7 by giving priority to the layout information of the class having a high priority.

【図１０】異なるポイント数の文字を同一行内に含む文
書画像例に本手法を適用した場合の効果に関する概念図
である。（ａ）原画像、（ｂ）ポイント数の大きな文字
を抽出したクラスの部分画像、（ｃ）ポイント数の小さ
な文字を抽出したクラスの部分画像。FIG. 10 is a conceptual diagram regarding an effect when the present technique is applied to an example of a document image including characters having different points in the same line. (A) An original image, (b) a partial image of a class in which a character with a large number of points is extracted, and (c) a partial image of a class in which a character with a small number of points is extracted.

[Explanation of symbols]

１１連結成分の抽出手段１２連結成分の分類手段１３部分画像生成手段１４〜１５レイアウト解析手段１６レイアウト情報合成手段２０連結成分数ヒストグラム生成手段２１優先度決定手段２２レイアウト情報選択手段１０１外接矩形情報１０２連結成分のクラスへの分類情報１０３〜１０４画像プレーン１０５〜１０６画像プレーンのレイアウト情報１０７レイアウト情報合成結果 11 Connected Component Extracting Means 12 Connected Component Classifying Means 13 Partial Image Generating Means 14 to 15 Layout Analyzing Means 16 Layout Information Composing Means 20 Connected Component Number Histogram Generating Means 21 Priority Determining Means 22 Layout Information Selecting Means 101 Circumscribing Rectangle Information 102 Classification information of connected components into classes 103 to 104 Image planes 105 to 106 Layout information of image planes 107 Layout information synthesis result

Claims

[Claims]

1. A document image is input as a binary image signal,
Connected component extraction means for extracting connected components, connected component classification means for classifying the connected components according to size information, and only connected components belonging to a class obtained by performing classification according to size information Partial image generation means for respectively generating and storing the partial images including the layout images, layout analysis means for performing layout analysis on each of the partial images and extracting layout information, and layout information composition for composing each layout information. And a document image layout analysis device comprising means.

2. The document image layout analyzing apparatus according to claim 1, wherein the layout information synthesizing unit measures a frequency distribution of connected components for each class from layout information output by each layout analyzing unit, When a contradiction occurs between the connected component number histogram generation means stored as a number histogram and the layout information output by each layout analysis means, the priority when performing layout information synthesis with reference to the connected component number histogram is set. Priority determining means for determining, and laying out the layout information by prioritizing the layout information according to the priority,
A document image layout analysis device comprising: layout information selecting means for outputting final layout information.