JPH083832B2

JPH083832B2 - Document image structure extraction method

Info

Publication number: JPH083832B2
Application number: JP61154184A
Authority: JP
Inventors: 公一江尻; 彰桜井
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-07-02
Filing date: 1986-07-02
Publication date: 1996-01-17
Anticipated expiration: 2011-01-17
Also published as: JPS6310282A

Description

【発明の詳細な説明】（技術分野）本発明は、文書画像の文字行、図形部、文章等の領域
を認識、抽出する方法に関するものである。TECHNICAL FIELD The present invention relates to a method for recognizing and extracting a region such as a character line, a graphic portion, and a sentence of a document image.

（従来技術）従来、文書画像の構造抽出法としては、黒画素あるい
は粗メッシュを単位とした非白画素の連続した領域を切
り出し、これを長方形近似して、近接した長方形を連結
する方法がよく知られている（例えばコンピュータビジ
ョン22−1 1983.1.27参照）。(Prior Art) Conventionally, as a structure extraction method of a document image, a method of cutting out a continuous region of non-white pixels in units of black pixels or coarse meshes, approximating this to a rectangle, and connecting adjacent rectangles is often used. Known (see, for example, Computer Vision 22-1 1983.1.27).

しかしながら、この方法では、長方形の枠の中に書か
れた文章や図形は全て枠の内部に隠れてしまう。また、
２つの領域間に小さな黒い汚れがあると、その２つの領
域は接続されてしまう。例えば、第２図に示したよう
に、実線で示す２つの絵または図があり、破線で示す連
結画像領域A,C間にＢのような汚れた部分や１本の細線
があると、Ａ−Ｂ−Ｃが１つの図形領域に合併される。However, with this method, all sentences and figures written in the rectangular frame are hidden inside the frame. Also,
If there is a small black stain between the two areas, the two areas will be connected. For example, as shown in FIG. 2, if there are two pictures or diagrams shown by solid lines and there is a dirty part such as B or one thin line between the connected image areas A and C shown by broken lines, A -B-C are merged into one graphic area.

（発明の目的）文書画像においては、印字または記入された黒情報の
みならず、白情報（余白も含めて）も構造を表わしてい
ることが多い。本発明は、この白い部分を積極的に利用
して、文字行、図形部、文章等の画像領域を抽出する文
書画像の構造抽出方法を提供するものである。(Object of the Invention) In a document image, not only printed or filled black information but also white information (including margins) often represents a structure. The present invention provides a method for extracting a structure of a document image, which positively utilizes this white portion to extract an image area such as a character line, a graphic portion, and a sentence.

（発明の構成）対象とする文書画像から垂直、水平線分を取り除き、
次いで白い部分を矩形によって分割し、その結果の非白
部分を連結ラベリング処理して、文章あるいは図形部を
抽出するものである。(Structure of the invention) Remove vertical and horizontal line segments from the target document image,
Next, the white part is divided into rectangles, and the resulting non-white part is connected and labeled to extract the text or graphic part.

（実施例）第１図は、本発明の一実施例の処理ステップを示した
ものである。まずステップ１においては、対象とする文
書画像の長い線分要素（垂直、水平成分のみ）を抽出
し、これを消去する。（線文の抽出法は、例えば情報処
理学会第25回＝昭和57年後期＝全国大会予稿5B−４参
照）。(Embodiment) FIG. 1 shows processing steps of an embodiment of the present invention. First, in step 1, long line segment elements (only vertical and horizontal components) of the target document image are extracted and deleted. (For the method of extracting line sentences, see, for example, IPSJ 25th = 2nd half of 1982 = National Convention Proceedings 5B-4)

次にステップ２として画像のメッシュ分割を行なう。
よく知られている方法にQuad−Treeの方法があり、第４
図のように、０層（原画）,1層（原画の４画素を１画素
に置換）,2層（１層の４画素を１画素に置換），…，の
ように順次縮退表現する。この表現法には多様な方法が
あるが、ここでは４画素（a,b,c,d）の和Ｔ、即ちＴ＝
ａ＋ｂ＋ｃ＋ｄがある閾値以上のときは黒、それ以外の
ときは白（０）とおく。Next, in step 2, the image is divided into meshes.
A well-known method is the Quad-Tree method.
As shown in the figure, degenerate representations are sequentially made as 0 layer (original image), 1 layer (4 pixels of the original image are replaced by 1 pixel), 2 layers (4 pixels of 1 layer are replaced by 1 pixel) ,. There are various methods for this expression, but here, the sum T of four pixels (a, b, c, d), that is, T =
When a + b + c + d is above a certain threshold value, it is set to black, and otherwise it is set to white (0).

ステップ３では白い領域を矩形分割する。分割法は、
文献Pattern Recognition Vol.11.pp297〜312 Aoki“RE
CTANGULAR REGION CODING FOR IMAGE DATA COMPRESSIO
N"法によってもよい。第３図にその分割の一例を示す。In step 3, the white area is divided into rectangles. The division method is
Reference Pattern Recognition Vol.11.pp297〜312 Aoki “RE
CTANGULAR REGION CODING FOR IMAGE DATA COMPRESSIO
The N "method may be used. Fig. 3 shows an example of the division.

次にステップ４として、第３図（ｂ）の非白の矩形領
域についてラベリング処理を行なう。これにより文字
行、図形部が矩形に切り出される。このとき、縦横幅が
大きい領域は図形、その他の領域は文字あるいは文章領
域とみなせる。即ち、 ◎横に長い非白領域は文字行の可能性が高い。Next, as step 4, labeling processing is performed on the non-white rectangular area in FIG. 3 (b). As a result, the character line and the graphic part are cut out into a rectangle. At this time, the area having a large vertical and horizontal width can be regarded as a figure, and the other areas can be regarded as a character or a text area. That is: ◎ The non-white areas that are long horizontally are highly likely to be character lines.

◎サイズの大きい非白領域はグラフや絵の可能性が高
い。◎ Large non-white areas are highly likely to be graphs or pictures.

最後にステップ５として、非白領域を統合し、文書画
像の画像領域を抽出する。Finally, in step 5, the non-white areas are integrated and the image area of the document image is extracted.

この方法を利用すると、以下のような場合、さらに効
果的である。第２図の領域Ｂのように両側に広い白領域
を有する狭幅の非白領域は、不要なノイズである可能性
が高い。従って第５図のＤのような非白領域は消去し、
２つの白領域E,Fを統合することができる。This method is more effective in the following cases. A narrow non-white area having wide white areas on both sides like the area B in FIG. 2 is highly likely to be unnecessary noise. Therefore, the non-white area like D in FIG. 5 is erased,
The two white areas E and F can be integrated.

（発明の効果）以上説明したように、本発明によれば、非白領域が大
きなブロックとして切り出せるため、高速かつ安定して
文書画像の構造抽出が可能となる。(Effects of the Invention) As described above, according to the present invention, a non-white area can be cut out as a large block, so that the structure of a document image can be extracted quickly and stably.

[Brief description of drawings]

第１図は、本発明の一実施例の処理ステップを示す図、
第２図は、２つの絵または図がある文書画像の例を示す
図、第３図は、白領域の矩形分割例を示す図、第４図
は、画像のメッシュ分割法の一例を示す図、第５図は、
２つの近接した白領域間に非白領域がある場合の処理法
を示す図である。 A,C…連結画像領域、B,D…狭幅の非白領域、E,F…白領
域。FIG. 1 is a diagram showing processing steps of an embodiment of the present invention,
FIG. 2 is a diagram showing an example of a document image having two pictures or diagrams, FIG. 3 is a diagram showing an example of rectangular division of a white area, and FIG. 4 is a diagram showing an example of an image mesh division method. , Fig. 5 shows
It is a figure which shows the processing method when there exists a non-white area | region between two adjacent white areas. A, C ... Connected image area, B, D ... Narrow non-white area, E, F ... White area.

Claims

[Claims]

1. A process for removing vertical line segments and horizontal line segments from a target document image, a process for dividing a white part into rectangles, and a process for performing a concatenative labeling process on a non-white part. A method for extracting a structure of a document image, wherein the extracted portion is extracted as a text or a graphic portion.

2. The process of dividing the white portion into rectangles, wherein when the plurality of white regions are adjacent to each other with a narrow gap, the plurality of white regions are regarded as one region. A method for extracting a structure of a document image according to the item (1).