JP4159071B2

JP4159071B2 - Image processing method, image processing apparatus, and computer-readable recording medium storing program for realizing the processing method

Info

Publication number: JP4159071B2
Application number: JP2000058174A
Authority: JP
Inventors: 史裕長谷川
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2000-03-03
Filing date: 2000-03-03
Publication date: 2008-10-01
Anticipated expiration: 2020-03-03
Also published as: JP2001250084A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理方法，画像処理装置および該処理方法を実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体、より詳細には、紙面に記入された文字を光学的に認識する方法，装置に関し、帳票画像から、記入された文字を精度良く抽出することを目的とする画像処理方法および装置に関する。
【０００２】
【従来の技術】
従来、帳票画像から、記入された文字部分を抽出する処理は、完全な定型文書が対象で、形状も、全体としての文字や記入枠の位置も、全く同一の帳票に文字が書かれているものを想定していた。従って、処理対象帳票と同一フォーマットを持つ画像上で文字記入位置をあらかじめ座標で指定しておき、処理対象画像が入力されたら、同一フォーマット画像との位置合わせを行ない、処理対象画像中における文字記入位置を推定して、文字を抽出するという方法で行なっていた。上述のような位置合わせに対して、例えば、特開平１０−９１７８３号公報に開示されたものは、画像から線分の交叉点を抽出し、それを目印に画像の位置合わせを行うもので、正確な文字抽出のために有効な手段であった。
【０００３】
【発明が解決しようとする課題】
しかしながら、従来流通している帳票には、完全に定型ではなく、おおまかには定型であるが微妙に形状の異なる帳票が多く存在する。図１を用いてその例を説明する。
図１は、従来流通している帳票の一例を説明するための図で、図１（Ａ），図１（Ｂ）に示した例のように、一見すると、両者は同じ帳票に見える。しかし、実際は、以下の点で異なっている。
（１）住所，氏名欄に用いられる線種（実線／点線）が違う。
（２）人数欄の性別と大人／子供の欄が、縦横逆になっている。
（３）人数欄の枠の大きさが違う。
【０００４】
図１に示した例のように、枠や印刷文字の位置や大きさが画像内で変化すると、前述の手法により、一方の画像をもとに、その文字記入位置の座標値を指定しても、また、他方の帳票に文字が書かれても、それらの文字を正確に抽出することができなかった。
【０００５】
本発明は、上述のような実情を考慮してなされたもので、微妙に形状の異なる帳票からでも文字を抽出することが可能な画像処理方法，画像処理装置および該処理方法を実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体を提供することを目的としてなされたものである。
【０００６】
【課題を解決するための手段】
上記目的を達成するために、請求項１の発明は、帳票画像中の文字を抽出する画像処理方法であって、帳票画像を入力する画像入力工程と、該帳票画像中に存在するキーワードを入力するキーワード入力工程と、前記帳票画像中の罫線で囲まれた枠を抽出する枠抽出工程と、該枠内の文字を認識する文字認識工程と、該認識結果の文字と前記キーワードとを比較するキーワード照合工程と、比較の結果前記キーワードと同一の文字が存在する枠の所属枠を探索して抽出する所属枠抽出工程とを備え、前記枠抽出工程は、所定の長さ以上の黒ランを抽出する黒ラン抽出工程と、該抽出された黒ラン同士が所定の距離よりも近い場合は該黒ランの間を黒画素で埋める画素補完工程と、該黒画素が埋め込まれた帳票画像から白画素の連結成分の外接矩形を抽出する外接矩形抽出工程と、該抽出によって得られた外接矩形同士が重なっている場合はその重なり具合に応じて外接矩形同士を統合する矩形統合工程と、該統合された外接矩形の大きさを所定の大きさと比較し、所定の大きさよりも小さいものを除外する矩形サイズ吟味工程と、該吟味に基づいて得られた外接矩形の座標値をもって枠の座標値とする枠位置決定工程とを備えたことを特徴とする。
【０００７】
請求項２の発明は、帳票画像中の文字を抽出する画像処理装置であって、帳票画像を入力する画像入力手段と、該帳票画像中に存在するキーワードを入力するキーワード入力手段と、前記帳票画像中の罫線で囲まれた枠を抽出する枠抽出手段と、該枠内の文字を認識する文字認識手段と、該認識結果の文字と前記キーワードとを比較するキーワード照合手段と、比較の結果前記キーワードと同一の文字が存在する枠の所属枠を探索して抽出する所属枠抽出手段とを備え、
前記枠抽出手段は、所定の長さ以上の黒ランを抽出する黒ラン抽出手段と、該抽出された黒ラン同士が所定の距離よりも近い場合は該黒ランの間を黒画素で埋める画素補完手段と、該黒画素が埋め込まれた帳票画像から白画素の連結成分の外接矩形を抽出する外接矩形抽出手段と、該抽出によって得られた外接矩形同士が重なっている場合はその重なり具合に応じて外接矩形同士を統合する矩形統合手段と、該統合された外接矩形の大きさを所定の大きさと比較し、所定の大きさよりも小さいものを除外する矩形サイズ吟味手段と、該吟味に基づいて得られた外接矩形の座標値をもって枠の座標値とする枠位置決定手段とを備えたことを特徴とする。
【０００８】
請求項３の発明は、帳票画像中の文字を抽出する画像処理方法を実行するための各機能を、コンピュータで実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体であって、前記各機能は、帳票画像を入力する画像入力機能と、該帳票画像中に存在するキーワードを入力するキーワード入力機能と、前記帳票画像中の罫線で囲まれた枠を抽出する枠抽出機能と、該枠内の文字を認識する文字認識機能と、該認識結果の文字と前記キーワードとを比較するキーワード照合機能と、比較の結果前記キーワードと同一の文字が存在する枠の所属枠を探索して抽出する所属枠抽出機能であって、かつ前記枠抽出機能は、所定の長さ以上の黒ランを抽出する黒ラン抽出機能と、該抽出された黒ラン同士が所定の距離よりも近い場合は該黒ランの間を黒画素で埋める画素補完機能と、該黒画素が埋め込まれた帳票画像から白画素の連結成分の外接矩形を抽出する外接矩形抽出機能と、該抽出によって得られた外接矩形同士が重なっている場合はその重なり具合に応じて外接矩形同士を統合する矩形統合機能と、該統合された外接矩形の大きさを所定の大きさと比較し、所定の大きさよりも小さいものを除外する矩形サイズ吟味機能と、該吟味に基づいて得られた外接矩形の座標値をもって枠の座標値とする枠位置決定機能であることを特徴とする。
【００１５】
【発明の実施の形態】
（実施例１）
図１（Ａ），図１（Ｂ）に示した例では、完全な定型帳票ではないが、項目名は同一である。例えば、氏名は、「氏名」と印刷された枠の隣に記入されることになっているし、男性は、縦横の違いこそあれ、「男性」と印刷された枠に隣り合った枠に記入されることについては、両者の間に差はない。そこで、まず、「氏名」，「男性」などの項目名を見付け出し、これらをキーワードとしてそこに隣接する枠を探して、そこを記入枠とする方法を取ることにする。
【００１６】
図２は、本発明による画像処理方法，画像処理装置および該処理方法を実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体の一実施例を説明するための構成図で、図中、１は画像入力手段、２は画像格納手段、３はキーワード入力手段、４はキーワード格納手段、５は枠抽出手段、６は枠情報格納手段、７は文字認識手段、８はキーワード照合手段、９は所属枠グループ化手段である。
図３は、図２に示した実施例の動作の一実施例を説明するためのフローチャートである。
【００１７】
（１）スキャナ等の画像入力手段１で処理対象となる画像を入力し、画像格納手段２に格納する（ステップＳ１）。
（２）キーワードをキーワード入力手段３で入力し、キーワード格納手段４に格納する（ステップＳ２）。このキーワードは、文字記入枠の属性を表すもので、図１に示した例では、「氏名」や「男性」等に相当するもので、抽出したい属性をあらかじめ人手で指定しておく。
（３）枠の抽出を枠抽出手段５で行い、枠情報格納手段６にその情報を蓄えておく（ステップＳ３）。枠の抽出法には、罫線を抽出してその情報を利用する方法が考えられるが、実施例１では、以下のように、白画素連結成分から求める。
【００１８】
図４は、図２に示した枠抽出手段の一実施例を説明するための構成図で、図中、１１は黒ラン抽出手段、１２は黒ラン間画素補完手段、１３は補完画像情報格納手段、１４は白画素連結成分外接矩形抽出手段、１５は矩形情報格納手段、１６は矩形統合手段、１７は矩形サイズ吟味手段、１８は枠位置決定手段である。
図５は、図４に示した実施例の動作の一実施例を説明するためのフローチャートである。
図６，図７は、図５に示した実施例の具体例を示した図である。
【００１９】
（ａ）黒ラン抽出手段１１により、あらかじめ決めておいた値より長い黒ランを抽出する（ステップＳ３−１）。
（ｂ）ラン同士の距離が接近しているものを探し、黒ラン間画素補完手段１２によって画像上でランの間に黒画素を埋めこむ（ステップＳ３−２）。この処理は、あとで行う白連結成分抽出による枠位置決定時に、罫線にかすれがあると枠内の連結成分が枠の外とつながってしまい、枠を正しく抽出できないことから行う。また、補完対象を長いランに限定する理由は、網掛け領域などで細かいノイズが多数存在する領域において、ノイズ同士の間を統合して偽の罫線を生成しないようにするためである（図６を参照）。黒画素が埋め込まれた画像情報は、補完画像情報格納手段１３に格納する。
【００２０】
（ｃ）白画素連結成分の外接矩形（以下、白矩形とも呼ぶ）を白画素連結成分外接矩形抽出手段１４によって抽出し、矩形情報格納手段１５に格納する（ステップＳ３−３）。枠を構成する黒画素に囲まれた領域は、ここで外接矩形として抽出される。この矩形の位置が枠候補となる。
（ｄ）この枠候補で、ある白矩形同士で重なっているものを選び出し、矩形統合手段１６で統合する（ステップＳ３−４）。この処理は、図７に示したように、枠が文字で分断されている場合に、別々の枠として抽出しないように、白矩形同士がある程度以上重なっている場合は、両者を統合してひとつの矩形として統合してしまう処理である。
（ｅ）最後に、矩形サイズ吟味手段１７により、矩形の大きさが小さすぎるものは文字内部の可能性が高いので除外し（ステップＳ３−５）、枠位置決定手段１８により、白矩形の位置を枠の位置として決定する（ステップＳ３−５）。
【００２１】
（４）以上により、枠が抽出できたので、次に、文字認識手段７により、枠内の文字認識処理を行う（ステップＳ４）。
（５）枠の位置がわかっているので、この内部のみに文字認識処理を施す。次に、得られた認識結果と登録しておいたキーワードとをキーワード照合手段８によって比較し、同一のものを探す（ステップＳ５）。キーワードと同一の認識結果を与える画像領域（枠領域）がキーワードの存在する枠ということになる。これで、キーワード（と同定された文字）の存在するキーワード枠が抽出できたことになる。
【００２２】
（６）キーワード枠に属する（関連する）文字記入枠（以下、所属枠と呼ぶ）を探索する（ステップＳ６）。例えば、図１に示した例において、「大人」という項目（キーワード）の存在する枠に属する文字記入枠は２つあるので、その両方を抽出して「大人」に属する文字記入枠グループとする。また、「氏名」という項目に属する文字記入枠はひとつだけなので、この枠ひとつを抽出することになる。キーワードの存在する枠の右，下に接する枠を次々に探していき、大きさの異なる枠に接するまで、同一の属性を持つ枠としてグループ化するという方針で行う。処理手順は以下の通りである。
【００２３】
図８は、図２に示した所属枠グループ化手段の一実施例を説明するための図で、図中、２１は基準枠設定手段、２２は隣接枠検索手段、２３は枠形状検証手段、２４は枠内キーワード検証手段、２５は所属枠候補情報格納手段、２６は所属枠候補層数計数手段、２７は所属枠情報格納手段で、その他、図２に示した実施例と同じ作用をする部分には、図２に示した実施例と同じ符号が付してある。
図９は、図８に示した実施例の動作の一実施例を説明するためのフローチャートである。
図１０，図１１は、図９に示した実施例の具体例を示した図である。
【００２４】
（ａ）基準枠設定手段２１により、キーワードが存在する枠を初期基準枠に定める（ステップＳ６−１）。
（ｂ）隣接枠探索手段２２により、基準枠の右側に接する枠を探す（ステップＳ６−２）。具体的には、基準枠の右端座標と隣接枠の左端座標とが近いものを接していると見なす。
（ｃ）接する枠がない場合は（ステップＳ６−３のＮＯ）、基準枠の右側探索を終了し、下側探索を行う（ステップＳ６−１０）。
【００２５】
（ｄ）枠形状検証手段２３で枠の大きさや位置の検証を行う（ステップＳ６−４）。枠の上下端が基準枠のそれよりも内側に入っており、高さ差が少ない場合に、枠の形状が基準に適合していると判断し、そうでない場合は（ステップＳ６−４のＮＯ）、基準枠の右側探索を終了して下側探索を行う（ステップＳ６−１０）。
（ｅ）枠内キーワード検証手段２４で、隣接する探索対象枠がキーワード枠であるか否かを吟味し（ステップＳ６−５）、キーワード枠でないなら（ステップＳ６−５のＮＯ）、枠形状検証手段２３を用いて基準枠と隣接する探索対象枠との幅の差を吟味し（ステップＳ６−６）、差が大きい場合には、この隣接する探索対象枠はグループ化の対象外と判定し（ステップＳ６−６のＮＯ）、右側探索を終えて下側探索に移る（ステップＳ６−１０）。
（ｆ）枠内キーワード検証手段２４で、隣接する探索対象枠が他のキーワード枠であるかどうかを吟味し（ステップＳ６−７）、キーワード枠なら（ステップＳ６−７のＹＥＳ）、グループ化の対象外と判定し、右側探索を終えて下側探索に移る（ステップＳ６−１０）。
【００２６】
（ｇ）以上の条件をクリアした（ステップＳ６−７のＮＯ）隣接する探索対象枠は、グループ化され、当該キーワードの所属枠候補と判断されるので、所属枠候補情報格納手段２５に記録しておく（ステップＳ６−８）。
（ｈ）基準枠設定手段２１により、グループ化された枠を基準枠に設定し直し（ステップＳ６−９）、さらに右隣の枠との接続関係を調べる。このように、新たにグループ化される枠が見つからなくなるまで処理を繰り返す。
（ｉ）続いて、キーワード枠の下の枠にもグループ化を行う（ステップＳ６−１０）。手順は右側を探す場合と同様である。
【００２７】
（ｊ）右側にも下側にもグループ化ができたら、どちらか一方だけを所属枠として登録する。どちらを選ぶかの基準は、所属枠候補層数計数手段２６を用い、図１１に示したように、縦横の各グループ内の所属枠の層数を数え（ステップＳ６−１１）、多い方を選択し、所属枠情報格納手段２７に格納する（ステップＳ６−１２）。図１１に示した例では、縦方向の層数が多いので、グループは縦方向（下側）のものを採用する。なお、同数の場合は横方向を採用する。
（ｋ）全キーワードを吟味し終えたら（ステップＳ６−１３のＹＥＳ）、結果を出力し（ステップＳ６−１４）、処理を終了する。
以上の処理により、各キーワードに属する文字記入領域が特定できるので、その位置にある画像が文字画像ということになり、文字が抽出できたことになる。
【００２８】
（実施例２）
図１２は、本発明による画像処理方法を実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体の一実施例を説明するための要部構成図で、本発明をソフトウェアによって実現する場合の実施例である。
ＣＰＵ３１、メモリ３２、ハードディスク３３、入力装置３４、ＣＤ−ＲＯＭドライブ３５、ディスプレイ３６、マウスなどからなる汎用の処理装置を用意する。ＣＤ−ＲＯＭなどの記録媒体３７には、本発明の画像処理の処理機能や処理手順を実現させるためのプログラムが記録されている。また、処理対象の原稿画像は、例えば、ハードディスク３３などに格納されている。ＣＰＵ３１は、記録媒体３７から上記した処理機能，手順を実現するプログラムを読み出し実行し、画像処理の結果をディスプレイ３６などに出力する。
【００２９】
【発明の効果】
以上の説明から明らかなように、本発明によれば、完全に定型でない帳票からも文字抽出が行え、罫線が多少かすれていても、正しく、文字抽出を行うことができる。
【図面の簡単な説明】
【図１】従来流通している帳票の一例を説明するための図である。
【図２】本発明による画像処理方法，画像処理装置および該処理方法を実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体の一実施例を説明するための要部構成図である。
【図３】図２に示した実施例の動作の一実施例を説明するためのフローチャートである。
【図４】図２に示した枠抽出手段の一実施例を説明するための構成図である。
【図５】図４に示した実施例の動作の一実施例を説明するためのフローチャートである。
【図６】図５に示した実施例の具体例を示した図である。
【図７】図５に示した実施例の具体例を示した図である。
【図８】図２に示した所属枠グループ化手段の一実施例を説明するための図である。
【図９】図８に示した実施例の動作の一実施例を説明するためのフローチャートである。
【図１０】図９に示した実施例の具体例を示した図である。
【図１１】図９に示した実施例の具体例を示した図である。
【図１２】本発明による画像処理方法を実現させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体の一実施例を説明するための要部構成図である。
【符号の説明】
１…画像入力手段、２…画像格納手段、３…キーワード入力手段、４…キーワード格納手段、５…枠抽出手段、６…枠情報格納手段、７…文字認識手段、８…キーワード照合手段、９…所属枠グループ化手段、１１…黒ラン抽出手段、１２…黒ラン間画素補完手段、１３…補完画像情報格納手段、１４…白画素連結成分外接矩形抽出手段、１５…矩形情報格納手段、１６…矩形統合手段、１７…矩形サイズ吟味手段、１８…枠位置決定手段、２１…基準枠設定手段、２２…隣接枠検索手段、２３…枠形状検証手段、２４…枠内キーワード検証手段、２５…所属枠候補情報格納手段、２６…所属枠候補層数計数手段、２７…所属枠情報格納手段、３１…ＣＰＵ、３２…メモリ、３３…ハードディスク、３４…入力装置、３５…ＣＤ−ＲＯＭドライブ、３６…ディスプレイ、３７…記録媒体。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing method, an image processing apparatus, and a computer-readable recording medium on which a program for realizing the processing method is recorded. More specifically, the present invention relates to a method for optically recognizing characters entered on a paper, The present invention relates to an apparatus, and more particularly, to an image processing method and apparatus for extracting written characters from a form image with high accuracy.
[0002]
[Prior art]
Conventionally, the process of extracting the entered character part from the form image is for a complete standard document, and the characters are written on the same form with the same shape, position of the character and the entry frame as a whole. I was expecting something. Therefore, the character entry position on the image having the same format as the processing target form is designated in advance by coordinates, and when the processing target image is input, alignment with the same format image is performed to enter the character in the processing target image. This was done by estimating the position and extracting the characters. For the alignment as described above, for example, what is disclosed in Japanese Patent Application Laid-Open No. 10-91783 extracts a crossing point of a line segment from an image and performs image alignment using the extracted point as a mark. It was an effective means for accurate character extraction.
[0003]
[Problems to be solved by the invention]
However, the forms that have been distributed in the past are not completely fixed, and there are many forms that are generally fixed but slightly different in shape. An example will be described with reference to FIG.
FIG. 1 is a diagram for explaining an example of a form that has been distributed in the past. As in the example shown in FIGS. 1A and 1B, both look like the same form. However, in reality, it differs in the following points.
(1) The line type (solid line / dotted line) used in the address and name fields is different.
(2) The gender and adult / child column in the number of people column are upside down.
(3) The size of the frame in the number of people column is different.
[0004]
As in the example shown in FIG. 1, when the position or size of a frame or print character changes in the image, the coordinate value of the character entry position is designated based on one image by the above-described method. However, even if characters were written on the other form, those characters could not be extracted accurately.
[0005]
The present invention has been made in consideration of the above-described circumstances, and an image processing method, an image processing apparatus, and a method for realizing the processing method capable of extracting characters from forms having slightly different shapes. The object of the present invention is to provide a computer-readable recording medium in which a program is recorded.
[0006]
[Means for Solving the Problems]
In order to achieve the above object, the invention of claim 1 is an image processing method for extracting characters in a form image, an image input step for inputting the form image, and a keyword existing in the form image. A keyword input step, a frame extraction step for extracting a frame surrounded by a ruled line in the form image, a character recognition step for recognizing characters in the frame, and a character of the recognition result and the keyword are compared. a keyword matching process, and a affiliation frame extracting step of extracting by searching belongs frame frame results the keyword and the same characters of the comparison is present, the frame extraction step, a predetermined length or more of the black run A black run extraction step for extracting the black runs, a pixel complement step for filling the black runs with black pixels when the extracted black runs are closer than a predetermined distance, and a form image in which the black pixels are embedded Circumscribing connected components of white pixels A circumscribed rectangle extracting step for extracting a shape, a rectangle integrating step for integrating the circumscribed rectangles according to the overlapping state when the circumscribed rectangles obtained by the extraction overlap, and a size of the integrated circumscribed rectangle A rectangular size examination step of comparing the size with a predetermined size and excluding those smaller than the predetermined size, and a frame position determination step of setting the coordinate value of the circumscribed rectangle obtained based on the examination as the coordinate value of the frame It is provided with .
[0007]
The invention of claim 2 is an image processing apparatus for extracting characters in a form image, wherein the image input means inputs a form image, the keyword input means inputs a keyword existing in the form image, and the form A frame extracting unit for extracting a frame surrounded by a ruled line in the image; a character recognizing unit for recognizing a character in the frame; a keyword collating unit for comparing the character of the recognition result with the keyword; and a comparison result An affiliation frame extracting means for searching and extracting an affiliation frame of a frame in which the same character as the keyword exists,
The frame extraction means includes a black run extraction means for extracting a black run having a predetermined length or more, and a pixel that fills the black run with black pixels when the extracted black runs are closer than a predetermined distance. Complementing means, circumscribed rectangle extracting means for extracting circumscribed rectangles of connected components of white pixels from the form image in which the black pixels are embedded, and circumscribed rectangles obtained by the extraction overlap each other In accordance with the examination, a rectangle integration means for integrating the circumscribed rectangles, a rectangle size examining means for comparing the size of the integrated circumscribed rectangle with a predetermined size, and excluding those smaller than the predetermined size, and with the coordinates of the circumscribed rectangles obtained Te characterized by comprising a frame position determining means for the coordinate values of the frame.
[0008]
Invention of Claim 3 is a computer-readable recording medium which recorded the program for implement | achieving each function for performing the image processing method which extracts the character in a form image with a computer, Each said function Includes an image input function for inputting a form image, a keyword input function for inputting a keyword existing in the form image, a frame extraction function for extracting a frame surrounded by a ruled line in the form image, A character recognition function for recognizing characters, a keyword collation function for comparing the recognition result characters with the keyword, and a result of comparison by searching for and belonging to a frame where the same character as the keyword exists A frame extraction function, wherein the frame extraction function is a black run extraction function that extracts black runs longer than a predetermined length, and the extracted black runs are closer than a predetermined distance. A pixel complementing function for filling the black run with black pixels, a circumscribed rectangle extracting function for extracting a circumscribed rectangle of a connected component of white pixels from a form image in which the black pixels are embedded, and a circumscribed rectangle obtained by the extraction When there is an overlap, the rectangle integration function that integrates circumscribed rectangles according to the degree of overlap, and the size of the integrated circumscribed rectangle is compared with a predetermined size, and those that are smaller than the predetermined size are excluded And a frame position determining function that uses the coordinate value of the circumscribed rectangle obtained based on the review as a coordinate value of the frame.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
(Example 1)
In the example shown in FIGS. 1A and 1B, the item names are the same, although they are not complete standard forms. For example, the name is supposed to be written next to the frame printed as "Name", and the male is filled in the frame adjacent to the frame printed as "Men", regardless of the vertical or horizontal difference. There is no difference between them. Therefore, first, an item name such as “name” and “male” is found, and a frame adjacent to the item name is searched for and used as an entry frame.
[0016]
FIG. 2 is a configuration diagram for explaining an embodiment of an image processing method, an image processing apparatus, and a computer-readable recording medium on which a program for realizing the processing method according to the present invention is recorded. Are image input means, 2 is image storage means, 3 is keyword input means, 4 is keyword storage means, 5 is frame extraction means, 6 is frame information storage means, 7 is character recognition means, 8 is keyword matching means, 9 is This is a means for grouping membership.
FIG. 3 is a flowchart for explaining an embodiment of the operation of the embodiment shown in FIG.
[0017]
(1) An image to be processed is input by the image input means 1 such as a scanner and stored in the image storage means 2 (step S1).
(2) A keyword is input by the keyword input means 3 and stored in the keyword storage means 4 (step S2). This keyword represents the attribute of the character entry box. In the example shown in FIG. 1, this keyword corresponds to “name”, “male”, etc., and the attribute to be extracted is designated manually in advance.
(3) The frame extraction unit 5 performs frame extraction, and stores the information in the frame information storage unit 6 (step S3). As a frame extraction method, a method of extracting ruled lines and using the information can be considered, but in the first embodiment, it is obtained from white pixel connected components as follows.
[0018]
FIG. 4 is a block diagram for explaining an embodiment of the frame extracting means shown in FIG. 2. In the figure, 11 is a black run extracting means, 12 is a black inter-pixel complementing means, and 13 is complementary image information storage. Means 14, white pixel connected component circumscribed rectangle extraction means 15, rectangle information storage means 16, rectangle integration means 16, rectangle size examination means 17, and frame position determination means 18.
FIG. 5 is a flowchart for explaining an embodiment of the operation of the embodiment shown in FIG.
6 and 7 are diagrams showing specific examples of the embodiment shown in FIG.
[0019]
(A) The black run extraction means 11 extracts a black run longer than a predetermined value (step S3-1).
(B) A search is made for those whose distances between the runs are close to each other, and black pixels are embedded between the runs on the image by the pixel interpolation means 12 between black runs (step S3-2). This processing is performed because the connected component in the frame is connected to the outside of the frame if the ruled line is blurred when the frame position is determined by extracting the white connected component later, and the frame cannot be correctly extracted. The reason why the complement is limited to a long run is to prevent generation of false ruled lines by integrating the noises in an area where a lot of fine noise exists such as a shaded area (FIG. 6). See). The image information in which black pixels are embedded is stored in the complementary image information storage unit 13.
[0020]
(C) The circumscribed rectangle of the white pixel connected component (hereinafter also referred to as a white rectangle) is extracted by the white pixel connected component circumscribed rectangle extracting means 14 and stored in the rectangular information storage means 15 (step S3-3). A region surrounded by black pixels constituting the frame is extracted as a circumscribed rectangle here. This rectangular position is a frame candidate.
(D) Among the frame candidates, those that overlap between certain white rectangles are selected and integrated by the rectangle integration means 16 (step S3-4). As shown in FIG. 7, if the white rectangles overlap each other to some extent so that they are not extracted as separate frames when the frames are divided by characters, as shown in FIG. It is a process that integrates as a rectangle.
(E) Finally, the rectangle size reviewing unit 17 excludes a rectangle whose size is too small because the possibility of being inside the character is high (step S3-5), and the frame position determining unit 18 determines the position of the white rectangle. Is determined as the frame position (step S3-5).
[0021]
(4) Since the frame has been extracted as described above, the character recognition means 7 performs character recognition processing within the frame (step S4).
(5) Since the position of the frame is known, the character recognition process is performed only inside the frame. Next, the obtained recognition result and the registered keyword are compared by the keyword collating means 8 to search for the same one (step S5). An image region (frame region) that gives the same recognition result as the keyword is a frame in which the keyword exists. Thus, a keyword frame in which a keyword (characters identified as) exists can be extracted.
[0022]
(6) Search for a character entry frame (hereinafter referred to as an affiliation frame) belonging to (related to) the keyword frame (step S6). For example, in the example shown in FIG. 1, there are two character entry frames belonging to the frame in which the item (keyword) “adult” exists, and both of them are extracted to form a character entry frame group belonging to “adult”. . Also, since there is only one character entry box belonging to the item “name”, this one frame is extracted. The policy is to search for frames that touch the right and bottom of the frame where the keyword exists, and group them as frames with the same attribute until they touch frames of different sizes. The processing procedure is as follows.
[0023]
FIG. 8 is a diagram for explaining one embodiment of the belonging frame grouping means shown in FIG. 2, in which 21 is a reference frame setting means, 22 is an adjacent frame search means, 23 is a frame shape verification means, Reference numeral 24 denotes an in-frame keyword verification unit, 25 denotes an affiliated frame candidate information storage unit, 26 denotes an affiliated frame candidate layer number counting unit, 27 denotes an affiliated frame information storage unit, and the other operations are the same as those in the embodiment shown in FIG. The same reference numerals as those in the embodiment shown in FIG.
FIG. 9 is a flowchart for explaining an embodiment of the operation of the embodiment shown in FIG.
10 and 11 are diagrams showing specific examples of the embodiment shown in FIG.
[0024]
(A) The reference frame setting means 21 determines the frame in which the keyword exists as an initial reference frame (step S6-1).
(B) The adjacent frame searching means 22 searches for a frame in contact with the right side of the reference frame (step S6-2). Specifically, it is considered that the right end coordinate of the reference frame and the left end coordinate of the adjacent frame are close to each other.
(C) If there is no frame to contact (NO in step S6-3), the right side search for the reference frame is terminated and a lower side search is performed (step S6-10).
[0025]
(D) The frame shape verification means 23 verifies the size and position of the frame (step S6-4). When the upper and lower ends of the frame are inside the reference frame and the height difference is small, it is determined that the shape of the frame conforms to the reference, and otherwise (NO in step S6-4). ), The right side search of the reference frame is finished and the lower side search is performed (step S6-10).
(E) The in-frame keyword verification means 24 examines whether or not the adjacent search target frame is a keyword frame (step S6-5), and if it is not a keyword frame (NO in step S6-5), frame shape verification The means 23 is used to examine the difference in width between the reference frame and the adjacent search target frame (step S6-6). If the difference is large, it is determined that the adjacent search target frame is not a grouping target. (NO in step S6-6), the right side search is finished and the process proceeds to the lower side search (step S6-10).
(F) The in-frame keyword verification means 24 examines whether or not the adjacent search target frame is another keyword frame (step S6-7). If it is a keyword frame (YES in step S6-7), grouping is performed. It determines with it not being a target, finishes a right side search, and moves to a lower side search (step S6-10).
[0026]
(G) Since the above search conditions are cleared (NO in step S6-7), the adjacent search target frames are grouped and determined to belong to the relevant frame candidate of the keyword. (Step S6-8).
(H) The reference frame setting means 21 resets the grouped frames as reference frames (step S6-9), and further checks the connection relationship with the right adjacent frame. In this manner, the process is repeated until no new grouped frame is found.
(I) Subsequently, grouping is also performed on the frame below the keyword frame (step S6-10). The procedure is the same as when searching for the right side.
[0027]
(J) If grouping can be performed on both the right side and the lower side, only one of them is registered as a membership frame. As a criterion for selecting either, the affiliation frame candidate layer number counting means 26 is used, as shown in FIG. 11, the number of affiliation frames in each of the vertical and horizontal groups is counted (step S6-11), and the larger one is selected. This is selected and stored in the affiliation frame information storage means 27 (step S6-12). In the example shown in FIG. 11, since the number of layers in the vertical direction is large, the group in the vertical direction (lower side) is adopted. In the case of the same number, the horizontal direction is adopted.
(K) When all keywords have been examined (YES in step S6-13), the result is output (step S6-14), and the process is terminated.
With the above processing, the character entry area belonging to each keyword can be specified, so the image at that position is called a character image, and characters can be extracted.
[0028]
(Example 2)
FIG. 12 is a main part configuration diagram for explaining an embodiment of a computer-readable recording medium on which a program for realizing the image processing method according to the present invention is recorded, and is implemented when the present invention is realized by software. It is an example.
A general-purpose processing device including a CPU 31, a memory 32, a hard disk 33, an input device 34, a CD-ROM drive 35, a display 36, a mouse, and the like is prepared. A recording medium 37 such as a CD-ROM stores a program for realizing the processing functions and processing procedures of the image processing of the present invention. The document image to be processed is stored in the hard disk 33, for example. The CPU 31 reads out and executes a program for realizing the processing functions and procedures described above from the recording medium 37, and outputs the image processing result to the display 36 or the like.
[0029]
【The invention's effect】
As is clear from the above description, according to the present invention, characters can be extracted even from forms that are not completely fixed, and characters can be extracted correctly even if the ruled lines are somewhat blurred.
[Brief description of the drawings]
FIG. 1 is a diagram for explaining an example of a form that has been distributed in the past.
FIG. 2 is a main part configuration diagram for explaining an embodiment of an image processing method, an image processing apparatus and a computer-readable recording medium storing a program for realizing the processing method according to the present invention.
FIG. 3 is a flowchart for explaining an embodiment of the operation of the embodiment shown in FIG. 2;
4 is a block diagram for explaining an embodiment of the frame extracting means shown in FIG. 2;
FIG. 5 is a flowchart for explaining an embodiment of the operation of the embodiment shown in FIG. 4;
6 is a diagram showing a specific example of the embodiment shown in FIG.
7 is a diagram showing a specific example of the embodiment shown in FIG.
FIG. 8 is a diagram for explaining an embodiment of the belonging frame grouping means shown in FIG. 2;
FIG. 9 is a flowchart for explaining an embodiment of the operation of the embodiment shown in FIG. 8;
10 is a diagram showing a specific example of the embodiment shown in FIG.
FIG. 11 is a diagram showing a specific example of the embodiment shown in FIG. 9;
FIG. 12 is a block diagram illustrating a main part of an embodiment of a computer-readable recording medium on which a program for realizing an image processing method according to the present invention is recorded.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Image input means, 2 ... Image storage means, 3 ... Keyword input means, 4 ... Keyword storage means, 5 ... Frame extraction means, 6 ... Frame information storage means, 7 ... Character recognition means, 8 ... Keyword collation means, 9 ... belonging frame grouping means, 11 ... black run extracting means, 12 ... inter-black run pixel complementing means, 13 ... complementary image information storing means, 14 ... white pixel connected component circumscribed rectangle extracting means, 15 ... rectangular information storing means, 16 ... rectangular integration means, 17 ... rectangular size examination means, 18 ... frame position determination means, 21 ... reference frame setting means, 22 ... adjacent frame search means, 23 ... frame shape verification means, 24 ... in-frame keyword verification means, 25 ... Affiliation frame candidate information storage means, 26 ... Affiliation frame candidate layer count counting means, 27 ... Affiliation frame information storage means, 31 ... CPU, 32 ... Memory, 33 ... Hard disk, 34 ... Input device, 35 ... CD-ROM drive Breakfast, 36 ... display, 37 ... recording medium.

Claims

An image processing method for extracting characters in a form image,
An image input step for inputting a form image, a keyword input step for inputting a keyword existing in the form image, a frame extraction step for extracting a frame surrounded by a ruled line in the form image, and characters in the frame a character recognition step of recognizing a keyword matching process and, affiliation frame to be extracted by searching the belonging frame frame results the keyword and the same character in comparison exists of comparing the character of the recognition result keyword An extraction process ,
The frame extraction step includes a black run extraction step for extracting a black run having a predetermined length or more, and a pixel that fills the black run with black pixels if the extracted black runs are closer than a predetermined distance. A complementing step, a circumscribed rectangle extracting step of extracting a circumscribed rectangle of a connected component of white pixels from a form image in which the black pixel is embedded, and a circumscribed rectangle obtained by the extraction overlap each other Based on the examination, a rectangle integration step for integrating the circumscribed rectangles, a rectangle size examining step for comparing the size of the integrated circumscribed rectangle with a predetermined size, and excluding those smaller than the predetermined size, and And a frame position determining step in which the coordinate value of the circumscribed rectangle obtained in this way is used as the coordinate value of the frame .

An image processing apparatus that extracts characters in a form image, and is surrounded by an image input means for inputting a form image, a keyword input means for inputting a keyword existing in the form image, and a ruled line in the form image A frame extracting unit for extracting a frame, a character recognizing unit for recognizing a character in the frame, a keyword collating unit for comparing the character of the recognition result with the keyword, and a character identical to the keyword as a result of the comparison An affiliation frame extraction means for searching and extracting the affiliation frame of an existing frame;
The frame extraction means includes a black run extraction means for extracting a black run having a predetermined length or more, and a pixel that fills the black run with black pixels when the extracted black runs are closer than a predetermined distance. Complementing means, circumscribed rectangle extracting means for extracting circumscribed rectangles of connected components of white pixels from the form image in which the black pixels are embedded, and circumscribed rectangles obtained by the extraction overlap each other In accordance with the examination, a rectangle integration means for integrating the circumscribed rectangles, a rectangle size examining means for comparing the size of the integrated circumscribed rectangle with a predetermined size, and excluding those smaller than the predetermined size, and An image processing apparatus comprising: frame position determining means that uses the coordinate value of the circumscribed rectangle obtained in this way as the coordinate value of the frame .

A computer-readable recording medium recording a program for realizing each function for executing an image processing method for extracting characters in a form image by a computer,
The functions include an image input function for inputting a form image, a keyword input function for inputting a keyword existing in the form image, a frame extraction function for extracting a frame surrounded by ruled lines in the form image, A character recognition function for recognizing characters in the frame; a keyword collation function for comparing the recognition result character with the keyword; and, as a result of the comparison, searching for a belonging frame of a frame in which the same character as the keyword exists An affiliation frame extraction function to extract,
In addition, the frame extraction function extracts a black run having a predetermined length or more, and if the extracted black runs are closer than a predetermined distance, the black run is filled with black pixels. A pixel complement function, a circumscribed rectangle extracting function for extracting a circumscribed rectangle of a connected component of white pixels from a form image in which the black pixel is embedded, and a circumstance of overlapping when circumscribed rectangles obtained by the extraction overlap each other A rectangle integration function for integrating the circumscribed rectangles according to the size, a rectangle size examination function for comparing the size of the integrated circumscribed rectangle with a predetermined size, and excluding those smaller than the predetermined size, and It is a frame position determination function that uses the coordinate value of the circumscribed rectangle obtained based on the coordinate value of the frame as a computer, and can be read by a computer in which a program for realizing the above functions is realized by a computer Do recording medium.