JP4869365B2

JP4869365B2 - Image processing apparatus and image processing method

Info

Publication number: JP4869365B2
Application number: JP2009026104A
Authority: JP
Inventors: 聡一郎小野; 博之水谷
Original assignee: Toshiba Corp; Toshiba Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2009-02-06
Filing date: 2009-02-06
Publication date: 2012-02-08
Anticipated expiration: 2029-02-06
Also published as: JP2010182167A

Description

本発明は、画像処理装置および画像処理方法に関する。 The present invention relates to an image processing apparatus and an image processing method.

画像処理装置としては、例えば画像に含まれる文字を認識する文字認識装置がある。文字認識装置において、複数の特徴量を用いて文字認識を行う技術の一つとして相互部分空間法という認識技術が提唱されている。この認識技術は１枚の文字画像の多様な特徴に着目して認識を行うものである（例えば非特許文献１参照）。 An example of the image processing apparatus is a character recognition apparatus that recognizes characters included in an image. In a character recognition device, a recognition technique called a mutual subspace method has been proposed as one technique for character recognition using a plurality of feature quantities. This recognition technique recognizes by paying attention to various features of one character image (see Non-Patent Document 1, for example).

一方、文字画像には、１枚ごとに位置ずれや角度などによる変動要因が存在する。これに対応してパターン認識を行うための一つの手法として部分空間法という技術が公開されているが、その対応能力は必ずしも完全とはいえない（例えば非特許文献２参照）。 On the other hand, a character image has a variation factor due to a positional deviation, an angle, and the like for each character image. In response to this, a technique called a subspace method has been disclosed as one method for performing pattern recognition, but the corresponding capability is not necessarily perfect (see, for example, Non-Patent Document 2).

前田賢一、渡辺貞一「局所的構造を導入したパターン・マッチング法」、電子通信学会論文誌Vol.J68-D, No.3, 1985.Kenichi Maeda, Sadaichi Watanabe “Pattern Matching Method Introducing Local Structure”, IEICE Transactions Vol.J68-D, No.3, 1985. 石井健一郎ほか「わかりやすいパターン認識」(1998)、オーム社Kenichiro Ishii et al. “Easy-to-understand pattern recognition” (1998), Ohm

部分空間法では、例えば学習パターンの中にこれらの変動要因が多く含まれていなければ、実用上は必ずしも大きな対応能力を発揮しないこともあり得る。また、入力画像の変動が余りにも大きい場合、部分空間法といえども対応できず、認識精度が低下する。 In the subspace method, for example, if a large number of these fluctuation factors are not included in the learning pattern, it may not necessarily exhibit a large capacity in practice. Also, if the variation of the input image is too large, even the subspace method cannot be handled, and the recognition accuracy is reduced.

本発明はこのような課題を解決するためになされたもので、文字画像から文字を認識する精度を向上することのできる画像処理装置および画像処理方法を提供することを目的とする。 SUMMARY An advantage of some aspects of the invention is that it provides an image processing apparatus and an image processing method capable of improving the accuracy of recognizing characters from a character image.

上記した課題を解決するために、本発明の画像処理装置は、文書画像が記憶されたメモリと、文字とその特徴データが対応して格納された認識辞書と、前記メモリから読み出した文書画像に対して所定の前処理を施して文字画像を生成する前処理部と、前記前処理部より生成された文字画像に対して第１画像加工処理を行うことで複数の異なる文字パターンを有する第１文字パターン群を生成する第１パターン生成部と、前記前処理部より生成された文字画像に対して第２画像加工処理を行うことで複数の異なる文字パターンを有する第２文字パターン群を生成する第２パターン生成部と、前記第１パターン生成部により生成された第１文字パターン群の複数の文字パターンそれぞれから特徴データを抽出する第１特徴抽出部と、前記第２パターン生成部により生成された第２文字パターン群の複数の文字パターンそれぞれからから特徴データを抽出する第２特徴抽出部と、前記第１特徴抽出部より抽出された複数の特徴データと前記認識辞書に格納されている文字の特徴データとの類似度を計算する第１類似度計算部と、前記第２特徴抽出部より抽出された複数の特徴データと前記認識辞書に格納されている文字の特徴データとの類似度を計算する第２類似度計算部と、前記第１類似度計算部により計算された類似度と前記第２類似度計算部により計算された類似度とを予め定められた計算式により一つに統合し、統合された類似度を用いて、類似度の高い文字を前記認識辞書から選出する類似度統合部とを具備することを特徴とする。 In order to solve the above-described problems, an image processing apparatus according to the present invention includes a memory in which a document image is stored, a recognition dictionary in which characters and feature data are stored correspondingly, and a document image read from the memory. A preprocessing unit that performs predetermined preprocessing on the character image to generate a character image , and a first image processing unit that performs a first image processing process on the character image generated by the preprocessing unit and has a plurality of different character patterns A first pattern generation unit that generates a character pattern group, and a second character pattern group having a plurality of different character patterns by performing a second image processing process on the character image generated by the preprocessing unit A second pattern generation unit; a first feature extraction unit that extracts feature data from each of a plurality of character patterns of the first character pattern group generated by the first pattern generation unit; and the second pattern generation unit. A second feature extraction unit for extracting feature data from each of a plurality of character patterns of the second character pattern group generated by the character generation unit, a plurality of feature data extracted by the first feature extraction unit, and the recognition A first similarity calculation unit that calculates the similarity between the character feature data stored in the dictionary, a plurality of feature data extracted by the second feature extraction unit, and a character stored in the recognition dictionary A second similarity calculating unit for calculating a similarity with the feature data, a similarity calculated by the first similarity calculating unit, and a similarity calculated by the second similarity calculating unit are predetermined. And a similarity integration unit for selecting characters having high similarity from the recognition dictionary using the integrated similarity.

本発明の画像処理方法は、文書画像が記憶されたメモリ、文字とその特徴データが対応して格納された認識辞書、前処理部、第１パターン生成部、第２パターン生成部、第１特徴抽出部、第２特徴抽出部、第１類似度計算部、第２類似度計算部、類似度統合部を有する画像処理装置による画像処理方法において、前記メモリから文書画像を前記前処理部が読み出し、読み出した前記文書画像に対して所定の前処理を施して文字画像を生成するステップと、前記文字画像に対して前記第１パターン生成部が第１画像加工処理を行うことで複数の異なる文字パターンを有する第１文字パターン群を生成するステップと、前記文字画像に対して前記第２パターン生成部が第２画像加工処理を行うことで複数の異なる文字パターンを有する第２文字パターン群を生成するステップと、前記第１画像加工処理を行うことにより生成された第１文字パターン群の複数の文字パターンそれぞれから前記第１特徴抽出部が特徴データを抽出する前記複数の文字パターンからそれぞれの特徴データを抽出するステップと、前記第２画像加工処理を行うことにより生成された第２文字パターン群の複数の文字パターンそれぞれから前記第２特徴抽出部が特徴データを抽出するステップと、前記第１特徴抽出部により抽出された前記複数の特徴データと、前記認識辞書の文字の特徴データとの類似度を前記第１類似度計算部が計算するステップと、前記第２特徴抽出部により抽出された前記複数の特徴データと、前記認識辞書の文字の特徴データとの類似度を前記第２類似度計算部が計算するステップと、前記第１類似度計算部により計算された類似度と前記第２類似度計算部により計算された類似度とを前記類似度統合部が予め定められた計算式により一つに統合し、統合された類似度を用いて、類似度の高い文字を前記認識辞書から選出するステップとを有することを特徴とする。 The image processing method of the present invention includes a memory in which a document image is stored, a recognition dictionary in which characters and their feature data are stored correspondingly, a preprocessing unit, a first pattern generation unit, a second pattern generation unit, and a first feature. In an image processing method by an image processing apparatus having an extraction unit, a second feature extraction unit, a first similarity calculation unit, a second similarity calculation unit , and a similarity integration unit, the preprocessing unit reads a document image from the memory A step of performing a predetermined pre-processing on the read document image to generate a character image , and a plurality of different characters by the first pattern generation unit performing a first image processing process on the character image . Generating a first character pattern group having a pattern, and a second character pattern having a plurality of different character patterns by performing a second image processing process on the character image by the second pattern generation unit. A plurality of characters from which the first feature extraction unit extracts feature data from each of a plurality of character patterns of the first character pattern group generated by performing the first image processing. A step of extracting each feature data from a pattern, and a step in which the second feature extraction unit extracts feature data from each of a plurality of character patterns of a second character pattern group generated by performing the second image processing. The first similarity calculator calculating the similarity between the plurality of feature data extracted by the first feature extractor and the character feature data of the recognition dictionary; and the second feature extractor A step of calculating the similarity between the plurality of feature data extracted by the unit and the feature data of the characters in the recognition dictionary by the second similarity calculation unit; And a similarity calculated with the calculated degree of similarity by the second similarity degree calculating section by serial first similarity calculation unit integrated into one by the similarity integration section calculation formula predetermined integrated Selecting a character with a high similarity from the recognition dictionary using the similarity .

本発明によれば、文字画像から文字を認識する精度を向上することができる。 ADVANTAGE OF THE INVENTION According to this invention, the precision which recognizes a character from a character image can be improved.

本発明の一実施形態の画像処理装置の構成を示す図である。It is a figure which shows the structure of the image processing apparatus of one Embodiment of this invention. 画像処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of an image processing apparatus. ４近傍ガウシアンフィルタを説明するための図である。It is a figure for demonstrating 4 neighborhood Gaussian filter. ８近傍ガウシアンフィルタを説明するための図である。It is a figure for demonstrating an 8-neighbor Gaussian filter.

以下、図面を参照して、本発明の一つの実施の形態の画像処理装置を詳細に説明する。 Hereinafter, an image processing apparatus according to an embodiment of the present invention will be described in detail with reference to the drawings.

図１に示すように、この実施形態の画像処理装置は、入力部１、コンピュータ２（以下「ＰＣ２」と称す）、出力部３などを有している。 As shown in FIG. 1, the image processing apparatus according to this embodiment includes an input unit 1, a computer 2 (hereinafter referred to as “PC2”), an output unit 3, and the like.

入力部１は、カメラ・スキャナなどの外部入力装置であり、紙の文書(書類)からＣＣＤセンサなどにより光学的に読み取った文書画像をＰＣ２に入力する。出力部３は、例えばモニタなどの表示装置、プリンタなどの印刷装置であり、ＰＣ２から出力された認識結果のデータを出力（表示または印刷）する。 The input unit 1 is an external input device such as a camera / scanner, and inputs a document image optically read from a paper document (document) by a CCD sensor or the like to the PC 2. The output unit 3 is, for example, a display device such as a monitor or a printing device such as a printer, and outputs (displays or prints) recognition result data output from the PC 2.

ＰＣ２は、メモリ１０、前処理部１１、パターン生成部１２ａ，１２ｂ、特徴抽出部１３ａ，１３ｂ、認識辞書１４ａ，１４ｂ、部分類似度計算部１５ａ，１５ｂ、類似度統合部１６などを有している。これら各部はコンピュータのハードディスクにインストールされたソフトウェアのモジュールとして実現される。なお、これら各部はハードウェアで構成してもよい。 The PC 2 includes a memory 10, a preprocessing unit 11, pattern generation units 12a and 12b, feature extraction units 13a and 13b, recognition dictionaries 14a and 14b, partial similarity calculation units 15a and 15b, a similarity integration unit 16, and the like. Yes. Each of these units is realized as a software module installed on a hard disk of a computer. Note that these units may be configured by hardware.

パターン生成部１２ａ、特徴抽出部１３ａ、認識辞書１４ａ、部分類似度計算部１５ａは、第１計算系統４である。この第１計算系統４は、前処理部１１より生成された文字画像２２をいくつかの文字パターン２３，２４に変化させた上で認識辞書１４ａに格納されている文字２７ａとの類似度を計算する。 The pattern generation unit 12a, the feature extraction unit 13a, the recognition dictionary 14a, and the partial similarity calculation unit 15a are the first calculation system 4. The first calculation system 4 changes the character image 22 generated by the preprocessing unit 11 into several character patterns 23 and 24, and then calculates the degree of similarity with the character 27a stored in the recognition dictionary 14a. To do.

パターン生成部１２ｂ、特徴抽出部１３ｂ、認識辞書１４ｂ、部分類似度計算部１５ｂは、第２計算系統５である。この第２計算系統５は、前処理部１１より生成された文字画像２２を第１計算系統４とは異なる処理でいくつかの文字パターン２５，２６に変化させた上で認識辞書１４ａに格納されている文字２７ｂとの類似度を計算する。
第１計算系統４および第２計算系統５は、前処理部１１より生成された文字画像を、系統毎に異なる処理でいくつかのパターンに変化させた上で、対応する認識辞書１４ａ，１４ｂに格納されている文字との類似度を計算する複数の計算系統である。 The pattern generation unit 12b, the feature extraction unit 13b, the recognition dictionary 14b, and the partial similarity calculation unit 15b are the second calculation system 5. The second calculation system 5 changes the character image 22 generated by the preprocessing unit 11 into several character patterns 25 and 26 by a process different from that of the first calculation system 4, and then stores the character image 22 in the recognition dictionary 14a. The degree of similarity with the current character 27b is calculated.
The first calculation system 4 and the second calculation system 5 change the character image generated by the preprocessing unit 11 into several patterns by different processes for each system, and then change the character images to the corresponding recognition dictionaries 14a and 14b. It is a plurality of calculation systems for calculating the similarity with stored characters.

メモリ１０は、オペレーティングシステム（ＯＳ）などのコンピュータ制御プログラムが読み込まれる領域として利用される他、上記各部による演算用および処理用の記憶領域として利用される。メモリ１０には、例えば比較処理のための画像データや処理結果のデータなどが記憶される。 The memory 10 is used not only as an area for reading a computer control program such as an operating system (OS) but also as a storage area for calculation and processing by the above-described units. The memory 10 stores, for example, image data for comparison processing, processing result data, and the like.

前処理部１１は、文字認識に使用する画像(文字画像)の部分的な切り出し、二値化、ノイズ除去、輪郭強調などの所定の前処理を行う。前処理部１１は、所定の前処理として、文字画像の部分的な切り出し、二値化、ノイズ除去、輪郭強調などのうちの少なくとも一つを行うものとする。これら個々の画像処理技術については、既知の技術のため詳細な説明は省略する。 The preprocessing unit 11 performs predetermined preprocessing such as partial segmentation, binarization, noise removal, and contour enhancement of an image (character image) used for character recognition. The pre-processing unit 11 performs at least one of character image partial segmentation, binarization, noise removal, contour enhancement, and the like as predetermined pre-processing. Since these individual image processing techniques are known techniques, a detailed description thereof will be omitted.

パターン生成部１２ａ，１２ｂは、前処理部１１で前処理済みの画像を拡張・収縮・回転・移動・ぼかし・手ぶれ・透視変換するなどの画像加工処理を行って、元の文字画像を変化（変形または変質）させた新たな画像を生成する。
パターン生成部１２ａは、文字画像に対して第１画像加工処理を行うことで複数の異なる文字パターン２３，２４を有する第１文字パターン群を生成する。パターン生成部１２ａは、第１画像加工処理として、例えば文字（黒画素）の移動処理を行うものとする。
パターン生成部１２ｂは、文字画像に対して第２画像加工処理を行うことで複数の異なる文字パターン２５，２６を有する第２文字パターン群を生成する。パターン生成部１２ｂは、第２画像加工処理として、例えば文字（黒画素）の移動処理と文字（黒画素）の拡張処理とを行うものとする。文字の移動処理とは、文字の取り得る範囲（文字枠）内で文字の位置をずらす（黒画素を平行移動する）処理である。文字の拡張処理とは文字の線を画素単位で太くする処理である。 The pattern generation units 12a and 12b change the original character image by performing image processing processing such as expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation on the image preprocessed by the preprocessing unit 11. A new image that has been deformed or altered) is generated.
The pattern generation unit 12a generates a first character pattern group having a plurality of different character patterns 23 and 24 by performing a first image processing process on the character image. The pattern generation unit 12a performs, for example, a character (black pixel) movement process as the first image processing process.
The pattern generation unit 12b generates a second character pattern group having a plurality of different character patterns 25 and 26 by performing a second image processing process on the character image. The pattern generation unit 12b performs, for example, a character (black pixel) moving process and a character (black pixel) expansion process as the second image processing process. The character moving process is a process of shifting the position of a character (translating black pixels in parallel) within a possible range (character frame) of the character. The character expansion process is a process for thickening a character line in units of pixels.

特徴抽出部１３ａは、パターン生成部１２ａにより生成された第１文字パターン群の中の個々の文字パターン２３，２４の特徴量（以下特徴データと称す）を抽出する。特徴抽出部１３ｂは、パターン生成部１２ｂにより生成された第２文字パターン群の中の個々の文字パターン２５，２６の特徴量（以下特徴データと称す）を抽出する。 The feature extraction unit 13a extracts feature amounts (hereinafter referred to as feature data) of the individual character patterns 23 and 24 in the first character pattern group generated by the pattern generation unit 12a. The feature extraction unit 13b extracts feature amounts (hereinafter referred to as feature data) of the individual character patterns 25 and 26 in the second character pattern group generated by the pattern generation unit 12b.

文字画像２３，２４と文字画像２５，２６とは、異なる種別の画像加工処理が行われた結果の画像であるものとする。異なる画像加工処理とは一部に同じ加工処理を含んでいてもよい。 It is assumed that the character images 23 and 24 and the character images 25 and 26 are images obtained by performing different types of image processing. Different image processing processes may include the same processing process in part.

認識辞書１４ａ，１４ｂには、予め複数（多く）の文字とその特徴データが対応して格納されている。認識辞書１４ａには、パターン生成部１２ａで生成される文字パターン２３，２４を認識するための文字２７ａの特徴パターン（特徴データとテキストデータ）が格納されている。認識辞書１４ｂには、パターン生成部１２ｂで生成される文字パターン２５，２６を認識するための文字２７ｂの特徴パターン（特徴データとテキストデータ）が格納されている。 In the recognition dictionaries 14a and 14b, a plurality of (many) characters and their feature data are stored in advance. The recognition dictionary 14a stores feature patterns (feature data and text data) of characters 27a for recognizing the character patterns 23 and 24 generated by the pattern generation unit 12a. The recognition dictionary 14b stores feature patterns (feature data and text data) of the characters 27b for recognizing the character patterns 25 and 26 generated by the pattern generation unit 12b.

部分類似度計算部１５ａは、特徴抽出部１３ａにより抽出された複数の特徴データと認識辞書１４ａに格納されている文字の特徴データとの類似度を、演算により求める。
部分類似度計算部１５ｂは、特徴抽出部１３ｂにより抽出された複数の特徴データと認識辞書１４ｂに格納されている文字の特徴データとの類似度を、演算により求める。演算とは、メモリ１０に記憶されている式（５）〜式（７）に示す計算式（関数）に、特徴データを入れて計算することをいう。 The partial similarity calculation unit 15a obtains the similarity between the plurality of feature data extracted by the feature extraction unit 13a and the character feature data stored in the recognition dictionary 14a by calculation.
The partial similarity calculation unit 15b obtains the similarity between the plurality of feature data extracted by the feature extraction unit 13b and the character feature data stored in the recognition dictionary 14b by calculation. The calculation means that calculation is performed by adding feature data to the calculation formulas (functions) shown in the formulas (5) to (7) stored in the memory 10.

類似度統合部１６は、第１計算系統４により計算された類似度と第２計算系統５により計算された類似度とを一つに統合する。より具体的には、類似度統合部１６は、部分類似度計算部１５ａ，１５ｂによりそれぞれ計算された複数の部分類似度を一つに統合する。類似度の統合には、メモリ１０に記憶されている式（８）で示す類似度統合関数σを用いる。 The similarity integration unit 16 integrates the similarity calculated by the first calculation system 4 and the similarity calculated by the second calculation system 5 into one. More specifically, the similarity integration unit 16 integrates a plurality of partial similarities calculated by the partial similarity calculation units 15a and 15b, respectively. For the integration of the similarity, a similarity integration function σ represented by Expression (8) stored in the memory 10 is used.

以下、図２のフローチャートおよび図３，図４を参照してこの画像処理装置の動作を説明する。 The operation of this image processing apparatus will be described below with reference to the flowchart of FIG. 2 and FIGS.

認識対象の文書をカメラ・スキャナなどの入力部１にセットして、デジタルカメラであれば撮影操作、またスキャナであればスキャン操作を行うと、入力部１により文書の画像が読み取られてＰＣ２へデジタル画像（これを「文書画像２１」と称す）として出力される。 When a document to be recognized is set in the input unit 1 such as a camera / scanner and a digital camera performs a shooting operation or a scanner performs a scanning operation, an image of the document is read by the input unit 1 to the PC 2. It is output as a digital image (referred to as “document image 21”).

入力部１から出力された文書画像２１がＰＣ２に入力されると、その文書画像２１は、前処理部１１により一旦、メモリ１０に記憶される（図２のステップＳ１０１）。 When the document image 21 output from the input unit 1 is input to the PC 2, the document image 21 is temporarily stored in the memory 10 by the preprocessing unit 11 (step S101 in FIG. 2).

文書画像２１をメモリ１０に記憶した後、前処理部１１は、メモリ１０から文書画像２１を読み出し、読み出した文書画像２１に対して所定の前処理を施して文字認識の対象となる文字画像２２を生成し（ステップＳ１０２）、メモリ１０に記憶する。所定の前処理とは、画像の部分切り出し、二値化、ノイズ除去、輪郭強調などの画像処理のうちの予め決められた処理である。所定の前処理により生成された文字画像２２は、例えば「Ａ」のような文字とする。 After storing the document image 21 in the memory 10, the preprocessing unit 11 reads the document image 21 from the memory 10, performs predetermined preprocessing on the read document image 21, and character image 22 to be subjected to character recognition. Is generated (step S102) and stored in the memory 10. The predetermined preprocessing is a predetermined process among image processes such as partial image segmentation, binarization, noise removal, and contour enhancement. The character image 22 generated by the predetermined preprocessing is a character such as “A”, for example.

パターン生成部１２ａは、メモリ１０から文字画像２２を読み出し、読み出した文字画像２２（前処理済みの画像）に対して第１の画像加工処理を行うことで複数個の異なる文字パターン群（図１の文字パターン２３，２４）を生成し（ステップＳ１０３）、メモリ１０に記憶する。第１の画像加工処理は、画像の拡張・収縮・回転・移動・ぼかし・手ぶれ・透視変換などの処理のうち予め決められた処理である。第１の画像加工処理により生成された文字パターン群を第１文字パターン群と言う。 The pattern generation unit 12a reads the character image 22 from the memory 10, and performs a first image processing process on the read character image 22 (pre-processed image), thereby a plurality of different character pattern groups (FIG. 1). Character patterns 23 and 24) are generated (step S103) and stored in the memory 10. The first image processing process is a predetermined process among processes such as image expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation. A character pattern group generated by the first image processing is referred to as a first character pattern group.

文字パターン２３は、文字画像２２（前処理済みの画像）に対して文字枠内左上に移動された文字「Ａ」である。文字パターン２４は、文字画像２２（前処理済みの画像）に対して文字枠内左下に移動された文字「Ａ」である。
また、パターン生成部１２ａとほぼ同時にパターン生成部１２ｂは、メモリ１０から文字画像２２を読み出し、読み出した文字画像２２（前処理済みの画像）に対して第２の画像加工処理を行うことで複数個の異なる文字パターン群（図１の文字パターン２５,２６）を生成し（ステップＳ１０４）、メモリ１０に記憶する。第２の画像加工処理により生成された文字パターン群を第２文字パターン群と言う。 The character pattern 23 is the character “A” moved to the upper left in the character frame with respect to the character image 22 (preprocessed image). The character pattern 24 is the character “A” that is moved to the lower left in the character frame with respect to the character image 22 (preprocessed image).
The pattern generation unit 12b reads the character image 22 from the memory 10 almost simultaneously with the pattern generation unit 12a, and performs a second image processing process on the read character image 22 (pre-processed image). A group of different character patterns (character patterns 25 and 26 in FIG. 1) is generated (step S104) and stored in the memory 10. A character pattern group generated by the second image processing is referred to as a second character pattern group.

第２の画像加工処理は、画像の拡張・収縮・回転・移動・ぼかし・手ぶれ・透視変換などの処理のうち予め決められた処理であり、第１の画像加工処理とは異なる処理である。
文字パターン２５は、文字画像２２（前処理済みの画像）に対して文字枠内右上に移動されかつ太字とされた文字「Ａ」である。文字パターン２６は、文字画像２２（前処理済みの画像）に対して文字枠内右下に移動されかつ太字とされた文字「Ａ」である。 The second image processing is a predetermined process among processes such as image expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation, and is different from the first image processing.
The character pattern 25 is a character “A” that is moved to the upper right in the character frame and is bolded with respect to the character image 22 (preprocessed image). The character pattern 26 is a character “A” that is moved to the lower right in the character frame and is bold with respect to the character image 22 (preprocessed image).

すなわち、第１の画像加工処理と第２の画像加工処理は、画像の拡張・収縮・回転・移動・ぼかし・手ぶれ・透視変換などの処理のうち予め決められた異なる処理である。 That is, the first image processing and the second image processing are different predetermined processes among processes such as image expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation.

特徴抽出部１３ａは、メモリ１０から第１文字パターン群、つまり複数の文字パターン２３，２４を読み出し、読み出した文字パターン２３，２４からそれぞれの特徴データを抽出し（ステップＳ１０５）、メモリ１０に記憶する。これとほぼ同時に特徴抽出部１３ｂは、メモリ１０から第２文字パターン群、つまり複数の文字パターン２５，２６を読み出し、読み出した文字パターン２５，２６からそれぞれの特徴データを抽出し（ステップＳ１０６）、メモリ１０に記憶する。 The feature extraction unit 13a reads a first character pattern group, that is, a plurality of character patterns 23 and 24 from the memory 10, extracts respective feature data from the read character patterns 23 and 24 (step S105), and stores them in the memory 10 To do. At approximately the same time, the feature extraction unit 13b reads the second character pattern group, that is, the plurality of character patterns 25 and 26 from the memory 10, and extracts the respective feature data from the read character patterns 25 and 26 (step S106). Store in the memory 10.

部分類似度計算部１５ａ，１５ｂは、メモリ１０から文字パターン２３，２４それぞれの特徴データを読み出し、読み出した複数の特徴データと認識辞書１４ａから読み出した文字２７ａの特徴データとを用いて部分類似度を計算し（ステップＳ１０７）、メモリ１０に記憶する。 The partial similarity calculation units 15a and 15b read the feature data of the character patterns 23 and 24 from the memory 10, and use the plurality of read feature data and the feature data of the character 27a read from the recognition dictionary 14a to obtain the partial similarity. Is calculated (step S107) and stored in the memory 10.

これとほぼ同時に部分類似度計算部１５ｂは、メモリ１０から文字パターン２５，２６それぞれの特徴データを読み出し、読み出した複数の特徴データと認識辞書１４ｂから読み出した文字２７ｂの特徴データとを用いて部分類似度を計算し（ステップＳ１０８）、メモリ１０に記憶する。 At substantially the same time, the partial similarity calculation unit 15b reads the feature data of each of the character patterns 25 and 26 from the memory 10, and uses the plurality of read feature data and the feature data of the character 27b read from the recognition dictionary 14b. The similarity is calculated (step S108) and stored in the memory 10.

類似度統合部１６は、計算されたそれぞれの文字パターン群の部分類似度をメモリ１０より読み出して統合する（ステップＳ１０９）。 The similarity integration unit 16 reads the calculated partial similarity of each character pattern group from the memory 10 and integrates them (step S109).

そして、類似度統合部１６は、統合した類似度を用いて、類似度の高い文字を認識辞書１４ａ，１４ｂから選出（ステップＳ１１０）、つまりパターン認識処理を行い、認識結果の文字（テキストデータおよび認識元の文字画像２２）を出力部３へ出力し、出力部３が例えば表示装置であれば、認識結果を表示装置の画面に表示する。 Then, using the integrated similarity, the similarity integration unit 16 selects characters with high similarity from the recognition dictionaries 14a and 14b (step S110), that is, performs a pattern recognition process, and recognizes characters (text data and text data) as a recognition result. The recognition source character image 22) is output to the output unit 3, and if the output unit 3 is, for example, a display device, the recognition result is displayed on the screen of the display device.

ここで、パターン生成部１２ａ，１２ｂが行う文字パターンの生成処理（画像処理）について説明する。 Here, the character pattern generation processing (image processing) performed by the pattern generation units 12a and 12b will be described.

パターン生成部１２ａ，１２ｂは、前処理部１１で前処理済みの画像を拡張・収縮・回転・移動・ぼかし・手ぶれ・透視変換するなどの所定の画像加工処理を行って、元の文字画像２２を切り出し範囲（文字枠の範囲）内で変動（変形または変質）させて新たな画像を複数生成し、生成した複数の画像を画像加工処理の方式に従ってグループ化（グループ分け）する。 The pattern generation units 12a and 12b perform predetermined image processing such as expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation of the image that has been preprocessed by the preprocessing unit 11, and the original character image 22 A plurality of new images are generated by changing (deformation or alteration) within the cutout range (character frame range), and the generated plurality of images are grouped (grouped) according to the image processing method.

例えば１つ目のグループは前処理済画像に拡張処理を施したもの、２つ目のグループは前処理済画像に収縮処理を施したもの、３つ目のグループは前処理済画像に回転処理を施したもの、といったグループ分けが考えられる。 For example, the first group is obtained by performing an expansion process on a preprocessed image, the second group is obtained by performing a contraction process on a preprocessed image, and the third group is a rotation process performed on a preprocessed image. It is possible to divide into groups such as

画像拡張処理の一例として、例えば各画素について、その画素または上下左右４画素のうち１つ以上が黒ならばその画素も黒とするといった処理を行う。 As an example of the image expansion processing, for example, for each pixel, if one or more of the pixels or four pixels in the upper, lower, left, and right directions is black, the processing is performed such that the pixels are also black.

画像収縮処理の一例として、例えば各画素について、その画素または上下左右４画素のうち１つ以上が白ならばその画素も白とする（収縮）といった処理を行う。 As an example of the image contraction process, for example, for each pixel, if one or more of the pixels or four pixels above, below, left, and right are white, the pixel is also white (contracted).

画像の回転および移動については、前処理済みの画像において座標ｘの画素値をｆ（ｘ）で表したとき、

で表されるＲｕ［ｆ］，Ｓｓ［ｆ］をそれぞれ、回転行列Ｕおよび移動量ｓをパラメータとする回転済み、または移動済み画像とすることができる。この処理によって、座標Ｕ^-1ｘ,（ｘ−ｓ）にあった黒点が座標ｘの位置にそれぞれ回転・平行移動する。 Regarding rotation and movement of the image, when the pixel value of the coordinate x is represented by f (x) in the preprocessed image,

Ru [f] and Ss [f] represented by the above can be rotated or moved images using the rotation matrix U and the moving amount s as parameters, respectively. By this process, the black point at the coordinate U ⁻¹ x, (x−s) is rotated and translated to the position of the coordinate x.

画像の回転については、例えば１０度刻みに９０度までといったようにして回転する。また画像の移動については、例えば前処理済画像の辺や対角線の長さを基準に、例えば１／４などといった比率を用いて移動する。 As for the rotation of the image, for example, it is rotated in increments of 10 degrees up to 90 degrees. As for the movement of the image, for example, the image is moved using a ratio such as 1/4 based on the length of the side or diagonal line of the preprocessed image.

画像のぼかし、手ぶれ処理については、これらを実現する点拡がり関数（ＰＳＦ）を準備し、前処理済みの画像に畳み込み処理を行い、それを再度二値化する。 For image blurring and camera shake processing, a point spread function (PSF) for realizing these is prepared, convolution processing is performed on the preprocessed image, and binarization is performed again.

ぼかしに対応する点拡がり関数（ＰＳＦ）は、図３に示すように、中心画素とその周囲８方向に隣接する画素とを配置した９画素モデルにおいて、中心の画素を「２」としたときに、その上下左右の画素を「１」とし、斜め方向の画素を「０」とする４近傍ガウシアンフィルタがある。 As shown in FIG. 3, the point spread function (PSF) corresponding to the blur is obtained when the center pixel is set to “2” in the 9-pixel model in which the center pixel and the neighboring pixels in the eight directions are arranged. There is a 4-neighbor Gaussian filter in which the upper, lower, left and right pixels are set to “1” and the diagonal pixel is set to “0”.

また、この他、図４に示すように、中心画素とその周囲８方向に隣接する画素とを配置した９画素モデルにおいて、中心の画素を「４」としたときに、その上下左右の画素を「２」とし、斜め方向の画素を「１」とする８近傍ガウシアンフィルタなどを用いる。 In addition, as shown in FIG. 4, in the 9-pixel model in which the center pixel and pixels adjacent to the surrounding 8 directions are arranged, when the center pixel is “4”, the upper, lower, left and right pixels are An 8-neighbor Gaussian filter or the like having “2” and a diagonal pixel “1” is used.

手ぶれに対応する点拡がり関数（ＰＳＦ）としては、原点Ｏの近傍に一点Ｐを選び、

として作ることができる。点拡がり関数（ＰＳＦ）として１（ｙ）を原画像ｆ（ｘ）に畳み込む処理は、

と表せる。上記ｈ（ｘ）が畳み込み処理後の画像である。 As a point spread function (PSF) corresponding to camera shake, a point P is selected in the vicinity of the origin O,

Can be made as The process of convolving 1 (y) into the original image f (x) as a point spread function (PSF) is as follows:

It can be expressed. The h (x) is an image after the convolution process.

透視変換は、射影変換の名で広く知られており、一般的な射影幾何学の文献、例えば川又雄二郎「射影空間の幾何学」(2001)、朝倉書店等に開示されている射影変換の技術を利用するものとする。 Perspective transformation is widely known as the name of projective transformation. Projective transformation technology disclosed in general projective geometry literature such as Yumuro Kawamata "Geometry of Projective Space" (2001), Asakura Shoten, etc. Shall be used.

特徴抽出部１３ａ，１３ｂが前処理済みの画像（文字パターン２３，２４または２５，２６）から特徴量を抽出する処理を以下に示す。
例えば、画像に前述の方法でぼかし処理を施した上で、そのぼかし処理した画像を、画素値を成分とするベクトルとみなしてそのまま特徴量とする方法がある。このとき、上記画像のグループ毎に、異なった特徴抽出を行ってもよい。 Processing for extracting feature values from pre-processed images (character patterns 23, 24 or 25, 26) by the feature extraction units 13a and 13b will be described below.
For example, there is a method in which a blurring process is performed on an image by the above-described method, and the blurred image is regarded as a vector having a pixel value as a component and used as a feature amount as it is. At this time, different feature extraction may be performed for each group of the images.

また、部分類似度計算部１５ａ，１５ｂがパターン認識処理を行う方法および認識辞書１４ａ，１４ｂの作成方法としては、非特許文献２に開示されているＣＬＡＦＩＣ法に基づいて認識辞書１４ａ，１４ｂを作成した上で、複数の生成パターンの特徴量と、認識辞書１４ａ，１４ｂに登録済みの文字種との類似度を、相互部分空間法などを用いて計算する。 In addition, as a method of performing pattern recognition processing by the partial similarity calculators 15a and 15b and a method of creating the recognition dictionaries 14a and 14b, the recognition dictionaries 14a and 14b are created based on the CLAFIC method disclosed in Non-Patent Document 2. Then, the similarity between the feature quantities of the plurality of generated patterns and the character types registered in the recognition dictionaries 14a and 14b is calculated using a mutual subspace method or the like.

相互部分空間法を用いた類似度の計算方法としては、例えば特徴抽出部１３ａ，１３ｂから入力された複数の特徴ベクトルのグループαの元

を計算し、その固有ベクトルをｕ^α _１，ｕ^α _ｍとした上で、０≦ｐ≦ｍ，０≦ｑ≦ｎとなる整数ｐ、ｑを選んだ上で、行列Ｕ_ｐ＝（ｕ^α _１，…ｕ^α _ｐ），Ｖ_ｑ＝（ｖ_１…ｖ_ｑ）を用いて定義される

の最大固有値ρ^αを求め、このρ^αを類似度とする。ただし、左肩のtは転置を表す。このときρ^αは、例えば二宮市三編著「数値計算のわざ」（2006）、共立出版）などに開示されている累乗法などの既知の計算方法を用いて計算する。ただし、ｖ₁ ，ｖ_n は辞書データであり、これは各文字種毎に予め準備した学習パターンｙ¹，…，ｙⁿを用いて行列

を計算し、その固有ベクトルをｖ₁ ，ｖ_n とすることで計算できる。 As a similarity calculation method using the mutual subspace method, for example, an element of a group α of a plurality of feature vectors input from the feature extraction units 13a and 13b is used.

And the eigenvectors are set as u ^α ₁ and u ^α _m, and integers p and q satisfying 0 ≦ p ≦ m and 0 ≦ q ≦ n are selected, and then the matrix U _p = (u ^α ₁ ,... U ^α _p ), V _q = (v ₁ ... V _q )

The maximum eigenvalue ρ ^α is obtained, and this ρ ^α is used as the similarity. However, t on the left shoulder represents transposition. At this time, ^ρ α is, for example, Ninomiya City three written and edited by "numerical work of calculation" (2006), is calculated using the known calculation methods, such as exponentiation, which is disclosed in, for example, Kyoritsu Publishing). However, v ₁ and v _n are dictionary data, which is a matrix using learning patterns y ¹ ,..., Y ⁿ prepared in advance for each character type.

And the eigenvectors are set as v ₁ and v _n .

この例では、認識辞書１４ａ，１４ｂは、各パターン生成部１２ａ，１２ｂに対応する部分類似度計算部１５ａ、１５ｂごとに別の学習パターンを用意して個別に設けているが、全ての部分類似度計算部１５ａ、１５ｂにおいて共通の認識辞書を用いてもよい。 In this example, the recognition dictionaries 14a and 14b prepare separate learning patterns for the partial similarity calculation units 15a and 15b corresponding to the pattern generation units 12a and 12b, respectively. A common recognition dictionary may be used in the degree calculation units 15a and 15b.

類似度計算統合部１６の処理としては、各グループの部分類似度がρ¹，…，ρ^μと表されるとき、ある類似度統合関数σを用いて

と定まるρを類似度とする。 As processing of the similarity calculation integration unit 16, when the partial similarity of each group is expressed as ρ ¹ ,..., Ρ ^μ , a certain similarity integration function σ is used.

Is defined as ρ.

このとき、類似度統合関数σの定め方としては、部分類似度のうち最大のものを選ぶ方法、部分類似度が大きい順にいくつかを選びこれを平均する方法、部分類似度全体の平均を取る方法がある。また、選択した部分類似度に直接平均操作を施すかわりに、一度単調増加関数を用いて部分類似度の差を強調してもよい。 At this time, as a method of determining the similarity integration function σ, a method of selecting the largest of the partial similarities, a method of selecting some of the partial similarities in descending order and averaging them, and taking an average of the entire partial similarities There is a way. Further, instead of directly performing the averaging operation on the selected partial similarity, the difference in partial similarity may be emphasized once using a monotonically increasing function.

そのためには、単調増加関数τを用いて

などとする方法がある。ただし、ｒは１≦ｒ≦μとなる整数であり、ρ^t(s)は、ρ¹，…，ρ^μのうち大きいものからs番目の値である。 To do so, using a monotonically increasing function τ

There is a method to say. However, r is an integer comprised between ^{1 ≦ r ≦ μ, ρ t} (s) is, ρ ^1, ..., a s th value from the largest of the [rho ^mu.

さらに、τの例としては

などが挙げられる。ただし、ρ₀，βは定数として適当なものを別途選ぶ。 Furthermore, as an example of τ

Etc. However, ρ ₀ and β are appropriately selected as constants.

類似度統合部１６の別の実現方法として、非特許文献２に開示されているニューラルネットなどの既存の方法を用いてもよい。また、部分類似度計算部１５ａ，１５ｂおよび類似度統合部１６において、顕著に類似度が高い文字種が存在しない場合、結果不明としてこれをリジェクトしてもよい。 As another method of realizing the similarity integration unit 16, an existing method such as a neural network disclosed in Non-Patent Document 2 may be used. Further, in the partial similarity calculation units 15a and 15b and the similarity integration unit 16, when there is no character type with a significantly high similarity, it may be rejected as an unknown result.

このようにこの実施形態の画像処理装置によれば、認識対象の文字画像の文字（黒画素）について積極的に部分的な変化（黒画素を所定のルールでずらしたり太くしたりする等）を起こさせた異なる文字パターンを含む文字パターン群を複数生成し、それぞれの文字パターン群の複数の特徴データと対応する認識辞書１４ａ，１４ｂの特徴データとの部分類似度を計算し、得られた部分類似度を一つに統合するので、文字認識精度を向上することができる。 As described above, according to the image processing apparatus of this embodiment, the character (black pixel) of the character image to be recognized is positively changed partially (eg, the black pixel is shifted or thickened according to a predetermined rule). A plurality of character pattern groups including different character patterns generated are generated, and the partial similarity between the plurality of feature data of each character pattern group and the corresponding feature data of the recognition dictionaries 14a and 14b is calculated. Since the similarities are integrated into one, the character recognition accuracy can be improved.

なお、本願発明は、上記実施形態のみに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形してもよい。
上記実施形態では、第１計算系統４と第２計算系統５の２つの計算系統を例示したが、この他、例えば第３計算系統、第４計算系統を加えても良く、その以上、多数（複数）の計算系統を加え、各計算系統の計算で得られた複数の類似度を統合しても良い。
この場合、複数の計算系統は、前処理部１１より生成された文字画像を、系統毎に異なる処理でいくつかのパターンに変化させた上で、それぞれ対応する認識辞書１４ａ，１４ｂに格納されている文字との類似度を計算することになる。また類似度統合部６は、複数の計算系統による計算結果として得られる複数の類似度を一つに統合することになる。 In addition, this invention is not limited only to the said embodiment, You may deform | transform a component in the range which does not deviate from the summary in an implementation stage.
In the above-described embodiment, two calculation systems of the first calculation system 4 and the second calculation system 5 are illustrated. However, for example, a third calculation system and a fourth calculation system may be added. A plurality of calculation systems may be added, and a plurality of similarities obtained by calculation of each calculation system may be integrated.
In this case, the plurality of calculation systems are stored in the corresponding recognition dictionaries 14a and 14b after the character image generated by the preprocessing unit 11 is changed into several patterns by different processes for each system. The degree of similarity with the existing character is calculated. Further, the similarity integration unit 6 integrates a plurality of similarities obtained as calculation results by a plurality of calculation systems into one.

また、上記実施形態の各構成要素を、コンピュータのハードディスク装置などのストレージにインストールしたプログラムで実現してもよい。
さらに、プログラムを、コンピュータ読取可能なＣＤ−ＲＯＭなどの記憶媒体に記憶しておき、プログラムを記憶媒体からコンピュータに読み取らせることで実現してもよい。さらに、ネットワークを介して接続した異なるコンピュータに構成要素を分散して記憶し、各構成要素を機能させたコンピュータ間で通信することで実現してもよい。 In addition, each component of the above embodiment may be realized by a program installed in a storage such as a hard disk device of a computer.
Furthermore, the program may be stored in a storage medium such as a computer-readable CD-ROM and the program may be read from the storage medium by a computer. Further, the configuration may be realized by distributing and storing components in different computers connected via a network, and communicating between computers in which the components are functioning.

１…入力部、２…コンピュータ（ＰＣ）、３…出力部、４…第１計算系統、５…第２計算系統、１０…メモリ、１１…前処理部、１２ａ，１２ｂ…パターン生成部、１３ａ，１３ｂ…特徴抽出部、１４ａ，１４ｂ…認識辞書、１５ａ，１５ｂ…部分類似度計算部、１６…類似度統合部。 DESCRIPTION OF SYMBOLS 1 ... Input part, 2 ... Computer (PC), 3 ... Output part, 4 ... 1st calculation system, 5 ... 2nd calculation system, 10 ... Memory, 11 ... Pre-processing part, 12a, 12b ... Pattern generation part, 13a , 13b ... feature extraction unit, 14a, 14b ... recognition dictionary, 15a, 15b ... partial similarity calculation unit, 16 ... similarity integration unit.

Claims

A memory storing document images;
A recognition dictionary in which characters and their feature data are stored correspondingly;
A preprocessing unit that generates a character image by performing predetermined preprocessing on the document image read from the memory;
A first pattern generation unit that generates a first character pattern group having a plurality of different character patterns by performing a first image processing process on the character image generated by the pre-processing unit;
A second pattern generation unit that generates a second character pattern group having a plurality of different character patterns by performing a second image processing process on the character image generated by the pre-processing unit;
A first feature extraction unit that extracts feature data from each of a plurality of character patterns of the first character pattern group generated by the first pattern generation unit;
A second feature extraction unit that extracts feature data from each of a plurality of character patterns of the second character pattern group generated by the second pattern generation unit;
A first similarity calculator for calculating a similarity between a plurality of feature data extracted by the first feature extractor and character feature data stored in the recognition dictionary;
A second similarity calculation unit for calculating a similarity between a plurality of feature data extracted by the second feature extraction unit and character feature data stored in the recognition dictionary;
The similarity calculated by the first similarity calculator and the similarity calculated by the second similarity calculator are integrated into one by a predetermined calculation formula , and the integrated similarity is used. An image processing apparatus comprising: a similarity integration unit that selects characters with high similarity from the recognition dictionary .

The image processing apparatus according to claim 1 .
The pre-processing unit is
An image processing apparatus that performs at least one of partial cutout, binarization, noise removal, and contour enhancement of the character image as predetermined preprocessing.

The image processing apparatus according to claim 1 .
The first pattern generator is
As the first image processing, at least one of expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation is performed,
The second pattern generator is
An image processing apparatus that performs an image processing process different from the first image processing process among expansion, contraction, rotation, movement, blurring, camera shake, and perspective transformation as the second image processing process.

The image processing apparatus according to claim 1 .
The similarity integration unit
The maximum similarity is selected from a plurality of similarities respectively calculated by the first similarity calculation unit and the second similarity calculation unit, some are selected in descending order of the similarity, and the similarities are averaged. An image processing apparatus that integrates the similarities by using any one calculation formula of taking an average of the whole and emphasizing a difference in the similarity using a monotonically increasing function .

Memory in which document image is stored, recognition dictionary in which characters and their feature data are stored correspondingly, preprocessing unit, first pattern generation unit, second pattern generation unit, first feature extraction unit, second feature extraction unit In the image processing method by the image processing apparatus having the first similarity calculation unit, the second similarity calculation unit, and the similarity integration unit,
A step of reading a document image from the memory and performing a predetermined preprocessing on the read document image to generate a character image;
Generating a first character pattern group having a plurality of different character patterns by performing a first image processing on the character image by the first pattern generation unit;
Generating a second character pattern group having a plurality of different character patterns by performing a second image processing process on the character image by the second pattern generation unit;
The first feature extraction unit extracts feature data from each of the plurality of character patterns extracted from the plurality of character patterns of the first character pattern group generated by performing the first image processing. Steps,
The second feature extraction unit extracting feature data from each of a plurality of character patterns of the second character pattern group generated by performing the second image processing process;
A step in which the first similarity calculation unit calculates a similarity between the plurality of feature data extracted by the first feature extraction unit and character feature data of the recognition dictionary;
A step in which the second similarity calculation unit calculates the similarity between the plurality of feature data extracted by the second feature extraction unit and the character feature data of the recognition dictionary;
Integrated into one by the equation of the similarity integration section and a computed similarity by the the calculated similarity second similarity calculation unit by the first similarity calculation unit is predetermined, integrated And selecting a character with a high similarity from the recognition dictionary using the similarity .