JP3072518B2

JP3072518B2 - Recognition result display device

Info

Publication number: JP3072518B2
Application number: JP1093632A
Authority: JP
Inventors: 啓嗣小島
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1989-04-13
Filing date: 1989-04-13
Publication date: 2000-07-31
Anticipated expiration: 2015-07-31
Also published as: JPH02271470A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、自動説明機や自動発表機などに利用され、
光学文字読取装置（OCR）で認識された文字等の認識結
果を表示するための認識結果表示装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Industrial Application Field] The present invention is used for automatic explanation machines, automatic announcement machines, etc.
The present invention relates to a recognition result display device for displaying a recognition result of a character or the like recognized by an optical character reading device (OCR).

[Conventional technology]

従来、光学文字読取装置では、読取った原稿上の文字
等をパターン認識し、その認識結果をディスプレイに表
示したり、あるいは音声合成によって音声出力するよう
になっていた。2. Description of the Related Art Conventionally, in an optical character reading apparatus, a character or the like on a read original is subjected to pattern recognition, and the recognition result is displayed on a display or output as voice by voice synthesis.

[Problems to be solved by the invention]

しかしながら、上述した従来の認識結果表示装置で
は、認識結果を単にディスプレイ等に表示したり音声出
力するだけであり、原稿上の図や表などの画像情報を認
識結果と組合せて原稿上の本文の流れに沿って原稿上の
図や表などの画像をディスプレイに表示しながら、認識
結果を表示したり、音声出力したりすることができなか
った。However, the above-described conventional recognition result display device merely displays the recognition result on a display or outputs sound, and combines image information such as figures and tables on the document with the recognition result to form the text on the document. While displaying images such as figures and tables on a document along a flow on a display, it was not possible to display a recognition result or output a sound.

このため原稿の内容を把握しにくくまたこれを相手に
効率良く伝えることができないという欠点があった。For this reason, there are drawbacks that it is difficult to grasp the contents of the manuscript and it is not possible to efficiently communicate the contents to the other party.

本発明は、認識結果と画像情報とを組合せて原稿の内
容を把握し易くしさらには原稿の内容を相手に効率良く
伝えることの可能な認識結果表示装置を提供することを
目的としている。SUMMARY OF THE INVENTION It is an object of the present invention to provide a recognition result display device which makes it easy to grasp the contents of a document by combining a recognition result and image information, and which can efficiently transmit the contents of the document to a partner.

[Means for solving the problem]

上記目的を達成するために、本発明は、認識結果を出
力しながら、認識結果中のキーワードに対応して画像を
表示するようになっていることを特徴としたものであ
る。In order to achieve the above object, the present invention is characterized in that an image is displayed corresponding to a keyword in the recognition result while outputting the recognition result.

[Action]

上記のような構成の認識結果表示装置では、認識結果
をディスプレイに表示したりあるいは認識結果を音声で
発生しながら、認識結果中の図，表などのキーワードに
対応して画像をデイスプレイに表示する。In the recognition result display device having the above configuration, while displaying the recognition result on the display or generating the recognition result by voice, an image is displayed on the display in accordance with the keyword such as a figure or a table in the recognition result. .

〔Example〕

以下、本発明の一実施例を図面に基づいて説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

第１図は本発明の認識結果表示装置の一例の構成図で
ある。FIG. 1 is a configuration diagram of an example of a recognition result display device of the present invention.

第１図に示す装置は、原稿を読取るスキャナ１と、全
体の処理制御を行なう中央演算処理部２と、光学文字読
取に関するプログラムが格納されているROM3と、音声合
成に関するプログラムが格納されているROM4と、スキャ
ナ１で読取られたオリジナル画像が記憶されるオリジナ
ル画像メモリ５と、オリジナル画像中の座標値並びにテ
キスト領域，イメージ領域の付加情報を格納するRAM6
と、認識用の辞書７と、キーワードテーブル８と、音声
発生部９と、データ格納RAM10とを備えている。The apparatus shown in FIG. 1 stores a scanner 1 for reading an original, a central processing unit 2 for controlling the entire processing, a ROM 3 in which a program relating to optical character reading is stored, and a program relating to speech synthesis. ROM 4, an original image memory 5 for storing an original image read by the scanner 1, and a RAM 6 for storing coordinate values in the original image and additional information of a text area and an image area.
, A dictionary 7 for recognition, a keyword table 8, a voice generator 9, and a data storage RAM 10.

次に、このような構成の装置の動作を第２図のフロー
チャートを用いて説明する。Next, the operation of the apparatus having such a configuration will be described with reference to the flowchart of FIG.

第２図のステップS1では、スキャナ１によって原稿か
ら画像を読取りオリジナル画像メモリ５にオリジナル画
像として記憶する。次いでステップS2ではオリジナル画
像メモリ５に記憶されたオリジナル画像から、文字が存
在するテキスト領域と図，表，写真等が存在するイメー
ジ領域とを識別し、ステップS3では各領域ごとにそれぞ
れの画像，情報をRAM6内に保存する。In step S1 of FIG. 2, an image is read from a document by the scanner 1 and stored in the original image memory 5 as an original image. Next, in step S2, a text area where characters are present and an image area where figures, tables, photographs, etc. are present are identified from the original image stored in the original image memory 5, and in step S3, each image, Save the information in RAM6.

ステップS2における領域の識別すなわち領域の分け方
には自動とマニュアルの２つの仕方がある。There are two methods of identifying the area in step S2, that is, dividing the area, automatic and manual.

自動的に領域を識別する場合には、画像の状態，例え
ば黒画素の連結量を見てテキスト領域かイメージ領域か
を識別する。イメージ領域と判断された領域について
は、その領域内かあるいはその周辺に第３図の例に示す
ようなキーワードKWをキーワードテーブル８を参照して
探し、そのキーワードKWの次の文字との組み合わせをイ
メージに情報（図1,表１など）として付ける必要があ
る。なおキーワードを見出すことができなかったときの
ために、利用者に問い合わせるなどの機能も必要であ
る。In the case of automatically identifying an area, a text area or an image area is identified by checking the state of an image, for example, the connection amount of black pixels. For the area determined to be an image area, a keyword KW as shown in the example of FIG. 3 is searched for in or around the area by referring to the keyword table 8, and a combination with the next character of the keyword KW is searched. It must be attached to the image as information (Figure 1, Table 1, etc.). It is also necessary to provide a function for inquiring a user when a keyword cannot be found.

これに対してマニュアルで領域を識別する場合には、
マウスやカーソルなどを利用して、テキスト領域かイメ
ージ領域かを指定する。例えば矩形領域の左上と右下の
位置を指定して画像を囲み、領域を指定する。次いで利
用者にイメージに対応する情報（図1,表１など）を入力
してもらう。On the other hand, when manually identifying the area,
Specify the text area or image area using the mouse or cursor. For example, the image is surrounded by specifying the upper left and lower right positions of the rectangular area, and the area is specified. Next, the user inputs information corresponding to the image (FIG. 1, Table 1, etc.).

第４図は上述した領域識別処理の具体例を示した図で
あって、オリジナル画像OGをテキスト領域画像TXとイメ
ージ領域画像IMとに分けた状態が示されており、イメー
ジ領域画像IMの各画像IM₁,IM₂にはそれぞれ“図1",“図
2"の情報I₁,I₂が付加されている。FIG. 4 is a diagram showing a specific example of the above-described region identification processing, in which the original image OG is divided into a text region image TX and an image region image IM. Images IM ₁ and IM ₂ have “Figure 1” and “Figure
2 ”information I ₁ and I ₂ are added.

このようにして領域識別処理を行なった後、ステップ
S4,S5,S6のテキスト領域の認識処理を辞書７を用いて行
なう。なおステップS4乃至S6に示されている認識処理で
は、ある単位で認識しながらキーワードを探すようにし
ている。すなわちステップS4では、ある単位例えばワー
ドや行単位で認識を行ない認識結果を表示したり、ROM4
に従って音声合成し、音声発生部９から音声出力させる
一方で、ステップS5ではその認識結果中に第３図に示す
ように登録されているキーワードがあるかないかを判別
し、キーワードを検出したときにはそのキーワードの次
に示されている数字などの組み合わせパターンと保存さ
れているイメージに付加された情報とを比較し、一致し
たものをステップS6においてディスプレイに表示する。
なおイメージのディスプレイ表示時間は、次のイメージ
までとするのが妥当である。After performing the region identification processing in this manner, the step
The recognition process of the text area of S4, S5, S6 is performed using the dictionary 7. In the recognition processing shown in steps S4 to S6, a keyword is searched for while recognizing a certain unit. That is, in step S4, recognition is performed in a unit, for example, a word or a line, and a recognition result is displayed.
In step S5, it is determined whether or not there is a keyword registered in the recognition result as shown in FIG. 3, and if a keyword is detected, The combination pattern such as the number indicated next to the keyword is compared with the information added to the stored image, and a match is displayed on the display in step S6.
It is appropriate that the display time of the image is set to the next image.

このようにして、本実施例では、図や表などの画像を
本文の流れにそってディスプレイに表示しながら、認識
結果を表示したり、認識結果を音声合成して音声出力す
るようにしているので、自動説明機に適用した場合に、
原稿の内容を効率良く把握し相手方に伝えることが可能
となる。In this manner, in the present embodiment, while displaying images such as figures and tables on the display along the flow of the text, the recognition result is displayed, and the recognition result is synthesized and output as voice. So, when applied to an automatic explanation machine,
The contents of the manuscript can be efficiently grasped and transmitted to the other party.

〔The invention's effect〕

以下に説明したように、本発明によれば、認識結果を
出力しながら、認識結果中のキーワードに対応して画像
を表示するようになっているので、例えば自動説明機に
適用した場合に図や表などを本文の流れに沿ってディス
プレイで見ることができて、原稿の内容を把握し易く原
稿の内容を相手に効率良く伝えることができる。As described below, according to the present invention, while outputting a recognition result, an image is displayed corresponding to a keyword in the recognition result. Tables and tables can be viewed on the display along the flow of the text, so that the contents of the document can be easily grasped and the contents of the document can be efficiently transmitted to the other party.

[Brief description of the drawings]

第１図は本発明の認識結果表示装置の一例の構成図、第
２図は第１図に示す装置の処理の一例を示すフローチャ
ート、第３図はキーワードを示す図、第４図は領域識別
処理の具体例を説明するための図である。１……スキャナ、２……中央演算処理部、 3,4……ROM、５……オリジナル画像メモリ、６……RAM、７……辞書、８……キーワードテーブル、９……音声発生部、 10……データ格納RAMFIG. 1 is a block diagram showing an example of a recognition result display device according to the present invention, FIG. 2 is a flowchart showing an example of processing of the device shown in FIG. 1, FIG. 3 is a diagram showing keywords, and FIG. It is a figure for explaining the example of processing. 1, scanner 2, central processing unit 3, 4, ROM 5, original image memory 6, RAM 7, dictionary 8, keyword table 9, sound generator 9, 10 Data RAM

Claims

(57) [Claims]

1. A keyword registration means in which a plurality of types of area identification keywords are registered in advance, an area identification means for identifying a text area and an image area from an input image, and an area identified as an image area It detects whether there is any one of a plurality of types of area identification keywords registered in the keyword registration means in the area or at an arbitrary position around the area, and there is an area identification keyword. When it is detected that the detected area identification keyword is associated with the image information of the image area, a recognition processing means for performing a character recognition process on the text area identified by the area identification means, Display the character recognition result in the text area that has been subjected to the character recognition processing by the means and the image information in the image area Display means for displaying the character recognition result in the text area from the recognition and identification means, and when the character recognition result in the text area matches the keyword for area identification, A recognition result display device for displaying image information associated with a keyword.

2. A keyword registration means in which a plurality of types of area identification keywords are registered in advance, an area identification means for identifying a text area and an image area from an input image, and an area identified as an image area includes: It detects whether there is any one of a plurality of types of area identification keywords registered in the keyword registration means in the area or at an arbitrary position around the area, and there is an area identification keyword. When it is detected, an associating means for associating the detected area identification keyword with the image information of the image area; a recognition processing means for performing a character recognition process on the text area identified by the area identification means; Means for synthesizing and outputting a character recognition result in a text area subjected to character recognition processing by means When, and display means for displaying the image information of the image area, while the audio output of the character recognition result in the text area from the recognition processing means,
A recognition result display device, wherein when the character recognition result in the text region matches the region identification keyword, the association means displays image information associated with the region identification keyword.

3. A text region and an image region are discriminated from an input image by a region discriminating means, and an area determined as an image region by the region discriminating means is located within or around the image discriminated region. Of the plurality of types of area identification keywords registered in advance in the keyword registration means at the position of, and if it is detected that there is an area identification keyword, it is detected. The area identification keyword is associated with the image information of the image area, and a character recognition process is performed by the recognition processing unit on the area determined to be a text area by the area identification unit. While displaying the result of character recognition in the Recognition result display method characterized by displaying the image information associated with the keyword region identification when they match a word.

4. A text region and an image region are discriminated from an input image by a region discriminating means, and an area determined as an image region by the region discriminating means is located in or around the image discriminated region. It is detected whether or not any of the plurality of types of area identification keywords registered in advance in the keyword registration means is present at the position, and if it is detected that there is an area identification keyword, it is detected. The area identification keyword is associated with the image information of the image area, and a character recognition process is performed by the recognition processing unit on the area determined to be a text area by the area identification unit. The character recognition result in the text area is Recognition result display method characterized by displaying the image information associated with the keyword region identification when they match a frequency identification keywords.