JP2008040753A

JP2008040753A - Image processor and method, program and recording medium

Info

Publication number: JP2008040753A
Application number: JP2006213612A
Authority: JP
Inventors: Koji Kobayashi; 幸二小林; Yukiko Yamazaki; 由希子山崎; Hirohisa Inamoto; 浩久稲本
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2006-08-04
Filing date: 2006-08-04
Publication date: 2008-02-21

Abstract

PROBLEM TO BE SOLVED: To improve the visibility of a document image on the list display of retrieved results, to make it possible to search a retrieval object image on the list display and to improve operability of keyword retrieval in a document image DB. SOLUTION: When a server unit 110 receives a retrieval key, a keyword retrieval processing part 119 performs keyword retrieval processing to text information inside an image information DB 117. A display screen control processing part 121 displays the thumbnail, text information and representative partial image of a hit page. COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、テキスト検索における検索結果の表示技術に関し、ユーザが入力した検索キーワードに応じてテキスト検索を行う検索機能、検索結果を表示する表示画面の作成機能を備えた画像処理装置、方法、プログラムおよび記録媒体に関し、例えば、複写機の複合機、ファイルサーバ、画像処理プログラム等に好適な技術に関する。 The present invention relates to a search result display technique in text search, and relates to an image processing apparatus, method, and program having a search function for performing a text search according to a search keyword input by a user, and a display screen creation function for displaying the search result. The present invention relates to a technique suitable for, for example, a multifunction peripheral of a copying machine, a file server, and an image processing program.

紙文書をスキャナ等の入力デバイスを使用して電子化する例えば電子ファイリング等の装置があるが、専ら紙文書を大量に扱う業務用途として使用されていた。近年、スキャナの低価格化やスキャン機能を搭載したＭＦＰ（ＭｕｌｔｉＦｕｎｃｔｉｏｎＰｒｉｎｔｅｒ）の普及、またｅ−文書法等の法制化により一般的なオフィスにおいてもそのハンドリングの良さや利便性が認知され、紙文書をスキャンして電子化する機会が増えている。また、電子化された文書画像データや写真画像データ、ＰＣ等のアプリケーションによって作成された文書データ等をデータベース（以下ＤＢ）化して一元管理する画像ＤＢの用途も増加している。例えば、紙文書の原本が保存されていても、管理や検索のし易さから画像ＤＢを構築する場合もある。 There are devices such as electronic filing that digitize paper documents using an input device such as a scanner, but they have been used exclusively for business purposes dealing with a large amount of paper documents. In recent years, MFPs (Multi Function Printers) equipped with lower scanner prices and scanning functions have become popular, and the legislation such as the e-document method has been recognized for its good handling and convenience in general offices. Opportunities to scan and digitize documents are increasing. In addition, the use of an image DB for centralizing and managing digitized document image data, photographic image data, document data created by an application such as a PC as a database (hereinafter referred to as DB) is also increasing. For example, even if an original paper document is stored, an image DB may be constructed for ease of management and search.

このような画像ＤＢは、サーバ装置を設置して多数の人がアクセスする大規模なものから、個人のＰＣ内にＤＢを構築するパーソナルな用途まで様々である。近年のＭＦＰでは、内臓のＨＤＤに文書を蓄積する機能を備えており、ＭＦＰをベースとして画像ＤＢを構築する例もある。また、画像ＤＢには、大量の画像から所望の画像を検索するための検索機能を備えたものがある。現在主流の検索機能は、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）処理した文字認識結果や文書データから抽出したテキスト情報等をキーワードとして、全文検索または類義語や概念検索等を行うのが一般的である。 Such image DBs vary from large-scale ones where a server device is installed and accessed by a large number of people to personal uses for constructing a DB in a personal PC. Recent MFPs have a function of storing documents in a built-in HDD, and there is an example in which an image DB is constructed based on the MFP. Some image DBs have a search function for searching a desired image from a large number of images. Currently, the mainstream search function generally performs full-text search or synonym or concept search using a keyword recognition result obtained from OCR (Optical Character Reader) processing, text information extracted from document data, or the like as a keyword.

このようにして検索された文書画像の検索結果は、通常、何らかの表示手段を介してユーザに提示される。このような表示手法（ここではハードウェア的な意味ではなく、表示画面を作成するソフトウェア的な意味）には、以下の３通りの手法がある。
・方式１：文書ファイル名や作成日等の文書情報をテキスト形式で一覧表示する方法
・方式２：キーワードにヒットした文書画像データを縮小画像としたサムネイル一覧によって表示する方法
・方式３：キーワードにヒットした周囲テキストをテキスト情報として表示する方法
方式１は、表示画面のサイズが小さく、処理能力が低い機器等で使用される方法であり、方式２は、現状の文書画像ＤＢ等で使用される方法であり、方式３は、インターネットの検索等で用いられる方法である。 The search result of the document image searched in this way is usually presented to the user via some display means. There are the following three methods for such a display method (in this case, not a hardware meaning but a software meaning for creating a display screen).
Method 1: A method of displaying document information such as document file names and creation dates in a text format. Method 2: A method of displaying a thumbnail list using reduced document image data hitting a keyword. Method 3: A keyword. Method of displaying hit surrounding text as text information Method 1 is a method used in a device having a small display screen size and low processing capability, and method 2 is used in a current document image DB or the like. Method 3 is a method used for Internet search or the like.

上記した従来の表示手法では、検索結果の画面のみからユーザが所望の文書画像を見つけ出すことが難しく、その結果、検索にヒットした個々の文書画像データをビューア等の画面を用いて表示し、検索対象画像を探さなければならず、検索時の操作性が悪いという問題があった。 In the conventional display method described above, it is difficult for the user to find a desired document image only from the search result screen. As a result, each document image data hit in the search is displayed using a screen such as a viewer, and the search is performed. There was a problem that the target image had to be searched and the operability during the search was poor.

例えば、方式１では、ファイル名や作成日等の文書情報を予め知っていなければ文書を探せないため、それらの情報をユーザが知らない場合にはビューアで確認する必要がある。また、方式２では、特徴的な概観を持つ文書画像データならば探索が可能となるが、ヒットした文書画像が同様な概観を持つものが多い場合、例えば、論文や帳票、特許等の定型もしくは定型に近い文書画像が大量にヒットしている場合には、結局ビューアで確認する必要がある。さらに、方式３では、サムネイル等の画像の概観を認識する情報がないため直感的な判断がしにくく、各々のテキスト情報を読む必要があることから操作が煩わしく、少量のテキスト情報からは検索対象画像であるかの判断ができず、結局、ビューアで確認するケースが多くなる（表示するテキスト情報を増やせば一覧性が悪化する）。そして、画像ＤＢの規模が大きくなり、キーワード検索によるヒット数が増加するほど、上記した問題が顕著になる。 For example, in the method 1, since the document cannot be searched unless the document information such as the file name and the creation date is known in advance, if the user does not know the information, it is necessary to check with the viewer. In the method 2, search is possible if the document image data has a characteristic overview. However, when there are many hit document images having a similar overview, for example, a fixed form such as a paper, a form, a patent, or the like. When a large number of document images close to a fixed form are hit, it is necessary to confirm with a viewer after all. Furthermore, in method 3, since there is no information for recognizing the appearance of images such as thumbnails, it is difficult to make an intuitive judgment, and it is necessary to read each piece of text information, and the operation is troublesome. As a result, it is impossible to determine whether the image is an image. Eventually, there are many cases where the image is checked by a viewer (the list becomes worse if more text information is displayed). The above problem becomes more prominent as the scale of the image DB increases and the number of hits by keyword search increases.

そこで、このような課題に対して特許文献１の検索システムがある。特許文献１では、検索キーに適合するテキストオブジェクトを他の部分と識別できるように特徴付けられてなる縮小簡易画像を生成し、表示することにより、当該テキストオブジェクトの出現頻度を一目で視認でき、また、段落中のオブジェクトの出現位置により文書でのオブジェクトの使われ方が推測でき、文書内容が判別できる。 Therefore, there is a search system disclosed in Patent Document 1 for such a problem. In Patent Document 1, by generating and displaying a reduced simplified image characterized so that a text object that matches a search key can be identified from other parts, the appearance frequency of the text object can be visually recognized at a glance, Further, it is possible to infer how the object is used in the document from the appearance position of the object in the paragraph, and the document content can be determined.

特開２００４−１５７６６８号公報JP 2004-157668 A

しかし、上記した方法では、文書画像の概観とキーワードの位置が分かるが、その文書画像の内容を把握することは難しい。一般的に文書画像検索は、ユーザの探したい文書が存在し、その既知の文書を検索する場合と、情報を得たいために未知の文書を検索する場合に大別されるが、上記した方法は、未知の文書の検索には使用できない。また、例えば、ほとんど文字で構成される文書画像のページが上記した方法で表示されても、それが検索対象の画像であるか否かを判断することは難しい。 However, in the above method, the outline of the document image and the position of the keyword can be understood, but it is difficult to grasp the contents of the document image. Generally, a document image search is roughly divided into a case where a user wants to search for a document and searching for the known document and a case of searching for an unknown document in order to obtain information. Cannot be used to retrieve unknown documents. Also, for example, even if a page of a document image composed mostly of characters is displayed by the method described above, it is difficult to determine whether or not it is a search target image.

本発明は上記した問題点に鑑みてなされたもので、
本発明の目的は、ユーザが入力したキーワードに基づいて文書画像ＤＢのキーワード検索を実施し、検索結果の表示画面を生成する機能を有する画像処理装置において、検索結果の一覧表示上での文書画像の視認性を向上させ、一覧表示上のみで検索対象画像を探し出すことが可能となり、文書画像ＤＢにおけるキーワード検索の操作性を向上させた画像処理装置、方法、プログラムおよび記録媒体を提供することにある。 The present invention has been made in view of the above problems,
An object of the present invention is to provide a document image on a list display of a search result in an image processing apparatus having a function of performing a keyword search of a document image DB based on a keyword input by a user and generating a display screen of the search result. To provide an image processing apparatus, method, program, and recording medium that can improve the visibility of images, search for a search target image only on a list display, and improve the operability of keyword search in the document image DB. is there.

本発明は、画像データベースに蓄積されている文書画像データを検索する機能を有する画像処理装置であって、文書画像を部分画像に分割する画像分割手段と、キーワードによる検索キーを受け付け、前記文書画像をページ単位に検索するキーワード検索手段と、前記キーワード検索にヒットした１以上の文書画像を表示する表示画面を生成する表示画面制御手段と、前記画像分割した部分画像を選択する部分画像選択手段を備え、前記表示画面は、各々のページ単位に、ページの概観を示すサムネイルと、ヒットしたキーワードを含むテキスト情報と、前記部分画像選択手段により選択された部分画像を配置することを最も主要な特徴とする。 The present invention is an image processing apparatus having a function of searching document image data stored in an image database, receiving an image dividing means for dividing a document image into partial images, a search key based on a keyword, and the document image A keyword search means for searching for each page, a display screen control means for generating a display screen for displaying one or more document images hit by the keyword search, and a partial image selection means for selecting the divided partial images. The display screen includes a thumbnail showing the overview of the page, text information including the hit keyword, and a partial image selected by the partial image selection means for each page. And

本発明によると、検索結果の一覧表示上での文書画像のサムネイル、検索キーにヒットしたテキスト情報、代表部分画像を同時に表示することにより、サムネイルによって画像の概観を容易に把握でき、テキスト情報によって検索キー前後の文書内容を容易に把握でき、また、代表部分画像によってサムネイルでは確認できない代表部分画像の詳細の視認性を向上させているので、曖昧な記憶を頼りに既知の文書画像を検索する場合や、未知の情報を得るために文書画像を検索する場合でも、検索にヒットした文書画像の内容理解を助け、一覧表示上のみで検索対象画像を探し出すことが可能となり、文書画像ＤＢにおけるキーワード検索の操作性が向上する。 According to the present invention, by simultaneously displaying the thumbnail of the document image on the search result list display, the text information that hits the search key, and the representative partial image, an overview of the image can be easily grasped by the thumbnail, and the text information The document contents before and after the search key can be easily grasped, and the visibility of details of representative partial images that cannot be confirmed with thumbnails by the representative partial images has been improved, so a known document image can be searched by relying on ambiguous memory. Even when searching for a document image to obtain unknown information, it is possible to help understand the content of the document image that has been hit by the search, and to search for the search target image only on the list display. Improved search operability.

本発明によると、表示部分画像の選択に際して、その文章内容を端的に表す可能性の高いタイトル部を示す部分画像を選択しているので、文書内容の理解が容易になる。 According to the present invention, when the display partial image is selected, the partial image indicating the title portion that is highly likely to express the sentence content is selected, so that the document content can be easily understood.

本発明によると、表示部分画像の選択に際して、本文領域以外の写真、図形、表等の部分画像を選択しているので、直感的に文書内容を把握できる。 According to the present invention, when selecting a display partial image, a partial image such as a photograph, a figure, or a table other than the body area is selected, so that the contents of the document can be grasped intuitively.

本発明によると、検索キーのヒットした部分画像を代表部分画像として選択しているので、サムネイルと照合することによって検索キーのヒットした部分が文書画像中のどの部分に相当するのかを容易に把握できる。また、実画像上で文字の概観（フォントスタイルやフォントの大きさ）等が分かるので、文書内容を容易の把握できる。 According to the present invention, since the partial image hit with the search key is selected as the representative partial image, it is easy to grasp which portion in the document image the hit portion of the search key corresponds to by comparing with the thumbnail. it can. In addition, since the outline of the characters (font style and font size) can be understood on the actual image, the document contents can be easily grasped.

本発明によると、表示部分画像の選択に際して、キーワード検索にヒットした部分画像の近傍の部分画像を選択しているので、検索キーワードと関連する部分画像を選択することができ、検索キーワードによる文書画像の内容理解が容易になる。 According to the present invention, when selecting a display partial image, since a partial image in the vicinity of the partial image hit in the keyword search is selected, a partial image related to the search keyword can be selected, and a document image based on the search keyword can be selected. Understanding of the contents becomes easier.

本発明によると、代表部分画像の選択に際して、大きなサイズの部分画像を選択しているので、文書画像の内容を端的に表す部分画像を選択することができ、文書画像の内容理解が容易になる。 According to the present invention, when selecting a representative partial image, since a partial image of a large size is selected, it is possible to select a partial image that directly represents the contents of the document image and to facilitate understanding of the contents of the document image. .

本発明によると、検索キーがヒットした部分画像に含まれる図形、写真、表等のオブジェクトを示すキーワードを検索し、そのキーワードに基づいて代表部分画像を選択しているので、検索キーワードと関連する部分画像を選択することができ、検索キーワードによる文書画像の内容理解が容易になる。 According to the present invention, a keyword indicating an object such as a figure, a photograph, or a table included in a partial image whose search key is hit is searched, and a representative partial image is selected based on the keyword. A partial image can be selected, and the contents of the document image can be easily understood by the search keyword.

以下、発明の実施の形態について図面により詳細に説明する。実施例１：
図１は、本発明の実施例１のシステム構成を示す。図１において、１００はパーソナルコンピュータ（以下ＰＣ）、ＰＤＡや携帯電話等のモバイル端末等のクライアント装置である。１０１はモニタ等の表示デバイス、１０２はユーザ指示の解釈、サーバ１１０との通信、表示デバイス１０１の制御を行うアプリケーションプログラム、１０３はユーザからの指示入力手段であるキーボードやマウス等の入力デバイス、１０４はＬＡＮやインターネット等の外部通信路である。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Example 1:
FIG. 1 shows a system configuration of Embodiment 1 of the present invention. In FIG. 1, reference numeral 100 denotes a client device such as a personal computer (hereinafter referred to as a PC), a mobile terminal such as a PDA or a mobile phone. 101 is a display device such as a monitor, 102 is an application program that interprets user instructions, communicates with the server 110, and controls the display device 101, 103 is an input device such as a keyboard and mouse that is an instruction input means from the user, 104 Is an external communication path such as a LAN or the Internet.

１１０はクライアントからのコマンドに応じて画像分類を行い、分類結果をクライアント１００へ出力するサーバ装置、１１１は外部通信路１０４とのインターフェース（以下Ｉ／Ｆ）、１１２は画像ＤＢ１１３へ登録する登録画像データ、１１３は登録画像１１２を所定サイズ以下に変倍してサムネイル画像を生成するサムネイル生成処理部、１１４は登録画像１１２のレイアウト解析処理を実施し、画像を分割して部分画像（以下オブジェクト）を生成する画像分割処理部、１１５は画像分割処理部１１４で生成されたオブジェクトの属性を識別する属性識別処理部、１１６はオブジェクト画像中から、文字コードを抽出する文字コード抽出処理部、１１７は画像ＤＢ１２０へ登録されている文書画像データ毎の情報を蓄積する画像情報ＤＢである。なお、情報とは例えば、登録画像データのファイル名、作成日、画像データ、サムネイル画像データ、オブジェクト画像データとの紐付け情報（紐付け情報とは例えば、画像ＤＢ１２０に蓄積される時に各データ固有に付されたＩＤやファイル名等）、各オブジェクト毎の属性、ページ内のオブジェクトの位置情報、オブジェクトに含まれるテキスト情報（文字コード）等である。 110 is a server device that performs image classification according to a command from the client and outputs the classification result to the client 100, 111 is an interface with the external communication path 104 (hereinafter referred to as I / F), and 112 is a registered image to be registered in the image DB 113. Data, 113 is a thumbnail generation processing unit that generates a thumbnail image by scaling the registered image 112 to a predetermined size or less, 114 performs layout analysis processing of the registered image 112, and divides the image into partial images (hereinafter referred to as objects) 115 is an attribute identification processing unit that identifies the attribute of the object generated by the image division processing unit 114, 116 is a character code extraction processing unit that extracts a character code from the object image, and 117 is Image information for storing information for each document image data registered in the image DB 120 It is a B. The information includes, for example, the file name, creation date, image data, thumbnail image data, and object image data associated with the registered image data (the association information is, for example, specific to each data when stored in the image DB 120) ID, file name, etc.), attributes for each object, position information of the object in the page, text information (character code) included in the object, and the like.

１１８はクライアント１００によって指定され入力されるテキスト情報からなる検索キー、１１９は検索キー１１８に対応するオブジェクトや文書画像をオブジェクト情報ＤＢに蓄積されている情報から検索し、検索結果を出力するキーワード検索処理部、１２０は登録画像１１２の画像データ、登録画像１１２のサムネイル画像データ、画像分割処理部１１４で生成された登録画像１１２のオブジェクト画像を蓄積する画像ＤＢ、１２１はキーワード検索処理部１１９の検索結果およびオブジェクト選択処理部の選択結果に応じてクライアント１００へ表示するための表示画面を生成する表示画面制御処理部、１２２はクライアント１００の表示デバイス１０１上へ表示される表示画面データ、１２３はキーワード検索処理部１１９の検索結果に応じてクライアント１００へ表示するためのオブジェクトを選択するオブジェクト選択処理部である。なお、図中の点線は、画像登録時のデータの流れを表し、実線はキーワード検索処理を行って表示画面を生成する時のデータの流れを表している。 118 is a search key composed of text information designated and input by the client 100, 119 is a keyword search for searching for an object or document image corresponding to the search key 118 from information stored in the object information DB, and outputting the search result. The processing unit 120 is the image data of the registered image 112, the thumbnail image data of the registered image 112, the image DB that stores the object image of the registered image 112 generated by the image division processing unit 114, and 121 is the search of the keyword search processing unit 119. A display screen control processing unit that generates a display screen to be displayed on the client 100 according to the result and a selection result of the object selection processing unit, 122 is display screen data displayed on the display device 101 of the client 100, and 123 is a keyword Search by search processing unit 119 An object selection processing unit for selecting the object for display to the client 100 in accordance with the result. The dotted line in the figure represents the data flow at the time of image registration, and the solid line represents the data flow when generating a display screen by performing a keyword search process.

図２は、サーバ装置１１０の構成を示す。図２において、２０１はプログラムに応じた演算や処理を行うＣＰＵ、２０２はプログラムのコードや画像の符号データ等のデータを一時的に記憶、保持するワーク領域として使用される揮発性のメモリ、２０３は画像データやプログラム等を保存、蓄積するためのハードディスク（以下ＨＤＤ）であり、画像ＤＢ１２０、画像情報ＤＢ１１７を保持する。２０４はモニタ２０５へ表示するためのデータバッファとなるビデオメモリである。ビデオメモリ２０４に書き込まれた画像データは、定期的にモニタ２０５へ表示される。２０６はマウスやキーボード等の入力デバイス、２０７はインターネットやＬＡＮ等の外部通信路１０４を介してデータを送受信する外部Ｉ／Ｆ、２０８は各々の構成要素を接続するバスである。 FIG. 2 shows the configuration of the server device 110. In FIG. 2, 201 is a CPU that performs calculations and processing according to a program, 202 is a volatile memory used as a work area for temporarily storing and holding data such as program code and image code data, 203 Is a hard disk (hereinafter referred to as HDD) for storing and accumulating image data, programs, etc., and holds an image DB 120 and an image information DB 117. Reference numeral 204 denotes a video memory serving as a data buffer for display on the monitor 205. The image data written in the video memory 204 is periodically displayed on the monitor 205. Reference numeral 206 denotes an input device such as a mouse or a keyboard, 207 denotes an external I / F that transmits and receives data via the external communication path 104 such as the Internet or a LAN, and 208 denotes a bus that connects each component.

本実施例では、サーバ装置１１０がサーバコンピュータにより構成され、画像検索等の処理がソフトウェによって実現する例を示す。すなわち、サーバ内の処理は図示しないアプリケーションプログラムによって実現される。本発明の実施形態はこれに限定されず、ＭＦＰ等の装置内にハードウェアによって処理を行うように構成しても良いし、また、サーバ、クライアント構成を採らずに、例えば１つのＰＣやＭＦＰ等の機器内に、図１を構成するようにしても良い。 In the present embodiment, an example in which the server device 110 is configured by a server computer and processing such as image search is realized by software. That is, the processing in the server is realized by an application program (not shown). The embodiment of the present invention is not limited to this, and may be configured such that processing is performed by hardware in an apparatus such as an MFP. Also, for example, a single PC or MFP can be used without adopting a server or client configuration. 1 may be configured in such a device.

以下、本実施例における、画像登録時の動作と画像の検索動作時の検索結果の表示について説明する。 Hereinafter, an operation at the time of image registration and display of a search result at the time of an image search operation in the present embodiment will be described.

図３は、実施例１の画像登録動作のフローチャートである。図１（破線は登録時の動作を示す）、図３を参照して画像登録動作を説明する。 FIG. 3 is a flowchart of the image registration operation according to the first embodiment. The image registration operation will be described with reference to FIG. 1 (broken lines indicate the operation during registration) and FIG.

ステップＳ００１において、ユーザはクライアント装置１００からアプリケーションプログラム１０２を介してサーバ装置１１０へ画像データの登録の指示と登録する登録画像データ１１２を指示する。 In step S001, the user instructs the server apparatus 110 to register image data and the registered image data 112 to be registered from the client apparatus 100 via the application program 102.

ステップＳ００２において、登録画像データ１１２は、外部通信路１０４を介してサーバ装置１１０へファイル名、作成日等のファイル情報と共に入力され、外部Ｉ／Ｆ１１１を経由して画像ＤＢ１２０へＩＤ番号を付与して登録される。同時に、サムネイル生成処理部１１３では、登録画像１１２を変倍処理して所定のサイズ以下のサムネイル画像を生成し、画像ＤＢ１２０へＩＤ番号を付与して登録する。なお、登録画像データ１１２が複数ページを有する画像データである場合には、ページ単位でサムネイルを生成する。 In step S002, the registered image data 112 is input to the server apparatus 110 along with file information such as a file name and a creation date via the external communication path 104, and an ID number is assigned to the image DB 120 via the external I / F 111. Registered. At the same time, the thumbnail generation processing unit 113 performs scaling processing on the registered image 112 to generate a thumbnail image having a predetermined size or less, and registers the image DB 120 with an ID number. When the registered image data 112 is image data having a plurality of pages, thumbnails are generated in units of pages.

ステップＳ００３において、登録画像データ１１２は、画像分割処理部１１４へ入力され、公知のレイアウト解析処理によって同一属性毎の領域に分割してオブジェクト画像データを生成する。なお、この際に画像データのオブジェクトの座標データも抽出する。生成されたオブジェクト画像データは画像ＤＢ１２０へオブジェクト画像データ毎にＩＤ番号を付与して登録されるとともに属性識別処理部１１５へ出力される。 In step S003, the registered image data 112 is input to the image division processing unit 114, and is divided into regions for the same attribute by a known layout analysis process to generate object image data. At this time, the coordinate data of the object of the image data is also extracted. The generated object image data is registered in the image DB 120 with an ID number assigned to each object image data, and is output to the attribute identification processing unit 115.

ステップＳ００４において、生成されたオブジェクト画像データはオブジェクト画像毎に属性識別処理部１１５へ入力され、公知のレイアウト解析処理によってオブジェクト毎に属性が識別される。 In step S004, the generated object image data is input to the attribute identification processing unit 115 for each object image, and the attribute is identified for each object by a known layout analysis process.

なお、上記したステップＳ００３、Ｓ００４におけるレイアウト解析処理は、ＯＣＲ処理の前処理等で使用され、様々な手法があり、それら公知の手法を用いれば良い。例えば、文書画像の背景色を特定し、背景色を用いて該文書画像から背景領域以外の画素を抽出し、該画素を統合して連結成分を生成し、該連結成分を少なくとも形状特徴を用いて所定の領域に分類して、文字領域や写真領域を識別する技術がある（特開２００１−２９７３０３号公報を参照）。また、文字領域識別の別の例として、適応的な２値化処理を行った後、外接矩形の形状を利用して、文字領域を識別する技術もある（特開平７−７３２７１号公報を参照）。さらに、入力画像の黒領域の隣接関係を解析して長方形に分離し（画像分割処理）、この長方形の大きさや黒領域の分布密度に基づいて入力画像の文字、写真、図形、表の各領域を識別する技術（属性識別処理）もある（特開平７−２２１９６８号公報を参照）。 Note that the layout analysis processing in steps S003 and S004 described above is used in the pre-processing of the OCR processing, and there are various methods, and these known methods may be used. For example, the background color of the document image is specified, pixels other than the background region are extracted from the document image using the background color, the connected components are generated by integrating the pixels, and at least the shape feature is used for the connected components. There is a technique for classifying a character area and a photographic area by classifying them into predetermined areas (see Japanese Patent Laid-Open No. 2001-297303). As another example of character region identification, there is a technique for identifying a character region using a circumscribed rectangle after performing an adaptive binarization process (see JP-A-7-73271). ). Furthermore, the adjacent relationship of the black area of the input image is analyzed and divided into rectangles (image division processing), and the character, photo, figure, and table areas of the input image are based on the size of the rectangle and the distribution density of the black area. There is also a technique (attribute identification process) for identifying (see Japanese Patent Application Laid-Open No. 7-221968).

上記した従来技術を用いる（または組み合わせる）ことによって、文字領域や写真領域、図形領域、表領域等の属性毎の画像領域分割（オブジェクト化）およびその属性判定が可能となる。また、その際に文字領域の位置やサイズ、文字の大きさ等をもとにしてタイトル領域の識別を行い、文字領域と区別して判定するように構成する。 By using (or combining) the above-described conventional techniques, it is possible to divide an image area (objectification) for each attribute such as a character area, a photographic area, a graphic area, and a table area, and determine the attribute. At this time, the title area is identified based on the position and size of the character area, the size of the character, etc., and is determined separately from the character area.

なお、分割されたオブジェクトの属性識別の他の例としては、例えば、分割された領域のヒストグラムや周波数的な特徴量等を取得して、予め特徴量と属性の関係を学習させたニューラルネットワークやサポートベクタマシン等のパターン認識手法を使用しても良い。また、レイアウト解析処理の前にその精度を高める目的で入力画像に対してスキュー補正や裏移り除去等の前処理を行うようにすればさらに好適である。 In addition, as another example of attribute identification of the divided object, for example, a neural network in which a histogram of a divided area, a frequency feature amount, etc. are acquired and a relationship between the feature amount and the attribute is learned in advance is used. A pattern recognition method such as a support vector machine may be used. Further, it is more preferable to perform pre-processing such as skew correction and back-off removal on the input image for the purpose of improving the accuracy before layout analysis processing.

図４（１）の原稿に対して、レイアウト解析を行った結果の例を図４（２）に示す。 FIG. 4B shows an example of the result of layout analysis performed on the document shown in FIG.

ステップＳ００５において、オブジェクト画像データを文字コード抽出処理部１１６へ入力し、オブジェクト内に文字が含まれている場合は、ＯＣＲ処理等を実施して文字認識して文字コードを抽出する。なお、入力された登録画像データが、スキャン画像等のビットマップデータの場合はＯＣＲ処理が必要となるが、文字コードを含む文書データ等の場合はデータからの文字コードをそのまま抽出すれば良い。 In step S005, the object image data is input to the character code extraction processing unit 116, and if a character is included in the object, the character code is extracted by performing character recognition by performing OCR processing or the like. Note that, when the input registered image data is bitmap data such as a scanned image, OCR processing is required. However, in the case of document data including a character code, the character code from the data may be extracted as it is.

ステップＳ００６において、ステップＳ００２〜Ｓ００５で生成、抽出された、以下の画像情報データを画像情報ＤＢ１１７へ登録する。
・ファイル名、作成日
・画像データＩＤ
・サムネイル画像データＩＤ
・オブジェクト画像データＩＤ
・オブジェクト座標
・オブジェクト属性
・オブジェクトから抽出された（文字コードによる）テキスト情報
なお、画像情報ＤＢ１１７は、一般的なＲＤＢ（リレーショナルデータベース）を使用することにより、上記した情報の登録、管理、検索等の処理を簡易に実現できる。また、画像ＤＢ１２０と画像情報ＤＢ１１７は上述の機能を満たせば、同じＤＢに例えばＸＭＬ（ｅＸｔｅｎｓｉｂｌｅＭａｒｋｕｐＬａｎｇａｇｅ）等の言語を使用し、階層的なデータ構造等を構築して蓄積しても良く、また、異なるサーバに別々なＤＢとして蓄積してもよい。また、画像登録については、スキャナやデジタルカメラ等の画像入力装置から直接、画像データをサーバ装置１１０へ登録するようにしても良い。 In step S006, the following image information data generated and extracted in steps S002 to S005 is registered in the image information DB 117.
・ File name, creation date ・ Image data ID
・ Thumbnail image data ID
-Object image data ID
-Object coordinates-Object attributes-Text information extracted from the object (by character code) The image information DB 117 uses the general RDB (relational database) to register, manage, search, etc. the above information Can be realized easily. Further, if the image DB 120 and the image information DB 117 satisfy the above functions, a hierarchical data structure or the like may be constructed and stored in the same DB using a language such as XML (extensible Markup Language), for example. These may be stored as different DBs on different servers. As for image registration, image data may be registered in the server apparatus 110 directly from an image input device such as a scanner or a digital camera.

図５は、実施例１における、画像検索動作のフローチャートである。図１、図５を参照して画像検索動作を説明する。 FIG. 5 is a flowchart of an image search operation in the first embodiment. The image search operation will be described with reference to FIGS.

ステップＳ１０１において、ユーザは、クライアント装置１００においてアプリケーションプログラム１０２を使用して、検索キーを指定して画像検索をサーバ装置１１０へ指示する。ここでの検索キーは単語等のキーワードを想定する。この時の指示手段は、例えば図６に示すような検索キー指定画面をクライアント装置１００の表示デバイス１０１上へ表示する。図６において、３０１はキーワード入力ウインドウ、３０２は検索を指示する検索ボタン、３０３は検索をキャンセルするキャンセルボタンである。 In step S 101, the user uses the application program 102 in the client device 100 to specify a search key and instruct the server device 110 to perform an image search. The search key here is assumed to be a keyword such as a word. The instruction means at this time displays a search key designation screen as shown in FIG. 6 on the display device 101 of the client apparatus 100, for example. In FIG. 6, 301 is a keyword input window, 302 is a search button for instructing search, and 303 is a cancel button for canceling search.

ユーザは、入力デバイス１０３のキーボード等を使用し、キーワード入力ウインドウ３０１へ検索キーワードを入力し、入力デバイス１０３のマウス等のポインティングデバイスを使用し、検索ボタン３０２をクリックすることにより、サーバ１１０側に画像分類指示が外部通信路１０４を介して転送される。 The user inputs a search keyword into the keyword input window 301 using the keyboard of the input device 103 or the like, uses a pointing device such as a mouse of the input device 103, and clicks the search button 302, thereby causing the server 110 to enter. An image classification instruction is transferred via the external communication path 104.

ステップＳ１０２において、サーバ装置１１０は、検索指示と共に検索キーを受信すると、キーワード検索処理部１１９では画像情報ＤＢ１１７に蓄積されたオブジェクトのテキスト情報に対してキーワード検索処理を実施する。キーワード検索処理は、検索キーで指定されたキーワードと完全に合致するもののみを検索する方法と、キーワードの類義語や連想される言葉等を含めて検索する方法、また自然文を検索キーとしてそこから抽出した単語をキーとして検索する方法等があるが、本発明はいずれの方法でもよく、またクライアント装置１００側でそれらを指定するように構成しても良い。 In step S102, when the server apparatus 110 receives the search key together with the search instruction, the keyword search processing unit 119 performs keyword search processing on the text information of the object stored in the image information DB 117. The keyword search process consists of a method that searches only for a keyword that exactly matches the keyword specified by the search key, a method that searches for synonyms of keywords, associated words, etc., and a natural sentence as a search key. There is a method of searching using the extracted word as a key, but any method may be used in the present invention, and the client device 100 may be configured to designate them.

キーワード検索処理部１１９では、画像情報ＤＢ１１７から検索にヒットしたオブジェクトに関する以下の情報を表示画面制御処理部１２１へ出力する。
・ファイル名、作成日
・画像データＩＤ
・サムネイル画像データＩＤ
・オブジェクト画像データＩＤ
・オブジェクト座標
・オブジェクト属性
・オブジェクトから抽出された（文字コードによる）テキスト情報
ステップＳ１０３において、オブジェクト選択処理部１２３では、キーワード検索処理部１１９の検索結果に応じて表示対象ページ（キーワード検索にヒットしたオブジェクトを含むページ）に含まれるオブジェクトから表示するオブジェクトを選択する。選択方法の詳細については後述する。 The keyword search processing unit 119 outputs the following information regarding the object hit in the search from the image information DB 117 to the display screen control processing unit 121.
・ File name, creation date ・ Image data ID
・ Thumbnail image data ID
-Object image data ID
-Object coordinates-Object attribute-Text information extracted from object (by character code) In step S103, the object selection processing unit 123 makes a display target page (keyword search hit) according to the search result of the keyword search processing unit 119. Select the object to display from the objects included in the page containing the object. Details of the selection method will be described later.

ステップＳ１０４において、表示画面制御処理部１２１では、キーワード検索処理部１１９の検索結果およびオブジェクト選択処理部１２３の選択結果に応じて検索結果表示画面１２２のレイアウトを決定する。次いで、画像ＤＢ１２０から、表示する画像データ、画像データのサムネイル、オブジェクト画像データを入力し、また画像情報ＤＢ１１７からキーワード検索にヒットしたテキスト情報を入力し、検索結果表示画面１２２を生成し、外部Ｉ／Ｆ１１１より外部通信路１０４を経由してクライアント１００へ送信する。 In step S104, the display screen control processing unit 121 determines the layout of the search result display screen 122 according to the search result of the keyword search processing unit 119 and the selection result of the object selection processing unit 123. Next, the image data to be displayed, the thumbnail of the image data, and the object image data are input from the image DB 120, and the text information hit by the keyword search is input from the image information DB 117, and the search result display screen 122 is generated. / F111 transmits to the client 100 via the external communication path 104.

なお、上記したような表示画面の作成方法やサーバクライアント間の通信方法には種々の手法があるが、一般的によく使用される手法としてサーバ装置１１０をＷｅｂサーバとしてＷｏｒｌｄＷｉｄｅＷｅｂベースの技術を使用することにより実現可能となる。そして、表示画面１２２はＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇａｇｅ）によって記述され、アプリケーション１０２は一般的なＷｅｂブラウザを使用すれば良い。 There are various methods for creating the display screen and for communication between server clients as described above. As a commonly used method, the World Wide Web-based technology is used with the server device 110 as a Web server. This can be realized. The display screen 122 is described in HTML (Hyper Text Markup Language), and the application 102 may use a general Web browser.

図７は、検索結果表示画面の一例を示す。図７の表示画面をページ詳細表示画面と呼ぶこととする。図７において、３１１は表示画面をページ詳細表示画面への切り換えを指示するラジオボタン、３１２は一般的なサムネイル表示を指示するラジオボタン、３１３は検索結果を表示するフレーム、３１４はページ（単位の）表示領域、３１５はフレームをスクロールするためのスライダである。 FIG. 7 shows an example of a search result display screen. The display screen in FIG. 7 is referred to as a page detail display screen. In FIG. 7, 311 is a radio button for instructing switching of the display screen to the page detail display screen, 312 is a radio button for instructing general thumbnail display, 313 is a frame for displaying search results, 314 is a page (unit ) Display area 315 is a slider for scrolling the frame.

このように、実施例１の詳細表示画面は、検索にヒットしたページ単位に表示領域３１４を設けて、それをスライダ３１５でスクロールしながら、キーワード検索した結果表示画面から検索対象画像を探す例を示す。 As described above, the detailed display screen of the first embodiment is an example in which the display area 314 is provided for each page hit in the search, and the search target image is searched from the result display screen of the keyword search while scrolling the display area 314 with the slider 315. Show.

また、図８は、ページ表示領域３１４に、図４に示す画像の「属性」をキーワードとして検索した結果を表示した例を示す。図８において、３２１はファイル名、３２２は検索にヒットした画像データの該当ページのサムネイル画像、３２３は画像データから抽出したタイトル、３２４は選択された代表オブジェクト、３２５はキーワード検索にヒットしたテキストデータである。 FIG. 8 shows an example in which the page search area 314 displays the search result using the “attribute” of the image shown in FIG. 4 as a keyword. In FIG. 8, 321 is the file name, 322 is the thumbnail image of the corresponding page of the image data hit in the search, 323 is the title extracted from the image data, 324 is the selected representative object, 325 is the text data hit in the keyword search It is.

この例では、テキストデータ３２５を文字コードのデータ（以下テキストデータと略す。ビットマップの文字画像は文字画像と記す）とした例を示す。これにより、クライアント１００のアプリケーション１０２によって文字コードが解釈され、ユーザは表示画面上で文字として読むことができる。この部分は、画像データの必要な部分の文字画像を切り出して表示しても良いが、大きさが不揃いになり、また、画像データであるためユーザが読み易く体裁を整える操作が煩雑になるなどデメリットがあるため、テキストデータの方がよい。また、テキストデータでは、キーワード部分を強調したり、色づけする等の文字装飾操作も容易である。さらに、テキスト部に実際のオブジェクト画像データのリンクを付けておき、マウス等のポインティングデバイス等でクリックするとオブジェクト画像データが表示されるようにしても良い。 This example shows an example in which the text data 325 is character code data (hereinafter abbreviated as text data. Bitmap character images are referred to as character images). As a result, the character code is interpreted by the application 102 of the client 100, and the user can read the character code on the display screen. This part may be displayed by cutting out the character image of the necessary part of the image data, but the size is not uniform, and since it is the image data, the operation of making the appearance easy for the user to read is complicated. Text data is better because of its disadvantages. In text data, character decoration operations such as emphasizing or coloring a keyword portion are also easy. Further, a link of actual object image data may be attached to the text portion, and the object image data may be displayed when clicked with a pointing device such as a mouse.

実施例１はページ単位での表示形態を示しているが、キーワードによる検索では、ページ単位を検索対象とする場合が多い。例えば、最終的にユーザが検索対象ページを含む複数ページからなる文書が必要な場合は、まずページ単位で検索対象画像を探し、そのページから文書へたどり着く手段があれば良いので、多くの場合、上記したような方法で問題はない。仮に、ユーザがキーワード検索からダイレクトに複数ページからなる文書を必要とする場合には、サムネイル画像３２２に表示するサムネイルを先頭ページのサムネイルとし、あるいは先頭ページのサムネイルを並べて表示しても良い。 The first embodiment shows a display form in units of pages. However, in a search using a keyword, there are many cases where a page unit is a search target. For example, when a user finally needs a document consisting of a plurality of pages including a search target page, it is only necessary to first search for a search target image in units of pages and reach a document from the page. There is no problem with the method described above. If the user needs a document consisting of a plurality of pages directly from a keyword search, the thumbnail displayed on the thumbnail image 322 may be used as the top page thumbnail, or the top page thumbnails may be displayed side by side.

ステップＳ１０５において、クライアント装置１００では、表示デバイス１０１上に表示画面１２２を表示する。 In step S 105, the client apparatus 100 displays the display screen 122 on the display device 101.

次に、オブジェクト選択処理部の動作を説明する。実施例１では、以下の条件によりオブジェクトを選択している。図９は、オブジェクト選択処理のフローチャートを示す。
・タイトルが存在すればタイトルオブジェクトを選択する（ステップＳ２０１）。
・本文、タイトルオブジェクト以外の図形、写真、表等のオブジェクトから、キーワードがヒットしたオブジェクトの最近傍のオブジェクトを一つ選択する（ステップＳ２０２、Ｓ２０３）。 Next, the operation of the object selection processing unit will be described. In the first embodiment, an object is selected under the following conditions. FIG. 9 shows a flowchart of the object selection process.
If a title exists, a title object is selected (step S201).
One object closest to the object hit by the keyword is selected from objects other than the body text and title object, such as graphics, photos, and tables (steps S202 and S203).

例えば、「請求の範囲」がキーワードの場合、図４（２）では、本文１のオブジェクトがヒットしたオブジェクトであり、その最近傍オブジェクトの図形オブジェクトが選択される。 For example, when “claim” is a keyword, in FIG. 4B, the object of body 1 is the hit object, and the graphic object of the nearest object is selected.

なお、実施例１では、タイトル以外の選択オブジェクトが１つの例を示したが、複数選択しても良い。また、近傍オブジェクトの選択において、予め、図形、写真、表オブジェクトに、近傍の文字主体のオブジェクトから得られるテキスト情報を関連付け、キーワード検索によってそれらのオブジェクトが自動的に選択されるように構成しても良い。また、実施例１では、属性単位に画像領域を分割する画像分割手法を示したが、例えば、図１０の破線で示すように画像を等分割して、分割した領域を代表オブジェクトとして表示しても良い。この場合は、図１１に示すような表示画面となるため、文書画像の内容を正確に提示する精度が低下するが、属性を判定する必要がなくなるためサーバ装置の処理が軽減される。 In the first embodiment, an example in which there is one selection object other than the title is shown, but a plurality of selection objects may be selected. In addition, when selecting nearby objects, text information obtained from nearby text-based objects is associated with graphics, photos, and table objects in advance, and these objects are automatically selected by keyword search. Also good. In the first embodiment, an image division method for dividing an image area into attribute units is shown. For example, as shown by a broken line in FIG. 10, an image is equally divided and the divided area is displayed as a representative object. Also good. In this case, since the display screen as shown in FIG. 11 is obtained, the accuracy of accurately presenting the contents of the document image is reduced, but the processing of the server device is reduced because the attribute need not be determined.

以上、説明したように本実施例によれば、ユーザが入力したキーワードに基づいて文書画像ＤＢのキーワード検索を実施して、検索結果の表示画面を生成する機能を有する画像処理装置において、検索結果の一覧表示上での文書画像のサムネイル、検索キーにヒットしたテキスト情報、代表オブジェクトを同時に表示することにより、サムネイルによって画像の概観を容易に把握でき、テキスト情報によって検索キー前後の文書内容を容易に把握でき、また、代表オブジェクトによってサムネイルでは確認できない代表オブジェクトの詳細の視認性を向上させているので、曖昧な記憶を頼りに既知の文書画像を検索する場合や、未知の情報を得るために文書画像を検索する場合でも、検索にヒットした文書画像の内容理解を助け、一覧表示上のみで検索対象画像を探し出すことが可能となり、文書画像ＤＢにおけるキーワード検索の操作性が向上する。 As described above, according to the present embodiment, in the image processing apparatus having the function of performing a keyword search of the document image DB based on the keyword input by the user and generating a display screen of the search result, By displaying thumbnails of document images on the list display, text information that hits the search key, and representative objects at the same time, the overview of the image can be easily grasped by thumbnails, and the document contents before and after the search key can be easily determined by text information. In addition, the visibility of the details of representative objects that cannot be confirmed with thumbnails is improved by the representative object, so when searching for a known document image relying on ambiguous memory, or to obtain unknown information Even when searching for document images, it helps to understand the contents of document images that have been found in the search, and Search target image it is possible to find out in the body, thereby improving the operability of the keyword search in the document image DB.

また、代表オブジェクトの選択に際して、その文章の内容を端的に表す可能性の高いタイトルオブジェクトを選択しているので、文書内容の理解が容易になる。また、代表オブジェクトの選択に際して、本文領域以外の写真、図形、表等の領域を選択しているので、直感的に文書内容を把握できる。また、代表オブジェクトの選択に際して、キーワード検索にヒットしたオブジェクトの近傍オブジェクトを選択しているので、検索キーワードに関連したオブジェクトを選択することができ、検索キーワードによる文書画像の内容理解が容易になる。 In addition, when selecting a representative object, a title object that is highly likely to express the content of the sentence is selected, so that it is easy to understand the document content. In addition, when selecting a representative object, a region such as a photo, figure, or table other than the body region is selected, so that the contents of the document can be grasped intuitively. In addition, when selecting a representative object, an object in the vicinity of the object hit in the keyword search is selected, so an object related to the search keyword can be selected, and the contents of the document image can be easily understood by the search keyword.

実施例２：
実施例２は、オブジェクト選択手段が実施例１のものと異なる。他の構成要素は実施例１と同様である。 Example 2:
The second embodiment is different from the first embodiment in object selection means. Other components are the same as those in the first embodiment.

実施例２では、以下の条件により、オブジェクトを選択する。図１３は、実施例２のオブジェクト選択処理のフローチャートを示す。
・タイトルが存在すればタイトルオブジェクトを選択する（ステップＳ３０１）。
・検索キーがヒットしたオブジェクトを選択する（ステップＳ３０２）。
・本文、タイトルオブジェクト以外の図形、写真、表等のオブジェクトから、最大サイズのオブジェクトを一つ選択する（ステップＳ３０３）。 In the second embodiment, an object is selected under the following conditions. FIG. 13 is a flowchart of object selection processing according to the second embodiment.
If a title exists, a title object is selected (step S301).
Select an object whose search key is hit (step S302).
One object of the maximum size is selected from objects other than the body text and title object, such as graphics, photos, and tables (step S303).

図４の画像に対する、実施例２の処理結果の例を図１２に示す。例えば、「請求の範囲」がキーワードの場合、図４（２）では、本文１のオブジェクトがヒットしたオブジェクトであり、図１２に示すように、ヒットしたオブジェクト３２６が画像データとして選択される。また、最大サイズの写真オブジェクト３２４が選択される。 An example of the processing result of the second embodiment for the image of FIG. 4 is shown in FIG. For example, when “claim” is a keyword, in FIG. 4B, the object of body 1 is the hit object, and as shown in FIG. 12, the hit object 326 is selected as the image data. Also, the maximum size photo object 324 is selected.

以上、説明したように本実施例によれば、検索キーがヒットしたオブジェクトを代表表示オブジェクトとして選択しているので、サムネイルと照合することによって検索キーがヒットした部分が文書画像中のどの部分に相当するかを容易に把握できる。また、実画像上での文字の概観（フォントスタイルやフォントの大きさ）等が容易に分かる。また、代表オブジェクトの選択に際して、オブジェクトのサイズにより選択しているので、文書画像の内容を端的に表すオブジェクトを選択することができ、文書画像の内容理解が容易になる。 As described above, according to the present embodiment, since the object whose search key is hit is selected as the representative display object, the portion where the search key is hit by comparing with the thumbnail is the part in the document image. You can easily grasp whether it corresponds. In addition, an overview of characters on the actual image (font style and font size) can be easily understood. Further, since the representative object is selected according to the size of the object, an object that directly represents the contents of the document image can be selected, and the contents of the document image can be easily understood.

実施例３：
実施例３は、オブジェクト選択手段が実施例１のものと異なる。他の構成要素は実施例１と同様である。 Example 3:
The third embodiment is different from the first embodiment in object selection means. Other components are the same as those in the first embodiment.

実施例３では、以下の条件により、オブジェクトを選択する。図１４は、実施例３のオブジェクト選択処理のフローチャートを示す。
・タイトルが存在すればタイトルオブジェクトを選択する（ステップＳ４０１）。
・検索キーがヒットしたオブジェクトのテキスト情報から、「表」「Ｆｉｇ」「図」「写真」等の図形、写真、表等のオブジェクトを示すキーワードを検索し（ステップＳ４０３）、ヒットした場合（ステップＳ４０４でＹｅｓ）、図形、写真、表等のオブジェクトのキャプション（本実施例の場合は、図形、写真、表オブジェクトに含まれるものとする）を対象としてヒットしたキーワードでキーワード検索を実施してオブジェクトを選択する（ステップＳ４０７）。なお、処理結果の例は実施例１と同様である（ただし、本文３に「図」キーワードが存在する場合）。 In the third embodiment, an object is selected under the following conditions. FIG. 14 is a flowchart illustrating object selection processing according to the third embodiment.
If a title exists, a title object is selected (step S401).
A keyword indicating an object such as a figure, a photograph, a table or the like such as “table”, “FIG”, “figure”, “photograph”, or the like is searched from the text information of the object hit by the search key (step S403). (Yes in S404), keyword search is performed with a keyword hit for the caption of an object such as a graphic, photo, or table (in this example, it is included in the graphic, photo, or table object), and the object Is selected (step S407). The example of the processing result is the same as that in the first embodiment (provided that the “text” keyword is present in the text 3).

以上、説明したように本実施例によれば、検索キーがヒットしたオブジェクトに含まれる図形、写真、表等のオブジェクトを示すキーワードを検索し、そのキーワードに基づいて代表オブジェクトを選択しているので、検索キーワードに関連するオブジェクトの選択が可能となり、検索キーワードによる文書画像の内容理解が容易になる。 As described above, according to the present embodiment, a keyword indicating an object such as a figure, a photograph, or a table included in an object whose search key is hit is searched, and a representative object is selected based on the keyword. The object related to the search keyword can be selected, and the contents of the document image can be easily understood by the search keyword.

本発明の実施例のシステム構成を示す。1 shows a system configuration of an embodiment of the present invention. サーバ装置の構成を示す。The structure of a server apparatus is shown. 実施例１の画像登録動作のフローチャートを示す。2 is a flowchart illustrating an image registration operation according to the first exemplary embodiment. 画像分割、属性識別結果の例を示す。An example of image segmentation and attribute identification results is shown. 実施例１の画像検索動作のフローチャートを示す。2 is a flowchart of an image search operation according to the first embodiment. 検索キー指定画面の例を示す。An example of a search key designation screen is shown. サムネイル一覧表示画面の例を示す。An example of a thumbnail list display screen is shown. 実施例１のページ表示領域詳細画面の例を示す。The example of the page display area detail screen of Example 1 is shown. 実施例１のオブジェクト選択処理のフローチャートを示す。3 is a flowchart of object selection processing according to the first embodiment. 画像分割の他の方法を示す。Another method of image segmentation is shown. 図１０の分割手法による表示画面の例を示す。The example of the display screen by the division | segmentation method of FIG. 10 is shown. 実施例２のページ表示領域詳細画面の例を示す。The example of the page display area detail screen of Example 2 is shown. 実施例２のオブジェクト選択処理のフローチャートを示す。10 is a flowchart of object selection processing according to the second embodiment. 実施例３のオブジェクト選択処理のフローチャートを示す。10 is a flowchart of object selection processing according to the third embodiment.

Explanation of symbols

１００クライアント装置
１０１表示デバイス
１０２アプリケーションプログラム
１０３入力デバイス
１０４外部通信路
１１０サーバ装置
１１１インターフェース
１１２登録画像データ
１１３サムネイル生成処理部
１１４画像分割処理部
１１５属性識別処理部
１１６文字コード抽出処理部
１１７画像情報ＤＢ
１１８検索キー
１１９キーワード検索処理部
１２０画像ＤＢ
１２１表示画面制御処理部
１２２表示画面データ
１２３オブジェクト選択処理部 DESCRIPTION OF SYMBOLS 100 Client apparatus 101 Display device 102 Application program 103 Input device 104 External communication path 110 Server apparatus 111 Interface 112 Registered image data 113 Thumbnail generation process part 114 Image division process part 115 Attribute identification process part 116 Character code extraction process part 117 Image information DB
118 Search Key 119 Keyword Search Processing Unit 120 Image DB
121 Display Screen Control Processing Unit 122 Display Screen Data 123 Object Selection Processing Unit

Claims

An image processing apparatus having a function of searching document image data stored in an image database, receiving an image dividing unit for dividing a document image into partial images, and a search key based on a keyword, A keyword search means for searching; a display screen control means for generating a display screen for displaying one or more document images hit by the keyword search; and a partial image selection means for selecting the partial image obtained by dividing the image. An image processing apparatus characterized in that a screen arranges a thumbnail indicating an overview of a page, text information including a hit keyword, and a partial image selected by the partial image selection unit for each page.

The image processing apparatus according to claim 1, further comprising an attribute identification unit that identifies an attribute of the partial image, wherein the partial image selection unit selects a partial image according to the identification result of the attribute identification unit.

The image processing apparatus according to claim 2, wherein the attribute selected by the partial image selection unit is a page title.

3. The image processing apparatus according to claim 2, wherein the attribute selected by the partial image selection means is one or more of a photo area, a graphic area, and a table area.

The image processing apparatus according to claim 2, wherein the attribute selected by the partial image selection unit is an area including a hit keyword.

The image processing apparatus according to claim 1, wherein the partial image selection unit selects a partial image according to a distance from the partial image including the hit keyword.

The image processing apparatus according to claim 1, wherein the partial image selection unit selects a partial image according to a size of the partial image.

The image processing apparatus according to claim 1, wherein the partial image selection unit selects a partial image according to character information included in the partial image including the hit keyword.

An image processing method having a function of searching document image data stored in an image database, an image dividing step for dividing a document image into partial images, and a search key based on a keyword are received, and the document image is page by page A keyword search step for searching; a display screen control step for generating a display screen for displaying one or more document images hit in the keyword search; and a partial image selection step for selecting the partial image obtained by dividing the image. An image processing method characterized in that the screen arranges a thumbnail showing an overview of a page, text information including a hit keyword, and a partial image selected in the partial image selection step for each page.

The image processing method according to claim 9, further comprising an attribute identification step that identifies an attribute of the partial image, wherein the partial image selection step selects a partial image according to an identification result of the attribute identification step.

The image processing method according to claim 10, wherein the attribute selected in the partial image selection step is a page title.

The image processing method according to claim 10, wherein the attribute selected in the partial image selection step is one or more of a photograph area, a graphic area, and a table area.

The image processing method according to claim 10, wherein the attribute selected in the partial image selection step is an area including a hit keyword.

The image processing method according to claim 9, wherein the partial image selection step selects a partial image according to a distance from the partial image including the hit keyword.

The image processing method according to claim 9, wherein the partial image selection step selects a partial image according to a size of the partial image.

The image processing method according to claim 9, wherein the partial image selection step selects a partial image according to character information included in the partial image including the hit keyword.

A program for causing a computer to implement the image processing method according to any one of claims 9 to 16.

The computer-readable recording medium which recorded the program for making a computer implement | achieve the image processing method of any one of Claims 9 thru | or 16.