JP4917575B2

JP4917575B2 - Database device for image search, database management method for image search, and program

Info

Publication number: JP4917575B2
Application number: JP2008168136A
Authority: JP
Inventors: 満安倍; 悠一吉田
Original assignee: Denso IT Laboratory Inc
Current assignee: Denso IT Laboratory Inc
Priority date: 2008-06-27
Filing date: 2008-06-27
Publication date: 2012-04-18
Anticipated expiration: 2028-06-27
Also published as: JP2010009332A

Description

本発明は、入力した画像に対して類似または関連する画像を出力する画像検索に用いられる画像検索用データベース装置に関する。 The present invention relates to an image search database device used for image search for outputting an image similar or related to an input image.

従来、画像検索の方式の一つとして、入力した画像に対して類似する画像または関連する画像を出力する方式（画像入力方式）が知られている。近年、この画像入力方式の新たなアプローチとして、入力画像とデータベース内の画像間で対応点探索を行い、対応点の数が多ければ検索をヒットさせる方式が提案されている。ここで、対応点とは、二つの画像間でパターンが一致する座標位置（座標点）の組み合わせのことである。この方式では、入力した画像と非常に類似度の高い画像を、検索結果として出力することができる。 Conventionally, as one of image search methods, a method (image input method) for outputting an image similar to or related to an input image is known. In recent years, as a new approach of this image input method, a method of searching for corresponding points between an input image and an image in a database and hitting the search if the number of corresponding points is large has been proposed. Here, the corresponding point is a combination of coordinate positions (coordinate points) at which patterns match between two images. In this method, an image having a very high degree of similarity with the input image can be output as a search result.

この対応点を求める方法については、例えばSIFTなどの計算手法を用いて、画像中の特徴点から局所特徴量を算出する方法が提案されている。このような手法によれば、正確に対応点を求めることができる（例えば非特許文献１参照）。
David G. Lowe、”Distinctive Image Features from Scale-Invariant Keypoints”、International Journal of Computer Vision (2004)、カナダ、２００４年1月５日 As a method for obtaining the corresponding points, for example, a method of calculating local feature amounts from feature points in an image using a calculation method such as SIFT has been proposed. According to such a method, corresponding points can be obtained accurately (see, for example, Non-Patent Document 1).
David G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision (2004), Canada, January 5, 2004

ところが、上記のような局所特徴量を用いた画像検索では、検索精度は非常に高いものの、データベースのすべての画像について対応点を求める必要があり、データベースに大量の画像が記憶されている場合には、画像検索のために必要な計算量が膨大になり、検索実行までに非常に時間がかかることになる。そこで、従来、Visual Wordsという方法が提案されている（例えば非特許文献２および３参照）。
David Nister、外１名、”Scalable Recognition with a Vocabulary Tree”、IEEE Conference on Computer Vision and Pattern Recognition (2006)、２００６年 James Philbin、外４名、”Object retrieval with large vocabularies and fast spatial matching”、 IEEE Conference on Computer Vision and Pattern Recognition (2007)、２００７年 However, in the image search using the local feature amount as described above, although the search accuracy is very high, it is necessary to obtain corresponding points for all images in the database, and a large number of images are stored in the database. Therefore, the amount of calculation required for image search becomes enormous, and it takes a very long time to execute the search. Therefore, conventionally, a method called Visual Words has been proposed (see, for example, Non-Patent Documents 2 and 3).
David Nister, 1 other, “Scalable Recognition with a Vocabulary Tree”, IEEE Conference on Computer Vision and Pattern Recognition (2006), 2006 James Philbin, 4 others, “Object retrieval with large vocabularies and fast spatial matching”, IEEE Conference on Computer Vision and Pattern Recognition (2007), 2007

Visual Wordsでは、それぞれの画像に含まれるすべての特徴点について、局所特徴量を１２８次元のベクトル（特徴量ベクトルともいう）として算出し、その特徴量ベクトルをベクトル空間内にプロットする。そして、データベースの全画像についてすべての特徴量ベクトルのプロットが完了した後、ある程度のプロットのまとまり（近くに集まっているプロットのグループ）を基準にして、ベクトル空間を複数の部分空間（小さな部屋）に分割する処理（クラスタリング処理という）を行う。この場合、データベース内のすべての特徴点の局所特徴量をメモリ上に展開してクラスタリング処理を行う。そして、各部分空間と各画像との対応関係を示す検索テーブル（検索インデックスともいう）を作成する。Visual Wordsでは、この検索テーブルを用いて画像検索が行われるので、データベースの画像が多くても計算量が少なくて済む。 In Visual Words, local feature amounts are calculated as 128-dimensional vectors (also referred to as feature amount vectors) for all feature points included in each image, and the feature amount vectors are plotted in a vector space. After all feature vectors have been plotted for all images in the database, the vector space is divided into multiple subspaces (small rooms) based on a certain set of plots (a group of nearby plots). A process of dividing into two (referred to as clustering process) is performed. In this case, clustering processing is performed by expanding the local feature amounts of all feature points in the database on the memory. Then, a search table (also referred to as a search index) indicating the correspondence between each partial space and each image is created. In Visual Words, image search is performed using this search table, so even if there are many images in the database, the calculation amount is small.

しかしながら、従来のVisual Wordsを用いた方法では、一旦、検索テーブルの準備ができてしまえば、データベースの画像が多くても少ない計算量で高精度の画像検索を行うことができるものの、その準備（特にクラスタリング処理）のために大容量のメモリ（作業用メモリ）が必要であるという問題があった。 However, in the conventional method using Visual Words, once the search table is prepared, high-precision image search can be performed with a small amount of calculation even if there are many images in the database. In particular, there is a problem that a large capacity memory (working memory) is required for clustering processing.

特に、データベースに大量の画像が格納されている場合、すべての特徴点の局所特徴量をメモリ上に展開すると、クラスタリング処理のためのメモリ消費が非常に大きくなるという問題があった。例えば、データベースに数百万枚の画像が格納されている場合、一つの画像に数千個の特徴点が含まれているとすると、そのすべての特徴点の局所特徴量を１２８次元のベクトルで表してベクトル空間にプロットするためには、少なくとも数十〜数百ＧＢという膨大なメモリ容量が必要となる。この場合、ＨＤＤを仮想メモリとして代用することも考えられるが、その場合にはクラスタリング処理の実行速度が大幅に低下してしまうという問題がある。 In particular, when a large amount of images are stored in the database, there is a problem that memory consumption for clustering processing becomes very large if local feature amounts of all feature points are expanded on a memory. For example, if millions of images are stored in the database, and if there are several thousand feature points in one image, the local feature values of all the feature points are expressed as 128-dimensional vectors. In order to represent and plot in the vector space, an enormous memory capacity of at least several tens to several hundreds GB is required. In this case, it is conceivable to substitute the HDD as a virtual memory, but in this case, there is a problem that the execution speed of the clustering process is significantly reduced.

そして、一旦、検索テーブルの準備が完了した後に、データベースを更新（新たな画像を追加）しようとした場合、再度、すべての特徴点の局所特徴量をメモリ上に展開して、クラスタリング処理を行う必要があるという問題があった。したがって、逐次的に（新たな画像がデータベースに追加される度に）データベースを更新することが容易でないという問題があった。 Once the search table is prepared and the database is updated (adding a new image), the local feature amounts of all feature points are expanded on the memory and clustering is performed again. There was a problem that it was necessary. Therefore, there is a problem that it is not easy to update the database sequentially (every time a new image is added to the database).

本発明は、上記の課題に鑑みてなされたもので、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要なメモリ容量が少なくて済み、また、データベースを逐次的に更新することができる画像検索用データベース装置を提供することを目的とする。 The present invention has been made in view of the above problems, and even when a large amount of images are stored in the database, the memory capacity required for the clustering process can be reduced, and the database can be sequentially stored. An object of the present invention is to provide a database device for image search that can be updated.

本発明の画像検索用データベース装置は、画像検索用の複数の画像が登録される画像検索用データベース装置であって、各画像に含まれる複数の特徴点の各々の局所特徴量が特徴量ベクトルとして表され、複数の前記特徴量ベクトルが含まれるベクトル空間においてクラスタリング処理が行われ前記ベクトル空間が複数の部分空間に分割され、各部分空間と各画像との対応関係を示す検索テーブルが生成されており、前記画像検索用データベース装置は、前記クラスタリング処理に用いるために、前記ベクトル空間における前記部分空間の範囲を記憶する作業用記憶手段と、前記作業用記憶手段より大きな記憶容量を有し、前記クラスタリング処理に用いるために、前記複数の画像の特徴量ベクトルを各特徴量ベクトルが属する部分空間に関連付けて記憶する大容量記憶手段と、を備えている。 The image search database device of the present invention is an image search database device in which a plurality of images for image search are registered, and local feature amounts of a plurality of feature points included in each image are used as feature amount vectors. And a clustering process is performed in a vector space including a plurality of the feature amount vectors, the vector space is divided into a plurality of partial spaces, and a search table indicating a correspondence relationship between each partial space and each image is generated. The database device for image search has working storage means for storing a range of the partial space in the vector space and a larger storage capacity than the working storage means for use in the clustering process; For use in clustering processing, the feature vectors of the plurality of images are related to the subspace to which each feature vector belongs. And it includes a large capacity storage means for attaching and storing, the.

この画像検索用データベース装置によれば、クラスタリング処理のために、作業用記憶手段（メモリなど）に、ベクトル空間における部分空間の範囲を記憶しておき、局所特徴量を表す特徴量ベクトルは、部分空間に関連付けて大容量記憶手段（ＨＤＤなど）に記憶しておけばよい。したがって、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要な作業用記憶手段（メモリ）の容量が少なくて済む。 According to this database device for image search, for clustering processing, the range of the partial space in the vector space is stored in the working storage means (memory or the like), and the feature quantity vector representing the local feature quantity is the partial What is necessary is just to memorize | store in mass storage means (HDD etc.) linked | related with space. Therefore, even when a large amount of images are stored in the database, the capacity of the working storage means (memory) necessary for the clustering process can be reduced.

また、本発明の画像検索用データベース装置は、画像検索用の画像として新たに登録される追加の画像に含まれる特徴点の局所特徴量を追加の特徴量ベクトルとして算出するベクトル算出手段と、前記追加の特徴量ベクトルが属する部分空間に関連付けられた特徴量ベクトルを、前記大容量記憶手段から前記作業用記憶手段に読み出す読出し制御手段と、前記読み出した特徴量ベクトルおよび前記追加の特徴量ベクトルに基づいて、前記追加の特徴量ベクトルが属する部分空間においてクラスタリング処理を行って前記部分空間を再分割するクラスタリング処理手段と、を備えてもよい。 Further, the image search database device of the present invention includes a vector calculation means for calculating a local feature amount of a feature point included in an additional image newly registered as an image for image search as an additional feature amount vector, Read control means for reading out the feature quantity vector associated with the partial space to which the additional feature quantity vector belongs from the large-capacity storage means to the working storage means, the read feature quantity vector, and the additional feature quantity vector And a clustering processing means for performing a clustering process on the partial space to which the additional feature vector belongs, and subdividing the partial space.

これにより、データベースに新たに画像を追加しようとする場合には、追加の画像に含まれる特徴点の特徴量ベクトル（追加の特徴量ベクトル）を算出し、その追加の特徴量ベクトルが属する部分空間に関連付けられた特徴量ベクトルが読み出される。そして、これらの特徴量ベクトルについてクラスタリング処理を行って部分空間を再分割する。このようにして、逐次的に（新たな画像がデータベースに追加される度に）データベースを更新することが可能になる。 As a result, when a new image is to be added to the database, the feature amount vector (additional feature amount vector) of the feature point included in the additional image is calculated, and the subspace to which the additional feature amount vector belongs The feature vector associated with is read out. Then, a clustering process is performed on these feature quantity vectors to subdivide the partial space. In this way, the database can be updated sequentially (each time a new image is added to the database).

また、本発明の画像検索用データベース装置は、前記追加の特徴量ベクトルが属する部分空間における特徴量ベクトルの分離の度合いを示す分離度を算出する分離度算出手段と、前記分離度が所定のしきい値より大きいか否かを判定し、前記分離度が所定のしきい値より大きい場合には、前記部分空間におけるクラスタリング処理を行い、前記分離度が所定のしきい値より小さい場合には、前記部分空間におけるクラスタリング処理を行わないように、前記部分空間におけるクラスタリング処理の制御を行う判定制御手段と、
を備えてもよい。 In addition, the image search database device of the present invention includes a degree-of-separation calculating means for calculating a degree of separation indicating a degree of separation of the feature amount vector in the partial space to which the additional feature amount vector belongs, and the degree of separation is a predetermined value. It is determined whether or not the threshold value is greater than the threshold value, and when the degree of separation is greater than a predetermined threshold value, clustering processing in the subspace is performed, and when the degree of separation is smaller than the predetermined threshold value, Determination control means for controlling clustering processing in the subspace so as not to perform clustering processing in the subspace;
May be provided.

これにより、部分空間における特徴量ベクトルの分離度に基づいて、その部分空間を分割するクラスタリング処理を行うか否かの判定が行われる。例えば、分離度が大きい場合には、部分空間におけるクラスタリング処理が行われ、分離度が小さい場合には、部分空間におけるクラスタリング処理は行われない。このように、部分空間における特徴量ベクトルの分離度に応じた適切なクラスタリング処理が行われる。 Thereby, based on the separation degree of the feature vector in the partial space, it is determined whether or not to perform the clustering process for dividing the partial space. For example, when the degree of separation is large, clustering processing in the subspace is performed, and when the degree of separation is small, clustering processing in the subspace is not performed. As described above, an appropriate clustering process is performed according to the degree of separation of the feature vector in the partial space.

また、本発明の画像検索用データベース装置は、前記クラスタリング処理手段によって前記部分空間が再分割された場合に、前記検索テーブルにおいて、前記分割された部分空間と各画像との対応関係を更新するテーブル更新手段を備えてもよい。 The database device for image search according to the present invention is a table for updating a correspondence relationship between the divided partial space and each image in the search table when the partial space is subdivided by the clustering processing unit. Update means may be provided.

これにより、クラスタリング処理によって部分空間が再分割された場合には、それに応じて検索テーブルが適切に更新される。したがって、逐次的なデータベースの更新が反映された画像検索が可能になる。 Thereby, when the partial space is subdivided by the clustering process, the search table is appropriately updated accordingly. Therefore, it is possible to search for an image reflecting a sequential database update.

本発明の画像検索用データベース管理方法は、画像検索用の複数の画像が登録される画像検索用データベースの管理方法であって、前記画像検索用データベースでは、各画像に含まれる複数の特徴点の各々の局所特徴量が特徴量ベクトルとして表され、複数の前記特徴量ベクトルが含まれるベクトル空間においてクラスタリング処理が行われ前記ベクトル空間が複数の部分空間に分割され、各部分空間と各画像との対応関係を示す検索テーブルが生成されており、前記画像検索用データベース管理方法は、前記クラスタリング処理に用いるために、作業用記憶手段に、前記ベクトル空間における前記部分空間の範囲を記憶することと、前記クラスタリング処理に用いるために、前記作業用記憶手段より大きな記憶容量を有する大容量記憶手段に、前記複数の画像の特徴量ベクトルを各特徴量ベクトルが属する部分空間に関連付けて記憶することと、を含んでいる。 The image search database management method of the present invention is an image search database management method in which a plurality of images for image search are registered. In the image search database, a plurality of feature points included in each image are stored. Each local feature is represented as a feature vector, clustering processing is performed in a vector space including a plurality of the feature vectors, and the vector space is divided into a plurality of subspaces. A search table indicating a correspondence relationship has been generated, and the image search database management method stores a range of the partial space in the vector space in a work storage unit for use in the clustering process, Mass storage means having a larger storage capacity than the working storage means for use in the clustering process Includes, and storing in association with feature vectors of the plurality of images to each feature quantity subspace vector belongs.

この方法によれば、クラスタリング処理のために、作業用記憶手段（メモリなど）に、ベクトル空間における部分空間の範囲を記憶しておき、局所特徴量を表す特徴量ベクトルは、部分空間に関連付けて大容量記憶手段（ＨＤＤなど）に記憶しておけばよい。したがって、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要な作業用記憶手段（メモリ）の容量が少なくて済む。 According to this method, for clustering processing, the range of the subspace in the vector space is stored in the working storage means (memory or the like), and the feature quantity vector representing the local feature quantity is associated with the subspace. What is necessary is just to memorize | store in a mass storage means (HDD etc.). Therefore, even when a large amount of images are stored in the database, the capacity of the working storage means (memory) necessary for the clustering process can be reduced.

また、本発明の画像検索用データベース管理方法は、画像検索用の画像として新たに登録される追加の画像に含まれる特徴点の局所特徴量を追加の特徴量ベクトルとして算出することと、前記追加の特徴量ベクトルが属する部分空間に関連付けられた特徴量ベクトルを、前記大容量記憶手段から前記作業用記憶手段に読み出すことと、前記読み出した特徴量ベクトルおよび前記追加の特徴量ベクトルに基づいて、前記追加の特徴量ベクトルが属する部分空間においてクラスタリング処理を行って前記部分空間を再分割することと、
を含んでもよい。 Further, the image search database management method of the present invention calculates a local feature amount of a feature point included in an additional image newly registered as an image for image search as an additional feature amount vector, and the addition Reading out the feature vector associated with the partial space to which the feature vector belongs to the working storage unit from the large-capacity storage unit, and based on the read feature vector and the additional feature vector, Re-dividing the partial space by performing a clustering process in the partial space to which the additional feature vector belongs;
May be included.

本発明の画像検索用データベース管理プログラムは、画像検索用の複数の画像が登録される画像検索用データベースの管理プログラムであって、前記画像検索用データベースでは、各画像に含まれる複数の特徴点の各々の局所特徴量が特徴量ベクトルとして表され、複数の前記特徴量ベクトルが含まれるベクトル空間においてクラスタリング処理が行われ前記ベクトル空間が複数の部分空間に分割され、各部分空間と各画像との対応関係を示す検索テーブルが生成されており、前記画像検索用データベース管理プログラムは、コンピュータに、前記クラスタリング処理に用いるために、作業用記憶手段に、前記ベクトル空間における前記部分空間の範囲を記憶する処理と、前記クラスタリング処理に用いるために、前記作業用記憶手段より大きな記憶容量を有する大容量記憶手段に、前記複数の画像の特徴量ベクトルを各特徴量ベクトルが属する部分空間に関連付けて記憶する処理と、を実行させるものである。 The image search database management program of the present invention is an image search database management program in which a plurality of images for image search are registered. In the image search database, a plurality of feature points included in each image are stored. Each local feature is represented as a feature vector, clustering processing is performed in a vector space including a plurality of the feature vectors, and the vector space is divided into a plurality of subspaces. A search table indicating the correspondence relationship is generated, and the image search database management program stores the range of the partial space in the vector space in a work storage means for use in the clustering process in a computer. Larger than the working storage means for use in processing and clustering processing. The mass storage means having a Do storage capacity, it is intended to execute a process of storing in association with feature vectors of the plurality of images into subspaces belonging each feature vector.

このプログラムによっても、クラスタリング処理のために、作業用記憶手段（メモリなど）に、ベクトル空間における部分空間の範囲を記憶しておき、局所特徴量を表す特徴量ベクトルは、部分空間に関連付けて大容量記憶手段（ＨＤＤなど）に記憶しておけばよい。したがって、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要な作業用記憶手段（メモリ）の容量が少なくて済む。 Also with this program, for clustering processing, the range of the partial space in the vector space is stored in the working storage means (memory, etc.), and the feature quantity vector representing the local feature quantity is associated with the partial space. What is necessary is just to memorize | store in capacity storage means (HDD etc.). Therefore, even when a large amount of images are stored in the database, the capacity of the working storage means (memory) necessary for the clustering process can be reduced.

また、本発明の画像検索用データベース管理プログラムは、コンピュータに、画像検索用の画像として新たに登録される追加の画像に含まれる特徴点の局所特徴量を追加の特徴量ベクトルとして算出する処理と、前記追加の特徴量ベクトルが属する部分空間に関連付けられた特徴量ベクトルを、前記大容量記憶手段から前記作業用記憶手段に読み出す処理と、前記読み出した特徴量ベクトルおよび前記追加の特徴量ベクトルに基づいて、前記追加の特徴量ベクトルが属する部分空間においてクラスタリング処理を行って前記部分空間を再分割する処理と、を実行させてもよい。 Further, the image search database management program according to the present invention includes a process of calculating a local feature amount of a feature point included in an additional image newly registered as an image search image in the computer as an additional feature amount vector. , A process of reading out the feature quantity vector associated with the partial space to which the additional feature quantity vector belongs from the large-capacity storage means to the working storage means, and the read feature quantity vector and the additional feature quantity vector Based on this, a process of performing a clustering process in the partial space to which the additional feature quantity vector belongs to re-divide the partial space may be executed.

本発明によれば、ベクトル空間における部分空間の範囲を記憶する作業用記憶手段と、部分空間に関連付けて特徴量ベクトルを記憶する大容量記憶手段とを設けることにより、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要なメモリ容量が少なくて済み、また、データベースを逐次的に更新することができる。 According to the present invention, a large amount of images can be stored in a database by providing working storage means for storing a range of a partial space in a vector space and large capacity storage means for storing a feature quantity vector in association with the partial space. Even in such a case, the memory capacity required for the clustering process can be reduced, and the database can be updated sequentially.

以下、本発明の実施の形態の画像検索用データベース装置について、図面を用いて説明する。本実施の形態では、Visual Wordsを用いた画像検索用データベース等に用いられる装置の場合を例示する。この装置は、画像検索用データベースの管理する機能を備えており、この機能は、装置のメモリやＨＤＤ等に記憶されたプログラムによって実現される。 Hereinafter, an image search database apparatus according to an embodiment of the present invention will be described with reference to the drawings. In the present embodiment, the case of an apparatus used for an image search database using Visual Words is exemplified. This apparatus has a function of managing an image search database, and this function is realized by a program stored in a memory, an HDD, or the like of the apparatus.

まず、本発明の実施の形態の画像検索用データベース装置（単にデータベース装置ともいう）の構成を、図面を用いて説明する。 First, the configuration of an image search database device (also simply referred to as a database device) according to an embodiment of the present invention will be described with reference to the drawings.

図１は、本実施の形態のデータベース装置の全体構成を説明するためのブロック図である。図１に示すように、データベース装置１は、入力された画像を受け付けて画像データベース２に登録する画像受付部３と、複数の画像（入力された画像または登録された画像）から処理対象の画像を選択する画像選択部４と、選択された画像から特徴点を抽出する特徴点抽出部５と、抽出した特徴点の局所特徴量を算出する局所特徴量算出部６を備えている。 FIG. 1 is a block diagram for explaining the overall configuration of the database apparatus according to the present embodiment. As shown in FIG. 1, the database device 1 receives an input image and registers it in an image database 2, and an image to be processed from a plurality of images (input image or registered image). An image selection unit 4 that selects a feature point, a feature point extraction unit 5 that extracts a feature point from the selected image, and a local feature amount calculation unit 6 that calculates a local feature amount of the extracted feature point.

また、データベース装置１は、特徴点の局所特徴量を高次元（例えば１２８次元）のベクトル（特徴量ベクトル）として算出するベクトル算出部７と、特徴量ベクトルが含まれるベクトル空間においてクラスタリング処理を行うクラスタリング処理部８を備えている。クラスタリング処理によって、ベクトル空間が複数の部分空間（Visual Wordとも呼ばれる小さな部屋）に分割される。ここでは、局所特徴量を１２８次元のベクトルで表す例について説明するが、本発明の範囲はこれに限定されるものではない。なお、このクラスタリング処理の内容については、図面を用いて後述する。 Further, the database device 1 performs a clustering process in a vector space 7 that calculates a local feature amount of a feature point as a high-dimensional (for example, 128-dimensional) vector (a feature amount vector), and a vector space including the feature amount vector. A clustering processing unit 8 is provided. By the clustering process, the vector space is divided into a plurality of subspaces (small rooms called Visual Word). Here, an example in which the local feature amount is represented by a 128-dimensional vector will be described, but the scope of the present invention is not limited to this. The contents of this clustering process will be described later with reference to the drawings.

ここで、クラスタリング処理部８の構成について、図２を用いて説明する。図２に示すように、クラスタリング処理部８は、ベクトル空間における部分空間の範囲として、部分空間の代表点（例えば重心など）を表す代表ベクトルを算出する代表ベクトル算出部９を備えている。また、クラスタリング処理部８は、部分空間における特徴量ベクトルの分離の度合いを示す分離度を算出する分離度算出部１０と、分離度に基づいて部分空間におけるクラスタリングを行うか否かを判定する判定制御部１１を備えている。ここでは、分離度算出部１０が、本発明の分離度算出手段に相当し、判定制御部１１が、本発明の判定制御手段に相当する。なお、この分離度に基づく判定制御の内容についても、図面を用いて後述する。 Here, the configuration of the clustering processing unit 8 will be described with reference to FIG. As shown in FIG. 2, the clustering processing unit 8 includes a representative vector calculation unit 9 that calculates a representative vector representing a representative point (for example, the center of gravity) of the partial space as the range of the partial space in the vector space. In addition, the clustering processing unit 8 determines a degree of separation that calculates the degree of separation indicating the degree of separation of the feature quantity vectors in the subspace, and determines whether to perform clustering in the subspace based on the degree of separation. A control unit 11 is provided. Here, the degree-of-separation calculating unit 10 corresponds to the degree-of-separation calculating unit of the present invention, and the determination control unit 11 corresponds to the determination control unit of the present invention. The contents of the determination control based on the degree of separation will be described later with reference to the drawings.

図１に戻って、データベース装置１の構成の説明を続ける。データベース装置１は、データ処理速度が速く比較的小容量のメモリ１２と、メモリ１２よりデータ処理速度は遅いものの大きな記憶容量を有する大容量のＨＤＤ１３（ハードディスクドライブ）と、メモリ１２やＨＤＤ１３への書込みや読出しを制御する記憶制御部１４を備えている。メモリ１２には、ベクトル空間における部分空間の範囲として代表ベクトルが記憶され、ＨＤＤ１３には、その特徴量ベクトルが属する部分空間に関連付けて特徴量ベクトルが記憶される。ここでは、メモリ１２が、本発明の作業用記憶手段に相当し、ＨＤＤ１３が、本発明の大容量記憶手段に相当する。 Returning to FIG. 1, the description of the configuration of the database apparatus 1 will be continued. The database apparatus 1 has a high data processing speed and a relatively small capacity memory 12, a large capacity HDD 13 (hard disk drive) having a large storage capacity although the data processing speed is slower than the memory 12, and writing to the memory 12 and the HDD 13. And a storage control unit 14 for controlling reading. The memory 12 stores a representative vector as a range of a partial space in the vector space, and the HDD 13 stores a feature vector associated with the partial space to which the feature vector belongs. Here, the memory 12 corresponds to the working storage means of the present invention, and the HDD 13 corresponds to the mass storage means of the present invention.

ベクトル算出部７は、画像データベース２に新たに登録される画像（追加の画像）についても、特徴点の局所特徴量を特徴量ベクトル（追加の特徴量ベクトル）として算出する。記憶制御部１４は、追加の特徴量ベクトルが属する部分空間に関連付けられた特徴量ベクトルをＨＤＤ１３からメモリ１２に読み出す。そして、クラスタリング処理部８は、その部分空間においてクラスタリング処理を行って部分空間を再分割する機能を備えている。したがって、このベクトル算出部７が、本発明のベクトル算出手段に相当し、記憶制御部１４が、本発明の読出し制御手段に相当する。また、クラスタリング処理部８は、本発明のクラスタリング処理手段に相当する。 The vector calculation unit 7 also calculates a local feature amount of a feature point as a feature amount vector (additional feature amount vector) for an image (additional image) newly registered in the image database 2. The storage control unit 14 reads out the feature vector associated with the partial space to which the additional feature vector belongs from the HDD 13 into the memory 12. The clustering processing unit 8 has a function of performing a clustering process in the partial space and re-dividing the partial space. Therefore, the vector calculation unit 7 corresponds to the vector calculation unit of the present invention, and the storage control unit 14 corresponds to the read control unit of the present invention. The clustering processing unit 8 corresponds to clustering processing means of the present invention.

また、データベース装置１は、クラスタリング処理の結果に基づいて、各部分空間と各画像との対応関係を示す検索テーブルを生成する検索テーブル生成部１５と、生成した検索テーブルを記憶する検索テーブル記憶部１６を備えている。この検索テーブル生成部１５は、部分空間が再分割されたときに検索テーブルを更新する機能を備えている。したがって、この検索テーブル生成部１５は、本発明の検索テーブル更新手段に相当する。そして、データベース装置１は、この検索テーブルを用いて画像検索を行う検索処理部１７を備えている。検索テーブルおよび画像検索の内容については、図面を用いて後述する。 Further, the database device 1 includes a search table generation unit 15 that generates a search table that indicates the correspondence between each partial space and each image based on the result of the clustering process, and a search table storage unit that stores the generated search table 16 is provided. The search table generation unit 15 has a function of updating the search table when the partial space is subdivided. Therefore, the search table generation unit 15 corresponds to search table update means of the present invention. The database device 1 includes a search processing unit 17 that performs image search using the search table. The contents of the search table and image search will be described later with reference to the drawings.

以上のように構成されたデータベース装置１について、図面を用いてその動作を説明する。ここでは、本発明の特徴的な３つの動作、すなわち、画像データベース２の初期化（１枚目の画像の登録）、画像データベース２の更新（２枚目以降の画像の登録）、画像検索を中心に説明する。 The operation of the database apparatus 1 configured as described above will be described with reference to the drawings. Here, three operations characteristic of the present invention, that is, initialization of the image database 2 (registration of the first image), update of the image database 2 (registration of the second and subsequent images), and image search are performed. The explanation is centered.

（画像データベースの初期化）
まず、図３〜図６を参照して、本実施の形態のデータベース装置１において、画像データベース２の初期化（１枚目の画像の登録）を行うときの動作について説明する。 (Initialization of image database)
First, with reference to FIG. 3 to FIG. 6, an operation when the database apparatus 1 of the present embodiment performs initialization of the image database 2 (registration of the first image) will be described.

画像データベース２の初期化を行うときには、まず、画像選択部４によって最初に登録する画像が選択される。図３の例では、コンピュータの画像ａが選択されている。そして、特徴抽出部５によってこの画像ａに含まれる特徴点が抽出される。この例では、コンピュータの角部などの８つの特徴的な部分が特徴点（図８では黒丸印として図示）として抽出されている。なお、ここでは、説明の便宜のため、特徴点の数を８つとしたが、特徴点の数がこれに限られないことは言うまでもない。 When the image database 2 is initialized, first, an image to be registered first is selected by the image selection unit 4. In the example of FIG. 3, the image a of the computer is selected. The feature extraction unit 5 extracts feature points included in the image a. In this example, eight characteristic parts such as corners of the computer are extracted as characteristic points (shown as black circles in FIG. 8). Although the number of feature points is eight here for convenience of explanation, it goes without saying that the number of feature points is not limited to this.

つぎに、局所特徴量算出部６によってこれらの特徴点の局所特徴量が計算され、ベクトル算出部７によって局所特徴量が１２８次元の特徴量ベクトルとして表される。そして、この特徴量ベクトルが１２８次元のベクトル空間（特徴量空間ともいえる）にプロットされる。図３では、ａ１〜ａ８の８つの特徴量ベクトルがベクトル空間にプロットされた例が図示されている。なお、この図では、説明の便宜のため、ベクトル空間が２次元として図示されている。 Next, the local feature amount calculation unit 6 calculates the local feature amounts of these feature points, and the vector calculation unit 7 expresses the local feature amount as a 128-dimensional feature amount vector. This feature vector is plotted in a 128-dimensional vector space (also referred to as a feature space). FIG. 3 illustrates an example in which eight feature quantity vectors a1 to a8 are plotted in a vector space. In this figure, the vector space is shown as two-dimensional for convenience of explanation.

クラスタリング処理部８は、ベクトル空間においてクラスタリング処理を行って、このベクトル空間を複数の部分空間に分割する。図４では、ベクトル空間が３つの部分空間（部分空間Ａ〜Ｃ）に分割された例が図示されている。クラスタリング処理（部分空間への分割）では、ある程度まとまった特徴量ベクトルごとに１つの部分空間が割り当てられる。このクラスタリング処理は、特徴量ベクトルの分離の度合い（または、まとまりの度合い）に基づいて行われる。この特徴量ベクトルの分離の度合いを表す分離度として、例えば分散比（＝クラス内分散／クラス外分散）などが用いられる。 The clustering processing unit 8 performs clustering processing in the vector space, and divides the vector space into a plurality of partial spaces. FIG. 4 illustrates an example in which the vector space is divided into three partial spaces (partial spaces A to C). In the clustering process (division into partial spaces), one partial space is assigned for each feature quantity vector that is gathered to some extent. This clustering process is performed based on the degree of separation (or the degree of unity) of feature quantity vectors. As the degree of separation representing the degree of separation of the feature vector, for example, a dispersion ratio (= in-class dispersion / out-class dispersion) is used.

そして、代表ベクトル算出部９によって、各部分空間の範囲として代表ベクトルが算出される。例えば、部分空間に含まれる複数の特徴量ベクトルの重心ベクトルなどが、代表ベクトルとして算出される。図４の例では、部分空間Ａの代表ベクトルは、特徴量ベクトルａ１〜ａ４の重心ベクトル（図では丸Ａ印で図示されている）であり、部分空間Ｂの代表ベクトルは、特徴量ベクトルａ７、ａ８の重心ベクトル（図では丸Ｂ印で図示されている）であり、部分空間Ｃの代表ベクトルは、特徴量ベクトルａ５、ａ６の重心ベクトル（図では丸Ｃ印で図示されている）である。 Then, the representative vector calculation unit 9 calculates a representative vector as a range of each partial space. For example, a centroid vector of a plurality of feature amount vectors included in the partial space is calculated as a representative vector. In the example of FIG. 4, the representative vector of the subspace A is the centroid vector (shown by a circle A in the figure) of the feature vectors a1 to a4, and the representative vector of the subspace B is the feature vector a7. , A8 centroid vector (shown by a circle B in the figure), and the representative vector of the partial space C is a centroid vector of feature vectors a5 and a6 (shown by a circle C in the figure). is there.

このようなクラスタリング処理の結果に基づいて、検索テーブル作成部は、図５に示すような検索テーブルを作成する。検索テーブルは、各部分空間と各画像との対応関係を示すテーブルであり、この場合、部分空間Ａ〜Ｃと画像ａとの関係が示されている。つまり、図５の例では、部分空間Ａ〜Ｃと画像ａが対応していることが示されている。 Based on the result of such clustering processing, the search table creation unit creates a search table as shown in FIG. The search table is a table showing the correspondence between each partial space and each image. In this case, the relationship between the partial spaces A to C and the image a is shown. That is, in the example of FIG. 5, it is shown that the partial spaces A to C correspond to the image a.

上記のようなクラスタリング処理の結果として得られた代表ベクトルは、記憶制御部１４によってメモリ１２に記憶される。また、このときクラスタリング処理された特徴量ベクトルは、どの部分空間に属するかという情報を関連付けて、ＨＤＤ１３に記憶される。つまり、メモリ１２には、各部分空間の代表ベクトルのみが記憶され、ＨＤＤ１３には、各部分空間に属する特徴量ベクトルが部分空間に関連付けて記憶される。この場合、メモリ１２には、部分空間の範囲が記憶されており、ＨＤＤ１３には、部分空間を表す識別ラベル（Ａ、Ｂ、Ｃなど）と、その部分空間に属する特徴量ベクトルが記憶されているともいえる。 The representative vector obtained as a result of the clustering process as described above is stored in the memory 12 by the storage control unit 14. Further, the feature quantity vector subjected to the clustering process at this time is stored in the HDD 13 in association with information indicating which partial space belongs. That is, only the representative vector of each partial space is stored in the memory 12, and the feature amount vector belonging to each partial space is stored in the HDD 13 in association with the partial space. In this case, a range of the partial space is stored in the memory 12, and an identification label (A, B, C, etc.) representing the partial space and a feature vector belonging to the partial space are stored in the HDD 13. It can be said that there is.

図６の例では、メモリ１２に、３つの部分空間（部分空間Ａ〜Ｃ）の代表ベクトルが記憶される。画像の数や特徴点の数が多い場合であっても、メモリ１２は代表ベクトルのみを記憶すればよく、メモリ１２の容量は小さくて済む。一方、ＨＤＤ１３には、部分空間Ａに属する特徴量ベクトルとして４つの特徴量ベクトルａ１〜ａ４が記憶され、部分空間Ｂに属する特徴量ベクトルとして２つの特徴量ベクトルａ７、ａ８が記憶され、部分空間Ｃに属する特徴量ベクトルとして２つの特徴量ベクトルａ５、ａ６が記憶される。画像の数や特徴点の数が多い場合、これらの特徴量ベクトルのデータ量は膨大になるため、ＨＤＤ１３は大容量の記憶容量が必要になる。 In the example of FIG. 6, representative vectors of three partial spaces (partial spaces A to C) are stored in the memory 12. Even when the number of images and the number of feature points are large, the memory 12 only needs to store representative vectors, and the capacity of the memory 12 can be small. On the other hand, the HDD 13 stores four feature quantity vectors a1 to a4 as feature quantity vectors belonging to the partial space A, and stores two feature quantity vectors a7 and a8 as feature quantity vectors belonging to the partial space B. Two feature quantity vectors a5 and a6 are stored as feature quantity vectors belonging to C. When the number of images and the number of feature points are large, the data amount of these feature amount vectors becomes enormous, and the HDD 13 needs a large storage capacity.

（画像データベースの更新）
つぎに、図７〜図１１を参照して、本実施の形態のデータベース装置１において、画像データベース２の更新（２枚目以降の画像の登録）を行うときの動作について説明する。ここでは、図３〜図６で説明した初期化が行われた画像データベース２に２枚目の画像を追加するときの動作を例示して説明する。 (Image database update)
Next, with reference to FIG. 7 to FIG. 11, an operation when updating the image database 2 (registering the second and subsequent images) in the database apparatus 1 of the present embodiment will be described. Here, the operation when the second image is added to the image database 2 subjected to the initialization described with reference to FIGS. 3 to 6 will be described as an example.

画像データベース２の更新を行うときには、まず、画像選択部４によって次に登録する画像が選択される。図７の例では、コンピュータの画像ｂが選択されている。そして、特徴抽出部５によってこの画像ｂに含まれる特徴点が抽出される。通常、画像ｂからは複数の特徴点が抽出され、以下の処理は複数の特徴点の各々について行われるが、ここでは、説明の便宜のため、まず、１つの特徴点について説明する。 When updating the image database 2, first, the image to be registered next is selected by the image selection unit 4. In the example of FIG. 7, the computer image b is selected. Then, the feature points included in the image b are extracted by the feature extraction unit 5. Usually, a plurality of feature points are extracted from the image b, and the following processing is performed for each of the plurality of feature points. Here, for convenience of explanation, one feature point will be described first.

図７では、特徴抽出部５によって画像ｂに含まれる１つ目の特徴点（図７では黒三角印として図示）が抽出されている。そして、この特徴点について、局所特徴量算出部６によって局所特徴量が計算され、ベクトル算出部７によって追加の特徴量ベクトルとして表され、図７に示すようにベクトル空間にプロットされる。この場合、追加の特徴量ベクトルｂ１は、部分空間Ａにプロットされている。 In FIG. 7, the first feature point (shown as a black triangle mark in FIG. 7) included in the image b is extracted by the feature extraction unit 5. Then, the local feature amount calculation unit 6 calculates a local feature amount for this feature point, the vector calculation unit 7 expresses it as an additional feature amount vector, and is plotted in a vector space as shown in FIG. In this case, the additional feature amount vector b1 is plotted in the partial space A.

このような場合、記憶制御部１４は、この追加の特徴量ベクトルｂ１が属する部分空間Ａに関連付けられた特徴量ベクトルａ１〜ａ４をＨＤＤ１３からメモリ１２に読み出して、図８に示すように、特徴量ベクトルａ１〜ａ４をベクトル空間（部分空間Ａ）にプロットする。 In such a case, the storage control unit 14 reads out the feature quantity vectors a1 to a4 associated with the partial space A to which the additional feature quantity vector b1 belongs from the HDD 13 to the memory 12, and as shown in FIG. The quantity vectors a1 to a4 are plotted in the vector space (subspace A).

そして、クラスタリング処理部８は、部分空間Ａにおいてクラスタリング処理を行って、この部分空間Ａを複数の部分空間に再分割する。図９では、部分空間Ａが２つの部分空間（部分空間ＡとＤ）に再分割された例が図示されている。このクラスタリング処理も、上記と同様、部分空間における特徴量ベクトルの分離度に基づいて行われる。この場合にも、分離度として、例えば分散比（＝クラス内分散／クラス外分散）などが用いられる。この分離度は、分離度算出部１０で算出され、判定制御部１１は、この分離度が所定のしきい値より大きいか否かを判定する。この場合、分離度がしきい値より大きいと判定され、部分空間Ａがクラスタリング処理によって再分割されている。 Then, the clustering processing unit 8 performs clustering processing in the partial space A, and subdivides the partial space A into a plurality of partial spaces. FIG. 9 illustrates an example in which the partial space A is subdivided into two partial spaces (partial spaces A and D). This clustering process is also performed based on the degree of feature vector separation in the subspace, as described above. Also in this case, as the degree of separation, for example, a dispersion ratio (= in-class dispersion / out-class dispersion) is used. This degree of separation is calculated by the degree-of-separation calculation unit 10, and the determination control unit 11 determines whether this degree of separation is greater than a predetermined threshold value. In this case, it is determined that the degree of separation is greater than the threshold value, and the subspace A is subdivided by clustering processing.

なお、本発明の分離度は、上記の分散比に限られるものではない。この分離度として、例えば、部分空間に含まれる特徴量ベクトルの数などが用いられてもよい。その場合、判定制御部１１は、部分空間に含まれる特徴量ベクトルの数が所定の数（しきい値）以上になったときに、その部分空間の再分割を行うように部分空間におけるクラスタリング処理を制御してもよい。 The degree of separation of the present invention is not limited to the above dispersion ratio. As the degree of separation, for example, the number of feature quantity vectors included in the partial space may be used. In this case, the determination control unit 11 performs clustering processing in the subspace so that the subspace is subdivided when the number of feature amount vectors included in the subspace exceeds a predetermined number (threshold value). May be controlled.

部分空間の再分割が行われると、代表ベクトル算出部９によって、再分割された部分空間の代表ベクトルが算出される。この場合、再分割された部分空間Ａの代表ベクトルは、特徴量ベクトルａ１とａ２の重心ベクトル（図では丸Ａ印で図示されている）であり、部分空間Ｄの代表ベクトルは、特徴量ベクトルａ３、ａ４、ｂ１の重心ベクトル（図では丸Ｄ印で図示されている）である。 When the subspace is subdivided, the representative vector calculation unit 9 calculates a representative vector of the subspace that has been subdivided. In this case, the representative vector of the subspace A that has been subdivided is the centroid vector (shown by a circle A in the figure) of the feature vectors a1 and a2, and the representative vector of the subspace D is the feature vector It is a center-of-gravity vector of a3, a4, and b1 (indicated by a circle D in the figure).

このような部分空間Ａのクラスタリング処理の結果に基づいて、検索テーブル作成部は、図１０に示すように検索テーブルの更新を行う。この場合、検索テーブルにおいて、部分空間Ａに画像ｂの対応が追加されるとともに、新たに部分空間Ｄの項目が追加され、部分空間Ｄには画像ａが対応するように検索テーブルの更新が行われる。 Based on the result of such clustering processing of the partial space A, the search table creation unit updates the search table as shown in FIG. In this case, in the search table, the correspondence of the image b is added to the partial space A, the item of the partial space D is newly added, and the search table is updated so that the image a corresponds to the partial space D. Is called.

また、図１１に示すように、部分空間Ａのクラスタリング処理の結果として得られた代表ベクトルが、記憶制御部１４によってメモリ１２に記憶される。また、部分空間Ａのクラスタリング処理の結果、クラスタリング処理された特徴量ベクトルが、部分空間に情報を関連付けてＨＤＤ１３に記憶される。図１１の例では、メモリ１２の記憶データが更新されて、２つの部分空間ＡとＤの代表ベクトルが記憶される。また、ＨＤＤ１３の記憶データが更新されて、部分空間Ａに属する特徴量ベクトルとして２つの特徴量ベクトルａ１、ａ２が記憶され、部分空間Ｄに属する特徴量ベクトルとして３つの特徴量ベクトルａ３、ａ４、ｂ１が記憶される。 Also, as shown in FIG. 11, the representative vector obtained as a result of the clustering process of the subspace A is stored in the memory 12 by the storage control unit 14. Further, as a result of the clustering process of the subspace A, the feature quantity vector subjected to the clustering process is stored in the HDD 13 in association with information in the subspace. In the example of FIG. 11, the data stored in the memory 12 is updated and the representative vectors of the two partial spaces A and D are stored. Also, the storage data of the HDD 13 is updated, two feature quantity vectors a1 and a2 are stored as feature quantity vectors belonging to the partial space A, and three feature quantity vectors a3, a4, b1 is stored.

（画像データベースの更新の他の例）
以上の説明では、画像データベース２の更新を行うときに部分空間の再分割が行われる例について説明した。つづいて、図１２〜図１５を参照して、画像データベース２の更新を行うときに部分空間の再分割が行われない例について説明する。ここでは、図７〜図１１で説明した画像ｂに含まれる１つ目の特徴点の抽出に続いて、２つ目の特徴点を抽出したときの動作を例にして説明する。 (Another example of updating the image database)
In the above description, the example in which the subspace is subdivided when the image database 2 is updated has been described. Next, an example in which the subspace is not redivided when the image database 2 is updated will be described with reference to FIGS. Here, the operation when the second feature point is extracted following the extraction of the first feature point included in the image b described in FIGS. 7 to 11 will be described as an example.

図１２では、特徴抽出部５によって画像ｂに含まれる２つ目の特徴点（図１２では黒三角印として図示）が抽出されている。そして、この特徴点についても、局所特徴量算出部６によって局所特徴量が計算され、ベクトル算出部７によって追加の特徴量ベクトルとして表され、図１２に示すようにベクトル空間にプロットされる。この場合、追加の特徴量ベクトルｂ２は、部分空間Ｄにプロットされている。 In FIG. 12, the second feature point (shown as a black triangle mark in FIG. 12) included in the image b is extracted by the feature extraction unit 5. Also for this feature point, the local feature quantity is calculated by the local feature quantity calculation unit 6, expressed as an additional feature quantity vector by the vector calculation unit 7, and plotted in a vector space as shown in FIG. 12. In this case, the additional feature vector b2 is plotted in the subspace D.

つぎに、記憶制御部１４は、この追加の特徴量ベクトルｂ２が属する部分空間Ｄに関連付けられた特徴量ベクトルａ３、ａ４、ｂ１をＨＤＤ１３からメモリ１２に読み出して、図１３に示すように、特徴量ベクトルａ３、ａ４、ｂ１をベクトル空間（部分空間Ｄ）にプロットする。 Next, the storage control unit 14 reads out the feature quantity vectors a3, a4, b1 associated with the partial space D to which the additional feature quantity vector b2 belongs from the HDD 13 to the memory 12, and as shown in FIG. The quantity vectors a3, a4, b1 are plotted in the vector space (subspace D).

この場合、部分空間Ｄに属する特徴量ベクトルがある程度まとまっており（あまり分離しておらず）、部分空間Ｄにおける特徴量ベクトルの分離度が、しきい値より小さいと判定される。したがって、クラスタリング処理部８は、部分空間Ｄにおいてクラスタリング処理を行わない。 In this case, the feature amount vectors belonging to the partial space D are gathered to some extent (not separated so much), and it is determined that the degree of separation of the feature amount vectors in the partial space D is smaller than the threshold value. Therefore, the clustering processing unit 8 does not perform the clustering process in the subspace D.

その後、代表ベクトル算出部９によって、部分空間Ｄの代表ベクトルが算出される。なお、この場合の部分空間Ｄの代表ベクトルは、特徴量ベクトルａ３、ａ４、ｂ１、ｂ２の重心ベクトル（図では丸Ｄ印で図示されている）である。 Thereafter, the representative vector calculation unit 9 calculates the representative vector of the partial space D. In this case, the representative vector of the partial space D is the centroid vector (shown by a circle D in the figure) of the feature amount vectors a3, a4, b1, and b2.

このような特徴量ベクトルｂ２の追加に基づいて、検索テーブル作成部は、図１４に示すように検索テーブルの更新を行う。この場合、検索テーブルにおいて、部分空間Ｄに、画像ａと画像ｂが対応するように検索テーブルの更新が行われる。 Based on the addition of the feature quantity vector b2, the search table creation unit updates the search table as shown in FIG. In this case, in the search table, the search table is updated so that the image a and the image b correspond to the partial space D.

また、図１５に示すように、特徴量ベクトルｂ２の追加の結果として得られた代表ベクトルが、記憶制御部１４によってメモリ１２に記憶される。また、特徴量ベクトルｂ２の追加の結果、ＨＤＤ１３に記憶されている部分空間と特徴量ベクトルの関連付けが更新される。図１５の例では、部分空間Ｄに追加の特徴量ベクトルｂ２が追加される。 Further, as shown in FIG. 15, the representative vector obtained as a result of adding the feature vector b2 is stored in the memory 12 by the storage control unit 14. Further, as a result of the addition of the feature vector b2, the association between the partial space and the feature vector stored in the HDD 13 is updated. In the example of FIG. 15, an additional feature vector b2 is added to the subspace D.

（画像検索）
最後に、図１６〜図２０を参照して、本実施の形態のデータベース装置１において、画像検索を行うときの動作について説明する。ここでは、上記のような初期化・更新によって生成された画像データベース２を用いて画像検索する例を説明する。 (Image search)
Finally, with reference to FIG. 16 to FIG. 20, an operation when performing an image search in the database apparatus 1 of the present embodiment will be described. Here, an example in which an image search is performed using the image database 2 generated by the initialization / update as described above will be described.

まず、画像検索に用いられる画像データベース２について説明する。図１６には、コンピュータの３つの画像（画像ａ〜ｃ）が画像データベース２に登録されたときの特徴量ベクトルのプロットが例示されている。この例では、画像ａから８つの特徴点が抽出され、８つの特徴量ベクトル（ａ１〜ａ８）がプロットされている。また、画像ｂからは６つの特徴点が抽出され、６つの特徴量ベクトル（ｂ１〜ｂ６）がプロットされている。また、画像ｃからは６つの特徴点が抽出され、６つの特徴量ベクトル（ｃ１〜ｃ６）がプロットされている。そして、ベクトル空間は、７つの部分空間（部分空間Ａ〜Ｇ）に分割されている。なお、図１６の例では、説明の便宜上、画像データベース２に３つの画像が登録された例について説明するが、画像の数はこれに限定されないことは言うまでもない。 First, the image database 2 used for image search will be described. FIG. 16 illustrates a plot of feature quantity vectors when three images (images a to c) of a computer are registered in the image database 2. In this example, eight feature points are extracted from the image a, and eight feature quantity vectors (a1 to a8) are plotted. Also, six feature points are extracted from the image b, and six feature quantity vectors (b1 to b6) are plotted. Further, six feature points are extracted from the image c, and six feature quantity vectors (c1 to c6) are plotted. The vector space is divided into seven partial spaces (partial spaces A to G). In the example of FIG. 16, an example in which three images are registered in the image database 2 will be described for convenience of explanation, but it goes without saying that the number of images is not limited to this.

上記のような例では、図１７に示すような検索テーブルが作成される。この検索テーブルでは、部分空間Ａには画像ａとｃが対応し、部分空間Ｂには画像ａとｃが対応している。また、部分空間Ｃには画像ａが対応し、部分空間Ｄには画像ｂが対応している。さらに、部分空間Ｅには画像ａとｂが対応し、部分空間Ｆには画像ｂとｃが対応している。そして、部分空間Ｇには画像ａ〜ｃが対応している。 In the above example, a search table as shown in FIG. 17 is created. In this search table, the images a and c correspond to the partial space A, and the images a and c correspond to the partial space B. Further, the image a corresponds to the partial space C, and the image b corresponds to the partial space D. Further, the images a and b correspond to the partial space E, and the images b and c correspond to the partial space F. Then, the images a to c correspond to the partial space G.

なお、この場合、図１８に示すように、メモリ１２には、部分空間Ａ〜Ｇの代表ベクトルが記憶されており、ＨＤＤ１３には、部分空間に属する特徴量ベクトルがその部分空間に関連付けて記憶されている。具体的には、部分空間Ａに属する特徴量ベクトルとして３つの特徴量ベクトルａ１、ａ２、ｃ１が記憶され、部分空間Ｂに属する特徴量ベクトルとして３つの特徴量ベクトルａ７、ａ８、ｃ６が記憶されている。また、部分空間Ｃに属する特徴量ベクトルとして２つの特徴量ベクトルａ５、ａ６が記憶され、部分空間Ｄに属する特徴量ベクトルとして３つの特徴量ベクトルｂ２、ｂ３、ｂ４が記憶されている。また、部分空間Ｅに属する特徴量ベクトルとして２つの特徴量ベクトルａ３、ｂ１が記憶され、部分空間Ｆに属する特徴量ベクトルとして２つの特徴量ベクトルｂ５、ｃ５が記憶されている。そして、部分空間Ｇに属する特徴量ベクトルとして５つの特徴量ベクトルａ４、ｂ６、ｃ２、ｃ３、ｃ４が記憶されている。 In this case, as shown in FIG. 18, the memory 12 stores the representative vectors of the partial spaces A to G, and the HDD 13 stores the feature vector belonging to the partial space in association with the partial space. Has been. Specifically, three feature quantity vectors a1, a2, and c1 are stored as feature quantity vectors belonging to the partial space A, and three feature quantity vectors a7, a8, and c6 are stored as feature quantity vectors belonging to the partial space B. ing. Further, two feature quantity vectors a5 and a6 are stored as feature quantity vectors belonging to the partial space C, and three feature quantity vectors b2, b3 and b4 are stored as feature quantity vectors belonging to the partial space D. Further, two feature quantity vectors a3 and b1 are stored as feature quantity vectors belonging to the partial space E, and two feature quantity vectors b5 and c5 are stored as feature quantity vectors belonging to the partial space F. Then, five feature quantity vectors a4, b6, c2, c3, c4 are stored as feature quantity vectors belonging to the partial space G.

つぎに、検索テーブルを用いた画像検索の動作について説明する。画像検索を行うときには、まず、図１９に示すように、画像選択部４によって入力画像が選択され、特徴抽出部５によってこの入力画像に含まれる特徴点が抽出される。この例では、６つの特徴点（図１９では黒星印として図示）が抽出されている。 Next, an image search operation using the search table will be described. When performing an image search, first, as shown in FIG. 19, an input image is selected by the image selection unit 4, and feature points included in the input image are extracted by the feature extraction unit 5. In this example, six feature points (shown as black stars in FIG. 19) are extracted.

つぎに、局所特徴量算出部６によってこれらの特徴点の局所特徴量が計算され、ベクトル算出部７によって局所特徴量が特徴量ベクトルとして表される。そして、この特徴量ベクトルがベクトル空間にプロットされる。図１９の例では、部分空間Ｄ、Ｅ、Ｆ、Ｇに、特徴量ベクトルがプロットされている。 Next, the local feature quantity of these feature points is calculated by the local feature quantity calculation unit 6, and the local feature quantity is expressed as a feature quantity vector by the vector calculation unit 7. Then, this feature vector is plotted in a vector space. In the example of FIG. 19, feature vectors are plotted in the subspaces D, E, F, and G.

検索処理部１７は、検索テーブルを読み出して、上記の特徴量ベクトルがプロットされた部分空間（部分空間Ｄ、Ｅ、Ｆ、Ｇ）に対応する画像の数をカウントする。図２０の例では、部分空間Ｄ、Ｅ、Ｆ、Ｇに対応する画像ａの数は２つであり、部分空間Ｄ、Ｅ、Ｆ、Ｇに対応する画像ｂの数は４つであり、部分空間Ｄ、Ｅ、Ｆ、Ｇに対応する画像ｃの数は２つである。その結果、画像ｂの数が最も多いことが分かる。したがって、この場合、入力画像に最も類似する画像として、画像ｂという検索結果が得られることになる。 The search processing unit 17 reads the search table and counts the number of images corresponding to the partial spaces (partial spaces D, E, F, and G) on which the feature amount vectors are plotted. In the example of FIG. 20, the number of images a corresponding to the partial spaces D, E, F, and G is two, and the number of images b corresponding to the partial spaces D, E, F, and G is four, The number of images c corresponding to the partial spaces D, E, F, and G is two. As a result, it can be seen that the number of images b is the largest. Therefore, in this case, a search result called image b is obtained as the image most similar to the input image.

このようなVisual Wordsを用いた検索では、入力した画像の局所特徴量が、どの部分空間（Visual Word）に属するかのみを判定すればよいため、極めて高速に検索を実行することができる。なお、特徴量ベクトルがどの部分空間（Visual Word）に属するかは、例えばBest-Bin-Firstアルゴリズムなどの最近傍点探索アルゴリズムを用いることによって、効率的に見つけることができる。 In such a search using Visual Words, it is only necessary to determine which partial space (Visual Word) the local feature amount of the input image belongs to, so that the search can be executed very quickly. Note that the subspace (Visual Word) to which the feature vector belongs can be efficiently found by using a nearest neighbor search algorithm such as a Best-Bin-First algorithm.

このような本実施の形態の画像検索用データベース装置１によれば、ベクトル空間における部分空間の範囲を記憶するメモリ１２と、部分空間に関連付けて特徴量ベクトルを記憶するＨＤＤ１３とを設けることにより、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要なメモリ１２の容量が少なくて済み、また、データベースを逐次的に更新することができる。 According to such an image search database device 1 of the present embodiment, by providing the memory 12 that stores the range of the partial space in the vector space and the HDD 13 that stores the feature quantity vector in association with the partial space, Even when a large amount of images are stored in the database, the capacity of the memory 12 required for the clustering process can be reduced, and the database can be updated sequentially.

すなわち、本実施の形態では、クラスタリング処理のために、部分空間の代表ベクトルをメモリ１２に記憶しておき、局所特徴量を表す特徴量ベクトルは、部分空間に関連付けてＨＤＤ１３に記憶しておけばよい。したがって、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要なメモリ１２の容量が少なくて済む。 That is, in this embodiment, for clustering processing, a representative vector of a partial space is stored in the memory 12, and a feature vector representing a local feature is stored in the HDD 13 in association with the partial space. Good. Therefore, even when a large amount of images are stored in the database, the capacity of the memory 12 required for the clustering process can be reduced.

また、本実施の形態では、データベースに新たに画像を追加しようとする場合に、追加の画像に含まれる特徴点の特徴量ベクトル（追加の特徴量ベクトル）を算出し、その追加の特徴量ベクトルが属する部分空間に関連付けられた特徴量ベクトルが、ＨＤＤ１３からメモリ１２に読み出される。そして、これらの特徴量ベクトルについてクラスタリング処理を行って部分空間を再分割する。このようにして、逐次的に（新たな画像がデータベースに追加される度に）データベースを更新することが可能になる。 In this embodiment, when a new image is to be added to the database, a feature amount vector (additional feature amount vector) of a feature point included in the additional image is calculated, and the additional feature amount vector The feature quantity vector associated with the partial space to which the data belongs is read from the HDD 13 to the memory 12. Then, a clustering process is performed on these feature quantity vectors to subdivide the partial space. In this way, the database can be updated sequentially (each time a new image is added to the database).

具体的には、従来のVisual Wordsの方法では、ｎ枚の画像が画像データベース２に登録されている場合、クラスタリング処理の実行に要する計算量ｏ（ｎ）が画像の数ｎに応じて増大する。したがって、データベースに大量の画像が登録されている場合には、クラスタリング処理の実行に非常に時間がかかる。それに対して、本実施の形態では、クラスタリング処理の実行に要する計算量ｏ（１）は画像の数ｎによらない。したがって、データベースに大量の画像が登録されている場合であっても、短時間でクラスタリング処理を実行することができ、逐次的なデータベースの更新が可能になる。 Specifically, in the conventional Visual Words method, when n images are registered in the image database 2, the calculation amount o (n) required for executing the clustering process increases according to the number n of images. . Therefore, when a large number of images are registered in the database, it takes a very long time to execute the clustering process. On the other hand, in the present embodiment, the calculation amount o (1) required for executing the clustering process does not depend on the number n of images. Therefore, even when a large number of images are registered in the database, the clustering process can be executed in a short time, and the database can be updated sequentially.

また、本実施の形態では、部分空間における特徴量ベクトルの分離度に基づいて、その部分空間を分割するクラスタリング処理を行うか否かの判定が行われる。例えば、分離度が大きい場合には、部分空間におけるクラスタリング処理が行われ、分離度が小さい場合には、部分空間におけるクラスタリング処理は行われない。これにより、部分空間における特徴量ベクトルの分離度に応じた適切なクラスタリング処理が行われる。 In the present embodiment, based on the degree of separation of the feature quantity vectors in the partial space, it is determined whether or not to perform clustering processing for dividing the partial space. For example, when the degree of separation is large, clustering processing in the subspace is performed, and when the degree of separation is small, clustering processing in the subspace is not performed. Thereby, an appropriate clustering process according to the degree of separation of the feature quantity vectors in the partial space is performed.

また、本実施の形態では、クラスタリング処理によって部分空間が再分割された場合には、それに応じて検索テーブルが適切に更新される。したがって、逐次的なデータベースの更新が反映された画像検索が可能になる。 In the present embodiment, when the subspace is subdivided by the clustering process, the search table is appropriately updated accordingly. Therefore, it is possible to search for an image reflecting a sequential database update.

以上、本発明の実施の形態を例示により説明したが、本発明の範囲はこれらに限定されるものではなく、請求項に記載された範囲内において目的に応じて変更・変形することが可能である。 The embodiments of the present invention have been described above by way of example, but the scope of the present invention is not limited to these embodiments, and can be changed or modified according to the purpose within the scope of the claims. is there.

以上のように、本発明にかかる画像検索用データベース装置は、データベースに大量の画像が格納される場合であっても、クラスタリング処理のために必要なメモリ容量が少なくて済み、また、データベースを逐次的に更新することができるという効果を有し、Visual Wordsを用いた画像検索用データベース等として有用である。 As described above, the database device for image search according to the present invention requires a small memory capacity for the clustering process even when a large amount of images are stored in the database. It is useful as an image search database using Visual Words.

本実施の形態における画像検索用データベース装置の構成を示すブロック図である。It is a block diagram which shows the structure of the database apparatus for image search in this Embodiment. クラスタリング処理部の構成を示すブロック図である。It is a block diagram which shows the structure of a clustering process part. １枚目の画像を登録するときの特徴量ベクトルのプロットの説明図である。It is explanatory drawing of the plot of the feature-value vector when registering the 1st image. ベクトル空間におけるクラスタリング処理の説明図である。It is explanatory drawing of the clustering process in vector space. １枚目の画像を登録したときの検索テーブルを示す図である。It is a figure which shows the search table when registering the 1st image. メモリおよびＨＤＤに記憶されるデータを示す図である。It is a figure which shows the data memorize | stored in a memory and HDD. ２枚目の画像を登録するときの追加の特徴量ベクトルのプロットの説明図である。It is explanatory drawing of the plot of the additional feature-value vector when registering the 2nd image. 部分空間における特徴量ベクトルの読出しの説明図である。It is explanatory drawing of the reading of the feature-value vector in a partial space. 部分空間におけるクラスタリング処理の説明図である。It is explanatory drawing of the clustering process in a partial space. 検索テーブルの更新の説明図である。It is explanatory drawing of the update of a search table. メモリおよびＨＤＤに記憶されるデータを示す図である。It is a figure which shows the data memorize | stored in a memory and HDD. ２枚目の画像を登録するときの追加の特徴量ベクトルのプロットの説明図である。It is explanatory drawing of the plot of the additional feature-value vector when registering the 2nd image. 部分空間における特徴量ベクトルの読出しの説明図である。It is explanatory drawing of the reading of the feature-value vector in a partial space. 検索テーブルの更新の説明図である。It is explanatory drawing of the update of a search table. メモリおよびＨＤＤに記憶されるデータを示す図である。It is a figure which shows the data memorize | stored in a memory and HDD. 複数の画像が登録されたときの特徴量ベクトルのプロットの説明図である。It is explanatory drawing of the plot of the feature-value vector when a some image is registered. 複数の画像が登録されたときの検索テーブルを示す図である。It is a figure which shows a search table when a several image is registered. メモリおよびＨＤＤに記憶されるデータを示す図である。It is a figure which shows the data memorize | stored in a memory and HDD. 入力画像に含まれる特徴量ベクトルのプロットの説明図である。It is explanatory drawing of the plot of the feature-value vector contained in an input image. 検索テーブルを用いた画像検索の説明図である。It is explanatory drawing of the image search using a search table.

Explanation of symbols

１データベース装置
２画像データベース
３画像受付部
４画像選択部
５特徴点抽出部
６局所特徴量算出部
７ベクトル算出部
８クラスタリング処理部
９代表ベクトル算出部
１０分離度算出部
１１判定制御部
１２メモリ
１３ＨＤＤ
１４記憶制御部
１５検索テーブル生成部
１６検索テーブル記憶部
１７検索処理部 DESCRIPTION OF SYMBOLS 1 Database apparatus 2 Image database 3 Image reception part 4 Image selection part 5 Feature point extraction part 6 Local feature-value calculation part 7 Vector calculation part 8 Clustering process part 9 Representative vector calculation part 10 Separation degree calculation part 11 Judgment control part 12 Memory 13 HDD
14 Storage Control Unit 15 Search Table Generation Unit 16 Search Table Storage Unit 17 Search Processing Unit

Claims

An image search database device in which a plurality of images for image search are registered,
Each local feature amount of a plurality of feature points included in each image is represented as a feature amount vector, and clustering processing is performed in a vector space including the plurality of feature amount vectors to divide the vector space into a plurality of partial spaces. And a search table showing the correspondence between each partial space and each image is generated,
The image search database device comprises:
Working storage means for storing a range of the subspace in the vector space for use in the clustering process;
A large-capacity storage unit that has a larger storage capacity than the working storage unit, and stores the feature amount vectors of the plurality of images in association with a partial space to which each feature vector belongs, for use in the clustering process;
Vector calculation means for calculating a local feature amount of a feature point included in an additional image newly registered as an image for image search as an additional feature amount vector;
Read control means for reading out the feature quantity vector associated with the partial space to which the additional feature quantity vector belongs from the mass storage means to the working storage means;
Clustering processing means for performing a clustering process on the partial space to which the additional feature quantity vector belongs based on the read feature quantity vector and the additional feature quantity vector, and subdividing the partial space;
When the partial space is subdivided by the clustering processing means, a table updating means for updating a correspondence relationship between the divided partial space and each image in the search table;
An image search database device comprising:

A degree-of-separation calculating means for calculating a degree of separation indicating the degree of separation of the feature amount vector in the partial space to which the additional feature amount vector belongs;
It is determined whether or not the degree of separation is greater than a predetermined threshold value. If the degree of separation is greater than the predetermined threshold value, clustering processing in the subspace is performed, and the degree of separation is determined to be a predetermined threshold value. A determination control means for controlling the clustering process in the subspace so as not to perform the clustering process in the subspace, if smaller than the value;
The database device for image search according to claim 1 , further comprising:

A method of managing an image search database in which a plurality of images for image search are registered,
In the image search database, local feature amounts of a plurality of feature points included in each image are represented as feature amount vectors, and clustering processing is performed in a vector space including the plurality of feature amount vectors. Is divided into a plurality of subspaces, and a search table showing the correspondence between each subspace and each image is generated,
The image search database management method includes:
Storing the range of the subspace in the vector space in a working storage means for use in the clustering process;
Storing the feature quantity vectors of the plurality of images in association with the partial space to which each feature vector belongs in a large capacity storage means having a larger storage capacity than the working storage means for use in the clustering process;
Calculating a local feature amount of a feature point included in an additional image newly registered as an image for image search as an additional feature amount vector;
Reading out the feature vector associated with the partial space to which the additional feature vector belongs from the large-capacity storage unit to the working storage unit;
Based on the read feature quantity vector and the additional feature quantity vector, clustering processing is performed in the partial space to which the additional feature quantity vector belongs, and the subspace is subdivided;
When the partial space is subdivided by the clustering process, updating the correspondence relationship between the divided subspace and each image in the search table;
A database management method for image retrieval, comprising:

An image search database management program in which a plurality of images for image search are registered,
In the image search database, local feature amounts of a plurality of feature points included in each image are represented as feature amount vectors, and clustering processing is performed in a vector space including the plurality of feature amount vectors. Is divided into a plurality of subspaces, and a search table showing the correspondence between each subspace and each image is generated,
The image search database management program is stored in a computer.
A process for storing a range of the partial space in the vector space in a working storage means for use in the clustering process;
A process of storing the feature quantity vectors of the plurality of images in association with the partial space to which each feature quantity vector belongs in a large capacity storage means having a larger storage capacity than the working storage means for use in the clustering process;
Processing for calculating a local feature amount of a feature point included in an additional image newly registered as an image for image search as an additional feature amount vector;
A process of reading a feature vector associated with a partial space to which the additional feature vector belongs from the large-capacity storage unit to the working storage unit;
Based on the read feature quantity vector and the additional feature quantity vector, a process of performing a clustering process in the partial space to which the additional feature quantity vector belongs to re-divide the partial space;
When the subspace is subdivided by the clustering process, in the search table, a process of updating a correspondence relationship between the divided subspace and each image;
The database management program for image retrieval characterized by performing this.