JP5382383B2

JP5382383B2 - Database processing apparatus, database processing method, program, and database data structure

Info

Publication number: JP5382383B2
Application number: JP2012048079A
Authority: JP
Inventors: 岳彦柏木; 純平上村
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2011-03-24
Filing date: 2012-03-05
Publication date: 2014-01-08
Anticipated expiration: 2032-03-05
Also published as: US8838552B2; JP2012212427A; US20120246128A1; CN102708145A

Description

本発明は、ＧＰＧＰＵ（General Purpose computing on Graphics Processing Units）を用いるデータベース技術に関する。 The present invention relates to a database technology using GPGPU (General Purpose Computing on Graphics Processing Units).

近年、ＧＰＵ（Graphics Processing Unit）等の並列演算器に汎用的な演算処理を行わせるＧＰＧＰＵ技術が注目されている。ＧＰＵは、ＣＰＵ（Central Processing
Unit）よりも演算器の並列度が高く、演算のスループットが高い。また、複数の演算器に対して命令供給を行うＳＩＭＤ演算器に似た構成を有する。ＧＰＧＰＵを用いて高い処理性能を発揮するには、命令分岐が少なくなるようにし、且つ、ある演算器のセットへのデータ供給に関して、データの供給量が一致することと供給するデータが連続性を持つ状態にすることが必要である。 In recent years, attention has been paid to GPGPU technology that allows a general-purpose arithmetic processing to be performed by a parallel arithmetic unit such as a GPU (Graphics Processing Unit). GPU is CPU (Central Processing)
The degree of parallelism of the computing units is higher than that of Unit, and the computation throughput is high. Further, it has a configuration similar to a SIMD arithmetic unit that supplies instructions to a plurality of arithmetic units. In order to achieve high processing performance using GPGPU, the number of instruction branches should be reduced, and the data supply to a certain set of computing units must be consistent with the data supplied. It is necessary to have a state.

カラムストアによるデータ構造はＧＰＧＰＵのような並列演算器による処理に適した構造と考えられる。固定長データのデータ処理は、カラムストアによってカラムごとに固定長の配列として表現されるため、これをＧＰＧＰＵでデータ処理に供すればよい。 The data structure by the column store is considered to be a structure suitable for processing by a parallel computing unit such as GPGPU. Since data processing of fixed length data is expressed as a fixed length array for each column by the column store, this may be used for data processing by GPGPU.

例えば非特許文献１には、大規模な一つのテキストについて、その内容の全文検索をＧＰＧＰＵを援用して行う技術が開示されている。 For example, Non-Patent Document 1 discloses a technique for performing a full-text search of the content of a single large-scale text with the aid of GPGPU.

東竜一、藤本典幸、萩原兼一、“ＧＰＵの汎用計算環境ＣＵＤＡによる主記憶上の大規模なテキストに対する高速な全文検索の検討”、情報処理学会研究報告、No.19,2008年,pp.139-144Ryuichi Higashi, Noriyuki Fujimoto, Kenichi Sugawara, “Examination of high-speed full-text search for large text on main memory using GPU's general-purpose computing environment CUDA”, IPSJ Research Report, No. 19, 2008, pp. 139-144

しかし、可変長データを含むデータ群を効率的に格納するデータベースやそのようなデータベースを効率的に処理できるデータベース処理方法は未だ実現されていない。 However, a database that efficiently stores a data group including variable-length data and a database processing method that can efficiently process such a database have not yet been realized.

本発明は、上記問題点に鑑みてなされたもので、並列演算器を用いて、可変長データについても効率的なデータベース処理を実現することができるデータベースシステム、データベース処理方法等を提供することを目的とする。 The present invention has been made in view of the above problems, and provides a database system, a database processing method, and the like that can implement efficient database processing for variable-length data using a parallel computing unit. Objective.

本発明は、並列演算器を有するデータベース処理装置であって、前記並列演算器のデータ処理単位に応じたフラグメント長を決定し、カラムストアデータベースにおいて、可変長データを含むタプルデータをフラグメントに格納するとともに、フラグメントヘッダとして前記フラグメントのメタデータを格納するデータ格納手段と、前記カラムストアデータベースに格納されたデータの処理に際して、前記メタデータを参照して各スレッドに割り当てるフラグメントを決定し、決定内容に基づいて各スレッドにフラグメントを割り当てて並列演算を実行させる並列演算手段と、を備え、前記並列演算手段は、各スレッドにおいて、割り振られたフラグメントについて文字列を検索し、検索結果としてフラグメントのバイト位置をビット反転したものを記録する手段と、前記並列演算器のデータ処理単位に応じた数の前記検索結果を各スレッドに割り当てて、各スレッドにおいて、割り当てられた前記検索結果についてビット反転を検出し、検出結果フラグをタプルレベルの検索結果として記録する手段と、を備えることを特徴とするデータベース処理装置である。 The present invention is a database processing apparatus having a parallel computing unit, which determines a fragment length according to a data processing unit of the parallel computing unit, and stores tuple data including variable length data in a fragment in a column store database. In addition, when processing the data stored in the column store database, the data storage means for storing the fragment metadata as a fragment header, the fragment to be assigned to each thread is determined with reference to the metadata, Parallel operation means for allocating fragments to each thread and executing parallel operations on the basis of each of the threads, the parallel operation means searches for a character string for the allocated fragment in each thread, and the byte position of the fragment as a search result Bit invert A means for recording things, and assigning a number of the search results corresponding to the data processing unit of the parallel computing unit to each thread, detecting bit inversion for the assigned search results in each thread, and a detection result flag And a means for recording as a tuple level search result .

本発明は、並列演算器を有するデータベース処理装置におけるデータ処理方法であって、前記並列演算器のデータ処理単位に応じたフラグメント長を決定し、カラムストアデータベースにおいて、可変長データを含むタプルデータをフラグメントに格納するとともに、フラグメントヘッダとして前記フラグメントのメタデータを格納するデータ格納ステップと、前記カラムストアデータベースに格納されたデータの処理に際して、前記メタデータを参照して各スレッドに割り当てるフラグメントを決定し、決定内容に基づいて各スレッドにフラグメントを割り当てて並列演算を実行させる並列演算ステップと、を備え、前記並列演算ステップは、各スレッドにおいて、割り振られたフラグメントについて文字列を検索し、検索結果としてフラグメントのバイト位置をビット反転したものを記録するステップと、前記並列演算器のデータ処理単位に応じた数の前記検索結果を各スレッドに割り当てて、各スレッドにおいて、割り当てられた前記検索結果についてビット反転を検出し、検出結果フラグをタプルレベルの検索結果として記録するステップと、を備えることを特徴とするデータベース処理方法である。 The present invention relates to a data processing method in a database processing apparatus having a parallel computing unit, wherein a fragment length is determined according to a data processing unit of the parallel computing unit, and tuple data including variable length data is stored in a column store database. A data storage step for storing the fragment metadata as a fragment header and processing the data stored in the column store database to determine a fragment to be assigned to each thread with reference to the metadata. A parallel operation step for assigning a fragment to each thread based on the determined content and executing a parallel operation , wherein the parallel operation step searches for a character string for the allocated fragment in each thread, and as a search result Hula A bit-inversion of the byte position of the event, and a number of the search results corresponding to the data processing unit of the parallel computing unit are allocated to each thread, and each thread has a bit for the search result allocated. A database processing method comprising: detecting inversion and recording a detection result flag as a tuple level search result .

本発明は、並列演算器を有するコンピュータに、前記並列演算器のデータ処理単位に応じたフラグメント長を決定し、カラムストアデータベースにおいて、可変長データを含むタプルデータをフラグメントに格納するとともに、フラグメントヘッダとして前記フラグメントのメタデータを格納するデータ格納ステップ、前記カラムストアデータベースに格納されたデータの処理に際して、前記メタデータを参照して各スレッドに割り当てるフラグメントを決定し、決定内容に基づいて各スレッドにフラグメントを割り当てて並列演算を実行させる並列演算ステップ、を実行させ、前記並列演算ステップは、各スレッドにおいて、割り振られたフラグメントについて文字列を検索し、検索結果としてフラグメントのバイト位置をビット反転したものを記録するステップと、前記並列演算器のデータ処理単位に応じた数の前記検索結果を各スレッドに割り当てて、各スレッドにおいて、割り当てられた前記検索結果についてビット反転を検出し、検出結果フラグをタプルレベルの検索結果として記録するステップと、を備えることを特徴とするプログラムである。 The present invention determines a fragment length according to a data processing unit of the parallel computing unit in a computer having a parallel computing unit, stores tuple data including variable length data in the fragment in the column store database, and also includes a fragment header. In the data storage step for storing the metadata of the fragment, in processing the data stored in the column store database, the fragment to be assigned to each thread is determined with reference to the metadata, and each thread is determined based on the determined content parallel operation step of executing a parallel operation by assigning a fragment, is executed, the parallel operation step, in each thread, search for the string for the allocated fragment, results byte position of a fragment obtained by bit inversion as And a number of the search results corresponding to the data processing unit of the parallel computing unit are allocated to each thread, and in each thread, bit inversion is detected for the allocated search results, and a detection result flag Recording a tuple level search result as a tuple level search result .

本発明は、カラムストアデータベースのデータ構造であって、並列演算器のデータ処理単位に応じてコンピュータにより決定された固定長のフラグメント長を有し、前記コンピュータにより可変長データを含むタプルデータが格納されたフラグメントと、前記コンピュータにより、各前記フラグメントに対して付与された、前記フラグメントの順番を示した番号と、前記フラグメント最後部データ位置のタプル先頭位置からのオフセットと、を示すフラグメントヘッダと、を備えるデータベースのデータ構造である。
The present invention provides a data structure of a column store database has a fragment length of a fixed length determined by the computer according to the data processing unit of the parallel arithmetic unit, tuple data containing variable length data is stored by the computer and fragments that have been, by the computer, which is assigned to each of said fragment, and numbers indicate the order of the fragments, and fragment header indicating an offset from the tuple head position of said fragment rearmost data position, Is a data structure of a database comprising

本発明によれば、並列演算器を用いて、可変長データについても効率的なデータベース処理を実現することができる。 According to the present invention, efficient database processing can be realized even for variable-length data using a parallel computing unit.

図１は本発明の実施形態に係るデータベースシステムのシステム構成の概略図である。FIG. 1 is a schematic diagram of a system configuration of a database system according to an embodiment of the present invention. 図２はデータベースにおけるデータ構造の一例を示す図である。FIG. 2 shows an example of the data structure in the database. 図３はデータベースシステムの動作を説明するためのフローチャートである。FIG. 3 is a flowchart for explaining the operation of the database system. 図４は記憶領域に記録された検索処理結果を例示する図である。FIG. 4 is a diagram exemplifying search processing results recorded in the storage area. 図５は所定記憶領域に記録された再処理結果を例示する図である。FIG. 5 is a diagram illustrating a reprocessing result recorded in the predetermined storage area. 図６は本データベースシステムの動作の概要を例示する図である。FIG. 6 is a diagram illustrating an outline of the operation of the database system. 図７は並列演算処理部が各スレッドに割り当てるフラグメントを決定する処理を説明するためのフローチャートである。FIG. 7 is a flowchart for explaining processing for determining a fragment to be assigned to each thread by the parallel processing unit. 図８は可変長データ格納・処理部がデータベースにデータを格納する処理を説明するためのフローチャートである。FIG. 8 is a flowchart for explaining a process in which the variable length data storage / processing unit stores data in the database.

以下、本発明の実施形態について図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図１は、本発明の実施形態に係るデータベースシステムのシステム構成の概略図である。図示されるように、本システムは、データベース１０とデータベース処理装置２０とを備え、これらはＬＡＮ（Local Area Network）等のネットワークにより接続されている。 FIG. 1 is a schematic diagram of a system configuration of a database system according to an embodiment of the present invention. As shown in the figure, this system includes a database 10 and a database processing device 20, which are connected by a network such as a LAN (Local Area Network).

データベース１０は、カラムストアデータベースである。データベース１０の管理単位は、タプル、カラム、テーブル、スキーマであり、それぞれ上位構造に対して複数を格納することができる。タプルは、データベース内部のある行のデータを含む。あるカラムストア内部には特定のカラムのデータがタプル単位で集められている。 The database 10 is a column store database. The management unit of the database 10 is a tuple, a column, a table, and a schema, and a plurality of units can be stored for each upper structure. A tuple contains a row of data inside the database. In a certain column store, data of a specific column is collected in units of tuples.

データベース処理装置２０は、ホストコンピュータとコプロセッサ（並列演算器）等から構成される。データベース処理装置２０は、並列演算器環境検知部２１、可変長データ格納長決定部２２、可変長データ格納・処理部２３、並列演算処理部２４、データ処理結果・格納・再処理部２５、を備える。 The database processing device 20 includes a host computer and a coprocessor (parallel computing unit). The database processing device 20 includes a parallel computing unit environment detection unit 21, a variable length data storage length determination unit 22, a variable length data storage / processing unit 23, a parallel processing unit 24, a data processing result / storage / reprocessing unit 25, and the like. Prepare.

並列演算器環境検知部２１は、本装置における並列演算器（ＧＰＵ）の処理能力に関する情報（データ処理単位等）を取得する。可変長データ格納長決定部２２は、並列演算器環境検知部２１が取得した情報に基づいて、データベース１０におけるデータ格納長（フラグメント長）を決定する。 The parallel computing unit environment detection unit 21 acquires information (data processing unit or the like) related to the processing capability of the parallel computing unit (GPU) in this apparatus. The variable length data storage length determination unit 22 determines the data storage length (fragment length) in the database 10 based on the information acquired by the parallel computing unit environment detection unit 21.

可変長データ格納・処理部２３は、可変長データ格納長決定部２２により決定されたデータ格納長に基づいて、データをデータベース１０に格納する。カラム単位の、固定長データを含む可変長のタプルデータ（カラムタプル）は、固定長のフラグメントへ格納される。あるカラムのタプルデータは、必要に応じて複数のフラグメントからなるフラグメントセットへ格納される。 The variable length data storage / processing unit 23 stores data in the database 10 based on the data storage length determined by the variable length data storage length determination unit 22. Variable-length tuple data (column tuple) including fixed-length data in column units is stored in fixed-length fragments. Tuple data of a certain column is stored in a fragment set composed of a plurality of fragments as necessary.

データベース１０のデータ構造の一例を図２に示す。図示されるように、データベース１０に格納されるデータは、フラグメントと、当該フラグメントに付されたフラグメントヘッダを備える。フラグメントは、１つのカラムタプルの部分データもしくは全体を格納する。フラグメントヘッダは、カラムタプルに関するメタデータを格納する。同じタプルに属するフラグメントは必ず連続的に格納される。フラグメントヘッダは、カラムタプルの部分データを格納したフラグメントの順番を示した番号(ct id)と、当該フラグメント最後部データ位置のタプル先頭位置からのオフセット(Length)を示す。 An example of the data structure of the database 10 is shown in FIG. As illustrated, data stored in the database 10 includes a fragment and a fragment header attached to the fragment. The fragment stores partial data or the whole of one column tuple. The fragment header stores metadata related to the column tuple. Fragments belonging to the same tuple are always stored continuously. The fragment header indicates a number (ct id) indicating the order of fragments storing partial data of the column tuple, and an offset (Length) from the tuple head position of the fragment last data position.

フラグメント長は、フラグメントセット内のカラムごとには固定長となりうるが、カラムごとにまたはフラグメントセット毎に、独立に設定することができる。可変長データ格納長決定部２２は、カラムもしくはフラグメントセット内に現れるデータ長を計測し、適当なアルゴリズムを用いて、並列演算器の処理能力に適合した、処理効率もしくは空間効率を向上できるフラグメント長を導くことができる。例えば、並列演算器のデータ処理単位に適したデータ長（４，８、１６、３２、６４バイト等）を設定してもよい。また、可変長データのデータ長の平均値等を設定してもよく、また、その平均値近傍の並列演算器のデータ処理単位に適したデータ長を設定してもよい。 The fragment length can be a fixed length for each column in the fragment set, but can be set independently for each column or for each fragment set. The variable length data storage length determination unit 22 measures the data length appearing in the column or fragment set, and uses a suitable algorithm to increase the processing efficiency or the space efficiency suitable for the processing capability of the parallel computing unit. Can guide you. For example, a data length (4, 8, 16, 32, 64 bytes, etc.) suitable for the data processing unit of the parallel computing unit may be set. In addition, an average value of the data length of variable length data or the like may be set, and a data length suitable for a data processing unit of a parallel computing unit near the average value may be set.

並列演算処理部２４は、データベース１０に格納されたデータについて並列演算処理を行う。データ処理結果・格納・再処理部２５は、並列演算処理部２４による演算結果を処理する。 The parallel arithmetic processing unit 24 performs parallel arithmetic processing on the data stored in the database 10. The data processing result / storage / reprocessing unit 25 processes the calculation result by the parallel calculation processing unit 24.

次に、本実施形態に係るデータベースシステムの動作について、データベース２１からある文字列を検索する場合を例に説明する。 Next, the operation of the database system according to this embodiment will be described by taking as an example a case where a character string is searched from the database 21.

検索の実行処理としては、１．フラグメントをまたがる処理が必要である場合と、２．フラグメントをまたがる処理が必要でない場合がある。 The search execution process includes: 1. When processing across fragments is necessary, Processing across fragments may not be necessary.

上記１．の場合は、検索対象の文字列（可変長バイト列）がフラグメントの処理格納単位バイト長よりも長く、かつ検索対象のタプルが複数のフラグメントから構成される場合を含む。この場合、各スレッドに対して担当フラグメントをオーバーラップさせ、多段処理する。Ｎ番目のスレッドは、Ｎ番目のフラグメントとＮ＋１番目のフラグメント、Ｎ＋１番目のスレッドは、Ｎ＋１番目のフラグメントとＮ＋２番目のフラグメントを担当する。一番最後になるスレッドはＥＮＤ番目のフラグメントのみ担当する。 Above 1. In this case, the search target character string (variable length byte string) is longer than the fragment processing storage unit byte length, and the search target tuple is composed of a plurality of fragments. In this case, the responsible fragment is overlapped for each thread and multistage processing is performed. The Nth thread is responsible for the Nth fragment and the (N + 1) th fragment, and the (N + 1) th thread is responsible for the (N + 1) th fragment and the (N + 2) th fragment. The last thread is responsible only for the END-th fragment.

上記２．の場合は、フラグメントへのデータ投入最短長管理単位（Byte）一発の検索と、すべてのフラグメントは１つのみでそれぞれのタプルを構成できている場合の検索と、を含む。これらは上記１．の特殊例として処理することができ、処理段が複数化しないので高速処理が可能となる。よって、以降、上記１．の場合を中心に図３を参照して説明する。 2. In the case of (1), the search includes a single data input shortest length management unit (Byte) search to a fragment and a search in the case where each fragment can be configured with only one fragment. These are described in 1. above. Can be processed as a special example, and since a plurality of processing stages are not used, high-speed processing is possible. Therefore, hereinafter, the above 1. The case will be described with reference to FIG.

まず、並列演算処理部２４は、検索処理の結果セットの格納領域を確保する（ステップＳ１）。検索の結果は、複数のヒット結果などに対応するため、フラグメント内部のバイトごとにビットで与えられる。具体的には、検索対象バイト列の最初のバイトにあたるビット位置を反転するものとする。 First, the parallel processing unit 24 secures a storage area for a search processing result set (step S1). The search result is given in bits for each byte inside the fragment in order to correspond to a plurality of hit results. Specifically, the bit position corresponding to the first byte of the search target byte string is inverted.

並列演算処理部２４は、並列演算器で実行される各スレッドに割り当てるフラグメントを、フラグメントに付与されたメタデータを参照して決定する（ステップＳ２）。このとき、１つのスレッドに処理させるフラグメント数はフラグメント長によって変化させる。例えばバイトのビット長である８とフラグメントのバイト長の最小公倍数を設定することで結果セットを空隙なく、かつ全てのスレッドに同じ長さのデータを与えることができる。例えば、フラグメント長が４バイトのフラグメントについては２フラグメントをスレッドに与えてもよい。フラグメント長が８バイトのフラグメントについては１フラグメントをスレッドに与えてもよい。フラグメント長が１６バイトのフラグメントについては１フラグメントをスレッドに与えてもよい。また、並列演算器のデータ処理単位に基づいて、各スレッドに割り当てるフラグメントを決定してもよい。演算器が３２ビットの場合には、各スレッドに与えるデータ長を３２ビットの倍数となるようにしてもよい。 The parallel processing unit 24 determines a fragment to be assigned to each thread executed by the parallel computing unit with reference to metadata attached to the fragment (step S2). At this time, the number of fragments to be processed by one thread is changed depending on the fragment length. For example, by setting 8 as the bit length of the byte and the least common multiple of the byte length of the fragment, it is possible to give the same data to all threads without gaps in the result set. For example, for a fragment having a fragment length of 4 bytes, 2 fragments may be given to the thread. For a fragment having a fragment length of 8 bytes, one fragment may be given to the thread. For a fragment with a fragment length of 16 bytes, one fragment may be given to the thread. Further, a fragment to be assigned to each thread may be determined based on the data processing unit of the parallel computing unit. When the arithmetic unit is 32 bits, the data length given to each thread may be a multiple of 32 bits.

ステップＳ２の決定内容に基づいて並列演算処理が実行される（ステップＳ３）。具体的には、各スレッドにより以下の処理が実行される。スレッドは、検索対象となる可変長バイト列が与えられると、検索対象バイト列の長さを算出する。また、当該スレッドの担当すべきフラグメントセットの位置が与えられる。スレッドは、指定されたフラグメントを順次読み込み、検索対象バイト列が、読み込んだフラグメントセットに現れているかを判別する検索処理を行う。 A parallel calculation process is executed based on the content determined in step S2 (step S3). Specifically, the following processing is executed by each thread. When a variable length byte string to be searched is given, the thread calculates the length of the search target byte string. In addition, the position of the fragment set to be handled by the thread is given. The thread sequentially reads the specified fragment and performs a search process to determine whether the search target byte string appears in the read fragment set.

なお、検索処理を行う条件として、現在までに与えられた逐次フラグメント内部に格納されているデータの合計長を算出し、所定の条件[（検索対象バイト列長−１）＋逐次フラグメントデータ長以上]を満たす場合に、検索処理を実行してもよい。すなわち、検索処理を行う条件として、現在処理を実行しているフラグメント内部の、未処理であるデータ長が、検索対象バイト列よりも長い場合には検索処理を実行してもよい。条件を満たさない場合は、次のフラグメントを続けて読み込み、データ合計長を算出して上記条件を満たすか判定してもよい。 As a condition for performing the search process, the total length of data stored in the sequential fragments given up to now is calculated, and a predetermined condition [(search target byte string length-1) + sequential fragment data length or more ] May be executed when the above condition is satisfied. That is, as a condition for performing the search process, the search process may be executed when the unprocessed data length in the fragment currently executing the process is longer than the search target byte string. If the condition is not satisfied, the next fragment may be continuously read and the data total length may be calculated to determine whether the above condition is satisfied.

検索処理では、検索対象バイト列の先頭のオフセットがそのスレッドの担当する先頭フラグメントの内部にあり、かつ検索対象バイト列のマッチ領域に１つのタプルＩＤのみ検出された場合、検索対象バイト列を検出したとしてその開始位置のビットを反転し、検索処理結果とする。この検索処理結果はスレッド単位でバイトアライメントされた領域に記録される。所定記憶領域に記録される検索処理結果の例を図４に示す。 In the search process, if the first offset of the search target byte string is inside the first fragment assigned to the thread and only one tuple ID is detected in the match area of the search target byte string, the search target byte string is detected. As a result, the bit at the start position is inverted to obtain a search processing result. The search processing result is recorded in a byte-aligned area in units of threads. An example of the search processing result recorded in the predetermined storage area is shown in FIG.

次に、データ処理結果・格納・再処理部２５は、上記検索処理結果の再計算処理を行うため、再計算処理の結果セットの格納領域を確保する（ステップＳ４）。再計算の結果は、タプルに対して１ビットの容量確保で済む。すなわち、タプル数＊ビット数＋アライメントの数となる。 Next, the data processing result / storage / reprocessing unit 25 secures a storage area for the result set of the recalculation process in order to perform the recalculation process of the search process result (step S4). As a result of the recalculation, it is sufficient to secure a capacity of 1 bit for the tuple. That is, the number of tuples * the number of bits + the number of alignments.

データ処理結果・格納・再処理部２５は、各スレッドに割り当てるフラグメントを決定する（ステップＳ５）。このとき、１つのスレッドに割り当てるタプル数を、バイトのビット長である８倍数に設定することでタプルレベルの結果セットを空隙なく、かつ全てのスレッドに同じ長さを与えることができる。また、並列演算器のデータ処理単位に基づいて割り当てるフラグメントを決定してもよい。演算器のデータ処理単位が３２ビットの場合には、各スレッドに与えるデータ長を３２ビットの倍数となるようにしてもよい。 The data processing result / storage / reprocessing unit 25 determines a fragment to be assigned to each thread (step S5). At this time, by setting the number of tuples to be assigned to one thread to a multiple of 8 which is the bit length of the byte, the tuple level result set can be given the same length to all threads without gaps. Moreover, you may determine the fragment allocated based on the data processing unit of a parallel computing unit. When the data processing unit of the arithmetic unit is 32 bits, the data length given to each thread may be a multiple of 32 bits.

ステップＳ５の決定内容に基づいて並列演算処理が実行される（ステップＳ６）。具体的には、各スレッドにより以下の処理が実行される。スレッドは、担当するタプルＩＤと、検索処理の結果セットの読み取り担当位置、フラグメントセットの担当位置を通知される。 A parallel calculation process is executed based on the determination content of step S5 (step S6). Specifically, the following processing is executed by each thread. The thread is notified of the tuple ID in charge, the position in charge of reading the result set of the search process, and the position in charge of the fragment set.

各スレッドは、自分が担当するタプルＩＤについて次の処理を行う。通知された位置からフラグメントを読み込み、そのタプルＩＤを算出する。自分が担当するタプルＩＤか判断し、担当するタプルＩＤの場合は、対応する検索結果を読み込み、フラグが上がっている場合には、そのタプルに検索対象バイト列が存在するとして、タプルレベルのフラグを上げる。この処理結果はスレッド単位でバイトアライメントされた領域に記録される。所定記憶領域に記録された再処理結果の例を図５に示す。これにより、各タプルに検索対象バイト列が存在するか否かが示される。 Each thread performs the following processing for the tuple ID that it is responsible for. A fragment is read from the notified position, and its tuple ID is calculated. It is determined whether the tuple ID is in charge. If the tuple ID is in charge, the corresponding search result is read. If the flag is raised, the tuple level flag indicates that the search target byte string exists in the tuple. Raise. This processing result is recorded in a byte-aligned area in units of threads. An example of the reprocessing result recorded in the predetermined storage area is shown in FIG. This indicates whether or not a search target byte string exists in each tuple.

なお、上述した本データベースシステムの動作の概要を図６に例示する。図６の例では、データベースにおいて、"HELLO!"、"parallel"、"GPU"、"GPGPU"等のタプルデータが各フラグメントに格納されている。並列演算処理部２４は、バイト列"GPU"を検索する処理について、検索処理の結果を格納する領域を確保し、並列演算器で実行される各スレッドに割り当てるフラグメントを、メタデータや並列演算器のデータ処理単位に基づいて決定する。この決定内容に基づいて各スレッドにフラグメントが割り当てられる。各スレッドは、割り当てられたフラグメントについて、バイト列”GPU”を検索する演算処理を行い、処理結果を記録する。図６では、演算結果の記録領域において、”GPU”が検出されたバイト列に対応する領域の先頭位置のビットが反転されている。次に、データ処理結果・格納部２５は、演算結果を再計算した処理結果を格納する領域を確保し、各スレッドに割り当てるタプルを決定する。このとき、各スレッドに割り当てるタプルの数は８の倍数に設定する。各スレッドは、割り当てられたタプルに対応する演算結果を読み込み、読み込んだ演算結果にフラグが立っている場合、そのタプルの処理結果としてフラグを立てる。図６では、フラグが立っている演算結果である"100"と"00100"、の処理結果として、フラグ"1"がそれぞれ立てられている。 An outline of the operation of the database system described above is illustrated in FIG. In the example of FIG. 6, tuple data such as “HELLO!”, “Parallel”, “GPU”, and “GPGPU” is stored in each fragment in the database. The parallel processing unit 24 secures an area for storing the result of the search processing for searching for the byte string “GPU”, and assigns a fragment to be assigned to each thread executed by the parallel processing unit to the metadata or the parallel processing unit. It is determined based on the data processing unit. A fragment is assigned to each thread based on this determination. Each thread performs arithmetic processing for searching for a byte string “GPU” for the allocated fragment, and records the processing result. In FIG. 6, the bit at the head position of the area corresponding to the byte string in which “GPU” is detected is inverted in the calculation result recording area. Next, the data processing result / storage unit 25 secures an area for storing a processing result obtained by recalculating the calculation result, and determines a tuple to be assigned to each thread. At this time, the number of tuples assigned to each thread is set to a multiple of 8. Each thread reads an operation result corresponding to the assigned tuple, and when a flag is set in the read operation result, sets a flag as a processing result of the tuple. In FIG. 6, the flag “1” is set as the processing result of “100” and “00100”, which are the calculation results for which the flag is set.

次に、並列演算処理部２４が各スレッドに割り当てるフラグメントを決定する処理（上記ステップＳ２）の詳細を、図７を参照して説明する。並列演算処理部２４は、割り当て対象のフラグメントを処理対象データから特定し、そのフラグメント長を取得する（ステップＳ１１）。並列演算処理部２４は、割り当て対象のフラグメントのフラグメント長が、８の倍数である所定数（８とフラグメントの最小公倍数等）であるかを判定する（ステップＳ１２）。フラグメント長が所定数でなければ（ステップＳ１２：ＮＯ）、並列演算処理部２４は、処理対象データにおける他のフラグメントを割り当て対象のフラグメントに追加し、再度、割り当て対象フラグメントのフラグメント長を算出して（ステップＳ１３）、ステップＳ１２に戻る。フラグメント長が所定数である場合（ステップＳ１２:ＹＥＳ）、並列演算処理部２４は、割り当て対象のフラグメントをスレッドの一つに割り当てる（ステップＳ１４）。並列演算処理部２４は、処理対象データにおける全てのフラグメントについて割り当てが完了したかを判定し（ステップＳ１５）、完了した場合には（ステップＳ１５：ＹＥＳ）、処理を終了する。割り当てが完了していない場合には（ステップＳ１５：ＮＯ）、ステップＳ１１に戻り、残りのフラグメントについての処理を続行する。なお、本処理は一例であり、他の割り当て処理を用いてもよい。 Next, the details of the process of determining the fragment to be assigned to each thread by the parallel processing unit 24 (step S2 above) will be described with reference to FIG. The parallel processing unit 24 identifies the allocation target fragment from the processing target data, and acquires the fragment length (step S11). The parallel processing unit 24 determines whether the fragment length of the fragment to be allocated is a predetermined number that is a multiple of 8 (e.g., the least common multiple of 8 and the fragment) (step S12). If the fragment length is not a predetermined number (step S12: NO), the parallel processing unit 24 adds another fragment in the processing target data to the allocation target fragment, and calculates the fragment length of the allocation target fragment again. (Step S13), the process returns to Step S12. When the fragment length is a predetermined number (step S12: YES), the parallel processing unit 24 assigns the fragment to be assigned to one of the threads (step S14). The parallel processing unit 24 determines whether the assignment has been completed for all the fragments in the processing target data (step S15), and if completed (step S15: YES), the process ends. If the assignment has not been completed (step S15: NO), the process returns to step S11, and the processing for the remaining fragments is continued. This process is an example, and other assignment processes may be used.

次に、可変長データ格納・処理部２３がデータベース１０にデータを格納する処理について図８を参照して説明する。可変長データ格納・処理部２３は、格納対象のデータを読み込む（ステップＳ２１）。読み込まれるデータは、カラム単位の固定長データを含む可変長のタプルデータである。可変長データ格納・処理部２３は、読み込んだタプルデータをデータベース１０におけるフラグメントに格納する（ステップＳ２２）。フラグメントは固定長であり、その長さは可変長データ格納長決定部２２により、本システムで用いる並列演算器のデータ処理単位等に基づいて決定されている。可変長データ格納・処理部２３は、フラグメントに格納したタプルデータに対応するメタデータを、データベース１０におけるフラグメントヘッダに格納する（ステップＳ２３）。メタデータは、タプルデータを識別するＩＤと、当該フラグメント最後部のタプル先頭位置からのオフセットを含む。可変長データ格納・処理部２３は、格納対象の全データについてデータベース１０への格納が完了したかを判定し（ステップＳ２４）、完了した場合には（ステップＳ２４：ＹＥＳ）、処理を終了する。格納が完了していない場合には（ステップＳ２４：ＮＯ）ステップＳ２１に戻り、残りのデータについての処理を続行する。なお、本処理は一例であり、他の格納処理を用いてもよい。また、フラグメントのフラグメント長は、一般的な並列演算器のデータ処理単位に基づく、予め設定された値（例えば、４バイトや８バイトや１６バイト等）を用いるようにしてもよい。 Next, processing in which the variable length data storage / processing unit 23 stores data in the database 10 will be described with reference to FIG. The variable length data storage / processing unit 23 reads data to be stored (step S21). The data to be read is variable-length tuple data including fixed-length data in column units. The variable length data storage / processing unit 23 stores the read tuple data in fragments in the database 10 (step S22). The fragment has a fixed length, and its length is determined by the variable length data storage length determination unit 22 based on the data processing unit of the parallel computing unit used in this system. The variable length data storage / processing unit 23 stores the metadata corresponding to the tuple data stored in the fragment in the fragment header in the database 10 (step S23). The metadata includes an ID for identifying the tuple data and an offset from the tuple head position at the end of the fragment. The variable length data storage / processing unit 23 determines whether or not all data to be stored has been stored in the database 10 (step S24), and if completed (step S24: YES), the process ends. If the storage has not been completed (step S24: NO), the process returns to step S21, and the process for the remaining data is continued. Note that this processing is an example, and other storage processing may be used. As the fragment length of a fragment, a preset value (for example, 4 bytes, 8 bytes, 16 bytes, or the like) based on a data processing unit of a general parallel computing unit may be used.

以上説明したように本実施形態によれば、可変長データのＧＰＵ上での効率的な処理を行えるようにするため、空間効率と処理効率のバランスをとることで効率的な処理を実現する。 As described above, according to the present embodiment, efficient processing is realized by balancing space efficiency and processing efficiency in order to perform efficient processing on the GPU of variable-length data.

上述した本発明の実施形態に係る並列演算器環境検知部２１、可変長データ格納長決定部２２、可変長データ格納・処理部２３、並列演算処理部２４、データ処理結果・格納・再処理部２５は、データベース処理装置２０のＣＰＵやコプロセッサが記憶部に格納された動作プログラム等を読み出して実行することにより実現されてもよく、また、ハードウェアで構成されてもよい。上述した実施の形態の一部の機能のみをコンピュータプログラムにより実現することもできる。 Parallel computing unit environment detection unit 21, variable length data storage length determination unit 22, variable length data storage / processing unit 23, parallel processing unit 24, data processing result / storage / reprocessing unit according to the embodiment of the present invention described above 25 may be realized by the CPU or coprocessor of the database processing apparatus 20 reading out and executing an operation program or the like stored in the storage unit, or may be configured by hardware. Only some functions of the above-described embodiments can be realized by a computer program.

以上、好ましい実施の形態をあげて本発明を説明したが、本発明は必ずしも上記実施の形態に限定されるものではなく、その技術的思想の範囲内において様々に変形し実施することが出来る。本出願は、２０１１年３月２４日に出願された日本出願特願２０１１−０６５１７１号を基礎とする優先権を主張し、その開示の全てをここに取り込む。 Although the present invention has been described with reference to the preferred embodiments, the present invention is not necessarily limited to the above-described embodiments, and various modifications can be made within the scope of the technical idea. This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2011-065171 for which it applied on March 24, 2011, and takes in those the indications of all here.

１０データベース
２０データベース処理装置
２１並列演算器環境検知部
２２可変長データ格納長決定部
２３可変長データ格納・処理部
２４並列演算処理部
２５データ処理結果・格納・再処理部 DESCRIPTION OF SYMBOLS 10 Database 20 Database processing apparatus 21 Parallel computing unit environment detection part 22 Variable length data storage length determination part 23 Variable length data storage / processing part 24 Parallel arithmetic processing part 25 Data processing result / storage / reprocessing part

Claims

A database processing apparatus having a parallel computing unit,
Data storage means for determining a fragment length according to a data processing unit of the parallel computing unit, storing tuple data including variable length data in the fragment in the column store database, and storing metadata of the fragment as a fragment header When,
Parallel processing means for determining a fragment to be assigned to each thread with reference to the metadata when processing the data stored in the column store database, and assigning the fragment to each thread based on the determined content and executing a parallel operation; ,
Equipped with a,
The parallel operation means searches the character string for the allocated fragment in each thread, records the result obtained by bit-inversion of the byte position of the fragment as a search result, and according to the data processing unit of the parallel operation unit Means for assigning a number of search results to each thread, detecting bit inversion for each assigned search result in each thread, and recording a detection result flag as a tuple level search result. A database processing apparatus characterized by that.

A data processing method in a database processing apparatus having a parallel computing unit,
A data storage step of determining a fragment length according to a data processing unit of the parallel computing unit, storing tuple data including variable length data in the fragment in the column store database, and storing metadata of the fragment as a fragment header When,
A parallel operation step of determining a fragment to be assigned to each thread with reference to the metadata when processing the data stored in the column store database, and assigning the fragment to each thread based on the determined content and executing a parallel operation; ,
Equipped with a,
In the parallel operation step, in each thread, a character string is searched for the allocated fragment, and a result obtained by bit-inversion of the byte position of the fragment is recorded as a search result, and according to a data processing unit of the parallel operation unit Assigning a number of search results to each thread, detecting bit inversion for the assigned search results in each thread, and recording a detection result flag as a tuple level search result. A database processing method characterized by the above.

For computers with parallel computing units,
A data storage step of determining a fragment length according to a data processing unit of the parallel computing unit, storing tuple data including variable length data in the fragment in the column store database, and storing metadata of the fragment as a fragment header ,
A parallel operation step of determining a fragment to be assigned to each thread with reference to the metadata when processing the data stored in the column store database, and assigning a fragment to each thread based on the determined content to execute a parallel operation;
Was executed,
In the parallel operation step, in each thread, a character string is searched for the allocated fragment, and a result obtained by bit-inversion of the byte position of the fragment is recorded as a search result, and according to a data processing unit of the parallel operation unit Assigning a number of search results to each thread, detecting bit inversion for the assigned search results in each thread, and recording a detection result flag as a tuple level search result. A program characterized by that.