JP5836422B2

JP5836422B2 - Method and system for data matching

Info

Publication number: JP5836422B2
Application number: JP2014083183A
Authority: JP
Inventors: 飯塚　真玄; 真玄飯塚; 拓橋本; 伸介岩元
Original assignee: 株式会社Ｔｋｃ
Priority date: 2014-04-14
Filing date: 2014-04-14
Publication date: 2015-12-24
Anticipated expiration: 2034-04-14
Also published as: JP2015203985A

Description

本発明は、一般に、データ一致化のための技術に関し、さらに詳しくは、ソース・データベースと複製データベースとの間の高速なデータ一致化のための方法およびシステムに関する。 The present invention relates generally to techniques for data matching, and more particularly to a method and system for fast data matching between a source database and a replicate database.

ソース・データベースと複製データベースとを備えてなるデータベース・システムでは、ソース・データベースと複製データベースとの間でのデータ一致化が重要である。 In a database system including a source database and a duplicate database, data matching between the source database and the duplicate database is important.

図４は、従来のデータ一致化方式を示す概念図である。 FIG. 4 is a conceptual diagram showing a conventional data matching method.

すなわち、図４（ａ）に示す方式によれば、ソース・データベースから、１件目のデータ・レコードが取得され、取得された１件目のデータ・レコードが、複製データベースに書き込まれる。このようにして複製データベースへの１件目のデータ・レコードの書き込みが終了すると、ソース・データベースから２件目のデータ・レコードが取得され、同様に、複製データベースに書き込まれる。 That is, according to the method shown in FIG. 4A, the first data record is acquired from the source database, and the acquired first data record is written in the duplicate database. When the writing of the first data record to the duplicate database is completed in this way, the second data record is acquired from the source database and similarly written to the duplicate database.

このような取出／書込処理が、最後のデータ・レコードの書き込みが完了するまで、データ・レコード１件ずつシーケンシャルに行われることによって、ソース・データベースと、複製データベースとの間でのデータ一致化が達成される。 Such retrieval / write processing is performed sequentially for each data record until writing of the last data record is completed, thereby matching the data between the source database and the duplicate database. Is achieved.

しかしながら、このようなシーケンシャル方式では、データ・レコードを取得するためのソース・データベースへのアクセスと、データ・レコードを書き込むための複製データベースへのアクセスとが、データ・レコード数に等しい回数、繰り返しなされるので、最後のデータ・レコードが複製データベースに書き込まれるまでに、非常に時間がかかってしまうという問題があった。 However, in such a sequential method, access to the source database to obtain data records and access to the duplicate database to write data records are repeated a number of times equal to the number of data records. Therefore, there is a problem that it takes a very long time before the last data record is written in the duplicate database.

このような問題を解決する１つの方法として、図４（ｂ）に示すようなバルク・コピー方式がある。この方式では、ソース・データベースから大量（例えば、１０万件ずつ）のデータ・レコードが一括して取得され、しかる後に、まとめて複製データベースに書き込まれる。この方式によれば、データ・レコードを取得するためのソース・データベースへのアクセスと、データ・レコードを書き込むための複製データベースへのアクセスとを、レコード数に等しい回数、繰り返し行う必要はないので、最後のデータ・レコードが複製データベースに書き込まれるまでに要する時間は、図４（ａ）に示す方式よりも大幅に短縮される。 One method for solving such a problem is a bulk copy method as shown in FIG. In this method, a large amount (for example, 100,000 records) of data records are collectively acquired from the source database, and thereafter, are collectively written in the duplicate database. According to this method, it is not necessary to repeat the access to the source database for obtaining the data record and the access to the duplicate database for writing the data record, the number of times equal to the number of records. The time required for the last data record to be written to the duplicate database is greatly reduced compared to the method shown in FIG.

一例として、１０万件のデータ・レコードを、ソース・データベースから複製データベースにコピーする場合、図４（ａ）に示すシーケンシャル・コピー方式では、取得と書込とを１０万回ループすることにより、約１０時間を要するのに対し、図４（ｂ）に示すバルク・コピー方式では、約３４分しか要しない。 As an example, when 100,000 data records are copied from a source database to a duplicate database, in the sequential copy method shown in FIG. 4A, acquisition and writing are looped 100,000 times, While it takes about 10 hours, the bulk copy method shown in FIG. 4B requires only about 34 minutes.

なお、バルク・コピーに関しては、非特許文献１に示すように、ＭｉｃｒｏｓｏｆｔＳＱＬＳｅｒｖｅｒのテーブルに、サイズの大きなデータを高速で一括書き込みするＳｑｌＢｕｌｋＣｏｐｙという一般的な部品が用意されている。 As for the bulk copy, as shown in Non-Patent Document 1, a general component called SQL Bulk Copy is prepared that batch-writes large-sized data at a high speed in a Microsoft SQL Server table.

また、非特許文献２には、ＯｒａｃｌｅＤａｔａｂａｓｅのテーブルに、サイズの大きなデータを高速で一括書き込みするＯｒａｃｌｅＢｕｌｋＣｏｐｙという一般的な部品が用意されている。 Non-Patent Document 2 provides a general part called OracleBulkCopy that batch-writes large data at a high speed to an Oracle Database table.

また、非特許文献３には、ＩＢＭＤＢ２Ｄａｔａｂａｓｅのテーブルに、サイズの大きなデータを高速で一括書き込みするＤＢ２ＢｕｌｋＣｏｐｙという一般的な部品が用意されている。 Non-Patent Document 3 provides a general component called DB2BulkCopy that batch-writes large-size data at a high speed into the IBM DB2 Database table.

Microsoft Developer Network ホームページ、http://msdn.microsoft.com/ja-jp/library/7ek5da1a(v=vs.110).aspxMicrosoft Developer Network home page, http://msdn.microsoft.com/en-us/library/7ek5da1a(v=vs.110).aspx Oracle Data Provider for .NET開発者ガイドホームページ、http://docs.oracle.com/cd/E16635_01/win.111/e06104/OracleBulkCopyClass.htmOracle Data Provider for .NET Developer's Guide home page, http://docs.oracle.com/cd/E16635_01/win.111/e06104/OracleBulkCopyClass.htm IBM DB2 Database for Linux, UNIX, and Windows インフォメーション・センターホームページ、http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.swg.im.dbclient.adonet.ref.doc%2Fdoc%2FDB2BulkCopyClass.htmlIBM DB2 Database for Linux, UNIX, and Windows Information Center home page, http://pic.dhe.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=%2Fcom.ibm.swg.im.dbclient. adonet.ref.doc% 2Fdoc% 2FDB2BulkCopyClass.html

しかしながら、このような従来のバルク・コピー方式では、以下のような問題がある。 However, such a conventional bulk copy method has the following problems.

すなわち、バルク・コピー方式では、前述したように、ソース・データベースから一度に大量（例えば、１０万件ずつ）のデータ・レコードが取得される。図５は、このようにして、ソース・データベース６０から、大量のデータ・レコードが、サーバ６２によって取得され、複製データベース７０へ書き込まれる処理を説明するための概念図である。 That is, in the bulk copy method, as described above, a large amount (for example, 100,000 records) of data records are acquired from the source database at a time. FIG. 5 is a conceptual diagram for explaining a process in which a large amount of data records are acquired by the server 62 from the source database 60 and written to the duplicate database 70 in this way.

ソース・データベース６０および複製データベース７０に接続されたサーバ６２は、データ取得部６４と、メモリ６６と、データ書込部６８とを備えている。そして、ソース・データベース６０に格納されているデータ・レコードは、データ取得部６４によって取得され、データ取得部６４からメモリ６６に渡され、メモリ６６において保持される。データ取得部６４によって取得されたデータ・レコードがすべてメモリ６６に格納されると、メモリ６６に格納されたデータ・レコードがデータ書込部６８によって取り出され、複製データベース７０へと書き込まれる。 The server 62 connected to the source database 60 and the duplicate database 70 includes a data acquisition unit 64, a memory 66, and a data writing unit 68. The data record stored in the source database 60 is acquired by the data acquisition unit 64, transferred from the data acquisition unit 64 to the memory 66, and held in the memory 66. When all the data records acquired by the data acquisition unit 64 are stored in the memory 66, the data records stored in the memory 66 are retrieved by the data writing unit 68 and written to the duplicate database 70.

すなわち、ソース・データベース６０からどれだけの量のデータ・レコードを一度に取得することができるかは、メモリ６６の容量に依存する。 That is, how much data records can be obtained from the source database 60 at a time depends on the capacity of the memory 66.

メモリ６６が十分な容量を有していれば、ソース・データベース６０から、すべてのデータ・レコードを一度で取得できるだろう。しかしながら、十分な容量のメモリ６６を準備することは、コスト・アップをもたらすという問題がある。また、現時点ではメモリ６６の容量が十分であったとしても、将来的に、ソース・データベース６０に格納されているデータ・サイズが増えれば、メモリの容量を増設する必要が生じうるという問題が常に付きまとう。 If the memory 66 has sufficient capacity, all data records could be obtained from the source database 60 at once. However, there is a problem that preparing a memory 66 having a sufficient capacity causes an increase in cost. Even if the capacity of the memory 66 is sufficient at the present time, there is always a problem that if the data size stored in the source database 60 increases in the future, it may be necessary to increase the capacity of the memory. Let's go.

一方、メモリ６６が十分な容量を有していなければ、ソース・データベース６０に格納されているすべてのデータ・レコードを一度に保持しきることはできないので、ソース・データベース６０からのデータ・レコードの取得と、取得したデータ・レコードの複製データベース７０への書込とからなるプロセスを、何度か繰り返す必要がある。その場合、複製データベース７０へのすべてのデータ・レコードの書込が完了するまでの時間が長くなるという問題がある。 On the other hand, if the memory 66 does not have a sufficient capacity, all the data records stored in the source database 60 cannot be held at one time. And the process of writing the acquired data record to the duplicate database 70 must be repeated several times. In that case, there is a problem that it takes a long time to complete writing of all data records in the duplicate database 70.

本発明はこのような事情に鑑みてなされたものであり、ソース・データベースに格納されたデータ・レコードを複製データベースにコピーする場合、ソース・データベースに格納されたデータ・レコードの数が増えても、メモリ容量を増設する必要なく、複製データベースへのコピーを高速で実施することが可能な、データ一致化のための方法およびシステムを提供することを目的とする。 The present invention has been made in view of such circumstances, and when copying data records stored in a source database to a duplicate database, even if the number of data records stored in the source database increases. Another object of the present invention is to provide a data matching method and system capable of performing copying to a duplicate database at high speed without the need for increasing the memory capacity.

上記の目的を達成するために、本発明では、以下のような手段を講じる。 In order to achieve the above object, the present invention takes the following measures.

すなわち、請求項１の発明は、ソース・データベースと複製データベースとの間のデータ一致化のためのシステムであって、ソース・データベースに格納された複数のデータ・レコードを順次取得しながら順次送出する取得手段と、取得手段によって順次送出されたデータ・レコードを順次受け取ってメモリに順次格納する格納手段と、メモリにデータ・レコードが格納されると、格納されたデータ・レコードを、後続するデータ・レコードがメモリに格納されている途中であっても、複製データベースに書き込み、書き込み終了後、このデータ・レコードを、メモリから削除する書込手段とを備えている。そして、書込手段が、メモリに格納されたデータ・レコードの複製データベースへの書き込みと、書き込み終了後のメモリからの削除とを、メモリに格納されたデータ・レコードが存在しなくなるまで繰り返すことによって、ソース・データベースに格納されたデータ・レコードと、複製データベースに格納されたデータ・レコードとの一致化を図るようにしている。
ここで、取得手段と格納手段とは通信ネットワークを介して接続され、取得手段が、Ｗｅｂサーバである。
また、ソース・データベースと書込手段とがダイレクトに接続しないように、ソース・データベースと書込手段との間に、Ｗｅｂサーバを設けるようにしている。 That is, the invention of claim 1 is a system for matching data between a source database and a duplicate database, and sequentially transmits a plurality of data records stored in the source database while sequentially acquiring them. An acquisition unit; a storage unit that sequentially receives and sequentially stores the data records sent by the acquisition unit; and when the data record is stored in the memory, the stored data record Writing means is provided for writing to the duplicate database even when the record is being stored in the memory, and deleting the data record from the memory after the writing is completed. Then, the writing means repeats the writing of the data record stored in the memory to the duplicate database and the deletion from the memory after the writing is completed until there is no data record stored in the memory. The data records stored in the source database and the data records stored in the duplicate database are matched.
Here, the acquisition unit and the storage unit are connected via a communication network, and the acquisition unit is a Web server.
Further, a Web server is provided between the source database and the writing means so that the source database and the writing means are not directly connected.

請求項２の発明は、取得手段および格納手段と、書込手段とは、独立して動作する、請求項１の発明のデータ一致化システムである。 The invention according to claim 2 is the data matching system according to claim 1, wherein the obtaining means, the storing means, and the writing means operate independently.

請求項３の発明は、ソース・データベースと複製データベースとの間のデータ一致化のための方法であって、ソース・データベースに格納された複数のデータ・レコードをＷｅｂサーバが順次取得しながら順次送出し、順次送出されたデータ・レコードを順次受け取ってメモリに順次格納し、メモリにデータ・レコードが格納されると、格納されたデータ・レコードを、後続するデータ・レコードがメモリに格納されている途中であっても、複製データベースに書き込み、書き込み終了後、このデータ・レコードを、メモリから削除する。そして、メモリに格納されたデータ・レコードの複製データベースへの書き込みと、書き込み終了後のメモリからの削除とを、メモリに格納されたデータ・レコードが存在しなくなるまで繰り返すことによって、ソース・データベースに格納されたデータ・レコードと、複製データベースに格納されたデータ・レコードとの一致化を図る。
また、ソース・データベースとメモリとがダイレクトに接続しないように、ソース・データベースとメモリとの間に、Ｗｅｂサーバを設けるようにしている。 The invention of claim 3 is a method for data matching between a source database and a replica database, wherein a Web server sequentially acquires a plurality of data records stored in the source database and sequentially transmits them. Then, sequentially sent data records are sequentially received and stored in the memory, and when the data record is stored in the memory, the stored data record is stored in the subsequent data record in the memory. Even in the middle, data is written to the duplicate database, and after the data is written, this data record is deleted from the memory. Then, by repeatedly writing the data record stored in the memory to the duplicate database and deleting it from the memory after the writing is completed, the data record stored in the memory does not exist. Matching the stored data records with the data records stored in the duplicate database.
Also, a Web server is provided between the source database and the memory so that the source database and the memory are not directly connected.

本発明に係るデータ一致化のための方法およびシステムによれば、ソース・データベースに格納されたデータ・レコードを複製データベースにコピーする場合、ソース・データベースに格納されたデータ・レコードの数が増えても、メモリ容量を増やす必要なく、高速で、複製データベースへのコピーを完了することが可能となる。 According to the method and system for data matching according to the present invention, when data records stored in a source database are copied to a duplicate database, the number of data records stored in the source database increases. However, it is possible to complete copying to the duplicate database at high speed without having to increase the memory capacity.

本発明の実施形態に係るデータ一致化方法が適用されたシステムの構成例を示す概念図である。It is a conceptual diagram which shows the structural example of the system by which the data matching method which concerns on embodiment of this invention was applied. 本発明の実施形態に係るシステムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the system which concerns on embodiment of this invention. 本発明の実施形態に係るシステムによるソース・データベースからのデータ・レコードの取得と、複製データベースへのデータ・レコードの書込とのタイミングを示す図である。It is a figure which shows the timing of acquisition of the data record from the source database by the system which concerns on embodiment of this invention, and the writing of the data record to a replication database. 従来のデータ一致化方式を示す概念図である。It is a conceptual diagram which shows the conventional data matching method. 従来のバルク・コピー方式を説明するための概念図である。It is a conceptual diagram for demonstrating the conventional bulk copy system.

以下に、本発明を実施するための最良の形態について図面を参照しながら説明する。 The best mode for carrying out the present invention will be described below with reference to the drawings.

図１は、本発明の実施形態に係るデータ一致化方法が適用されたシステムの構成例を示す概念図である。 FIG. 1 is a conceptual diagram showing a configuration example of a system to which a data matching method according to an embodiment of the present invention is applied.

このシステム１０は、ソース・データベース１２と複製データベース１４との間のデータ一致化のためのシステムであって、ソース・データベース１２と複製データベース１４の他に、例えばＷｅｂサーバであるデータ取得部１６と、データ書込部１８とを備えている。そして、データ取得部１６とデータ書込部１８とは、例えばインターネットのような通信ネットワーク１９によって接続されている。 This system 10 is a system for data matching between a source database 12 and a duplicate database 14, and in addition to the source database 12 and the duplicate database 14, for example, a data acquisition unit 16 that is a Web server, The data writing unit 18 is provided. The data acquisition unit 16 and the data writing unit 18 are connected by a communication network 19 such as the Internet.

このような通信ネットワーク１９は、イーサネット（登録商標）等のＬＡＮ、あるいは公衆回線や専用回線を介して複数のＬＡＮが接続されるＷＡＮ等からなる。ＬＡＮの場合には、必要に応じてルータを介した多数のサブネットから構成される。また、ＷＡＮの場合には、公衆回線に接続するためのファイアウォール等を適宜備えているが、ここではその図示及び詳細説明を省略する。 Such a communication network 19 includes a LAN such as Ethernet (registered trademark) or a WAN to which a plurality of LANs are connected through a public line or a dedicated line. In the case of a LAN, it is composed of a number of subnets via routers as necessary. In the case of a WAN, a firewall or the like for connecting to a public line is provided as appropriate, but illustration and detailed description thereof are omitted here.

データ取得部１６は、ソース・データベース１２に接続されており、ソース・データベース１２に格納された複数のデータ・レコードを順次取得しながら、通信ネットワーク１９を介してデータ書込部１８へ順次送出する。すなわち、データ取得部１６は、後続するデータ・レコードを取得している途中であっても、既に取得したデータ・レコードを、データ書込部１８へ順次送出する。 The data acquisition unit 16 is connected to the source database 12, and sequentially sends out a plurality of data records stored in the source database 12 to the data writing unit 18 via the communication network 19. . In other words, the data acquisition unit 16 sequentially transmits the already acquired data records to the data writing unit 18 even while the subsequent data records are being acquired.

データ書込部１８は、受信部２０と、メモリ２２と、書込部２４と備えている。そして、データ取得部１６によって順次送出されたデータ・レコードを受信部２０が順次受信して、メモリ２２に順次格納する。書込部２４は、メモリ２２にデータ・レコード（例えば、データ・レコード＃１）が格納されると、後続するデータ・レコード（例えば、データ・レコード＃２，＃３，＃４・・・・）がメモリ２２に格納されている途中であっても、格納されたデータ・レコード（データ・レコード＃１）を複製データベース１４へ書き込む。そして、複製データベース１４にデータ・レコード（データ・レコード＃１）が書き込まれると、このデータ・レコード（データ・レコード＃１）を、メモリ２２から削除する。 The data writing unit 18 includes a receiving unit 20, a memory 22, and a writing unit 24. Then, the receiving unit 20 sequentially receives the data records sequentially transmitted by the data acquisition unit 16 and sequentially stores them in the memory 22. When a data record (for example, data record # 1) is stored in the memory 22, the writing unit 24 performs subsequent data records (for example, data records # 2, # 3, # 4,...). ) Is stored in the memory 22, the stored data record (data record # 1) is written to the replication database 14. Then, when a data record (data record # 1) is written in the duplicate database 14, this data record (data record # 1) is deleted from the memory 22.

書込部２４は、その後、次のデータ・レコード（例えば、データ・レコード＃２）に対しても同様なことを行う。書込部２４は、このような動作を、メモリ２２に格納された最後のデータ・レコードまで連続的に繰り返すことによって、メモリ２２に格納されたすべてのデータ・レコードを複製データベース１４に書き込む。 Thereafter, the writing unit 24 performs the same operation for the next data record (for example, data record # 2). The writing unit 24 writes all the data records stored in the memory 22 in the duplicate database 14 by continuously repeating such an operation until the last data record stored in the memory 22.

次に、以上のように構成した本発明の実施形態に係るシステムの動作を、図２のフローチャートを用いて説明する。 Next, the operation of the system according to the embodiment of the present invention configured as described above will be described with reference to the flowchart of FIG.

本実施形態に係るシステム１０では、ソース・データベース１２に格納された複数のデータ・レコードが、例えばＷｅｂサーバのようなデータ取得部１６によって順次取得され、データ書込部１８へ順次送出される（ステップＳ１）。 In the system 10 according to the present embodiment, a plurality of data records stored in the source database 12 are sequentially acquired by a data acquisition unit 16 such as a Web server and sequentially transmitted to the data writing unit 18 ( Step S1).

ステップＳ１で順次送出されたデータ・レコードが、データ書込部１８の受信部２０によって順次取得され、メモリ２２に順次格納される（ステップＳ２）。 The data records sequentially transmitted in step S1 are sequentially acquired by the receiving unit 20 of the data writing unit 18 and sequentially stored in the memory 22 (step S2).

メモリ２２へデータ・レコード（例えば、データ・レコード＃１）が格納されると、後続するデータ・レコード（例えば、データ・レコード＃２，＃３，＃４・・・・）がメモリ２２に格納されている途中であっても、格納されたデータ・レコード（データ・レコード＃１）は、書込部２４によって複製データベース１４へ書き込まれる（ステップＳ３）。 When a data record (for example, data record # 1) is stored in the memory 22, subsequent data records (for example, data records # 2, # 3, # 4,...) Are stored in the memory 22. Even during the process, the stored data record (data record # 1) is written into the replication database 14 by the writing unit 24 (step S3).

このようにして、複製データベース１４にデータ・レコード（データ・レコード＃１）が書き込まれると、このデータ・レコード（データ・レコード＃１）は、書込部２４によって、メモリ２２から削除される（ステップＳ４）。 When the data record (data record # 1) is thus written in the duplicate database 14, the data record (data record # 1) is deleted from the memory 22 by the writing unit 24 ( Step S4).

その後、次のデータ・レコード（例えば、データ・レコード＃２）に対しても、書込部２４によって、ステップＳ３およびステップＳ４と同様な処理がなされる。このような動作が、メモリ２２に格納された最後のデータ・レコードまで連続的に繰り返される（ステップＳ５）ことによって、メモリ２２に格納されたすべてのデータ・レコードが複製データベース１４に書き込まれる。 Thereafter, the writing unit 24 performs the same processing as in step S3 and step S4 for the next data record (for example, data record # 2). Such an operation is continuously repeated until the last data record stored in the memory 22 (step S5), whereby all the data records stored in the memory 22 are written in the duplicate database 14.

図３は、本実施形態に係るシステム１０によるソース・データベース１２からのデータ・レコードの取得と、複製データベース１４へのデータ・レコードの書込とのタイミングを、従来技術と比較して示す図である。図３（ａ）は、図４（ａ）でも示したシーケンシャル・コピー方式であり、図３（ｂ）は、図４（ｂ）でも示したバルク・コピー方式である。 FIG. 3 is a diagram showing the timing of data record acquisition from the source database 12 by the system 10 according to the present embodiment and writing of the data record to the duplicate database 14 in comparison with the prior art. is there. 3A shows the sequential copy method shown in FIG. 4A, and FIG. 3B shows the bulk copy method shown in FIG. 4B.

ステップＳ１およびステップＳ２が、図３（ｃ）におけるデータ・レコードの取得プロセスａに相当し、ステップＳ３およびステップＳ４が、図３（ｃ）におけるデータ・レコードの書込プロセスｂに相当する。図３（ｃ）に示すように、本実施形態に係るシステム１０では、メモリ２２に１件のデータ・レコードでも格納されると、このデータ・レコードの、複製データベース１４への書込が開始されるので、データ・レコードの取得プロセスａが完了していなくても、データ・レコードの書込プロセスｂが開始される。すなわち、ステップＳ２において、メモリ２２にデータ・レコードが格納されると、ステップＳ３の動作が開始され、その後は、ステップＳ１およびステップＳ２の動作と、ステップＳ３およびステップＳ４の動作とは、独立して実行される。 Steps S1 and S2 correspond to the data record acquisition process a in FIG. 3C, and steps S3 and S4 correspond to the data record writing process b in FIG. 3C. As shown in FIG. 3C, in the system 10 according to the present embodiment, when even one data record is stored in the memory 22, writing of this data record to the duplicate database 14 is started. Therefore, even if the data record acquisition process a is not completed, the data record writing process b is started. That is, when a data record is stored in the memory 22 in step S2, the operation of step S3 is started, and thereafter, the operations of step S1 and step S2 and the operations of step S3 and step S4 are independent. Executed.

このような独立動作によって、ソース・データベース１２から最初のデータ・レコードが取得されてから、最後のデータ・レコードの複製データベース１４への書込が終了するまでに要する時間は、図３（ｂ）に示すバルク・コピー方式よりもさらに大幅に短縮される。 The time required for the writing of the last data record to the duplicate database 14 after the first data record is acquired from the source database 12 by such an independent operation is as shown in FIG. This is much shorter than the bulk copy method shown in

一例として、１０万件のデータ・レコードを、ソース・データベース１２から複製データベース１４にコピーする場合、図３（ａ）に示すシーケンシャル・コピー方式では、約１０時間を要するところ、図３（ｂ）に示すバルク・コピー方式では、約３４分しか要しないことを前述したが、図３（ｃ）に示す本実施形態に係るシステム１０では、約１８分しか要しない。すなわち、本実施形態に係るシステム１０によれば、従来のバルク・コピー方式で要していた処理時間が、さらに約半分に短縮される。しかも、ステップＳ４で説明したように、データ・レコードは、複製データベース１４に書き込まれると、順次、メモリ２２から削除されるので、メモリ２２は、さほど大きな容量を要しない。 As an example, when 100,000 data records are copied from the source database 12 to the duplicate database 14, the sequential copy method shown in FIG. 3A requires about 10 hours, whereas FIG. As described above, the bulk copy method shown in FIG. 3 only requires about 34 minutes, but the system 10 according to the present embodiment shown in FIG. 3C requires only about 18 minutes. That is, according to the system 10 according to the present embodiment, the processing time required for the conventional bulk copy method is further reduced to about half. Moreover, as described in step S4, when the data record is written in the duplicate database 14, it is deleted from the memory 22 in sequence, so that the memory 22 does not require a very large capacity.

このような動作は、前述したＳｑｌＢｕｌｋＣｏｐｙや、ＯｒａｃｌｅＢｕｌｋＣｏｐｙや、ＤＢ２ＢｕｌｋＣｏｐｙによっても実現可能である。例えば、ＳｑｌＢｕｌｋＣｏｐｙは、図５に示すように、ソース・データベース６０と複製データベース７０とにダイレクトに接続されたサーバ６２のデータ書込部６８において使用されることを想定している。しかし、ソース・データベース６０とサーバ６２の間に、ネットワーク等の制約や、テーブル間の単純コピーでなくデータ加工のプロセスが入っているなどの理由で、ソース・データベース６０とデータ書き込み部６８がダイレクトに接続できない場合は、データ書き込み部６８で使用するＳｑｌＢｕｌｋＣｏｐｙに読み込ませるために、データ取得部６４によりメモリ６６にデータを一時的に全てのデータを蓄積する必要がある。したがって、ＳｑｌＢｕｌｋＣｏｐｙをそのまま適用した場合、前述したように、ソース・データベース６０からの大量（例えば、１０万件ずつ）のデータ・レコードがメモリ６６に格納されるので、メモリ６６は十分な容量を有している必要がある。 Such an operation can also be realized by the above-described SQL Bulk Copy, Oracle Bulk Copy, and DB2 Bulk Copy. For example, it is assumed that SqlBulkCopy is used in a data writing unit 68 of a server 62 directly connected to a source database 60 and a duplicate database 70 as shown in FIG. However, the source database 60 and the data writing unit 68 are directly connected between the source database 60 and the server 62 because of a limitation of a network or the like, or a data processing process instead of a simple copy between tables. If the connection cannot be established, all data must be temporarily stored in the memory 66 by the data acquisition unit 64 in order to be read by the SQL Bulk Copy used by the data writing unit 68. Therefore, when the SQL Bulk Copy is applied as it is, a large amount (for example, 100,000 records) of data records from the source database 60 are stored in the memory 66 as described above, so that the memory 66 has a sufficient capacity. Need to be.

しかしながら、図１に示すように、Ｗｅｂサーバのようなデータ取得部１６を設け、ソース・データベース１２とダイレクトに接続しないようにしたデータ書込部１８においてＳｑｌＢｕｌｋＣｏｐｙを適用すれば、メモリ２２にデータ・レコード（例えば、データ・レコード＃１）が格納されると、後続するデータ・レコード（例えば、データ・レコード＃２，＃３，＃４・・・・）がメモリ２２に格納されている途中であっても、格納されたデータ・レコード（データ・レコード＃１）が複製データベース１４へ書き込まれ、しかる後に、このデータ・レコード（データ・レコード＃１）が、メモリ２２から削除されるようになり、本実施形態に係るシステム１０が実現される。 However, as shown in FIG. 1, if a data acquisition unit 16 such as a Web server is provided and a data writing unit 18 that is not directly connected to the source database 12 is applied with SqlBulkCopy, the data 22 is stored in the memory 22. When a record (for example, data record # 1) is stored, a subsequent data record (for example, data record # 2, # 3, # 4,...) Is being stored in the memory 22. Even so, the stored data record (data record # 1) is written to the duplicate database 14, and then the data record (data record # 1) is deleted from the memory 22. The system 10 according to the present embodiment is realized.

上述したように、本実施形態に係るシステム１０においては、上記のような作用により、ソース・データベース１２に格納されたデータ・レコードを複製データベース１４にコピーする場合、ソース・データベース１２に格納されたデータ・レコードのサイズが増えても、メモリ２２の容量を増設する必要なく、高速で、複製データベース１４へのコピーを完了することが可能となる。 As described above, in the system 10 according to the present embodiment, when the data record stored in the source database 12 is copied to the duplicate database 14 by the operation as described above, the data record stored in the source database 12 is stored. Even if the size of the data record increases, copying to the replication database 14 can be completed at high speed without the need to increase the capacity of the memory 22.

また、データ取得部１６およびデータ書込部のメモリの容量が十分にある場合には、並行して複数のバルク・コピーを実施しても良い。以上、本発明を実施するための最良の形態について、添付図面を参照しながら説明したが、本発明はかかる構成に限定されない。特許請求の範囲の発明された技術的思想の範疇において、当業者であれば、各種の変更例及び修正例に想到し得るものであり、それら変更例及び修正例についても本発明の技術的範囲に属するものと了解される。 If the data acquisition unit 16 and the data writing unit have sufficient memory capacity, a plurality of bulk copies may be performed in parallel. The best mode for carrying out the present invention has been described above with reference to the accompanying drawings, but the present invention is not limited to such a configuration. Within the scope of the invented technical idea of the scope of claims, a person skilled in the art can conceive of various changes and modifications. The technical scope of the present invention is also applicable to these changes and modifications. It is understood that it belongs to.

例えば、本実施形態では、図１に示すように、データ取得部１６とデータ書込部１８とがインターネットのような通信ネットワーク１９を介して接続された構成例を説明したが、本発明に係るシステムおよび方法は、このような構成に限定されるものではなく、通信ネットワークを排して、データ取得部１６とデータ書込部１８とをダイレクトに接続するようにしても良い。 For example, in the present embodiment, as illustrated in FIG. 1, the configuration example in which the data acquisition unit 16 and the data writing unit 18 are connected via the communication network 19 such as the Internet has been described. The system and method are not limited to such a configuration, and the data acquisition unit 16 and the data writing unit 18 may be directly connected without the communication network.

本発明に係るデータ一致化のためのシステムおよび方法は、ソース・データベースと複製データベースとを備えてなる、例えば、自治体の住民データ管理のためのデータベース・システムや、金融機関、証券会社、保険会社等のデータベース・システムにも好適に適用される。 The system and method for data matching according to the present invention comprises a source database and a replica database. For example, a database system for resident data management of a local government, a financial institution, a securities company, an insurance company The present invention is also preferably applied to a database system such as

１０システム
１２ソース・データベース
１４複製データベース
１６データ取得部
１８データ書込部
１９通信ネットワーク
２０受信部
２２メモリ
２４書込部
６０ソース・データベース
６２サーバ
６４データ取得部
６６メモリ
６８データ書込部
７０複製データベース DESCRIPTION OF SYMBOLS 10 System 12 Source database 14 Replication database 16 Data acquisition part 18 Data writing part 19 Communication network 20 Reception part 22 Memory 24 Writing part 60 Source database 62 Server 64 Data acquisition part 66 Memory 68 Data writing part 70 Replica database

Claims

A system for achieving data consistency between a source database and a duplicate database using bulk copy ,
Obtaining means for sequentially sending out a plurality of data records stored in the source database;
Storage means for sequentially receiving the data records sequentially sent by the obtaining means and sequentially storing them in a memory;
When a data record is stored in the memory, the stored data record is written to the duplicate database even when a subsequent data record is being stored in the memory. Writing means for deleting the data record from the memory;
The writing means writes the data record stored in the memory to the duplicate database and deletes the data record from the memory after the writing is completed until there is no data record stored in the memory. By repeating, the data record stored in the source database and the data record stored in the duplicate database are matched,
Said acquisition means, said storage means being connected through a communication network, said acquiring means, Ri Oh the Web server,
A data matching system in which the Web server is provided between the source database and the writing means so that the source database and the writing means are not directly connected .

The data matching system according to claim 1, wherein the acquisition unit, the storage unit, and the writing unit operate independently.

A method for achieving data matching of between the source database and the duplicate database using bulk copy,
The Web server sequentially sends out a plurality of data records stored in the source database,
Sequentially receiving and sequentially storing the sequentially sent data records in memory;
When a data record is stored in the memory, the stored data record is written to the duplicate database even when a subsequent data record is being stored in the memory. Delete this data record from the memory,
By repeating the writing of the data record stored in the memory to the duplicate database and the deletion from the memory after completion of the writing until the data record stored in the memory does not exist, the source The data records stored in the database are matched with the data records stored in the duplicate database ;
A data matching method in which the Web server is provided between the source database and the memory so that the source database and the memory are not directly connected .