CN111460000B

CN111460000B - Backtracking data query method and system based on relational database

Info

Publication number: CN111460000B
Application number: CN202010229649.6A
Authority: CN
Inventors: 谭凌
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-03-27
Filing date: 2020-03-27
Publication date: 2021-01-12
Anticipated expiration: 2040-03-27
Also published as: CN111460000A; CN112463857B; CN112463857A

Abstract

The invention discloses a backtracking data query method based on a relational database, which can ensure that a program developer of business logic does not need to support the requirement of backtracking query when developing and compiling a business program, and a great amount of extra business codes are added or modified, for example, only tuple statements are needed to be updated once when the backtracking query is not needed to be supported, so that under the condition of supporting the backtracking query when the method is used, the tuple statements are also needed to be updated once and the same in the program codes, no extra business logic appears, the codes are easy to maintain, the readability is high, and the development is simpler; the extra conversion, execution and other work generated by supporting the backtracking query are automatically completed by the method and the system, and a large amount of business logic codes are not required to be changed.

Description

Backtracking data query method and system based on relational database

Technical Field

The invention relates to the technical field of data processing, in particular to a backtracking data query method and a backtracking data query system based on a relational database.

Background

When a data information system is developed, a relational database (such as MySQL, SQLServer, and the like) is mostly used, the current latest data state is stored in the relational database, and the data state at the past time cannot be inquired. However, in many cases, the past data state needs to be backtracked and inquired, in the prior art, a log table of some fields is separately established in a database, when data is updated, a main table used by business logic needs to be updated, records are inserted into the log table of the corresponding fields, and a certain field is recorded to be updated from a certain value to a certain value at a certain moment. When backtracking query is needed, the main table is queried, the log table is queried respectively, the relevant fields are replaced by the state of a certain past moment, and then the result is returned.

It can be seen that the above conventional backtracking query method is complex to implement, and to meet the need of "backtracking query support", originally simple program codes become very complex: 1. when "writing a database" (add, delete, update, etc.: an INSERT statement, a DELETE statement, or an UPDATE statement need only be executed once per script. However, in order to support the "backtracking query" information about how data is changed and the time of change need to be inserted into the log table of multiple fields, the number of corresponding tables into which records are to be inserted is how many fields. The complexity of program code has increased dramatically, making code difficult to maintain and debug. 2. When "reading a database" (query operation): only the SELECT statement needs to be executed originally. But in order to support the 'backtracking query', a plurality of field log tables need to be queried, and the results are spliced and then returned. The complexity of the program will increase further when the query involves multiple raw data tables.

Therefore, a new backtracking data query strategy is urgently needed in the market at present, and backtracking data query is realized under the condition that program developers do not need to perform additional code editing work, so that codes are easy to maintain, readability is high, and program development is simpler.

Disclosure of Invention

The invention provides a backtracking data query method and a backtracking data query system based on a relational database, which are used for solving the technical problem and further realizing

In order to solve the above technical problem, an embodiment of the present invention provides a backtracking data query method based on a relational database, including:

in a data table of a database needing to support backtracking query, two additional 'time-of-day' type fields are added, which respectively represent the 'start time' and 'end time' of a tuple. Also, a special value is defined to mean "current" meaning, which is called NOW value. The NOW value must you be able to fill in the "start time" and "end time" fields and has no business meaning.

Determining a constraint mode in a configuration file, determining names of tables supporting backtracking query, setting field names representing starting time and ending time in the tables, determining an expression of NOW value, and obtaining an SQL expression character string of current time to obtain a configuration file supporting backtracking query;

receiving an original newly added tuple statement, setting the initial time of a row to be newly added in the original newly added tuple statement as transaction execution time and setting the end time of the row to be newly added in the original newly added tuple statement as an NOW value when the table corresponding to the original newly added tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file, generating a new newly added tuple statement, and executing and submitting the new newly added tuple statement;

receiving an original deletion tuple statement, and extracting a deletion condition in the original deletion tuple statement when the table related in the original deletion tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file; establishing an update statement for the same table, regarding all tuples with NOW termination time, taking a deletion condition in the original deletion tuple statement as an update condition of the update statement, setting a termination time field value in the update statement as a transaction execution time, obtaining a new update statement, executing and submitting the new update statement;

receiving an original updating tuple statement, obtaining a current time T1 when a table corresponding to the original updating tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file, and generating three new statements according to the original updating tuple statement; sequentially executing the three new statements and submitting the transaction in the same transaction; wherein, the three new sentences include: 1. generating a duplicate tuple statement for all tuples with NOW termination time according to the updating condition of the original update tuple statement, copying all tuples to be updated, setting the start time of all newly copied tuples as T1, and setting the termination time as NOW; 2. generating a first new updating statement according to the original updating tuple statement, and setting the termination time of all copied tuples as T1; 3. generating a second new updating statement according to the original updating tuple statement, and executing the updating action related to the original updating tuple statement on the newly copied tuple;

receiving an original query statement, recursively splitting nested statements in the original query statement until the innermost query statement is a single query statement, sequentially processing the single query statement from the innermost layer, replacing a placeholder string with the processed single query statement, splicing the replaced placeholder string into the previous layer of query statement, sequentially recursively until the outermost query statement is processed, and then sequentially replacing the placeholder string with the corresponding processed single query statement to generate the converted nested query statement.

Preferably, the step of setting the start time and the end time includes:

and combining the field or field group with the unique constraint in the data table needing backtracking query with the set termination time field to form the combined unique constraint.

Preferably, in the step of determining an expression for a NOW value, the NOW value is valid for the combined unique constraint.

Preferably, after the executing and submitting the newly generated statement, the method further includes:

and when the abnormity or the error is judged to occur, rolling back the transaction.

As a preferred scheme, the step of processing the single query statement includes:

extracting all table names and aliases from the original query sentence;

extracting a query condition from an original query statement, and generating a new query condition according to the query condition and the alias;

and replacing the query conditions in the original query statement with the new query conditions to generate a new query statement.

Preferably, the query condition includes a query current time and a query past time.

Preferably, the step of replacing the processed single query statement with a placeholder string includes:

and sequentially pulling out the character string segments of the single query statement temporarily stored in a list and sequentially replacing the corresponding placeholders.

The embodiment of the invention also provides a backtracking data query system based on the relational database, which comprises:

the file configuration module is used for determining a constraint mode in a configuration file, determining a table name supporting backtracking query, setting starting time and ending time, determining an expression of a NOW value, and acquiring an SQL expression character string of the current time to obtain a backtracking query supporting configuration file;

the new tuple adding module is used for receiving an original new tuple statement, setting the initial time of a row to be newly added in the original new tuple statement as transaction execution time and setting the end time of the row to be newly added in the original new tuple statement as an NOW value when the table corresponding to the original new tuple statement belongs to a table supporting backtracking query according to the backtracking query supporting configuration file, generating a new tuple statement, and executing and submitting the new tuple statement;

the tuple deleting module is used for receiving an original tuple deleting statement, and extracting a deleting condition in the original tuple deleting statement when the table corresponding to the original tuple deleting statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file; establishing an update statement for the same table, regarding all tuples with NOW termination time, taking a deletion condition in the original deletion tuple statement as an update condition of the update statement, setting a termination time field value in the update statement as a transaction execution time, obtaining a new update statement, executing and submitting the new update statement;

the updating tuple module is used for receiving an original updating tuple statement, acquiring the current time T1 when the table corresponding to the original updating tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file, and generating three new statements according to the original updating tuple statement; sequentially executing the three new statements and submitting the transaction in the same transaction; wherein, the three new sentences include: 1. generating a duplicate tuple statement for all tuples with NOW termination time according to the updating condition of the original update tuple statement, copying all tuples to be updated, setting the start time of all newly copied tuples as T1, and setting the termination time as NOW; 2. generating a first new updating statement according to the original updating tuple statement, and setting the termination time of all copied tuples as T1; 3. generating a second new updating statement according to the original updating tuple statement, and executing the updating action related to the original updating tuple statement on the newly copied tuple;

and the backtracking query module is used for receiving the original query statement, recursively splitting the nested statement in the original query statement until the query statement at the innermost layer is a single query statement, sequentially processing the single query statement from the innermost layer, replacing a placeholder string with the well-processed single query statement, splicing the replaced placeholder string into the query statement at the previous layer, sequentially recursively until the query statement at the outermost layer is processed, and sequentially replacing the placeholder with the corresponding well-processed single query statement to generate the converted nested query statement.

Preferably, the step of setting the start time and the end time includes:

and combining the field or field group of the unique constraint in the data table needing backtracking query with the set termination time field to form a combined unique constraint.

extracting all table names and aliases from the original query sentence;

An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program; wherein, when running, the computer program controls the device where the computer readable storage medium is located to execute the method for querying the backtracking data based on the relational database as described in any one of the above.

The embodiment of the present invention further provides a terminal device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements the trace-back data query method based on the relational database according to any one of the above items when executing the computer program.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

the technical scheme of the invention can ensure that a program developer of business logic does not need to support the requirement of 'backtracking query' when developing and writing the business program, and a great amount of extra business codes are added or modified, for example, only one tuple statement is needed to be updated in the codes when the 'backtracking query' is not needed to be supported, so that under the condition of supporting the 'backtracking query' in the invention, the tuple statement is also needed to be updated in the program codes in a mode of one time, no extra business logic appears, the codes are easy to maintain, the readability is high, and the development is simpler; the extra conversion, execution and other work generated by supporting the backtracking query are automatically completed by the method and the system, and a large amount of business logic codes are not required to be changed.

Drawings

FIG. 1: the first example of the character string splitting in the embodiment of the invention is shown;

FIG. 2: fig. two is an exemplary diagram of splitting a character string according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1 and fig. 2, a preferred embodiment of the present invention provides a backtracking data query method based on a relational database, including:

s1, adding two additional "time-of-day" type fields in the data table of the database to support the backtracking query, respectively representing the "start time" and "end time" of a tuple. Also, a special value is defined to mean "current" meaning, which is called NOW value. The NOW value must you be able to fill in the "start time" and "end time" fields and has no business meaning. Determining a constraint mode in a configuration file, determining names of data tables supporting backtracking query, setting field names representing start time and end time in the tables, determining an expression of NOW value, and obtaining an SQL expression character string of current time to obtain a configuration file supporting backtracking query.

Specifically, the table built in the relational database in the technical scheme must meet certain rules. If the table supporting backtracking query is not needed, the table is normally established according to the service requirement without additional change. In any data table that needs to support backtracking query, in addition to the original service-related fields, two more accurate "time-of-day" type fields are added, which are ST _ START for "START time" and ST _ STOP for "end time" (when actually used, the specific field names are not necessarily ST _ START and ST _ STOP, and can be determined by themselves as long as they do not conflict with other service field names. The type of the two fields, ST _ START and ST _ STOP, requires the use of a high precision "time of day" type for each large database, and requires the expression of the year, month, day and specific time of day (e.g. datatime (3) in MySQL, and datatime 2(3) in SQLServer database, etc.). The precision aspect suggests a precision of 0.01 seconds or more, for example a datatime (3) precision of up to milliseconds in the MySQL database, and for example a datatime 2(4) precision of 0.1 milliseconds in the SQLServer database.

The time type and the precision need to be supported by the invention are not particularly limited, and the time type and the precision need to be determined according to a specifically used relational database, because the names, the precision and the expressions of the time types in different databases may be slightly different. The time accuracy is also determined according to the actual scene, since the time accuracy determines the minimum interval for modification of the same logical record. For example: assuming that the time type used is accurate to 1 second, if modifications are made to this record at 0.5 second intervals in succession, such modifications will not be allowed. Thus requiring the use of higher time accuracy. The time accuracy is therefore also determined by the usage scenario, which may take 1 second for accuracy if the usage scenario modification frequency is low, and 1 second for high accuracy if higher, typically by default a time accuracy of milliseconds or more is used (e.g. datatime (3) in MySQL database, such as datatime 2(3) in SQLServer database, etc.).

When the relational database constraint mode is adopted, all the fields with unique constraint on logic are repeatable, and the values in the fields in the actual data table are repeatable, so that the values of the fields are repeatable among tuples representing different time periods. Therefore, some changes are needed when the table is built, and the change method comprises the following steps: all original fields or field groups with unique constraints and the additionally added ST _ STOP 'end time' field form a combined unique constraint (the multi-field combined unique constraint is supported by each large correlation coefficient database). For example: the num field in a table t1 has a unique constraint logically, so when a trace-back query is supported and a relational database constraint mode is adopted, num and ST _ STOP should form a combined unique constraint, that is, the two fields of any two pieces of data in the relational data table cannot be equal at the same time. This is also the case if the table itself contains a combination unique constraint. For example: the three fields a, b and c in a table t2 have logically combined unique constraints, and the three fields of two data can not be equal to each other. At this time, if the ST _ STOP is required to be added into the combined unique constraint when backtracking query is supported and the relational database constraint mode is adopted, namely the combined unique constraint is logically arranged in the four fields of a, b, c and ST _ STOP in the table.

With respect to the selection and expression of the NOW value, the NOW value is a value that can be filled in the ST _ START "Start time" and ST _ STOP "end time" fields, representing the meaning of "current", which is a time that elapses forward every minute and second. A special value with no specific business meaning is applied to express the NOW. The NOW value selection principle is as follows: there is no special value for the meaning of a specific service, and this value must be valid for a combined unique constraint! After the selection is determined, the SQL specific expression of the value is recorded in the configuration file of the invention. For example: it is generally recommended that the NOW be represented by a value of "9999-12-3100: 00:00.000" (a time value at the maximum date of the datetime type), and that the string expression of "9999-12-3100: 00:00.000" be recorded in the system configuration file. When using the SQLServes database, Null values may also be used to represent NOW.

The condition "valid for the combination unique constraint" needs to be specified, that is, the value is constrained by the "combination unique constraint". If a particular value is not constrained, it cannot be used to represent a NOW. Examples are: in MySQL, Oracle, etc. databases, null is not constrained by combinatorial unique constraints, other values are not allowed to be repeated, but null values are allowed to be repeated (which is the case for most databases). Therefore, when such a database is used, NOW cannot be represented by null value, and it can be set that NOW is represented by a value of "9999-12-3100: 00: 00.000". Since this value is constrained and at the same time it is sufficiently special. After the selection decision, "9999-12-3100: 00:00.000" (note that the last ".000" indicates an accuracy of milliseconds) this string is written into the configuration file. Example 2: in the SQLServer database, null is a value that is constrained by a combinatorial unique constraint, so null can be used to represent NOW. After the selection is determined, the character string of 'null' is written into the configuration file.

After a series of key information is defined, they need to be registered in configuration files, and the system operation will read the information in these configuration files and store it in the system cache at startup. The system will frequently read this initial configuration information at run-time. Operations may be performed based on the configuration information and associated methods. The configuration file specific format is not specified here (the inventors' implementation is to use an XML file). Only the contents of the configuration file are described herein. The profile contents and requirements are as follows:

(1) need to support backtracking queries for those tables in the database? The full table names of all tables that need to be supported are listed.

(2) Start time list name: the column name is self-defined, the name can not be duplicated with the original service field, and the suggested name is 'ST _ Start'.

(3) End time column name: self-defining column names, and being incapable of being renamed with original service fields, wherein the suggested name is 'ST _ Stop'.

(4) Expression of NOW values: according to the used database customization, firstly, the value is ensured to be written into the column of the 'termination time' ST _ Stop, and special values which can be accommodated by the ST _ Stop field and have no other business logic meaning are used for splicing the SQL statement. For example, using a null value, "null" is registered. If NOW is represented using a special time, a constant SQL expression for this time is registered with an accuracy that matches the ST _ Start and ST _ Stop fields at the time of the tabulation. Such as "'9999-12-3100: 00: 00.000'". (Note that if NULL IS used, the relevant syntax for SQL IS "field name IS NULL" or "field name IS NOT NULL").

(5) Acquiring the SQL expression character string of the current time: for automatic generation of new SQL statements, depending on the relational database used and the version, i.e. this database gets the functional expression of the current time (time accuracy needs to match the ST _ Start and ST _ Stop fields accuracy). For example, when the SQLServer is used, the registered content is "getdate ()", and when the MySQL database is used, the registered content is "now (3)" (this means that the current time is acquired and the accuracy is 0.001 second).

S2, receiving an original newly added tuple statement, setting the starting time of a row to be newly added in the original newly added tuple statement as transaction execution time and setting the ending time of the row to be newly added in the original newly added tuple statement as NOW value when the table corresponding to the original newly added tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file, generating a new newly added tuple statement, executing and submitting the new newly added tuple statement.

When a tuple is newly added, a user inputs an original Insert statement, and the system firstly searches a related table in the configuration without supporting backtracking query if the original Insert statement is not required to be directly executed. If support is needed, a new record is inserted into the relational library, the value of the initial time field is the time when the transaction is executed, the value of the termination time field is set as a value of NOW, and the NOW value expresses that the record is found and filled in from the configuration. The specific method comprises the following steps:

(1) and (3) finding out whether the table to be inserted into the record needs to support backtracking query or not from the system configuration record, if not, directly executing the originally input INSERT statement, and if so, entering the step (2).

(2) Inserting the tuple data to be inserted, inserting each service field correspondingly, setting the start time of the tuple as transaction execution time when adding new tuple, and setting the end time of the tuple as NOW value. That is, according to the original Insert statement, two fields are automatically added in the Insert statement, namely "start time" and "end time", field names of the two fields are searched from a configuration file, and the field names are added to a column list in the original Insert statement. And similarly, adding two expressions of value into the inserted value list, searching the expression of 'obtaining SQL expression of current time' and NOW value from the configuration file, respectively adding the expression of 'obtaining SQL expression of current time' and NOW value into the position corresponding to 'starting time' and 'ending time' in the value list, generating a new Insert statement, executing and submitting the new Insert statement.

If the only constraint conflict occurs, the database returns an error and prompts the execution failure, and the database rolls back (the specific relational database has self-constraint to detect the conflict). Examples are: using the SQLServer as a relational database, a piece of data is inserted in table1 (table name table1) as follows: the wage for "wangwu" with the wage number "009" is "8500". In the configuration: NOW is represented by a "Null" value, the acquisition time expression is "getdate ()", the start time column name is designated as "start time", and the end time column name is designated as "end time".

Shop number	Name (I)	Payroll	Starting time	End time
					007	Zhang three	7000	2015/1/1 9:00	2016/1/1 8:00
007	Zhang three	10000	2016/1/2 8:00	2018/1/1 8:00
					008	Li four	9000	2014/1/1 9:00	2016/1/1 8:00
008	Li four	12000	2017/1/2 9:00	null

Table1, example Table one

The user's originally entered INSERT statements are as follows (since the "start time" and "end time" are read and written by the system, the user entered SQL statements may override them): INSERT intot table1 (payroll, name, payroll) VALUES ('009', 'wangwu', '8500').

The system automatically adds two fields of start time and end time and an insertion value according to the original INSERT statement and the configuration file, and automatically generates the following new INSERT statement: INSERT INTOs table1 (job number, name, payroll, start time, end time) VALUES ('009', 'wang five', '8500', getdate ()), null.

Getdate () is a function of SQL in the SQLServer relational database for obtaining the current time, and the expression string of this function is read from the system configuration. null is also a string expression defined in the configuration file that expresses the meaning of NOW. After the above statements are executed and committed, the state of the data table is as shown in Table2 below (assuming that the time at insertion is 2019/12/259: 00):

TABLE2 example Table two

S3, receiving an original deletion tuple statement, and extracting a deletion condition in the original deletion tuple statement when the table corresponding to the original deletion tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file; establishing an update statement for the same table, regarding all tuples with NOW termination time, taking the deletion condition in the original deletion tuple statement as the update condition of the update statement, setting the termination time field value in the update statement as the transaction execution time, obtaining a new update statement, executing and submitting the new update statement.

Tuples are deleted, which is not true physical deletion in the case of backtracking queries. But logically deleted, leaving the validity time of a tuple to end. The delete tuple operation is actually an update operation in the relational database. The specific method comprises the following steps: in a relational database, the "expiry time" of all tuples (possibly more than one) that meet the logical deletion condition and correspond to an expiry time value of NOW value are set as transaction execution time in a certain table, while the values of other fields are not changed.

The user inputs the original DELETE statement, the system searches the related table in the first-come configuration without supporting backtracking query, and if not, the original DELETE statement is directly executed. If backtracking query needs to be supported, a WHERE clause (deleted condition) of the DELETE statement is extracted. An UPDATE statement is generated for the table, wherein the SET clause only modifies the "time to expire" (field name read from system configuration), which is modified to be the transaction execution time (fetch the current time expression read from system configuration), and then the WHERE clause condition of the UPDATE statement is: "WHERE clause condition of original DELETE statement" and "end time is now value" (expression of now is read from system configuration). And completing the generation of the new UPDATE statement splicing. If NOW IS represented by a Null value in the configuration, the SQL syntax IS "IS NULL at termination", and if NOW IS represented by another special value in the configuration, the SQL syntax IS "XXXXXXX'". Examples are: using the SQLServer as a relational database, it is deleted in table3 (table name table1) as follows: tuple with job number "008". Assume a system configuration in which: NOW is represented by a "Null" value, the acquisition time expression is "getdate ()", the start time column name is designated as "start time", and the end time column name is designated as "end time".

TABLE3 example Table III

The DELETE statement originally entered by the user is as follows: the DELETE FROM table1WHERE job number is '008'.

The present system automatically generates the following new UPDATE statement from the original DELETE statement: the UPDATE table1 SET end time IS getdate () WHERE number IS '008' AND end time IS NULL.

Table name table1 is not changed, only one field of "end time" is updated in the SET clause, the SET value is the transaction execution time, getdate () is the function used by SQL in the SQLServer relational database to obtain the current time, and the expression string of this function is read from the system configuration. The condition in the WHERE clause is "WHERE clause condition of original DELETE sentence" AND "end time is null". null is a string expression defined in the system configuration to express the meaning of NOW, AND in SQL statements represents the meaning of "logical AND". After the newly generated UPDATE statement is executed and committed, the state of the data table is as shown in Table 4 below (assuming that the logical delete time is 2019/12/259: 00):

TABLE 4 example Table four

S4, receiving an original update tuple statement, obtaining a current time T1 when the table corresponding to the original update tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file, and generating three new statements according to the original update tuple statement: 1. and for all tuples with NOW termination time, generating a duplicate tuple statement according to the updating condition of the original updating statement, copying all tuples to be updated, wherein the start time of all newly copied tuples is T1, and the termination time is set to NOW. 2. From the original update statement, a new update statement is generated, setting the termination time of all copied tuples to T1. 3. And generating another new update statement according to the original update statement, and executing the update action related to the original update statement only on the newly copied tuple. The three newly generated statements are executed in sequence in the same transaction and the transaction is committed.

The updated tuple is relatively complex, and in the case of supporting backtracking query, it can be logically equivalent to deleting the original tuple and inserting the updated new tuple at the same time. An update operation may affect multiple tuples simultaneously. The updating of the tuple is to copy the tuples meeting the updating condition and having the termination time NOW in the same table, change the termination time NOW of all the copied tuples into the transaction execution time (current time), set the start time NOW of the newly copied tuples as the transaction execution time (current time) and the termination time NOW, and finally perform the updating operation expressed by the original UPDATE statement on the newly copied tuples. Particular emphasis is given to: all of the above processes need to be performed successfully together or fail to roll back to a state before the transaction is started in one transaction. The specific method is to use insert intro select statement:

(1) the user inputs the original UPDATE statement, the system searches the related table in the configuration first without supporting backtracking query, if not, the original UPDATE statement is directly executed, and the submission is finished. If the backtracking query needs to be supported, a new database transaction is started, and step (2) is entered.

(2) The data table meta-information involved is obtained, mainly including all field names in this table and their default order. These must include two fields, start time and end time. Various programming languages have a code method for obtaining data table meta information for various databases, and no additional creativity is needed in the step.

(3) The program acquires the current time, noted as T1. As the time of execution. This can be obtained by a select now () statement, a select getdate () statement, or by an API of the program. The current time is obtained separately here in order to find out which filled time values are consistent in the following steps.

(4) An insert into select statement is automatically generated according to the UPDATE statement input by a user originally: where insert into is followed by the table name in the original UPDATE statement. Where the select key is followed by a meta-information order listing all field names of the table, including the start time field but not the "start time" and "end time" fields. And filling (3) the acquired execution time (which can be a direct SQL expression or a precompiled mode and then filling parameters) in the field position of the start time, and filling the now value of the system configuration in the field position of the end time. The last where clause condition is "the original update's where clause condition" as a whole and the "end time" field is the value now. The NOW value and the field names "start time" and "end time" are obtained from the system configuration, and if the NOW in the configuration IS expressed by NULL, the expression syntax IS "end time IS NULL", and if the new value IS other value, the new value IS "end time XXXX". After the new insert into select statement is generated, the qualified tuples are copied, but the termination time of the newly copied tuples is the time point T1 obtained in step (3).

(5) Automatically generating a new update statement according to the update statement input by the user originally: the set clause of the new update statement only needs to set "end time ═ time point T1" (which may be a direct SQL expression or through a precompilation mode and then fill in parameters) acquired in step (3). The condition of the where clause of the new update statement is "the where clause condition of the original update" as a whole and (the "end time" field is a not value, while the "start time" is not equal to the time point T1 obtained in step (3)). The now value and the field names "start time", "end time" are obtained from the system configuration, and if NULL, the expression syntax IS "end time IS NULL", and if other values, "end time ═ XXXX". And executing after generating a new update statement. This statement is intended to "logically delete" the tuple being copied.

(6) Automatically generating a second new update statement according to the update statement input by the user originally: the set clause list of the new update statement replicates the list of the original statement. The condition of the where clause of the new update statement is "the where clause condition of the original update" as a whole and the "end time" field is the value now. The now value and the field names "start time", "end time" are obtained from the system configuration, and if NULL, the expression syntax IS "end time IS NULL", and if other values, "end time ═ XXXX". And executing after generating a new update statement. This statement is intended to update the newly replicated tuple.

(7) The transaction is committed and completed if successful. If any place in the above steps (2) - (6) or an exception or error occurs at the time of submission, the transaction is rolled back, which is equivalent to not doing the same execution.

Examples are: using SQLServer as a relational database, all staff wages less than 10000 were increased by 2000 in Table 5 below (table name table1, supporting backtracking queries). Assume a system configuration in which: NOW is represented by a "Null" value, the acquisition time expression is "getdate ()", the start time column name is designated as "start time", and the end time column name is designated as "end time".

TABLE 5 example Table five

The UPDATE statement originally entered by the user is as follows: UPDATE table1 SET payroll +2000WHERE payroll < 10000.

Because table1 supports backtracking queries, a new transaction is first started; acquiring meta information of a table1 table, and obtaining all field names and sequences thereof as follows: payroll number, name, payroll, start time, end time; then, the current time value (the precision reaches the standards of "start time" and "start time") is obtained, and the obtained current time T1 value is assumed to be "2020-1-212: 00:00.000", which is used in the following many times.

An insert into select statement is automatically generated from the original UPDATE statement above, with the table name being the same as the original UPDATE, and the select key followed by the meta-information order listing all the field names of the table, but excluding the "start time" and "end time" fields. And '2020-1-212:00:00.000' is filled in the "start time" field position and null is filled in the "end time" field position. The last word clause is conditioned as the "original update's word clause condition" in its entirety and the "end time" field is null. The insert into select statement generated is: INSERT INTERO table1 SELECT Job number, name, payroll, '2020-1-212:00:00.000', NULL WHERE (payroll <10000) AND (time of termination IS NULL), after execution (transaction not committed), the temporary state of Table 6 IS as follows:

TABLE 6 example Table six

Then a first new UPDATE statement IS generated from the original UPDATE statement, the set clause in the new statement only needs to set "end time ═ 2020-1-212:00:00.000 '", the condition of the new UPDATE statement's where IS the "original UPDATE's where clause condition" as a whole and ("end time" IS NULL, and "start time" IS not equal to T1: '2020-1-212:00:00.000'), then the first new UPDATE statement IS as follows: the UPDATE table1 SET end time ═ 2020-1-212:00:00.000' WHERE (payroll <10000) AND (end time IS NULL AND start time | = '2020-1-212:00:00.000'), after execution (transaction uncommitted), the temporary state of the table IS as follows:

TABLE 7 exemplary TABLE seven

A second new UPDATE statement is then generated from the original UPDATE statement. The set clause list of the new UPDATE statement replicates the list of the original statement. The condition of the where clause of the new update statement is (where clause condition of the original update: payroll <10000) in its entirety and (expiration time is null). The following were used: the UPDATE table1 SET payroll +2000 wore (payroll <10000) AND (end time IS NULL), after execution (transaction not committed), the temporary state of table 8 IS as follows:

TABLE 8 example Table eight

If all the previous steps are executed without errors, the transaction is submitted, and the operation is finished. If any step throws an exception or error, execution fails, the transaction rolls back, and the table is restored to the state before the transaction started.

S5, receiving an original query statement, recursively splitting nested statements in the original query statement until the query statement at the innermost layer is a single query statement, sequentially processing the single query statement from the innermost layer, replacing the single query statement which is well replaced by a placeholder string, splicing the single query statement into the query statement at the upper layer, sequentially recursively until the query statement at the outermost layer is processed, and sequentially replacing the placeholder with the corresponding processed single query statement to generate the converted nested query statement.

The query operation is an operation showing expected effects, and the previous methods from table building to operations of adding, deleting and the like are all used for supporting backtracking query, and the backtracking query is shown in query (SELECT). The query operation is divided into two cases: one is a normal query, which is the query current. Another is a backtracking query, which is a query at some point in time in the past. When a user inquires, an original SELECT statement needs to be input, the original SELECT can disregard two extra fields of 'starting time' and 'ending time', only the service field of the table itself needs to be concerned, and the method is just like the method without supporting backtracking inquiry. The difference between the backtracking query and the ordinary query is that, in addition to the original SELECT statement, a past time point, named "query time", needs to be input when backtracking query is performed, and represents the result of the SELECT statement when i want to query this time point. If the query is a normal query, the time point is not input. The invention adopts program API mode to input 'query time' point, which is a parameter of a function (function) or method (method) in the program, that is, the parameter of the query function is divided into a time object besides the SELECT statement.

The essence of the SELECT operation is to change the character string of the original SELECT statement, newly generate a new SELECT statement, and then execute the new SELECT statement to return the result. The newly generated SELECT is different from the original SELECT statement in that the WHERE clause is added with the query conditions of two added fields of "start time" and "end time".

The above-mentioned processing of the SELECT statement and the generation of a new statement are complicated. Because the SELECT statement is nestable, it is common to nest the SELECT statement. When processing nested SELECT statements, the statements are required to be recursively split according to brackets (), whether each pair of () internal is a SELECT statement or not is analyzed, if yes, the statements are extracted to be continuously split, and if not, no processing is carried out until the statements can not be split any more. And after the splitting is finished, processing is started from the SELECT statement of the innermost layer. This process is termed herein as "single SELECT statement processing" (the method of processing is detailed below). It should be noted that the SELECT statement processed by "single SELECT statement processing" is definitely not nested because it has been recursively split. The method comprises the steps of sequentially processing the SELECT statement of the innermost layer, recording the processed SELECT statement, representing the SELECT statement by using a special occupation string, replacing and splicing the special occupation string into the SELECT statement of the previous layer, performing single SELECT statement processing on the SELECT statement of the previous layer when the SELECT statement of the previous layer is definitely not nested in syntax, and sequentially recursing until the SELECT statement of the outermost layer is processed. And then sequentially replacing the complete-processing SELECT statement represented by all the occupation bit strings. Eventually generating a new SELECT statement.

The input of the "single SELECT statement processing" and the "single SELECT statement processing" is a set of non-nested SELECT statement character strings and a time object, the time object represents "query time", and the "query time" may be empty, and if the time object is empty, the time object represents the current state of the query. The result of the "single SELECT statement processing" is also a new string of SELECT statements. Directly executing the newly generated SELECT statement can query the state of the current data table or a certain time. The simple "single SELECT statement processing" is the string splitting and splicing conversion processing of the SELECT statement. The specific method comprises the following steps:

(1) all table names and aliases (if any) are read FROM the FROM clause of the original SELECT statement. And comparing the table names in the FROM clause with the full table names which need to support backtracking query in the system configuration, and recording which table names support backtracking query and do not need support. If the tables related to the original SELECT statement do not need to support backtracking query, no processing is carried out, the original SELECT statement is returned and the processing is finished, otherwise, the step (2) is carried out

(2) And intercepting a WHERE clause from an original SELECT statement, backtracking and querying an 'alias' list of the data tables according to the query condition in the WHERE clause and the support obtained in the last step, and generating a new query condition by directly using table names (assuming that the aliases of the tables related to the statement are respectively table1, table2 and table3 … …) if no alias exists.

a) If the current moment is queried: connecting the original query condition AND the "end time of NOW" of each table supporting backtracking query with "AND" logic (AND logic), namely:

(conditions in original WHERE clause) AND (table1. end time is NOW) AND (table2. end time is NOW) AND (table1. end time is NOW) … …

b) If the query has passed a certain time: connecting the original query condition AND the start time of each table supporting the backtracking query earlier than the query time AND (the end time is later than OR equal to the query time OR AND the end time is NOW) by using AND logic (AND logic), namely:

(conditions in the original WHERE clause) AND

(table1. start time < query time AND (table1. end time is NOW OR table1. end time >: query time)) AND

(table2. start time < query time AND (table2. end time is NOW OR table2. end time >: query time)) AND

(table3. start time < query time AND (table3. end time is NOW OR table3. end time >: query time)) … …

The "less than" ("<") can be replaced by "less than or equal to" ("< ═") and the "greater than or equal to" ("> =") can be replaced by "greater than" (">"), which can be determined according to actual situations.

(3) And replacing the query condition generated in the previous step with a WHERE clause in the original SELECT statement, and if the WHERE clause does not exist in the original SELECT statement (namely all queries), adding the WHERE clause, wherein the conditions are the same as the above. And other parts of the original SELECT statement are unchanged, and a new SELECT statement is generated. And ending the 'single-SELECT statement processing' and returning to the newly generated SELECT statement.

Examples are: the SQLServer is used as a relational database, in which there is a table name "table 1" and a field name "job number" in the table. Assume a system configuration in which: NOW is represented by a "Null" value, the Start time column name is designated "ST _ Start", and the end time column name is "ST _ Stop". The following original SELECT statement is executed: SELECT FROM table1WHERE job number is '001'.

If the current state is queried, the following SELECT statement is generated: SELECT FROM table1WHERE (job number ═ 001') AND (table1.st _ Stop IS NULL).

If data at some point in time in the past is queried, such as querying the state of the data system at time 2015.1.19: 00:00.000, then the following SELECT statement is generated: SELECT FROM table1WHERE (job number is '001') AND (table1.st _ Start < '2015.1.19: 00:00.000' AND (table1.st _ Stop is null OR table1.st _ Stop > '2015.1.19: 00: 00.000').

Based on the above "single SELECT statement processing", the "processing of a nested SELECT statement" is now described, and similarly to the above, "processing of a nested SELECT statement" inputs a string of SELECT statements that may have nesting and a time object, where the time object represents "query time", and "query time" may be empty, and if empty, represents the current state of the query. The output result of the nested SELECT statement processing is also a newly generated string of SELECT statements. Directly executing the newly generated SELECT statement can query the state of the current data table or a certain time. The 'nested SELECT statement processing' is also the string splitting and splicing conversion processing of the SELECT statement.

The basic idea of the 'processing of nested SELECT statements' is to split individual SELECT statements according to a structure divided by brackets (), perform 'single SELECT statement processing' one by one, and then re-splice the processed results into nested statements.

For example, the following sentence structure: a (B) C (D (E) F) G, where each capital letter represents a part separated by brackets, which must be included between pairs () if the Select statement is nested, e.g., "B", "E", "D (E) F", are all likely to be a legal Select statement. The algorithm first sees the string as a segment divided by "(" and ")" and sets a hierarchy for each segment. The detailed method comprises the following steps:

(1) an ordered list (abbreviated as list) is created to store character strings, a shaping variable representation level is created, and the initial value is 1.

For the character string of the originally input SELECT statement, all separating characters (blank spaces, tabulation characters and the like) before and after the character string are removed, and if the outermost layer of the character string is a pair of brackets (), the brackets on the outermost layer are removed. Traversing the whole character string, defining the first "(" the previous character substring as the first segment, adding into an ordered list, setting the level of the first segment as 1, truncating the previous character substring to form a new segment (even if an intercepted empty string is used as the new segment) when the "(" character "is encountered later), adding 1 to the level (level), and copying the value of the current level (level) to the level attribute of the segment.

And finally ensuring that the last segment of the segment chain is in the first layer, and if the last segment is in the second layer, adding an empty string segment behind to fill up in the first layer. If the last section is at a position larger than the second layer, the bracket structure is not complete, and the input statement has grammar errors. The above-described character string may be expressed in the structure shown in fig. 1 below, which is now named "fragment chain data structure". The top layer (layer 3 in the example) fragments are all directly bracketed between "(" and ")" (e.g., "E"). In the "fragment chain data structure", all fragments may be accessed in sequence, predecessors and successors of any fragment may be found, or all fragments of a certain layer may be accessed in sequence or sequence number. After the 'fragment chain data structure' is built, the step (2) is entered

(2) The first segment (segment E in the drawing) in the current top-level (layer 3 in the drawing) ordering is subjected to "segment analysis action", which is to judge whether the segment starts with SELECT (case-insensitive) after the separator is removed and converted into capitalization, and if not, no processing is performed on the segment (because the segment included between "()" is not necessarily a SELECT statement of a sub-query, but is also likely a logic condition in a logic expression in a Where clause, or a parameter included in parentheses in a database self-contained function), and if so, the above-mentioned "single SELECT statement processing" (see above).

If the 'single SELECT statement processing' is carried out, the character string result after the analysis processing is temporarily stored, and a placeholder is used for corresponding to the character string result, wherein the placeholder is equivalent to a virtual table name. The placeholder format can be self-defined as long as the SQL original character string does not conflict in meaning, and can be arranged by a special prefix and a number, and a space (such as "$ V _ TABLE _ 1") is added before and after the placeholder.

(3) Merging fragments: for the segment just processed in step 2, if the processed segment is not a SELECT statement, the three strings "predecessor of the processed segment", "(original processed segment)", "successor of the processed segment" are directly merged and the parentheses are kept to be merged. If the fragment being processed is a SELECT statement, the string fragment just processed is replaced with its corresponding placeholder. And splicing and merging the three character strings of 'predecessor of the processed fragment', 'corresponding placeholder of the processed SELECT statement' and 'successor of the processed fragment'. Make it a new string fragment, and then remove the three fragments of "predecessor of processed fragment", "processed fragment" and "successor of processed fragment" from the "fragment chain data structure" (removed in the global list, and also removed in the corresponding tier). And simultaneously, inserting the new character string segment into a 'segment chain data structure', wherein the total ordering of the new character string segment in list is the same as that of the original 'predecessor of the processed segment', and the hierarchy of the new character string segment is the same as that of the original predecessor or successor segment of the processed segment. For example, after the segment "E" is processed, its processed segment and the segments "D" and "F" are merged into a new segment "D (E) F", as shown in fig. 2 (b).

(4) After step (3) is executed, it is checked whether the top layer is empty (i.e. no fragment belongs to the top layer), if so, the top layer is cleared, and the level of the whole data structure is reduced by 1. If the number of the remaining layers of the whole structure is more than 1, the step (2), (3) and (4) are repeatedly executed by switching to the step (2). If the number of remaining layers of the entire structure is equal to 1, the algorithm proceeds to step (5). The results after the treatment are schematically shown in (c) in fig. 2 and (d) in fig. 2 below.

(5) After the above steps are completed, a new character string (shown in (d) in fig. 2) is obtained, the character string may contain a plurality of placeholders, all the character string fragments of the SELECT statement temporarily stored in step (2) are sequentially taken out, and the corresponding placeholders are sequentially replaced, so that the conversion of the nested query SELECT statement is completed, and the converted nested SELECT statement is obtained. The algorithm ends.

The invention relates to a computer technology method, and a software 'middleware' is developed according to the computer technology method, and the software 'middleware' runs on a relational database and forms a data system together with the relational database. The data system supports backtracking query, and each deleting or updating transaction does not physically delete or update the tuple state in the database, but logically deletes or updates a certain tuple by setting a time field with special meaning. When executing an UPDATE statement, the system converts a plurality of corresponding other SQL statements according to the whole original UPDATE statement and executes the SQL statements in a transaction, modifies the state of a data table and simultaneously inserts corresponding time information, thereby realizing the effect of logically updating. (INSERT and DELETE operate similarly)

The past historical state of the query data can be backtracked according to conditions during query. The system can add a time query condition in a WHERE word of each layer of nested SELECT sentences according to an original SELECT sentence and a certain method to automatically generate a new SELECT query sentence, and execute the new SELECT query sentence and backtrack the data state of a certain time point after query.

When the method and the system are used, the requirement for supporting the aspect of 'backtracking query' is independently extracted and is irrelevant to specific business logic. And is designed to be implemented as a software middleware system using the method of the present invention. All the related work of the requirement of supporting the backtracking query is completed by the system, a service logic programmer does not need to additionally consider the problem in the aspect, a program developer does not need to additionally increase or modify service codes because the requirement of supporting the backtracking query when developing and writing the service program, for example, only one UPDATE statement needs to be executed in the codes when the requirement of supporting the backtracking query is not needed, and under the condition of supporting the backtracking query when the system is used, the program codes also only need to be subjected to one-time same UPDATE statement, and no additional service logic appears, so that the codes are easy to maintain, high in readability and simpler to develop.

The above-mentioned embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and are not intended to limit the scope of the present invention. It should be understood that any modifications, equivalents, improvements and the like, which come within the spirit and principle of the invention, may occur to those skilled in the art and are intended to be included within the scope of the invention.

Claims

1. A backtracking data query method based on a relational database is characterized by comprising the following steps:

receiving an original newly added tuple statement, setting the initial time of a row to be newly added by the original newly added tuple statement as transaction execution time and setting the termination time of the row to be newly added by the original newly added tuple statement as an NOW value when the table corresponding to the original newly added tuple statement belongs to a table supporting backtracking query according to the backtracking query supporting configuration file, generating a new newly added tuple statement, and executing and submitting the new newly added tuple statement;

receiving an original deletion tuple statement, and extracting a deletion condition in the original deletion tuple statement when the table corresponding to the original deletion tuple statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file; establishing an update statement for the same table, regarding all tuples with NOW termination time, taking the deletion condition in the original deletion tuple statement as the update condition of the update statement, setting the termination time field in the update statement as transaction execution time, obtaining a new update statement, executing and submitting the new update statement;

2. The relational database-based backtracking data query method according to claim 1, wherein the step of setting the start time field and the end time field comprises:

and combining the original unique constraint field or field group in the data table in the database with the set termination time field to form a combined unique constraint.

3. The relational database-based backtracking data query method according to claim 2, wherein in the step of determining the expression of the NOW value, the NOW value is valid for the combined unique constraint.

4. The relational database-based backtracking data query method according to claim 1, wherein when three new statements are generated according to the original update tuple statement, and the three new statements are sequentially executed in the same transaction and the transaction is submitted, the method further comprises:

5. The relational database-based backtracking data query method according to claim 1, wherein the step of processing the single query statement comprises:

extracting all table names and aliases from the original query sentence;

extracting a query condition from an original query statement, and generating a new query condition according to the query condition and the table name or the alias;

6. The relational database-based backtracking data query method according to claim 5, wherein the query condition includes a query current time and a query past time.

7. A backtracking data query system based on a relational database, comprising:

the file configuration module is used for determining a constraint mode in a configuration file, determining the name of a table supporting backtracking query, setting field names representing starting time and ending time in the tables, determining an expression of NOW value, and acquiring an SQL expression character string of the current time to obtain a backtracking query supporting configuration file;

the new tuple adding module is used for receiving an original new tuple statement, setting the initial time of a row to be newly added by the original new tuple statement as transaction execution time and setting the termination time of the row to be newly added by the original new tuple statement as an NOW value when the table corresponding to the original new tuple statement is judged to belong to a table supporting backtracking query by the backtracking query supporting configuration file, generating a new tuple statement, executing and submitting the new tuple statement;

the tuple deleting module is used for receiving an original tuple deleting statement, and extracting a deleting condition in the original tuple deleting statement when the table corresponding to the original tuple deleting statement is judged to belong to a table supporting backtracking query through the backtracking query supporting configuration file; establishing an update statement for the same table, regarding all tuples with NOW termination time, taking the deletion condition in the original deletion tuple statement as the update condition of the update statement, setting the termination time field in the update statement as transaction execution time, obtaining a new update statement, executing and submitting the new update statement;

the updating tuple module is used for receiving an original updating tuple statement, receiving the original updating tuple statement when the table corresponding to the original updating tuple statement is judged to belong to a table supporting backtracking query through the backtracking-supporting query configuration file, acquiring the current time T1 when the table corresponding to the original updating tuple statement is judged to belong to the table supporting backtracking query through the backtracking-supporting query configuration file, and generating three new statements according to the original updating tuple statement; sequentially executing the three new statements and submitting the transaction in the same transaction; wherein, the three new sentences include: 1. generating a duplicate tuple statement for all tuples with NOW termination time according to the updating condition of the original update tuple statement, copying all tuples to be updated, setting the start time of all newly copied tuples as T1, and setting the termination time as NOW; 2. generating a first new updating statement according to the original updating tuple statement, and setting the termination time of all copied tuples as T1; 3. generating a second new updating statement according to the original updating tuple statement, and executing the updating action related to the original updating tuple statement on the newly copied tuple;

8. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored computer program; wherein the computer program controls the device where the computer readable storage medium is located to execute the method for querying the backtracking data based on the relational database according to any one of claims 1 to 6 when running.

9. A terminal device, comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, wherein the processor implements the relational database-based backtracking data query method according to any one of claims 1 to 6 when executing the computer program.