CN116383242B

CN116383242B - An online index selection method and apparatus based on neural networks and decay strategies

Info

Publication number: CN116383242B
Application number: CN202310303515.8A
Authority: CN
Inventors: 王森洋; 吴小莹
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2023-03-23
Filing date: 2023-03-23
Publication date: 2025-11-25
Anticipated expiration: 2043-03-23
Also published as: CN116383242A

Abstract

This invention discloses an online index selection method and apparatus based on neural networks and a decay strategy, relating to the database field. The method includes: training a tree convolutional neural network to output the corresponding execution cost based on the query execution plan; constructing a hash table; obtaining new values for each position in each array of the hash table based on decay parameters; disabling existing indexes in the database and obtaining the default execution cost of a new query under the default index configuration based on the tree convolutional neural network; calculating the benefit of each column index on the new query for the new query based on the default execution cost of the new query, and adding this benefit to the tail of the corresponding array in the hash table; enabling existing indexes based on index instructions in the database, sorting the indexes according to their total benefit, and selecting a preset number of indexes from high to low as the new index configuration. This invention achieves higher accuracy in calculating query execution costs.

Description

On-line index selection method and device based on neural network and attenuation strategy

Technical Field

The invention relates to the field of databases, in particular to an online index selection method and device based on a neural network and an attenuation strategy.

Background

In recent years, with the entry into the large data age, database systems have become more and more complex, and the amount of data therein has become larger and larger. The database in operation is to process the queries sent by the users continuously, however, when there is a large amount of data in the database, it is not easy to ensure that the database can have a faster query processing speed.

In general, in order to reduce the execution time of a query and increase the running speed of a database, an appropriate index configuration may be selected for the database. An index is a physical structure built on a column in a database, and occupies a certain storage space, but can improve the execution efficiency of queries in the database. The index configuration is a set of indexes. However, it is difficult for database administrators and users to choose a better index configuration because in a larger database, there are typically hundreds of tables, thousands of columns, and because of memory space limitations, only a small fraction of columns can be selected to build an index, and it is difficult to select the appropriate columns among so many columns to build an index. And because new queries are continually sent to the database, database administrators need to continually adjust the index configuration for these new queries, it is more cumbersome to select a valid index configuration in such a dynamic environment.

Index selection problems have been a hotspot problem in the database field, and many schemes have been used to deal with this problem in recent decades, most of which use a database optimizer to generate an execution plan for a query, and use the estimated execution cost of the query in the execution plan to calculate the return of the index. The database optimizer is not able to accurately estimate the execution cost of the query, which results in inaccurate estimates of index returns. Moreover, in the existing index selection methods, most of the methods are used for solving the problem of offline index selection. In the off-line index selection problem, the environment in which the method faces is static, i.e., the method needs to optimize a fixed set of queries for which to select the appropriate index configuration. However, as mentioned above, in practice, the offline index selection method cannot be used in a real-world environment because the queries are constantly coming and there is no fixed set of queries. Therefore, how to design an online index selection method capable of accurately calculating index benefits and continuously optimizing new queries is a current urgent problem to be solved.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an online index selection method and device based on a neural network and an attenuation strategy, so that higher accuracy can be realized when the execution cost of the query is calculated.

In order to achieve the above purpose, the invention provides an online index selection method based on a neural network and an attenuation strategy, which specifically comprises the following steps:

training the tree convolution neural network to enable the tree convolution neural network to output corresponding execution cost based on the query execution plan;

constructing a hash table, wherein keys of the constructed hash table are columns capable of establishing indexes in a database, and values are arrays, and the arrays are used for storing benefits of indexes on corresponding columns to past queries;

adopting an attenuation strategy, and obtaining a new value of the value at each position of each array in the hash table based on the attenuation parameters;

Disabling the existing indexes in the database, and obtaining default execution cost of the new query under default index configuration based on the tree convolutional neural network;

Calculating the income of the index pair of each column in the database based on the default execution cost of the newly-arrived query, and adding the income into the tail part of the corresponding array in the hash table;

Enabling existing indexes based on index instructions in a database, sequencing the indexes according to total benefits, and selecting a preset number of indexes from high to low as new index configuration.

Based on the technical scheme, the training tree convolutional neural network is used for enabling the tree convolutional neural network to output corresponding execution cost based on the query execution plan, and the specific steps comprise:

Acquiring an executed query execution plan and corresponding execution cost in a database;

the method comprises the steps of forming a data set by an acquired query execution plan and execution cost, and dividing data in the data set into a training set and a testing set according to a preset proportion;

Training the tree convolution neural network by using the training set, judging the training result of the tree convolution neural network by using the testing set, and obtaining the trained tree convolution neural network after the tree convolution neural network converges.

On the basis of the technical scheme, the hash table is constructed, and the specific construction process is as follows:

Traversing all columns and indexes of each table in the database, and taking the column without indexes as a column capable of establishing indexes;

and assigning the index-built columns and a null array to the hash table as key value pairs one by one to realize the construction of the hash table.

Based on the technical scheme, the method adopts an attenuation strategy, obtains a new value of the value at each position of each array in the hash table based on the attenuation parameter, and specifically comprises the following steps:

Establishing a two-layer for loop, wherein the first layer for loop is used for circularly traversing each array in the hash table, and the second layer for loop is used for circularly traversing each position in the array;

multiplying the value of each position of each array in the hash table by the attenuation parameter to obtain a new value as the value of the current position.

Based on the technical scheme, the tree convolutional neural network obtains the default execution cost of the new query under the default index configuration, and the specific steps include:

Invoking a database optimizer to obtain a query execution plan for the new query;

the new query execution plan is input into the tree convolutional neural network, and the output is the execution cost of the new query under the default index configuration, namely the default execution cost.

Based on the technical scheme, the obtaining of the income of the index pair on each column in the database based on the default execution cost of the income inquiry comprises the following specific steps:

for each column appearing in the new query, establishing a virtual index on the current column, and calculating to obtain the query execution cost of the current column at the moment based on a database optimizer and a tree convolutional neural network;

Subtracting the query execution cost of the current column from the default execution cost of the new query to obtain the benefit of the index on the current column on the new query.

Based on the technical scheme, the adding of the benefit to the tail of the corresponding array in the hash table is specifically as follows:

The benefit of the new-to-query is added to the tail of the corresponding array in the hash table and the virtual index is deleted, and for columns that do not appear in the new-to-query, 0 is added to the tail of the corresponding array in the hash table.

Based on the technical scheme, after adding the income from the new query to the tail of the corresponding array in the hash table, the method further comprises the following steps:

judging whether the length of the current array is larger than the preset length, if so, deleting the head element of the current array, and if not, not processing.

Based on the technical scheme, the indexes are ordered according to total benefits, and a preset number of indexes are selected from high to low to serve as new index configuration, and the specific steps include:

calculating the sum of each array in the hash table, and taking the sum as the total income of the index on the corresponding column;

And sorting the indexes according to the total income, selecting a preset number of indexes from high to low as new index configuration, and replacing the existing index configuration in the database with the new index configuration.

The invention provides an online index selection device based on a neural network and an attenuation strategy, which comprises the following components:

the training module is used for training the tree convolution neural network so that the tree convolution neural network outputs corresponding execution cost based on the query execution plan;

the construction module is used for constructing a hash table, keys of the constructed hash table are columns which can be used for establishing indexes in the database, the values are arrays, and the arrays are used for saving benefits of the indexes on the corresponding columns to the past query;

the first calculation module is used for obtaining a new value of the value at each position of each array in the hash table based on the attenuation parameters by adopting an attenuation strategy;

the disabling module is used for disabling the existing indexes in the database and obtaining default execution cost of the new query under default index configuration based on the tree convolution neural network;

the second calculation module is used for calculating the income of the index pair of each column in the database to the new query based on the default execution cost of the new query, and adding the income to the tail of the corresponding array in the hash table;

and the starting module is used for starting the existing indexes based on the index instructions in the database, sequencing the indexes according to the total income, and selecting a preset number of indexes from high to low as new index configuration.

Compared with the prior art, the invention has the advantages that:

(1) The invention calculates the execution cost of the query by using the tree convolution neural network, and the network uses the detector capable of identifying the specific tree structure and can capture the specific tree structure existing in the query plan, thereby having higher accuracy when calculating the execution cost of the query;

(2) The method uses the attenuation strategy to calculate the total gain of the index, and the strategy ensures that the more recent queries occupy larger weight when calculating the total gain of the index, and can finally obtain more reasonable total gain of the index;

(3) Compared with the existing index selection method, the method provided by the invention can be well performed in a dynamic environment with continuous query arrival.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of an online index selection method based on a neural network and an attenuation strategy in an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an online index selection device based on a neural network and an attenuation strategy in an embodiment of the invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application.

The present invention uses a tree convolutional neural network to estimate the execution cost of a query. Tree convolutional neural networks were developed from conventional convolutional neural networks that use probes that identify a particular tree structure. Because the execution plan generated by the database optimizer has a tree-like structure, the tree convolutional neural network can capture some specific structures in the tree-like structure, and obtain more accurate numerical values than the execution cost contained in the execution plan. In a dynamic environment where queries arrive continuously, the longer the queries are, the less relevant the queries are to the latest queries, and the weight of old queries in calculating index returns needs to be reduced. Therefore, the invention designs an attenuation strategy for calculating the total income of the index in the dynamic environment, which ensures that the more recent inquiry has larger influence on the index income and can more effectively adjust the existing index configuration.

Referring to fig. 1, the method for selecting an online index based on a neural network and an attenuation strategy according to the embodiment of the present invention specifically includes the following steps:

s1, training a tree convolutional neural network to enable the tree convolutional neural network to output corresponding execution cost, namely execution time, based on a query execution plan;

That is, the execution time of the queries executed by the database is collected, the queries are input into the database optimizer, the execution plans of the queries can be obtained, and the tree convolutional neural network is trained by the execution plan and the actual execution time of the queries. After being trained, the tree convolution neural network can output an accurate estimated execution cost by taking the execution plan of the query as input. In the following steps, the tree convolutional neural network will replace the database optimizer to calculate the execution cost of the query.

In the invention, training the tree convolution neural network to make the tree convolution neural network output corresponding execution cost based on the query execution plan, specifically comprising the following steps:

S101, acquiring an executed query execution plan and corresponding execution cost in a database;

S102, forming a data set by the acquired query execution plan and execution cost, and dividing the data in the data set into a training set and a testing set according to a preset proportion;

And S103, training the tree convolutional neural network by using a training set, judging the training result of the tree convolutional neural network by using a testing set, and obtaining the trained tree convolutional neural network after the tree convolutional neural network converges.

In the practical application process, the programming language is Python, and the database is PostgreSQL. By searching the log file of the database, the executed queries and the corresponding execution costs in the database can be obtained, the execution plans of the queries can be obtained through EXPLAIN commands, the query execution plans and the corresponding execution costs form a data set, the data in the data set are divided into a training set and a testing set according to the ratio of 9:1, the training set is used for training the tree convolutional neural network, the training result of the tree convolutional neural network is judged by adopting the testing set, and the training can be stopped after the tree convolutional neural network is observed to be converged.

S2, constructing a hash table, wherein keys of the constructed hash table are columns which can be used for establishing indexes in a database, and the values are arrays which are used for storing benefits of indexes on the corresponding columns to the past query;

that is, a preparation work before query processing is performed, an empty hash table is created, the keys of the hash table are each index-built column in the database, and the value is an array. For example, the maximum value of each array length is w, i.e., the index on the corresponding column is reserved in the array at most for the benefit of the past w queries.

In the invention, a hash table is constructed, and the specific construction process is as follows:

s201, traversing all columns and indexes of each table in a database, and taking the columns without indexes as columns capable of establishing indexes;

s202, assigning the index-built columns and a null array to the hash table one by one as key value pairs to realize the construction of the hash table.

In the actual application process, using an \d+ command to check all columns and index conditions of each table in the database one by one, wherein the column without the index is used as an index-establishing column, in Python, a hash table is realized by using a data structure of a direct, and the index-establishing column and a null array are used as key value pairs one by one to be assigned to the hash table, so that the hash table is constructed.

S3, adopting an attenuation strategy, and obtaining a new value of the value at each position of each array in the hash table based on attenuation parameters;

In the invention, a new value of the value at each position of each array in the hash table is obtained based on attenuation parameters by adopting an attenuation strategy, and the specific steps comprise:

s301, establishing a two-layer for loop, wherein the first layer for loop is used for traversing each array in the hash table in a loop mode, and the second layer for loop is used for traversing each position in the array in a loop mode;

S302, multiplying the value of each position of each array in the hash table by the attenuation parameter to obtain a new value as the value of the current position.

The value of each position of each array in the hash table is multiplied by the attenuation parameter as a new value, the attenuation parameter is between 0 and 1, and the value of the attenuation parameter can be continuously adjusted according to the actual environment and the effect of the method in actual application.

S4, disabling the existing indexes in the database, and obtaining default execution cost of the new query under default index configuration based on the tree convolutional neural network;

Since there are previously recommended index configurations in the database that have an impact on evaluating the execution cost of the query, the instructions in the database need to be invoked first to disable the existing index. Specifically, the index in the database may be disabled using the update pg_index SET INDISVALID =false instruction.

In the invention, the default execution cost of the new query under the default index configuration is obtained based on the tree convolution neural network, and the specific steps comprise:

S401, calling a database optimizer to obtain a query execution plan of the new query;

S402, inputting a query execution plan of the new query into the tree convolutional neural network, and outputting the new query to be the execution cost of the new query under the default index configuration, namely the default execution cost.

And calling the database optimizer by using EXPLAIN commands to obtain an execution plan of the new-to-query, taking the execution plan as the input of the tree convolutional neural network, and finally obtaining the execution cost of the new-to-query under the default index configuration.

S5, calculating the income of the index pair of each column in the database to the new query based on the default execution cost of the new query, and adding the income to the tail of the corresponding array in the hash table;

In the invention, based on default execution cost of the new query, the income of indexes on each column in the database to the new query is calculated, and the specific steps include:

s501, for each column appearing in the new query, establishing a virtual index on the current column, and calculating to obtain the query execution cost of the current column at the moment based on a database optimizer and a tree convolution neural network;

s502, subtracting the query execution cost of the current column from the default execution cost of the new query to obtain the benefit of the index on the current column to the new query.

In the actual application process, for each column appearing in the new query, a HypoPG (https:// gitsub. Com/HypoPG/hypopg) plug-in is used to build a virtual index on the column, then according to step S4, the execution cost of the latest query in the presence of the index is obtained, and then the value obtained in step S4 is subtracted by the value obtained in the step, so that the benefit of the index to the new query is obtained.

In the invention, the benefit is added to the tail part of the corresponding array in the hash table, and the method specifically comprises the following steps:

The benefits to the query may be added to the tail of the corresponding array in the hash table using the application method of the list in Python.

In one possible implementation, after adding the benefit of the new query to the tail of the corresponding array in the hash table, the method further includes:

Since a new element is added to each array of the hash table, the length of the current array may be greater than w, and thus, if the length of the current array is greater than w, the head element of the array is deleted. The head element is deleted and the other elements are not deleted because the head element is the benefit of the index to the oldest query.

The length of the array can be obtained using the len method, and if the length is greater than w, the head element of the array is deleted using the del statement.

And S6, starting the existing indexes based on the index instruction in the database, sequencing the indexes according to the total income, and selecting the indexes with preset numbers from high to low as new index configuration.

Since the benefit of the index on each column to the new query has now been calculated, the instructions in the database can be invoked to enable the existing index. Specifically, the database existing index may be enabled using the update pg_index SET INDISVALID =true instruction.

In the invention, indexes are ordered according to total income, and a preset number of indexes are selected from high to low to be used as new index configuration, and the specific steps include:

S601, calculating the sum of each array in the hash table, and taking the sum as the total income of the index on the corresponding column;

S602, sorting indexes according to total benefits, selecting a preset number of indexes from high to low as new index configuration, and replacing the existing index configuration in the database with the new index configuration.

In the actual application process, sum statement is used for calculating the sum of each array in the hash table, then the sort statement is used for sorting the arrays, and then indexes with preset numbers are selected from high to low to be used as new index configuration. The new index configuration is compared to the old index configuration, and the existing index configuration is changed using the CREATE statement and DELETE statement of the database, thereby transferring the index configuration in the database to this new index configuration.

When a new query needs to be processed, the steps S3 to S6 are repeated.

The method for selecting the index on line based on the neural network and the attenuation strategy comprehensively considers the collected data, the construction of the training tree convolutional neural network and the calculation of the index gain by using the attenuation strategy, the recommendation of new index configuration and the modification of the index configuration in the whole process.

In the invention, the tree convolution neural network is provided with the detector capable of identifying the structural mode, and can observe the special structure in the query execution plan, so that the method has higher accuracy compared with a method for calculating the query cost by using a database optimizer.

The method comprises the steps of setting a manually-set attenuation parameter, wherein the value of the attenuation parameter is between 0 and 1, multiplying the gain of an index obtained from the past query by the attenuation parameter when calculating the total gain of the index, and multiplying the attenuation parameter by the older query more times, so that the influence of the newer query on the total gain of the index is ensured to be larger, and the change trend of the future query can be captured.

The method comprises the steps of collecting data from log files of a database to train a tree convolutional neural network, using a hash table to save index benefits of past queries and provide a data structure required by an attenuation strategy, disabling existing index configuration in the database before calculating the index benefits so as to avoid influence on calculating the index benefits, and enabling the index configuration after calculating.

For the online index selection method based on the neural network and the attenuation strategy, firstly, the invention provides a method for calculating the query execution cost by using a tree convolution neural network. Therefore, the method can capture some specific tree structures in the query plan, and finally has higher accuracy when calculating the execution cost of the query; the invention designs a method for calculating the total gain of indexes under dynamic environment, namely the dynamic environment is continuously provided with an environment where queries arrive, and in the environment, the queries arriving close in time are similar, so that the weight of old queries in calculating index gain needs to be reduced, and the requirement can be met by the attenuation strategy.

In a possible implementation manner, the embodiment of the present invention further provides a readable storage medium, where the readable storage medium is located in a PLC (Programmable Logic Controller ) controller, and a computer program is stored on the readable storage medium, where the program is executed by a processor to implement the following steps of the online index selection method based on a neural network and an attenuation policy:

The storage media may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present invention may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

Referring to fig. 2, an online index selection device based on a neural network and an attenuation policy provided by an embodiment of the present invention includes a training module, a building module, a first computing module, a disabling module, a second computing module, and an enabling module.

The system comprises a training module, a constructing module, a first calculating module, a disabling module, a second calculating module and an enabling module, wherein the training module is used for training the tree convolutional neural network to enable the tree convolutional neural network to output corresponding execution cost based on a query execution plan, the constructing module is used for constructing a hash table, keys of the constructed hash table are used for each column capable of establishing indexes in a database, the values are arrays used for saving benefits of indexes on the corresponding columns to past queries, the first calculating module is used for obtaining new values of the values on each position of each array in the hash table based on attenuation parameters by adopting an attenuation strategy, the disabling module is used for disabling existing indexes in the database and obtaining default execution cost of the new query under default index configuration based on the tree convolutional neural network, the second calculating module is used for calculating and obtaining benefits of the new query for each column in the database based on default execution cost of the new query, the benefits are added to the tail of the corresponding arrays in the hash table, and the enabling module is used for enabling existing indexes based on index instructions in the database, ordering the indexes according to the total benefits, and selecting the preset number of indexes from high to low as the new index configuration.

The foregoing is only a specific embodiment of the application to enable those skilled in the art to understand or practice the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims

1. An online index selection method based on neural networks and decay strategies, characterized by comprising the following steps:

Train a tree convolutional neural network so that it outputs the corresponding execution cost based on the query execution plan;

Construct a hash table where the key is each indexable column in the database and the value is an array used to store the benefits of indexes on the corresponding columns for past queries.

A decay strategy is employed to obtain the new value for each position in each array of the hash table based on the decay parameter;

Disable existing indexes in the database and obtain the default execution cost of new queries under the default index configuration based on a tree convolutional neural network;

Based on the default execution cost of new queries, the benefit of indexes on each column in the database for new queries is calculated, and this benefit is added to the tail of the corresponding array in the hash table;

The existing indexes are enabled based on the index instructions in the database, and the indexes are sorted according to the total benefit, and a preset number of indexes are selected as new indexes from high to low.

The step of employing a decay strategy to obtain the new value of each position in each array of the hash table based on the decay parameter includes the following specific steps:

Create two nested for loops. The first for loop iterates through each array in the hash table, and the second for loop iterates through each position in the array.

Multiply the value at each position of each array in the hash table by the decay parameter, and use the new value as the value at the current position.

2. The online index selection method based on neural networks and decay strategies as described in claim 1, characterized in that the training of the tree convolutional neural network, so that the tree convolutional neural network outputs the corresponding execution cost based on the query execution plan, specifically includes the following steps:

Retrieve the execution plans and corresponding execution costs of queries that have already been executed in the database;

The obtained query execution plans and execution costs are combined into a dataset, and the data in the dataset is divided into training set and test set according to a preset ratio;

The tree convolutional neural network is trained using a training set and the training result is judged using a test set. Once the tree convolutional neural network converges, the trained tree convolutional neural network can be obtained.

3. The online index selection method based on neural networks and decay strategies as described in claim 1, characterized in that the specific construction process of the hash table is as follows:

Iterate through all columns and indexes of each table in the database, and identify columns without existing indexes as columns that can be indexed.

The hash table is constructed by assigning each indexable column and an empty array as key-value pairs to the hash table.

4. The online index selection method based on neural networks and decay strategies as described in claim 1, characterized in that the step of obtaining the default execution cost of a new query under the default index configuration based on a tree convolutional neural network specifically includes:

The database optimizer is invoked to obtain the query execution plan for the new query;

The execution plan of the newly arrived query is input into the tree convolutional neural network, and the output is the execution cost of the newly arrived query under the default index configuration, i.e., the default execution cost.

5. The online index selection method based on neural networks and decay strategies as described in claim 1, characterized in that the step of calculating the benefit of the index on each column of the database for the new query based on the default execution cost of the new query specifically includes:

For each column that appears in the new query, a virtual index is created on the current column, and the query execution cost of the current column is calculated based on the database optimizer and tree convolutional neural network.

Subtracting the current column's query execution cost from the default execution cost of the new query gives the benefit of the index on the current column for the new query.

6. The online index selection method based on neural networks and decay strategies as described in claim 5, characterized in that, adding the gain to the tail of the corresponding array in the hash table specifically comprises:

Add the earnings from new queries to the end of the corresponding array in the hash table, and delete the virtual index. For columns that do not appear in new queries, add 0 to the end of the corresponding array in the hash table.

7. The online index selection method based on neural networks and decay strategies as described in claim 6, characterized in that, after adding the revenue of a newly arrived query to the tail of the corresponding array in the hash table, it further includes:

Determine if the length of the current array is greater than the preset length. If it is, delete the first element of the current array; otherwise, do nothing.

8. The online index selection method based on neural networks and decay strategies as described in claim 1, characterized in that the step of sorting the indexes according to total revenue and selecting a preset number of indexes from high to low as new index configuration specifically includes:

Calculate the sum of each array in the hash table and use it as the total gain of the index on the corresponding column;

Sort the indexes according to total revenue, select a preset number of indexes from high to low as new index configurations, and replace the existing index configurations in the database with the new index configurations.

9. An online index selection device based on neural networks and a decay strategy, characterized in that it comprises:

The training module is used to train the tree convolutional neural network so that the tree convolutional neural network outputs the corresponding execution cost based on the query execution plan;

The building module is used to build a hash table, where the key of the hash table is each indexable column in the database and the value is an array used to store the benefits of the index on the corresponding column for past queries.

The first calculation module is used to obtain the new value of the value at each position of each array in the hash table based on the decay parameter using a decay strategy.

The disable module is used to disable existing indexes in the database and obtain the default execution cost of new queries under the default index configuration based on a tree convolutional neural network;

The second calculation module is used to calculate the benefit of the index on each column of the database for the new query based on the default execution cost of the new query, and add the benefit to the tail of the corresponding array in the hash table;

The enable module is used to enable existing indexes based on index instructions in the database, and sort the indexes according to total revenue, selecting a preset number of indexes from high to low as new index configurations;