Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can be termed a second and, similarly, a second can be termed a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at" \8230; "or" when 8230; \8230; "or" in response to a determination ", depending on the context.
In the present specification, two execution plan updating methods are provided, and the present specification relates to two execution plan updating apparatuses, a database system, a computing device, a computer-readable storage medium, and a computer program, which are described in detail one by one in the following embodiments.
Referring to the schematic diagram shown in fig. 1, in order to enable the distributed query execution plan to be reusable, so as to save resource consumption, and meanwhile, to ensure that the plan is completely pushed down, the execution plan updating method provided in this specification may traverse the parameter information after acquiring the distribution information and the parameter information of the sub-plan in the original execution plan, so as to construct a target query statement corresponding to the original execution plan under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, and an execution node in the constructed target execution plan is in a state to be updated, so that the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, the original execution plan can be stored to a specified position, the requirement of repeatedly generating the execution plan according to parameter values is overcome, the query requirement can be quickly responded by direct multiplexing, data reading at the target data node is realized, the resource consumption is effectively saved, and the query efficiency is improved.
Fig. 2 is a flowchart illustrating an execution plan updating method according to an embodiment of the present specification, which specifically includes the following steps.
Step S202, the distribution information and the parameter information of the sub-plan in the original execution plan are obtained, and the parameter information is traversed.
The execution plan updating method provided by the embodiment is applied to a distributed database scene; in order to realize that a distributed query execution plan under a distributed database scene can be reused to save resource consumption and ensure complete push-down of the plan, a parameterized distributed execution plan which is dynamically pushed down according to parameters is adopted to realize, after the original distributed execution plan is generated, all pushed-down sub-plans in the original distributed execution plan can be calculated, the pushed-down sub-plans can be determined to be related to the parameters under the condition that the pushed-down sub-plans are judged to be related to the input parameters according to the calculation push-down conditions, then a target distributed execution plan which can be pushed down can be created based on the calculation push-down sub-plans, whether each sub-plan can push down the same data node or not can be judged according to the parameter values under the condition that the parameters are bound subsequently, if yes, the completely pushed-down distributed execution plan can be executed, the aim that the plan caching can be supported under the condition that the execution plan is completely pushed down is realized, the cache hit rate can be effectively improved, and the reuse can be realized.
Based on this, the original execution plan specifically refers to a distributed query execution plan obtained by converting the query tree, and is used for implementing query optimization, and a query path with the minimum cost can be selected according to the query tree for data query operation. The original execution plan is usually generated based on the sql statement, and if the sql statement is repeatedly executed, the original execution plan is repeatedly constructed, and each construction consumes more computing resources. Correspondingly, the sub-plan specifically refers to the sub-plan forming the original execution plan, and is used for pulling data from the data nodes according to the distribution information in the query phase, and then making a join with the data pulled by other sub-plans. Correspondingly, the distribution information specifically refers to description information corresponding to a distribution column associated with the sub-plan, and is used for describing how data related to the distribution column is distributed in the data node, and the description information includes information such as a table to which the data belongs, a few fields, field types and the like; accordingly, the parameter information specifically refers to information corresponding to the parameter values recorded in the distribution sequence determination condition corresponding to the sub-plan.
The execution plan updating method provided by the embodiment is applied to a distributed database cluster, and a cluster architecture comprises a coordination node, a data node and a central time service node, wherein the coordination node is used for managing a distributed execution plan, the data node is used for storing and inquiring data, and the central time service node is used for ensuring clock synchronization between nodes.
On this basis, in order to create a reusable parameterized distributed execution plan, distribution information and parameter information corresponding to sub-plans in the original execution plan may be obtained first, so as to clarify distribution description information related to each sub-plan in the unrepeatable execution plan and information such as related parameter values. And then, traversing the parameter information of all the sub-plans to realize the distribution statistics of the sub-plans in the original execution plan so as to conveniently judge whether the parameterization processing can be carried out or not and cache.
Further, when acquiring the distribution information and the parameter information of the sub-plans in the original execution plan, considering that the original execution plan may include a plurality of sub-plans, and the correlation degree between the sub-plans is not high, it is necessary to determine the distribution mode and the distribution determination condition of each sub-plan, and in this embodiment, the specific implementation manner is as in step S2022 to step S2024.
Step S2022, determining a distribution mode corresponding to the child plan in the original execution plan, and determining the distribution information according to the distribution mode.
In a distributed database cluster, when a coordination node generates an original execution plan, considering a distributed execution plan which cannot be pushed down completely, data is pulled from data nodes through a sub-plan according to distribution information of the data, then the data is made into a join with the data pulled by other sub-plans, and a result is used as an input of other join operators or is used as result data. The reason why the process cannot be completely pushed down is that because the distribution of each sub-plan is different, data needs to be pulled on different nodes, and the process cannot be completely pushed down to a single node for execution. Particularly, when the sub-plan distribution determination input is not provided, the parameter values cannot be obtained by the distribution calculation of the sub-calculation due to parameterization, and it cannot be determined at which data node the sub-calculation specifically acquires data, and if the sub-plan distributions are generated differently, the finally generated whole plan cannot be completely pushed down.
Therefore, in order to construct a target execution plan of complete push-down, each sub-plan needs to be analyzed, that is, the distribution mode corresponding to the sub-plan in the original execution plan can be determined, so as to determine the distribution information of the sub-plan in the distribution mode. The distribution mode specifically refers to a mode for calculating the distribution list, and includes, but is not limited to, a hash distribution mode, a module distribution mode, a replication distribution mode, or a roundbin distribution mode.
The processing type of the sub-plan can be determined by determining the distribution mode of the sub-plan, and the distribution column information corresponding to the processing type can be explicitly processed on the basis of the processing type, so that the processing type can be conveniently used when a parameterized distributed execution plan is subsequently constructed, in the embodiment, the specific implementation mode is as follows:
determining sub-plan information by reading a data structure of a sub-plan in the original execution plan; under the condition that the sub plan is determined to be the single table scanning plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a first distribution mode, and reading the single table distribution information corresponding to the single table scanning plan based on the first distribution mode to serve as the distribution information; or, when the sub-plan is determined to be the connection plan according to the sub-plan information, determining that the distribution mode corresponding to the sub-plan is the second distribution mode, and reading the connection distribution information corresponding to the connection plan based on the second distribution mode as the distribution information.
Specifically, the data structure of the sub-plan specifically refers to a structure formed by recording the sub-plan, and correspondingly, the sub-plan information specifically refers to description information determined according to related structure information recorded in the data structure, and is used for representing the plan type of the sub-plan; correspondingly, the single-table scanning plan specifically refers to a plan for scanning a single table by a sub-plan; correspondingly, the first distribution mode is specifically a hash distribution mode or a module distribution mode under a single table scanning plan, and correspondingly, the single table distribution information is specifically distribution column information corresponding to the single table; correspondingly, the connection sub-plan specifically refers to a plan for performing join by the sub-plan, and correspondingly, the second distribution mode specifically refers to a hash distribution mode or a module distribution mode under the connection plan. The connection distribution information specifically refers to distribution column information of the join.
Based on this, by reading the data structure of the sub-plans in the original execution plan, the sub-plan information of each sub-plan can be determined. If the sub plan is determined to be the single table scanning plan according to the sub plan information, the distribution mode corresponding to the sub plan is a first distribution mode, and at this time, the single table distribution information corresponding to the single table scanning plan, namely the distribution column information of the single table, can be read as the distribution information of the sub plan based on the first distribution mode; if the sub-plan is determined to be the connection plan according to the sub-plan information, the distribution mode corresponding to the sub-plan is described as the second distribution mode, and at this time, the connection distribution information corresponding to the connection plan, that is, the distribution column information of the join may be read as the distribution information of the sub-plan based on the second distribution mode.
For example, the distributionType in the data structure of the original execution plan is a distribution mode, the sub-plan information of the sub-plan can be determined by reading the data structure, and if the sub-plan is a single table scan and the distribution mode is determined to be a hash or modulo distribution, the distribution column information of the single table can be recorded as the distribution information of the sub-plan. If the sub-plan is a join and the join is pushable, and the distribution mode is determined to be hash or modulo distribution, the distribution column information of the join may be recorded as the distribution information of the sub-plan. The distribution column information of the join depends on both ends of the join and the mode of the join, and if the join can be pushed down, the distribution information can be generated according to the both end information of the join.
In summary, by combining the sub-plan information and determining the distribution information in different manners for the sub-plans of different distribution manners, it can be ensured that the distribution information of each sub-plan in the original execution plan is accurately counted, so as to facilitate the subsequent construction of the parameterized distributed execution plan based on the distribution information, and ensure the reusability of the parameterized distributed execution plan.
Step S2024, determining a distribution judgment condition corresponding to the sub-plan in the original execution plan, and determining the parameter information according to the parameter value recorded in the distribution judgment condition.
Specifically, the distributed determination condition is calculated according to a selection condition of query data in the sql statement, that is, determined according to query optimization derivation, for example, a =1and a = c, if c is a distribution column, c =1 is the determination condition, and if a is the distribution column, a =1 is the determination condition. Correspondingly, the parameter value specifically refers to a parameter value corresponding to a parameter recorded in the distribution determination condition.
Based on this, in order to be able to determine the parameter information of the sub-plan, and then create the parameterized distributed execution plan, the distribution determination condition corresponding to the sub-plan in the original execution plan may be determined, so as to determine the parameter information according to the parameter value recorded in the distribution determination condition, for subsequent use.
In practical application, in consideration of different parameters corresponding to different distribution determination conditions, different sub-plans may correspond to different parameter information, and therefore, it is necessary to determine the parameter information separately for each type of sub-plan, and in this embodiment, a specific implementation manner is as follows:
under the condition that the sub plan in the original execution plan is a copy plan, reading a copy distribution judgment condition corresponding to the copy plan, and determining the parameter information according to a reference parameter value recorded in the copy distribution judgment condition; or, under the condition that the child plan in the original execution plan is a non-copy plan, reading a non-copy distribution judgment condition corresponding to the non-copy plan, traversing statement expressions in the non-copy distribution judgment condition, and determining the parameter information according to expression parameters in the statement expressions.
Specifically, the copy plan refers to a plan type when the distribution manner of the sub-plan is a replication distribution manner, and correspondingly, the copy distribution determination condition refers to a distribution determination condition corresponding to a case where the sub-plan processes a replication table; accordingly, the reference value specifically refers to a default value set for the copy table. Correspondingly, the non-copy plan specifically refers to a plan type when the distribution mode of the sub-plan is a non-replay distribution mode.
Based on this, when the sub-plan in the original execution plan is the copy plan, the distribution mode is explained to be the replay distribution mode, and at this time, the copy distribution judgment condition corresponding to the copy plan can be read, and the parameter information is determined according to the reference parameter value recorded in the copy distribution judgment condition; and under the condition that the sub-plan in the original execution plan is the non-copy plan, the distribution mode is described to be the non-replay distribution mode, at this time, the non-copy distribution judgment condition corresponding to the non-copy plan can be read, the statement expression in the non-copy distribution judgment condition is traversed, and the parameter information is determined according to the expression parameter in the statement expression.
Along the above example, if the distribution mode corresponding to the sub-plan is a replication distribution mode, a replication table represents the sub-plan processing, and data can be acquired at any node, at this time, a parameter number paramId =0 may be set to indicate that data can be acquired at any node, at this time, expressions in where in all the push-down statements can be traversed, if the expression form is a = b, and one side of the expression form is a distribution column, and the other side of the expression form is a parameterized parameter, for example, a is a distribution column, and b is a parameter, paramId = b- > paramId may be recorded, otherwise, the next push-down expression continues to be determined, and information such as a parameter number corresponding to the sub-plan can be acquired until all the processing is completed.
If the paramId is not obtained, judging whether the current sub-plan is join, if so, traversing all expressions of join ON, if the expression form is a = b, and one side of the expression form is a distribution column, the other side of the expression form is a parameter after parameterization, for example, a is the distribution column, and b is the parameter, recording paramId = b- > paramId, otherwise, continuously judging the next push-down expression until all processing is finished, and obtaining information such as parameter serial numbers corresponding to the sub-plan. In the query optimization stage, when each sub-plan in the distributed execution plan is generated, the distribution information and the parameter information of all the sub-plans can be counted according to the process, so that the subsequent use is facilitated.
In summary, the accuracy of parameter information determination can be ensured by determining the parameter information in different ways for different sub-plans, so that the push-down judgment of the push-down distributed execution plan can be accurately completed in the subsequent process, and the reusable parameterized distributed execution plan can be generated.
And step S204, under the condition that the original execution plan meets the plan processing conditions according to the traversal result, constructing a target query statement corresponding to the original execution plan.
Specifically, after the distribution information and the parameter information corresponding to the sub-plan are determined, it is considered that different original execution plans may support different push-down strategies, and if complete push-down is not possible, even the created parameterized execution plan cannot be reused, so that the push-down result can be predicted in a plan processing condition judgment manner, so that the completely pushdown original execution plan can be processed conveniently, and the parameterized execution plan can be cached. That is, after the parameter information is traversed, the execution plan can be completely pushed down under the condition that the original execution plan meets the plan processing condition according to the traversal result, and then the target query statement corresponding to the original execution plan can be constructed, so that the construction and the caching of the parameterized execution plan based on the target query statement can be facilitated subsequently. The plan processing condition specifically refers to a condition for detecting whether the original execution plan can be subjected to complete push-down judgment, and correspondingly, the target query statement specifically refers to a constructed sql statement capable of being subjected to complete push-down and is used for constructing a reusable parameterized execution plan.
Further, when determining whether the original execution plan can be pushed down, it is actually to determine the parameter value, and in this embodiment, the specific implementation manner is as follows:
determining sub-plan parameter values of the sub-plan in the original execution plan according to the traversal result; and under the condition that the parameter value of the sub-plan is larger than or equal to the parameter value threshold value, executing the step of constructing the target query statement corresponding to the original execution plan.
Specifically, the sub-plan parameter value specifically refers to a parameter serial number. Based on this, after traversing the parameter information of each sub-plan, the sub-plan parameter values of the sub-plans in the original execution plan can be determined according to the traversal result, and then compared with the parameter value threshold. If the number of the sub-plans is larger than or equal to the preset number, the sub-plans are all corresponding to the parameter serial numbers, and further the sub-plans can be completely pushed down, and the step of constructing the target query statement corresponding to the original execution plan is executed according to the comparison result. If the number of the sub-plans is less than the preset number, the sub-plans do not satisfy the push-down condition, the sub-plans are considered not to be completely pushed down, and the sub-plans are executed according to the original execution plan.
In the above example, in the query optimization stage, the distributed execution plan is traversed, and the paramId counted in all the pushdown sub-plans is determined to be greater than or equal to 0 according to the traversal result, and then the subsequent distributed execution plan creation is performed if the result that the whole distributed execution plan can be completely pushed down is determined to exist. Otherwise, if the push-down rate is not greater than 0, the push-down operation is considered to be incomplete, the execution is performed according to the original execution plan, and the execution plan of complete push-down operation is not generated for the execution plan.
In summary, by determining all sub-plans by means of numerical comparison, smooth creation and execution of a complete push-down plan can be achieved, so as to avoid performance loss caused by interrupt generation.
Furthermore, in a case that it is determined that the original execution plan can be completely pushed down, generation of a target distributed execution plan that needs to be completely pushed down is required, and an spl statement that can be completely pushed down needs to be constructed first between the generation of the target distributed execution plan that needs to be completely pushed down, which is specifically implemented as follows in this embodiment:
acquiring a target query tree corresponding to the original execution plan, wherein the target query tree is obtained by structuralizing the initial query tree corresponding to the original execution plan; and performing inverse transformation processing on the target query tree to obtain a target query statement corresponding to the original execution plan, wherein the target query statement is associated with global query information of the target query tree.
Specifically, the target query tree is a query tree obtained by performing structuring processing on an initial query tree corresponding to the original distributed execution plan. Accordingly, the structuring specifically refers to processing of parse, analyze, and rewrite on the initial query tree.
Therefore, the target query tree corresponding to the original execution plan is obtained firstly, and the target query tree is a result obtained through the structuralization processing, so that the target query tree can be subjected to inverse transformation processing to be used for reversely deducing a query statement which can be completely pushed down, namely a target query statement, so that the target execution plan can be conveniently constructed on the basis of the target query tree.
In practical applications, in the query optimization stage, it is determined that there is a possibility of full push-down according to the determination result of full push-down, and therefore a target execution plan capable of full push-down can be created on the basis of the possibility. While a full push-down represents that the entire statement can be pushed down to the data node for execution, so a full push-down sql statement needs to be generated for the executable plan executing in the data node.
Along the above example, the query tree after being currently processed by park, analyze and rewrite is restored to the executable sql statement through the deparse process, and the sql statement of the deparse can completely meet the push-down requirement because all the current query information is already contained in the query tree.
In summary, by creating the target query statement, it is possible to subsequently create a target execution plan that can be executed at the data node, so that data query can be quickly multiplexed without consuming more computing resources.
Step S206, creating a target execution plan corresponding to the target query statement according to the distribution information and the parameter information, where an execution node of the target execution plan is in a state to be updated.
Specifically, after the completely pushable target query statement is created, the creation of the target execution plan can be further realized, at this time, the target execution plan associated with the target query statement can be created in combination with the distribution information and the parameter information, and the execution node of the created target execution plan is in a state to be updated, so that in the actual application stage, the state of the execution node can be updated according to the query requirement of the user, and the parameter is substituted to complete the data query processing operation. The target execution plan is a parameterized execution plan which can be completely pushed down and reused. Correspondingly, the executing node is the data node.
Further, when creating the target execution plan, it is implemented by sequentially creating sub-plans, and in this embodiment, the specific implementation manner is as follows:
according to the distribution information and the parameter information, creating a target sub-plan corresponding to the sub-plan in the original execution plan; and composing the target execution plan based on the target sub-plan, wherein the target execution plan comprises a parameterized target query statement.
Specifically, the target sub-plan is a newly created sub-plan corresponding to the sub-plan in the original execution plan; based on this, after the distribution information and the parameter information are obtained, a target sub-plan corresponding to the sub-plan of the original execution plan can be created; in the process of creating, the target sub-plan can combine the result of parameterization of the fused target query statement, and then a target execution plan can be formed according to all the generated target sub-plans.
According to the above example, after the fully pushdown sql statement is obtained, the generated sql statement can be packaged into the customized plan, the execution node of the fully pushdown plan is marked to be empty, and then the generated customized plan is stored in the original parameterized distributed execution plan to be used for replacing the original parameterized distributed execution plan when the fully pushdown condition is met, so that the fully pushdown is realized. That is, the custom plan is an execution plan of a remote query generated for sql, in which a parameterized sql statement that can be pushed down completely is encapsulated, and the executor of the coordinator node can obtain a result by issuing the parameterized sql to the data node.
In conclusion, by combining the target query statement to create the target execution plan, the parameterized sql is sent to the data node to acquire data in the completely pushed-down scene, and operations such as data pulling from the data node and calculation at the execution node are not needed, so that the data operation efficiency is effectively improved.
And step S208, replacing the original execution plan with the target execution plan and storing the original execution plan.
Specifically, after the target execution plan is obtained, in order to ensure reusability of the target execution plan, the target execution plan may replace the original execution plan and store the original execution plan in the cache, so that the original execution plan may be replaced and used when the full push-down condition is subsequently satisfied. The purpose of saving resource consumption is achieved.
Further, in the practical application stage, in order to ensure the availability, the following steps S2082 to S2086 may be performed.
Step S2082, receiving a user query statement, and extracting a user query parameter from the user query statement;
step S2084, updating the target execution plan based on the user query parameters;
step S2086, determining a target data node by executing the updated target execution plan, and querying data in the target data node.
Specifically, the user query statement refers to a query statement submitted by a user in an actual application stage; correspondingly, the user query parameter specifically refers to a parameter related to the current query requirement, and correspondingly, the target data node specifically refers to a data node meeting the query requirement of the user query statement, that is, corresponding data can be queried at the data node.
Based on this, under the condition that a user query statement is received, the user has the query requirement, and in order to optimize the query, the feedback is completed with the minimum cost, and the user query parameter can be extracted from the user query statement; and then updating the execution plan based on the query parameters, namely updating parameter values in the target execution plan so as to determine target data nodes by executing the updated target execution plan and query data at the target data nodes.
In summary, by performing query processing operations on data in a manner of multiplexing the target execution plan, resource consumption can be effectively saved, and the query processing operations can be supported in a manner of not constructing a plan separately for query optimization.
Furthermore, when the target execution plan is updated, the state of the sub-plan in the target execution plan is updated based on the user query parameter, and in this embodiment, the specific implementation manner is as follows:
updating the state of the execution node of the target sub-plan in the target execution plan according to the user query parameter; and determining the target data node according to the state updating result, and inquiring data at the target data node through executing the updated target execution plan.
That is, when the status is updated, the execution node of each target sub-plan is updated, so as to determine the target data node according to the update result, and realize data query at the target data node.
In specific implementation, if each sub-plan execution node is unique and the same, the whole target execution plan can be completely pushed down, at this time, if the target sub-plan is determined to be replay distribution, one node is randomly selected as a target data node, the target sub-plan is hash or modulo distribution, and the target data node is calculated according to the distribution column and the parameter values.
On the first hand, under the condition that the target sub-plan in the target execution plan corresponds to the third distribution mode, the data node corresponding to the target sub-plan with the highest execution priority in the target execution plan is selected as the target data node according to the state updating result.
Specifically, the third distribution mode specifically refers to a replication distribution mode. Based on this, when the target sub-plan in the target execution plan corresponds to the third distribution mode, it is described that all the target sub-plans are in the replication distribution mode, and at this time, the data node corresponding to the target sub-plan with the highest execution priority in the target execution plan may be selected as the target data node according to the state update result. Namely: if all sub-plans are replenishability distributions, the data node of the first sub-plan can be used as a final target data node for realizing data query processing operation.
Referring to fig. 3, the sub-plan a is in a replication distribution manner, the execution node corresponding to the sub-plan a is determined to be in a replication distribution manner by calculation, the execution node corresponding to the sub-plan b is determined to be in a replication distribution manner by calculation, the execution node corresponding to the sub-plan c is determined to be in a replication distribution manner by calculation, and the execution node is determined to be in a replication 1 when the first sub-plan is a during the traversal.
In a second aspect, when a target sub-plan in the target execution plan corresponds to a fourth distribution mode, the same data node corresponding to the target sub-plan in the target execution plan is selected as the target data node according to the state update result.
Specifically, the fourth distribution mode specifically refers to a hash or module distribution mode. Based on this, under the condition that the target sub-plan in the target execution plan corresponds to the fourth distribution mode, it is described that the target sub-plan is in a hash or module distribution mode, and at this time, the same data node corresponding to the target sub-plan in the target execution plan can be selected as the target data node according to the state update result. That is, if all sub-plans are in a hash or module distribution manner, the execution nodes of all sub-plans must be unique and identical, and any one of the execution nodes may be selected as the target data node.
Referring to fig. 4, the sub-plan a is in a hash distribution mode, the corresponding parameter is a first parameter, the execution node corresponding to the sub-plan a is determined to be in a hash distribution mode through calculation, the sub-plan b is in a hash distribution mode, the corresponding parameter is a second parameter, the execution node corresponding to the sub-plan a is determined to be in a hash distribution mode through calculation, the sub-plan c is in a hash distribution mode, the corresponding parameter is a third parameter, the execution node corresponding to the sub-plan a is determined to be 1 through calculation, and then the execution node is directly determined to be 1.
And in a third aspect, under the condition that a target sub-plan in the target execution plan corresponds to a mixed distribution mode, determining a data node corresponding to the target sub-plan in the target execution plan according to the state updating result, and selecting the data node with the highest hit rate as the target data node.
Specifically, the hybrid distribution mode specifically refers to a distribution mode including hash, modulo, or replication. Based on this, under the condition that the target sub-plan in the target execution plan corresponds to the mixed distribution mode, it is indicated that the distribution modes corresponding to the target sub-plan may be different, at this time, the data node corresponding to the target sub-plan in the target execution plan may be determined according to the state update result, and the data node with the highest hit rate may be selected as the target data node.
That is, if all sub-plans are in replay distribution and there is hash or module distribution, determining that all sub-plan execution nodes in hash or module distribution are unique and the same, ignoring the execution nodes in replay distribution, and taking the execution nodes in hash or module distribution as the finally determined target data nodes.
Referring to fig. 5, the sub-plan a is in a hash distribution mode, the corresponding parameter is a first parameter, the execution node corresponding to the sub-plan a is determined to be in a hash distribution mode through calculation, the sub-plan b is in a module distribution mode, the corresponding parameter is a second parameter, the execution node corresponding to the sub-plan a is determined to be in a replication distribution mode through calculation, the execution node corresponding to the sub-plan c is determined to be in a replication distribution mode through calculation, and the execution node corresponding to the execution node is determined to be 1 through calculation. If the first step determines the execution node, then the distributed execution plan is considered to be fully pushdown. And secondly, acquiring a completely pushed-down customization plan from the distributed execution plans, updating the acquired final execution node information into the customization plan, replacing the original distributed execution plan with the completely pushed-down customization plan, and executing the completely pushed-down customization plan to realize data query processing operation.
In order to realize that the distributed query execution plan can be reused to save resource consumption and ensure that the plan is completely pushed down, the execution plan updating method provided by the specification can traverse the parameter information after acquiring the distribution information and the parameter information of the sub-plan in the original execution plan, and can construct a target query statement corresponding to the original execution plan under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, and an execution node in the constructed target execution plan is in a state to be updated, so that the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, the original execution plan can be stored to a specified position, the requirement of repeatedly generating the execution plan according to the parameter value is overcome, the query requirement can be quickly responded by direct multiplexing, the resource consumption is effectively saved, and the query efficiency is improved.
The following describes the execution plan updating method further by taking an application of the execution plan updating method provided in this specification in a data query scenario as an example, with reference to fig. 6. Fig. 6 is a flowchart illustrating a processing procedure of an execution plan updating method according to an embodiment of the present specification, and specifically includes the following steps.
Step S602, determining a distribution mode corresponding to the sub-plan in the original execution plan, and determining distribution information according to the distribution mode.
Specifically, the sub-plan information is determined by reading the data structure of the sub-plan in the original execution plan; under the condition that the sub plan is determined to be the single table scanning plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a first distribution mode, and reading the single table distribution information corresponding to the single table scanning plan based on the first distribution mode to serve as the distribution information; alternatively, when the sub-plan is determined to be the connection plan based on the sub-plan information, the distribution method corresponding to the sub-plan is determined to be the second distribution method, and the connection distribution information corresponding to the connection plan is read as the distribution information based on the second distribution method.
Step S604, determining a distribution determination condition corresponding to the sub-plan in the original execution plan, and determining parameter information according to the parameter values recorded in the distribution determination condition.
Specifically, under the condition that the sub-plan in the original execution plan is the copy plan, reading a copy distribution judgment condition corresponding to the copy plan, and determining parameter information according to a reference parameter value recorded in the copy distribution judgment condition; or, under the condition that the sub-plan in the original execution plan is a non-copy plan, reading a non-copy distribution judgment condition corresponding to the non-copy plan, traversing statement expressions in the non-copy distribution judgment condition, and determining parameter information according to expression parameters in the statement expressions.
And step S606, determining sub-plan parameter values of the sub-plans in the original execution plan according to the traversal result.
Step S608, when the sub-plan parameter value is greater than or equal to the parameter value threshold, obtaining a target query tree corresponding to the original execution plan, where the target query tree is obtained by performing a structuring process on the initial query tree corresponding to the original execution plan.
Step S610, inverse transformation processing is performed on the target query tree to obtain a target query statement corresponding to the original execution plan, where the target query statement is associated with global query information of the target query tree.
Step S612, creating a target sub-plan corresponding to the sub-plan in the original execution plan according to the distribution information and the parameter information.
And step S614, forming a target execution plan based on the target sub-plan, wherein the target execution plan comprises parameterized target query statements, and execution nodes of the target execution plan are in a state to be updated.
In step S616, the original execution plan is replaced with the target execution plan and stored.
Step S618, receiving the user query statement, and extracting the user query parameter from the user query statement.
Step S620, updating the target execution plan based on the user query parameter.
In step S622, a target data node is determined by executing the updated target execution plan, and data is queried at the target data node.
Specifically, updating the state of an execution node of a target sub-plan in the target execution plan according to the user query parameter; and determining a target data node according to the state updating result, and inquiring data at the target data node through executing the updated target execution plan.
Further, under the condition that the target sub-plan in the target execution plan corresponds to the third distribution mode, selecting a data node corresponding to the target sub-plan with the highest execution priority in the target execution plan as a target data node according to the state updating result; or under the condition that the target sub-plan in the target execution plan corresponds to the fourth distribution mode, selecting the same data node corresponding to the target sub-plan in the target execution plan as a target data node according to the state updating result; or under the condition that the target sub-plan in the target execution plan corresponds to the mixed distribution mode, determining the data node corresponding to the target sub-plan in the target execution plan according to the state updating result, and selecting the data node with the highest hit rate as the target data node.
In order to realize that the distributed query execution plan can be reused to save resource consumption and ensure that the plan is completely pushed down, the execution plan updating method provided by the specification can traverse the parameter information after acquiring the distribution information and the parameter information of the sub-plan in the original execution plan, and can construct a target query statement corresponding to the original execution plan under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, and an execution node in the constructed target execution plan is in a state to be updated, so that the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, the original execution plan can be stored to a specified position, the requirement of repeatedly generating the execution plan according to the parameter value is overcome, the query requirement can be quickly responded by direct multiplexing, the resource consumption is effectively saved, and the query efficiency is improved.
Corresponding to the above method embodiment, the present specification further provides an execution plan updating apparatus embodiment, and fig. 7 illustrates a schematic structural diagram of an execution plan updating apparatus provided in an embodiment of the present specification. As shown in fig. 7, the apparatus includes:
an obtaining module 702 configured to obtain distribution information and parameter information of a sub-plan in an original execution plan, and traverse the parameter information;
a building module 704 configured to build a target query statement corresponding to the original execution plan when it is determined that the original execution plan satisfies a plan processing condition according to a traversal result;
a creating module 706 configured to create a target execution plan corresponding to the target query statement according to the distribution information and the parameter information, where an execution node of the target execution plan is in a state to be updated;
a storage module 708 configured to replace and store the original execution plan with the target execution plan.
In an optional embodiment, the obtaining distribution information and parameter information of the sub-plan in the original execution plan includes:
determining a distribution mode corresponding to a sub-plan in the original execution plan, and determining the distribution information according to the distribution mode; and determining a distribution judgment condition corresponding to the sub-plan in the original execution plan, and determining the parameter information according to the parameter values recorded in the distribution judgment condition.
In an optional embodiment, the determining a distribution manner corresponding to a sub-plan in the original execution plan and determining the distribution information according to the distribution manner includes:
determining sub-plan information by reading a data structure of a sub-plan in the original execution plan; under the condition that the sub plan is determined to be the single table scanning plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a first distribution mode, and reading the single table distribution information corresponding to the single table scanning plan based on the first distribution mode to serve as the distribution information; or, when the sub-plan is determined to be the connection plan according to the sub-plan information, determining that the distribution mode corresponding to the sub-plan is the second distribution mode, and reading the connection distribution information corresponding to the connection plan based on the second distribution mode as the distribution information.
In an optional embodiment, the determining a distribution judgment condition corresponding to a sub-plan in the original execution plan, and determining the parameter information according to a parameter value recorded in the distribution judgment condition includes:
under the condition that the sub plan in the original execution plan is a copy plan, reading a copy distribution judgment condition corresponding to the copy plan, and determining the parameter information according to a reference parameter value recorded in the copy distribution judgment condition; or, under the condition that the child plan in the original execution plan is a non-copy plan, reading a non-copy distribution judgment condition corresponding to the non-copy plan, traversing statement expressions in the non-copy distribution judgment condition, and determining the parameter information according to expression parameters in the statement expressions.
In an optional embodiment, when it is determined that the original execution plan meets the plan processing condition according to the traversal result, constructing a target query statement corresponding to the original execution plan includes:
determining sub-plan parameter values of the sub-plan in the original execution plan according to the traversal result; and under the condition that the parameter value of the sub-plan is larger than or equal to the parameter value threshold value, executing the step of constructing the target query statement corresponding to the original execution plan.
In an optional embodiment, the constructing a target query statement corresponding to the original execution plan includes:
acquiring a target query tree corresponding to the original execution plan, wherein the target query tree is obtained by structuralizing the initial query tree corresponding to the original execution plan; and performing inverse transformation processing on the target query tree to obtain a target query statement corresponding to the original execution plan, wherein the target query statement is associated with the global query information of the target query tree.
In an optional embodiment, the creating a target execution plan corresponding to the target query statement according to the distribution information and the parameter information includes:
according to the distribution information and the parameter information, creating a target sub-plan corresponding to the sub-plan in the original execution plan; and composing the target execution plan based on the target sub-plan, wherein the target execution plan comprises a parameterized target query statement.
In an optional embodiment, the apparatus further comprises:
the updating module is configured to receive a user query statement and extract a user query parameter from the user query statement; updating the target execution plan based on the user query parameters; and determining a target data node by executing the updated target execution plan, and inquiring data at the target data node.
In an optional embodiment, the updating the target execution plan based on the user query parameter includes:
updating the state of the execution node of the target sub-plan in the target execution plan according to the user query parameter;
correspondingly, the determining a target data node by executing the updated target execution plan and querying data at the target data node includes:
and determining the target data node according to the state updating result, and inquiring data at the target data node by executing the updated target execution plan.
In an optional embodiment, the determining the target data node according to the status update result includes:
under the condition that a target sub-plan in the target execution plan corresponds to a third distribution mode, selecting a data node corresponding to the target sub-plan with the highest execution priority in the target execution plan as the target data node according to the state updating result; or under the condition that a target sub-plan in the target execution plan corresponds to a fourth distribution mode, selecting the same data node corresponding to the target sub-plan in the target execution plan as the target data node according to the state updating result; or under the condition that a target sub-plan in the target execution plan corresponds to a mixed distribution mode, determining a data node corresponding to the target sub-plan in the target execution plan according to the state updating result, and selecting the data node with the highest hit rate as the target data node.
In summary, in order to achieve that the distributed query execution plan can be reused to save resource consumption and simultaneously ensure that the plan is completely pushed down, the parameter information may be traversed after the distribution information and the parameter information of the sub-plan in the original execution plan are obtained, so that the target query statement corresponding to the original execution plan may be constructed under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, and an execution node in the constructed target execution plan is in a state to be updated, so that the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, the original execution plan can be stored to a specified position, the requirement of repeatedly generating the execution plan according to the parameter value is overcome, the query requirement can be quickly responded by direct multiplexing, the resource consumption is effectively saved, and the query efficiency is improved.
The above is an exemplary scheme of an execution plan updating apparatus according to the present embodiment. It should be noted that the technical solution of the execution plan updating apparatus and the technical solution of the execution plan updating method belong to the same concept, and for details that are not described in detail in the technical solution of the execution plan updating apparatus, reference may be made to the description of the technical solution of the execution plan updating method.
Fig. 8 is a flowchart illustrating another execution plan updating method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step S802, acquiring distribution information and parameter information of a sub-plan in an original execution plan, and traversing the parameter information;
step S804, under the condition that the original execution plan meets the plan processing conditions according to the traversal result, constructing a target query statement corresponding to the original execution plan;
step S806, creating a target execution plan corresponding to the target query statement according to the distribution information and the parameter information, wherein an execution node of the target execution plan is in a state to be updated;
step S808, updating the target execution plan in response to the user query statement, and querying target data by executing the updated target execution plan.
Optionally, the acquiring distribution information and parameter information of the sub-plan in the original execution plan includes:
determining a distribution mode corresponding to a sub-plan in the original execution plan, and determining the distribution information according to the distribution mode;
and determining a distribution judgment condition corresponding to the sub-plan in the original execution plan, and determining the parameter information according to the parameter values recorded in the distribution judgment condition.
Optionally, the determining a distribution manner corresponding to the sub-plan in the original execution plan, and determining the distribution information according to the distribution manner includes:
determining sub-plan information by reading a data structure of a sub-plan in the original execution plan;
under the condition that the sub plan is determined to be the single table scanning plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a first distribution mode, and reading the single table distribution information corresponding to the single table scanning plan based on the first distribution mode to serve as the distribution information; or,
and under the condition that the sub plan is determined to be the connection plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a second distribution mode, and reading the connection distribution information corresponding to the connection plan based on the second distribution mode to be used as the distribution information.
Optionally, the determining a distribution judgment condition corresponding to a sub-plan in the original execution plan, and determining the parameter information according to a parameter value recorded in the distribution judgment condition includes:
under the condition that the sub plan in the original execution plan is a copy plan, reading a copy distribution judgment condition corresponding to the copy plan, and determining the parameter information according to a reference parameter value recorded in the copy distribution judgment condition; or,
under the condition that the sub-plan in the original execution plan is a non-copy plan, reading a non-copy distribution judgment condition corresponding to the non-copy plan, traversing statement expressions in the non-copy distribution judgment condition, and determining the parameter information according to expression parameters in the statement expressions.
Optionally, the constructing a target query statement corresponding to the original execution plan under the condition that it is determined according to the traversal result that the original execution plan meets the plan processing condition includes:
determining sub-plan parameter values of the sub-plan in the original execution plan according to the traversal result;
and under the condition that the parameter value of the sub-plan is larger than or equal to the parameter value threshold value, executing the step of constructing the target query statement corresponding to the original execution plan.
Optionally, the constructing a target query statement corresponding to the original execution plan includes:
acquiring a target query tree corresponding to the original execution plan, wherein the target query tree is obtained by structuralizing the initial query tree corresponding to the original execution plan;
and performing inverse transformation processing on the target query tree to obtain a target query statement corresponding to the original execution plan, wherein the target query statement is associated with global query information of the target query tree.
Optionally, the creating a target execution plan corresponding to the target query statement according to the distribution information and the parameter information includes:
according to the distribution information and the parameter information, creating a target sub-plan corresponding to the sub-plan in the original execution plan;
and composing the target execution plan based on the target sub-plan, wherein the target execution plan comprises a parameterized target query statement.
Optionally, after the step of creating the target execution plan corresponding to the target query statement according to the distribution information and the parameter information is executed, the method further includes:
receiving a user query statement, and extracting a user query parameter from the user query statement;
updating the target execution plan based on the user query parameters;
and determining a target data node by executing the updated target execution plan, and inquiring data at the target data node.
Optionally, the updating the target execution plan based on the user query parameter includes:
updating the state of the execution node of the target sub-plan in the target execution plan according to the user query parameter;
correspondingly, the determining a target data node by executing the updated target execution plan and querying data at the target data node includes:
and determining the target data node according to the state updating result, and inquiring data at the target data node by executing the updated target execution plan.
Optionally, the determining the target data node according to the status update result includes:
under the condition that a target sub-plan in the target execution plan corresponds to a third distribution mode, selecting a data node corresponding to the target sub-plan with the highest execution priority in the target execution plan as the target data node according to the state updating result; or,
under the condition that a target sub-plan in the target execution plan corresponds to a fourth distribution mode, selecting the same data node corresponding to the target sub-plan in the target execution plan as the target data node according to the state updating result; or,
and under the condition that a target sub-plan in the target execution plan corresponds to a mixed distribution mode, determining a data node corresponding to the target sub-plan in the target execution plan according to the state updating result, and selecting the data node with the highest hit rate as the target data node.
It should be noted that, another execution plan updating method provided in this embodiment corresponds to the execution plan updating method, and the same or corresponding descriptions can be referred to the above embodiments, which are not described herein in detail.
In summary, in order to achieve that the distributed query execution plan can be reused to save resource consumption and simultaneously ensure that the plan is completely pushed down, the parameter information may be traversed after the distribution information and the parameter information of the sub-plan in the original execution plan are obtained, so that the target query statement corresponding to the original execution plan may be constructed under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, an execution node in the constructed target execution plan is in a state to be updated, the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, and after the user query statement is received, the plan is updated by combining parameters in the statement, so that the query processing operation of data can be directly realized. The requirement that the execution plan is repeatedly generated according to the parameter values is overcome, the query requirement can be quickly responded by direct multiplexing, the resource consumption is effectively saved, and the query efficiency is improved.
Corresponding to the above method embodiment, the present specification further provides another execution plan updating apparatus embodiment, and fig. 9 shows a schematic structural diagram of another execution plan updating apparatus provided in an embodiment of the present specification. As shown in fig. 9, the apparatus includes:
an obtaining information module 902 configured to obtain distribution information and parameter information of a sub-plan in an original execution plan, and traverse the parameter information;
a sentence construction module 904 configured to construct a target query sentence corresponding to the original execution plan when it is determined that the original execution plan satisfies a plan processing condition according to the traversal result;
a plan creating module 906 configured to create a target execution plan corresponding to the target query statement according to the distribution information and the parameter information, where an execution node of the target execution plan is in a state to be updated;
an update plan module 908 configured to update the target execution plan in response to a user query statement and query target data by executing the updated target execution plan.
In an optional embodiment, the obtaining distribution information and parameter information of the sub-plan in the original execution plan includes:
determining a distribution mode corresponding to a sub-plan in the original execution plan, and determining the distribution information according to the distribution mode; and determining a distribution judgment condition corresponding to the sub-plan in the original execution plan, and determining the parameter information according to the parameter values recorded in the distribution judgment condition.
In an optional embodiment, the determining a distribution manner corresponding to a sub-plan in the original execution plan and determining the distribution information according to the distribution manner includes:
determining sub-plan information by reading a data structure of a sub-plan in the original execution plan; under the condition that the sub plan is determined to be the single table scanning plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a first distribution mode, and reading the single table distribution information corresponding to the single table scanning plan based on the first distribution mode to serve as the distribution information; or, when the sub-plan is determined to be the connection plan according to the sub-plan information, determining that the distribution mode corresponding to the sub-plan is the second distribution mode, and reading the connection distribution information corresponding to the connection plan based on the second distribution mode as the distribution information.
In an optional embodiment, the determining a distribution judgment condition corresponding to a sub-plan in the original execution plan, and determining the parameter information according to a parameter value recorded in the distribution judgment condition includes:
under the condition that the sub plan in the original execution plan is a copy plan, reading a copy distribution judgment condition corresponding to the copy plan, and determining the parameter information according to a reference parameter value recorded in the copy distribution judgment condition; or, under the condition that the sub-plan in the original execution plan is a non-copy plan, reading a non-copy distribution judgment condition corresponding to the non-copy plan, traversing statement expressions in the non-copy distribution judgment condition, and determining the parameter information according to expression parameters in the statement expressions.
In an optional embodiment, the constructing a target query statement corresponding to the original execution plan when it is determined that the original execution plan satisfies a plan processing condition according to the traversal result includes:
determining sub-plan parameter values of the sub-plan in the original execution plan according to the traversal result; and under the condition that the parameter value of the sub-plan is larger than or equal to the parameter value threshold value, executing the step of constructing the target query statement corresponding to the original execution plan.
In an optional embodiment, the constructing a target query statement corresponding to the original execution plan includes:
acquiring a target query tree corresponding to the original execution plan, wherein the target query tree is obtained by structuralizing an initial query tree corresponding to the original execution plan; and performing inverse transformation processing on the target query tree to obtain a target query statement corresponding to the original execution plan, wherein the target query statement is associated with global query information of the target query tree.
In an optional embodiment, the creating a target execution plan corresponding to the target query statement according to the distribution information and the parameter information includes:
according to the distribution information and the parameter information, a target sub-plan corresponding to the sub-plan in the original execution plan is created; and composing the target execution plan based on the target sub-plan, wherein the target execution plan comprises a parameterized target query statement.
In an optional embodiment, the apparatus further comprises:
the updating module is configured to receive a user query statement and extract a user query parameter from the user query statement; updating the target execution plan based on the user query parameters; and determining a target data node by executing the updated target execution plan, and inquiring data at the target data node.
In an optional embodiment, the updating the target execution plan based on the user query parameter includes:
updating the state of the execution node of the target sub-plan in the target execution plan according to the user query parameter;
correspondingly, the determining a target data node by executing the updated target execution plan and querying data at the target data node includes:
and determining the target data node according to the state updating result, and inquiring data at the target data node by executing the updated target execution plan.
In an optional embodiment, the determining the target data node according to the status update result includes:
under the condition that a target sub-plan in the target execution plan corresponds to a third distribution mode, selecting a data node corresponding to the target sub-plan with the highest execution priority in the target execution plan as the target data node according to the state updating result; or under the condition that a target sub-plan in the target execution plan corresponds to a fourth distribution mode, selecting the same data node corresponding to the target sub-plan in the target execution plan as the target data node according to the state updating result; or under the condition that a target sub-plan in the target execution plan corresponds to a mixed distribution mode, determining a data node corresponding to the target sub-plan in the target execution plan according to the state updating result, and selecting the data node with the highest hit rate as the target data node.
In summary, in order to achieve that the distributed query execution plan can be reused to save resource consumption and simultaneously ensure that the plan is completely pushed down, the parameter information may be traversed after the distribution information and the parameter information of the sub-plan in the original execution plan are obtained, so that the target query statement corresponding to the original execution plan may be constructed under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, an execution node in the constructed target execution plan is in a state to be updated, the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, and after the user query statement is received, the plan is updated by combining parameters in the statement, so that the query processing operation of data can be directly realized. The requirement that the execution plan is repeatedly generated according to the parameter values is overcome, the query requirement can be quickly responded by direct multiplexing, the resource consumption is effectively saved, and the query efficiency is improved.
The above is an exemplary scheme of another execution plan updating apparatus of the present embodiment. It should be noted that the technical solution of the execution plan updating apparatus and the technical solution of the execution plan updating method described above belong to the same concept, and details of the technical solution of the execution plan updating apparatus, which are not described in detail, can be referred to the description of the technical solution of the execution plan updating method described above.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a database system, and fig. 10 shows a schematic structural diagram of a database system provided in an embodiment of the present specification. As shown in FIG. 10, database system 1000 includes a coordinating node 1010, and a target data node 1020.
The coordination node 1010 is used for acquiring distribution information and parameter information of a sub-plan in the original execution plan and traversing the parameter information; under the condition that the original execution plan meets the plan processing conditions, a target query statement corresponding to the original execution plan is constructed according to the traversal result; according to the distribution information and the parameter information, a target execution plan corresponding to the target query statement is created; responding to a user query statement to update the target execution plan, determining a target data node according to an update result, and sending the updated target execution plan to the target data node;
the target data node 1020 is configured to receive the updated target execution plan, determine target data by executing the updated target execution plan, and feed back the target data to the coordinating node.
It should be noted that the database system provided in this embodiment is similar to the execution plan updating method, and the same or corresponding description contents can be referred to the above embodiments, which are not described in detail herein.
Based on this, the coordinating node specifically refers to a node in the database system for creating a pushable execution plan, and the data node specifically refers to a node in the distributed database for storing data, which may be used to execute the execution plan, and to query and feed back data to the coordinating node.
That is, the coordinating node may traverse the parameter information after acquiring the distribution information and the parameter information of the sub-plan in the original execution plan; under the condition that the original execution plan meets the plan processing conditions, a target query statement corresponding to the original execution plan is constructed according to the traversal result; according to the distribution information and the parameter information, a target execution plan corresponding to the target query statement is created; after creation is complete, the coordinating node may store the target execution plan in the plan cache.
When an inquiry statement in the same mode as the target execution plan is received, namely a user inquiry statement is received, the target execution plan can be read from the plan cache, the target execution plan is updated based on parameters in the user inquiry statement, the purpose that parameters corresponding to current user inquiry data are inserted into the reusable target execution plan is achieved, a data node which needs to be subjected to data inquiry and feedback, namely a target data node, is determined according to the updated target execution plan, and then the updated target execution plan is sent to the target data node.
After the target data node receives the updated target execution plan, because the target execution plan has written the parameters corresponding to the user query statement, the target data node can determine the locally stored target data by executing the updated target execution plan, and feed back the target data to the coordination node. For the determination of the target data node, reference may be made to the same or corresponding description in the foregoing embodiments, and this embodiment is not described in detail herein.
In an optional embodiment, the coordinating node 1010 is further configured to:
determining a distribution mode corresponding to a sub-plan in the original execution plan, and determining the distribution information according to the distribution mode; and determining a distribution judgment condition corresponding to the sub-plan in the original execution plan, and determining the parameter information according to the parameter values recorded in the distribution judgment condition.
In an optional embodiment, the coordinating node 1010 is further configured to:
determining sub-plan information by reading a data structure of a sub-plan in the original execution plan; under the condition that the sub plan is determined to be the single table scanning plan according to the sub plan information, determining that the distribution mode corresponding to the sub plan is a first distribution mode, and reading the single table distribution information corresponding to the single table scanning plan based on the first distribution mode to serve as the distribution information; or, when the sub-plan is determined to be the connection plan according to the sub-plan information, determining that the distribution mode corresponding to the sub-plan is the second distribution mode, and reading the connection distribution information corresponding to the connection plan based on the second distribution mode as the distribution information.
In an optional embodiment, the coordinating node 1010 is further configured to:
under the condition that the sub plan in the original execution plan is a copy plan, reading a copy distribution judgment condition corresponding to the copy plan, and determining the parameter information according to a reference parameter value recorded in the copy distribution judgment condition; or, under the condition that the child plan in the original execution plan is a non-copy plan, reading a non-copy distribution judgment condition corresponding to the non-copy plan, traversing statement expressions in the non-copy distribution judgment condition, and determining the parameter information according to expression parameters in the statement expressions.
In an optional embodiment, the coordinating node 1010 is further configured to:
determining a sub-plan parameter value of the sub-plan in the original execution plan according to a traversal result; and under the condition that the parameter value of the sub-plan is larger than or equal to the parameter value threshold value, executing the step of constructing the target query statement corresponding to the original execution plan.
In an optional embodiment, the coordinating node 1010 is further configured to:
acquiring a target query tree corresponding to the original execution plan, wherein the target query tree is obtained by structuralizing the initial query tree corresponding to the original execution plan; and performing inverse transformation processing on the target query tree to obtain a target query statement corresponding to the original execution plan, wherein the target query statement is associated with the global query information of the target query tree.
In an optional embodiment, the coordinating node 1010 is further configured to:
according to the distribution information and the parameter information, creating a target sub-plan corresponding to the sub-plan in the original execution plan; and composing the target execution plan based on the target sub-plan, wherein the target execution plan comprises a parameterized target query statement.
In an optional embodiment, the coordinating node 1010 is further configured to:
receiving a user query statement, and extracting a user query parameter from the user query statement; updating the target execution plan based on the user query parameters.
In an optional embodiment, the coordinating node 1010 is further configured to:
and updating the state of the execution node of the target sub-plan in the target execution plan according to the user query parameter.
In an optional embodiment, the coordinating node 1010 is further configured to:
under the condition that a target sub-plan in the target execution plan corresponds to a third distribution mode, selecting a data node corresponding to the target sub-plan with the highest execution priority in the target execution plan as the target data node according to the state updating result; or under the condition that a target sub-plan in the target execution plan corresponds to a fourth distribution mode, selecting the same data node corresponding to the target sub-plan in the target execution plan as the target data node according to the state updating result; or under the condition that a target sub-plan in the target execution plan corresponds to a mixed distribution mode, determining a data node corresponding to the target sub-plan in the target execution plan according to the state updating result, and selecting the data node with the highest hit rate as the target data node.
In summary, in a distributed database scenario, in order to enable the distributed query execution plan to be reusable, so as to save resource consumption, and meanwhile, to ensure that the plan is completely pushed down, the parameter information may be traversed after the distribution information and the parameter information of the sub-plan in the original execution plan are obtained, so that a target query statement corresponding to the original execution plan may be constructed under the condition that the original execution plan meets the plan processing condition according to the traversal result; and then, a target execution plan corresponding to the target query statement is created according to the distribution information and the parameter information, an execution node in the constructed target execution plan is in a state to be updated, the construction of a parameterized execution plan based on the original execution plan is realized, the original execution plan can be replaced in a parameter binding stage in practical application, dynamic push-down is realized, and after the user query statement is received, the plan is updated by combining parameters in the statement, so that the query processing operation of data can be directly realized. The requirement that the execution plan is repeatedly generated according to the parameter values is overcome, the query requirement can be quickly responded by direct multiplexing, the resource consumption is effectively saved, and the query efficiency is improved.
The foregoing is a schematic scheme of the database system of the present embodiment. It should be noted that the technical solution of the database system and the technical solution of the execution plan updating method belong to the same concept, and details that are not described in detail in the technical solution of the database system can be referred to the description of the technical solution of the execution plan updating method.
FIG. 11 illustrates a block diagram of a computing device 1100 provided in accordance with one embodiment of the present description. The components of the computing device 1100 include, but are not limited to, memory 1110 and a processor 1120. The processor 1120 is coupled to the memory 1110 via a bus 1130 and the database 1150 is used to store data.
The computing device 1100 also includes an access device 1140, the access device 1140 enabling the computing device 1100 to communicate via one or more networks 1160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 1140 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 1100, as well as other components not shown in FIG. 11, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 11 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 1100 can be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. The computing device 1100 can also be a mobile or stationary server.
The processor 1120 is configured to execute computer-executable instructions, which when executed by the processor, implement the steps of the execution plan updating method described above.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the execution plan updating method described above belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the execution plan updating method described above.
An embodiment of the present specification further provides a computer-readable storage medium storing computer-executable instructions, which when executed by a processor, implement the steps of the execution plan updating method described above.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the execution plan updating method described above, and for details that are not described in detail in the technical solution of the storage medium, reference may be made to the description of the technical solution of the execution plan updating method described above.
An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of the execution plan updating method.
The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program is the same as the technical solution of the execution plan updating method, and details that are not described in detail in the technical solution of the computer program can be referred to the description of the technical solution of the execution plan updating method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in source code form, object code form, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer-readable medium may contain suitable additions or subtractions depending on the requirements of legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer-readable media may not include electrical carrier signals or telecommunication signals in accordance with legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, and to thereby enable others skilled in the art to best understand the specification and utilize the specification. The specification is limited only by the claims and their full scope and equivalents.