CN117555906A

CN117555906A - Data processing method, device, electronic equipment and storage medium

Info

Publication number: CN117555906A
Application number: CN202410048473.2A
Authority: CN
Inventors: 卢栋栋
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2024-01-12
Filing date: 2024-01-12
Publication date: 2024-02-13
Anticipated expiration: 2044-01-12
Also published as: CN117555906B

Abstract

The embodiment of the application provides a data processing method, a device, electronic equipment and a storage medium, relates to the technical field of data storage, and can be applied to object storage, cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and other various scenes. The method is applied to a database, the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table comprises a main key field and a non-main key field; a primary table divided into a plurality of tiles, a storage node storing a primary table tile, the method comprising: determining a main table fragment stored in each storage node; for each storage node, a local unique index table is created in the storage node based on the main table fragment stored in the storage node, so that when a query request for a database is received, the storage nodes query field data of a field to be queried in the corresponding local unique index table and the main table fragment.

Description

Data processing method, device, electronic equipment and storage medium

Technical Field

The application belongs to the technical field of data storage, and particularly relates to a data processing method, a data processing device, electronic equipment and a storage medium.

Background

The index is a structure for ordering values of one or more columns in the database table, and specific information in the database table can be quickly accessed by using the index, and conventionally, the index is created by storing the index column value of the main table into a new physical table (referred to as an index table for short), and the query efficiency is improved by querying the index table. For globally unique indexes of distributed databases, a conventional scheme is to build an index table, and disperse index values into storage nodes according to rules (for example, a rule of dividing a database into tables), where an index row and a corresponding main table row may be located in different storage nodes. For example, as shown in fig. 1, when the master table test creates a globally unique index idx2 on col2, the idx2 index table contains two columns col2 and col1, col2 as the distribution keys of the idx2 index table to redistribute idx2 data to the respective storage nodes. col1 is also included in the index table idx2 as a distribution key of the main table test. Thus, the index data of the idx2 index table above the storage node does not correspond to the primary table data of the local storage node, i.e., the primary table and index table are in a many-to-many relationship in the corresponding relationship of the storage node.

Therefore, when inquiring, if index is carried out to return to the table, the prior art needs to access across nodes, network overhead exists, and the performance is poor; in the above example, if select col3 from test where col2 =1 is performed; then, first, the index table idx2 needs to be accessed, col1=9 is obtained from col2=1 to the storage node 1, then, the main table test is accessed according to col1=9, the value of col3 is obtained from the storage node 2, and finally, the value of col3 is returned to the query object.

In the above process, the storage node 1 needs to be accessed first, and when the storage node 2 is accessed, the cross-node access causes poor index table returning performance.

Disclosure of Invention

The embodiment of the application aims to provide a data processing method, a device, electronic equipment and a storage medium capable of improving index table returning performance. In order to achieve the above object, the technical solution provided in the embodiments of the present application is as follows:

in a first aspect, a data processing method is provided and applied to a database, where the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table includes a main key field and a non-main key field; the master table is divided into a plurality of tiles, and a storage node stores one master table tile, the method comprising:

determining a main table fragment stored in each storage node;

executing a local unique index table creation operation for each storage node;

the local unique index table creating operation includes:

creating a local unique index table in the storage node based on the main table fragment stored in the storage node, so that when a query request for a database is received, each storage node queries field data of a field to be queried in the corresponding local unique index table and main table fragment;

Wherein, the query request carries a field to be queried; for each storage node, the primary key data of the primary key field in the locally unique index table is the same as the primary key data of the primary key field in the corresponding primary table fragment.

In one possible implementation, the method further includes:

receiving a data update request for a database; wherein, the data update request carries field data of a primary key field; the data update request includes: a data insertion request or a data modification request;

determining a first target storage node corresponding to field data of the primary key field from all storage nodes;

and based on the data updating request, performing data updating operation on the local unique index table and the main table fragment stored in the first target storage node.

In another possible implementation manner, the local unique index table includes a primary key field and a global unique index field; the data updating request comprises a data inserting request, the data inserting request comprises at least one piece of data to be inserted, and each piece of data to be inserted comprises: main key data to be inserted into a main key field and non-main key data to be inserted into a non-main key field;

Determining a first target storage node corresponding to field data of the primary key field from all storage nodes; based on the data update request, performing a data update operation on the local unique index table and the primary table fragment stored in the first target storage node, including:

determining main key data in each piece of data to be inserted;

determining a second target storage node corresponding to the main key data of each piece of data to be inserted according to each piece of data to be inserted;

and for each piece of data to be inserted, inserting the data to be inserted into a main table partition of the second target storage node, determining index data of a globally unique index field in each piece of data to be inserted, and inserting the index data and the main key data into a locally unique index table of the second target storage node.

In another possible implementation manner, the data update request includes a data modification request, where the data modification request carries target data of at least one field to be modified, and primary key data of a primary key field corresponding to each field to be modified;

Determining primary key data corresponding to each field to be modified;

determining a third target storage node corresponding to the primary key data for each field to be modified;

and for each field to be modified, in the third storage node, determining the original data of each field to be modified in the main table fragment, and modifying the original data into the target data.

In yet another possible implementation manner, if the field to be modified includes a globally unique index field; the method further comprises the steps of:

determining the original data of the globally unique index field through the primary key data in a locally unique index table of the third storage node;

and modifying the original data into the target data.

determining a target data strip where the primary key data is located through the primary key data in a local unique index table of the third storage node;

deleting all data in the target data bar in the local unique index table;

and inserting target data of the globally unique index field and the primary key data into the locally unique index table.

In another possible implementation manner, the data update request further carries target data of a field to be updated; if the field to be updated includes a globally unique index field, the method further includes:

recording target data of the globally unique index field;

checking whether the target data is unique in a global unique index table; wherein the global unique index table is composed of local unique index tables;

submitting the data updating operation if the target data is unique in the global unique index table;

and if the target data is not unique in the globally unique index table, rolling back the data updating operation.

In a second aspect, there is provided another data processing method, the method comprising:

acquiring a data query request aiming at a database, wherein the data query request comprises a target data index to be queried and a field to be queried; the database is provided with at least two storage nodes, each storage node is provided with a main table fragment of the main table and a local unique index table corresponding to the main table fragment, and the local unique index table comprises data indexes of field data of the main key field in the main table fragment;

If the field to be queried is a non-primary key field, each storage node queries field data of a primary key field corresponding to the target data index in a respective local unique index table according to the target data index, and queries field data of the field to be queried in a stored primary table fragment based on the queried field data of the primary key field.

In one possible implementation manner, if the field to be queried is a primary key field, each storage node queries field data of the field to be queried in a corresponding local index table according to the target data index;

in a third aspect, a data processing apparatus is provided and applied to a database, where the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table includes a main key field and a non-main key field; the master table is divided into a plurality of fragments, and a storage node stores one master table fragment, the apparatus comprising:

the main table fragment determining module is used for determining the main table fragments stored in each storage node;

the creation operation determining module is used for executing a local unique index table creation operation aiming at each storage node;

The creation operation determining module is specifically configured to, when executing a local unique index table creation operation:

In one possible implementation, the apparatus further includes: a receiving module, a storage node determining module and a data updating operation module, wherein,

the receiving module is used for receiving a data updating request aiming at the database; wherein, the data update request carries field data of a primary key field; the data update request includes: a data insertion request or a data modification request;

the storage node determining module is used for determining a first target storage node corresponding to the field data of the primary key field from all the storage nodes;

The data updating operation module is configured to perform a data updating operation on the local unique index table and the master table fragment stored in the first target storage node based on the data updating request.

the storage node determining module is specifically configured to, when determining, from each storage node, a first target storage node corresponding to field data of the primary key field:

determining main key data in each piece of data to be inserted;

the data updating operation module is specifically configured to, when executing a data updating operation on the local unique index table and the master table fragment stored in the first target storage node based on the data updating request:

determining primary key data corresponding to each field to be modified;

In another possible implementation manner, if the field to be modified includes a globally unique index field; the apparatus further comprises: the data determining module and the modifying module, wherein,

the data determining module is configured to determine, in a local unique index table of the third storage node, original data of the global unique index field through the primary key data;

the modification module is used for modifying the original data into the target data.

In another possible implementation manner, if the field to be modified includes a globally unique index field; the apparatus further comprises: a data bar determining module, a deleting module and an inserting module, wherein,

the data bar determining module is configured to determine, by using the primary key data in the local unique index table of the third storage node, a target data bar where the primary key data is located;

the deleting module is used for deleting all data in the target data bar in the local unique index table;

The inserting module is configured to insert, in the local unique index table, the target data of the global unique index field and the primary key data.

In another possible implementation manner, the data update request further carries target data of a field to be updated; if the field to be updated includes a globally unique index field, the apparatus further includes: a recording module, a uniqueness checking module, an operation submitting module and an operation rollback module, wherein,

the recording module is used for recording the target data of the globally unique index field;

the uniqueness checking module is used for checking whether the target data is unique in the global unique index table; wherein the global unique index table is composed of local unique index tables;

the operation submitting module is used for submitting the data updating operation when the target data is unique in the globally unique index table;

the operation rollback module is used for rollback the data updating operation when the target data is not unique in the globally unique index table.

In a fourth aspect, there is provided a data processing apparatus, the apparatus comprising:

The request acquisition module is used for acquiring a data query request aiming at a database, wherein the data query request comprises a target data index to be queried and a field to be queried; the database is provided with at least two storage nodes, each storage node is provided with a main table fragment of the main table and a local unique index table corresponding to the main table fragment, and the local unique index table comprises data indexes of field data of the main key field in the main table fragment;

and the first query module is used for querying field data of a main key field corresponding to the target data index in respective local unique index tables by each storage node according to the target data index when the field to be queried is a non-main key field, and querying the field data of the field to be queried in the stored main table fragments based on the queried field data of the main key field.

In one possible implementation, the apparatus further includes: the second query module is used for querying field data of the field to be queried in the corresponding local index table according to the target data index by each storage node when the field to be queried is a primary key field;

In a fifth aspect, an embodiment of the present application further provides an electronic device, where the first terminal includes a memory and a processor, and the memory stores a computer program, and the processor executes the computer program to implement a data processing method provided by any possible implementation manner of the first aspect.

In a sixth aspect, embodiments of the present application further provide a computer readable storage medium, in which a computer program is stored, which when executed by a processor implements a data processing method provided by any one of the possible implementations of the first aspect.

In a seventh aspect, embodiments of the present application further provide a computer program product comprising a computer program which, when executed by a processor, implements a data processing method provided by any one of the possible implementations of the first aspect.

In an eighth aspect, an embodiment of the present application further provides an electronic device, where the first terminal includes a memory and a processor, and the memory stores a computer program, and the processor executes the computer program to implement a data processing method provided by any possible implementation manner of the second aspect.

In a ninth aspect, embodiments of the present application further provide a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements a data processing method provided by any one of the possible implementations of the second aspect.

In a tenth aspect, embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements a data processing method provided by any one of the possible implementations of the second aspect.

The beneficial effects brought by the technical scheme provided by the embodiment of the application are as follows:

the embodiment of the application provides a data processing method, a device, electronic equipment and a storage medium, which are applied to a database, wherein the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table comprises a main key field and a non-main key field; in this embodiment, by determining the main table fragment stored in each storage node and performing a local unique index table creation operation for each storage node, that is, for each storage node, based on the main table fragment stored in the storage node, a local unique index table is created in the storage node, and for each storage node, the main key data of the main key field in the local unique index table is the same as the main key data of the main key field in the corresponding main table fragment, so that when a query request for a database is received, the field data of the field to be queried is queried in the corresponding local unique index table and main table fragment by each storage node, thereby performing index query and main table query in the same node, avoiding the condition of cross-node access, further reducing network overhead and improving network performance.

Compared with the related art, the embodiment of the application provides another data processing method, device, electronic equipment and storage medium, by acquiring a data query request for a database, because the data query request includes a target data index to be queried and a field to be queried, and the database stores a main table, the main table includes a main key field and a non-main key field, the database corresponds to at least two storage nodes, each storage node stores a main table fragment of the main table and a local unique index table corresponding to the main table fragment, the local unique index table includes a data index of each field data of the main key field in the main table fragment, that is, each storage node includes a main table fragment and a local unique index table corresponding to the main table fragment, the main key data in the local unique index table is the same as the main key data of the main table fragment, or if the field to be queried is a non-main key field, the storage nodes query the data of the main key field corresponding to the target data index in the respective local unique index table according to the target data index, and the data index table is not queried to the main key field, and the data of the main key field is not queried in the local unique index table is not required to be accessed to the network field, thereby reducing the query cost.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.

FIG. 1 is a diagram illustrating the creation of a globally unique index table in the related art;

FIG. 2 is a schematic diagram of a visibility problem of an index back table;

FIG. 3 is a diagram showing the difference between the focus index and the normal index in the visibility judgment;

FIG. 4 is a schematic diagram on transaction visibility determination for a globally unique index;

FIG. 5 is a schematic view of an application environment of a data processing method in an embodiment of the present application;

FIG. 6a is a flowchart of a data processing method according to an embodiment of the present application;

FIG. 6b is a flowchart illustrating another data processing method according to an embodiment of the present application;

FIG. 6c is a flowchart illustrating another data processing method according to an embodiment of the present application;

FIG. 7a is a schematic diagram of creating a locally unique index based on a master table in an embodiment of the present application;

FIG. 7b is a schematic diagram of creating a locally unique index based on a master table in an embodiment of the present application;

FIG. 8 is a schematic diagram of creating a locally unique index when creating a master table in an embodiment of the present application;

FIG. 9 is a schematic flow chart of creating a locally unique index according to an embodiment of the present application;

FIG. 10 is a schematic flow chart of an insertion transaction according to an embodiment of the present application;

FIG. 11 is a schematic flow chart of a data modification transaction according to an embodiment of the present application;

FIG. 12 is a flowchart of another data processing method according to an embodiment of the present application;

FIG. 13a is a flowchart illustrating an index-back table query in an embodiment of the present application;

FIG. 13b is a flowchart illustrating an index query according to an embodiment of the present application;

FIG. 14 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 15 is a schematic diagram of another data processing apparatus according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of an apparatus of an electronic device in an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the drawings in the present application. It should be understood that the embodiments described below with reference to the drawings are exemplary descriptions for explaining the technical solutions of the embodiments of the present application, and the technical solutions of the embodiments of the present application are not limited.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and "comprising," when used in this application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, all of which may be included in the present application. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates that at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B". In describing a plurality of (two or more) items, if a relationship between the plurality of items is not explicitly defined, the plurality of items may refer to one, more or all of the plurality of items, for example, the description of "the parameter a includes A1, A2, A3" may be implemented such that the parameter a includes A1 or A2 or A3, and may also be implemented such that the parameter a includes at least two of three items of the parameters A1, A2, A3.

Alternatively, the data processing according to the embodiments of the present application may be implemented based on Cloud storage (Cloud storage) in Cloud technology.

Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

Cloud storage is a new concept which extends and develops in the concept of cloud computing, and a distributed cloud storage system (hereinafter referred to as a storage system for short) refers to a storage system which integrates a large number of storage devices (storage devices are also called storage nodes) of different types in a network through application software or application interfaces to cooperatively work and jointly provides data storage and service access functions for the outside through functions such as cluster application, grid technology, a distributed storage file system and the like.

At present, the storage method of the storage system is as follows: when creating logical volumes, each logical volume is allocated a physical storage space, which may be a disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as a data Identification (ID) and the like, the file system writes each object into a physical storage space of the logical volume, and the file system records storage position information of each object, so that when the client requests to access the data, the file system can enable the client to access the data according to the storage position information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided into stripes in advance according to the set of capacity measures for objects stored on a logical volume (which measures tend to have a large margin with respect to the capacity of the object actually to be stored) and redundant array of independent disks (RAID, redundant Array of Independent Disk), and a logical volume can be understood as a stripe, whereby physical storage space is allocated for the logical volume.

The Database (Database), which can be considered as an electronic filing cabinet, is a place for storing electronic files, and users can perform operations such as adding, inquiring, updating, deleting and the like on the data in the files. A "database" is a collection of data stored together in a manner that can be shared with multiple users, with as little redundancy as possible, independent of the application.

The database management system (Database Management System, abbreviated as DBMS) is a computer software system designed for managing databases, and generally has basic functions of storage, interception, security, backup and the like. The database management system may classify according to the database model it supports, e.g., relational, XML (Extensible Markup Language ); or by the type of computer supported, e.g., server cluster, mobile phone; or by the query language used, such as SQL (structured query language (Structured Query Language), XQuery, or by the energy impact emphasis, such as maximum-scale, maximum-speed, or other classification means, regardless of which classification means is used, some DBMSs can cross-category, for example, while supporting multiple query languages.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention, and special techniques are required for big data to effectively process a large amount of data within a tolerant elapsed time. Technologies applicable to big data include massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems.

Further, in addition to the problem of poor table returning performance in the related art, there is a write amplification process, that is, when the main table is updated, the index table needs to be updated simultaneously, and the cross-node update takes a long time, for example, the request statement is: update test set col 2=5 where col1=3; then the to-be-updated position of the main table is first determined according to col1=3, that is, the row of col3 is updated to the storage node 1, and col2 is updated to 5; furthermore, col2 is part of the globally unique index idx2, so the index idx2 needs to be updated. According to old value of col2 being 7 and new value being 5, it is necessary to delete corresponding row col2=7 on index idx2 to storage node 2 first, while adding a row (col 2, col 1) at storage node 1 with value (5, 3). That is, when updating the master table with the globally unique index, the index table needs to be updated across nodes at the same time, resulting in write amplification.

In addition, there is a problem in the related art that the index back table is blocked, for example, when the selection col3 from test where col 2=1 is executed, the index table idx2 of the storage node 1 is accessed first and then the main table test of the storage node 2 is accessed in the process of the back table. Firstly, locking an index table idx2, and then locking a main table test; when update test set col2 =5where col1=3 is executed, the primary table test of the storage node 1 is accessed at the time of updating, and the index table idx2 of the storage node 2 is accessed. The main table test is locked at present, and the index table idx2 is locked.

Further, the locking sequence of the main table and the index table is inconsistent in the two cases. If there are multiple concurrences, the master table and the index table are located at different nodes, and distributed deadlock is easy to occur.

Furthermore, in the manner shown in the related art described above, there may be a visibility problem of the index back table, as shown in fig. 2, at the RR isolation level, there may be the following cases:

(1) Transaction T1 has committed to an end before transaction T2 began. So transaction T1 modified record a is visible to transaction T2;

(2) Transaction T3 starts later than transaction T2, so the B record written by transaction T3 is not visible to transaction T2;

(3) When transaction T3 commits, transaction T2 has not yet committed, so the B record is also invisible to transaction T2;

(4) When transaction T2 modifies the B record, the B record is again visible at this time because mysql modifies the latest value, i.e., it modifies the B record of transaction T3.

(5) That is, as shown in FIG. 3, in mysql, transaction T2 would find the modification B record of T3 through undo log; a back-master table (i.e., a clustered index of mysql) is needed for the normal index to determine transaction visibility;

(6) That is, as shown in FIG. 4, in mysql, transaction T2 would find the modification B record of T3 through undo log; a back-master table (i.e., a clustered index of mysql) is needed for the normal index to determine transaction visibility;

in order to solve the above technical problems, in the embodiments of the present application, the globally unique index is implemented by a locally unique index. In order to realize the global unique index function of the distributed database, the embodiment of the application provides a global unique index function implementation scheme based on a local index table. Specifically, in the embodiment of the present application, the storage node is utilized to internally establish respective locally unique indexes, that is, the index data and the main table data are located in the same node. When the global unique index is established and updated, the distributed transaction is started, firstly, the local unique index of each node is established, then, from the global angle, the global-level uniqueness check is carried out on the updated part of the global unique index, if the check is successful, the distributed transaction is submitted, and otherwise, the transaction is rolled back.

The main application scene of the global unique index function realized in the embodiment of the application comprises the following steps:

(1) In the business, besides the main key, other combinations of columns are required to meet the strong requirement of a globally unique row, and the business requirement can be realized only through a globally unique index;

(2) The query of the service can not obtain the conditional predicates of the partition keys, and the query of the service table is not written in at the same time in a high concurrency way, so that the scanning of the whole partition is avoided, and a global unique index can be constructed according to the query conditions.

In the above application scenario, in the embodiment of the present application, as shown in fig. 5, when the global unique index is created, respective unique indexes (may also be referred to as local unique indexes) are created in each storage node, that is, the unique indexes in each storage node are in one-to-one correspondence with the data of the storage node, and the situation that the data of the index table corresponds to the main table data across the nodes does not occur.

The unique index within each storage node is referred to as a locally unique index. For global unique index, in order to guarantee global unique on the basis of local unique, the updated index value needs to be checked for uniqueness after each data updating operation.

When updating a table with a global unique index, the method mainly comprises the following two steps:

(1) Executing updating operation on each storage node, and recording updated index new values;

(2) At the global angle, according to the updated new value of the index, the query is executed at each node, and the uniqueness of the index is checked. If the updated new value has multiple rows, the uniqueness is not satisfied, a rollback transaction is needed, and the update fails.

The technical solutions provided in the present application and technical effects produced by the technical solutions of the present application are described below by describing several alternative embodiments. It should be noted that the following embodiments may be referred to, or combined with each other, and the description will not be repeated for the same terms, similar features, similar implementation steps, and the like in different embodiments.

Fig. 5 is an application environment schematic diagram of a data processing method according to an embodiment of the present application. The application environment may include, among other things, a server 101 and a storage system 102. The server 101 sends data to be stored to the storage system 102; the storage system 102 stores data to be stored in a main table of a database, the data to be stored may be stored in at least one shard of the main table, each shard is respectively set in different storage nodes, for example, storage node 1 and storage node 2, shard 1 is divided in storage node 1, shard 2 is divided in storage node 2, local unique index table 1 is created based on the data stored in shard 1, and local unique index table 2 is created based on the data stored in shard 2, and the subsequent server 101 may send various requests, for example, a data query request, a data insertion request, a data update request, etc., at which time the storage system 102 may operate in a certain storage node to avoid a case of operation across storage nodes.

The application environment shown in the embodiments of the present application is only one possible example, and is not intended to limit the embodiments of the present application.

Those skilled in the art will appreciate that the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server or a server cluster that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), and basic cloud computing services such as big data and artificial intelligence platforms. The embodiment of the invention can be applied to various scenes, including but not limited to cloud technology, artificial intelligence, intelligent traffic, audio and video, auxiliary driving and the like. And in particular, the method can be determined based on actual application scene requirements, and is not limited herein.

The embodiments of the present application may be applied to various scenarios including, but not limited to, cloud technology, artificial intelligence, intelligent transportation, assisted driving, and the like.

It should be noted that, in the alternative embodiment of the present application, related data such as object data is required to obtain permission or consent of the object when the embodiment of the present application is applied to a specific product or technology, and the collection, use and processing of related data is required to comply with related laws and regulations and standards of related countries and regions. That is, in the embodiment of the present application, if data related to subject data or the like is involved, the data needs to be acquired through subject authorization consent, and in compliance with relevant laws and regulations and standards of countries and regions.

Fig. 6a shows a flow chart of a data processing method provided in an embodiment of the present application, where the method may be applied to a database, where the database corresponds to at least two storage nodes, and a primary table is stored in the database, and the primary table includes a primary key field and a non-primary key field, and the primary table is divided into a plurality of fragments, and one storage node stores one primary table fragment, and for example, as shown in fig. 7a and 7b, the database corresponds to two storage nodes, namely, a storage node 1 and a storage node 2, and the primary table is also divided into a primary table fragment 1 and a primary table fragment 2, where the primary table fragment 1 is stored in the storage node 1, and the primary table fragment 2 is stored in the storage node 2;

on the basis of the above, the data processing method may be executed by an electronic device, which may be a terminal device or a server as shown in fig. 5, and in this embodiment of the present application, the data processing method is described by taking the server as an example, and the data processing method specifically may include the following steps:

step S601, determining a master table slice stored in each storage node.

For the embodiments of the present application, a storage node is a component of a database cluster that is used to actually store business data. The primary table is typically a table storing data, and contains primary keys that are intended to be queried, which uniquely identify the data records in the table.

Specifically, in the embodiment of the present application, at least two storage nodes correspond to each other in the database, and the master table is divided into at least two slices by a preset slicing rule. In an embodiment of the present application, the preset slicing rules may include: the time range slicing rule is a time type field requiring slicing keys, and can support data slicing in the ranges of year, month, day, hour and the like; numerical range slicing is similar to time range slicing, except that each slice is relatively balanced in data size and there is relatively little but hot spot data. In the embodiment of the present application, the preset slicing rules may also be other slicing rules, which are not limited in the embodiment of the present application.

Specifically, in the embodiments of the present application, the primary table shards stored in each storage node are typically allocated at the time of table construction. In this embodiment of the present application, if a local unique index table needs to be built in a storage node for each primary table partition, it needs to be determined that the primary table partition stored in each storage node is the primary table partition test1, and the primary table partition stored in the storage node 2 is the primary table partition test2, for example, as shown in fig. 7a and fig. 7 b.

Step S602, for each storage node, a local unique index table creation operation is performed.

For the embodiment of the application, after the main table is created, the creation operation of the local unique index table can be executed for each storage node according to the main table fragments stored in the storage node; the local unique index table creation operation may also be performed based on an index creation request for a storage node in the database upon receipt of the index creation request.

It should be noted that, the index creation request for the storage node in the database may be: the local index creation request for each main table partition in each storage node may also be a local index creation request for a main table partition in a certain storage node, or a local index creation request for a main table partition of each storage node in several storage nodes. In the embodiment of the application, when an index creation request for a storage node in a database is a local index creation request for each main table partition in each storage node, a local unique index table is created based on the main table partition in each storage node and stored in the storage node; if the index creation request for a storage node in the database is a local index creation request for a main table partition of a certain storage node, a corresponding local unique index table is created based on the main table partition in the storage node, for example, a local index creation request for a main table test partition 1 of the storage node 1 is received, and a local unique index table new_idx3 is created based on the main table test1 partition in the storage node 1.

Specifically, as shown in fig. 6b, the local unique index table creating operation in step S602 may specifically include: step S6021, wherein,

step S6021, based on the main table fragments stored in the storage nodes, creating a local unique index table in the storage nodes so that when a query request for a database is received, each storage node queries field data of a field to be queried in the corresponding local unique index table and the main table fragment.

Wherein, the query request carries a field to be queried; for each storage node, the primary key data of the primary key field in the local unique index table is the same as the primary key data of the primary key field in the corresponding primary table fragment.

For the embodiment of the present application, the local unique index table is created for each storage node in the above manner, that is, each storage node stores at least one main table fragment and the local unique index table corresponding to the main table fragment.

Further, as can be seen from the foregoing embodiments, in the embodiments of the present application, a local unique index may be created when a master table is created, that is, when a master table test is created, a corresponding local unique index new_idx3 is created at each storage node, that is, by creating an index statement as follows, so as to create a corresponding local unique index new_idx3 at the storage node; as shown in fig. 8, the locally unique index creation may be performed by creating a statement, wherein the index creation statement includes:

“table test（

col1 int key，

col2 int，

col3 int，

unique index new_idx3（col2）global）

distributed by hash（col1）；”

Further, in the embodiment of the present application, the local index may also be created separately, that is, as shown in fig. 7a and 7b, that is, the index is created separately in the presence of the main table, and as shown in fig. 7a, the local index field of the local index table new_idx3 is set to col3 in the presence of the main table; in the case of the main table, col2 is used as the local index field of new_idx3, as shown in fig. 7 b. Taking fig. 7b as an example, a local index table is created, that is, "create index new_idx3on test (col 2) is executed; ". That is, as shown in fig. 9, the creation transaction is started, then all storage nodes create respective locally unique indexes new_idx3, and then for update by the statement "select count (x) from test group by col; "check uniqueness of the globally unique index; if the globally unique index is unique, the creation transaction is committed, otherwise, the creation transaction is rolled back.

After the local unique index table is created, or one main table fragment and the local unique index table corresponding to the main table fragment exist in each storage node, if a query request is received, each storage node queries field data of a field to be queried in the corresponding local unique index table and the main table fragment. Specifically, in the embodiment of the present application, the query request may include: the first query request or the second query request, wherein a field to be queried (query target field) in the first query request contains a primary key field and does not contain a non-primary key field; the field to be queried in the second query request comprises a non-primary key field. That is, the first query request only needs to query in the local index table to obtain the query result, and the second query request needs to query in the local index table and the main table fragment to obtain the query result. For example, col1 is a primary key field, col2 is an index field (local index field), and col3 is a non-primary key field; if the first query request is "select col1 from test where col 2=9; "that is, the first query request is issued to each storage node at this time, so that when each storage node queries col 2=9 in the respective local index table, the value of col1 is 1, that is, the returned result (col 2, col 1) of storage node 1 is (9, 1), as shown in fig. 7b, the returned result of storage node 1 is (col 2, col 1) = (9, 1), and the returned result of storage node 2 is not.

Specifically, in the embodiment of the present application, the query is issued to all storage nodes, and the local unique index is used to perform index scanning in the nodes, and then the result is returned. Because each node executes individually, the execution final time is equal to the time of actually executing the node (the storage node 1 execution time in the example).

For example, col1 is a primary key field, col2 is an index field (local index field), and col3 is a non-primary key field; if the second query request is "select col3 from test where col 2=9; "that is, the second query request is issued to each storage node at this time, so that each storage node performs" select col3 from test where col 2=9; in each storage node, a fast query is performed according to the local unique index new_idx3 (the query is performed in the local unique index table shown in fig. 7 b), the index scan result (col 2, col 1) of the storage node 1 is (9, 1), and then the table returning is performed in the storage node 1 according to col 1=1, that is, the query is performed in the main table slice 1, so as to obtain the query result (col 1, col 3) as (1, 9), and other nodes are all queried according to the previous mode, but the query result is null.

Further, in the embodiment of the present application, when executing a query that needs to index back to the table, the query is issued to all storage nodes, index scanning is performed in the nodes by using the local unique index, and then the result is returned. Because each node executes individually, the execution final time is equal to the time of actually executing the node (the storage node 1 execution time in the example); however, in the existing conventional scheme, index scanning needs to be performed on the corresponding sharded storage nodes of the global index table to obtain the distribution key value of the main table; and then scanning the main table on the storage nodes of the corresponding fragments of the main table according to the distribution key values of the main table to obtain a final result. Therefore, compared with the traditional scheme, the index table returning query mode disclosed by the embodiment of the application does not need to return tables across nodes, reduces network overhead and reduces query time consumption.

Further, on the basis of the foregoing embodiment, that is, on the basis of creating the local index table in each storage node based on the primary table partition stored in the storage node, in addition to avoiding performing a table-returning query across nodes when a query request is made, so as to increase the network overhead and network time consumption, the data update operation may be performed based on the data update request, as shown in fig. 6c, where the data processing method may further include: step Sa, step Sb and step Sc, wherein,

Step Sa, receiving a data update request for a database.

Wherein, the data update request carries field data of the primary key field; the data update request includes: a data insertion request or a data modification request.

And step Sb, determining a first target storage node corresponding to the field data of the primary key field from all the storage nodes.

Specifically, in the embodiment of the present application, the mapping relationship between the field data in which the primary key field is stored in advance and the storage nodes, that is, the storage nodes to which different storage nodes may correspond are different.

In this embodiment of the present application, if field data of a primary key field is carried in a data update request for a database, at this time, a target storage node may be determined according to the carried field data of a primary key field and a mapping relationship (mapping relationship between field data of a primary key field and a storage node) stored in advance. That is, a target storage node corresponding to the update operation corresponding to the current data update request is determined.

Further, if one data update request carries at least two primary key data, the characterization may need to update the data of at least two pieces of data, and at this time, a corresponding target storage node is determined based on each primary key data, that is, a storage node corresponding to at least two data update operations may be different pieces of data in the same storage node or may be located in different storage nodes. The embodiments of the present application are not limited thereto.

And step Sc, based on the data updating request, executing data updating operation on the local unique index table and the main table fragment stored in the first target storage node.

Further, the data update request also carries target data of the field to be updated; if the field to be updated includes a globally unique index field, the method further includes: recording target data of the globally unique index field; checking whether the target data is unique in the global unique index table; wherein the global unique index table is composed of each local unique index table; if the target data is unique in the global unique index table, submitting data updating operation; and if the target data is not unique in the globally unique index table, rolling back the data updating operation.

It should be noted that, the data update request also carries the target data of the field to be updated; if the field to be updated includes a globally unique index field, that is, in this embodiment, if the data update involves a field data update of the globally unique index field, including a new addition or modification, then the data after the globally unique index field update (the field data of the newly inserted globally unique index field, or the field data of the modified globally unique index field) needs to be recorded first. Further, the execution of step Sb and step Sc is started. Then, based on the data updated by the global unique index field of the record, a uniqueness check is performed, that is, it is determined whether the data updated by the global unique index table is unique, that is, whether the updated data corresponds to only one piece of data.

If the updated data is unique, the data update operation is committed at this time, which may also be referred to as a data update transaction commit. That is, all updates to the database in the transaction are written back to the physical database on the disk, and the transaction is normally ended; if the updated data is not unique, the rollback of the data update operation may also be referred to as data update transaction rollback, that is, the database is returned to the state where the transaction begins, that is, the system completely cancels all the completed update operations to the database in the transaction, so that the database is rolled back to the state where the transaction begins.

Further, in the foregoing embodiment, each storage node may be accessed to determine whether the data updated by the globally unique index field is unique. Further, in order to avoid excessively long access time caused by accessing each storage node, when the uniqueness check is performed, the data updated by the globally unique index field may be searched from the locally unique index table in each storage node, so as to perform the uniqueness check in the storage nodes. In particular, storage nodes where new values may exist may be filtered out from the updated index new values by a filtering algorithm, such as a bloom filter algorithm, which only examines these possible storage nodes.

Specifically, the data update request includes: the following describes, by way of specific embodiments, a manner of performing data insertion based on a data insertion request and a manner of checking uniqueness, and a manner of performing data modification based on a data modification request and a manner of checking uniqueness, as data insertion requests or data modification requests.

Specifically, the local unique index table comprises a main key field and a global unique index field; the data update request comprises a data insertion request, wherein the data insertion request comprises at least one piece of data to be inserted, and each piece of data to be inserted comprises: main key data to be inserted into a main key field and non-main key data to be inserted into a non-main key field.

On the basis, determining a first target storage node corresponding to field data of a main key field from all storage nodes; based on the data updating request, performing a data updating operation on the local unique index table and the main table fragment stored in the first target storage node, specifically including: determining main key data in each piece of data to be inserted; determining a second target storage node corresponding to the main key data of each piece of data to be inserted according to each piece of data to be inserted; and for each piece of data to be inserted, inserting each piece of data to be inserted into the main table fragment of the second target storage node, determining the index data of the globally unique index field in each piece of data to be inserted, and inserting the index data and the main key data into the locally unique index table of the second target storage node.

Specifically, in the embodiment of the present application, when a data insertion request is received, the data insertion request carries at least one piece of data to be inserted, and first, index data of a global unique index field to be inserted in each piece of data to be inserted is recorded;

further, as can be seen from the above embodiment, there is a mapping relationship between the primary key data and the storage node, and at this time, the storage node where the primary table fragment to be inserted of each piece of data to be inserted is located may be determined according to the primary key data carried in each piece of data to be inserted, for example, the primary table fragment to be inserted of the data to be inserted 1 is located at the storage node 1, and the primary table fragment to be inserted of the data to be inserted 2 is located at the storage node 2;

after determining the storage node corresponding to each piece of data to be inserted, inserting each piece of data to be inserted into a main table fragment in the storage node, and inserting main table data and index data in each piece of data to be inserted into a local unique index table of the storage node; and then, carrying out uniqueness check on the basis of the recorded index data, submitting the insertion operation corresponding to the data to be inserted aiming at the unique data to be inserted of the index data, and rolling back the insertion operation corresponding to the data to be inserted aiming at the non-unique data to be inserted of the index data.

Based on the above embodiment, the data insertion is described by taking the insertion of a piece of data to be inserted as an example, as shown in fig. 10, the data insertion request may be "insert into test (col 1, col2, col 3) values (5, 6); then starting a data insertion transaction, and recording an inserted global unique index new value col2=6; an insert operation is performed, i.e. the storage node 1 performs an insert on the main table test and the local unique index new_idex3, and then checks the uniqueness of the globally unique index (select count (x) from test group by col2 =6for update), if the globally unique index is unique, the insert transaction is committed, otherwise the insert transaction is rolled back.

It should be noted that, if the data insertion operation is performed by the prior art, an insert operation needs to be performed on the primary table at the storage node 1; determining a storage node 2 of col2=6 of the index table according to the new value 6 of col2, and executing insert operation on the fragments 2 of the index table by the storage node 2; in this way, data insertion may cause cross-node update, which takes a long time, and the insertion scheme shown in the embodiment of the present application may perform insertion operation in the same node based on the local unique index, so as to avoid cross-node update and reduce time consumption.

Further, the data update request comprises a data modification request, wherein the data modification request carries target data of at least one field to be modified and main key data of a main key field corresponding to each field to be modified; on the basis, determining a first target storage node corresponding to field data of a main key field from all storage nodes; based on the data update request, performing a data update operation on the locally unique index table and the primary table fragment stored in the first target storage node may specifically include: determining primary key data corresponding to each field to be modified; determining a third target storage node corresponding to the primary key data for each field to be modified; for each field to be modified, in the third storage node, the original data of each field to be modified in the main table fragment is determined, and the original data is modified into target data.

Specifically, the data modification request carries at least one target data of a field to be modified and main key data of a main key field corresponding to each field to be modified; based on the above embodiments, it is known that a mapping relationship exists between the primary key data and the storage node, and at this time, the storage node corresponding to each field to be modified may be determined, so as to perform the modification operation in the storage node.

Specifically, if there are at least two field data of the field to be modified, it may include: modifying at least two data in the same field, or modifying field data of different fields, if modifying at least two data in the same field, the corresponding primary key data must be different, the modifying operation may be located in the same storage node or may be located in different storage nodes; if the field data of different fields are modified, the corresponding primary key data may be the same or different, if the corresponding primary key data is the same, the modifying operation is located in one storage node, and if the corresponding primary key data is different, the modifying operation may be located in the same storage node or in different storage nodes.

Further, for each field to be modified, the original data of the field is modified to target data in the primary table partition of the corresponding storage node.

Further, if the field to be modified includes a non-primary key field in the primary table field and does not include the globally unique index field, the modification is directly performed in the primary table field.

Specifically, if the field to be modified includes a globally unique index field; the method further comprises the steps of: determining the original data of the global unique index field through the primary key data in the local unique index table of the third storage node; the original data is modified to target data. That is, if the field to be modified includes the globally unique index field, the target data of the globally unique index field, that is, the modified data thereof, is first recorded, and then the data in the locally unique index table and the primary key fragment table in the third storage node are modified.

Further, it is also necessary to perform a uniqueness check according to the target data of the globally unique index field of the record, if it is unique, submit the modification operation, and if it is not unique, rollback the modification operation. The method for performing the uniqueness check according to the target data of the global unique index field may be specifically described in the above embodiment, and will not be described herein again.

Specifically, the above-described process of data modification is described by way of a specific example, and as shown in fig. 11, the data modification request may be "update test set col 2=5 where col1=3": first record updated new value of update: col2=5; then performing an update operation, determining that the target storage node of the update is node 1 (taking the table shown in fig. 7b as an example for query) according to col1=3, and performing the update operation on the main table and the index table on the storage node 1; performing select for update operation to check uniqueness according to updated new value col2=5 of update; and judging whether the transaction is submitted according to the uniqueness check result, if so, submitting the data modification transaction, otherwise, rolling back the data modification transaction.

Furthermore, it should be noted that, if the data modification transaction is executed through the existing scheme, an update operation needs to be executed on the main table at the storage node 1, then a delete operation is executed on the storage node 2 for the index new_idx3 fragment 2 according to the old value col2 before update being 7, and then an insert operation is executed on the storage node 1 for the index new_idx3 fragment 1 according to the new value col2 after update being 5; therefore, based on the above, compared with the existing scheme, the technical scheme of the modification operation, which is shown in the embodiment of the application, avoids executing the write operation across nodes and also avoids write amplification when the modification operation is performed.

In the above embodiment, it is described that if the field to be modified includes the globally unique index field, the field data is directly modified in the locally unique index table of the third storage node. In another possible implementation, if the field to be modified includes a globally unique index field; the method further comprises the steps of: determining a target data bar where the primary key data is located through the primary key data in a local unique index table of the third storage node; deleting all data in the target data bar in the local unique index table; and inserting target data of the globally unique index field and primary key data in the locally unique index table. That is, it can be determined which piece of data the data to be modified belongs to in the local unique index table of the third storage node through the primary key data, the piece of data can be directly deleted, and then the target data of the global unique index field and the primary key data are inserted. Of course, in this implementation, it is also necessary to record the target data of the globally unique index field first, and perform a uniqueness check based on the recorded target data of the globally unique index field, submit the modification operation if it is unique, and roll back the modification operation if it is not unique. The method for performing the uniqueness check according to the target data of the global unique index field may be specifically described in the above embodiment, and will not be described herein again.

Further, as shown in fig. 12, the method may be applied to a database, where the database corresponds to at least two storage nodes, and a primary table is stored in the database, and includes a primary key field and a non-primary key field, where the primary table is divided into a plurality of segments, and one storage node stores one primary table segment, and the data processing method may be performed by an electronic device, where the electronic device may be a terminal device, or may be a server as shown in fig. 5, and in this embodiment of the present application, a server is described as an example, and the data processing method may specifically include the following steps:

step S1201, a data query request for a database is acquired.

The data query request comprises a target data index to be queried and a field to be queried; the database is provided with at least two storage nodes, each storage node is provided with a main table fragment of the main table and a local unique index table corresponding to the main table fragment, and the local unique index table comprises data indexes of field data of the main key field in the main table fragment.

Step 1202, if the field to be queried is a non-primary key field, querying, by each storage node, field data of a primary key field corresponding to the target data index in each local unique index table according to the target data index, and querying, in the stored primary table fragment, field data of the field to be queried based on the queried field data of the primary key field.

For example, if the data query request for the database is "select col3 from test where col 2=9; "all storage nodes execute" select col3 from test where col 2=9; in each storage node, performing quick query according to the local unique index new_idx3, wherein the query result of the storage node 1 is (col 2, col 1) as (9, 1), and then performing table returning in the storage node 1 according to col1=1; the main table test fragment query results (col 1, col 3) are (1, 9); the other storage node queries that the result is null and then returns the result col3=9 as shown in fig. 13 a. The index table on which the data query depends is shown in fig. 7 b.

Further, in the existing related technical scheme, index scanning needs to be performed on the corresponding fragment storage node of the global index table to obtain the distribution key value of the main table; and then scanning the main table on the storage nodes of the corresponding fragments of the main table according to the distribution key values of the main table to obtain a final result. According to the method and the device, each storage node stores one main table fragment of the main table and a local unique index table corresponding to the main table fragment, the local unique index table comprises data indexes of all field data of main key fields in the main table fragment, on the basis, each storage node queries field data of the main key field corresponding to the target data index in each local unique index table according to the target data index, and queries field data of a field to be queried in the stored main table fragment based on the queried field data of the main key field, so that table returning can be avoided across nodes, network overhead can be reduced, and query time consumption can be reduced.

Further, if the field to be queried is the primary key field, each storage node queries field data of the field to be queried in a corresponding local index table according to the target data index.

For example, if the data query request for the database is "select col1 from test where col 2=9; "all storage nodes execute" select col1 from test where col 2=9; in each storage node, the fast query is performed according to the local unique index new_idx3, the query result of the storage node 1 is (col 2, col 1) is (9, 1), the query results of other storage nodes are null, and then the result col1=1 is returned, as shown in fig. 13 b. Wherein the index table relied upon is shown in fig. 7 b.

In the following embodiments, the data processing method shown in the embodiments of the present application will be described by way of specific examples, and it is known from the foregoing embodiments: the database comprises at least two storage nodes, a main table is stored in the database, the main table is divided into at least two main table fragments, firstly, a local unique index can be created in each storage node based on the main table fragments in each storage node, specifically, the main table can be created in each storage node when being created, and the main table can also be created in each storage node when being present.

Specifically, the local unique index creation may be performed by creating a local unique index new_idx3 in the storage node 1 and the storage node 2, respectively, as shown in fig. 8;

wherein the index creation statement comprises:

“table test（

col1 int key，

col2 int，

col3 int，

unique index new_idx3（col3）global）

distributed by hash（col1）；”

further, the index creation may also be performed when the main table exists, that is, "create index new_idx3on test (col 2); ". As shown in fig. 9 in particular, the creation transaction is started, then all storage nodes create respective locally unique indexes new_idx3, then for update by the statement "select count (x) from test group by col 2; "check uniqueness of the globally unique index; if the globally unique index is unique, the creation transaction is committed, otherwise, the creation transaction is rolled back. Wherein the locally unique index new_idx3 created in storage node 1 and storage node 2, respectively, is shown in particular in fig. 7 b.

Further, when there is a locally unique index in each storage node, data insertion may also be performed, as shown in fig. 10, and the data insertion request may be "insert into test (col 1, col2, col 3) values (5, 6); then starting a data insertion transaction, and recording an inserted global unique index new value col2=6; an insert operation is performed, i.e. the storage node 1 performs an insert on the main table test and the local unique index new_idex3, and then checks the uniqueness of the globally unique index (select count (x) from test group by col2 =6for update), if the globally unique index is unique, the insert transaction is committed, otherwise the insert transaction is rolled back.

Further, when there is a locally unique index in each storage node, data modification may be performed, as shown in fig. 11, the data modification request may be "update test set col 2=5where col1=3": first record updated new value of update: col2=5; then executing update operation, determining the target storage node of the update as node 1 according to col1=3, and executing update operation on the main table and the index table on the storage node 1; performing select for update operation to check uniqueness according to updated new value col2=5 of update; and judging whether the transaction is submitted according to the uniqueness check result, if so, submitting the data modification transaction, otherwise, rolling back the data modification transaction.

Further, when there is a locally unique index in each storage node, a data query may also be performed, if the data query request for the database is "select col1 from test where col 2=9; "all storage nodes execute" select col1 from test where col 2=9; in each storage node, the fast query is performed according to the local unique index new_idx3, the query result of the storage node 1 is (col 2, col 1) is (9, 1), the query results of other storage nodes are null, and then the result col1=1 is returned, as shown in fig. 13 b.

If the data query request for the database is "select col3 from test where col 2=9; "all storage nodes execute" select col3 from test where col 2=9; in each storage node, performing quick query according to the local unique index new_idx3, wherein the query result of the storage node 1 is (col 2, col 1) as (9, 1), and then performing table returning in the storage node 1 according to col1=1; the main table test fragment query results (col 1, col 3) are (1, 9); the other storage node queries that the result is null and then returns the result col3=9 as shown in fig. 13 a.

Further, in the above embodiment, no matter the updating operation or the querying operation, the index returns to the same node, that is, the inconsistent locking sequence with updating will not occur, that is, no distributed deadlock will occur; furthermore, since the index is a local unique index, the index and the main table are located at the same node, and therefore the visibility judgment of the index back table does not have a problem.

Based on the same principle as the data processing method provided by the embodiment of the application, the embodiment of the application also provides a data processing device which is applied to a database, wherein the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table comprises a main key field and a non-main key field; the master table is divided into a plurality of fragments and a storage node stores one master table fragment, as shown in fig. 14, the apparatus 1400 may include: a main table fragment determination module 1401, and a create operation determination module 1402, wherein,

A main table fragment determining module 1401, configured to determine a main table fragment stored in each storage node;

a creating operation determining module 1402, configured to perform, for each storage node, a local unique index table creating operation;

the creating operation determining module 1402 is specifically configured to, when executing the local unique index table creating operation:

creating a local unique index table in the storage node based on the main table fragment stored in the storage node, so that when a query request for a database is received, each storage node queries field data of a field to be queried in the corresponding local unique index table and the main table fragment;

In one possible implementation manner of the embodiment of the present application, the apparatus 1400 further includes: a receiving module, a storage node determining module and a data updating operation module, wherein,

the receiving module is used for receiving a data updating request aiming at the database; wherein, the data update request carries field data of the primary key field; the data update request includes: a data insertion request or a data modification request;

The storage node determining module is used for determining a first target storage node corresponding to field data of the main key field from all the storage nodes;

and the data updating operation module is used for executing data updating operation on the local unique index table and the main table fragment stored in the first target storage node based on the data updating request.

Another possible implementation manner of the embodiment of the present application, the local unique index table includes a primary key field and a global unique index field; the data update request comprises a data insertion request, wherein the data insertion request comprises at least one piece of data to be inserted, and each piece of data to be inserted comprises: main key data to be inserted into a main key field and non-main key data to be inserted into a non-main key field;

determining main key data in each piece of data to be inserted;

the data updating operation module is specifically configured to, when executing a data updating operation on the local unique index table and the main table fragment stored in the first target storage node based on the data updating request:

And for each piece of data to be inserted, inserting each piece of data to be inserted into the main table fragment of the second target storage node, determining the index data of the globally unique index field in each piece of data to be inserted, and inserting the index data and the main key data into the locally unique index table of the second target storage node.

In another possible implementation manner of the embodiment of the present application, the data update request includes a data modification request, where the data modification request carries target data of at least one field to be modified, and primary key data of a primary key field corresponding to each field to be modified;

determining primary key data corresponding to each field to be modified;

for each field to be modified, in the third storage node, the original data of each field to be modified in the main table fragment is determined, and the original data is modified into target data.

In another possible implementation manner of the embodiment of the present application, if the field to be modified includes a globally unique index field; the apparatus 1400 further comprises: the data determining module and the modifying module, wherein,

the data determining module is used for determining the original data of the global unique index field through the primary key data in the local unique index table of the third storage node;

and the modification module is used for modifying the original data into target data.

In another possible implementation manner of the embodiment of the present application, if the field to be modified includes a globally unique index field; the apparatus 1400 further comprises: a data bar determining module, a deleting module and an inserting module, wherein,

the data strip determining module is used for determining a target data strip where the primary key data is located through the primary key data in the local unique index table of the third storage node;

and the inserting module is used for inserting the target data of the globally unique index field and the primary key data into the locally unique index table.

In another possible implementation manner of the embodiment of the present application, the data update request further carries target data of a field to be updated; if the field to be updated includes a globally unique index field, the apparatus 1400 further includes: a recording module, a uniqueness checking module, an operation submitting module and an operation rollback module, wherein,

the uniqueness checking module is used for checking whether the target data is unique in the global unique index table; wherein the global unique index table is composed of each local unique index table;

the operation submitting module is used for submitting data updating operation when the target data is unique in the global unique index table;

and the operation rollback module is used for rollback data updating operation when the target data is not unique in the globally unique index table.

The embodiment of the application provides a data processing device which is applied to a database, wherein the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table comprises a main key field and a non-main key field; in this embodiment, by determining the main table fragment stored in each storage node and performing a local unique index table creation operation for each storage node, that is, for each storage node, based on the main table fragment stored in the storage node, a local unique index table is created in the storage node, and for each storage node, the main key data of the main key field in the local unique index table is the same as the main key data of the main key field in the corresponding main table fragment, so that when a query request for a database is received, the field data of the field to be queried is queried in the corresponding local unique index table and main table fragment by each storage node, thereby performing index query and main table query in the same node, avoiding the condition of cross-node access, further reducing network overhead and improving network performance.

The embodiment of the present application further provides another data processing apparatus, as shown in fig. 15, an apparatus 1500 may include: a request acquisition module 1501, and a first query module 1502, wherein,

a request obtaining module 1501, configured to obtain a data query request for a database, where the data query request includes a target data index to be queried and a field to be queried; the method comprises the steps that a main table is stored in a database, the main table comprises a main key field and a non-main key field, the database is correspondingly provided with at least two storage nodes, each storage node is stored with a main table fragment of the main table and a local unique index table corresponding to the main table fragment, and the local unique index table comprises data indexes of field data of the main key field in the main table fragment;

the first query module 1502 is configured to, when the field to be queried is a non-primary key field, query, by each storage node, field data of a primary key field corresponding to the target data index in a respective local unique index table according to the target data index, and query, in the stored primary table partition, field data of the field to be queried based on the queried field data of the primary key field.

In another possible implementation manner of the embodiment of the present application, the apparatus 1500 may further include: the second query module is used for querying field data of the field to be queried in the corresponding local index tables by each storage node according to the target data index when the field to be queried is a primary key field;

It should be noted that, the first query module 1502 and the second query module may be the same query module, or may be different query modules, which is not limited in the embodiment of the present application.

Compared with the related art, the embodiment of the application provides another data processing method, by acquiring a data query request for a database, because the data query request includes a target data index to be queried and a field to be queried, and the database stores a main table, the main table includes a main key field and a non-main key field, the database corresponds to at least two storage nodes, each storage node stores a main table segment of the main table and a local unique index table corresponding to the main table segment, the local unique index table includes a data index of each field data of a main key field in the main table segment, that is, each storage node includes a main table segment and a local unique index table corresponding to the main table segment, the main key data in the local unique index table is the same as the main key data of the main table segment, that is, if the field to be queried is the non-main key field, the field data of the main key field corresponding to the target data index is queried in the respective local unique index table according to the target data index, and based on the data index of the main key corresponding to the main key field data in the local unique index table, that is the main key field to be queried is not required to be queried, so that the network-based on the query performance of the data segment can be reduced.

The apparatus of the embodiments of the present application may perform the method provided by the embodiments of the present application, and implementation principles of the method are similar, and actions performed by each module in the apparatus of each embodiment of the present application correspond to steps in the method of each embodiment of the present application, and detailed functional descriptions of each module of the apparatus may be referred to in the corresponding method shown in the foregoing, which is not repeated herein.

Fig. 16 shows a schematic structural diagram of an electronic device, which may be a server or a user terminal, and may be used to implement the method provided in any embodiment of the present application, as shown in fig. 16, where the electronic device is applicable to the embodiment of the present application.

As shown in fig. 16, the electronic device 2000 may mainly include at least one processor 2001 (one is shown in fig. 16), a memory 2002, a communication module 2003, and input/output interface 2004, etc., and optionally, the components may be in communication with each other through a bus 2005. It should be noted that the structure of the electronic device 2000 shown in fig. 16 is merely schematic, and does not limit the electronic device to which the method provided in the embodiment of the present application is applicable.

The memory 2002 may be used to store an operating system, application programs, and the like, which may include computer programs that implement the methods of the embodiments of the present application when called by the processor 2001, and may also include programs for implementing other functions or services. Memory 2002 may be, but is not limited to, ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, RAM (Random Access Memory ) or other type of dynamic storage device that can store information and computer programs, EEPROM (Electrically Erasable Programmable Read Only Memory ), CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The processor 2001 is connected to the memory 2002 via a bus 2005, and executes a corresponding function by calling an application program stored in the memory 2002. The processor 2001 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof, that may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with the present disclosure. The processor 2001 may also be a combination of computing functions, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.

The electronic device 2000 may be coupled to a network through a communication module 2003 (which may include, but is not limited to, components such as a network interface) to enable interaction of data, such as sending data to or receiving data from other devices, through communication of the network with other devices, such as user terminals or servers, etc. Among other things, the communication module 2003 may include a wired network interface and/or a wireless network interface, etc., i.e., the communication module may include at least one of a wired communication module or a wireless communication module.

The electronic device 2000 may be connected to a desired input/output device, such as a keyboard, a display device, etc., through an input/output interface 2004, and the electronic device 2000 itself may have a display device, or may be externally connected to other display devices through the interface 2004. Optionally, a storage device, such as a hard disk, may be connected to the interface 2004, so that data in the electronic device 2000 may be stored in the storage device, or data in the storage device may be read, and data in the storage device may be stored in the memory 2002. It will be appreciated that the input/output interface 2004 may be a wired interface or a wireless interface. The device connected to the input/output interface 2004 may be a component of the electronic device 2000 or may be an external device connected to the electronic device 2000 when necessary, depending on the actual application scenario.

Bus 2005, which is used to connect the various components, may include a path to transfer information between the components. Bus 2005 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect Standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The bus 2005 can be classified into an address bus, a data bus, a control bus, and the like according to functions.

Alternatively, for the solutions provided in the embodiments of the present application, the memory 2002 may be used to store a computer program for executing the solutions of the present application, and the processor 2001 executes the computer program to implement the actions of the methods or apparatuses provided in the embodiments of the present application when the processor 2001 executes the computer program.

Based on the same principle as the method provided by the embodiment of the present application, the embodiment of the present application provides a computer readable storage medium, where a computer program is stored, where the computer program can implement the corresponding content of the foregoing method embodiment when executed by a processor.

Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the respective aspects of the foregoing method embodiments.

It should be noted that the terms "first," "second," "third," "fourth," "1," "2," and the like in the description and claims of this application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present application described herein may be implemented in other sequences than those illustrated or otherwise described.

It should be understood that, although the flowcharts of the embodiments of the present application indicate the respective operation steps by arrows, the order of implementation of these steps is not limited to the order indicated by the arrows. In some implementations of embodiments of the present application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages may be flexibly configured according to the requirement, which is not limited in the embodiment of the present application.

In the present embodiment, the term "module" or "unit" refers to a computer program or a part of a computer program having a predetermined function, and works together with other relevant parts to achieve a predetermined object, and may be implemented in whole or in part by using software, hardware (such as a processing circuit or a memory), or a combination thereof. Also, a processor (or multiple processors or memories) may be used to implement one or more modules or units. Furthermore, each module or unit may be part of an overall module or unit that incorporates the functionality of the module or unit.

The foregoing is merely an optional implementation manner of the implementation scenario of the application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the application are adopted without departing from the technical ideas of the application, and also belong to the protection scope of the embodiments of the application.

Claims

1. The data processing method is characterized by being applied to a database, wherein the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table comprises a main key field and a non-main key field; the master table is divided into a plurality of tiles, and a storage node stores one master table tile, the method comprising:

determining a main table fragment stored in each storage node;

executing a local unique index table creation operation for each storage node;

the local unique index table creating operation includes:

2. The method according to claim 1, wherein the method further comprises:

3. The method of claim 2, wherein the locally unique index table includes a primary key field and a globally unique index field; the data updating request comprises a data inserting request, the data inserting request comprises at least one piece of data to be inserted, and each piece of data to be inserted comprises: main key data to be inserted into a main key field and non-main key data to be inserted into a non-main key field;

determining main key data in each piece of data to be inserted;

4. The method according to claim 2, wherein the data update request includes a data modification request, the data modification request carrying target data of at least one field to be modified, and primary key data of a primary key field corresponding to each field to be modified;

Determining primary key data corresponding to each field to be modified;

5. The method of claim 4, wherein if the field to be modified comprises a globally unique index field; the method further comprises the steps of:

and modifying the original data into the target data.

6. The method of claim 4, wherein if the field to be modified comprises a globally unique index field; the method further comprises the steps of:

deleting all data in the target data bar in the local unique index table;

7. The method according to claim 2, wherein the data update request further carries target data of a field to be updated; if the field to be updated includes a globally unique index field, the method further includes:

recording target data of the globally unique index field;

8. A method of data processing, the method comprising:

9. The method of claim 8, wherein the method further comprises:

and if the field to be queried is a primary key field, querying field data of the field to be queried in a corresponding local index table by each storage node according to the target data index.

10. The data processing device is characterized by being applied to a database, wherein the database corresponds to at least two storage nodes, a main table is stored in the database, and the main table comprises a main key field and a non-main key field; the master table is divided into a plurality of fragments, and a storage node stores one master table fragment, the apparatus comprising:

11. A data processing apparatus, the apparatus comprising:

12. An electronic device comprising a memory in which a computer program is stored and a processor which, when running the computer program, performs the data processing method of any one of claims 1 to 9.

13. A computer-readable storage medium, characterized in that the storage medium has stored therein a computer program which, when executed by a processor, implements the data processing method of any one of claims 1 to 9.