Detailed Description
First, some terms appearing in the embodiments of the present application will be described:
key-value pairs consist of keys and values. In the Key-Value storing device 20, data is stored in the form of a Key-Value pair, with a Key as an index of Value, which is the stored data. The keys may be divided into big keys (big keys) and base keys (base keys) according to key lengths. The key length refers to the size of the key.
The large key means a key having a key length exceeding a prescribed length. In one implementation, the prescribed length is the maximum key length allowed by the key-value storage device, such as 1 Kilobyte (KB), 2KB, or 4KB. It should be understood that the specific value of the specified length may be set according to practical situations, and may be greater than 1KB or less than 1KB. The length unit of the big key may be a unit of storage in KB, megabytes (MB), gigabytes (GB) or more. Since the large key does not meet the storage requirement of the key-value storage device 20, the large key cannot be directly stored into the key-value storage device 20.
The basic key means a key having a key length not exceeding a prescribed length.
The basic key value pair is composed of a basic key and a value, wherein the basic key is used for searching a value corresponding to the basic key in the key value storage device.
The slice identifier is used to identify which slice, for example, the slice identifier may be a number, for example, a slice sequence number. The tile identifier may also be a combination of one or more characters and numbers or other types of identifiers.
The global identity is used to index the shards. Each tile has a unique global identity in the key-value database. The global identification may be a universally unique identifier (universally unique identifier, UUID). Specifically, the key of the first base key value may not include a global identification, and the value of the last base key value may not include a global identification. If there are n base key value pairs in total, then the base key value pairs between the first base key value pair and the last base key value pair include the 2 nd base key value pair through the n-1 th base key value pair. The keys and values of the ith basic key value pair respectively comprise a global identifier, the global identifier of the ith basic key value pair is the same as the global identifier of the (i-1) th basic key value pair, the global identifier of the ith basic key value pair is the same as the global identifier of the (i+1) th basic key value pair, and i is an integer and 1< i < n. Thus, the value of the basic key value pair can be found according to the key of each basic key value pair, the global identification included by the key of the next basic key value pair can be obtained from the value of each basic key value pair, and thus the key of the next basic key value pair is obtained, and all basic key value pairs can be found by sequentially executing the steps.
The following describes an application scenario of the present application, where the key-value pair processing method of the present application may be applied to a data processing system. Referring to fig. 1A and 1B, a data processing system may include a data processing apparatus 10 and a key value storage device 20.
The data processing apparatus 10 may be a host, a server or an electronic device with computing capabilities. The key value store 20 may be a stand-alone storage server or a distributed storage server. The key-value storage device 20 may include one or more magnetic disks, or one or more Solid State Disks (SSD) STATE DISK, among others. The data processing apparatus 10 and the key value storage device 20 may be directly connected via a communication cable, or may be connected via a wired network or a wireless network.
An application program is deployed on the data processing apparatus 10, and can communicate with the database program. The user may input a key or a key value pair to the application program, and after the data processing apparatus 10 acquires the key or the key value pair, the data processing apparatus 10 may transmit a read operation and a write operation to the key value storage device 20. Or the user inputs a key or a key value pair in the database program, the data processing apparatus 10 may send a read operation and a write operation to the key value storage device 20 after the data processing apparatus 10 acquires the key or the key value pair. The user input mode may be a mouse input, a voice input, a touch input, etc.
Referring to fig. 1A, to perform a write operation, the data processing apparatus 10 supplies keys and values to the key-value storing device 20 to write key-value pairs to the key-value storing device 20, the keys being indexes of the values.
Referring to fig. 1B, to perform a read operation, the data processing apparatus 10 supplies keys to the key-value storing device 20, the key-value storing device 20 finds a value according to the keys, and supplies the value to the data processing apparatus 10. Thus in a key-value store system, a key is an index used to access a value, and a value is the data that is accessed. In general, the length of the keys and values may be of fixed or variable length.
A method by which the data processing apparatus 10 is able to implement the key-value pairs of the present application is described below. Fig. 2 is a schematic diagram of a data processing apparatus 10. Referring to fig. 2, the data processing apparatus 10 includes an input module 201, a processor 202, a communication interface 203, a memory 204, and a display 205. The input module 201, processor 202, communication interface 203, memory 204, and display 205 are connected by bus 206.
The input module 201 is operable to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the data processing apparatus 10. In particular, the input module 201 may include, but is not limited to, a physical keyboard, a mouse, a touch screen, function keys (such as volume control keys, switch keys, etc.).
It should be appreciated that the processor 202 referred to in this disclosure may be a central processing unit (central processing unit, CPU), but may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application Specific Integrated Circuits (ASICs), off-the-shelf programmable gate arrays (field programmable GATE ARRAY, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It should also be appreciated that the memory 204 referred to in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an erasable programmable ROM (erasable PROM), an electrically erasable programmable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (double DATA RATE SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCH LINK DRAM, SLDRAM), and direct memory bus random access memory (direct rambus RAM, DRRAM).
It is noted that when the processor 202 is a general purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, the memory 204 may be integrated into the processor 202. It should be noted that the memory 204 described herein is intended to comprise, without being limited to, these and any other suitable types of memory. By calling a program stored in the memory 204, the processor 202 can execute the key-value pair processing method described in the embodiment of the present application.
The communication interface 203 is used to communicate with the key value storage device 20. The processor 202 sends instructions, such as write instructions or read instructions, to the key-value store 20 via the communication interface 203. Or processor 202 retrieves data from key-value store 20 via communication interface 203.
The display 205 may be used to display information entered by a user or provided to a user as well as various menus of the data processing apparatus. The display 205 may include a display panel, which may be configured with a Liquid Crystal Display (LCD) CRYSTAL DISPLAY, an organic light-emitting diode (OLED), or the like.
It will be appreciated that although not shown in fig. 2, the data processing apparatus may also include a power supply, an audio module, a communication module, etc.
Aiming at the problem of large key persistence, the application provides a novel key value pair processing method, and the large key is stored in a persistence manner through a plurality of basic key value pairs for storing fragments of the large key. Turning now to FIG. 3, one embodiment of the key-value pair processing method of the present application includes:
step 301, a first key value pair is received.
The first key-value pair may be any key-value pair input by the user in the application program or the database program, or any key-value pair acquired by the data processing apparatus from other devices (such as a terminal or a server). After the user inputs a key value pair by the application program or the database program, the data processing apparatus may receive the key value pair, and the reception of the key value pair may be regarded as the key value pair that the data processing apparatus detects the user input. It will be appreciated that the data processing apparatus may also obtain key value pairs from other terminals or servers.
And 302, when the first key value pair comprises a big key, performing slicing processing on the big key.
After the first key value pair is received, the keys of the first key value pair are compared with the preset length, if the keys of the first key value pair are larger than the preset length, the first key value pair is determined to comprise large keys, and if the keys of the first key value pair are smaller than or equal to the preset length, the first key value pair is determined to not comprise large keys. The preset length may be a difference between a maximum key length allowed by the key value storage device and a fragment identification length. The length of each piece of label can be set according to actual requirements, for example, the length of each piece of label is the same.
The size of each slice can be set according to a preset rule.
In one implementation, the total number of slices is 2.
The first fragment length L 1 satisfies the following formula that L 1=Lbase-Ls.Lbase is the maximum length of the base key and L s is the fragment identification length.
The second segment length L 2 satisfies the following formula that L 2=Lbig-L1.Lbig is the length of the large key.
In another implementation, the total number of slices is n, n being greater than 2.
The length of the first segment, L 1, satisfies the following equation, L 1=Lbase-Ls.
The length of the ith tile L i satisfies the following formula that L i=Lbase-Ls-Lu.Lu is the length of the global identity and the lengths of the global identities of different tiles may be the same. i is a positive integer greater than 1 and less than n.
The length L n of the nth fragment satisfies the following formula:
It should be appreciated that the above method is in order to generate a minimum number of slices. It should be understood that the slice length is not limited to the above example, and the length thereof may be set shorter according to the actual situation. For example, the first slice length may be set according to the length formula of the ith slice.
Step 303, generating a global identifier for each slice.
And 304, generating a basic key value pair according to each fragment and the global identification of the fragment according to the sequence of the fragments.
Each basic key value pair comprises a basic key and a value corresponding to the basic key, and the global identification included in the value of each basic key value pair is the same as the global identification included in the key of the following basic key value pair. It should be noted that, the value of the last basic key value pair may not include the global identifier of the last tile, and in step 303, the global identifier of the last tile may not be generated. And finally, the last slice is the last slice.
Step 305, writing all generated basic key value pairs into a key value storage device.
In some key value databases, the write operation may be implemented through a set interface.
In this embodiment, after the large key is subjected to the slicing process, a base key value pair is generated according to each slice, and then all the base key value pairs are stored in the key value storage device. Even if the data processing device is powered off, a program crashes or other reasons cause the large key stored in the memory to be lost, the large key is still stored in the key value storage device, so that the problem that the large key is easy to lose when stored in the memory is solved.
And setting global identifiers in the values of the previous basic key value pair and keys of the next basic key value pair, so that the next basic key value pair can be searched according to the global identifiers until all basic key value pairs are searched, and the large key can be restored according to all basic key value pairs. It will be appreciated that the value corresponding to the large key may also be read based on all of the base key value pairs, thereby recovering the key value pairs entered by the user.
And the large keys can be split and stored according to the method, and the data processing device can receive keys with any size, so that the limitation on the key size can be removed in an application program or a database program.
The process of creating the base key-value pair is described in detail below, and in an alternative embodiment, step 304 includes:
when the slice is a first slice, generating a first slice mark; generating a key of a first basic key value pair according to the first fragment identifier and the first fragment;
generating a mark of a fragment when the fragment is not the first fragment or the last fragment, generating a key of a basic key value pair according to the mark of the fragment, the global mark of the previous fragment and the fragment;
When the fragment is the last fragment, a last fragment mark is generated, a key of the last basic key value pair is generated according to the last fragment, the last fragment mark and the global mark of the last fragment, and a value of the last basic key value pair is generated according to the data mark and the value of the first key value pair.
In this embodiment, each slice corresponds to a base key value pair one by one, and the slice sequence and the base key value pair sequence are the same. The tile identifier may be a number, a character, or a combination of a number and a character. For example, 1,2, the tile identification of N tiles is 1,2, respectively. Or 1, 2..the tile identity of the N tiles is shard, shard, shardN, respectively. The above is a schematic example of a tile identifier, and the specific form of the tile identifier is not limited to the above example.
Specifically, the first segment identifier and the first segment form a key of a first basic key value pair; the index identifier and the global identifier of the first tile are combined to form a value of the first base key-value pair.
If the second slice is the last slice, the second slice identifier, the global identifier of the first slice, and the second slice are combined into a key of a second base key-value pair, and the data identifier and the value of the first key-value pair are combined into a value of the second base key-value pair.
If the second slice is not the last slice, the second slice mark, the global mark of the first slice and the second slice are combined into a key of a second basic key value pair, the index mark and the global mark of the second slice are combined into a value of the second basic key value pair, then other basic key value pairs are sequentially generated until the last but one basic key value pair, the last slice mark, the global mark of the last but one slice and the last slice are combined into a key of the last basic key value pair, and the data mark and the value of the first key value pair are combined into a value of the last basic key value pair. Thus, all the basic key value pairs corresponding to the fragments can be generated according to the sequence of the fragments.
It should be appreciated that in the keys of each base key-value pair, the order of the tile identifier, global identifier, and tile is not limited to the above examples, but may be arranged in other orders, not limited thereto. In the value of each base key value pair, the order of the index flag and the global flag is not limited to the above example, and the order of the data flag and the global flag is not limited to the above example, but may be set in other orders, which is not limited herein.
It should be appreciated that all of the tiles may be generated at once, and that one base key value pair may be generated from each tile in turn. Or generating a fragment and a basic key value pair corresponding to the fragment, and then generating a next fragment and a basic key value pair corresponding to the next fragment until a last fragment and all basic key value pairs are generated.
In another alternative embodiment, when the first key-value pair does not include a large key, a base key-value pair is created from the first key-value pair and written to the key-value storage device.
Specifically, if the key of the first key value pair is smaller than or equal to the preset length, it is determined that the first key value pair does not include a large key. And generating a key of the basic key value pair according to the fragment identifier and the key of the first key value pair, and generating a value of the basic key value pair according to the data identifier and the value of the first key value. In this way, the key value pairs received by the data processing apparatus may also be stored when they do not include a large key.
When reading data, after inputting the key of the basic key value pair, reading the value of the basic key value pair, and then resolving the value of the first key value pair from the value.
The above describes the process of storing the base key-value pairs. Based on the above embodiments, the procedure of reading the base key value pair will be described as follows:
referring to fig. 4, one embodiment of the key-value pair processing method of the present application includes:
step 401, receiving a large key of a first key-value pair.
And 402, performing slicing processing on the big key.
And 403, acquiring a basic key value pair according to each fragment according to the sequence of the fragments.
Wherein the value of each base key-value pair comprises the same global identity as the key of the following base key-value pair.
Step 404, parsing the value of the first key value pair from the value of the last basic key value pair.
In some key value databases, obtaining a value corresponding to a basic key according to the basic key may be implemented through a get interface.
In this embodiment, the next basic key value pair is a child node of the previous basic key value pair in the key value storage system, so that each segment can be acquired according to the node order, and then all segments are formed into a big key. In the key index of the key value storage system, each basic key value pair is used as one node of a forest, and all basic key value pairs form the forest. All keys received by the data processing device can be read in turn by traversing the forest.
In an alternative embodiment, step 403 includes generating a first base key according to the first slice identifier and the first slice when the slice is the first slice, reading a value corresponding to the first base key from the key value storage device, parsing a global identifier of a previous slice from a value corresponding to a previous base key when the slice is not the first slice, and reading a value corresponding to the base key from the key value storage device according to the slice identifier, the global identifier of the previous slice, and the base key corresponding to the slice. The previous base key refers to the base key corresponding to the previous tile. Taking the second segment as an example, the previous segment of the second segment refers to the first segment, and the base key corresponding to the previous segment refers to the first base key. The value of each key-value pair may thus be read from the key-value storage device.
In the following description of the process of storing other key-value pairs, in another alternative embodiment, the key-value pair processing method further includes:
The method comprises the steps of receiving a second key value pair, carrying out slicing processing on a large key of the second key value pair when the second key value pair comprises the large key, generating global identification for each slice from the i+1th slice of the large key of the second key value pair when the first i slices of the large key of the second key value pair are identical to the first i slices of the large key of the first key value pair, generating basic key value pairs according to each slice and the global identification of the slices in sequence, and writing all generated basic key value pairs into a key value storage device.
In this embodiment, the second key pair refers to any one of the key pairs received after the first key pair is received.
The large key of the first key-value pair includes n total number of fragments, and the large key of the second key-value pair includes m total number of fragments. When the first i slices of the large key of the second key value pair are identical to the first i slices of the large key of the first key value pair, the basic key value pair corresponding to the first i slices of the second large key is also identical to the basic key value pair corresponding to the first i slices of the first large key, wherein the global identification of the first i slices of the second large key is identical to the global identification of the first i slices of the first large key. For the (i+1) th through m-th slices of the second big key, the global identity of each slice is different from the global identity of the slice of the first big key.
Starting from the (i+1) th shard of the large key of the second key-value pair, generating a global identifier for each shard, generating a base key-value pair in turn from each shard of the second large key and the global identifier of the shard, and generating a base key-value pair from each shard in a manner similar to step 304 in the embodiment or alternative embodiment shown in fig. 3. i is a positive integer. For example, for the (i+1) th tile of the second big key, a key of the (i+1) th base key value pair is generated from the tile identifier of the (i+1) th tile, the global identifier of the i th tile of the first big key, and the (i+1) th tile, and a value of the (i+1) th base key value pair is generated from the index identifier and the global identifier of the (i+1) th tile.
It should be appreciated that the first i shards of the large key of the second key-value pair are stored in the first i base key-value pair of the first key-value pair, and the (i+1) th shard to the last shard of the large key of the second key-value pair are stored in the (i+1) th base key-value pair to the last base key-value pair of the second key-value pair.
The process of reading the second key value pair will be described, after the user inputs the big key of the second key value pair, the data processing device segments the big key, obtains the basic key value pair according to each segment according to the segment sequence, and analyzes the value of the second key value pair from the value of the last basic key value pair.
The step of obtaining the base key value pair according to the slicing sequence is similar to step 403, and specific reference may be made to the corresponding description hereinabove. For example, the ith tile of the second large key is identical to the ith tile of the first large key, and the global identity of the ith tile of the second large key is the global identity of the ith tile of the first large key. And for the (i+1) th piece of the second large key, generating the (i+1) th basic key of the second large key according to the piece identification of the (i+1) th piece, the global identification of the i th piece of the first large key and the (i+1) th piece, and reading the corresponding value according to the (i+1) th basic key. Thus, the complete second key value pair can be restored according to the basic key value pair of the same part and the basic key value pair of different parts.
It should be understood that, according to the above key-value pair processing method, the next basic key-value pair is a child node of the previous basic key-value pair in the key-value storage system, so that each large key can be acquired in the order of nodes.
Taking the key value pairs input by the user as an example, one or more basic key value pairs can be generated according to the input key value pairs, and each basic key value pair can be used as a node.
The input key-value pair may or may not include a large key. For input key value pairs that do not include a large key, the base key value pair generated from the input key value pair constitutes a single node, i.e., a tree without child nodes. For an input key value pair including a big key, the big key may include the same first i slices, or may include different first i slices, where i is a positive integer.
For a plurality of input key-value pairs having the same first i tiles, all base key-value pairs generated from the plurality of input key-value pairs constitute a tree. The first basic key value pair is used as a root node, the basic key value pair corresponding to the last fragment of the big key in each input key value pair is used as a leaf node, other basic key value pairs are used as intermediate nodes, and the root node and the intermediate node can have one or more child nodes. Wherein the first i basic key value pairs are the same, and the i+1th basic key value pair of different input key value pairs is used as different child nodes of the i basic key value pair.
For input key value pairs for which the same patch does not exist as other input key value pairs, all base key value pairs generated from the input key value pairs form a tree that does not diverge.
For all the 2 different input key value pairs of the shards, all the basic key value pairs generated according to the 2 input key value pairs form 2 trees. And so on, in the key value storage system, all basic key values generated according to all input key value pairs form a forest. In the key index, for a plurality of keys having a sequential relationship, an orderly traversal of the plurality of keys can be achieved by the shard index value and the global identification of each key.
The key value pair processing method of the present application is described in detail below with reference to a specific application scenario in which the maximum key length allowed by the key value storage device is exemplified by 1 KB.
When the user needs to store data, the user inputs a key value pair in the application program, wherein the key of the key value pair is denoted as user_key1, and the value of the first key value pair is denoted as user_value1. The key length of user_key1 is exemplified by 2.5KB and the data processing apparatus compares the key length of user_key1 with the maximum key length. Since 2.5KB is greater than 1KB, it can be determined that user_key1 belongs to a large key. And slicing the user_key1. The lengths of the fragments are respectively as follows 10200B, 956B, 284B. The user_ke1 is divided into 3 slices according to the slice length, namely user_ke1_0, user_ke1_1 and user_ke1_2.
For user_ke1_0, generating a slice identifier 0 and a global index identifier uuid10;
Referring to FIG. 5A, a 0 and user_ke1_0 are combined into a key 5011 of the first base key-value pair 501, which key 5011 may be represented by 0-user_ke1_0. The index identifier is exemplified by 1, and the data identifier is exemplified by 0. Values 5012 for the first base pair of key values are formed by 1 and uuid10, e.g., 1-uuid 10.
Referring to FIG. 5B, for user_ken1_1, a tile identifier 1 and a global index identifier uuid11 are generated, and 1, uuid10 and user_ken1_1 are combined into a key 5021 of a second base key value pair 502, such as 1-uuid 10-user_ken1_1. Index tag 1 and uuid11 are combined into a value 5022, e.g., 1-uuid 11, of the second base key value pair 502.
Referring to FIG. 5C, since user_ken1_2 is the last tile, tile identifier 2 is generated, and 2, uuid11 and user_ken1_2 are combined into a key 5031 of a third base key value pair 503, such as 2-uuid 11-user_ken1_2. The value1 of the first key value pair and the data identifier 0 form a value 5032, e.g., 0-user_value 1, of the third base key value pair 503.
All the generated key value pairs can be stored respectively, or all the generated key value pairs can be stored in the key value storage device together.
Referring to fig. 6, when the user needs to read the user_value1, the user inputs the user_ke1, the data processing device determines that the user_ke1 is greater than the preset length, and then slices the user_ke1 into the user_ke1_0, the user_ke1_1 and the user_ke1_2. The tile identifier 0 and user_ke1_0 are combined into a key 5011 (i.e., 0-user_ke1_0) of the first base key 501, and a value 5022 (i.e., 1-uuid 10) is found from the key 5011. For the next base key value pair, uuid10 is fetched from value 5022, then tile identifier 1, uuid10 and user_key1_1 are combined into key 5021 of the second base key value (i.e., 1-uuid 10-user_key1_1), and value 5022 (i.e., 1-uuid 11) is found from key 5021. For the last base value pair, uuid11 is taken out of value 5022, then fragment id 2, uuid11 and user_ke1_2 are combined into a key 5031 of a third base key value (namely 2-uuid 11-user_ke1_2), value 5032 (namely 0-user_value 1) is found according to key 5031, and user_value1 is analyzed from the key 5031.
When the user wants to store another key-value pair, the key-value pair is denoted as (user_ke2, user_value 2). If user_key2 belongs to a large key, then user_key2 is sliced. Assume that 4 slices are obtained by slicing, and the 4 slices are respectively user_key2_0, user_key2_1, user_key2_2 and user_key2_3.
Comparing the fragments of the user_key2 with the fragments of the user_key1, starting from the first fragments of the two big keys, if the first fragments of the two big keys are different, generating a basic key value pair according to the fragments of the user_key2 according to the data storage method, and then storing.
Referring to fig. 7A and 7B, if the first and second slices of the two large keys are identical, the third slices are compared, and when the third slices are different, it is determined that the user_key2_0, the user_key2_1 are identical to the user_key1_0, and the user_key1_1_1. Thus, starting from user_key2_2, a base key pair 703 and a base key pair 704 are generated according to the global identifier of each slice and each slice, and the base key pair 703 and the base key pair 704 are respectively stored in the key pair storage device by calling the set interface by taking (2-uuid 11-user_key2_2, 1-uuid 22), (3-uuid 22-user_key2_3, 0-user_value 2) as an example. Wherein, the key 7031 of the base key value pair 703 is (2-uuid 11-user_key2_2), the value 7032 of the base key value pair 703 is (1-uuid 22), the key 7041 of the base key value pair 704 is (3-uuid 22-user_key2_3), and the value 7042 of the base key value pair 704 is (0-user_value 2).
When the user wants to read the user_value2, the user inputs the user_key2, and the data processing device slices the user_key2 into user_key2_0, user_key2_1, user_key2_2, and user_key2_3, namely user_key1_0, user_key1_1, user_key2_2, and user_key2_3.
Referring to FIG. 8, a key 5011 (i.e., 0-user_ke1_0) is generated from the user_ke1_0, and a value 5012 (i.e., 1-uuid 10) is found from the key 5011. The key 5021 (i.e., 1-uuid 10-user_ke1_1) is composed according to the slice identifier 1, uuid10 and user_ke1_1, and the value 5022 (i.e., 1-uuid 11) is found according to the key 5021. Based on the tile identifier 2, uuid11, and user_ke2_2, key 7031 (i.e., 2-uuid 11-user_ke2_2) is composed, and based on key 7031, value 7032 (i.e., 1-uuid 22) is found. From 3, uuid22, and user_ke2_3, a key 7041 (i.e., 3-uuid 22-user_ke2_3) is composed, from which a value 7042 (i.e., 0-user_value 2) is found from the key 7041, from which user_value2 is parsed.
In the key value storage system, the storage form of the basic key value pairs 501, 502, 503, 703, and 704 is shown in fig. 9. It can be seen that the above method only stores a partial fragment of the user_key2, and when data is read, the partial basic key value pair corresponding to the user_key1 and other basic key value pairs corresponding to the user_key2 can be respectively read, so that the user_value2 can be read. Thus, the storage space can be saved, and the orderly traversal can be realized.
In the present application, A-B represents one complete data of A and B. It should be understood that a or B may be a number, symbol or string, with symbol-separation being employed for ease of reading and understanding. The symbols in A-B-C have the same meaning.
The present application provides a data processing apparatus capable of implementing the method in the above embodiments. Referring to fig. 10, in an alternative embodiment, a data processing apparatus 1000 includes an input module 1001, a processing module 1002, and a communication interface 1003;
An input module 1001 for receiving a first key value pair;
a processing module 1002, configured to perform a slicing process on the big key when the first key value pair includes the big key;
the processing module 1002 is further configured to generate a base key value pair according to each slice and the global identifier of the slice according to the slice sequence;
the processing module 1002 is further configured to write all generated base key value pairs to the key value storage device through the communication interface 1003.
In an alternative embodiment, the processing module 1002 is specifically configured to perform the following steps:
when the slice is a first slice, generating a first slice mark; generating a key of a first basic key value pair according to the first fragment identifier and the first fragment;
generating a mark of a fragment when the fragment is not the first fragment or the last fragment, generating a key of a basic key value pair according to the mark of the fragment, the global mark of the previous fragment and the fragment;
When the fragment is the last fragment, generating an identifier of the last fragment, generating a key of a last basic key value pair according to the identifier of the last fragment, the global identifier of the last fragment and the last fragment, and generating a value of the last basic key value pair according to the data identifier and the value of the first key value pair.
In a further alternative embodiment of the present invention,
The input module 1001 is further configured to receive a large key of the first key value pair;
the processing module 1002 is further configured to perform a slicing process on the big key;
The processing module 1002 is further configured to obtain a base key value pair according to each slice according to the slice sequence;
The processing module 1002 is further configured to parse the value of the first key-value pair from the value of the last base key pair.
In another alternative embodiment, the processing module 1002 is specifically configured to perform the steps of:
when the fragment is a first fragment, generating a first basic key according to the first fragment identifier and the first fragment;
When the fragment is not the first fragment, resolving the global identification of the previous fragment from the value of the previous basic key value pair; and generating a basic key corresponding to the fragment according to the fragment, the fragment identifier and the global identifier of the previous fragment, and reading a value corresponding to the basic key from the key value storage device.
In a further alternative embodiment of the present invention,
The input module 1001 is further configured to receive a second key value pair;
The processing module 1002 is further configured to perform a sharding process on the big key of the second key value pair when the second key value pair includes the big key, generate a global identifier for each shard from the i+1st shard when the first i shard of the second key value pair is the same as the first i shard of the first key value pair, generate a base key value pair according to each shard and the global identifier of the shard in turn, i is a positive integer, and write all generated base key value pairs into the key value storage device through the communication interface 1003.
It is further noted that the above described embodiments of the apparatus are schematic, wherein the units described as separate components may or may not be physically separate. Some or all of the modules may be selected according to actual needs to achieve the purposes of the solution in the present application. In addition, in the drawings of the apparatus embodiments of the present application, some connection relations between modules or units indicate that they have communication connections therebetween, and may be implemented as one or more communication buses or signal lines. Other connections between modules or units represent electrical connections therebetween and may be embodied as one or more circuits.
The present application provides a computer storage medium comprising instructions which, when run on a computer, cause the computer to perform the steps of the above embodiments performed by the data processing apparatus.
The data processing device in the application can be a chip in the data processing device, and the chip comprises a processing unit and a communication unit. The processing unit may be a processor and the communication unit may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer instructions stored by the storage unit, so that the data processing apparatus performs the data processing method in the above embodiment or the alternative embodiment. Alternatively, the storage unit is a storage unit in the chip, such as a register, a cache, or the like, and the storage unit may also be a storage unit located outside the chip in the data processing apparatus, such as a ROM or a static storage device that can store static information and other types of static storage devices, such as a RAM, or the like. The processor referred to in any of the foregoing may be a general purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for performing the program execution of the key-value pair processing method described above.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course by means of special purpose hardware including application specific integrated circuits, special purpose CPUs, special purpose memories, special purpose components, etc. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. But a software program implementation is a preferred embodiment for many more of the cases of the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL), or wireless (e.g., infrared, wireless, microwave, etc.) means, the computer-readable storage medium may be any available medium that can be stored by the computer or a data storage device such as a server, data center, etc., that contains an integration of one or more available media, the available media may be magnetic media, (e.g., floppy disk, hard disk, tape), optical media (e.g., DVD), or semiconductor media, etc.
The foregoing embodiments are merely for illustrating the technical solution of the present application, but not for limiting the same, and although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that modifications may be made to the technical solution described in the foregoing embodiments or equivalents may be substituted for parts of the technical features thereof, and such modifications or substitutions do not depart from the spirit of the corresponding technical solution from the scope of the technical solution of the embodiments of the present application.