Disclosure of Invention
Aiming at the problems that the chain address method is limited in the Hash conflict, a single fixed storage path cannot meet the requirements of high performance and large capacity at the same time, the system performance is reduced due to a KV storage mode realized by software, the effective utilization of a fragment space is lacked and the like, the invention provides a block type KV storage method and system based on a configurable MMU, which combines hardware and MMU technology, not only improves the system performance, but also optimizes the utilization of the storage space.
In order to achieve the above object, the present invention is realized by the following technical scheme:
a partitioned KV storage method based on a configurable MMU, the method comprising:
Selecting and configuring NSID types required in MMU according to the requirements of system read-write performance and memory capacity;
determining a section size and a Hash type corresponding to the NSID type according to the NSID type;
Determining the size of a Hash table according to the Hash type, and calculating the number of sections based on the size of the Hash table and the section size;
In an MMU configuration interface, distributing a pointer for each section, and storing pointers corresponding to different sections by adopting a memory block, wherein the pointers point to actual positions in a physical address space and are used for storing data of the sections;
after the configuration is completed, submitting all parameters through an MMU configuration interface, and automatically managing the mapping relation between the Hash table and the physical address space by hardware according to the configured parameters;
In the running process of the system, the hardware is dynamically adjusted and optimized according to the real-time read-write operation requirement.
The invention further improves that pointers of all sections corresponding to the same NSID are stored in a continuous address interval of the same memory block, the system firstly automatically configures a memory address of the pointer of the first section corresponding to the NSID, and then automatically calculates and configures the memory addresses of the pointers of the rest sections according to the memory address.
The invention further improves that the hardware automatically manages the mapping relation between the Hash table and the physical address space according to the configured parameters, and specifically comprises the following steps: and carrying out Hash calculation according to keys of the key value pair to determine the positions of the keys in the Hash table, then finding pointers corresponding to the sections by hardware through an MMU, positioning the pointers to specific positions of a physical address space according to the pointers, and executing corresponding operations of adding, deleting, changing and searching.
A further improvement of the present invention is that the MMU can configure a plurality of NSIDs, each NSID configuring a plurality of sections, each section corresponding to a pointer; the storage locations of each section in the physical address space under the same NSID are continuous, and the storage locations of different sections in the physical address space under the same NSID are discrete, discontinuous.
A further development of the invention consists in that the dynamic adjustment and optimization comprises:
the MMU can be used for dynamic configuration of multiple NSIDs, including changing the size of a section corresponding to the NSID, reassigning pointers to the adjusted section, and implementing dynamic modification of a corresponding Hash table through hardware acceleration.
The invention further improves that the dynamic adjustment and optimization also comprises MMU can adjust the physical address space by dynamically changing the corresponding pointer of the section, including adjusting the position of the section in the same storage medium or migrating the section from one storage medium to another storage medium, and executing the conversion from the old Hash table to the new Hash table through hardware acceleration, wherein the conversion is realized through the enabling of the configuration Rehash command.
A further improvement of the present invention is that said enabling by means of a configure Rehash command is implemented by the steps of:
locking the current Hash table, and prohibiting new read-write operation;
Generating a new Hash table according to the new section distribution condition, releasing the lock and starting the new Hash table for subsequent key value pair management after the address reassignment of all sections is completed.
A partitioned KV storage system based on a configurable MMU, the system comprising:
the MMU unit is used for automatically managing the mapping relation between the Hash table and the physical address space according to the configured parameters and dynamically adjusting and optimizing according to the real-time read-write operation requirements;
The MMU configuration interface is used for selecting and configuring a required NSID type according to the read-write performance and the memory requirement of the system, determining the section size and the Hash type based on the NSID type, further configuring the size of the Hash table, distributing pointers of the sections, and mapping the pointers to actual positions in a physical address space;
The storage units are used for storing the block type KV data, wherein each storage unit comprises a plurality of sections;
And the Hash table is used for determining the position of the key in the Hash table according to the key in the key value pair.
The beneficial effects of the invention are as follows: based on the configurable MMU mapping relation, the Hash table is discretized and mapped to different areas, so that the fragmented space can be fully utilized, the new storage space is expanded, and meanwhile, the dynamic change of the MMU mapping table is realized through hardware acceleration, and the host pressure is reduced. The Hash table is managed through the partition (section), so that the overhead of a mapping table is reduced, continuous address intervals are fragmented, discrete space is utilized more effectively, and the space utilization rate is improved. Through dynamic configuration and optimization of NSID and corresponding section, flexible management of physical address space is realized while system read-write performance is improved, mapping of Hash table and physical address is efficiently managed, advanced operations such as Rehash are supported, and flexibility and expansibility of the system are remarkably enhanced.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which are obtained by a person skilled in the art based on the described embodiments of the invention, fall within the scope of protection of the invention.
Example 1: as shown in fig. 1, this embodiment provides a block KV storage method based on a configurable MMU, including the following steps:
S1: selecting and configuring NSID types required in MMU according to the requirements of system read-write performance and memory capacity;
MMU: memory Management Unit, chinese name is memory management unit, is a computer hard responsible for processing memory access request of Central Processing Unit (CPU);
NSID: NAMESPACE IDENTIFIER identifiers for identifying different storage spaces in the memory mapping process; by configuring NSID type, the storage space can be divided into different sections (sections, i.e., region blocks into which a Hash table is desired to be divided), each section being mapped to a certain portion of the actual physical storage medium;
Step S1 involves analyzing the current and anticipated workload of the system to determine the optimal NSID configuration. For example, in a high performance database system, if a large number of concurrent read and write requests are expected, then it may be necessary to select an NSID type that supports high concurrent access.
S2: determining the section size and the Hash type corresponding to the NSID type according to the NSID type;
The section size determines how much data each section can hold, and the Hash type (i.e., hash algorithm, used to map keys to storage locations for fast searching and storing data, including MD5, SHA-1, SHA-256, CRC32, etc.) affects the distribution and access efficiency of the data. For a KV storage system needing quick searching, a Hash function with a low conflict rate can be selected;
The NSID type determines the size and allocation manner of the Hash table, and each NSID type corresponds to one or more sections, and the size of each section needs to be determined according to the requirement. If the large Hash table needs to be divided into smaller, easily managed blocks, a smaller section size can be selected; for example: for high performance applications, CRC_32 is selected as the Hash type and the section size is set to 2M.
S3: determining the size of a Hash table according to the Hash type, and calculating the number of sections based on the size of the Hash table and the section size; for example, if the size of the Hash table is 16M and each section is 4M, 4 sections are required;
through block management (section), the storage space can be mapped flexibly, and the resource utilization rate is improved.
S4: in the MMU configuration interface, a pointer (pointer) is allocated to each section, and a memory block (memory) is used to store pointers corresponding to different sections, wherein the pointers point to actual positions (i.e. PHYSICAL ADDRESS SPACE) in a physical address space and are used for storing data of the sections;
Specifically, pointers of all sections corresponding to the same NSID are stored in a continuous address section of the same memory block, the system firstly automatically configures a memory address where a pointer of a first section corresponding to the NSID is located, and then automatically calculates and configures memory addresses of pointers of other sections according to the memory address.
For example: there is an NSID which is divided into 4 sections, and pointers of the sections are stored in the memory in the area with addresses 0x0 to 0xc, if the pointer of the first section is stored in 0x0, the system only needs to configure the address 0x0, and the pointer addresses of other sections can be 0x4,0x8 and 0xc in sequence. By configuring the continuous memory addresses, each section can be effectively managed and quickly accessed, the efficiency of storage operation is improved, and the complexity of configuration is simplified.
S5: after configuration is completed, submitting all parameters including NSID, section size, hash type, section number, pointer, physical address space, memory address and the like through an MMU configuration interface, gradually adding key value data into a Hash table by hardware according to the configured parameters, and automatically managing the mapping relation between the Hash table and the physical address space;
The step S5 specifically comprises the following steps: performing Hash calculation according to keys of the key value pair to determine the positions of the keys in a Hash table, then finding pointers corresponding to the sections by hardware through an MMU, positioning the pointers to specific positions of a physical address space according to the pointers, and executing corresponding operations of adding, deleting, changing and searching;
the MMU can configure a plurality of NSIDs, each NSID is configured with a plurality of sections, and each section corresponds to one pointer; the storage locations of each section in the physical address space under the same NSID are continuous, and the storage locations of different sections in the physical address space under the same NSID are discrete, discontinuous. For example, a Hash table with a size of 1000 corresponding to NSID0 is configured, that is, 1000 consecutive chains are provided, and the number of chains is 0 to 999. If the NSID0 configured section size is 100, there are 10 sections in total, i.e., 10 pointers to different locations. This divides the continuous 1000 chains into discrete 10 blocks, each block being a continuous 100 chain.
As shown in fig. 2, NSID is NAMESPACE IDENTIFIER, section is an area block into which a Hash table is to be divided, and pointer is a start position pointer of a real storage medium to which the section is mapped, PHYSICAL ADDRESS SPACE is a real storage medium section.
Nsidx (x=0, 1, …, p-1): different NSIDs that can be supported;
sectiony (y=0, 1, …, n-1): different sections divided in each NSID;
pointerz (z=0, 1, …, n-1) different pointers for different section configurations;
200K space: i.e., one section is 200k in size.
S6: in the running process of the system, the hardware dynamically adjusts and optimizes according to the real-time read-write operation requirement;
The system monitors the change of the storage requirement in real time, such as full load of section caused by data increase, or system performance reduction, and the like; when detecting a condition (such as insufficient section size or storage efficiency optimization) to be adjusted, the system triggers a dynamic adjustment process; according to the current demand and load, the system calculates the new size of the corresponding section under each NSID, and according to the new section size, the system plans how to redistribute the pointer of each section so as to ensure the effective utilization of the storage space.
The system modifies the section size parameter of the corresponding NSID by the MMU configuration interface, and the hardware automatically adjusts the section layout in the physical address space according to the new configuration, reassigns pointers to each section after adjustment, and the pointers point to the new physical address space to accommodate the adjusted section.
According to the new section size, the system may need to recalculate and update the mapping relation (i.e. Rehash) in the Hash table, which ensures that the mapping relation between the Hash table and the physical address space is still accurate, and the hardware quickly completes the conversion from the old Hash table to the new Hash table through the built-in acceleration function, so as to reduce the influence on the system performance. After confirming that all the adjustment operations are completed, the system resumes normal read-write operation on the relevant section, and writes the data buffered during adjustment into the new section position, ensuring data consistency.
Further, step S6 includes:
The MMU can be used for dynamic configuration of a plurality of NSIDs, comprising changing the size of the section corresponding to the NSID, reassigning a pointer to the adjusted section, and realizing dynamic change of the corresponding Hash table through hardware acceleration; for example, as mentioned above, the NSID0 configured corresponds to a Hash table of size 1000, the section size can be changed from 100 to 200, so that the section size of NSID1 can be configured to 200, and a pointer corresponding to each section can be configured. After configuration, the change of the real Hash table can be realized by utilizing hardware.
The MMU can further adjust the physical address space by dynamically changing pointers corresponding to the sections, including adjusting the positions of the sections in the same storage medium (such as adjusting different positions in a DRAM) or migrating the sections from one storage medium to another storage medium (such as changing from a DRAM to an SLC or MLC, etc.), and performing conversion from an old Hash table to a new Hash table through hardware acceleration, wherein the conversion is realized through enabling a configuration Rehash command.
Example 2
S10: configuring the section size of NSID0 in the MMU to be 4M, and the Hash type to be CRC_24, namely, the size of a Hash table to be 16M, wherein the number of sections corresponding to NSID0 is 4;
s20: a memory with a width of 32 bits and a depth of 20 is selected, and four pointers with 32 bits are written from the start position in sequence, as follows:
table 1 NSID0 configured memory address and pointer map
memory address |
pointer |
0x0 |
0x0 |
0x4 |
0x1000_0000 |
0x8 |
0x2000_0000 |
0xc |
0x3000_0000 |
S30: continuously sending KV commands to a hardware module, and constructing a Hash table by the hardware according to parameters configured in the steps S10 and S20;
S40: because of the change of the storage space, the actual storage space of NSID0 needs to be changed, that is, the mapping relation in the MMU needs to be changed, the section size of NSID1 in the MMU is configured to be 4m, the hash type is crc_24, and new four pointers are written in sequence in the memory at consecutive positions avoiding NSID0, as follows:
Table 2 NSID1 configured memory address and pointer mapping table
memory address |
pointer |
0x100 |
0x5000_0000 |
0x104 |
0x6000_0000 |
0x108 |
0x7000_0000 |
0x10c |
0x8000_0000 |
S50: after the configuration step S40, the enabling of the command Rehash is sent to the hardware, and the hardware moves the hash table of NSIDO0 in the actual storage space to the spatial location specified by NSID1 according to the configurations of NSID0 and NSID1 in the above steps.
Example 3: as shown in FIG. 3, this embodiment provides a configurable MMU-based partitioned KV storage system comprising:
the MMU unit is used for automatically managing the mapping relation between the Hash table and the physical address space according to the configured parameters and dynamically adjusting and optimizing according to the real-time read-write operation requirements;
The MMU configuration interface is used for selecting and configuring a required NSID type according to the read-write performance and the memory requirement of the system, determining the section size and the Hash type based on the NSID type, further configuring the size of the Hash table, distributing pointers of the sections, and mapping the pointers to actual positions in a physical address space;
The storage units are used for storing the block type KV data, wherein each storage unit comprises a plurality of sections;
And the Hash table is used for determining the position of the key in the Hash table according to the key in the key value pair.
In summary, the method and the device of the invention discretize and map the Hash table to different areas based on the configurable MMU mapping relation, can fully utilize fragmented space and expand new storage space, and simultaneously realize dynamic change of the MMU mapping table through hardware acceleration, thereby reducing host pressure. The Hash table is managed through the partition (section), so that the overhead of a mapping table is reduced, continuous address intervals are fragmented, discrete space is utilized more effectively, and the space utilization rate is improved. Through dynamic configuration and optimization of NSID and corresponding section, flexible management of physical address space is realized while system read-write performance is improved, mapping of Hash table and physical address is efficiently managed, advanced operations such as Rehash are supported, and flexibility and expansibility of the system are remarkably enhanced.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that various changes and substitutions are possible within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.