Disclosure of Invention
The embodiment of the application provides a computing system, a data processing method and related equipment, which solve the problem of slow response of the existing data access.
In a first aspect, an embodiment of the present application provides a computing system, including:
A processor;
the main memory is used for storing data;
the memory comprises an extended memory, wherein the extended memory comprises N-level first file layers, N is more than or equal to 1, a first-level first file layer in the N-level first file layers is used for storing data transferred from the main memory, and an i-th first file layer in the N-level first file layers is used for storing data transferred in the i-1-th first file layer under the condition that N is more than 1, 1<i is less than or equal to N;
The hard disk comprises M-level second file layers, wherein M is greater than 1, a first-level second file layer in the M-level second file layers is used for storing data transferred from an Nth-level first file layer, and a j-th second file layer in the M-level second file layers is used for storing data transferred from a j-1-th second file layer, wherein 1<j is less than or equal to M.
The extended memory supports creation of an N-level first file layer, data transfer is performed between the first-level first file layer in the N-level first file layer and the main memory, and data transfer is performed between the N-level first file layer and the hard disk.
N may be set to 1, in which case the extended memory contains only the first file layer of level 1. The first-stage first file layer is the nth-stage first file layer.
N may be set to be greater than 1, in which case the expansion memory contains multiple levels of the first file layer. The first file layer and the Nth first file layer are different file layers.
And under the condition that N is greater than 1, a first file layer in the N-level first file layers is used for storing data transferred from the main memory, an ith first file layer is used for storing the data transferred in the i-1 th first file layer, and an Nth first file layer is used for transferring and writing the stored data into the hard disk.
The hard disk supports the creation of an M-level second file layer, and data transfer is carried out between the first-level second file layer in the M-level second file layer and the extended memory.
The first-level second file layer in the M-level second file layers is used for storing data transferred from the nth-level first file layer, and the jth-level second file layer is used for storing data transferred from the jth-1-level second file layer.
In this embodiment, data may be transferred between the main memory and the extended memory, and may be transferred between the extended memory and the hard disk. Because the expansion memory is added, the transfer process of the data in the main memory to the hard disk is changed, the data is not directly written into the hard disk any more, the data is required to be transferred downwards to the expansion memory, and then the data stored in the expansion memory is further transferred downwards to the hard disk by the expansion memory. The expansion memory and the hard disk are different storage media, file layer division is carried out in the expansion memory, expansion of the main memory is realized by means of the expansion memory provided with a file layer, and the data storage connection of the hard disk provided with a plurality of file layers is realized by adapting to the hard disk layering.
The data in the extended memory can be transferred between the main memory and the hard disk based on the divided file layers, so that the data can be stored in the main memory and the hard disk in a transitional way, the added extended memory is used as a transitional storage space between the main memory and the hard disk, and the data can be cached, so that the data can be read and written by using the memory to replace the hard disk, the access operation of the data operation to the hard disk is reduced, the interference of low I/O performance of the hard disk to the data access efficiency is reduced, the time consumption of data processing is reduced, the data access response speed is improved, and the read and write performance of the whole flow is improved.
In some alternative embodiments, in the case where N is greater than 1, an nth level first file layer of the N level first file layers in the expansion memory is used to store data transferred from the hard disk.
Optionally, the i-th first file layer is used for storing the data transferred in the i+1-th first file layer, and the first-stage first file layer is used for transferring the data to the main memory.
Or alternatively, the j-1 th level second file layer is used for storing data transferred from the j-th level second file layer, and the data in the first level second file layer is transferred to the N-th level first file layer.
The situation corresponds to the implementation process of some read data, the data to be read in the hard disk can be transferred to the extended memory layer by layer to enable the processor to read, or the data to be read can be transferred from the extended memory layer by layer to the main memory to enable the processor to read, and the data reading requirements under different situations are adapted.
With reference to the first aspect, in some implementations of the first aspect, the extended memory includes a first set of the N-level first file layers and a second set of the N-level first file layers.
The first-level first file layers in the first N-level first file layers are used for storing first target keys transferred from the main memory, the first-level first file layers in the second N-level first file layers are used for storing first target values corresponding to the first target keys transferred from the main memory, and/or the hard disk comprises first M-level second file layers and second M-level second file layers, the first second file layers in the first M-level second file layers are used for storing second target keys transferred from the N-level first file layers in the first N-level first file layers in the expansion memory, and the first second file layers in the second M-level second file layers are used for storing second target values corresponding to the second target keys transferred from the N-level first file layers in the second N-level first file layers in the expansion memory.
The key-value pair (key-value) key and the value are stored separately, metadata required to be stored are reduced, matching efficiency when the key is used for searching data is improved, and data transfer efficiency is improved.
With reference to the first aspect, in some implementations of the first aspect, the extended memory is a memory adopting a computing high-speed link CXL protocol, so as to implement efficient and low-latency interconnection with the server, implement high-performance and low-latency data transmission and memory sharing, and ensure access efficiency of data in the extended memory.
With reference to the first aspect, in some implementations of the first aspect, a variable memory table for storing data is stored in the main memory, where the variable memory table is converted into an invariable memory table when the data writing amount reaches a preset condition, and the first-stage first file layer is used to store data transferred from the invariable memory table.
The non-variable memory table provides an intermediate state of data storage, the setting of the non-variable memory table allows writing operation to continue on the newly created variable memory table, and the conversion of the variable memory table into the non-variable memory table can ensure that data in the non-variable memory table can be safely written into the extended memory by the background thread, so that the influence on the performance of the database is reduced.
With reference to the first aspect, in some implementations of the first aspect, a pre-write log is stored in the extended memory, and/or a manifest file for storing metadata is stored in the extended memory.
Manifest files are typically used to record and manage metadata such as structure, content, configuration, and rights of a database. These metadata are used to perform the functions of correct operation of the database, data management, user access, etc.
In the pre-write log mechanism, all write operations first write data to the pre-write log and then perform write operations to ensure that in the event of a system crash or other failure, the data can be restored to a consistent state based on the pre-write log. In this way, the log file may be used to recover outstanding write operations even if a failure occurs during the data write process.
With reference to the first aspect, the computing system comprises a computing device and an expansion memory device, wherein the computing device comprises the processor and the main memory, the expansion memory device comprises the expansion memory, or the computing system comprises the computing device, and the computing device comprises the processor, the main memory and the expansion memory.
In a second aspect, an embodiment of the present application provides a data processing method, applied to a computing system, where the computing system includes a main memory, an extended memory and a hard disk, where the extended memory includes N-level first file layers, where N is greater than or equal to 1, the hard disk includes M-level second file layers, where M is greater than 1, and the data processing method includes:
Writing first target data in the main memory into a first-stage first file layer of the extended memory under the condition that a first preset condition is met, wherein under the condition that N is more than 1, an ith first file layer in the N-stage first file layers is used for storing data transferred in the ith-1 first file layer, and 1<i is less than or equal to N;
And under the condition that a second preset condition is met, writing second target data stored in an N-th first file layer of the extended memory into a first-stage second file layer in the hard disk, wherein a j-th second file layer in the M-th second file layer is used for storing data transferred in a j-1-th second file layer, and 1<j is less than or equal to M.
The first target data may be part of data or all of data stored in the main memory.
The second target data may be part or all of the data stored in the nth stage first file layer of the extended memory.
In the process, the transfer process of the data in the main memory to the hard disk is changed, the data is not directly written into the hard disk any more, the data is required to be transferred downwards to the expansion memory, and then the data stored in the expansion memory is further transferred downwards to the hard disk.
By dividing file layers in the extended memory, the main memory is extended by means of the extended memory provided with the file layers, and the data storage connection of the hard disk provided with a plurality of file layers is realized by adapting the hard disk layering.
An extended memory is introduced between the main memory and the hard disk, data transfer can be performed between a first-stage file layer in N-stage file layers and the main memory, data transfer is performed between an N-stage file layer and the hard disk, layer-by-layer caching of data is realized by means of the extended memory provided with a storage layer of data files, access of data operation to the hard disk is reduced, interference of low I/O performance of the hard disk on data access efficiency is reduced, time consumption of data processing is reduced, and data access response speed in a computing system is improved.
With reference to the second aspect, in some implementations of the second aspect, the first preset condition is determined to be met if the remaining storage space of the main memory is smaller than a first threshold, or the first preset condition is determined to be met if a first specific user operation is detected.
The main memory, the extended memory and the hard disk respectively have set storage space sizes.
When data is stored in the main memory, the extended memory and the hard disk, the total storage space, the used amount and the residual available amount of the main memory, the extended memory and the hard disk can be obtained to determine whether the residual storage space in the corresponding storage medium is sufficient.
And the time judgment on whether the data in the main memory need to be transferred into the expansion memory is realized by combining the size of the residual storage space of the main memory or the operation condition of a user.
With reference to the second aspect, in some implementations of the second aspect, the determining that the first preset condition is met in a case where a remaining storage space of the main memory is smaller than a first threshold includes:
if the remaining storage space of the variable memory table in the main memory is smaller than the data amount of the data to be stored, converting the variable memory table into an invariable memory table, and determining that the remaining storage space of the main memory is smaller than the first threshold value under the condition that the remaining storage space in the main memory is insufficient to newly create the variable memory table.
The process combines the data quantity of different memory tables in the main memory to judge whether the data in the main memory needs to be transferred into the expansion memory or not.
With reference to the second aspect, in certain implementation manners of the second aspect, the data processing method further includes:
And under the condition that a third preset condition is met, storing third target data stored in the ith-1 th first file layer in the N-level first file layers into the ith-level first file layer, wherein 1<i is less than or equal to N.
The third preset condition may be a click trigger operation of a specific instruction key by the user. The user can trigger the data in the expansion memory to be migrated from the ith-1 level first file layer to the ith level first file layer through specific operation.
Or the third preset condition may be that the remaining storage space in the i-1 st level first file layer in the extended memory is smaller than a threshold value.
The method and the device realize the time discrimination of whether the data transfer between different file layers is needed in the extended memory.
With reference to the second aspect, in some implementations of the second aspect, when a third preset condition is met, storing third target data stored in an i-1 th first file layer in the N-th first file layer into the i-th first file layer includes:
under the condition that the residual storage space in the ith-1 level first file layer is smaller than a second threshold value, starting from the ith-1 level first file layer, and writing third target data stored in the current level first file layer into the ith level first file layer;
Until the remaining storage space in the i-1 th level first file layer is greater than the second threshold.
The third target data is part or all of data stored in the ith-1 level first file layer of the extended memory.
In the process, data is transferred among the file layers of the extended memory, so that the upper file layer is ensured to have enough residual storage space after data transfer is performed to the lower file layer, and effective storage of the data in different file layers in the extended memory is ensured.
With reference to the second aspect, in certain implementation manners of the second aspect, the data processing method further includes:
If the remaining storage space of the variable memory table in the main memory is larger than the data amount of the data to be stored, writing the data to be stored into the variable memory table under the condition that the remaining storage space of the pre-written log in the extended memory is larger than the backup data amount of the data to be stored.
According to the processing process, even if a fault occurs in the data writing process, the incomplete writing operation can be recovered by utilizing the pre-writing log, so that the effective writing of the data in the database is ensured.
With reference to the second aspect, in some implementations of the second aspect, the second preset condition is determined to be satisfied if the remaining storage space of the nth level first file layer is less than a fourth threshold, or the second preset condition is determined to be satisfied if a second specific user operation is detected.
In some scenarios, for example, when it is required to ensure that there is enough remaining storage space in the extension, a user may trigger, through a specific operation, migration of data in the extension memory from the nth-level first file layer to the hard disk.
The extended memory has a set storage space size. And triggering the data in the extended memory to migrate from the Nth-stage first file layer to the hard disk under the condition that the residual storage space of the Nth-stage first file layer of the extended memory is smaller than the fixed value.
The time judgment of whether the data in the extended memory needs to be transferred to the hard disk is realized.
With reference to the second aspect, in certain implementation manners of the second aspect, the data processing method further includes:
Under the condition that a fourth preset condition is met, fourth target data stored in a j-1 th second file layer in the M-level second file layers are stored in the j-level second file layer, and 1<j is smaller than or equal to M.
The fourth preset condition may be a click trigger operation of a specific instruction key by the user. The user can trigger the data in the hard disk to be migrated from the j-1 th level second file layer to the j-th level second file layer through specific operation. Or the fourth preset condition may be that the remaining storage space in the j-1 th level first file layer in the hard disk is smaller than a threshold value.
The method and the device realize the time discrimination of whether the data transfer between different file layers is needed in the hard disk.
With reference to the second aspect, in some implementations of the second aspect, storing fourth target data stored in a j-1 th second file layer in the M-th second file layer in a case that a fourth preset condition is met includes:
Under the condition that the residual storage space in the j-1 th level second file layer is smaller than a fifth threshold value, starting from the j-1 th level second file layer, writing fourth target data stored in the current level second file layer into the j-1 th level second file layer;
until the remaining storage space in the j-1 th level second file layer is greater than the fifth threshold.
The fourth target data is part or all of data stored in the j-1 th level second file layer of the hard disk.
In the process, data are transferred among all file layers of the hard disk, so that the upper file layer is ensured to have enough residual storage space after data transfer is performed to the lower file layer, and effective storage of data in different file layers in the hard disk is ensured.
With reference to the second aspect, in some implementations of the second aspect, the extended memory includes a first set of the N-level first file layers, and a second set of the N-level first file layers;
under the condition that a first preset condition is met, writing first target data in the main memory into a first-stage first file layer of the extended memory, wherein the first-stage first file layer comprises:
writing a first target key in the main memory into a first-stage first file layer in a first group of the N-stage first file layers under the condition that the first preset condition is met; and/or writing a first target value corresponding to the first target key in the main memory into a first-stage first file layer in a second group of the N-stage first file layers;
And writing second target data stored in the nth stage first file layer of the extended memory into the first stage second file layer of the hard disk under the condition that a second preset condition is met, wherein the method comprises the following steps:
And under the condition that the second preset condition is met, writing a second target key stored in an N-th first file layer in a first group of N-th first file layers in the extended memory into the hard disk, and/or writing a second target value corresponding to the second target key stored in an N-th first file layer in a second group of N-th first file layers in the extended memory into the hard disk.
According to the process, the key value is used for separating and storing the key and the value in the data, metadata needing to be stored is reduced, the matching efficiency when the key is used for searching the data is improved, and the data transfer efficiency is improved.
With reference to the second aspect, in some implementations of the second aspect, before writing the first target data in the main memory into the first level first file layer of the extended memory when the first preset condition is met, the method further includes:
formatting the extended memory by adopting a memory file system;
Configuring initialization parameters, wherein the initialization parameters at least comprise file catalogues of the extended memory, storage catalogues of a pre-written log in the extended memory, the size of an N-level first file layer in the extended memory and the layer number of the file layers;
And if the residual storage space in the extended memory is detected to meet the layering required capacity of the N-level first file layer, initializing the N-level first file layer in the extended memory and initializing the pre-write log based on the initialization parameter.
Thus, the effective initialization construction of the N-level first file layer in the newly added expansion memory is realized.
With reference to the second aspect, in certain implementation manners of the second aspect, the data processing method further includes:
Reading fifth target data matched with the data to be read from the main memory under the condition that a fifth preset condition is met, or
And under the condition that a sixth preset condition is met, the fifth target data is read from the extended memory.
In this way, in the improved storage architecture, the processor can read the target data from the main memory in a conventional manner, so that the modification of the original system under the function of reading the data after the expansion memory is newly added is reduced as much as possible, and the implementation is convenient, or the processor can directly read the target data from the expansion memory which is newly added, so that the data can be directly and effectively obtained in the memory, and the data reading response speed is improved. Different data reading implementation modes are provided, and the data reading implementation requirements under different scenes are met.
With reference to the second aspect, in some implementations of the second aspect, when a fifth preset condition is met, reading fifth target data that matches data to be read from the main memory includes:
reading the fifth target data from the main memory under the condition that the fifth target data matched with the data to be read is stored in the main memory, or
Writing the fifth target data into the main memory from the first-level first file layer of the extended memory under the condition that the fifth target data matched with the data to be read is stored in the extended memory, or reading the fifth target data from the main memory
And under the condition that the fifth target data matched with the data to be read is stored in the hard disk, writing the fifth target data into the N-th first file layer of the extended memory from the hard disk, writing the fifth target data into the main memory from the first-level first file layer of the extended memory, and reading the fifth target data from the main memory.
According to the implementation process, corresponding data reading processing is implemented respectively according to different storage conditions of the target data in the main memory, the extended memory and the hard disk, the target data is transferred among the hard disk, the extended memory and the main memory by utilizing the storage levels of the data files arranged in the extended memory, and final reading of the target data is achieved from the main memory.
With reference to the second aspect, in some implementations of the second aspect, the writing the fifth target data to the main memory by the first-level first file layer of the extended memory includes:
Determining a target-level first file layer where the fifth target data is located in the extended memory;
And under the condition that N is larger than 1 and the target-level first file layer is not the first-level first file layer, starting from the target-level first file layer of the extended memory, writing the fifth target data stored in the current-level first file layer into the adjacent upper-level first file layer until the fifth target data is written into the first-level first file layer, and writing the fifth target data into the main memory from the first-level first file layer of the extended memory.
In this way, the data is transferred among the hard disk, the extended memory and the main memory by utilizing the multiple storage levels of the data file arranged in the extended memory and transferring the target data among the multiple storage levels in the extended memory, and the final reading of the target data is realized from the main memory.
With reference to the second aspect, in some implementations of the second aspect, the reading the fifth target data from the extended memory when the sixth preset condition is met includes:
And reading the fifth target data from the expansion memory when the fifth target data is stored in the expansion memory, or transferring the fifth target data from the hard disk to the Nth-level first file layer of the expansion memory when the fifth target data is stored in the hard disk, and reading the fifth target data from the expansion memory.
Therefore, the target data can be directly read from the extended memory, and the data reading efficiency and the data reading speed are improved.
With reference to the second aspect, in some implementations of the second aspect, the hard disk includes an M-level second file layer, where the first-level second file layer is used to transfer stored data to the nth-level first file layer, and M is greater than 1;
The transferring the fifth target data from the hard disk to the nth stage first file layer of the extended memory includes:
Determining a target-level second file layer of the fifth target data in the hard disk;
Writing the fifth target data from the first level second file layer to the Nth level first file layer of the extended memory in case that the target level second file layer is the first level second file layer in the hard disk, or
And under the condition that the target level second file layer is not the first level second file layer, starting from the target level second file layer of the hard disk, writing the fifth target data stored in the current level second file layer into the adjacent upper level second file layer until the fifth target data is written into the first level second file layer, and writing the fifth target data into the Nth level first file layer of the expansion memory from the first level second file layer.
In this process, the j-1 th level second file layer in the hard disk is used to store data transferred from the j-1 th level second file layer.
In this way, by utilizing a plurality of storage levels of the data files arranged in the hard disk, the transfer of the data from the hard disk to the extended memory is realized by means of the transfer of the target data among the file levels of the hard disk, and the data access order is ensured.
A third aspect of an embodiment of the present application provides a computing device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to the second aspect when the computer program is executed.
A fourth aspect of the embodiments of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method according to the second aspect.
A fifth aspect of the embodiments of the present application provides a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in an electronic device, causes a processor in the electronic device to perform the steps of the method as described in the second aspect above.
Detailed Description
Embodiments of the technical scheme of the present application will be described in detail below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present application, and thus are merely examples, and are not intended to limit the scope of the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs, the terms used herein are for the purpose of describing particular embodiments only and are not intended to be limiting of the application, and the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the above description of the drawings are intended to cover non-exclusive inclusions.
In the description of embodiments of the present application, the technical terms "first," "second," and the like are used merely to distinguish between different objects and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated, a particular order or a primary or secondary relationship. In the description of the embodiments of the present application, the meaning of "plurality" is two or more unless explicitly defined otherwise.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In the description of the embodiment of the present application, the term "and/or" is merely an association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B, and may indicate that a exists alone, while a and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
Before explaining the scheme in the embodiment of the application, related concepts are introduced and explained:
Main memory, also known as memory (RAM), is the temporary storage for programs and data in a server where it is typically located close to the processor.
And the extended memory is outside the main memory and is used as an extension of the main memory.
The file layer is a data storage layer divided in the storage space based on the file system. The data storage hierarchy is the structure in which the file system organizes and manages data. Such a hierarchy facilitates efficient access and management of data on the storage device.
In order to illustrate the technical solution according to the embodiments of the present application, the following description is made by specific embodiments.
In some embodiments, as shown in connection with fig. 1, a computing system is presented, comprising:
A processor;
the main memory is used for storing data;
the memory comprises an extended memory, wherein the extended memory comprises N-level first file layers, N is more than or equal to 1, a first-level first file layer in the N-level first file layers is used for storing data transferred from the main memory, and an i-th first file layer in the N-level first file layers is used for storing data transferred in the i-1-th first file layer under the condition that N is more than 1, 1<i is less than or equal to N;
The hard disk comprises M-level second file layers, wherein M is greater than 1, a first-level second file layer in the M-level second file layers is used for storing data transferred from an Nth-level first file layer, and a j-th second file layer in the M-level second file layers is used for storing data transferred from a j-1-th second file layer, wherein 1<j is less than or equal to M. The above-described computing system corresponds to a specific hardware architecture.
In some alternative embodiments, the hardware architecture corresponding to the computing system includes a computing device (e.g., a server device, a terminal device with computing power, etc.), and a separate expansion memory device connected to the computing device. Optionally, the extended memory included in the extended memory device is a memory adopting a CXL (Compute Express Link, computing high-speed link) protocol, so as to implement efficient and low-latency interconnection with the server, implement high-performance and low-latency data transmission and memory sharing, and ensure the access efficiency of the data in the extended memory. In this architecture, the processor, main memory, is located in the computing device. In this case, the computing system is formed as a system that includes a computing device and an expansion memory device that is set independently of the computing device.
In alternative embodiments, the extended memory may be formed as a memory pool for data storage using the CXL protocol. The memory storage device based on the CXL protocol can be called as a CXL memory component, so that the expansion of the memory device can be effectively realized, and the storage capacity of a server is increased.
In some implementation processes, a plurality of CXL memory components may be resource-integrated based on a pooling technology to obtain a memory device including the plurality of CXL memory components, where the plurality of CXL memory components are formed as storage modules in the remote memory device, and different servers and the remote memory device may be established with an interconnection network based on a cache coherence protocol, and a processor in the server may directly access the storage modules in the remote memory device, so as to improve memory capacity to the maximum extent, and satisfy a memory requirement of a data-intensive application.
In other alternative embodiments, the computing system includes a computing device in a corresponding hardware architecture. The processor, the main memory and the expansion memory are all arranged in the computing device. In this case, the computing system is formed as a system that includes the computing device.
Alternatively, in the above two hardware architectures, the hard disk may be built in the computing device or external to the computing device, and may be set according to requirements.
In the above computing system, a data storage architecture is formed among the main memory, the expansion memory and the hard disk.
In the data storage architecture, as shown in fig. 2, the extended memory and the hard disk are different storage media, file layer division is performed in the extended memory, expansion of the main memory is realized by means of the extended memory provided with the file layer, and data storage connection of the hard disk provided with a plurality of file layers is realized by adapting to the hard disk layering.
The data can be transferred between the main memory and the extended memory, and can be transferred between the extended memory and the hard disk, and the processor can directly read the data in the main memory or the extended memory.
The data in the extended memory can be transferred based on the divided file layer, the main memory and the hard disk, so that the transition storage of the data in the main memory and the hard disk is realized, the data is cached in the extended memory, and the data can be read and written by the memory instead of the hard disk, thereby improving the read and write performance of the whole flow.
The extended memory supports creation of an N-level first file layer.
The extended memory transfers data between the first-stage first file layer of the N-stage first file layers and the main memory, and transfers data between the N-stage first file layer and the hard disk.
In some alternative embodiments, as shown in fig. 2, the main memory stores a variable memory table for storing data, where the variable memory table is converted into an invariable memory table when the data writing amount reaches a preset condition. And correspondingly creating a variable memory table in the main memory, and continuously writing new data into the variable memory table. The immutable memory table formed by the translation no longer accepts new write operations, but can still be read.
The variable memory table is a data structure in the memory and is used for temporarily storing data in the database. The contents of the variable memory table may be modified or updated at runtime as compared to the non-variable memory table.
An immutable memory table is another data structure in memory, as opposed to a mutable memory table. Once data is written to the immutable memory table, the data cannot be modified or deleted (unless the entire table is replaced or deleted).
The non-variable memory table provides an intermediate state of data storage, the setting of the non-variable memory table allows writing operation to continue on the newly created variable memory table, and the conversion of the variable memory table into the non-variable memory table can ensure that data in the non-variable memory table can be safely written into the extended memory by the background thread, so that the influence on the performance of the database is reduced.
And the first file layer of the N levels in the extended memory is used for storing the data transferred from the immutable memory table.
The preset condition is, for example, that the data writing amount of the variable memory table reaches a quantity threshold, or that the remaining storage space in the variable memory table is insufficient for writing the data to be stored, etc. Thus, the data in the immutable memory table is transferred to the expansion memory, and the remaining memory space in the main memory increases.
After the variable memory table is converted into the non-variable memory table, the data table is converted from readable and writable to readable and non-writable. And a variable memory table can be newly built under the condition that the residual memory space in the main memory is sufficient. By means of conversion of the data table in the main memory, orderly transfer of data from the main memory to the newly added expansion memory is realized.
The number of layers N of the first file layer of the extended memory may be configured during configuration. N is greater than or equal to 1.
N may be set to 1, in which case the extended memory contains only the first file layer of level 1. The first-stage first file layer is the nth-stage first file layer.
N may be set to be greater than 1, in which case the extended memory contains multiple levels of first file layers. The first file layer and the Nth first file layer are different file layers. As shown in fig. 2. The data can be transferred among the multi-level file layers of the extended memory, the data is stored into the extended memory in a layered manner, and the multi-level storage requirements of different data in the extended memory are adapted.
And if N is greater than 1, on one hand, a first file layer in the N-level first file layers is used for storing data transferred from the main memory, an ith first file layer is used for storing data transferred in the i-1 th first file layer, and an Nth first file layer is used for transferring and writing the stored data into the hard disk.
On the other hand, an nth first file layer among the nth first file layers is used for storing data transferred from the hard disk.
Optionally, the i-th first file layer is used for storing the data transferred in the i+1-th first file layer, and the first-stage first file layer is used for transferring the data to the main memory.
The situation corresponds to some implementation processes of reading data, the data in the hard disk can be transferred to the extended memory to enable the processor to read, or the data to be read can be transferred from the extended memory to the main memory layer by layer to enable the processor to read.
According to the different situations, through the difference of the N values, the effective storage of the data in the extended memory can be realized according to the proper number of file levels, and the data access requirements under different scenes are adapted.
The hard disk supports the creation of an M-level second file layer, as shown in connection with fig. 2.
And the hard disk transfers data between the first-stage second file layer and the extended memory through the first-stage second file layer in the M-stage second file layers.
The first-level second file layer in the M-level second file layers is used for storing data transferred from the nth-level first file layer, and the jth-level second file layer is used for storing data transferred from the jth-1-level second file layer;
Or alternatively, the j-1 th second file layer is used for storing the data transferred from the j-th second file layer, and the data in the first second file layer is transferred to the N-th first file layer so as to be read by the processor through the expansion memory.
The M-level file layer in the hard disk is a plurality of layers and is used for carrying out hierarchical storage on data needing to be stored for a long time and transferring related target data into the expansion memory layer by layer when the data stored in the hard disk needs to be read, so that the target data can be directly read from the expansion memory by the processor, or the target data can be further transferred into the main memory through the expansion memory after being transferred into the expansion memory and can be directly read from the main memory by the processor.
The M-level second file layer in the hard disk and the N-level first file layer in the extended memory may be file layers managed separately, in which case there is not necessarily some sequential relationship between the level numbers of the N-level first file layer and the M-level second file layer.
Or both are file layers that are managed in unison, in which case the file layer numbers of the M-level second file layer and the N-level first file layer may be in sequential relationship.
In an optional embodiment, when transferring data among the memory, the extended memory and the hard disk, based on the data heat in the database, the data in the main memory is gradually transferred and stored downward in each level of file layer in the extended memory, and the data in the extended memory is gradually transferred and stored downward in each level of file layer in the hard disk, so that the hot data and the partial temperature data are stored in the memory (including the main memory and the extended memory), and the partial temperature data and the cold data are sunk and stored in the hard disk.
Optionally, a plurality of SST files may be included in the N-level first file layer and the M-level second file layer for storing data.
Each file layer contains a series of SST (Sorted String Table, ordered string table) files, and redundant data from layer to layer is dynamically compressed and combined.
In the data storage process, the data to be transferred in the current file layer can be combined and then compressed and transferred to the SST file of the next file layer. Conversely, in the data reading process, the target data to be read in the SST file of the current level file layer may be decompressed and restored to be stored in the previous level file layer.
In some optional implementation processes, data is transferred between multiple first file layers of the extended memory, or data is transferred between multiple second file layers of the hard disk, or data to be transferred in one of the file layers of the extended memory and the first second file layer of the hard disk may be combined and then compressed and transferred to the file layer of the other layer when data is transferred between the first file layer of the N-th layer of the extended memory and the first second file layer of the hard disk.
Hot data is typically transferred at a lower level (low numbered file level, e.g., first level file level), cold data is gradually transferred to a higher level (high numbered file level, e.g., nth level file level), and colder data is gradually transferred from the lower level down to the higher level. When the data is read, the target data needs to be decompressed layer by layer and transferred to the memory for reading from the file layer where the target data is located.
The merging of the data may be to screen the data content in the files stored in the same level of file layer, screen the data content with the same heat degree and merge the data content into one file, or perform duplication-removing merging of the same or similar data repeatedly stored in the file layer (for example, the same Key & Value data may exist between different SST files in the same level of file layer, and in the merging compression process or decompression process, select the data with newer version to remove the data with old version, so as to implement the data merging processing), or include the two processing modes, or be other data merging processing modes. The combination method is not particularly limited here.
The data transfer specifically requires that the original data be stored in a new storage area and deleted from the original storage area.
Alternatively, the data transfer between memories may be operated using DSA (DATA STREAMING accelerants). The DSA is a high-performance data copying and converting accelerator, is mainly applied to scenes requiring high-performance data copying and converting, supports a high-performance data moving function, can improve the efficiency of data moving and converting operation, simultaneously releases resources of a CPU period for higher-level functions, and improves the overall system performance.
The process may include transferring data within the main memory, transferring data between the main memory and the expansion memory, and transferring data between various storage tiers in the expansion memory. The performance of data transfer by DSA is improved by 3-5 times compared with that of data transfer by CPU or hard disk controller.
Alternatively, as shown in connection with fig. 2, the transfer of data between M second file layers in the hard disk uses a hard disk controller, i.e. the controller shown in the figure.
Therefore, the data transfer performance can be improved, the CPU computing power can be unloaded, the data reading and writing performance can be further improved, and the data processing performance is improved.
In some alternative embodiments, the data type stored in the database is key-value pair data. Optionally, the number of bytes occupied by the key value in the general key value pair data is 128-1024 bytes, and the number of bytes occupied by the value is larger than 2048 bytes.
In order to realize separate storage of keys and values in key value pairs, as shown in fig. 3, the extended memory includes a first set of N-level first file layers and a second set of N-level first file layers.
The first-level first file layer in the first group of N-level first file layers is used for storing a first target key transferred from the main memory, and the first-level first file layer in the second group of N-level first file layers is used for storing a first target value corresponding to the first target key transferred from the main memory.
Optionally, the variable memory table in the main memory comprises a first variable memory table for storing the target key and a second variable memory table for storing the target value corresponding to the target key, wherein the first N-level first file layer corresponds to the first variable memory table, and the second N-level first file layer corresponds to the second variable memory table.
Optionally, the hard disk may include a first set of M second file layers corresponding to the first set of N-level first file layers, and a second set of M second file layers corresponding to the second set of N-level first file layers.
By the arrangement mode, two groups of N first file layers are divided in the extended memory, two groups of M second file layers are divided in the hard disk, and different variable memory tables can be set in the main memory.
The first-level first file layers in the first group of N-level first file layers are used for storing first target keys transferred from the main memory, and the first-level first file layers in the second group of N-level first file layers are used for storing first target values corresponding to the first target keys transferred from the main memory.
The first-stage second file layer in the M-stage second file layers is used for storing a second target key transferred from the N-stage first file layer in the first-stage first file layer in the N-stage first file layer in the extended memory, and the first-stage second file layer in the M-stage second file layers is used for storing a second target value corresponding to the second target key transferred from the N-stage first file layer in the second-stage first file layer in the extended memory.
The key value pair key and the key value are stored separately, metadata required to be stored is reduced, the matching efficiency when the key value is used for searching data is improved, and the data transfer efficiency is improved.
In some alternative embodiments, as shown in connection with fig. 2 and 3, the extended memory also supports the creation of a pre-written Log (WAL Log), a Manifest file (Manifest table) for storing metadata, and the like.
The manifest file is typically used to record and manage metadata of the structure, content, configuration, and rights of the database, including version of the database, column Family (Column Family) information, SST file lists, option configurations, etc. These metadata are used to perform the functions of correct operation of the database, data management, user access, etc.
In the pre-write log mechanism, all write operations first write data to the pre-write log and then perform write operations to ensure that in the event of a system crash or other failure, the data can be restored to a consistent state based on the pre-write log. In this way, the log file may be used to recover outstanding write operations even if a failure occurs during the data write process.
In the data storage architecture described above, and as shown in connection with FIG. 2, an optional data writing process includes:
The client writes the data into the main memory by calling the database interface DB API, and writes the data into the pre-written log in the hard disk. And when the total number of the non-variable memory tables in the main memory exceeds a set threshold, writing the data in the non-variable memory tables into a first-stage first file layer of the extended memory, and creating SST files in the first-stage first file layer of the extended memory to store the data in the newly-written non-variable memory tables.
And the data transferred and issued to the expansion content in the main memory can be stored in a layered manner according to different file layers in the expansion memory, and then the data transferred and issued to the hard disk in the expansion memory is stored in a layered manner according to different file layers in the hard disk.
Multiple SST files may be included in a file layer, and since the data in SST files may be from different immutable memory tables, redundant data may exist between the multiple SST files in a file layer.
For redundant data existing between the N-level first file layer in the expansion memory or the M-level second file layer in the hard disk, data merging and compression can be selected, SST files are newly created in the next-level file layer to store the merged and compressed data set, for the redundant data existing among the multiple-level file layers, data merging and compression are carried out layer by layer, and correspondingly, the SST files are newly created in the next-level file layer to merge and store the data set, so that the orderly storage of the data in the main memory, the expansion memory and the hard disk is realized.
The process is performed by the CPU to merge, compress and write data to the processing logic of the new SST table and update the corresponding metadata in the manifest file.
Wherein the manifest file is updated whenever the database storage structure or organization changes, such as creating a new SST file or deleting an old SST file. The manifest file may be stored in an extended memory.
In the data storage architecture described above, and as shown in connection with FIG. 2, an optional data read process includes:
When the Client calls DB API to read data, firstly, searching the requested key from the variable memory table in the main memory, if the key is not found in the variable memory table, continuously searching the requested key from the non-variable memory table, and if the key is not found in the variable memory table, namely the requested key is not found in the main memory, searching the index downwards according to the hierarchical structure of the file layer in the expansion memory.
Firstly, starting from a first-stage first file layer, judging whether a plurality of SST files in each layer contain a requested key by means of indexes, if not, continuing to search for the next-stage first file layer until the key is found from the SST files or until the last-stage first file layer, namely an Nth-stage first file layer, is found. If the requested key is not found in the first file layers of each level in the expansion memory, the index is continuously searched downwards according to the hierarchical structure of the file layers in the hard disk.
Firstly, starting from a first-stage second file layer, judging whether a plurality of SST files in each layer contain a requested key by means of indexes, if not, continuing to search for a next-stage second file layer until the key is found from the SST files or until the last-stage second file layer, namely an Mth-stage second file layer, is found.
The above-described process of locating the requested key may use a bloom filter to quickly exclude SST files that do not contain the requested key. For SST files containing a requested key, the index of the SST file is used to quickly locate a storage location where the requested key may exist.
When the requested key is found from the main memory, the value corresponding to the requested key is directly read from the main memory and fed back to the user side. When the requested key is found out from the extended memory, the value corresponding to the requested key is directly read from the extended memory and fed back to the user side, or the value corresponding to the requested key is transferred to a variable memory table of the main memory layer by layer for reading and fed back to the user side. When the requested key is found from the hard disk, the value corresponding to the requested key is transferred to the expansion memory layer by layer for reading and fed back to the user side, or the value corresponding to the requested key is transferred to the expansion memory layer by layer and then transferred to the variable memory table of the main memory layer by layer for reading and fed back to the user side.
In the computing system, as the expansion memory is added, the transfer process of the data in the main memory to the hard disk is changed, the data is not directly written into the hard disk any more, but the data is required to be transferred downwards to the expansion memory, and then the data stored in the expansion memory is further transferred downwards to the hard disk by the expansion memory.
In the embodiment of the application, the extended memory is added between the main memory and the hard disk and is used as a transitional storage space between the main memory and the hard disk, so that the access operation of data operation to the hard disk is reduced, the interference of low I/O performance of the hard disk on the data access efficiency is reduced, the time consumption of data processing is reduced, and the data access response speed is improved.
In some embodiments, the embodiment of the application provides a data processing method, which is applied to a computing system, wherein the computing system comprises a main memory, an extended memory and a hard disk, the extended memory comprises N-level first file layers, N is more than or equal to 1, the hard disk comprises M-level second file layers, and M is more than 1. The method, when executed, may be implemented by a processor in a computing system.
In one implementation, corresponding to a data storage process, the data processing method includes:
and under the condition that a first preset condition is met, writing first target data in the main memory into a first-stage first file layer of the extended memory.
And under the condition that N is greater than 1, the ith first file layer in the N-level first file layers is used for storing the data transferred in the ith-1 first file layer, wherein 1<i is less than or equal to N.
And under the condition that the second preset condition is met, writing second target data stored in the N-th first file layer of the extended memory into the first-stage second file layer in the hard disk.
The j-th second file layer in the M-th second file layer is used for storing data transferred in the j-1-th second file layer, and 1<j is less than or equal to M.
The first target data may be part of data or all of data stored in the main memory. Alternatively, the first target data may be part or all of the data stored in the variable memory table and the non-variable memory table in the main memory. Or may be part or all of the data stored in the immutable memory tables, for example, the data formed by combining the data with the same heat and relatively low heat in multiple immutable memory tables, or the data determined in other manners.
The second target data may be part or all of the data stored in the nth stage first file layer of the extended memory. The partial data is, for example, data formed by combining data with the same heat and relatively low heat in a plurality of SST files in the nth-level first file layer, or data determined in other manners.
In the process, an extended memory is introduced between the main memory and the hard disk, data transfer can be performed between a first-stage file layer in N-stage file layers and the main memory, data transfer is performed between an N-stage file layer and the hard disk, data caching is realized by means of the extended memory provided with a storage level of a data file, access of data operation to the hard disk is reduced, interference of low I/O performance of the hard disk on data access efficiency is reduced, time consumption of data processing is reduced, and data access response speed in a computing system is improved.
In some embodiments, the first preset condition is determined to be met if the remaining storage space of the main memory is less than a first threshold. Or in case a first specific user operation is detected, determining that the first preset condition is satisfied.
The main memory, the extended memory and the hard disk respectively have set storage space sizes.
When data is stored in the main memory, the extended memory and the hard disk, the total storage space, the used amount and the residual available amount of the main memory, the extended memory and the hard disk can be obtained to determine whether the residual storage space in the corresponding storage medium is sufficient.
The first threshold may be a fixed value for determining whether the remaining storage space is sufficient. And under the condition that the residual storage space of the main memory is smaller than the fixed value, determining that the first preset condition is met.
Or the first threshold may be a dynamic value that determines whether the remaining storage space is sufficient based on the current data size of the data to be stored.
And the time judgment on whether the data in the main memory needs to be transferred to the extended memory is realized by combining the size of the residual storage space of the main memory.
In one embodiment, if the remaining storage space of the variable memory table in the main memory is smaller than the data amount of the data to be stored, the variable memory table is converted into the non-variable memory table, and if the remaining storage space in the main memory is not enough to create the variable memory table, it is determined that the remaining storage space of the main memory is smaller than the first threshold, and it is determined that the first preset condition is met. And the time discrimination of whether the data in the main memory needs to be transferred to the expansion memory is realized by combining the data quantity of different memory tables in the main memory.
The first specific user operation is, for example, a click trigger operation of a specific instruction key by a user. In some scenarios, for example, when it is required to ensure that there is enough remaining storage space in the main memory, a user may trigger, through a specific operation, migration of data in the main memory into the extended memory, and write first target data in the main memory into a first file layer of a first level of the extended memory.
And the time judgment of whether the data in the main memory needs to be transferred to the extended memory is realized by combining the operation condition of the user.
In some embodiments, the second preset condition is determined to be met if the remaining storage space of the nth-level first file layer of the extended memory is smaller than a fourth threshold, or the second preset condition is determined to be met if a second specific user operation is detected.
The extended memory has a set storage space size. When the N-level first file layers in the extended memory are initialized, the storage space of each level of first file layer can be set. When data is stored in a certain first file layer of the main memory, the storage space size, the used amount and the residual available amount of the first file layer of the first stage can be obtained to determine whether the residual storage space of the first file layer of the first stage is sufficient.
The fourth threshold may be a fixed value for determining whether the remaining storage space of the nth stage first file layer of the extended memory is sufficient. And determining that the second preset condition is met under the condition that the remaining storage space of the Nth-level first file layer of the extended memory is smaller than the fixed value.
Or the fourth threshold may be a dynamic value that determines whether the remaining storage space is sufficient based on the amount of data in the nth level first file layer currently to be transferred to the extended memory. In one embodiment, if the remaining storage space of the nth stage first file layer of the extended memory is smaller than the data amount to be transferred and stored, it is determined that the remaining storage space of the nth stage first file layer of the extended memory is smaller than the fourth threshold, and it is determined that the second preset condition is satisfied.
The second specific user operation is, for example, a click trigger operation of a specific instruction key by a user. In some scenarios, for example, when it is required to ensure that there is enough remaining storage space in the extension, a user may trigger, through a specific operation, migration of data in the extension memory from the nth-level first file layer to the hard disk.
In some embodiments, in the case where N is greater than 1, the data processing method includes:
And under the condition that a third preset condition is met, storing third target data stored in the ith-1 th first file layer in the N-level first file layers of the extended memory into the ith-level first file layer, wherein N is not less than 1<i.
The third preset condition may be a click trigger operation of a specific instruction key by the user. In some scenarios, for example, when it is required to ensure that there is enough remaining storage space in the lower first file layer of the extended memory, a user may trigger the migration of data in the extended memory from the i-1 th first file layer to the i-1 th first file layer through a specific operation. Or the third preset condition may be that the remaining storage space in the i-1 st level first file layer in the extended memory is smaller than a threshold value. The method and the device realize the time discrimination of whether the data transfer between different file layers is needed in the extended memory.
Correspondingly, under the condition that the third preset condition is met, storing third target data stored in the ith-1 th first file layer in the N-level first file layers of the extended memory into the ith first file layer, wherein the third target data comprises the following steps:
And under the condition that the residual storage space in the ith-1 level first file layer in the extended memory is smaller than a second threshold value, starting from the ith-1 level first file layer in the extended memory, writing third target data stored in the current level first file layer into the ith level first file layer until the residual storage space in the ith-1 level first file layer in the extended memory is larger than the second threshold value.
The second threshold may be a fixed value for determining whether the remaining storage space of the i-1 st level first file layer of the extended memory is sufficient. And determining that a third preset condition is met under the condition that the residual storage space of the i-1 th level first file layer of the extended memory is smaller than the fixed value.
Or the second threshold may be a dynamic value that determines whether the remaining storage space is sufficient based on the amount of data in the i-1 st level first file layer currently to be transferred to the extended memory. In one embodiment, if the remaining storage space of the i-1 th level first file layer of the extended memory is smaller than the data amount to be transferred and stored, determining that the remaining storage space of the i-1 th level first file layer of the extended memory is smaller than the second threshold, and determining that the third preset condition is met.
The third target data is part or all of data stored in the ith-1 level first file layer of the extended memory.
In the process, data are transferred among the file layers, so that the upper file layer is ensured to have enough residual storage space after data transfer is performed to the lower file layer, and effective storage of the data in different file layers in the extended memory is ensured.
In an alternative embodiment, the hard disk includes M second file layers, and a first second file layer in the M second file layers is used for storing data transferred from an Nth first file layer, wherein M >1, and the data processing method further includes:
under the condition that a fourth preset condition is met, fourth target data stored in a j-1 th second file layer in M-level second file layers of the hard disk are stored in the j-th second file layer, and 1<j is smaller than or equal to M.
The fourth preset condition may be a click trigger operation of a specific instruction key by the user. In some scenarios, for example, when it is required to ensure that there is enough remaining storage space in the second file layer of the lower hierarchy of the hard disk, the user may trigger the migration of data in the hard disk from the j-1 th second file layer to the j-th second file layer through a specific operation. Or the fourth preset condition may be that the remaining storage space in the j-1 th level first file layer in the hard disk is smaller than a threshold value. The method and the device realize the time discrimination of whether the data transfer between different file layers is needed in the hard disk.
Correspondingly, under the condition that a fourth preset condition is met, fourth target data stored in a j-1 th second file layer in the M-level second file layers of the hard disk are stored in the j-th second file layer, and the method comprises the following steps:
And under the condition that the residual storage space in the j-1 th level second file layer in the hard disk is smaller than a fifth threshold value, starting from the j-1 th level second file layer in the hard disk, writing fourth target data stored in the current level second file layer into the j-th level second file layer until the residual storage space in the j-1 th level second file layer in the hard disk is larger than the fifth threshold value.
The fifth threshold may be a fixed value for determining whether the remaining storage space of the j-1 th level second file layer in the hard disk is sufficient. And determining that the fourth preset condition is met under the condition that the remaining storage space of the j-1 th level second file layer in the hard disk is smaller than the fixed value.
Or the fifth threshold may be a dynamic value that determines whether the remaining storage space is sufficient based on the amount of data in the j-1 th level second file layer currently to be transferred to the hard disk. In one embodiment, if the remaining storage space of the j-1 th level second file layer of the hard disk is smaller than the data amount to be transferred and stored, determining that the remaining storage space of the j-1 th level second file layer of the hard disk is smaller than the fifth threshold, and determining that the fourth preset condition is met.
The fourth target data is part or all of data stored in the j-1 th level second file layer of the hard disk.
In the process, data are transferred among all file layers of the hard disk, so that the upper file layer is ensured to have enough residual storage space after data transfer is performed to the lower file layer, and effective storage of data in different file layers in the hard disk is ensured.
In an alternative embodiment, to ensure that the database data is restored to a consistent state in the event of a system crash or other failure, a residual memory space determination process for the pre-written log is introduced during data storage.
Correspondingly, if the remaining storage space of the variable memory table in the main memory is larger than the data amount of the data to be stored, writing the data to be stored into the variable memory table under the condition that the remaining storage space of the pre-written log in the extended memory is larger than the backup data amount of the data to be stored.
Optionally, when the remaining storage space of the pre-write log in the extended memory is insufficient to write the backup data of the data to be stored, a new pre-write log file with sufficient remaining storage space may be created in the extended memory, and the backup data of the data to be stored is written into the new pre-write log, while deleting the old pre-write log file.
The processing procedure can recover incomplete write operation by utilizing the pre-write log even if faults occur in the data writing process, so that effective writing of data in a database is ensured.
In some optional implementations, the data to be stored is key-value pair data, and the extended memory includes a first set of the N-level first file layers, and a second set of the N-level first file layers.
On the basis, under the condition that a first preset condition is met, writing first target data in the main memory into a first-stage first file layer of the extended memory, wherein the first-stage first file layer specifically comprises:
and writing a first target key in the main memory into a first-stage first file layer in a first group of N-stage first file layers under the condition that the first preset condition is met, and/or writing a first target value corresponding to the first target key in the main memory into a first-stage first file layer in a second group of N-stage first file layers.
Correspondingly, under the condition that a second preset condition is met, writing second target data stored in the first file layer of the nth stage of the extended memory into the second file layer of the first stage of the hard disk, wherein the method specifically comprises the following steps:
And under the condition that the second preset condition is met, writing a second target key stored in an N-th first file layer in a first group of N-th first file layers in the extended memory into the hard disk, and/or writing a second target value corresponding to the second target key stored in an N-th first file layer in a second group of N-th first file layers in the extended memory into the hard disk.
In this case, the key value pair data is stored separately in the expansion memory and the hard disk.
For the key, an independent group of N-level first file layers exist in the extended memory to transfer the key, so that the storage and the reading of key data in the extended memory, the hard disk and the main memory are realized. Alternatively, a separate set of M-level second file layers may be present in the hard disk for key transfer. Alternatively, there may be a separate variable memory table and a corresponding non-variable memory table in the main memory, so as to implement storage and reading of the keys in the main memory. On the basis, the specific implementation process of performing data transfer, data storage and reading by the key in the extended memory, the main memory and the hard disk can be referred to the description of other parts in the embodiment of the present application, and will not be repeated here.
Similarly, for the value, another independent group of N-level first file layers exist in the extended memory to transfer keys, so that the storage and the reading of the value data in the extended memory, the hard disk and the main memory are realized. Alternatively, a separate set of M-level second file layers may be present in the hard disk for value transfer. Alternatively, there may be a separate variable memory table and a corresponding non-variable memory table in the main memory, to enable storing and reading of values in the main memory. On this basis, the specific implementation process of performing data transfer, data storage and reading on the value in the extended memory, the main memory and the hard disk can be referred to the description of other parts in the embodiment of the present application, and will not be repeated here.
According to the process, the keys and the values are stored separately, metadata required to be stored is reduced, matching efficiency when the keys are used for searching data is improved, and data transfer efficiency is improved.
For the above data storage process, for the case that the extended memory includes the multi-level first file layer and the hard disk includes the multi-level second file layer, an exemplary implementation process is described, and in conjunction with fig. 4, the method specifically includes the following steps:
step 401, based on the data to be stored, determining whether the remaining storage space in the main memory is sufficient.
The judging of whether the remaining storage space in the main memory is sufficient may be to judge whether the remaining storage space of the variable memory table in the main memory is sufficient to write the data to be stored, if the remaining storage space of the variable memory table is greater than the data amount of the data to be stored, then consider that the data to be stored is sufficient to write, or judge whether the remaining storage space in the main memory is sufficient to create a variable memory table to write the data to be stored in the newly created variable memory table.
In some optional implementations, the determining whether the remaining storage space in the main memory is sufficient based on the data to be stored includes:
judging whether the residual storage space of the variable memory table in the main memory is enough to write the data to be stored;
If the remaining storage space of the variable memory table is enough to write the data to be stored, determining that the remaining storage space in the main memory is sufficient;
if the residual storage space of the variable memory table is insufficient to write the data to be stored, converting the variable memory table into an invariable memory table, and determining that the residual storage space in the main memory is insufficient under the condition that the residual storage space in the main memory is insufficient to newly create the variable memory table;
If the remaining storage space of the variable memory table is insufficient to write the data to be stored, but the remaining storage space in the main memory is sufficient to create the variable memory table, determining that the remaining storage space in the main memory is sufficient.
Here, the newly created variable memory table needs to satisfy the condition that the storage space is sufficient for writing the data to be stored.
The judging process is used for effectively judging different situations of whether the residual storage space in the main memory is sufficient or not by judging whether the residual storage space in the variable memory table in the main memory is sufficient or not and further combining with judging whether the residual storage space in the main memory is sufficient or not to establish the variable memory table, so that the effective storage judgment of the data in the database is ensured.
Further, in the case that the remaining storage space in the main memory is insufficient, step 402 is performed. In the case where there is sufficient memory remaining in the main memory, step 405 is performed.
Step 402, determining whether the remaining storage space in the N-level first file layer in the extended memory is sufficient.
The judging of whether the remaining storage space in the N-level first file layer in the extended memory is sufficient may be to judge whether the remaining storage space in the first-level first file layer in the N-level first file layer is sufficient to transfer the storage data stored in the invariable memory table in the main memory, or judge whether the remaining storage space in the i-th first file layer in the N-level first file layer is sufficient to transfer the storage data stored in the i-1-th first file layer in sequence according to the order between the N-level first file layers.
In some optional implementations, the determining whether the remaining storage space in the N-level first file layer in the extended memory is sufficient includes:
judging whether the residual storage space of the first-stage first file layer in the N-stage first file layer is enough to transfer the storage data stored in the non-variable memory table in the main memory;
If the data is judged to be enough, the residual storage space in the N-level first file layer in the extended memory is enough;
If the storage space is insufficient, judging whether the residual storage space of the ith first file layer in the N-level first file layers is enough to transfer the storage data stored in the ith-1 first file layer or not in sequence;
If the data is judged to be enough, the residual storage space in the N-level first file layer in the extended memory is enough;
if the storage space is insufficient, the residual storage space in the N-level first file layer in the extended memory is insufficient.
In the process of sequentially judging whether the residual storage space of the ith first file layer in the N-level first file layers is enough to transfer the storage data stored in the ith-1 first file layer, if the residual storage space of any one first file layer is detected to be enough to transfer the storage data stored in the last first file layer in the sequential judgment, the residual storage space in the N-level first file layers in the extended memory can be considered to be enough, and the judgment on the residual storage space of the later storage level can be not needed to be continuously executed.
In the process of sequentially judging whether the residual storage space of the ith first file layer in the N-level first file layers is enough to transfer the storage data stored in the ith-1 th first file layer, if the residual storage space is not enough to transfer the storage data stored in the last first file layer from the second first file layer to the nth first file layer, the residual storage space in the nth first file layer in the extended memory can be considered to be insufficient.
The process is combined with judging whether the residual storage space of the first file layer of the N-level first file layer in the extended memory is enough to transfer the storage data stored in the invariable memory table in the main memory or not and further judging whether the residual storage space of the i-th first file layer in the N-level first file layer is enough to transfer the storage data stored in the i-1-th first file layer or not, and the effective judgment on whether the residual storage space of the extended memory is enough or not is implemented through judging different situations of judging whether the residual storage space of the N-level first file layer is enough or not, so that the effective storage judgment on the data in the database is ensured.
Further, if the remaining storage space in the N-level first file layer in the extended memory is sufficient, step 403 is performed. If the remaining storage space in the N-level first file layer in the extended memory is insufficient, step 404 is performed.
Step 403, transferring the stored target data in the main memory into the N-level first file layer.
In some alternative implementations, the transferring the target data stored in the main memory into the N-level first file layer includes:
And if the residual storage space of the first-stage first file layer in the N-stage first file layer is enough to transfer the data stored in the non-variable memory table in the main memory, transferring the data stored in the non-variable memory table in the main memory to the first-stage first file layer in the N-stage first file layer.
In an optional implementation process, when the remaining storage space of the first-stage first file layer is enough, a background thread can be triggered to start, and an immutable memory table in the main memory is downward brushed into the first-stage first file layer in the extended memory through DSA.
The processing process transfers the stored data in the invariable memory table in the main memory into the first file layer of the first stage of the N-stage first file layers, and makes available data storage space in the main memory, so that the residual storage space in the main memory can meet the data writing requirement of the data to be stored, and the effective storage of the data in the database is ensured.
Or in some optional implementations, the transferring and storing the target data stored in the main memory into the N-level first file layer includes:
If the residual storage space of the first-stage first file layer in the N-stage first file layer is insufficient to transfer the storage data stored in the invariable memory table in the main memory, and the residual storage space exists behind the first-stage first file layer in the N-stage first file layer and is sufficient to transfer the target-stage first file layer of the storage data stored in the upper-stage first file layer, the storage data in the upper-stage first file layer is stored in the target-stage first file layer.
And transferring the stored data in the non-variable memory table in the main memory into the first file layer of the first stage of the N-stage first file layer until the residual storage space of the first file layer of the first stage of the N-stage first file layer is enough to transfer the stored data in the non-variable memory table in the main memory.
In an optional implementation process, when the residual storage space of the first-stage first file layer is insufficient to transfer the storage data stored in the immutable memory table in the main memory, detecting whether the residual storage space of the second-stage first file layer in the extended memory is sufficient to transfer the storage data stored in the first-stage first file layer, if so, starting a background thread, creating a new SST file in the second-stage first file layer, brushing the storage data in the first-stage first file layer down to the second-stage first file layer through DSA, storing the new SST file in the second-stage first file layer, judging whether the residual storage space of the last-stage first file layer of the current first file layer is sufficient, and particularly, judging whether the storage data stored in the immutable memory table in the main memory is sufficient to transfer, and repeatedly executing the judging process based on a judging result.
If the residual storage space of the second-stage first file layer is insufficient to transfer the storage data stored in the first-stage first file layer, judging whether the residual storage space of the third-stage first file layer is enough to transfer the storage data stored in the second-stage first file layer, if so, starting a background thread, creating a new SST file in the third-stage first file layer, brushing the storage data in the second-stage first file layer to the third-stage first file layer through DSA, storing the new SST file in the created SST file, judging whether the residual storage space of the previous-stage first file layer (namely, the second-stage first file layer) of the current first file layer (namely, the third-stage first file layer) is enough, and particularly, whether the residual storage space of the previous-stage first file layer (namely, the second-stage first file layer) is enough to store the storage data in the previous-stage first file layer, and repeatedly executing the judging processing process based on the judging result.
If the residual storage space of the third-stage first file layer is insufficient to transfer the storage data stored in the second-stage first file layer, judging whether the residual storage space of the fourth-stage first file layer is enough to transfer the storage data stored in the third-stage first file layer, if so, starting a background thread, creating a new SST file in the fourth-stage first file layer, brushing the storage data in the third-stage first file layer to the fourth-stage first file layer through DSA, storing the new SST file in the created SST file, judging whether the residual storage space of the previous-stage first file layer (namely, the third-stage first file layer) of the current first file layer (namely, the fourth-stage first file layer) is enough or not, and particularly, whether the residual storage space of the previous-stage first file layer (namely, the third-stage first file layer) is enough to store the storage data in the previous-stage first file layer or not, and repeatedly executing the judging process based on the judging result.
According to the sequence relation among N first file layers, and so on, judging whether the residual storage space of the N first file layer is enough to transfer the storage data stored in the N-2 first file layer or not when judging that the residual storage space of the N first file layer is insufficient to transfer the storage data stored in the N-1 first file layer, if so, starting a background thread, creating a new SST file in the N first file layer, brushing the storage data in the N first file layer to the N first file layer through DSA, storing the storage data in the newly created SST file, judging whether the residual storage space of the previous file layer (namely, the N first file layer) of the current first file layer (namely, the N first file layer) is enough or not, and particularly, if so, storing the storage data in the previous file layer, and repeatedly executing the judging process based on the judging result.
And transferring and storing the stored data in the invariable memory table in the main memory to the first file layer of the first stage through DSA (data storage architecture) until the residual memory space of the first file layer of the first stage is determined to be enough to transfer and store the stored data in the invariable memory table in the main memory.
According to the processing procedure, the available data storage space is vacated for the first-stage file layer in the N-stage first-file layer by layer in a data storage mode of sinking, compressing and transferring data layer by layer, so that the first-stage file layer can transfer and store stored data in an invariable memory table in a main memory, the available data storage space is vacated for the main memory, the residual storage space in the main memory can meet the data writing requirement of the data to be stored, and the effective storage of the data in a database is ensured.
Further, after step 403, the process returns to step 401 to determine whether the remaining storage space in the main memory is sufficient.
Step 404, if the remaining storage space of the M-level second file layer in the hard disk is sufficient, transferring the target data stored in the extended memory to the M-level second file layer.
In some optional implementations, the transferring the target data stored in the extended memory into the M-level second file layer includes:
And if the residual storage space of the first-stage second file layer in the M-stage second file layer is enough to be transferred and stored into the compressed data of the storage data in the N-stage first file layer, transferring and storing the storage data in the N-stage first file layer into the first-stage second file layer in the M-stage second file layer.
The processing process transfers the storage data in the N-th first file layer in the expansion memory to the first-stage second file layer in the hard disk, so that the available data storage space is reserved for the expansion memory as far as possible, the main memory can be stored in the expansion memory, the residual storage space of the main memory can meet the data writing requirement of the data to be stored, and the effective storage of the data in the database is ensured.
Or in some optional implementations, the transferring the target data stored in the extended memory into the M-level second file layer includes:
If the residual storage space of the first-stage second file layer in the M-stage second file layer is insufficient to transfer and store the storage data in the N-stage first file layer in the expansion memory, and the residual storage space exists behind the first-stage second file layer in the M-stage second file layer and is sufficient to transfer and store the target-stage second file layer of the storage data in the upper-stage second file layer, the storage data in the upper-stage second file layer is stored in the target-stage second file layer.
And transferring the stored data in the N-th first file layer into the first-stage second file layer in the M-stage second file layer until the residual storage space of the first-stage second file layer in the M-stage second file layer is enough to transfer the stored data stored in the N-th first file layer.
In an optional implementation process, when the residual storage space of the first-stage second file layer in the hard disk is insufficient to transfer and store the storage data in the nth-stage first file layer in the extended memory, detecting whether the residual storage space of the second-stage second file layer in the hard disk is sufficient to transfer and store the storage data in the first-stage second file layer, if so, starting a background thread, creating a new SST file in the second-stage second file layer, brushing the storage data in the first-stage second file layer down to the second-stage second file layer through a controller, storing the storage data in the newly created SST file, judging whether the residual storage space of the previous-stage second file layer (namely, the first-stage second file layer) of the current second file layer (namely, the second-stage second file layer) is sufficient, specifically, judging whether the first-stage second file layer is sufficient to transfer and store the storage data in the previous-stage second file layer (namely, the nth-stage first file layer), and repeatedly executing the judging process based on a judging result.
If the residual storage space of the second-level second file layer is insufficient to transfer the storage data stored in the first-level second file layer, judging whether the residual storage space of the third-level second file layer is enough to transfer the storage data stored in the second-level second file layer, if so, starting a background thread, creating a new SST file in the third-level second file layer, brushing the storage data in the second-level second file layer to the third-level second file layer through a controller, storing the new SST file in the newly created SST file, judging whether the residual storage space of the previous-level file layer (namely, the second-level second file layer) of the current second file layer (namely, the third-level second file layer) is enough or not, and particularly, whether the residual storage space of the previous-level second file layer (namely, the second-level second file layer) is enough to store the storage data in the previous-level second file layer or not, and repeatedly executing the judging process based on the judging result.
If the residual storage space of the third-level second file layer is insufficient to transfer the storage data stored in the second-level second file layer, judging whether the residual storage space of the fourth-level second file layer is sufficient to transfer the storage data stored in the third-level second file layer, if so, starting a background thread, creating a new SST file in the fourth-level second file layer, brushing the storage data in the third-level second file layer to the fourth-level second file layer by a controller, storing the storage data in the newly created SST file, judging whether the residual storage space of the previous-level file layer (namely the third-level second file layer) of the current second file layer (namely the fourth-level second file layer) is sufficient, and particularly, whether the storage data stored in the previous-level file layer is sufficient or not, and repeatedly executing the judging process based on a judging result.
According to the sequence relation among the M-level second file layers, and so on, judging whether the residual storage space of the last storage level (namely the M-level second file layer) in the M-level second file layer is enough to transfer the storage data in the M-level second file layer according to the judgment that the residual storage space of the M-level second file layer is insufficient to transfer the storage data in the M-level second file layer, if so, starting a background thread, creating a new SST file in the M-level second file layer, brushing the storage data in the M-level second file layer to the M-level second file layer through a controller, storing the storage data in the newly created SST file, judging whether the residual storage space of the last level (namely the M-level second file layer) of the current second file layer (namely the M-level second file layer) is enough, and particularly whether the last level second file layer is enough to store the storage data in the last level of the file layer, and repeatedly executing the judgment process based on the judgment result.
And transferring the stored data in the first file layer of the Nth level to the second file layer of the first level until the residual storage space of the second file layer of the first level is determined to be enough to transfer the stored data stored in the first file layer of the Nth level.
According to the processing process, the data storage mode of the M-level second file layer is utilized, the available data storage space is vacated for the first-level second file layer in the M-level second file layer, the first-level second file layer can be transferred and stored into the storage data in the last-level second file layer in the expansion memory, the expansion memory can vacate the available data storage space, further the storage data in the expansion memory can be transferred and stored into the main memory, the available data storage space is vacated for the main memory, the residual storage space in the main memory can meet the data writing requirements of the data to be stored, and the effective storage of the data in the database is ensured.
Further, after step 404, step 402 is performed back to determine whether the remaining storage space in the N-level first file layer in the extended memory is sufficient.
The above process is looped until it is determined that the remaining storage space in the main memory is sufficient, and step 405 is performed. Or until it is determined that the remaining storage space of the M-level second file layer is insufficient, step 406 is performed.
In step 405, the data to be stored is stored in the main memory.
The data to be stored may be stored into the main memory with sufficient storage space remaining in the main memory.
And step 406, if the remaining storage space of the M-level second file layer is insufficient, a response message is returned.
The response message may be an error message or a hint message of insufficient storage space in the database.
According to the processing procedure, by utilizing the data storage mode of the sinking compression transfer of the data among the main memory, the expansion memory and the hard disk, through judging whether the residual storage space in different storage spaces is sufficient or not, combining the sinking compression transfer of the data in the main memory to the expansion memory and combining the processing operation of the sinking compression transfer of the data in the expansion memory to the hard disk, the available data storage space is ensured to be reserved in the main memory, the residual storage space in the main memory can meet the data writing requirement of the data to be stored, and the effective storage of the data in the database is ensured.
The embodiment of the application also provides an initialization processing process of the N-level first file layer in the extended memory.
In some alternative implementations, the data processing method further includes:
formatting the extended memory by adopting a memory file system;
configuring initialization parameters, wherein the initialization parameters at least comprise file catalogues of an extended memory, storage catalogues of a pre-written log in the extended memory, the size of an N-level first file layer in the extended memory and the layer number of the file layers.
The memory file system is adopted to format the extended memory, so that the data access program can access the data stored in the extended memory in the form of a file, the data access performance is improved, and the modification adaptation to the database system caused by adding the extended memory is reduced.
Alternatively, the Memory file system is, for example, FAMFS (Fabric-Attached Memory FILE SYSTEM), which is a structured link Memory file system.
FAMFS is a file system designed specifically for structured memory. The memory file system is a DAX (DIRECT ACCESS ) file system that allows users to access the memory of the DAX device in files without PAGE CACHE (page cache) overhead.
The memory file system can be installed in storage media such as a hard disk and an extended memory, and realizes hierarchical initialization and management of storage space in the extended memory.
Optionally, when configuring the initialization parameters, the parameter content includes, but is not limited to:
The method comprises the steps of expanding a file directory of a memory, storing the hard disk file directory, a pre-written log in the expanded memory, writing the cache size, the file size of a variable memory table, the total number of files of the variable memory table and a non-variable memory table, a file lower-brushing threshold value of the non-variable memory table, the capacity size of a first-stage first file layer in the expanded memory, the value of a layer number N, the capacity amplification coefficient of each layer among the N-stage first file layers, the number of background threads and the number of background threads.
If the residual storage space in the extended memory is detected to meet the layering required capacity of the N-Level first file layers, initializing the N-Level first file layers in the extended memory based on the initialization parameters, namely initializing Level 0 to Level N-1 layers, initializing the pre-written log for backing up data to be stored, creating management files and the like.
Initialization of the M-level second file layer of the hard disk may be implemented using other file systems, such as NTFS (New Technology FILE SYSTEM ).
The other file systems can be installed in storage media such as a hard disk and an extended memory, and the hierarchical initialization and management of the storage space in the hard disk are realized.
The initialization parameters at least comprise the file directory of the hard disk, the size of the M-level second file layer in the hard disk and the layer number of the file layers.
In an alternative embodiment, when configuring initialization parameters, the parameter content includes, but is not limited to:
the capacity of the first-level second file layer in the hard disk, the value of the layer number M, the capacity amplification coefficient of each layer among the M-level second file layers and the like.
And initializing M-level second file layers, namely a first-level second file layer to an Mth-level second file layer, in the hard disk based on the initialization parameters, and creating management files and the like.
Therefore, the data access program can access the data stored in the extended memory in the form of a file, and the data access performance is improved.
Optionally, the initialization of the extended memory and the initialization of the hard disk may be performed in the same initialization process, or the initialization of the extended memory and the hard disk may be performed separately in different initialization processes, so as to meet the initialization requirements under different situations.
Optionally, the initialization process further includes a read and write cache area of the initialization database, a file downloading policy of the immutable memory table in the main memory, a processing policy (such as merging and compressing) of the SST files in each file layer, and the like.
The process realizes the initialization processing of the database system based on the main memory, the extended memory and the hard disk storage device, and realizes the effective initialization construction of the N-level first file layer in the newly added extended memory.
The data processing method provided by the embodiment of the application also relates to a data reading process, and specifically comprises the following steps:
And reading the fifth target data matched with the data to be read from the main memory under the condition that a fifth preset condition is met, or reading the fifth target data from the extended memory under the condition that a sixth preset condition is met.
In this way, in the improved storage architecture, the processor can read the target data from the main memory in a conventional manner, so that the modification of the original system under the function of reading the data after the expansion memory is newly added is reduced as much as possible, and the implementation is convenient, or the processor can directly read the target data from the expansion memory which is newly added, so that the data can be directly and effectively obtained in the memory, and the data reading response speed is improved. Different data reading implementation modes are provided, and the data reading implementation requirements under different scenes are met.
Alternatively, the fifth preset condition may be that target data matching the data to be read is stored in the main memory. Or the extended memory stores target data matched with the data to be read, and the target data is written into the main memory by the first-stage first file layer of the extended memory. Or the hard disk stores target data matched with the data to be read, the target data is written into the Nth-level first file layer of the extended memory from the hard disk, and the target data is further written into the main memory through the first-level first file layer of the extended memory.
That is, when the fifth preset condition is satisfied, the reading of the fifth target data matching the data to be read from the main memory specifically includes:
The method comprises the steps of storing fifth target data matched with data to be read in a main memory, reading the fifth target data from the main memory, or writing the fifth target data into the main memory from the first-stage first file layer of the expansion memory when the fifth target data matched with the data to be read is stored in the expansion memory, reading the fifth target data from the main memory, or writing the fifth target data into the N-th-stage first file layer of the expansion memory from the hard disk when the fifth target data matched with the data to be read is stored in the hard disk, and writing the fifth target data into the main memory from the first-stage first file layer of the expansion memory and reading the fifth target data from the main memory when the fifth target data matched with the data to be read is stored in the hard disk.
When N is 1, the nth-level first file layer and the first-level first file layer of the extended memory are the same level. When N is greater than 1, the N-th first file layer of the extended memory and the first file layer of the first stage are not the same level, and the i-1-th first file layer of the extended memory is used for storing the data transferred in the i-th first file layer.
According to the implementation process, corresponding data reading processing is implemented respectively according to different storage conditions of the target data in the main memory, the extended memory and the hard disk, the target data is transferred among the hard disk, the extended memory and the main memory by utilizing the storage levels of the data files arranged in the extended memory, and final reading of the target data is achieved from the main memory.
Optionally, in the above processing, writing the fifth target data into the main memory from the first file layer of the first stage of the extended memory includes:
determining a target-level first file layer where fifth target data are located in the extended memory;
And under the condition that N is larger than 1 and the target-level first file layer is not the first-level first file layer, starting from the target-level first file layer of the extended memory, writing the fifth target data stored in the current-level first file layer into the adjacent upper-level first file layer until the fifth target data is written into the first-level first file layer, and writing the fifth target data into the main memory from the first-level first file layer of the extended memory.
The target-level first file layer may be an nth-level first file layer or any one of second-level first file layers to an N-1-th-level first file layer.
If the fifth target data is stored in the extended memory and the target level first file layer of the fifth target data in the extended memory is the nth level first file layer, when N is 1, the nth level first file layer is also the first level first file layer, and the fifth target data is directly written into the main memory through the file layer.
If the fifth target data is stored in the extended memory, when N is greater than 1 and the target first file layer is any one of the nth first file layer to the second first file layer (i.e., not the first file layer), starting from the target first file layer, transferring the fifth target data to the adjacent upper first file layer in sequence until the fifth target data is transferred to the first file layer, and finally writing the fifth target data from the first file layer to the main memory.
If the fifth target data is stored in the hard disk, when N is 1, after the fifth target data is written from the hard disk into the expansion memory, the target level first file layer where the fifth target data is in the expansion memory is the nth level first file layer. At this time, the nth level first file layer is also the first level first file layer, and the fifth target data is directly written into the main memory through the first level first file layer.
If the fifth target data is stored in the hard disk, when N is greater than 1, after the fifth target data is written from the hard disk into the nth stage first file layer of the extended memory, starting from the nth stage first file layer, transferring the fifth target data to the adjacent upper stage first file layer in sequence until the fifth target data is transferred to the first stage first file layer, and finally writing the fifth target data into the main memory from the first stage first file layer.
In this way, the data is transferred among the hard disk, the extended memory and the main memory by utilizing the multiple storage levels of the data file arranged in the extended memory and transferring the target data among the multiple storage levels in the extended memory, and the final reading of the target data is realized from the main memory.
Optionally, the sixth preset condition may be that target data matching the data to be read is stored in the expansion memory. Or the hard disk stores target data matched with the data to be read, and the target data is transferred from the hard disk to the Nth-level first file layer of the expansion memory.
That is, when the sixth preset condition is satisfied, reading the fifth target data from the extended memory includes:
And reading the fifth target data from the expansion memory when the fifth target data is stored in the expansion memory, or transferring the fifth target data from the hard disk to the Nth-level first file layer of the expansion memory when the fifth target data is stored in the hard disk, and reading the fifth target data from the expansion memory.
Therefore, the target data can be directly read from the extended memory, and the data reading efficiency and the data reading speed are improved.
In an alternative embodiment, the hard disk includes M-level second file layers, and the first-level second file layer is used for transferring the stored data to the nth-level first file layer, where M is greater than 1.
Correspondingly, transferring fifth target data from the hard disk to the nth stage first file layer of the expansion memory includes:
Determining a target-level second file layer of fifth target data in the hard disk;
Writing the fifth target data from the first level second file layer to the Nth level first file layer of the extended memory in case that the target level second file layer is the first level second file layer in the hard disk, or
And under the condition that the target level second file layer is not the first level second file layer, starting from the target level second file layer of the hard disk, writing the fifth target data stored in the current level second file layer into the adjacent upper level second file layer until the fifth target data is written into the first level second file layer, and writing the fifth target data into the Nth level first file layer of the expansion memory from the first level second file layer.
In this process, the j-1 th level second file layer in the hard disk is used to store data transferred from the j-1 th level second file layer.
In this way, by utilizing a plurality of storage levels of the data files arranged in the hard disk, the transfer of the data from the hard disk to the extended memory is realized by means of the transfer of the target data among the file levels of the hard disk, and the data access order is ensured.
For the above data reading process, an exemplary implementation process is described for the case that the extended memory includes multiple levels of first file layers and the hard disk includes multiple levels of second file layers, and the implementation process specifically includes the following steps, as shown in fig. 5:
Step 501, based on the data to be read, it is determined whether the main memory, the extended memory, and the hard disk include target storage data matching the data to be read.
In an alternative implementation, the main memory includes files of a variable memory table and a non-variable memory table, the extended memory includes a plurality of SST files stored in each of N-level first file layers, and the hard disk includes a plurality of SST files stored in each of M-level second file layers.
In this way, the main memory, the extended memory and the hard disk can respectively construct corresponding file indexes, and the data to be read can be matched with the file indexes to determine where to store the target storage data matched with the data to be read.
Or in an optional implementation process, the corresponding data index can be directly constructed based on the data stored in the main memory, the extended memory and the hard disk, the data to be read can be matched with the data index, and the storage position of the target storage data matched with the data to be read can be determined.
The target storage data can be storage data with the same content as the data to be read, or storage data containing the data to be read, or other set matching relations.
When index matching is performed, it is possible to detect in which files or which data items the data to be read may be present by means of a Bloom Filter (Bloom Filter). If the data to be read is a key value input by a user, detecting that the key value to be read possibly exists in which files or data items. The files or the data items are formed into target storage data matched with the data to be read, so that whether the data to be read is possibly stored in a main memory, an extended memory or a hard disk is determined, and the query efficiency is improved.
Step 502, if the extended memory includes target storage data, determining a target level first file layer where the target storage data is located.
When the target storage data matched with the data to be read is determined to be positioned in the expansion memory, determining which file level of the N-level first file layers the target storage data is positioned in, namely determining the target-level first file layer in which the target storage data is positioned, and marking the first file layer as the X-level first file layer.
Subsequently, after step 502, there may be two ways of reading the data.
One way is to execute step 503 on the basis of a larger modification of the original key-value database with respect to the newly added extended memory. Or in another way, steps 504 to 506 are performed on the basis of performing minor modification on the original key-value database with respect to the newly added extended memory.
In step 503, the target storage data is read from the target level first file layer.
In this step, after determining which file level of the N-level first file layer the target storage data is stored in, since the file level is a storage area in the memory space, the CPU may choose to directly read the data to be read from the memory address of the target storage data, thereby improving the data reading efficiency and ensuring the data reading response speed.
Step 504, from the target first file layer, the target storage data is decompressed layer by layer and then transferred and stored in the first file layer of the previous stage until the target storage data is transferred and stored in the first file layer of the N stages of first file layers.
Step 505, decompress the target storage data in the first-stage file layer in the N-stage first file layer, and then transfer and store the decompressed target storage data in the main memory.
In step 506, the target storage data is read from the main memory.
In the process from step 504 to step 506, the data of the X-th first file layer is decompressed and transferred to the first-stage first file layer by utilizing the data access rule between the N-stage first file layers under the file hierarchy structure adopted by the extended memory. And the data of the first file layer of the first stage is decompressed and transferred to the main memory, in particular to a variable memory table in the main memory, and the CPU reads the data to be read from the variable memory table in the main memory, so that the data reading is realized by using the memory space on the premise of less reconstruction of the original key value database, the data reading efficiency is improved, and the data reading response speed is ensured.
The data transfer in the process can be realized by adopting DSA, so that the data read-write performance in the database is further improved, and the overall data processing performance of the database is improved.
In some optional implementations, after determining whether the main memory, the extended memory, and the hard disk include target storage data matching the data to be read in step 501 based on the data to be read, the method further includes:
if the main memory contains the target storage data, the target storage data is directly read from the main memory.
When the target storage data matched with the data to be read is determined to be located in the main memory, the CPU directly reads the data to be read from the main memory, data reading is achieved in a memory space, data reading efficiency is improved, and data reading response speed is ensured.
In some optional implementations, after determining whether the main memory, the extended memory, and the hard disk include target storage data matching the data to be read in step 501 based on the data to be read, the method further includes:
if the hard disk contains target storage data, determining a target-level second file layer where the target storage data is located;
From the target-level second file layer, decompressing the target storage data layer by layer and transferring the target storage data into the upper-level second file layer until the target storage data is transferred and stored into the first-level second file layer in the M-level second file layer in the hard disk;
Decompressing target storage data in a first-level second file layer in the M-level second file layer, and then transferring the decompressed target storage data into an Nth-level first file layer in an extended memory;
reading target storage data from the Nth-level first file layer in the extended memory, or
From the Nth first file layer in the expansion memory, decompressing the target storage data layer by layer and transferring the target storage data into the first file layer at the upper level until the target storage data is transferred into the first file layer in the expansion memory;
decompressing the target storage data in the first file layer of the first stage and then transferring the decompressed target storage data into a main memory;
And reading the target storage data from the main memory.
And when the target storage data matched with the data to be read is determined to be positioned in the hard disk, marking the target-level second file layer where the target storage data is positioned as an X-level second file layer.
And decompressing and transferring the data of the X-th second file layer to the first-stage second file layer by layer, decompressing and transferring the data of the first-stage second file layer to the N-th first file layer in the extended memory, and reading the data to be read from the N-th first file layer by the CPU based on the memory address of the target storage data, so as to ensure the effective reading of the data to be read in the hard disk.
Or after the data of the X-th second file layer is decompressed and transferred to the first-stage second file layer by layer, and the data of the first-stage second file layer is decompressed and transferred to the N-th first file layer in the expansion memory, the data of the N-th first file layer is decompressed and transferred to the first-stage first file layer by layer continuously. And the data of the first file layer of the first stage is decompressed and transferred to the main memory, in particular to a variable memory table in the main memory, and the CPU reads the data to be read from the variable memory table in the main memory, so that the effective reading of the data to be read in the hard disk is ensured on the premise of less reconstruction of the key value database.
In some optional implementations, after determining whether the main memory, the extended memory, and the hard disk include target storage data matching the data to be read in step 501 based on the data to be read, the method further includes:
and if the main memory, the extended memory and the hard disk do not contain target storage data matched with the data to be read, returning response information.
The response information may be information that the query result is fed back to be 0, or feedback information that the query operation is wrong, etc., so as to effectively prompt the user for the situation that the query cannot reach the result.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a data processing device. The data processing device provided in the embodiments of the present application can implement the processes of the embodiments of the data processing method and achieve the same technical effects, so specific limitations in one or more embodiments of the data processing device provided in the following may be referred to above as limitations of the data processing method, and in order to avoid repetition, details are not repeated here.
The embodiment can divide the functional modules on the computing side according to the method. For example, each function may be divided into each functional module, or two or more functions may be integrated into one processing module.
Referring to fig. 6, fig. 6 is a block diagram of a data processing apparatus according to an embodiment of the present application, and for convenience of explanation, only a portion related to the embodiment of the present application is shown.
The data processing device 600 comprises a main memory, an extended memory and a hard disk, wherein the extended memory comprises an N-level first file layer, N is more than or equal to 1, the hard disk comprises an M-level second file layer, M is more than 1, and the data processing device 600 further comprises:
A data storage module 601 for:
Writing first target data in the main memory into a first-stage first file layer of the extended memory under the condition that a first preset condition is met, wherein under the condition that N is more than 1, an ith first file layer in the N-stage first file layers is used for storing data transferred in the ith-1 first file layer, and 1<i is less than or equal to N;
And under the condition that a second preset condition is met, writing second target data stored in an N-th first file layer of the extended memory into a first-stage second file layer in the hard disk, wherein a j-th second file layer in the M-th second file layer is used for storing data transferred in a j-1-th second file layer, and 1<j is less than or equal to M.
Optionally, the data processing apparatus 600 further includes:
A data reading module 602, configured to:
Reading fifth target data matched with the data to be read from the main memory under the condition that a fifth preset condition is met, or
And under the condition that a sixth preset condition is met, the fifth target data is read from the extended memory.
The integrated modules described above may be implemented in hardware. It should be noted that, in this embodiment, the division of the modules is schematic, only one logic function is divided, and another division manner may be implemented in actual implementation.
It should be noted that, the relevant content of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein.
In one embodiment, as shown in FIG. 7, a computer device is provided. The computer device 7 of this embodiment comprises at least one processor 700 (only one shown in fig. 7), a memory 701 and a computer program 702 stored in said memory 701 and executable on said at least one processor 700, said processor 700 implementing the steps of any of the respective method embodiments described above when said computer program 702 is executed.
The computer device 7 may be a desktop computer, a notebook, a palm computer or the like. The computer device 7 may include, but is not limited to, a processor 700, a memory 701. It will be appreciated by those skilled in the art that fig. 7 is merely an example of the computer device 7 and is not limiting of the computer device 7, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the computer device may also include input and output devices, network access devices, buses, etc.
The Processor 700 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 701 may be an internal storage unit of the computer device 7, such as a hard disk or a memory of the computer device 7. The memory 701 may also be an external storage device of the computer device 7, such as a plug-in hard disk provided on the computer device 7, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like. Further, the memory 701 may also include both an internal storage unit and an external storage device of the computer device 7. The memory 701 is used to store the computer program and other programs and data required by the computer device. The memory 701 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided by the present application, it should be understood that the disclosed apparatus/computer device and method may be implemented in other manners. For example, the apparatus/computer device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
Embodiments of the present application may implement all or part of the procedures in the methods of the embodiments described above, and may also be implemented by a computer program product, which when run on a computer device causes the computer device to implement the steps in the embodiments of the methods described above.
The foregoing embodiments are merely illustrative of the technical solutions of the present application, and not restrictive, and although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that modifications may still be made to the technical solutions described in the foregoing embodiments or equivalent substitutions of some technical features thereof, and that such modifications or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.