Detailed Description
The objects, technical solutions and advantages of the embodiments of the present application will be more apparent, and the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are some, but not all, embodiments of the application.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In embodiments of the present application, the terms "first," "second," and the like are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the term "include" and any variations thereof are intended to cover a non-exclusive inclusion.
Fig. 1 is a schematic diagram of a storage system according to an embodiment of the present application. As shown in fig. 1, the storage system 110 includes a recycle bin management server 111 and a metadata server 112. The recycle bin management server 111 may be configured to generate an instruction to instruct the metadata server 112 to perform a cleaning action on metadata in the recycle bin, where the metadata server 112 includes a key-value (KV) database, where the KV database is a database storing data in key values, and each key corresponds to a unique value, and is adapted to query the data through a primary key.
In the embodiment of the application, the KV database stores data of each tenant, for example, including, but not limited to, metadata of a file system and metadata of a recycle bin. Different tenants lease different storage space, use different file systems and recycle bins. Of course, the same tenant may use multiple file systems, which is not limited in this regard by the present application. In the following, for convenience of distinguishing and explanation, the file system and the recycle bin are distinguished by taking the tenant as granularity.
It should be appreciated that the recycle bin management server 111 and the metadata server 112 are shown separately in fig. 1 for ease of distinction only. In a practical scenario, the recycle bin management server 111 and the metadata server 112 may be two independent physical devices, or may be two different logical partitions on the same physical device. The embodiment of the present application is not limited thereto.
In order to prevent the data stored in the storage system by the tenant from being deleted by mistake and lost, a recycle bin function is designed in the storage system so that the tenant can recover the deleted data. Meanwhile, the recycling station also has an automatic cleaning function, and when the retention time of the data in the recycling station exceeds the preset time, the recycling station management server sends an instruction to the metadata server to instruct the metadata server to clean out the expired data.
At present, the storage system charges the storage space occupied by the data in the recycle bin according to the normal storage price, and the storage space is released after the data is cleaned from the storage space of the recycle bin, so that the charging is stopped. Therefore, if the recovery station is not timely cleaning the expired data, the charging space of the tenant can be occupied, and additional cost is added to the tenant. Therefore, the recycle bin needs to clear the outdated data in real time. However, the cleaning of the recycle bin data needs to modify metadata of the file system, and also triggers deletion of the metadata and the data, which occupies a certain overhead.
The cleaning of the recycle bin data includes two processes, deleting metadata and freeing space. The flow of deleting metadata is shown in fig. 2, and the flow of freeing space is shown in fig. 3.
Specifically, the process of deleting metadata may specifically include:
step 201, deleting file object metadata;
step 202, deleting the file object index record;
Step 203, modifying the parent directory modification time and the statistical information;
step 204, modifying the file system statistics;
In step 205, a file deletion event is recorded.
The space release process specifically may include:
Step 301, acquiring a file deletion event;
step 302, determining a storage position of data according to a file layout (layout) record;
Step 303, deleting the data;
step 304, modifying a storage management record of the system;
Step 305, delete the layout record of the file.
Current disks widely use a log structured merge Tree (MERGE TREE, LSM-Tree) structure to store data. The structure of the LSM-Tree is shown in FIG. 4. Random access memory (random access memory, RAM) and disk (disk) are shown in this configuration, where the disk may be further divided into multiple layers (levels), as shown by levels 0, 1 and 2. Wherein the total size of the files of each layer is predefined, and the size of the next layer is larger than that of the previous layer. When the memory is full, files in the memory are written to disk. The files in the disk merge down layer by layer, once the total size of the files in one layer exceeds a predefined threshold, a merge (compatibility) of one file with the files in the next layer may be selected.
In the above two processes, the modifying and deleting operations, i.e. steps 201, 202, 203, 204, 303, 304, 305 are written to the memory table (memtable) of the RAM, and the data is actually modified or deleted after multiple merging. While the query request needs to traverse memtable of RAM, level 0 ordered string table (sorted string table, SStable) in disk, and SStable of the lower layer contains query data. Therefore, if the user accesses the system and the background is cleaning up the recycle bin data, the overhead will be great, and the system performance will be affected, for example, the processing of the creation, deletion, and rewrite requests of the data link will be affected.
In view of this, the present application provides a data processing method, when the data in the recycle bin expires, on one hand, the expired data is identified, and the account space of the tenant is not counted, on the other hand, the expired data is cleaned up until the storage system enters the valley period. Therefore, the strategy of cleaning the outdated data by the recycle bin is optimized, the influence on the system performance is reduced, the occupied time of the tenant on the storage space is not additionally increased, and unnecessary expense is avoided for the tenant.
The data processing method and the storage system provided by the embodiment of the application are described in detail below with reference to the accompanying drawings.
It should be noted that the data processing method provided by the embodiment of the application can be applied to a storage system. The storage system may be used to provide storage services for one or more tenants, each of which may use the storage space provided by the storage system to store respective data. The storage system may include a recycle bin management server and a metadata server, and the following embodiments will describe the method provided by the embodiments of the present application by taking interaction between the two as an example.
For convenience of distinction and explanation, the method provided by the embodiment of the present application is described below using the first data of the first tenant as an example. The first data is data which is stored in the recycle bin for a preset time period and is not cleaned. Briefly, the first data is the data that expires without being cleaned. It will be appreciated that the first data may be data from any one tenant, which is defined herein as data of the first tenant for ease of distinction only.
Fig. 5 is a schematic flow chart of a data processing method according to an embodiment of the present application. As shown in fig. 5, the method may include steps 501 to 504. The steps of fig. 2 are described in detail below.
In step 501, the recycle bin management server generates a first instruction according to a time index of the first data.
As described above, the first data is data which is stored in the recycle bin for a predetermined period of time but is not cleared. The preset duration may be set by the tenant, or may be default of the system, which is not limited in the present application. The recycle bin management server can judge whether the data is out of date or not according to the time when the data is put into the recycle bin and the preset time.
Here, the data is put into the recycle bin, which may mean that the user deletes the data from the file system, and in order to avoid the deletion, the system automatically moves the data into the recycle bin for temporary storage for a period of time. In an embodiment of the present application, the operation of deleting data from the file system may be recorded, for example, when data 1 (i.e., one example of the first data) is placed in the recycle bin for 2021, 6, 15, 1, 2021061501, and when data 2 (i.e., another example of the first data) is placed in the recycle bin for 2021, 6, 15, 2, 2021061502.
Since the tenant may delete unnecessary data at any time, that is, new data may be stored in the recycle bin at any time. Thus, the recycle bin management server may periodically count the data in the recycle bin to determine newly generated stale uncleaned data.
In an embodiment of the present application, the time index of the first data may be related to the time when the first data is put into the recycle bin. Each time index may correspond to a time node. In this way, the recycle bin management server may generate the first instruction based on the time index of the first data, so as to instruct the metadata server to identify the first data as abnormal state through the first instruction.
Here, the time index may be created with different time lengths as granularity, for example, with granularity of hours, and then one time index is created for each hour, corresponding to the data put into the recycle bin in that hour.
In one example, data 1 is placed in the recycle bin for 2021, 6, 15, 1, with a corresponding time index of 1024, and data 2 is placed in the recycle bin for 2021, 6, 15, 2 with a corresponding time index of 1025.
In the embodiment of the application, the granularity of the time index can be the same as that of the charging mode. For example, the charging for the tenant to use the storage space is charged in units of hours, and the granularity of the time index may be hours. Of course, the use of the storage space by the tenant may be charged in units of other time lengths, and the time index may be differentiated in units of other time lengths, which is not limited by the present application.
Optionally, step 501 includes determining, by the recycle bin management server, a time index of the first data according to a preset duration and a first mapping relationship, where the first mapping relationship includes a correspondence between at least one time index and at least one time, and the time in the at least one time represents a time when the data under the corresponding time index is placed into the recycle bin, and generating, by the recycle bin management server, a first instruction according to the time index of the first data.
Table 1 shows an example of the first mapping relation.
TABLE 1
Time index |
Time of |
1024 |
2021061501 |
1025 |
2021061502 |
It should be understood that the first mapping relation shown in table 1 is given based on data 1 and data 2in the above example, but this should not constitute any limitation of the present application. The present application is not limited to the specific content in the first mapping relation.
Taking the mapping relation between the time index 1024 and the time 2021061501 as an example, assuming that the preset duration is 1 day, when the duration of storing the data 1 with the time of 2021061501 in the recycle bin reaches the preset duration in 2021, the recycle bin management server determines the time index 1024 of the data 1 from the time 2021061501 according to the first mapping relation, and further generates the first instruction according to the time index 1024 of the data 1.
Note that, the first mapping relationship may be recorded in the metadata server, for example, metadata of the recycle bin. The recycle bin management server can read related data in the metadata server through an access interface provided by the metadata server, further determine a time index of the first data, and further generate a first instruction according to the time index of the first data. The storage space occupied by the data identified as abnormal in the recycle bin does not account for the billing space of the first tenant. That is, the storage space occupied by the outdated unclean data is not counted into the tenant's billing space. In other words, the tenant does not have to pay additional for data that is out of date and unclean. In contrast, the storage space occupied by the data identified as normal in the recycle bin counts into the billing space of the first tenant, who needs to pay for it. Here, the data in normal state is that the data in the recycle bin has not expired, that is, the data stored in the recycle bin for a time not reaching the preset time length.
In step 502, the recycle bin management server sends a first instruction to the metadata server. Accordingly, the metadata server receives a first instruction from the recycle bin management server.
In step 503, the metadata server identifies the first data as abnormal based on the first instruction.
The recycle bin management server may trigger the metadata server to identify the first data as abnormal by sending a first instruction to the metadata server.
Optionally, step 503 includes the metadata server identifying, based on the first instruction, a state of the first data in the second mapping relationship as abnormal, where the second mapping relationship includes a correspondence between at least one time index and at least one state, where the at least one state includes normal and abnormal states, and a storage space occupied by the data identified as normal in the recycle bin is counted into a billing space of the first tenant.
In one example, the second mapping relationship is a mapping relationship between the time index 1024 and the state. Before the metadata server receives the first instruction, the second mapping relation of the data 1 is a time index 1024 and a state is normal, and after the metadata server receives the first instruction, the second mapping relation of the data 1 is changed into a time index 1024 and a state is abnormal.
Table 2 shows an example of the second mapping relationship.
TABLE 2
Time index |
Status of |
1024 |
Abnormal state |
1025 |
Normal state |
It should be understood that the first mapping relationship and the second mapping relationship may be stored as two independent mapping relationships, or may be stored as one mapping relationship by synthesis. For example, table 3 shows another example of the first mapping relationship and the second mapping relationship.
TABLE 3 Table 3
Time index |
Status of |
Time of |
1024 |
Abnormal state |
2021061501 |
1025 |
Normal state |
2021061502 |
Optionally, the first instruction further indicates an occupation amount of the storage space by the first data, and after step 501, the method further includes that the metadata server excludes the occupation amount of the storage space by the data to be cleaned from the occupation amount of the storage space by the file system and the recycle bin of the first tenant, so as to obtain a billing space of the first tenant.
Here, the data to be cleaned may include the first data that is determined to be out of date in step 501 described above, and the data that is out of date and has not been cleaned that has been determined before. Since these data are stored in the recycle bin for a period of time equal to or longer than a predetermined period of time, they should be cleaned. However, because the mechanism introduced by the application is introduced, the metadata server cleans the system overhead under the condition that the system overhead meets the preset condition, and therefore, the time stored in the recycle bin may exceed or exceeds the preset time length. In order to avoid unnecessary expenditure on the tenants, the occupation of the storage space by the data to be cleaned is excluded from the billing space of the first tenant. That is, the charging space of the tenant=the file system of the tenant occupies the storage space+the recycle bin occupies the storage space-the data to be cleaned occupies the storage space.
For example, the unit of memory space is Gigabyte (GB). If the first instruction indicates that the storage space occupied by the data 1 is 1GB, the storage space occupied by the file system of the first tenant is 10GB, the storage space occupied by the recycle bin is 5G, and the storage space occupied by the data to be cleaned counted by the metadata processor before the first instruction is received is 2GB. After receiving the first instruction, the metadata processor may accumulate the storage space 1GB occupied by the data 1 to the storage space 2GB occupied by the data to be cleaned counted before, to obtain 3GB. The charging space of the first tenant is 10gb+5gb-3gb=12gb.
Optionally, the abnormal state includes a frozen state and a clean state.
It should be appreciated that both the frozen state and the clean state are abnormal, and that neither the storage space occupied by data identified as either the frozen state or the clean state takes into account billing space. In the embodiment of the application, the frozen state and the clean state are used for distinguishing the states before and after metering.
Illustratively, step 503 includes the metadata server identifying the state of the first data as a frozen state before excluding the amount of storage space occupied by the first data from the amount of storage space occupied by the file system and recycle bin of the first tenant, and the metadata server identifying the state of the first data as a cleaned state after excluding the amount of storage space occupied by the first data from the amount of storage space occupied by the file system and recycle bin of the first tenant.
For example, the metadata server, after receiving the first instruction, identifies the time index 1024 of data 1 as frozen in the second mapping relationship before performing the act of cleaning data 1. The metadata server accumulates the storage space occupied by the data 1 to the storage space occupied by the data to be cleaned, and after updating the charging space, identifies the time index 1024 of the data 1 in the second mapping relationship as a cleaning state.
Therefore, the atomicity of the metadata processor in the processing process of the first data can be ensured, and repeated accumulation of the storage space occupied by the first data under the condition of requesting replay can be avoided, so that the metering result of the charging space is wrong.
It should be understood that the frozen state and the clean state are only two states introduced for distinguishing before and after charging, and should not be construed as limiting the application in any way. The application does not exclude the possibility of introducing other states to distinguish between the two states before and after charging.
In step 504, the first data identified as abnormal is cleaned from the recycle bin if the overhead satisfies a predetermined condition.
Because the data need to be cleaned from the recycle bin occupies a certain system overhead, thereby affecting the functions of creating, deleting and reading and writing of the data link in the storage system, the data cleaning is preferably performed in the valley period of the storage system.
Whether the storage system is in the valley period or not can be judged according to the system overhead. If the overhead is low, it is considered to be in the valley period. Therefore, the metadata server can monitor the current system overhead in real time, and the data marked as abnormal state is cleaned from the recycle bin under the condition of low system overhead. Here, the data identified as abnormal state includes first data.
As previously described, in order to distinguish between the states before and after metering, a frozen state and a clean state are introduced, respectively. It will be appreciated that since the data requires metering of the billing space before being cleaned, the cleaned data is typically data in the cleaned state with the frozen state and cleaned state introduced.
Here, the overhead includes the overhead of the foreground and the overhead of the background. The system overhead of the foreground specifically includes system overhead caused by user access, such as system overhead caused by operations of writing, deleting, modifying, inquiring and the like, and the system overhead of the background specifically may include system overhead caused by cleaning up the expired data in the recycle bin.
The metadata server can clean the data to be cleaned from the recycle bin under the condition of low system overhead of the foreground and/or low system overhead of the background.
Illustratively, the overhead of the foreground may be measured by the input output per second (input output per second, IOPS) of the foreground, the Query Per Second (QPS) of the foreground, and so on. For example, a threshold is set for the foreground IOPS or QPS, e.g., the foreground IOPS corresponds to a first preset threshold and the foreground QPS corresponds to a second preset threshold.
If the foreground IOPS is taken as a judgment basis, the metadata server can determine that the system overhead of the foreground is smaller under the condition that the foreground IOPS is lower than a first preset threshold, and if the foreground QPS is taken as a judgment basis, the metadata server can determine that the system overhead of the foreground is smaller under the condition that the foreground QPS is lower than a second preset threshold.
It should be understood that foreground IOPS and foreground QPS are just two examples of overhead for characterizing a foreground, and the present application is not limited thereto.
The overhead of the background can be measured by the amount of uncombined data. For example, a third preset threshold is set for the amount of uncombined data. If the amount of the uncombined data is lower than a third preset threshold, the metadata database determines that the system overhead of the background is smaller.
The metadata server can clean the data to be cleaned from the recycle bin under the condition that any one of the foreground and the background has small system overhead, or can clean the data to be cleaned from the recycle bin under the condition that the foreground and the background have small system overhead.
The preset conditions include one or more of the input output per second of the foreground being lower than a first preset threshold, the query rate per second of the foreground being lower than a second preset threshold, and the amount of uncombined data in the memory being higher than a third preset threshold. In practical application, which item or items are selected and whether to add more preset conditions can be set by the staff, and the application is not limited in any way. The metadata server may execute the step 204 according to the state of each data in the second mapping relationship, or may execute the step 204 in response to the call of the recycle bin management server.
Optionally, before step 504, the method further includes the recycle bin management server generating a second instruction based on the N data to be cleaned, the second instruction being used for indicating to clean the N data to be cleaned, where the N data to be cleaned includes the first data and N is a preset value, and the recycle bin management server sending the second instruction to the metadata server.
Because the recycle bin management server may determine one or more outdated uncleaned data each time, the outdated uncleaned data may be identified as abnormal after being determined, and become data to be cleaned. If the instruction is sent separately for each data, a large signaling overhead may be caused, so that the cleaning requests of the N data to be cleaned may be aggregated in one instruction and sent to the metadata server. That is, the recycle bin management server may generate a second instruction based on the N data to be cleaned to instruct cleaning of the N data to be cleaned. Wherein N may be a default system or a tenant setting, which is not limited by the present application.
It should be noted that the data may include data in a file and data in a directory. A data may specifically refer to data in a file (or data under a file directory) or data in a directory.
It should be further noted that, the recycle bin management server marks the first data as abnormal, and the first data is cleaned by calling the metadata server with the recycle bin management server, which can be regarded as two independent processes, and the two independent processes are not coupled with each other.
Fig. 6 is a flow diagram of a recycle bin management service and a metadata service, as an example. Wherein the recycle bin management service may be provided by a recycle bin management server, the metadata service may be provided by a metadata server, and the metadata server may provide the metadata service in response to a call from the recycle bin management server.
In the recycle bin management service flow, firstly, a file system is selected for cleaning, then an hour catalog is obtained, the catalog in cleaning is extracted, then a file and a subdirectory under the catalog to be cleaned are obtained, and finally, metadata service is called to execute cleaning. In the metadata service flow, a file/subdirectory cleaning request is received first, then whether cleaning can be performed is judged according to the state of the hour catalog, next whether cleaning request can be executed is judged according to indexes such as QPS of the current foreground request, the number of level 0 files and the like, finally the file/subdirectory is cleaned, and a cleaning result is returned to the recycle bin management server. If the cleaning is successful, the metadata service can be repeatedly called to execute the cleaning in order to ensure that the cleaning is successful, and if the cleaning is busy, a file system is selected for cleaning.
In the above step, the hour catalog may be the contents shown in table 1, table 2 or table 3.
Optionally, the second instruction carries data identifiers of N pieces of data to be processed. In an embodiment of the application, the data identification may include a time inode (inode), a parent directory inode, a parent directory entry (dentryid), and a file name of the data to be cleaned.
The data identification is in the form of < hours directory inode > # < original parent directory inode > _ < dentryid > _ < original file name >, wherein the hours directory inode is an example of a time inode with the granularity of hours.
For example, the data of the data having the file name "file2" is identified as "1024#23_27_file2". For another example, the data of the data having the file name "dir4" is identified as "1025#20_23_dir4".
It should be appreciated that each data identification may be used to uniquely identify one data and determine the location of that data in the memory space. Thus, based on the data identification of the N pieces of data to be processed, the metadata server can determine which data and their locations need to be deleted.
The metadata server counts the input/output requests and the query requests of the foreground and the data amount which is not combined in the background. The input/output request and the query request of the foreground can be used for judging whether the current storage system is busy, and the number of the files which are not combined in the background can be used for predicting the performance influence possibly caused by the input/output request of the subsequent foreground on the storage system. Therefore, whether the storage system is in the valley period can be judged by the three preset conditions.
It has been described that data identified as abnormal state cannot be accessed, and data identified as normal state can be accessed.
"Not accessible" and "accessible" herein are for the tenant. If the duration of the data 1 in the recycle bin does not reach the preset duration, the data 1 is still stored in the recycle bin, and the data 1 can be accessed for the tenant, if the duration of the data 1 in the recycle bin reaches the preset duration, the data 1 is in a cleared state for the tenant, and the tenant cannot access the data 1 even though the data 1 is not cleared in the metadata server for a while.
Optionally, the method further comprises the steps that the metadata server receives a query request, wherein the query request carries a directory structure index of second data and is used for requesting to query the second data, the metadata server determines a time index of the first data according to the directory structure index of the second data and a third mapping relation, the third mapping relation comprises a one-to-one correspondence relation between at least one directory structure index and at least one time index, the metadata server determines the state of the second data according to the second mapping relation, the metadata server refuses the query request under the condition that the state of the second data is abnormal, or the metadata server searches the second data based on the directory structure index and the time index under the condition that the state of the second data is normal.
Table 4 below shows an example of the third mapping relationship. The third mapping relationship shown in table 4 may be referred to as a directory structure table.
TABLE 4 Table 4
Directory structure index |
Time index |
2 |
1025 |
20 |
1025 |
22 |
1025 |
23 |
1024,1025 |
26 |
1024 |
The flow of the recycle bin directory structure query is shown in fig. 7, and specifically, the flow of the recycle bin directory structure query may include:
step 701, obtaining an hour directory for storing deleted files/subdirectories of the directory;
Step 702, filtering the abnormal state hour catalog;
in step 703, the files and directories under the normal state hours directory are queried.
For one example, assume that the directory structure indexes for data 1 and data 2 are both "23". If the tenant queries "23" in 2021, 6/16/1, the metadata server may determine that the time indexes are 1024 and 1025 through the third mapping relationship, and then query the second mapping relationship, where the state corresponding to the time index 1024 is an abnormal state, and the state corresponding to the time index 1025 is a normal state. Data 1 with time index 1024 is filtered out and data 2 with time index 1025 is returned to the tenant.
Based on the method, the first data which is outdated but not cleaned is marked as abnormal state by the metadata server, the payment space of the tenant is not counted, and the data marked as abnormal state is cleaned under the condition that the system overhead meets the preset condition, so that the strategy of cleaning the outdated data by the recycle bin is optimized, the influence on the system performance is reduced, the occupied time of the tenant on the storage space is not additionally increased, and the unnecessary expense is avoided for the tenant.
Fig. 8 is a schematic block diagram of a recycle bin management server according to an embodiment of the present application. The recycle bin management server 800 may be the recycle bin management server 111 in fig. 1, or a server having the same function as it. As shown in fig. 8, the recycle bin management server 800 may include a processing module 810 and a transmitting module 820. The processing module 810 may be configured to generate a first instruction according to a time index of the first data, where the first data is data that is stored in a recycle bin of the first tenant for a time reaching a preset duration but not cleaned, the first instruction is configured to indicate that the first data is marked as abnormal, a storage space occupied by the data marked as abnormal in the recycle bin does not account for a billing space of the first tenant, and the sending module 820 may be configured to send the first instruction to the metadata server.
Optionally, the processing module 810 is further configured to determine a time index of the first data according to a preset duration and a first mapping relationship, where the storage space occupied by the data identified as abnormal state in the recycle bin does not account for the billing space of the first tenant, the first mapping relationship includes a correspondence between at least one time index and at least one time, and the time in the at least one time represents a time when the data under the corresponding time index is stored in the recycle bin.
Optionally, the processing module 810 may be further configured to generate a second instruction based on the N data to be cleaned, where the second instruction is used to instruct cleaning of the N data to be cleaned, and the N data to be cleaned includes the first data, and N is a preset value, and the sending module 820 may be further configured to send the second instruction to the metadata server.
Fig. 9 is a schematic block diagram of a metadata server provided in an embodiment of the present application. The metadata server 900 may be the metadata server 112 of fig. 1, or a server having the same function as it. As shown in fig. 9, the metadata server 900 may include a receiving module 910 and a processing module 920. The receiving module 910 may be configured to receive a first instruction from the recycle bin management server, where the first instruction is configured to indicate that first data in a recycle bin of a first tenant is identified as abnormal, where the first data is data that is stored in the recycle bin for a time reaching a preset duration but is not cleaned, and a storage space occupied by the data identified as abnormal in the recycle bin does not account for a billing space of the first tenant, and the processing module 920 may be configured to identify the first data as abnormal based on the first instruction, and clean the first data identified as abnormal from the recycle bin if a system overhead meets a preset condition.
Optionally, the processing module 920 may be further configured to exclude the occupation amount of the storage space by the data to be cleaned from the occupation amount of the storage space by the file system and the recycle bin of the first tenant, so as to obtain a billing space of the first tenant, where the data to be cleaned includes the first data.
Optionally, the processing module 920 may be further configured to identify the state of the first data as a frozen state before excluding the first data from the first tenant's file system and the recycle bin's amount of storage space, and identify the state of the first data as a clean state after excluding the first data from the first tenant's file system and the recycle bin's amount of storage space.
Optionally, the processing module 920 may be further configured to clean the N data to be cleaned from the recycle bin based on the second instruction if the overhead meets a preset condition.
Optionally, the receiving module 910 is further configured to receive a query request, where the query request carries a directory structure index of the second data and is used for requesting to query the second data, and the processing module 920 is further configured to determine a time index of the first data according to the directory structure index of the second data and a third mapping relationship, where the third mapping relationship includes a one-to-one correspondence between at least one directory structure index and at least one time index, determine a state of the second data according to the second mapping relationship, reject the query request if the state of the second data is abnormal, or search the second data based on the directory structure index and the time index if the state of the second data is normal.
Fig. 10 is a schematic structural diagram of a data processing apparatus according to the present application, in one embodiment, the data processing apparatus may be a recycle bin management server, and in another embodiment, the data processing apparatus may be a metadata server. As shown in fig. 10, the data processing apparatus 1000 may include at least one processor 1010 for implementing the data processing functions of the method provided by the present application. Reference is made specifically to the detailed description in the method examples, and details are not described here.
The data processing apparatus 1000 may also include a memory 1020 for storing program instructions and/or data. Memory 1020 is coupled to processor 1010. The coupling in the present application is an indirect coupling or communication connection between devices, units or modules, which may be in electrical, mechanical or other form for the exchange of information between the devices, units or modules. The processor 1010 may operate in conjunction with the memory 1020. The processor 1010 may execute program instructions stored in the memory 1020. At least one of the at least one memory may be included in the processor.
The data processing apparatus 1000 may also include a communication interface 1030 for communicating with other devices over a transmission medium such that apparatus for use in the data processing apparatus 1000 may communicate with other devices. The communication interface 1030 may be, for example, a transceiver, an interface, a bus, a circuit, or a device capable of implementing a transceiver function. Processor 1010 may utilize communication interface 1030 to transmit and receive data and/or information and may be used to implement methods for data processing in the corresponding embodiments of fig. 5.
The specific connection medium between the processor 1010, the memory 1020, and the communication interface 1030 is not limited to the specific connection medium described above. The present application is illustrated in fig. 10 as being coupled between processor 1010, memory 1020, and communication interface 1030 via bus 1040. The bus 1040 is shown in fig. 10 with a bold line, and the connection between the other components is merely schematically illustrated, and is not limited thereto. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 10, but not only one bus or one type of bus.
In the embodiment of the present application, the processor may be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or execute the methods, steps and logic blocks disclosed in the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
According to the method provided by the application, the application further provides an electronic device, which comprises a processor and is used for calling a computer program to enable the electronic device to execute the method executed by the recycle bin management server and/or the method executed by the metadata server in the embodiment shown in fig. 5 to 7. Optionally, the electronic device further comprises a memory for storing said computer program.
According to the method provided by the present application, the present application further provides a computer readable storage medium storing a computer program, which when executed on a computer, causes the computer to perform the method performed by the recycle bin management server or the method performed by the metadata server in the embodiments shown in fig. 5 to 7.
According to the method provided by the application, the application also provides a computer program product comprising computer program code. The computer program code, when run on a computer, causes the computer to perform the method performed by the recycle bin management server or the method performed by the metadata server in the embodiments shown in fig. 5 to 7.
The technical scheme provided by the application can be realized in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a terminal device, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via a wired, such as coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium such as a digital video disc (digital video disc, DVD), or a semiconductor medium, etc.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.