CN105677652B

CN105677652B - A kind of data managing method and device

Info

Publication number: CN105677652B
Application number: CN201410659318.0A
Authority: CN
Inventors: 李炉阳
Original assignee: Alibaba Group Holding Ltd
Current assignee: Zhejiang Tmall Technology Co Ltd
Priority date: 2014-11-19
Filing date: 2014-11-19
Publication date: 2019-01-04
Anticipated expiration: 2034-11-19
Also published as: CN105677652A

Abstract

The embodiment of the present application discloses a kind of data managing method and device, the described method includes: obtaining the Table Properties information of tables of data, the Table Properties information and first information set are matched, matching result is obtained, determines that the matching result is the first numerical value of successful tables of data；Matching result is unsuccessful tables of data, and the period 1 of the tables of data is determined according to the Table Properties information, according to the life cycle in the period 1 of the tables of data and the tables of data Table Properties information, calculates the first numerical value of the tables of data；The pending data table in the tables of data is determined according to the first numerical value of the tables of data, handles the pending data table according to preset rules.A kind of data managing method and device provided by the present application, may be implemented permanently effective data management.

Description

Data management method and device

Technical Field

The present application relates to the field of computer data processing technologies, and in particular, to a data management method and apparatus.

Background

Data has penetrated into every industry and business function area today and becomes an important production factor. With the advent of the big data era, the data volume and the business complexity of each enterprise are rapidly increased, the storage requirement of data is more and more large, and the difficulty is increased for the management of the data.

Existing data management methods may generally include: when the data has a storage bottleneck, a new storage machine can be added; if a new storage machine cannot be added, the server may initiate a data clean. The data cleaning process may include: searching the first n large data tables with larger data volume, such as the tables with the data volume larger than 10 TB; judging whether the big data scale can be deleted or not, and if so, deleting the big data scale; or the life cycle of the searched big data scale is reduced.

In the process of implementing the present application, the inventor finds that at least the following problems exist in the prior art: the operation process of searching the large data table and confirming whether the large data table can be deleted or the life cycle is reduced needs a lot of time and labor, the existing data management method can only relieve the problem of insufficient storage space in a short time, and after a period of time, the storage bottleneck still occurs. Therefore, the conventional data management method cannot efficiently perform optimal management of data for a long period of time.

Disclosure of Invention

The embodiment of the application aims to provide a data management method and a data management device so as to realize long-term effective data management.

To solve the foregoing technical problem, embodiments of the present application provide a data management method and apparatus, which are implemented as follows:

a method of data management, comprising:

acquiring table attribute information of a data table, matching the table attribute information with a first information set to obtain a matching result, and determining that the matching result is a first numerical value of a successful data table;

determining a first period of the data table according to the table attribute information, and calculating a first numerical value of the data table according to the first period of the data table and a life cycle in the data table attribute information;

and determining a data table to be processed in the data table according to the first numerical value of the data table, and processing the data table to be processed according to a preset rule.

In a preferred embodiment, the data management method further includes: and determining a second numerical value of the first unit according to the first numerical value of the data table and the table attribute information of the data table.

A data management apparatus comprising: the device comprises a matching module, a first numerical value calculating module and a processing module; wherein,

the matching module is used for acquiring the table attribute information of the data table, matching the table attribute information with the first information set to obtain a matching result, and determining the matching result as a first numerical value of the successful data table;

the first numerical value calculation module is used for determining a first period of the data table according to the table attribute information for the data table with unsuccessful matching result in the matching module, and calculating a first numerical value of the data table according to the first period of the data table and the life cycle in the table attribute information of the data table;

the processing module is used for determining a data table to be processed in the data table according to the first numerical value of the data table determined in the matching module and the first numerical value calculating module, and processing the data table to be processed according to a preset rule.

In a preferred embodiment, the data management apparatus further includes: a first unit module; the first unit module is used for determining a second numerical value of the first unit according to the first numerical value of the data table and the table attribute information of the data table determined in the matching module and the first numerical value calculating module.

According to the technical scheme provided by the embodiment of the application, the data management method and the data management device disclosed by the embodiment of the application determine the first numerical value of the data table by analyzing the table attribute information of the data table, the first numerical value can intuitively reflect the data effectiveness of the data table, and long-term effective data management can be realized by timely processing the data table with lower data effectiveness. Further, in a preferred embodiment, a second value of the first unit corresponding to the individual, application or business unit may also be determined, and the second value may visually reflect the data validity of the data table associated with the first unit, which is beneficial for managing the data table associated with the first unit.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.

FIG. 1 is a flow chart of one embodiment of a data management method of the present application;

FIG. 2 is a flow chart of another embodiment of a data management method of the present application;

FIG. 3 is a block diagram of one embodiment of a data management device of the present application;

FIG. 4 is a block diagram of another embodiment of a data management device according to the present application.

Detailed Description

The embodiment of the application provides a data management method and device.

In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

FIG. 1 is a flow chart of one embodiment of a data management method of the present application. As shown in fig. 1, the data management method may include:

s101: the method comprises the steps of obtaining table attribute information of a data table, matching the table attribute information with a first information set to obtain a matching result, and determining the matching result as a first numerical value of a successful data table.

Obtaining table attribute information of a data table, wherein the table attribute information may include at least one of the following: the information of the type of the table, the manual review information, the access information in the first time interval, the output time information, the life cycle information, the span information, the byte number information and the information of the responsible person.

The type information of the table can be used for describing the type of the data table. The type of the table can be a source head table, a partition table or a non-partition table and the like. The source header table may be a table that ultimately needs to be reserved in a link. Typically a link may contain a source header table.

The manual review information may be used to describe whether the data table is manually reviewed. If the data table is manually reviewed, the data table may be desirably retained.

The first time interval access information may be used to describe whether the data table was accessed in a first time interval. The data table may be retained if the data table was accessed within a first time interval. The first time interval may be predetermined, for example, may be set to one month.

The production time information may be used to describe the length of time the data table has been built up to date.

The life cycle information may be used to describe the maximum time for storing data in the data table. For example, if the life cycle of a data table is 7 days, data "XXX" is stored in the data table, and the data "XXX" is automatically deleted from the data table on the 8 th day after storage.

The span information may be used to represent information accessed by the data table after the data table is produced, and may include a span value and time information corresponding to the span value.

The limit storage information may record whether the data table is subjected to limit storage. The limit storage means that the records repeatedly stored in the data table are reduced, so that the data stored in the table cannot cause the waste of resource space.

The byte number information may be used to indicate the size of the data amount in the data table.

The responsible person information may be used to represent the responsible person information of the data table, and may include, for example, the name, contact address, or affiliated department of the responsible person.

The first set of information includes: the type of the table is a source head table, the table is manually audited, the table is accessed in a first time interval, the type of the table is a non-partition table and the table is accessed in a second time interval, and the output time is less than a first preset time length.

The second time interval may have the same value as the first time interval. The first preset time period may be preset. The first preset time period may be generally less than 20 days, and may be set to 10 days, for example.

Matching the table attribute information with the first information set may specifically include: and solving intersection of the table attribute information and the first information set. If the obtained intersection is not an empty set, it indicates that part or all of the table attribute information of the data table is the same as the information in the first information set, and it may be determined that the matching result is successful.

If the matching result of the table attribute information of the data table and the first information set is successful, a first numerical value of the data table may be determined, including: setting a first value of the data table equal to a maximum value within a first preset range.

The first preset range may be a value range of the first numerical value. The first predetermined range may be predetermined, for example, the first predetermined range may be 0 to 100 or 0 to 1. Assuming that the first preset range is 0-100, when the matching result of the table attribute information of the data table and the first information set is successful, it may be determined that the first value of the data table is 100.

The data table that matches part of the information in the first information set can be identified as the data table that the data needs to be preserved. And screening the data table with stronger data effectiveness in the data table through matching the table attribute information with the first information set.

S102: and determining a first period of the data table according to the table attribute information, and calculating a first numerical value of the data table according to the first period of the data table and the life cycle in the data table attribute information.

For a data table with an unsuccessful matching result, a first cycle of the data table may be determined according to the table attribute information. The first period may be used to represent a reasonable storage period for the data in the data table. Specifically, a first period of the data table is determined according to span information in table attribute information of the data table.

The span information may be used to indicate a time interval from a generation time to an accessed time of the data table. The span information may include: a span value, and time information corresponding to the span value. The span information of a data table may include one or more span values and time information corresponding to the span values. For example, if a data table is generated at 10/01 days 2014 and accessed at 10/10 days 2014 and 20 days 2014, the data table has a span value of 10 days at 10/10 days 2014 and 20 days at 20/10 days 2014.

Determining a first period of the data table from the span information comprises: and acquiring a maximum span value of which the time information is positioned in a second time interval before the current time, judging an interval range to which the maximum span value belongs, and determining a first period of the data table according to a corresponding relation between the interval range and the first period.

The value of the second time interval may be preset, for example the second time interval may be 90 days. The larger the maximum span value of the data table in the second time interval is, the longer the corresponding first period of the data table may be.

The correspondence relationship between the interval range and the first period may be set in advance. For example, the correspondence relationship between the interval range and the first period may be as shown in table 1.

TABLE 1

Range of interval	First period
		0 to 4 days	7 days
5 to 12 days	15 days
		13 to 30 days	33 days
31 to 90 days	93 days
		91-180 days	183 days
191 to 365 days	368 days
		366 to 730 days	1095 days
More than 730 days	First cycle +365 days

A first value of the data table may be calculated based on the first period of the data table and the life cycle of the data table.

The life cycle of the data table may be obtained from table attribute information of the data table. Calculating the first value according to the first cycle and the life cycle of the data table may specifically include: calculating a difference value between the life cycle and the first cycle, and subtracting the difference value from the maximum value of the first preset range to obtain a first candidate result; and judging whether the candidate result is greater than a first preset value, if the first candidate result is smaller than the first preset value, setting a first numerical value of the data table to be equal to the first preset value, and if the first candidate result is greater than or equal to the first preset value, setting the first numerical value of the data table to be equal to the first candidate result. The first preset value belongs to a first preset range, for example, the first preset range is 0 to 100, and the value of the first preset value may be 20.

In the step, a first numerical value of the data table is obtained by analyzing and processing the table attribute information of the data table, the first numerical value can reflect the data validity in the data table, and the larger the first numerical value is, the stronger the data validity of the corresponding data table is.

S103: and determining a data table to be processed in the data table according to the first numerical value of the data table, and processing the data table to be processed according to a preset rule.

The data table to be processed in the data table can be determined according to the first numerical value of the data table, and specifically, the data table to be processed can be determined by adopting any one or a combination of several methods:

sorting the data tables according to the size of the first numerical value, and selecting the top m data tables with the smallest first numerical value in the data tables as data tables to be processed; wherein m is a positive integer, and m is less than or equal to the total number of the data tables;

sorting the data tables according to the size of the first numerical value, and selecting the top p% data tables with the smallest first numerical value in the data tables as data tables to be processed; wherein the value of p is 0-100;

comparing a first numerical value of a data table with a preset reference value, wherein the data table with the first numerical value smaller than the preset reference value can be used as a data table to be processed; wherein the preset reference value is greater than the first preset value.

Processing the to-be-processed data table according to the preset rule may include: deleting the data table to be processed; or deleting part of data in the data table to be processed; or changing the life cycle of the data table to be processed.

The changing the life cycle of the data table to be processed may be changing the life cycle of the data table to be processed to a first cycle.

Because the larger the first numerical value is, the stronger the data validity of the corresponding data table is, the processing of the data table with the smaller first numerical value can effectively reduce the waste of storage resources caused by the data table with the lower data validity.

According to the data management method embodiment, the table attribute information of the data table is analyzed, the first numerical value of the data table can be determined, the first numerical value can visually reflect the data effectiveness of the data table, the data table with lower data effectiveness is processed in time, a large data table can be effectively processed, a long tail table with a large number of data can be effectively processed, but the data amount is smaller, and therefore long-term effective data management is achieved.

FIG. 2 is a flow chart of another embodiment of the data management method of the present application. As shown in fig. 2, the present embodiment is different from the first embodiment of the data management method in that the method may further include:

s104: and determining a second numerical value of the first unit according to the first numerical value of the data table and the table attribute information of the data table.

The first unit may include: a person, an application, or a business unit. The second value may be used to describe the data validity of all data tables associated with the first cell. The higher the second value, the higher the data validity of the data table associated with the first cell may be indicated.

When the first unit is an individual, the method for determining the second value may include: and performing weighted overlap addition on the first numerical value of the data table associated with the first unit, and averaging, wherein the average value is multiplied by the management coverage rate of the data table, and the obtained result is a second numerical value.

Wherein,

the calculation method of the weight of the data table can be as follows: the number of bytes of the data table is added with 1, and the obtained sum is opened to a cubic root.

The calculation method of the data table management coverage rate can be as follows: the first amount of data is divided by the total amount of data. The total amount of data may be an amount of data of a data table associated with the first cell.

The first data amount may be a sum of data amounts of data tables conforming to a first rule, and the first rule may be that table attribute information of the data tables conforms to at least one of: the method comprises the steps of including life cycle information, enabling the table type to be a source head table, manually auditing and performing limit storage.

When the first unit is an application or service unit, the method for determining the second value may include: and performing weight overlapping addition on a first numerical value of the data sheet associated with the first unit, and averaging, multiplying the average value by the management coverage rate of the data sheet, multiplying the product by the integrity rate of the responsible person, wherein the obtained result is a second numerical value.

Wherein,

The calculation method of the management coverage rate of the data table can be as follows: the second amount of data is divided by the total amount of data. The total amount of data may be an amount of data of a data table associated with the first cell.

The second data amount may be a sum of data amounts of data tables that meet a second rule, and the second rule may be that table attribute information of the data tables meet at least one of: including life cycle information, after manual review, and after limit storage.

The method for calculating the integrity rate of the responsible person can comprise the following steps: the third amount of data is divided by the total amount of data. The total amount of data may be an amount of data of a data table associated with the first cell.

The third amount of data may be: the table attribute information in the data table associated with the first unit contains a sum of data amounts of the data tables responsible for the person information.

Other parts of this embodiment are the same as those of the first embodiment of the data management method of the present application, and are not described herein again.

S104 may be executed before S103 or after S103, which is not limited in the present application.

According to the embodiment of the data management method disclosed by the application, on the basis of the first embodiment of the method, a second numerical value of the first unit corresponding to the individual, application or business unit can be determined, the second numerical value can intuitively reflect the data validity of the data table associated with the first unit, and the management of the data table associated with the first unit is facilitated.

The following describes embodiments of the data management apparatus of the present application.

FIG. 3 is a block diagram of an embodiment of a data management device of the present application. As shown in fig. 3, the data management apparatus may include: a matching module 301, a first numerical value calculation module 302 and a processing module 303. Wherein,

the matching module 301 may be configured to obtain table attribute information of a data table, match the table attribute information with a first information set to obtain a matching result, and determine that the matching result is a first numerical value of a successful data table;

the first numerical value calculating module 302 may be configured to, for a data table in which a matching result in the matching module 301 is unsuccessful, determine a first cycle of the data table according to the table attribute information, and calculate a first numerical value of the data table according to the first cycle of the data table and a life cycle in the data table attribute information;

the processing module 303 may be configured to determine a to-be-processed data table in the data table according to the first numerical value of the data table determined in the matching module 301 and the first numerical value calculating module 302, and process the to-be-processed data table according to a preset rule.

FIG. 4 is a block diagram of another embodiment of the data management device of the present application. As shown in fig. 4, the present embodiment is different from the first embodiment of the data management apparatus of the present application in that the data management apparatus may further include: a first unit module 304.

The first unit module 304 may be configured to determine a second value of the first unit according to the first value of the data table and the table attribute information of the data table determined in the matching module 301 and the first value calculating module 302.

Other parts of this embodiment are the same as those of the first embodiment of the data management device of the present application, and are not described herein again.

The data management device disclosed in the above embodiment corresponds to the data management method embodiment of the present application, and can achieve the technical effects of the data management method embodiment of the present application.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate a dedicated integrated circuit chip 2. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory.

Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. With this understanding in mind, the present solution, or portions thereof that contribute to the prior art, may be embodied in the form of a software product, which in a typical configuration includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The computer software product may include instructions for causing a computing device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the various embodiments or portions of embodiments of the present application. The computer software product may be stored in a memory, which may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (transient media), such as modulated data signals and carrier waves.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

While the present application has been described with examples, those of ordinary skill in the art will appreciate that there are numerous variations and permutations of the present application without departing from the spirit of the application, and it is intended that the appended claims encompass such variations and permutations without departing from the spirit of the application.

Claims

1. A method for managing data, comprising:

2. A data management method according to claim 1, wherein matching said table attribute information with a first set of information comprises: and solving intersection of the table attribute information and the first information set.

3. The data management method of claim 2, wherein the matching result is a success, comprising: and the intersection set obtained by the table attribute information and the first information set is not an empty set.

4. A data management method according to claim 3, wherein determining the first value of the data table for which the matching result is successful comprises: setting a first numerical value of the data table to be equal to a maximum value in a first preset range; wherein the first preset range is a value range of the first numerical value.

5. A method of data management according to claim 1, wherein said first set of information comprises: the type of the table is a source head table, the table is manually audited, the table is accessed in a first time interval, the type of the table is a non-partition table and the table is accessed in a second time interval, and the output time is less than a first preset time length.

6. A data management method according to claim 1, wherein said table attribute information includes at least one of: the information of the type of the table, the manual review information, the access information in the first time interval, the output time information, the life cycle information, the span information, the byte number information and the information of the responsible person.

7. The data management method of claim 6, wherein the span information of the data table includes: one or more span values and time information corresponding to the span values.

8. The data management method of claim 7, wherein said determining a first period of said data table based on table attribute information comprises: and acquiring a maximum span value of which the time information is positioned in a second time interval before the current time, judging an interval range to which the maximum span value belongs, and determining a first period of the data table according to a corresponding relation between the interval range and the first period.

9. The data management method of claim 8, wherein the larger the maximum span value of the data table in the second time interval, the longer the corresponding first period of the data table.

10. The method of claim 1, wherein said calculating a first value of said table based on a first period of said table and a life cycle of said table attribute information comprises:

calculating a difference value between the life cycle and the first cycle, and subtracting the difference value from the maximum value of the first preset range to obtain a first candidate result;

judging whether the candidate result is greater than or equal to a first preset value, if the first candidate result is smaller than the first preset value, setting a first numerical value of the data table to be equal to the first preset value, and if the first candidate result is larger than the first preset value, setting the first numerical value of the data table to be equal to the first candidate result;

the first preset range is a value range of the first numerical value, and the first preset value belongs to the first preset range.

11. The method according to claim 10, wherein the table to be processed in the table is determined according to the first value of the table, and specifically, any one or a combination of the following methods is used to determine the table to be processed:

comparing a first numerical value of a data table with a preset reference value, wherein the data table with the first numerical value smaller than the preset reference value is used as a data table to be processed; wherein the preset reference value is greater than the first preset value.

12. The data management method of claim 1, wherein processing the to-be-processed data table according to a preset rule comprises:

deleting the data table to be processed; or,

deleting part of data in the data table to be processed; or,

and changing the life cycle of the data table to be processed.

13. The data management method of claim 12, wherein said modifying the lifecycle of the pending data table comprises: and changing the life cycle of the data table to be processed into a first cycle.

14. A data management method according to claim 1, further comprising:

and determining a second numerical value of the first unit according to the first numerical value of the data table and the table attribute information of the data table.

15. A data management method according to claim 14, wherein said first unit comprises: a person, an application, or a business unit.

16. The data management method of claim 15, wherein when the first unit is an individual, determining the second value for the first unit comprises:

and performing weighted overlap addition on the first numerical value of the data table associated with the first unit, and averaging, wherein the average value is multiplied by the management coverage rate of the data table, and the obtained result is a second numerical value.

17. The data management method of claim 16, wherein the data table management coverage is calculated by: dividing the first amount of data by the total amount of data;

wherein the total data volume is a data volume of a data table associated with the first cell;

the first data amount is a sum of data amounts of the data tables conforming to the first rule.

18. A data management method according to claim 17, wherein the first rule comprises:

the table attribute information of the data table conforms to at least one of: the method comprises the steps of including life cycle information, enabling the table type to be a source head table, manually auditing and performing limit storage.

19. The data management method of claim 15, wherein determining the second value for the first element when the first element is an application or service element comprises:

and performing weight overlapping addition on a first numerical value of the data sheet associated with the first unit, and averaging, multiplying the average value by the management coverage rate of the data sheet, multiplying the product by the integrity rate of the responsible person, wherein the obtained result is a second numerical value.

20. A method of data management according to claim 16 or 19, wherein the method of calculating the weight of the data table comprises: the number of bytes of the data table is added with 1, and the obtained sum is opened to a cubic root.

21. The data management method of claim 19, wherein the calculation of the management coverage of the data table comprises: dividing the second amount of data by the total amount of data;

the second data amount is a sum of data amounts of the data tables conforming to the second rule.

22. A method of data management according to claim 21, wherein the second rule comprises:

the table attribute information of the data table conforms to at least one of: including life cycle information, after manual review, and after limit storage.

23. The data management method of claim 19, wherein the calculation method of the accountant integrity rate comprises: dividing the third amount of data by the total amount of data;

the third amount of data comprises: the table attribute information in the data table associated with the first unit contains a sum of data amounts of the data tables responsible for the person information.

24. A data management apparatus, comprising: the device comprises a matching module, a first numerical value calculating module and a processing module; wherein,

25. A data management apparatus according to claim 24, further comprising: a first unit module;

the first unit module is used for determining a second numerical value of the first unit according to the first numerical value of the data table and the table attribute information of the data table determined in the matching module and the first numerical value calculating module.