CN118445330B - A table dimension statistical caliber calculation method and system - Google Patents
A table dimension statistical caliber calculation method and system Download PDFInfo
- Publication number
- CN118445330B CN118445330B CN202410526203.8A CN202410526203A CN118445330B CN 118445330 B CN118445330 B CN 118445330B CN 202410526203 A CN202410526203 A CN 202410526203A CN 118445330 B CN118445330 B CN 118445330B
- Authority
- CN
- China
- Prior art keywords
- data
- tables
- dimension
- score
- caliber
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data management and provides a method and a system for calculating a dimension statistical caliber of a table, wherein the method comprises the steps of regularly acquiring basic statistical information of the table, acquiring blood-margin relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table; the method comprises the steps of obtaining basic data of a statistical caliber, constructing a first data attribute table according to the obtained basic data of the statistical caliber, calculating front preparation data of the statistical caliber of the dimension of the table according to attribute data in the first data attribute table, constructing a second data attribute table according to the front preparation data of the statistical caliber of the dimension of the table obtained through calculation, circularly traversing the first data attribute table and the second data attribute table, calculating scores of the statistical caliber of each dimension of the table for each table, adding labels for the statistical caliber of each dimension of the table, storing the statistical caliber of the dimension of the table added with the labels into a database, and displaying pages. The invention can analyze the value of the table from each caliber dimension, and is beneficial to improving the efficiency and accuracy of data management.
Description
Technical Field
The invention relates to the technical field of data management, in particular to a method and a system for calculating a statistical caliber of a meter dimension.
Background
Along with the continuous deepening of data application, various large traditional user main bodies are subjected to digital transformation, and in the digital transformation process, data processing such as data extraction, management and management plays a vital role in the development of digital transformation. In the data management process, a large number of data tables are generated, and the dimension statistical apertures of the tables such as access heat, growth, dependency concentration, income cost, influence, attention and data timeliness are calculated, scored and labeled, so that important help is provided for data management and management effects.
At present, no calculation rule of the statistical caliber of the table dimension exists, and how to provide a high-efficiency, accurate and high-applicability calculation method of the statistical caliber of the table dimension becomes a technical problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention aims to overcome the shortcomings of the prior art, and to provide a method and a system for calculating a statistical caliber of a table dimension.
According to a first aspect of the present invention, there is provided a method for calculating a statistical caliber of a meter dimension, the method comprising:
Acquiring basic statistical information of the table at regular time, and acquiring blood-margin relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
Acquiring basic statistical caliber data of the table according to basic statistical information of the table, blood relationship information and task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic statistical caliber data;
Calculating the front preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table, and constructing a second data attribute table according to the calculated front preparation data of the table dimension statistical caliber;
circularly traversing the first data attribute table and the second data attribute table, and calculating the score of each dimension statistical caliber of the table for each table;
Adding labels for the dimension statistical apertures of the tables, and storing the corresponding table dimension statistical apertures of the labeled tables into a database and displaying pages.
Preferably, in the method for calculating the statistical caliber of the table dimension, the basic statistical information of the table comprises the storage size occupied by the table, the number of data pieces of the table, the last update time of the table data, the number of times the table is browsed in the system and the number of times the table is collected.
Preferably, in the method for calculating a statistical caliber of a table dimension according to the present invention, the attributes of the first data attribute table include a last-week collection number, a calculation engine Id, a data partition Id, a table Id, a total table storage amount, a total table data piece number, a last table data update time, a last-week storage increment amount, a last-week table data piece number increment amount, a last-week execution number, an upstream direct connection blood-edge relation table number, a downstream blood-edge table number, and a last-week table browsing record number.
Preferably, in the method for calculating the statistical caliber of the table dimension, the front preparation data of the statistical caliber of the table dimension comprises the maximum number of times of task execution of the last week table, the minimum number of times of task execution of the last week table, the maximum storage increment of all tables of the last week, the minimum storage increment of all tables of the last week, the maximum storage increment duty ratio, the maximum data strip increment of all tables of the last week, the minimum data strip increment of all tables of the last week, the maximum data strip increment duty ratio, the minimum data strip increment duty ratio, the maximum dependency concentration, the minimum storage occupancy of all tables, the maximum number of blood edge tables downstream in all tables, the minimum number of blood edge tables downstream in all tables, the maximum browsing number of all tables of the last week, the minimum browsing number of all tables of the last week, the maximum number of collected all tables and the minimum collection number of all tables of the last week.
Preferably, in the table dimension statistical caliber calculating method of the present invention, a score of each dimension statistical caliber of a table is calculated for each table by circularly traversing the first data attribute table and the second data attribute table, including:
Calculating the access hotness score of the table according to the maximum number of the task execution of the last week table, the minimum number of the task execution of the last week table and the last week execution number;
Calculating the dependency concentration score of the table according to the maximum dependency concentration, the minimum dependency concentration, the number of the upstream direct connection blood edge relation tables and the number of the downstream direct connection blood edge relation tables;
Calculating a table storage score according to the total storage amount of the tables, the maximum storage occupation amount of all the tables and the minimum storage occupation amount of all the tables, and calculating a benefit cost score of the tables according to the access hotness score of the tables and the table storage score;
Calculating an influence score of the table according to the number of the downstream blood-edge tables, the maximum number of the downstream blood-edge tables in all tables and the minimum number of the downstream blood-edge tables in all tables;
and calculating the data timeliness score of the table according to the last change time of the table data and the current time.
Preferably, in the table dimension statistical caliber calculating method of the present invention, a score of each dimension statistical caliber of a table is calculated for each table by circularly traversing the first data attribute table and the second data attribute table, including:
Calculating a growth score of the table based on the most recent circumferential storage growth, the most recent circumferential storage growth of all tables, and the least recent circumferential storage growth of all tables;
Calculating an increase duty score of a table according to the latest circumferential storage duty, the maximum storage increase duty, and the minimum storage increase duty;
Calculating the data score of the table according to the increment of the data number of the last week, the increment of the maximum data number of all tables of the last week and the increment of the minimum data number of all tables of the last week;
calculating the data duty ratio score of the table according to the total data count, the maximum data count increment duty ratio and the minimum data count increment duty ratio of the latest week table;
The weighted sum value is obtained by weighted summing the table growth score, the table growth fraction score, the table data score, and the table data fraction score, and is used as the table growth score.
Preferably, in the table dimension statistical caliber calculating method of the present invention, a score of each dimension statistical caliber of a table is calculated for each table by circularly traversing the first data attribute table and the second data attribute table, including:
Calculating the collection scores of the tables according to the collection number of the latest week tables, the maximum collection number of all the tables of the latest week and the minimum collection number of all the tables of the latest week;
Calculating the browsing score of the table according to the browsing record number of the latest week table, the maximum browsing record number of all tables of the latest week and the minimum browsing record number of all tables of the latest week;
And obtaining a weighted sum value by carrying out weighted summation on the collection score of the table and the browse score of the table, and taking the weighted sum value as the attention score of the table.
Preferably, in the method for calculating the statistical caliber of the dimension of the table, a label is added to each dimension statistical caliber of the table, the statistical caliber of the dimension of the table corresponding to the added label is stored in a database and displayed on a page, and the method comprises the following steps:
according to the table Id and the data partition Id, storing the scores of the statistical apertures of the dimensions of the table into a database and displaying the page;
Sorting the relevant dimension statistical apertures of all tables according to the scores of the dimension statistical apertures to obtain binary values of the relevant dimension statistical apertures;
Comparing the score of the corresponding table dimension statistical caliber of the table with the corresponding bipartite value, adding a high-index label to the corresponding table dimension statistical caliber of the table when the score of the corresponding table dimension statistical caliber of the table is not smaller than the corresponding bipartite value, and adding a low-index label to the corresponding table dimension statistical caliber of the table when the score of the corresponding table dimension statistical caliber of the table is smaller than the corresponding bipartite value;
And storing the corresponding table dimension statistical caliber of the tag-added table into a database and displaying the page.
According to a second aspect of the present invention, there is provided a meter dimension statistical caliber calculation system, the system comprising a meter dimension statistical caliber calculation server for:
Acquiring basic statistical information of the table at regular time, and acquiring blood-margin relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
Acquiring basic statistical caliber data of the table according to basic statistical information of the table, blood relationship information and task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic statistical caliber data;
Calculating the front preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table, and constructing a second data attribute table according to the calculated front preparation data of the table dimension statistical caliber;
circularly traversing the first data attribute table and the second data attribute table, and calculating the score of each dimension statistical caliber of the table for each table;
Adding labels for the dimension statistical apertures of the tables, and storing the corresponding table dimension statistical apertures of the labeled tables into a database and displaying pages.
Preferably, in the statistical caliber calculating system for a table dimension of the present invention, a statistical caliber calculating server for a table dimension includes:
The information acquisition module is used for acquiring basic statistical information of the table at regular time, and acquiring blood edge relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
the first data attribute table construction module is used for acquiring the basic data of the statistical caliber of the table according to the basic statistical information, the blood relationship information and the task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic data of the statistical caliber;
The second data attribute table construction module is used for calculating the prepositive preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table and constructing a second data attribute table according to the prepositive preparation data of the table dimension statistical caliber obtained by calculation;
the dimension statistical caliber score calculation module is used for circularly traversing the first data attribute table and the second data attribute table and calculating the score of each dimension statistical caliber of each table;
the dimension statistical caliber marking and storing module is used for adding labels for all dimension statistical calibers of the tables, storing the corresponding table dimension statistical calibers of the labeled tables into a database and displaying pages.
According to a third aspect of the present invention there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first aspect of the present invention when executing the program.
According to the method and the system for calculating the statistical caliber of the meter dimension, the score of each statistical caliber of the meter and the level of each caliber can be directly displayed on the page by performing score calculation and marking of the statistical caliber of the meter, so that a data management personnel can analyze the value of the meter from the dimension of each caliber, analyze the management effect of the meter, and be beneficial to improving the efficiency and accuracy of data management.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system for a method for calculating statistical caliber of a table dimension according to an embodiment of the application;
FIG. 2 is a diagram illustrating an exemplary architecture of a tabled dimension statistical caliber calculation server in a tabled dimension statistical caliber calculation system according to an embodiment of the invention;
FIG. 3 is a flowchart illustrating a method for calculating a statistical caliber of a table dimension according to an embodiment of the invention;
fig. 4 is a schematic structural diagram of the apparatus provided by the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It is noted that the following embodiments and features of the embodiments may be combined with each other without conflict, and that all other embodiments obtained by persons of ordinary skill in the art without creative efforts based on the embodiments in the present disclosure are within the scope of protection of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
FIG. 1 illustrates an exemplary system for a table dimension statistical caliber calculation method suitable for use with embodiments of the application. As shown in fig. 1, the system may include a meter dimension statistical caliber calculation server 101, a communication network 102, and/or one or more meter dimension statistical caliber calculation clients 103, an example of which is a plurality of meter dimension statistical caliber calculation clients 103 in fig. 1.
The table dimension statistical caliber computation service 101 may be any suitable server for storing information, data, programs, and/or any other suitable type of content. In some embodiments, the table dimension statistical caliber calculation server 101 may perform appropriate functions. For example, in some embodiments, the table dimension statistical caliber calculation service 101 may be used for table dimension statistical caliber calculation. As an alternative example, in some embodiments, the table dimension statistical caliber calculation service 101 may be used to implement the table dimension statistical caliber calculation by constructing a first data attribute table and a second data attribute table. For example, the table dimension statistical caliber calculation server 101 may be configured to:
Acquiring basic statistical information of the table at regular time, and acquiring blood-margin relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
Acquiring basic statistical caliber data of the table according to basic statistical information of the table, blood relationship information and task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic statistical caliber data;
Calculating the front preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table, and constructing a second data attribute table according to the calculated front preparation data of the table dimension statistical caliber;
circularly traversing the first data attribute table and the second data attribute table, and calculating the score of each dimension statistical caliber of the table for each table;
Adding labels for the dimension statistical apertures of the tables, and storing the corresponding table dimension statistical apertures of the labeled tables into a database and displaying pages.
Fig. 2 is a diagram illustrating an architecture example of a table dimension statistical caliber calculation service end in a table dimension statistical caliber calculation system according to an embodiment of the present invention, as shown in fig. 2, the table dimension statistical caliber calculation service end in the present embodiment includes:
The information acquisition module is used for acquiring basic statistical information of the table at regular time, and acquiring blood edge relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
the first data attribute table construction module is used for acquiring the basic data of the statistical caliber of the table according to the basic statistical information, the blood relationship information and the task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic data of the statistical caliber;
The second data attribute table construction module is used for calculating the prepositive preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table and constructing a second data attribute table according to the prepositive preparation data of the table dimension statistical caliber obtained by calculation;
the dimension statistical caliber score calculation module is used for circularly traversing the first data attribute table and the second data attribute table and calculating the score of each dimension statistical caliber of each table;
the dimension statistical caliber marking and storing module is used for adding labels for all dimension statistical calibers of the tables, storing the corresponding table dimension statistical calibers of the labeled tables into a database and displaying pages.
As another example, in some embodiments, the table dimension statistical caliber calculation server 101 may send the table dimension statistical caliber calculation method to the table dimension statistical caliber calculation client 103 for use by the user according to the request of the table dimension statistical caliber calculation client 103.
As an optional example, in some embodiments, the table dimension statistical caliber calculation client 103 is configured to provide a visual calculation interface, where the visual calculation interface is configured to receive a selection input operation of the user table dimension statistical caliber calculation, and is configured to obtain, in response to the selection input operation, a calculation interface corresponding to an option selected by the selection input operation from the table dimension statistical caliber calculation server 101 and display the calculation interface, where at least information of the table dimension statistical caliber calculation and an operation option of the information of the table dimension statistical caliber calculation are displayed in the calculation interface.
In some embodiments, communication network 102 may be any suitable combination of one or more wired and/or wireless networks. For example, the communication network 102 can include any one or more of the Internet, an intranet, a Wide Area Network (WAN), a Local Area Network (LAN), a wireless network, a Digital Subscriber Line (DSL) network, a frame relay network, an Asynchronous Transfer Mode (ATM) network, a Virtual Private Network (VPN), and/or any other suitable communication network. The table dimension statistical caliber calculation client 103 can be connected to the communication network 102 via one or more communication links (e.g., communication link 104), which communication network 102 can be linked to the table dimension statistical caliber calculation server 101 via one or more communication links (e.g., communication link 105). The communication link may be any communication link suitable for transferring data between the table dimension statistical caliber computation client 103 and the table dimension statistical caliber computation server 101, such as a network link, a dial-up link, a wireless link, a hardwired link, any other suitable communication link, or any suitable combination of such links.
The table dimension statistical caliber calculation client 103 may include any one or more clients that present interfaces related to the table dimension statistical caliber calculation in a suitable form for use and operation by a user. In some embodiments, the table dimension statistical caliber calculation client 103 may comprise any suitable type of device. For example, in some embodiments, the table dimension statistical caliber computing client 103 may comprise a mobile device, a tablet computer, a laptop computer, a desktop computer, and/or any other suitable type of client device.
Although the table dimension statistical caliber computation service 101 is illustrated as one device, in some embodiments any suitable number of devices may be used to perform the functions performed by the table dimension statistical caliber computation service 101. For example, in some embodiments, multiple devices may be used to implement the functions performed by the table dimension statistical caliber calculation server 101. Or the cloud service may be used to implement the function of the table dimension statistical caliber calculation server 101.
Based on the above system, the embodiment of the application provides a method for calculating a statistical caliber of a table dimension, which is described by the following embodiment.
FIG. 3 is a flowchart illustrating a method for calculating a statistical caliber of a table dimension according to an embodiment of the invention. The method for calculating the statistical caliber of the table dimension in the embodiment can be executed at a server for calculating the statistical caliber of the table dimension, as shown in fig. 3, and comprises the following steps:
Step S201, basic statistical information of the table is collected at fixed time, and blood edge relation information of the table and task execution times information of the table participation are obtained according to the collected basic statistical information of the table.
As an optional example, in this embodiment, the basic statistics of the table include the storage size occupied by the table, the number of data pieces of the table, the last update time of the table data, the number of times the table is browsed in the system, and the number of times the table is collected. For example, in this embodiment, basic statistics of the table, such as the storage size occupied by the table, the number of data pieces of the table, the last update time of the table data, the number of times the table is browsed in the system, and the number of times the table is collected, are managed in a unified manner, and timing statistics is performed. The blood relationship of the table and the task execution times of the table are stored after analysis.
In this embodiment, the blood-edge relationship indicates that the table structure or table data of one table is derived, in whole or in part, from another table, such as an insert over-write.
In this embodiment, the number of task execution indicates the number of times a certain table is used in the data management task.
Step S202, obtaining the basic data of the statistical caliber of the table according to the basic statistical information of the table, the blood relationship information and the task execution times information of the table participation, and constructing a first data attribute table according to the obtained basic data of the statistical caliber.
As an optional example, in this embodiment, the attributes of the first data attribute table include the last-week collection number, the computing engine Id, the data partition Id, the table Id, the total table storage amount, the total table data number, the last update time of the table data, the last-week storage increment, the last-week table data number increment, the last-week execution count, the number of upstream direct-connection blood-edge relationship tables, the number of downstream blood-edge tables, and the last-week table browsing record number.
For example, in practical application, the present embodiment obtains the basic data of the statistical apertures of all tables based on the basic statistical information, the blood relationship information and the task execution times information of the table participation, and assembles the basic data into List < TableMetadataBO >, and attributes included in TableMetadataBO entity class are shown in the following table 1.
TABLE 1
Step S203, calculating the front preparation data of the statistical caliber of the table dimension according to the attribute data in the first data attribute table, and constructing a second data attribute table according to the front preparation data of the statistical caliber of the table dimension obtained by calculation.
As an alternative example, the front preparation data of the statistical caliber of the table dimension in the embodiment comprises the maximum number of times of task execution of the last week table, the minimum number of times of task execution of the last week table, the maximum storage increment of all tables of the last week, the minimum storage increment of all tables of the last week, the maximum storage increment duty ratio, the minimum storage increment duty ratio of all tables of the last week, the maximum data strip increment duty ratio of all tables of the last week, the minimum data strip increment duty ratio, the maximum dependency concentration, the minimum dependency concentration, the maximum storage duty of all tables, the minimum storage duty of all tables, the maximum number of downstream blood edge tables of all tables, the maximum browsing number of all tables of the last week, the minimum browsing number of all tables of the last week, the maximum browsing number of all tables of the last week and the minimum collecting number of all tables of the last week.
For example, in practical application, the present embodiment calculates the pre-preparation data of the statistical caliber of the table dimension according to the attribute data in the first data attribute table, and constructs the second data attribute table according to the calculated pre-preparation data of the statistical caliber of the table dimension as shown in the following table 2.
TABLE 2
Step S204, circulating to traverse the first data attribute table and the second data attribute table, and calculating the score of each dimension statistical caliber of the table for each table.
The embodiment calculates the access hotness score of the table according to the maximum number of the task execution of the latest week table, the minimum number of the task execution of the latest week table and the latest week execution number. As an alternative example, in the method of this embodiment, the access hotness score of the table is calculated as follows:
rddf (visit hotness score) rddf =1 if rdmax-rdmin =0 or the last number of peripheral executions-rdmin =0, otherwise rddf =1+ (99 (last number of peripheral executions-rdmin)/(rdmax-rdmin)).
In the embodiment, the dependency concentration score of the table is calculated according to the maximum dependency concentration, the minimum dependency concentration, the number of the upstream direct connection blood edge relation tables and the number of the downstream direct connection blood edge relation tables. As an alternative example, in the method of the present embodiment, the dependency concentration score of the table is calculated as follows:
yljzddf (dependency concentration score of table) yljzddf =0 if yljzdmax-yljzdmin =0, otherwise yljzddf =1+ (99 x (number of upstream direct blood edge relationship tables+number of downstream direct blood edge relationship tables-yljzdmin)/(yljzdmax-yljzdmin)).
In this embodiment, a table storage score is calculated from the total storage amount of the tables, the maximum storage occupation amount of all the tables, and the minimum storage occupation amount of all the tables, and a benefit cost score of the tables is calculated from the access hotness score of the tables and the table storage score. As an alternative example, in the method of the present embodiment, the revenue cost score of the table is calculated as follows:
bccdf (table storage score) bccdf =1 if the total table storage is 0 or sycbmax-sycbmin =0, otherwise bccdf =1+ (99 (total table storage-sycbmin)/(sycbmax-sycbmin));
sydf (benefit cost score of table) = rddf ×0.5+ (100-bccdf) ×0.5.
In this embodiment, the impact score for a table is calculated based on the number of downstream tables, the maximum number of downstream tables in all tables, and the minimum number of downstream tables in all tables. As an alternative example, in the method of the present embodiment, the influence score of the table is calculated as follows:
Yxldf (influence score) yxldf =1 if the downstream number of blood-edge tables is 0 or yxlmax-yxlmin =0, otherwise yxldf =1+ (99 (downstream number of blood-edge tables-yxlmin)/(yxlmax-yxlmin)).
In this embodiment, the data timeliness score of the table is calculated according to the last change time of the table data and the current time. As an alternative example, in the method of this embodiment, the data timeliness score of the table is calculated as follows:
sjjsxdf (data timeliness score) =100-days of last change time of table data and current time difference.
In this embodiment, the table growth score is calculated based on the most recent circumferential surface storage growth amount, the most recent week all tables maximum storage growth amount, and the most recent week all tables minimum storage growth amount, the table growth duty score is calculated based on the most recent week table storage occupation amount, the most recent week all tables maximum storage growth duty, and the least recent week all tables minimum storage growth duty, the table data score is calculated based on the most recent week data bar growth amount, the most recent week all tables maximum data bar growth amount, and the most recent week all tables minimum data bar growth amount, the table data duty score is calculated based on the most recent week table total bar number, the most recent data bar number growth duty, and the least recent week all tables minimum storage growth duty, and the weighted sum value is obtained by weighted sum of the table growth score, the table growth duty score, the table data score, and the table data duty score, and the table growth duty score, and the weighted sum value is used as the table growth score. As an alternative example, in the method of the present embodiment, the growth score of the table is calculated as follows:
In calculating zzxdf (growth score), the following score needs to be calculated first:
zzdf (growth score) zzdf =1 if the last circumferential storage increase is 0 or ccmax-ccmin =0, otherwise zzdf =1+ (99 x (last circumferential storage increase-ccmin)/(ccmax-ccmin));
Zzzbdf (increment duty score) zzzbdf =1 if the last week table storage occupancy is 0 or cczbmax-cczbmin =0, otherwise zzzbdf =1+ (99/(cczbmax-cczbmin) × cczbmin (last week stored increment/table total storage);
sjdf (data score) sjdf =1 if the last week table number of data increases by 0, otherwise sjdf =1+ (99/(sjmax-sjmin) (last week number of data increases-sjmin));
sjzbdf (data duty score) sjzbdf =1 if the total number of recent weekly table data is 0 or sjzbmax-sjzbmin =0, otherwise sjzbdf =1+ (99/(sjzbmax-sjzbmin) ((recent number of increases in circumferential data/total number of table data) -sjzbmin);
after the calculation to obtain zdf, zzzbdf, sjdf and sjzbdf, zzxdf was calculated according to zzdf, zzzbdf, sjdf and sjzbdf, zzxdf = zzdf ×0.15+sjdf×0.15+zzzbdf×0.35+sjzbdf×0.35.
In the embodiment, the collection scores of the tables are calculated according to the collection number of the latest week tables, the maximum collection number of all the tables of the latest week and the minimum collection number of all the tables of the latest week, the browsing scores of the tables are calculated according to the browsing record number of the latest week tables, the maximum browsing record number of all the tables of the latest week and the minimum browsing record number of all the tables of the latest week, and the weighted sum value is obtained by weighted sum of the collection scores of the tables and the browsing scores of the tables and is used as the attention score of the tables. As an alternative example, in the method of the present embodiment, the attention score of the table is calculated as follows:
when calculating the attention score of a table, first, the collection score of the table and the browse score of the table are calculated.
Scdf (collection score) scdf =1 if the number of last week table collections is 0 or bscmax-bscmin =0, otherwise scdf =1+ (99 (number of last week collections-bscmin)/(bscmax-bscmin));
blldf (table view score) blldf =1 if the number of last week table view records is 0 or bllmax-bllmin =0, otherwise blldf =1+ (99 (number of last week table view records-bllmin)/(bllmax-bllmin));
gzddf, gzddf = scdf ×0.5+bldf0.5 is calculated from scdf and blldf.
And S205, adding labels for the dimension statistical apertures of the tables, and storing the dimension statistical apertures of the corresponding tables of the label-added tables into a database and displaying pages.
As an alternative example, the embodiment stores the scores of all the dimension statistical apertures of the tables into a database according to the table Id and the data partition Id and performs page display, sorts the relevant dimension statistical apertures of all the tables according to the scores of all the dimension statistical apertures to obtain the bipolarized values of the relevant dimension statistical apertures, compares the scores of the corresponding dimension statistical apertures of the tables with the corresponding bipolarized values, adds a high-index label to the corresponding dimension statistical apertures of the tables when the scores of the corresponding dimension statistical apertures of the tables are not smaller than the corresponding bipolarized values, adds a low-index label to the corresponding dimension statistical apertures of the tables when the scores of the corresponding dimension statistical apertures of the tables are smaller than the corresponding bipolarized values, stores the corresponding dimension statistical apertures of the tables added with the labels into the database and performs page display.
For example, the embodiment stores the calculated statistical caliber index results of each table dimension in combination with information such as the table Id and the data partition Id to a database to provide statistical caliber index display of each table dimension of the table on a page, performs marking operation on all the tables to distinguish high and low index labels of each statistical caliber of the table, and the marking process is as follows, taking the heat degree of all the tables as an example, sorting the heat degrees of all the tables in ascending order, taking the heat degree of two-bit values, defining the heat degree as a high heat degree label if the heat degree of the current table is larger than the heat degree of the two-bit values, otherwise defining the heat degree as a low heat degree label, storing all indexes of all the tables into the database after marking, and providing index label display of the page table.
According to the method and the system for calculating the statistical caliber of the meter dimension, the score calculation and the marking of the statistical caliber of the meter are carried out, and the score of each statistical caliber and the level of each caliber of the meter can be directly displayed on a page, so that a data management person can analyze the value of the meter from each caliber dimension, the management effect of the meter is analyzed, and the improvement of the efficiency and the accuracy of data management is facilitated.
As shown in FIG. 4, the present invention also provides an apparatus comprising a processor 310, a communication interface 320, a memory 330 for storing a processor executable computer program, and a communication bus 340. Wherein the processor 310, the communication interface 320 and the memory 330 perform communication with each other through the communication bus 340. The processor 310 implements the table dimension statistical caliber calculation method described above by running an executable computer program.
The computer program in the memory 330 may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a separate product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected based on actual needs to achieve the purpose of the embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product, which may be stored in a computer-readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the various embodiments or methods of some parts of the embodiments.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.
Claims (7)
1. A method for calculating a statistical caliber of a meter dimension, the method comprising:
Acquiring basic statistical information of the table at regular time, and acquiring blood-margin relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
Acquiring basic statistical caliber data of the table according to basic statistical information of the table, blood relationship information and task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic statistical caliber data;
Calculating the front preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table, and constructing a second data attribute table according to the calculated front preparation data of the table dimension statistical caliber;
circularly traversing the first data attribute table and the second data attribute table, and calculating the score of each dimension statistical caliber of the table for each table;
Adding labels for the dimension statistical apertures of the tables, and storing the dimension statistical apertures of the corresponding tables of the label-added tables into a database and displaying pages;
basic statistical information of the table comprises the storage size occupied by the table, the number of data of the table, the last update time of the table data, the browsed times of the table in the system and the collected times of the table;
The attribute of the first data attribute table comprises the latest peripheral collection number, the computing engine Id, the data partition Id, the table Id, the total storage capacity of the table, the total data number of the table, the latest update time of the table data, the latest peripheral storage increment, the latest peripheral table data number increment, the latest peripheral execution times, the upstream direct connection blood edge relation table number, the downstream blood edge table number and the latest peripheral table browsing record number;
The front preparation data of the table dimension statistical caliber comprises the maximum number of times of task execution of the last week table, the minimum number of times of task execution of the last week table, the maximum storage increment of all tables of the last week, the minimum storage increment of all tables of the last week, the maximum storage increment duty ratio, the minimum storage increment duty ratio, the maximum data bar increment of all tables of the last week, the minimum data bar increment duty ratio, the maximum dependency concentration, the minimum dependency concentration, the maximum storage occupation amount of all tables, the minimum storage occupation amount of all tables, the maximum number of downstream blood edge tables in all tables, the minimum number of downstream blood edge tables in all tables, the maximum browsing amount of all tables of the last week, the minimum browsing amount of all tables of the last week, the maximum collection amount of all tables of the last week and the minimum collection amount of all tables of the last week.
2. The meter dimension statistical caliber calculation method according to claim 1, wherein cyclically traversing the first data attribute table and the second data attribute table calculates a score of each dimension statistical caliber of the table for each table, comprising:
Calculating the access hotness score of the table according to the maximum number of the task execution of the last week table, the minimum number of the task execution of the last week table and the last week execution number;
Calculating the dependency concentration score of the table according to the maximum dependency concentration, the minimum dependency concentration, the number of the upstream direct connection blood edge relation tables and the number of the downstream direct connection blood edge relation tables;
Calculating a table storage score according to the total storage amount of the tables, the maximum storage occupation amount of all the tables and the minimum storage occupation amount of all the tables, and calculating a benefit cost score of the tables according to the access hotness score of the tables and the table storage score;
Calculating an influence score of the table according to the number of the downstream blood-edge tables, the maximum number of the downstream blood-edge tables in all tables and the minimum number of the downstream blood-edge tables in all tables;
and calculating the data timeliness score of the table according to the last change time of the table data and the current time.
3. The meter dimension statistical caliber calculation method according to claim 1, wherein cyclically traversing the first data attribute table and the second data attribute table calculates a score of each dimension statistical caliber of the table for each table, comprising:
Calculating a growth score of the table based on the most recent circumferential storage growth, the most recent circumferential storage growth of all tables, and the least recent circumferential storage growth of all tables;
Calculating an increase duty score of a table according to the latest circumferential storage duty, the maximum storage increase duty, and the minimum storage increase duty;
Calculating the data score of the table according to the increment of the data number of the last week, the increment of the maximum data number of all tables of the last week and the increment of the minimum data number of all tables of the last week;
calculating the data duty ratio score of the table according to the total data count, the maximum data count increment duty ratio and the minimum data count increment duty ratio of the latest week table;
The weighted sum value is obtained by weighted summing the table growth score, the table growth fraction score, the table data score, and the table data fraction score, and is used as the table growth score.
4. The meter dimension statistical caliber calculation method according to claim 1, wherein cyclically traversing the first data attribute table and the second data attribute table calculates a score of each dimension statistical caliber of the table for each table, comprising:
Calculating the collection scores of the tables according to the collection number of the latest week tables, the maximum collection number of all the tables of the latest week and the minimum collection number of all the tables of the latest week;
Calculating the browsing score of the table according to the browsing record number of the latest week table, the maximum browsing record number of all tables of the latest week and the minimum browsing record number of all tables of the latest week;
And obtaining a weighted sum value by carrying out weighted summation on the collection score of the table and the browse score of the table, and taking the weighted sum value as the attention score of the table.
5. The method for calculating the statistical caliber of the table dimension according to claim 1, wherein adding a label to each dimension statistical caliber of the table, saving the corresponding table dimension statistical caliber of the labeled table to a database and displaying a page, comprises:
according to the table Id and the data partition Id, storing the scores of the statistical apertures of the dimensions of the table into a database and displaying the page;
Sorting the relevant dimension statistical apertures of all tables according to the scores of the dimension statistical apertures to obtain binary values of the relevant dimension statistical apertures;
Comparing the score of the corresponding table dimension statistical caliber of the table with the corresponding bipartite value, adding a high-index label to the corresponding table dimension statistical caliber of the table when the score of the corresponding table dimension statistical caliber of the table is not smaller than the corresponding bipartite value, and adding a low-index label to the corresponding table dimension statistical caliber of the table when the score of the corresponding table dimension statistical caliber of the table is smaller than the corresponding bipartite value;
And storing the corresponding table dimension statistical caliber of the tag-added table into a database and displaying the page.
6. The system is characterized by comprising a table dimension statistical caliber calculation server, wherein the table dimension statistical caliber calculation server is used for:
Acquiring basic statistical information of the table at regular time, and acquiring blood-margin relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
Acquiring basic statistical caliber data of the table according to basic statistical information of the table, blood relationship information and task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic statistical caliber data;
Calculating the front preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table, and constructing a second data attribute table according to the calculated front preparation data of the table dimension statistical caliber;
circularly traversing the first data attribute table and the second data attribute table, and calculating the score of each dimension statistical caliber of the table for each table;
Adding labels for the dimension statistical apertures of the tables, and storing the dimension statistical apertures of the corresponding tables of the label-added tables into a database and displaying pages;
basic statistical information of the table comprises the storage size occupied by the table, the number of data of the table, the last update time of the table data, the browsed times of the table in the system and the collected times of the table;
The attribute of the first data attribute table comprises the latest peripheral collection number, the computing engine Id, the data partition Id, the table Id, the total storage capacity of the table, the total data number of the table, the latest update time of the table data, the latest peripheral storage increment, the latest peripheral table data number increment, the latest peripheral execution times, the upstream direct connection blood edge relation table number, the downstream blood edge table number and the latest peripheral table browsing record number;
The front preparation data of the table dimension statistical caliber comprises the maximum number of times of task execution of the last week table, the minimum number of times of task execution of the last week table, the maximum storage increment of all tables of the last week, the minimum storage increment of all tables of the last week, the maximum storage increment duty ratio, the minimum storage increment duty ratio, the maximum data bar increment of all tables of the last week, the minimum data bar increment duty ratio, the maximum dependency concentration, the minimum dependency concentration, the maximum storage occupation amount of all tables, the minimum storage occupation amount of all tables, the maximum number of downstream blood edge tables in all tables, the minimum number of downstream blood edge tables in all tables, the maximum browsing amount of all tables of the last week, the minimum browsing amount of all tables of the last week, the maximum collection amount of all tables of the last week and the minimum collection amount of all tables of the last week.
7. The meter dimension statistical caliber computing system according to claim 6, wherein the meter dimension statistical caliber computing server comprises:
The information acquisition module is used for acquiring basic statistical information of the table at regular time, and acquiring blood edge relation information of the table and task execution times information participated in the table according to the acquired basic statistical information of the table;
the first data attribute table construction module is used for acquiring the basic data of the statistical caliber of the table according to the basic statistical information, the blood relationship information and the task execution times information participated in by the table, and constructing a first data attribute table according to the acquired basic data of the statistical caliber;
The second data attribute table construction module is used for calculating the prepositive preparation data of the table dimension statistical caliber according to the attribute data in the first data attribute table and constructing a second data attribute table according to the prepositive preparation data of the table dimension statistical caliber obtained by calculation;
the dimension statistical caliber score calculation module is used for circularly traversing the first data attribute table and the second data attribute table and calculating the score of each dimension statistical caliber of each table;
the dimension statistical caliber marking and storing module is used for adding labels for all dimension statistical calibers of the tables, storing the corresponding table dimension statistical calibers of the labeled tables into a database and displaying pages.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410526203.8A CN118445330B (en) | 2024-04-29 | 2024-04-29 | A table dimension statistical caliber calculation method and system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410526203.8A CN118445330B (en) | 2024-04-29 | 2024-04-29 | A table dimension statistical caliber calculation method and system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118445330A CN118445330A (en) | 2024-08-06 |
| CN118445330B true CN118445330B (en) | 2025-03-04 |
Family
ID=92318943
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410526203.8A Active CN118445330B (en) | 2024-04-29 | 2024-04-29 | A table dimension statistical caliber calculation method and system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118445330B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111611248A (en) * | 2020-05-25 | 2020-09-01 | 山东浪潮商用系统有限公司 | Method, system and device for automatically analyzing index caliber |
| CN113360496A (en) * | 2021-05-26 | 2021-09-07 | 国网能源研究院有限公司 | Method and device for constructing metadata tag library |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180322435A1 (en) * | 2017-05-03 | 2018-11-08 | Aurora Predictions, LLC | Performance & predictive dimensions for business intelligence data |
| CN113792084A (en) * | 2021-08-12 | 2021-12-14 | 北京中交兴路信息科技有限公司 | Data heat analysis method, device, equipment and storage medium |
| CN115098671B (en) * | 2022-08-25 | 2023-02-03 | 深圳市城市交通规划设计研究中心股份有限公司 | Government affair data processing method based on artificial intelligence, electronic equipment and storage medium |
| CN117540927A (en) * | 2023-11-24 | 2024-02-09 | 南威软件股份有限公司 | Label comprehensive evaluation method, system and storage medium based on multiple weights |
-
2024
- 2024-04-29 CN CN202410526203.8A patent/CN118445330B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111611248A (en) * | 2020-05-25 | 2020-09-01 | 山东浪潮商用系统有限公司 | Method, system and device for automatically analyzing index caliber |
| CN113360496A (en) * | 2021-05-26 | 2021-09-07 | 国网能源研究院有限公司 | Method and device for constructing metadata tag library |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118445330A (en) | 2024-08-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105701216B (en) | A kind of information-pushing method and device | |
| CN109615128A (en) | Real estate client's conclusion of the business probability forecasting method, device and server | |
| CN108170692B (en) | Hotspot event information processing method and device | |
| CN104834731B (en) | A kind of recommended method and device from media information | |
| CN102929901A (en) | Methods and apparatus for improving data warehouse performance | |
| US7194477B1 (en) | Optimized a priori techniques | |
| CN116522930A (en) | Hot word extraction method, device and storage medium | |
| CN114528448B (en) | Accurate analytic system of drawing of portrait of global foreign trade customer | |
| CN103942198A (en) | Method and device for mining intentions | |
| CN110737691A (en) | Method and apparatus for processing access behavior data | |
| CN103218411B (en) | Website related information acquisition methods and device | |
| CN113435970B (en) | Product recommendation method and device based on biological information, electronic equipment and medium | |
| Fan et al. | A differential equation model for predicting public opinions and behaviors from persuasive information: Application to the index of consumer sentiment | |
| CN118445330B (en) | A table dimension statistical caliber calculation method and system | |
| CN113641654A (en) | Marketing handling rule engine method based on real-time event | |
| CN112115129A (en) | Retail terminal sample sampling method based on machine learning | |
| US10847144B1 (en) | Methods and apparatus for identification and analysis of temporally differing corpora | |
| CN115841334A (en) | Abnormal account identification method and device, electronic equipment and storage medium | |
| CN113505172B (en) | Data processing method, device, electronic equipment and readable storage medium | |
| CN107609194B (en) | A cloud computing-oriented storage method for time redundant power load data | |
| CN112614005B (en) | Method and device for processing the state of resumption of work of an enterprise | |
| Peiris et al. | Citation network based framework for ranking academic publications and venues | |
| CN110766429A (en) | Data value evaluation system and method | |
| CN111191126A (en) | Keyword-based scientific and technological achievement accurate pushing method and device | |
| CN119991328B (en) | A method and device for discovering target communities integrating content structure rules and time laws |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |