CN117271469A - Energy storage data distributed storage method of energy storage power station - Google Patents
Energy storage data distributed storage method of energy storage power station Download PDFInfo
- Publication number
- CN117271469A CN117271469A CN202311541200.3A CN202311541200A CN117271469A CN 117271469 A CN117271469 A CN 117271469A CN 202311541200 A CN202311541200 A CN 202311541200A CN 117271469 A CN117271469 A CN 117271469A
- Authority
- CN
- China
- Prior art keywords
- energy storage
- term
- data
- dissimilarity
- length
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004146 energy storage Methods 0.000 title claims abstract description 146
- 238000000034 method Methods 0.000 title claims abstract description 67
- 230000007774 longterm Effects 0.000 claims abstract description 75
- 238000013144 data compression Methods 0.000 claims abstract description 6
- 238000012935 Averaging Methods 0.000 claims description 4
- 239000002699 waste material Substances 0.000 abstract description 3
- 238000007906 compression Methods 0.000 description 11
- 230000006835 compression Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 8
- 230000005611 electricity Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1727—Details of free space management performed by the file system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0608—Saving storage space on storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Business, Economics & Management (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data compression, in particular to an energy storage data distributed storage method of an energy storage power station. The method comprises the following steps: acquiring energy storage log data and disk related data; dividing the energy storage log data into a short-term group and a long-term group according to two different preset values, and respectively acquiring short-term dissimilarity and long-term dissimilarity of the energy storage log data; acquiring the length, the adjustment length and the adjustment parameter of a region to be encoded to be first according to the short-term dissimilarity and the long-term dissimilarity, acquiring the length of a target region to be encoded and the length of a dictionary region according to the adjustment parameter, the adjustment length and the length of the first region to be encoded, and compressing the energy storage log data to acquire the compressed data of the energy storage log based on the length; acquiring the utilization rate of the disk lattice according to the size of the compressed data of the energy storage log and the size of the format unit of the disk; and obtaining the disk suitability of the disk, thereby completing the storage. According to the invention, different compressed energy storage data are placed on different magnetic disks, so that the waste of the storage space of the magnetic disk is greatly reduced.
Description
Technical Field
The invention relates to the technical field of data compression, in particular to an energy storage data distributed storage method of an energy storage power station.
Background
The energy storage power station is a device system for storing, converting and releasing electric energy by taking electrochemical cells or electromagnetic energy and other modes as storage media, so that not only can the scheduled electric power be stored, but also the peak-valley electricity utilization problem of cities can be regulated. When the urban electricity consumption is reduced, the electric energy is stored in a chemical energy or other modes, and when the urban electricity consumption rises, the stored electric energy is released. Common energy storage power plant data include power plant capacity, power, charge-discharge efficiency, cycle life, operational data, and the like. Wherein the operation and maintenance data is particularly important in modern informationized management for maintaining improved and optimized system equipment, and is helpful for troubleshooting and solving problems. And the distributed storage is to form a virtual storage device by using the scattered storage space resources, and the data are scattered on a plurality of nodes, so that the expandability, the reliability and the performance of the system are improved. Meanwhile, the safety of the data is guaranteed, all the data cannot be lost even if the disk in the storage space is damaged, and the fault tolerance is high.
In previous compression algorithm applications, data files were typically compressed using default compression algorithm parameters, and data similarities that may exist within the data files were ignored. This results in wastage of disk space, which is less useful. Because default parameters may not fully exploit duplicates or similarities of data within a file, failing to minimize the volume of the file, unnecessary occupation of disk space is caused.
Disclosure of Invention
In order to solve the technical problem of small disk space utilization rate, the invention provides an energy storage data distributed storage method of an energy storage power station, which adopts the following technical scheme:
the invention provides a distributed storage method for energy storage data of an energy storage power station, which comprises the following steps:
acquiring energy storage log data, the number of disks and the format unit size of the disks;
grouping and marking each energy storage log data into short-term groups according to a first preset value, and acquiring short-term dissimilarity of the energy storage log data according to the Hamming distance of the adjacent short-term groups; grouping and marking each energy storage log data into a long-term group according to a second preset value, and acquiring long-term dissimilarity of the energy storage log data according to the DTW distance of the adjacent long-term group;
acquiring the length and the adjustment length of a region to be coded to be first according to the short-term dissimilarity and the long-term dissimilarity; acquiring adjustment parameters according to short-term dissimilarity and long-term dissimilarity, acquiring a target region to be encoded according to the adjustment parameters, the adjustment length and the first region to be encoded, taking twice the length of the target region to be encoded as the dictionary region length, and compressing the energy storage log data according to the target region to be encoded and the dictionary region length to acquire energy storage log compressed data;
acquiring the utilization rate of the disk lattice according to the size of the compressed data of the energy storage log and the size of the format unit of the disk; acquiring the disk suitability of each energy storage log compressed data on each disk according to the size of the energy storage log compressed data, the format unit size of the disk and the utilization rate of the disk grid; and storing each stored energy log compressed data in a disk with the maximum use degree.
Preferably, the method for acquiring the energy storage log data comprises the following steps:
the method comprises the steps of collecting energy storage data of an energy storage power station through a sensor, storing the energy storage data after being encoded by using an UTF-8 encoding mode, and recording an energy storage log data as an encoding obtained by the energy storage data of each day in the collected energy storage data.
Preferably, the method for acquiring the short-term dissimilarity of the stored energy log data according to the hamming distances of the adjacent short-term groups comprises the following steps:
and (3) forming a short-term sequence by all binary codes in each short-term group, marking any short-term sequence as a standard short-term sequence, calculating the Hamming distance between the standard short-term sequence and the adjacent and later short-term sequences as a first distance of the standard short-term sequence, and averaging the first distances of all short-term sequences except the last short-term sequence to obtain short-term dissimilarity of the energy storage log data.
Preferably, the method for acquiring the long-term dissimilarity of the stored energy log data according to the DTW distance of the adjacent long-term group comprises the following steps:
and (3) forming a long-term sequence by all binary codes in each long-term group, recording any long-term sequence as a standard long-term sequence, calculating the DTW distance between the standard long-term sequence and the adjacent and later long-term sequences as a second distance of the standard long-term sequence, and averaging the second distances of all the long-term sequences except the last one to obtain long-term dissimilarity of the energy storage log data.
Preferably, the method for obtaining the length of the region to be encoded and the adjustment length according to the short-term dissimilarity and the long-term dissimilarity comprises the following steps:
comparing the normalized value of short-term dissimilarity with the normalized value of long-term dissimilarity, if the normalized value of short-term dissimilarity is greater than or equal to the normalized value of long-term dissimilarity, the length of the initial region to be encoded is a preset valueOtherwise the length of the original region to be encoded is a preset value +.>Let the initialThe length of the region to be encoded is the length of the first region to be encoded, and the adjustment length is obtained according to the difference between the normalized value of the short-term dissimilarity and the normalized value of the long-term dissimilarity and the length of the first region to be encoded.
Preferably, the method for obtaining the adjustment length according to the difference between the normalized value of the short-term dissimilarity and the normalized value of the long-term dissimilarity and the length of the first region to be encoded comprises the following steps:
in the method, in the process of the invention,a normalized value representing short term dissimilarity of the jth stored log data, ++>A normalized value representing short term dissimilarity of the jth stored log data, ++>Representing the length of the first region to be encoded->Representing a minimum function, +.>Represents an exponential function based on natural constants, < ->Indicating the adjustment length.
Preferably, the method for obtaining the adjustment parameters according to the short-term dissimilarity and the long-term dissimilarity comprises the following steps:
comparing the normalized value of the short-term dissimilarity with the normalized value of the long-term dissimilarity, and if the normalized value of the short-term dissimilarity is greater than or equal to the normalized value of the long-term dissimilarity, setting the adjustment parameter to be 1; otherwise, let the adjustment parameter be-1.
Preferably, the method for obtaining the target length of the region to be encoded according to the adjustment parameter, the adjustment length and the first length of the region to be encoded includes:
and marking the product of the adjustment parameter and the adjustment length as a first product, and taking the rounded value of the sum of the first product and the first region length to be encoded as the target region length to be encoded of the energy storage log data.
Preferably, the method for obtaining the utilization rate of the disk grid according to the size of the compressed data of the energy storage log and the size of the format unit of the disk comprises the following steps:
and the size of the compressed data of the energy storage log is made to calculate the remainder of the size of the format unit of the disk, and then the ratio of the calculated remainder to the size of the format unit of the disk is used as the utilization rate of the disk lattice.
Preferably, the method for obtaining the disk fitness of each energy storage log compressed data on each disk according to the size of the energy storage log compressed data, the format unit size of the disk and the utilization rate of the disk grid comprises the following steps:
in the method, in the process of the invention,indicating the size of the j-th energy storage log after data compression,>represents the format unit size of the v-th disk,/->Indicating the utilization of the last disk compartment of the disk, < >>Indicating the disk fitness of the v-th disk.
The invention has the following beneficial effects: the invention analyzes the similarity condition of the file in a long term and a short term by grouping the stored energy data. For long-term and short-term similarity analysis of the stored energy data, different LZ77 compression algorithm parameters are used for different stored energy data files, so that the volume of the compressed files is effectively reduced, and the disk space is saved. And according to the number of available disks and the size of the disk format units, the reading speed and the utilization rate of the disk space are weighed, and different compressed energy storage data are placed on different disks, so that the waste of the disk storage space is greatly reduced.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a distributed storage method for energy storage data of an energy storage power station according to an embodiment of the present invention;
fig. 2 is a grouping result of the stored energy log data.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description refers to specific implementation, structure, characteristics and effects of an energy storage data distributed storage method of an energy storage power station according to the invention by combining the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
An embodiment of an energy storage data distributed storage method of an energy storage power station comprises the following steps:
the invention provides a specific scheme of an energy storage data distributed storage method of an energy storage power station, which is specifically described below with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a distributed storage method for energy storage data of an energy storage power station according to an embodiment of the present invention is shown, and the method includes the following steps:
and S001, acquiring energy storage data of the energy storage power station through a sensor, and acquiring energy storage log data, the number of magnetic disks and the format unit size of the magnetic disks according to the energy storage data.
The method comprises the steps of collecting energy storage data of an energy storage power station through a sensor, coding the energy storage data by using a coding mode of UTF-8, storing the energy storage data as energy storage log data, taking data collected in one day as one energy storage log data, setting N total energy storage log data, obtaining the number M of magnetic discs which can be used in the energy storage power station, setting the size of a formatting distribution unit of each magnetic disc, and setting by an implementer according to actual conditions.
So far, the energy storage log data and the disk characteristics are obtained.
Step S002, grouping and marking each energy storage log data as a short-term group according to a first preset value, and acquiring short-term dissimilarity of the energy storage log data according to the Hamming distance of the adjacent short-term group; and grouping and marking each energy storage log data into a long-term group according to a second preset value, and acquiring long-term dissimilarity of the energy storage log data according to the DTW distance of the adjacent long-term group.
For each stored log data, dividing each stored log data into groups of K bits, wherein each group corresponds to a data sequence, in the embodiment, setting K to be 32, setting the ratio of the length of each stored log data to K as the number of groups, wherein the part less than K is omitted, and setting the data sequence of the ith group asAs shown in fig. 2. K in FIG. 2 is the length of each group, < >>Indicating the length of the jth stored log data, < >>Data representing group i, ++>Data representing group i-1, +.>Data representing group i+1.
Short-term dissimilarity of the stored energy log data is obtained according to the Hamming distance of each group of data, and the formula is as follows:
in the method, in the process of the invention,data sequence representing group i, +.>Represents the data sequence of group i+1, K represents the length of each group, +.>Indicating the length of the jth stored log data, < >>Representing the number of groups of stored log data, +.>Indicating Haiming distance, & gt>Representing short term dissimilarity of the jth stored log data.
Wherein,the average difference of the hamming distances of each set before and after the stored log data j is represented. Obviously->Smaller, representing dataThe smaller the difference between them. In the limit, when d is all 1 in the group and all 0 in the group, the alternation occurs, < >>Max +.>。
Grouping each stored log data again, wherein the length of the grouping is made to be T, the method comprises the following steps ofMarking each group as a long-term group, wherein each long-term group corresponds to a data sequence, obtaining the number of the long-term groups by the ratio of the length of the energy storage log data to T, and marking the number as +.>Similarly, the part of the log less than T after grouping is omitted, and the long-term dissimilarity of the j log length is obtained according to the DTW distance of two adjacent long-term groups, wherein the formula is as follows:
in the method, in the process of the invention,data sequence representing the u-th long-term group, < >>Data sequence representing the (u+1) th long-term group, ">Represents DTW distance, +.>Indicates the number of long-term groups of stored log data, +.>Indicating long-term dissimilarity of the jth stored log data.
Wherein,representing the average DTW between the front and rear groups. It is evident that the more similar the two sequences are, the smaller the value of DTW, and in the limit, the two sequences are completely different, the +.>Maximum value of->。/>Reflecting the degree of similarity of documents over a longer period, the smaller the more similar.
Index (I)Reflects the similarity of the jth energy storage log data on the K bit scale, and is->Reflecting the similarity of the stored data file on the 2*K bit scale. The data is similarly biased on a longer scale to use a longer sliding window to improve the compression effect, save disk space, and is similarly biased on a shorter scale to use a shorter sliding window because increasing the sliding window length in this case does not significantly improve the compression effect, unlike the use of a shorter sliding window, improving the running speed of the compression algorithm.
Thus, short-term dissimilarity and long-term dissimilarity of each of the stored energy log data are obtained.
Step S003, obtaining the length and the adjustment length of the region to be encoded to be first according to the short-term dissimilarity and the long-term dissimilarity; and acquiring adjustment parameters according to the short-term dissimilarity and the long-term dissimilarity, acquiring the target region to be encoded length according to the adjustment parameters, the adjustment length and the first region to be encoded length, taking twice the target region to be encoded length as the dictionary region length, and compressing the energy storage log data according to the target region to be encoded length and the dictionary region length to acquire the energy storage log compressed data.
The adaptive size adjustment of the region to be coded and the dictionary region is realized according to the short-term dissimilarity and the long-term dissimilarity obtained by the calculation, and the short-term dissimilarity and the long-term dissimilarity are normalized respectively by using linear normalization because the two obtained dissimilarity sizes are different, so that the normalized short-term dissimilarity is recorded asThe normalized long-term dissimilarity is noted +.>。
Both dissimilarities are smaller and more similar ifSmaller, the length of the region to be encoded and the dictionary region is smaller when the LZ77 compression algorithm is used, and the effect is better. If->The length of the region to be encoded and the dictionary region is large, and the compression effect is better; however, too large a length increases the amount of computation resulting in compression rate encoding, and therefore a moderate length needs to be found.
Setting the length of the initial region to be encoded asAccording to->And->The length of the first region to be encoded is obtained by comparing the sizes of the regions, and the formula is as follows:
in the method, in the process of the invention,a normalized value representing short term dissimilarity of the jth stored log data, ++>A normalized value representing short term dissimilarity of the jth stored log data, ++>Representing the original length of the region to be encoded, < >>Representing the length of the first region to be encoded.
According to the length of the first region to be encoded and the stored energy log dataAnd->The adjustment length is obtained, and the formula is as follows:
in the method, in the process of the invention,a normalized value representing short term dissimilarity of the jth stored log data, ++>A normalized value representing short term dissimilarity of the jth stored log data, ++>Representing the length of the first region to be encoded->Representing a minimum function, +.>Represents an exponential function based on natural constants, < ->Indicating the adjustment length.
The larger the difference between the long-term dissimilarity and the short-term dissimilarity, the more the dissimilarity is biased to the side with smaller dissimilarity, and the larger the difference is, the smaller the subsequent adjustment range is.
The adjustment length has different effects on the different first region lengths to be encoded, thus according toAnd->Acquiring an adjustment parameter, and acquiring the length of the region to be encoded according to the adjustment parameter, the length of the first region to be encoded and the adjustment length, wherein the formula is as follows:
in the method, in the process of the invention,a normalized value representing short term dissimilarity of the jth stored log data, ++>A normalized value representing short term dissimilarity of the jth stored log data, ++>An adjustment parameter representing the jth stored energy log data,representing the length of the first region to be encoded->Indicating the adjustment length->Indicating that the result in brackets is rounded, < >>And representing the length of the region to be encoded of the j-th energy storage log data.
The following describes an embodiment whenIt is obvious that the stored energy log data is more similar in short term when the length of the region to be encoded should be close to +.>At this time->. When the stored energy log data are more similar in the long term, assume +.>At this time, since the difference of the two dissimilarities is not large, the adjustment of the length of the region to be encoded is larger, and the result is close to +.>,/>。
It is noted that when the dissimilarity of the stored energy log data in the long term and the short term is equal, thenResulting in->. Selecting a smaller length of a region to be coded by default; if the stored energy log data are identical, calculateThe short-term dissimilarity and the long-term dissimilarity of the log of (a) are 0, so that the log has a good compression effect without adjustment, and the log can be compressed by selecting the default length to be encoded.
After the length of the region to be encoded is obtained, the length of the dictionary region is doubled as the length of the region to be encoded, so that the dictionary region and the region to be encoded are obtained, and compression of the LZ77 algorithm is performed on the energy storage log data based on the dictionary region and the region to be encoded to obtain compressed data.
So far, compressed data is acquired.
Step S004, obtaining the utilization rate of the disk grids according to the size of the compressed data of the energy storage log and the size of the format unit of the disk; acquiring the disk suitability of each energy storage log compressed data on each disk according to the size of the energy storage log compressed data, the format unit size of the disk and the utilization rate of the disk grid; and respectively opening up storage spaces by a plurality of independent devices, and storing the compressed data of the energy storage logs in the storage spaces according to the disk suitability of the compressed data of each energy storage log and the storage spaces to finish distributed storage.
When the magnetic disk is used, the magnetic disk is subjected to formatting, wherein the formatting is to divide a storage space into small grids one by one, and the size of the grids is a format unit; files can only be stored in a blank grid. If a certain disk format unit has a size of 32kb and a file has a size of only 1kb, the file occupies only 1kb in the disk space, and the remaining 31kb space is wasted and cannot be used for storing other files. In order to maximize the utilization of storage space and reduce the waste rate of space, files of different sizes are stored on different disks.
For each stored log data, the compressed size is recorded asThe number of the magnetic disks is M, and the utilization rate of the last magnetic disk lattice of the magnetic disk is obtained according to the compressed size and the format unit, and the formula is as follows:
in the method, in the process of the invention,indicating the size of the j-th energy storage log after data compression,>represents the format unit size of the v-th disk,/->Indicating the utilization of the last disk bin of the disk.
The smaller the format unit of the disk is, the higher the utilization rate of the disk is, but the more the number of occupied disks is, the slower the reading speed of the disk is caused by the number of occupied disks, so that a trade-off is made between the number of the disks occupied by the file and the space utilization rate of the last disk, and therefore, the disk utilization rate is calculated according to the following formula:
in the method, in the process of the invention,indicating the size of the j-th energy storage log after data compression,>represents the format unit size of the v-th disk,/->Indicating the utilization of the last disk compartment of the disk, < >>Representing the downward rounding of the ratio of compressed size to format units,/->Indicating the disk fitness of the v-th disk. Wherein (1)>The fewer and the better, such a tableThe pointer showing the cell address is small, the file reading speed is fast,/-the pointer showing the cell address is small>The higher the utilization rate of the last disk cell occupied is, the better the utilization rate of the last disk cell occupied is, and the disk space can be fully utilized. In this embodiment, the storage technology of the stored data is specific, so that the efficiency of disk storage is more emphasized, and the requirement on the reading speed is smaller, so that the ln function is added to the denominator, and the +.>For whole->Is a function of (a) and (b). The larger the disk suitability is, the more suitable the disk is for storing the compressed energy storage log data.
Calculating the disk fitness of each independent device according to different format units, respectively storing each independent device with each daily energy storage log data, opening up a disk in each independent device for storing the energy storage log data, enabling the different devices to have different storage capacities, enabling the rest of available space to be different, distributing each independent device according to the available space proportion, obtaining a space for storing the energy storage log data for each independent device, calculating the disk fitness of the energy storage log data on each space, distributing the compressed energy storage log data to the space with the largest disk fitness, and if two or more than two energy storage log data correspond to one space at the same time, placing the latest space with the rest of spaces with larger disk fitness according to time sequence.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
Claims (10)
1. The distributed storage method for the energy storage data of the energy storage power station is characterized by comprising the following steps of:
acquiring energy storage data of an energy storage power station through a sensor, and acquiring energy storage log data, the number of magnetic disks and the format unit size of the magnetic disks according to the energy storage data;
grouping and marking each energy storage log data into short-term groups according to a first preset value, and acquiring short-term dissimilarity of the energy storage log data according to the Hamming distance of the adjacent short-term groups; grouping and marking each energy storage log data into a long-term group according to a second preset value, and acquiring long-term dissimilarity of the energy storage log data according to the DTW distance of the adjacent long-term group;
acquiring the length and the adjustment length of a region to be coded to be first according to the short-term dissimilarity and the long-term dissimilarity; acquiring adjustment parameters according to short-term dissimilarity and long-term dissimilarity, acquiring a target region to be encoded according to the adjustment parameters, the adjustment length and the first region to be encoded, taking twice the length of the target region to be encoded as the dictionary region length, and compressing the energy storage log data according to the target region to be encoded and the dictionary region length to acquire energy storage log compressed data;
acquiring the utilization rate of the disk lattice according to the size of the compressed data of the energy storage log and the size of the format unit of the disk; acquiring the disk suitability of each energy storage log compressed data on each disk according to the size of the energy storage log compressed data, the format unit size of the disk and the utilization rate of the disk grid; and respectively opening up storage spaces by a plurality of independent devices, and storing the compressed data of the energy storage logs in the storage spaces according to the disk suitability of the compressed data of each energy storage log and the storage spaces to finish distributed storage.
2. The method for storing energy storage data of an energy storage power station in a distributed manner according to claim 1, wherein the method for obtaining the energy storage log data according to the energy storage data is as follows:
and (3) encoding the energy storage data by using a UTF-8 encoding mode, storing the energy storage data, and recording the encoding obtained by the energy storage data of each day in the acquired energy storage data as energy storage log data.
3. The method for storing energy storage data in a distributed manner in an energy storage power station according to claim 1, wherein the method for obtaining short-term dissimilarity of energy storage log data according to hamming distances of adjacent short-term groups comprises:
and (3) forming a short-term sequence by all binary codes in each short-term group, marking any short-term sequence as a standard short-term sequence, calculating the Hamming distance between the standard short-term sequence and the adjacent and later short-term sequences as a first distance of the standard short-term sequence, and averaging the first distances of all short-term sequences except the last short-term sequence to obtain short-term dissimilarity of the energy storage log data.
4. The method for storing energy storage data of an energy storage power station in a distributed manner according to claim 1, wherein the method for obtaining long-term dissimilarity of energy storage log data according to DTW distances of adjacent long-term groups is as follows:
and (3) forming a long-term sequence by all binary codes in each long-term group, recording any long-term sequence as a standard long-term sequence, calculating the DTW distance between the standard long-term sequence and the adjacent and later long-term sequences as a second distance of the standard long-term sequence, and averaging the second distances of all the long-term sequences except the last one to obtain long-term dissimilarity of the energy storage log data.
5. The method for storing energy storage data of an energy storage power station according to claim 1, wherein the method for obtaining the length of the region to be encoded and the length to be adjusted according to the short-term dissimilarity and the long-term dissimilarity comprises the following steps:
comparing the normalized value of the short-term dissimilarity with the normalized value of the long-term dissimilarity, and if the normalized value of the short-term dissimilarity is greater than or equal to the normalized value of the long-term dissimilarity, initializing the region to be encodedThe length is a preset valueOtherwise the length of the original region to be encoded is a preset value +.>And the length of the initial region to be encoded is made to be the length of the first region to be encoded, and the adjustment length is obtained according to the difference between the normalized value of the short-term dissimilarity and the normalized value of the long-term dissimilarity and the length of the first region to be encoded.
6. The method for storing energy storage data in an energy storage power station according to claim 5, wherein the method for obtaining the adjustment length according to the difference between the normalized value of short-term dissimilarity and the normalized value of long-term dissimilarity and the length of the first region to be encoded is as follows:
in the method, in the process of the invention,a normalized value representing short term dissimilarity of the jth stored log data, ++>A normalized value representing short term dissimilarity of the jth stored log data, ++>Representing the length of the first region to be encoded->Representing a minimum function, +.>Represents an exponential function based on natural constants, < ->Indicating the adjustment length.
7. The method for distributed storage of energy storage data in an energy storage power station according to claim 1, wherein the method for obtaining the adjustment parameters according to short-term dissimilarity and long-term dissimilarity comprises the following steps:
comparing the normalized value of the short-term dissimilarity with the normalized value of the long-term dissimilarity, and if the normalized value of the short-term dissimilarity is greater than or equal to the normalized value of the long-term dissimilarity, setting the adjustment parameter to be 1; otherwise, let the adjustment parameter be-1.
8. The distributed storage method of energy storage data of an energy storage power station as claimed in claim 1, wherein the method for obtaining the target length of the region to be encoded according to the adjustment parameter, the adjustment length and the first length of the region to be encoded is as follows:
and marking the product of the adjustment parameter and the adjustment length as a first product, and taking the rounded value of the sum of the first product and the first region length to be encoded as the target region length to be encoded of the energy storage log data.
9. The distributed storage method of energy storage data of an energy storage power station as claimed in claim 1, wherein the method for obtaining the utilization rate of the disk cells according to the size of the compressed data of the energy storage log and the size of the format unit of the disk is as follows:
and the size of the compressed data of the energy storage log is made to calculate the remainder of the size of the format unit of the disk, and then the ratio of the calculated remainder to the size of the format unit of the disk is used as the utilization rate of the disk lattice.
10. The method for storing the energy storage data of the energy storage power station in a distributed manner according to claim 1, wherein the method for obtaining the disk suitability of each energy storage log compressed data on each disk according to the size of the energy storage log compressed data, the format unit size of the disk and the use rate of the disk grid is as follows:
in the method, in the process of the invention,indicating the size of the j-th energy storage log after data compression,>represents the format unit size of the v-th disk,/->Indicating the utilization of the last disk compartment of the disk, < >>Indicating the disk fitness of the v-th disk.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311541200.3A CN117271469B (en) | 2023-11-20 | 2023-11-20 | Energy storage data distributed storage method of energy storage power station |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311541200.3A CN117271469B (en) | 2023-11-20 | 2023-11-20 | Energy storage data distributed storage method of energy storage power station |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117271469A true CN117271469A (en) | 2023-12-22 |
CN117271469B CN117271469B (en) | 2024-02-02 |
Family
ID=89202883
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311541200.3A Active CN117271469B (en) | 2023-11-20 | 2023-11-20 | Energy storage data distributed storage method of energy storage power station |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117271469B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5956504A (en) * | 1996-03-04 | 1999-09-21 | Lucent Technologies Inc. | Method and system for compressing a data stream in a database log so as to permit recovery of only selected portions of the data stream |
US20090060047A1 (en) * | 2007-08-29 | 2009-03-05 | Red Hat, Inc. | Data compression using an arbitrary-sized dictionary |
US20190268017A1 (en) * | 2019-05-08 | 2019-08-29 | Vinodh Gopal | Self-checking compression |
CN110222020A (en) * | 2019-05-07 | 2019-09-10 | 平安科技(深圳)有限公司 | Log file management method, device, computer equipment and storage medium |
CN116800596A (en) * | 2023-06-16 | 2023-09-22 | 天翼数字生活科技有限公司 | Log lossless compression analysis method and system |
CN116938256A (en) * | 2023-09-18 | 2023-10-24 | 苏州科尔珀恩机械科技有限公司 | Rotary furnace operation parameter intelligent management method based on big data |
CN116932493A (en) * | 2022-03-30 | 2023-10-24 | 华为技术有限公司 | Data compression method and related device |
CN117041019A (en) * | 2023-10-10 | 2023-11-10 | 中国移动紫金(江苏)创新研究院有限公司 | Log analysis method, device and storage medium of content delivery network CDN |
-
2023
- 2023-11-20 CN CN202311541200.3A patent/CN117271469B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5956504A (en) * | 1996-03-04 | 1999-09-21 | Lucent Technologies Inc. | Method and system for compressing a data stream in a database log so as to permit recovery of only selected portions of the data stream |
US20090060047A1 (en) * | 2007-08-29 | 2009-03-05 | Red Hat, Inc. | Data compression using an arbitrary-sized dictionary |
CN110222020A (en) * | 2019-05-07 | 2019-09-10 | 平安科技(深圳)有限公司 | Log file management method, device, computer equipment and storage medium |
US20190268017A1 (en) * | 2019-05-08 | 2019-08-29 | Vinodh Gopal | Self-checking compression |
CN116932493A (en) * | 2022-03-30 | 2023-10-24 | 华为技术有限公司 | Data compression method and related device |
CN116800596A (en) * | 2023-06-16 | 2023-09-22 | 天翼数字生活科技有限公司 | Log lossless compression analysis method and system |
CN116938256A (en) * | 2023-09-18 | 2023-10-24 | 苏州科尔珀恩机械科技有限公司 | Rotary furnace operation parameter intelligent management method based on big data |
CN117041019A (en) * | 2023-10-10 | 2023-11-10 | 中国移动紫金(江苏)创新研究院有限公司 | Log analysis method, device and storage medium of content delivery network CDN |
Non-Patent Citations (4)
Title |
---|
NICOLA PREZZA: "Optimal Rank and Select Queries on Dictionary-Compressed Text", 《ARXIV》, pages 1 - 12 * |
PETER MARJAI等: "A Novel Dictionary-Based Method to Compress Log Files with Different Message Frequency Distributions", 《APPLIED SCIENCES》, vol. 12, no. 4, pages 1 - 32 * |
叶洪娜: "一种改进的LZW-FSE数据压缩算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 09, pages 138 - 616 * |
范欢欢等: "基于时间相关性的差分消零数据压缩算法研究", 《电子测量技术》, vol. 41, no. 12, pages 49 - 52 * |
Also Published As
Publication number | Publication date |
---|---|
CN117271469B (en) | 2024-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tong et al. | Smart metering load data compression based on load feature identification | |
CN117353751A (en) | Unmanned charging pile transaction data intelligent management system based on big data | |
CN111105007A (en) | A Compression Acceleration Method for Deep Convolutional Neural Networks for Object Detection | |
CN117648647B (en) | Multi-energy power distribution network user data optimization classification method | |
CN117278054A (en) | A smart grid monitoring data storage method and system | |
CN114640355A (en) | Lossy compression and decompression method, system, storage medium and equipment of time sequence database | |
CN117271469B (en) | Energy storage data distributed storage method of energy storage power station | |
CN118051881A (en) | Wind power short-term prediction method and device considering low-temperature cold tide weather | |
CN112329923A (en) | Model compression method and device, electronic equipment and readable storage medium | |
CN116051156A (en) | A new energy dynamic electricity price data management system based on digital twin | |
CN117040542B (en) | Intelligent comprehensive distribution box energy consumption data processing method | |
CN117786582B (en) | Intelligent monitoring method and system for abnormal power consumption state based on data driving | |
CN119003468A (en) | Intelligent management method for shared oxygenerator based on Internet of things | |
CN119271845A (en) | A method for generating multi-stage scene tree of wind-solar-load based on uncertainty feature extraction of source load | |
CN115964347B (en) | Intelligent storage method for data of market supervision and monitoring center | |
CN116743180A (en) | Intelligent storage method for energy storage power supply data | |
CN118133107A (en) | A lightweight heterogeneous cloud load prediction method and system with cross-group pruning | |
CN111756819A (en) | Optimal operation method of Internet of things resource scheduling based on improved immune genetics | |
CN116883059A (en) | Distribution terminal management method and system | |
CN117767960B (en) | Sensor data optimization acquisition and storage method | |
CN114925617B (en) | Short-term optimal operation method of energy storage system based on space transformation genetic algorithm | |
CN118211092B (en) | Motor controller data storage method and system | |
CN116307502B (en) | An optimal configuration method for reducing power outage losses in a building block energy storage system | |
CN110399975A (en) | A Lithium Battery Depth Diagnosis Model Compression Algorithm Oriented to Hardware Migration | |
CN118590072B (en) | A smart agricultural information management system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |