A kind of lossless compress of general information and decompression method
Technical field
The present invention relates to a kind of lossless compress and decompression method of data, be specifically related to a kind of lossless compress and decompression method of general information.
Background technology
Lossless compress is to utilize the statistical redundancy of data to compress, and can reply initial data fully and does not cause any distortion.Now, the data volume of various information systems is increasing, how sooner, more, transmission and harmless storage data become the matter of utmost importance of processing data information better, the data lossless compress technique then is the important method that addresses this problem.
The data lossless compress technique is applied to every field already, but compression ratio is the one theory that receives the data statistics redundancy, and the information after the compression of softwares such as at present popular Winzip, Winrar all can not be carried out lossless compress once more.
Summary of the invention
The technical problem that the present invention will solve provide a kind of not only can lossless compress to common file format data, and the compression and the decompression method that also can further recompress the data message that compressed through compressed software commonly used; The shared space of file by compression method compression according to the invention obtains is littler.
For solving the problems of the technologies described above, the present invention adopts following technical scheme:
A kind of lossless compression method of general information, its step is following:
1) read source file with binary mode, the gained binary sequence deposits among the newly-built array A, and first binary code among the array A deposited among the variable a;
2) with all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array A " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array A " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array B;
3) the different element that occurs among the statistics array B, the element that these are different deposits among the newly-built array C by the order that occurs;
4) element among the array C is pressed ascending sort, the sequence after the ordering deposits among the newly-built array D;
5) use digital n to replace n element value among the array D, institute's calling sequence deposits among the newly-built array E;
6) with element among element among the array D and the array E according to one to one the relation, elements all among the array B is replaced, institute's calling sequence deposits among the newly-built array F;
7) all elements among the array F is done the above-mentioned the 2nd) inverse transformation of step, wherein first code among the array F is the code of being deposited among the variable a, the binary code sequence after the inverse transformation deposits among the newly-built array G;
8) element among the array D being carried out " consequent deduct adjacent preceding paragraph " handles; The result deposits among the newly-built array H; Seek among the array H first more than or equal to 2 element position; If the position of this element in array H is n, preceding n element among the array D all to be deleted, remaining order of elements deposits among the newly-built array I;
9) preserving array I and G is binary file, and this binary file is the file after the source file lossless compress.
Wherein:
In the step 5), digital n begins counting with 1.
In the step 8), element position n begins counting with 1.
In the step 1), said source file can be the data message that uncompressed software compressed, and promptly common file format data can also be through PAQ, winrk, Winzip, the data message that compression such as winrar software compressed.
The decompression method of general information of the present invention, its step is following:
1) reads binary sequence among the array G in the file after the compression, first binary code among the array G deposited among the variable g;
2) with all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array G " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array G " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array F;
3) the different element that occurs among the statistics array F, the element that these are different deposits among the newly-built array E by the order that occurs;
4) element among the array E is pressed ascending sort, the sequence after the ordering deposits among the newly-built array D;
5) read array I in the file after the compression, establish total i element among the array I, the element among i the element use in back among the array D array I is replaced, the sequence after the replacement deposits among the newly-built array C;
6) with element among element among the array D and the array C according to one to one the relation, elements all among the array F is replaced, institute's calling sequence deposits among the newly-built array B;
7) all elements among the array B is done the above-mentioned the 2nd) inverse transformation of step; Wherein first code among the array B is the code of being deposited among the variable g; Binary code sequence after the inverse transformation deposits among the newly-built array A, and A saves as file with array, and this document is source file.
Compared with prior art; Compression method according to the invention not only can carry out lossless compress to common file format data; And also can further recompress the data message that compress through compressed software commonly used, the shared space of file that compression is obtained is littler, can on the basis of original compression ratio, remove 1~80% redundancy (the amount of redundancy situation of removal is looked the redundant difference of file internal data and difference) again; And compression method algorithm according to the invention is simple, is easy to realize.
Description of drawings
Fig. 1 is for carrying out the flow chart of lossless compress to source file;
The flow chart of Fig. 2 for compressed file is decompressed.
Label is among the figure:
101, read source file with binary mode, the gained binary sequence deposits among the newly-built array A, and first binary code among the array A deposited among the variable a;
102, with all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array A " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array A " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array B;
103, the different element that occurs among the statistics array B, the element that these are different deposits among the newly-built array C by the order that occurs;
104, the element among the array C is pressed ascending sort, the sequence after the ordering deposits among the newly-built array D;
105, use digital n to replace n element value among the array D, institute's calling sequence deposits among the newly-built array E;
106, with element among element among the array D and the array E according to one to one the relation, elements all among the array B is replaced, institute's calling sequence deposits among the newly-built array F;
107, all elements among the array F is done the above-mentioned the 2nd) inverse transformation of step, wherein first code among the array F is the code of being deposited among the variable a, the binary code sequence after the inverse transformation deposits among the newly-built array G;
108, element among the array D being carried out " consequent deduct adjacent preceding paragraph " handles; The result deposits among the newly-built array H; Seek among the array H first more than or equal to 2 element position; If the position of this element in array H is n, preceding n element among the array D all to be deleted, remaining order of elements deposits among the newly-built array I;
109, preserving array I and G is binary file, and this binary file is the file after the source file lossless compress.
201, read binary sequence among the array G in the file after the compression, first binary code among the array G deposited among the variable g;
202, with all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array G " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array G " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array F;
203, the different element that occurs among the statistics array F, the element that these are different deposits among the newly-built array E by the order that occurs;
204, the element among the array E is pressed ascending sort, the sequence after the ordering deposits among the newly-built array D;
205, read array I in the file after the compression, establish total i element among the array I, the element among i the element use in back among the array D array I is replaced, the sequence after the replacement deposits among the newly-built array C;
206, with element among element among the array D and the array C according to one to one the relation, elements all among the array F is replaced, institute's calling sequence deposits among the newly-built array B;
207, all elements among the array B is done the above-mentioned the 2nd) inverse transformation of step; Wherein first code among the array B is the code of being deposited among the variable g; Binary code sequence after the inverse transformation deposits among the newly-built array A, and A saves as file with array, and this document is source file.
Embodiment
Fig. 1 carries out the flow chart of lossless compress to source file for compression method according to the invention.As shown in Figure 1, the step of compression is: the file to compressing is read with binary mode, and the gained binary sequence deposits among the newly-built array A, and first binary code among the array A deposited among the variable a (step 101); With all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array A " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array A " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array B (step 102); Add up the different element that occurs among the array B then, the element that these are different deposits among the newly-built array C (step 103) by the order that occurs; By ascending sort, the sequence after the ordering deposits among the newly-built array D (step 104) with the element among the array C; Use digital n to replace n element value among the array D again, institute's calling sequence deposits among the newly-built array E, and wherein digital n begins counting (step 105) with 1; Afterwards, with element among element among the array D and the array E according to one to one the relation, elements all among the array B is replaced, institute's calling sequence deposits among the newly-built array F (step 106); All elements among the array F is done the above-mentioned the 2nd) inverse transformation of step, wherein first code among the array F is the code of being deposited among the variable a, the binary code sequence after the inverse transformation deposits among the newly-built array G (step 107); At last; Element among the array D is carried out " consequent deduct adjacent preceding paragraph " and handle, the result deposits among the newly-built array H, seeks among the array H first more than or equal to 2 element position; If the position of this element in array H is n; Preceding n element among the array D all deleted, and remaining order of elements deposits among the newly-built array I, and said element position n begins counting (step 108) with 1; Preserve array I and G, array I and G are the file (step 109) after the source file lossless compress.
Fig. 2 is for to the flow chart that decompresses of file through compression, and as shown in Figure 2, the step of decompression is: read the binary sequence among the array G in the file after the compression, first binary code among the array G deposited among the variable g (step 201); With all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array G " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array G " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array F (step 202); Then add up the different element that occurs among the array F, the element that these are different deposits among the newly-built array E (step 203) by the order that occurs; Then, by ascending sort, the sequence after the ordering deposits among the newly-built array D (step 204) with the element among the array E; Read the array I in the file after the compression, establish total i element among the array I, the element among i the element use in back among the array D array I is replaced, the sequence after the replacement deposits among the newly-built array C (step 205); Afterwards, with element among element among the array D and the array C according to one to one the relation, elements all among the array F is replaced, institute's calling sequence deposits among the newly-built array B (step 206); At last all elements among the array B is done the above-mentioned the 2nd) inverse transformation of step; Wherein first code among the array B is the code of being deposited among the variable g; Binary code sequence after the inverse transformation deposits among the newly-built array A, and A saves as file with array, and this document is source file (step 207).
The present invention will be described with instance below.
Suppose the file W that certain will compress,
One, the lossless compress process of above-mentioned W file is following:
1) binary code that reads out with binary mode:
A={001001011101001100000111010111110},a=0;
2) with all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array A " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array A " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array B;
B={21211311225311151};
3) the different element that occurs among the statistics array B, the element that these are different deposits among the newly-built array C by the order that occurs;
C={2135},
4) element among the array C is pressed ascending sort, the sequence after the ordering deposits among the newly-built array D;
D={1235};
5) use digital n to replace n element value among the array D, institute's calling sequence deposits among the newly-built array E;
E={1234};
6) with element among element among the array D and the array E according to one to one the relation, elements all among the array B is replaced, institute's calling sequence deposits among the newly-built array F;
F={21211311224311141};
7) all elements among the array F is done the above-mentioned the 2nd) inverse transformation of step, wherein first code among the array F is the code of being deposited among the variable a, the binary code sequence after the inverse transformation deposits among the newly-built array G;
G={0010010111010011000011101011110};
8) element among the array D being carried out " consequent deduct adjacent preceding paragraph " handles; The result deposits among the newly-built array H; Seek among the array H first more than or equal to 2 element position; If the position of this element in array H is n, preceding n element among the array D all to be deleted, remaining order of elements deposits among the newly-built array I;
H={112},I={5};
9) preserving array I and G is binary file, and this binary file is the file after the source file lossless compress.
Analyze: relatively array A and array G can find out that array G is than having lacked 2 binary codes among the array A, and promptly array G is than lacking the space that has accounted for 2 bits among the array A.The length of source file sequence A is 33 bits in this example, the length of compression back file sequence G is 31 bits, and meaning is that this algorithm has been removed among the A 6% redundant data, has saved 6% memory space.
Two, the decompression procedure of the W file after the compression:
1) read the binary sequence among the array G in the file after the compression:
G={0010010111010011000011101011110},g=0;
2) with all " 0 " that occurred in the length replacement of " continue 0 be 1 until next bit " among the array G " continue 0 be 1 until next bit "; With all " 1 " that occurred in the length replacement of " continue 1 be 0 until next bit " among the array G " continue 1 be 0 until next bit ", institute's calling sequence deposits among the newly-built array F;
F={21211311224311141};
3) the different element that occurs among the statistics array F, the element that these are different deposits among the newly-built array E by the order that occurs;
E={2134};
4) element among the array E is pressed ascending sort, the sequence after the ordering deposits among the newly-built array D;
D={1234};
5) read array I in the file after the compression, establish total i element among the array I, the element among i the element use in back among the array D array I is replaced, the sequence after the replacement deposits among the newly-built array C;
I={5},i=1,C={1235};
6) with element among element among the array D and the array C according to one to one the relation, elements all among the array F is replaced, institute's calling sequence deposits among the newly-built array B;
B={21211311225311151};
7) all elements among the array B is done the above-mentioned the 2nd) inverse transformation of step; Wherein first code among the array B is the code of being deposited among the variable g; Binary code sequence after the inverse transformation deposits among the newly-built array A, and A saves as file with array, and this document is source file;
A={001001011101001100000111010111110}。