KR101174598B1

KR101174598B1 - Apparatus for generating compression file using lossless data compression algorism and method thereof

Info

Publication number: KR101174598B1
Application number: KR1020110067302A
Authority: KR
Inventors: 박근홍
Original assignee: 박근홍
Priority date: 2011-07-07
Filing date: 2011-07-07
Publication date: 2012-08-16
Anticipated expiration: 2031-07-07

Abstract

본 발명은 많은 시간이 소요되는 비손실 데이터 압축 과정에서 압축시간을 최소화하기 위한 장치 및 그 방법에 관한 것으로, 비손실 데이터 압축알고리즘을 이용하여 압축된 파일을 생성하는 장치에 있어서, 입력모듈에서 입력된 파일을 압축알고리즘으로 압축 수행한 후 압축 데이터를 생성하여 출력모듈에 출력파일로 저장하는 과정을 반복 수행하는 압축모듈; 상기 압축모듈로 입력되는 입력파일의 크기와 압축모듈에서 출력모듈로 출력되는 출력파일의 압축률을 판단하는 판단모듈; 상기 판단모듈의 판단에 따라 압축모듈을 거치지 않고 입력모듈의 파일이 출력모듈로 저장되도록 하는 비압축모듈을 포함하여 이루어진 비손실 데이터 압축알고리즘을 이용한 압축파일 생성장치와, 이 장치로부터 (a) 압축모듈이 입력모듈에서 압축하려는 입력파일로부터 일정 크기의 데이터를 읽어 들이는 단계; (b) 상기 압축모듈은 입력된 데이터를 압축알고리즘을 이용하여 압축 작업을 수행한 후에 압축된 데이터를 출력모듈의 출력파일에 저장하는 단계; (c) 상기 압축모듈의 압축 작업 진행 중에 판단모듈은 압축모듈에서 일정 크기 이상의 데이터 처리가 진행된 후에 압축모듈로 입력되는 데이터와 압축모듈에서 출력되는 데이터를 비교하여 압축률이 설정비율 이하인지를 판단하는 단계; (d) 상기 판단모듈은 압축률이 설정 비율이하인 때에 입력모듈의 입력파일에서 나머지 데이터를 읽어 들어 비압축모듈을 거쳐 출력모듈의 출력파일에 압축하지 않고 저장하는 단계를 포함하여 이루어진 비손실 데이터 압축알고리즘을 이용한 압축파일 생성방법을 제공한 것이다. 본 발명은 압축이 불가능한 데이터를 실시간으로 인지하여 압축 작업 도중에 압축 과정을 진행 또는 포기할 수 있도록 함으로써, 압축 작업에 따른 불필요한 시간 소모를 최소화하고, 아카이빙(Archiving) 작업을 빠르게 처리할 수 있도록 하는 것이다.The present invention relates to an apparatus and a method for minimizing the compression time in a lossless data compression process that takes a lot of time, the apparatus for generating a compressed file using a lossless data compression algorithm, input from the input module A compression module for repeatedly compressing the compressed file with a compression algorithm and generating compressed data and storing the compressed file as an output file in the output module; A determination module for determining a size of an input file input to the compression module and a compression ratio of the output file output from the compression module to the output module; An apparatus for generating a compressed file using a lossless data compression algorithm, comprising a non-compression module for storing a file of an input module as an output module without passing through the compression module according to the judgment of the determination module; and (a) compression from the device. Reading, by the module, data of a predetermined size from an input file to be compressed in the input module; (b) the compression module compressing the input data using a compression algorithm and storing the compressed data in an output file of the output module; (c) During the compression operation of the compression module, the determination module compares the data input to the compression module with the data output from the compression module after processing a data of a predetermined size or more in the compression module to determine whether the compression ratio is less than or equal to the set ratio. step; (d) The lossless data compression algorithm comprising the step of reading the remaining data from the input file of the input module and storing the uncompressed data in the output file of the output module through the non-compression module when the compression ratio is less than the set ratio. It provides a compressed file generation method using. The present invention is to recognize the non-compressible data in real time to proceed or abandon the compression process during the compression operation, thereby minimizing the unnecessary time consumption due to the compression operation, it is possible to quickly process the archiving (Archiving).

Description

Apparatus for Generating Compression File using Lossless Data Compression Algorism and Method

본 발명은 압축파일을 생성하는 장치 및 그 방법에 관한 것으로, 더욱 상세하게는 많은 시간이 소요되는 비손실 데이터 압축 과정에서 압축시간을 최소화하기 위한 장치 및 그 방법에 관한 것이다.
The present invention relates to an apparatus and method for generating a compressed file, and more particularly, to an apparatus and method for minimizing a compression time in a lossless data compression process that takes a lot of time.

일반적으로 데이터 압축(Data Compression)은 주어진 공간에 기억시킬 수 있는 데이터의 양을 증가시키는 방법 또는 주어진 양의 데이터를 기억시키는 데 필요한 공간을 감소시키는 방법이다. 데이터의 전송이나 축적 때 여분의 공백을 제거하거나 부호화함으로써 데이터의 양을 줄이는 것이다. 즉 레코드나 블록의 길이를 단축시키기 위해 갭이나 공백ㆍ필드 등의 불필요한 데이터를 제거함으로써 기억공간을 절약하는 기술이다. 또한, 데이터 압축은 파일이나 통신 메시지와 같은 데이터 집합의 크기를 절약하거나 전송 시간을 단축하기 위해 데이터를 좀 더 적은 수의 비트를 사용하도록 부호화하는 것이다. 데이터에 포함되어 있는 중복된 비트열 또는 패턴을 삭제하고 대신 좀 더 적은 수의 비트 또는 요약 형식으로 부호화하며, 요약 형식을 복호화하면 원래의 데이터가 복원된다. 데이터 압축 방법에는 무손실 압축 방법과 손실 압축 방법이 있다. 문장이나 부호 데이터, 수치 데이터 등의 압축에는 반드시 무손실 압축 방법을 사용해야 하지만 영상이나 음성 압축에는 손실 압축 방법도 사용한다.In general, data compression is a method of increasing the amount of data that can be stored in a given space or a method of reducing the space required to store a given amount of data. The amount of data is reduced by removing or encoding extra white space when transferring or accumulating data. In other words, in order to shorten the length of a record or a block, the storage space is saved by eliminating unnecessary data such as gaps, blanks, and fields. In addition, data compression is the encoding of data to use fewer bits to save the size of data sets, such as files or communication messages, or to shorten the transmission time. The redundant bit string or pattern included in the data is deleted, and instead encoded into fewer bits or summary formats, and the original format is restored by decoding the summary format. Data compression methods include lossless compression and lossy compression. Lossless compression must be used for compression of text, code data, and numerical data, but lossy compression is also used for video and audio compression.

또한, 큰 데이터를 더 작은 크기로 변환시키는 인코딩 과정과 저장된 데이터를 다시 불러와 원래 데이터 형태로 복원시키는 디코딩 과정으로 이루어진다. 이때 인코딩하기 전의 데이터 크기와 인코딩하고 나서의 데이터 크기의 비율을 압축률이라고 한다. 압축 기술의 종류에 따라 데이터의 내용을 바꾸지 않고 원래 내용 그대로 디코딩할 수 있는 무손실 압축과 더 높은 압축률을 얻을 수 있지만 디코딩한 데이터의 세부적인 디테일을 일부 희생시키는 손실 압축이 존재한다. 대표적인 무손실 압축알고리즘에는 반복 길이 부호화와 허프만 부호화 등이 있다. 손실 압축알고리즘은 인간의 감각기관의 특성을 역이용하여 압축률을 높이므로, 음성, 정지화상, 동영상 등 데이터의 종류에 따라 각각 다른 알고리즘이 사용된다. MPEG 표준 압축기술이 많이 쓰인다.It also consists of an encoding process that converts large data to a smaller size, and a decoding process that recalls the stored data and restores it to its original data form. The ratio of the data size before encoding to the data size after encoding is called the compression rate. Depending on the type of compression technique, lossless compression and higher compression can be achieved that can be decoded as is without changing the content of the data, but lossy compression exists at the expense of some of the details of the decoded data. Typical lossless compression algorithms include repetitive length coding and Huffman coding. Since the lossy compression algorithm improves the compression rate by using the characteristics of human sense organs in reverse, different algorithms are used according to the type of data such as voice, still picture, and moving picture. MPEG standard compression technology is widely used.

파일의 확장자가 '*.htm, *.txt'와 같은 텍스트 파일은 압축하기 쉬우며 압축률이 높은 경우가 많다. 이미 압축되어 있는 파일들, 예를 들어, '*.rar, *.zip'과 같은 알려진 압축파일, 또는 '*.mpg, *.mp3, *.jpg'와 같은 멀티미디어 파일은 더 이상 압축할 여지가 적기 때문에 한 번 더 압축할 때에는 용량이 크게 줄어들지 않는다. 그러므로 압축률이 낮다고 할 수 있다. 헤더 데이터가 추가되기 때문에 파일이 오히려 더 커질 가능성도 있다.Text files with file extensions '* .htm and * .txt' are easy to compress and often have high compression rates. Files that are already compressed, for example, known archives such as '* .rar, * .zip', or multimedia files such as '* .mpg, * .mp3, * .jpg' can no longer be compressed. Because of the smaller capacity, the capacity does not decrease significantly when compressed once more. Therefore, the compression rate is low. Because header data is added, it is possible that the file will be larger.

더욱이 확장자가 AVI인 멀티미디어 파일이나 ZIP 등의 이미 압축된 데이터 파일을 다시 압축하는 작업을 수행하더라도 압축이 거의 이루어지지 않는다. 하지만 이들 파일도 내부의 알고리즘에 따라 압축할 경우에 압축률이 높은 경우도 있으므로 파일의 확장자나 파일의 헤더만 가지고 데이터가 압축될 것인지의 여부를 정확히 판단하는 것은 불가능한 실정이다.
Moreover, even if you recompress multimedia files with extension AVI or already compressed data files such as ZIP, compression is hardly achieved. However, since these files are also compressed according to internal algorithms, the compression rate may be high. Therefore, it is impossible to accurately determine whether data is compressed using only the file extension or the file header.

본 발명은 상기 실정을 감안하여, 데이터의 압축을 수행하는 과정에서 데이터 앞부분의 압축률이 낮을 경우에 파일의 마지막 부분까지 압축률이 낮을 가능성이 높다는 점에 착안하여 시간이 많이 걸리는 비손실 데이터 압축과정의 시간을 비약적으로 줄이기 위한 것이 목적이다.
In view of the above situation, the present invention focuses on the fact that the compression ratio is low to the last part of the file when the compression ratio of the front part of the data is low in the process of compressing the data. The goal is to drastically reduce time.

본 발명은 상기 목적을 달성하기 위하여, 비손실 데이터 압축알고리즘을 이용한 압축파일 생성장치는, 비손실 데이터 압축알고리즘을 이용하여 압축된 파일을 생성하는 장치에 있어서, 입력모듈에서 입력된 파일을 압축알고리즘으로 압축 수행한 후 압축 데이터를 생성하여 출력모듈에 출력파일로 저장하는 과정을 반복 수행하는 압축모듈; 상기 압축모듈로 입력되는 입력파일의 크기와 압축모듈에서 출력모듈로 출력되는 출력파일의 압축률을 판단하는 판단모듈; 상기 판단모듈의 판단에 따라 압축모듈을 거치지 않고 입력모듈의 파일이 출력모듈로 저장되도록 하는 비압축모듈을 포함하여 이루어진 것이다.In order to achieve the above object, the present invention provides an apparatus for generating a compressed file using a lossless data compression algorithm, wherein the apparatus for generating a compressed file using the lossless data compression algorithm comprises: compressing a file input from an input module; A compression module for repeatedly performing a process of generating compressed data and storing the compressed data as an output file after the compression; A determination module for determining a size of an input file input to the compression module and a compression ratio of the output file output from the compression module to the output module; In accordance with the determination of the determination module is made to include a non-compression module for storing the file of the input module as an output module without passing through the compression module.

또한, 본 발명에서, 상기 압축모듈 내에 판단모듈이 포함될 수 있다.In addition, in the present invention, the determination module may be included in the compression module.

또한, 본 발명에서, 상기 비압축모듈에 빠른 압축알고리즘이 포함될 수 있다.In addition, in the present invention, a fast compression algorithm may be included in the uncompressed module.

또한, 본 발명의 비손실 데이터 압축알고리즘을 이용한 압축파일 생성방법은, 비손실 데이터 압축알고리즘을 이용하여 압축된 파일을 생성하는 방법에 있어서, (a) 압축모듈이 입력모듈에서 압축하려는 입력파일로부터 일정 크기의 데이터를 읽어 들이는 단계; (b) 상기 압축모듈은 입력된 데이터를 압축알고리즘을 이용하여 압축 작업을 수행한 후에 압축된 데이터를 출력모듈의 출력파일에 저장하는 단계; (c) 상기 압축모듈의 압축 작업 진행 중에 판단모듈은 압축모듈에서 일정 크기 이상의 데이터 처리가 진행된 후에 압축모듈로 입력되는 데이터와 압축모듈에서 출력되는 데이터를 비교하여 압축률이 설정비율 이하인지를 판단하는 단계; (d) 상기 판단모듈은 압축률이 설정 비율이하인 때에 입력모듈의 입력파일에서 나머지 데이터를 읽어 들어 비압축모듈을 거쳐 출력모듈의 출력파일에 압축하지 않고 저장하는 단계를 포함하여 이루어진 것이다.In addition, the method for generating a compressed file using the lossless data compression algorithm of the present invention, in the method for generating a compressed file using the lossless data compression algorithm, (a) from the input file to be compressed by the compression module in the input module Reading data of a predetermined size; (b) the compression module compressing the input data using a compression algorithm and storing the compressed data in an output file of the output module; (c) During the compression operation of the compression module, the determination module compares the data input to the compression module with the data output from the compression module after processing a data of a predetermined size or more in the compression module to determine whether the compression ratio is less than or equal to the set ratio. step; (d) The judging module includes a step of reading the remaining data from the input file of the input module when the compression ratio is less than the set ratio, and storing the remaining data without compression in the output file of the output module via the uncompressed module.

또한, 본 발명에서, 상기 판단모듈은 압축모듈의 압축률에 따라 압축모듈을 통해 압축을 계속 진행시킬 것인지 비압축모듈을 통해 입력파일을 출력파일로 이동시켜 저장할 것인지를 판단하여 플래그를 작동시킬 수 있다.Further, in the present invention, the determination module may operate the flag by determining whether to continue the compression through the compression module or move the input file to the output file through the non-compression module according to the compression ratio of the compression module. .

또한, 본 발명에서, 상기 판단모듈은 입력모듈의 원본 입력파일의 헤더를 분석하여 압축모듈과 비압축모듈의 작동을 선택할 수 있다.
Further, in the present invention, the determination module may select the operation of the compression module and the uncompression module by analyzing the header of the original input file of the input module.

본 발명은 상기 해결 수단에 의하여, 압축이 불가능한 데이터를 실시간으로 인지하여 압축 작업 도중에 압축 과정을 진행 또는 포기할 수 있도록 함으로써, 압축 작업에 따른 불필요한 시간 소모를 최소화하고, 아카이빙(Archiving) 작업을 빠르게 처리할 수 있도록 하는 것이다.
According to the present invention, it is possible to proceed or abandon the compression process in the middle of the compression operation by recognizing the data that cannot be compressed in real time, thereby minimizing unnecessary time consumption due to the compression operation and quickly processing the archiving operation. To do it.

도 1은 본 발명에 따른 비손실 데이터 압축알고리즘을 이용한 압축파일 생성장치를 나타낸 블록도이다.
도 2는 본 발명에 따른 비손실 데이터 압축알고리즘을 이용한 압축파일 생성방법을 나타낸 흐름도이다.1 is a block diagram showing an apparatus for generating a compressed file using a lossless data compression algorithm according to the present invention.
2 is a flowchart illustrating a method of generating a compressed file using a lossless data compression algorithm according to the present invention.

이하, 본 발명에 따른 비손실 데이터 압축알고리즘을 이용한 압축파일 생성장치에 관하여 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, a compressed file generating apparatus using a lossless data compression algorithm according to the present invention will be described in detail with reference to the accompanying drawings.

도 1에서, 본 발명은 운영체제(OS)를 가진 컴퓨터 등에서 데이터를 압축하여 파일로 만들 때에 사용되는 것이다. 특히 비손실 데이터 압축알고리즘을 이용하여 압축된 파일을 생성하는 장치이다.In FIG. 1, the present invention is used to compress data into a file in a computer having an operating system (OS). In particular, it is a device that generates compressed files using lossless data compression algorithm.

입력모듈(10)은 압축하기 위한 복수의 입력파일, 예를 들어, AVI파일, JPG파일, MP3파일 등과 같은 멀티미디어 파일이나 ZIP파일이나 RAR파일 등과 같이 이미 압축된 파일 또는 각종 프로그램이나 문서파일 등이 저장된 것이고, 출력모듈(14)은 압축된 복수의 출력파일이 저장된 것이다.The input module 10 includes a plurality of input files for compression, for example, an multimedia file such as an AVI file, a JPG file, an MP3 file, a compressed file such as a ZIP file or a RAR file, or various programs or document files. The output module 14 stores the compressed plurality of output files.

압축모듈(12)은 상기 입력모듈(10)에서 입력된 파일을 압축을 위한 알고리즘으로 압축 수행하여 압축 데이터를 생성하고, 생성된 압축 데이터를 출력모듈(14)에 출력파일로 저장하는 과정을 반복 수행하는 것이다.The compression module 12 compresses the file input by the input module 10 with an algorithm for compression to generate compressed data, and repeats the process of storing the generated compressed data as an output file in the output module 14. To do.

판단모듈(20)은 상기 압축모듈(12)로 입력되는 입력파일의 크기와 압축모듈(12)에서 출력모듈(14)로 출력되는 출력파일의 압축률을 판단하는 것이다. 판단모듈(20)은 압축모듈(12) 내에 포함될 수 있다. 판단모듈(20)은 입력모듈(10)의 입력파일 원본의 헤더를 분석하는 기능이 포함되고, 헤더에 포함된 입력파일 데이터 앞부분의 압축정보를 비롯하여 데이터 뒷부분의 압축정보를 분석하는 기능이 포함되어 있는 것이 좋다. 판단모듈(20)은 입력모듈(10)에서 압축모듈(12)로 입력되는 입력파일과 압축모듈(12)에서 출력모듈(14)로 저장되는 데이터의 압축률을 연산 및 비교하고, 이 압축률은 설정된 압축크기에 따라 압축모듈(12)과 비압축모듈(30)의 작동을 선택하거나 제어하는 것이다.The determination module 20 determines the size of the input file input to the compression module 12 and the compression ratio of the output file output from the compression module 12 to the output module 14. The determination module 20 may be included in the compression module 12. The determination module 20 includes a function of analyzing the header of the original input file of the input module 10, and includes a function of analyzing the compressed information at the back of the data as well as the compression information of the front of the input file data included in the header. It is good to be. The determination module 20 calculates and compares the compression ratio of the input file input from the input module 10 to the compression module 12 and the data stored from the compression module 12 to the output module 14, and the compression ratio is set. It is to select or control the operation of the compression module 12 and the non-compression module 30 according to the compression size.

비압축모듈(30)은 상기 판단모듈(20)의 판단에 따라 압축모듈(12)을 거치지 않고 입력모듈(10)의 파일이 출력모듈(14)로 그대로 저장되도록 하는 것이다. 비압축모듈(30)에는 압축시간을 줄이거나 압축률이 낮은 빠른 압축알고리즘이 포함될 수 있다.The non-compression module 30 allows the file of the input module 10 to be stored as the output module 14 without passing through the compression module 12 according to the determination of the determination module 20. The non-compression module 30 may include a fast compression algorithm that reduces the compression time or has a low compression rate.

상기 압축모듈(12)과 비압축모듈(30)은 단일의 모듈로 구성될 수도 있다. 즉 판단모듈(20)에는 압축하지 않는 플래그를 작동시키는 조건문이나 판단문 등이 포함된 경우에 하나의 모듈로 구성될 수 있다.The compression module 12 and the non-compression module 30 may be configured as a single module. That is, the determination module 20 may be configured as one module when a conditional statement or a decision statement for operating a flag that does not compress is included.

본 발명의 비손실 데이터 압축알고리즘을 이용한 압축파일 생성방법을 도 2의 흐름도를 참조하여 설명한다.A method of generating a compressed file using the lossless data compression algorithm of the present invention will be described with reference to the flowchart of FIG. 2.

먼저, 압축모듈(12)이 입력모듈(10)에서 압축하려는 입력파일로부터 일정 크기의 데이터를 읽어 들인다(S1). 그리고 압축모듈(12)은 입력된 데이터를 압축모듈(12)에 포함된 압축알고리즘을 이용하여 압축 작업을 수행한다(S2). 압축알고리즘은 비손실 데이터 압축알고리즘이다. 압축모듈(12)은 압축알고리즘으로 압축된 데이터를 출력모듈(14)의 출력파일로 저장한다(S3). 상기 압축모듈(12)의 압축 작업이 종료될 때까지 상기 단계 (S1) 내지 (S3)의 과정은 반복적으로 이루어진다(S4).First, the compression module 12 reads data of a predetermined size from the input file to be compressed in the input module 10 (S1). In addition, the compression module 12 performs a compression operation on the input data using a compression algorithm included in the compression module 12 (S2). The compression algorithm is a lossless data compression algorithm. The compression module 12 stores the data compressed by the compression algorithm as an output file of the output module 14 (S3). The process of steps S1 to S3 is repeatedly performed until the compression operation of the compression module 12 is completed (S4).

상기 압축모듈(12)의 압축 작업 진행 중에 판단모듈(20)은 압축모듈(12)에서 일정 크기 이상의 데이터 처리가 진행었는 지를 판단한다(S5). 그리고 압축모듈(12)로 입력되는 데이터와 압축모듈(12)에서 출력되는 데이터를 비교하여 압축률이 설정비율 이하인지를 판단한다(S6). 즉 판단모듈(20)은 압축모듈(12)에서 압축 처리하고 있는 데이터의 크기가 일정 크기 이상으로 처리되었다면, 판단모듈(20)은 압축모듈(12)로 입력되는 데이터와 출력되는 데이터를 연산 및 비교하여 압축률을 판단하게 된다. 상기 판단모듈(20)은 데이터 처리가 전체 데이터 크기의 일정 비율(%), 예를 들어, 5%나 10% 등의 비율 또는 전체 압축에 걸리는 예상시간으로부터 일정 비율을 판단할 수 있을 것이다. 상기 단계 (S5) 및 (S6)은 데이터의 압축률을 실시간으로 판단하는 과정으로, 입력된 데이터의 크기와 출력된 데이터의 크기로 압축률을 연산하고, 압축률에 따라 압축을 계속 진행할 것인지 아니면 압축과정을 포기할 것인지를 결정하는 것이다. 그러므로 압축률이 설정 비율보다 높은 경우에는 압축모듈(12)의 압축 작업이 계속 진행될 것이다.During the compression operation of the compression module 12, the determination module 20 determines whether data processing of a predetermined size or more is performed in the compression module 12 (S5). The data input to the compression module 12 and the data output from the compression module 12 are compared to determine whether the compression ratio is less than or equal to the set ratio (S6). In other words, if the size of the data being processed by the compression module 12 is greater than or equal to a predetermined size, the determination module 20 calculates and outputs the data input to the compression module 12 and the output data. The compression rate is determined by comparison. The determination module 20 may determine a predetermined ratio from a predetermined percentage (%) of the total data size, for example, a ratio of 5% or 10%, or an estimated time for full compression. The steps (S5) and (S6) is a process of determining the compression ratio of the data in real time, and calculates the compression ratio based on the size of the input data and the size of the output data, and whether or not to continue the compression according to the compression rate or the compression process Is to decide whether to give up. Therefore, if the compression ratio is higher than the set ratio, the compression operation of the compression module 12 will continue.

더욱이 판단모듈(20)은 압축모듈(12)의 압축률에 따라 압축모듈(12)을 통해 압축을 계속 진행시킬 것인지 비압축모듈(30)을 통해 입력파일을 출력파일로 그대로 이동시켜 저장할 것인지를 판단하여 플래그를 작동시킨다.Furthermore, the determination module 20 determines whether to continue the compression through the compression module 12 or move the input file as an output file through the non-compression module 30 according to the compression ratio of the compression module 12. To activate the flag.

그러나 판단모듈(20)에서 압축률이 설정 비율이하라고 판단되었을 경우에는 입력모듈(10)의 입력파일에서 나머지 데이터를 읽어 들이고(S7), 읽어 들인 데이터는 비압축모듈(30)을 거쳐 출력모듈(14)의 출력파일에 압축하지 않고 그대로 저장되도록 한다(S8). 상기 단계 (S7) 및 (S8)은 판단모듈(20)이 압축모듈(12)의 압축 작업을 포기하고 원본의 파일 데이터를 그대로 출력파일에 저장되도록 하는 것이다. 판단모듈(20)은 입력모듈(10)의 원본 입력파일의 헤더를 분석하여 압축모듈(12)과 비압축모듈(30)의 작동을 선택하게 된다.However, when it is determined that the compression ratio is less than the set ratio by the determination module 20, the remaining data is read from the input file of the input module 10 (S7), and the read data is passed through the non-compression module 30 to the output module ( 14) to be stored as it is without compression in the output file (S8). Steps S7 and S8 allow the determination module 20 to give up the compression operation of the compression module 12 and to store the original file data as it is in the output file. The determination module 20 analyzes the header of the original input file of the input module 10 to select the operation of the compression module 12 and the non-compression module 30.

따라서 압축모듈(12)의 비손실 압축알고리즘을 적용하지 않으므로 단계 (S1) 내지 (S4)의 과정을 거치지 않아 매우 빠른 처리가 가능해진다.Therefore, since the lossless compression algorithm of the compression module 12 is not applied, very fast processing is possible without going through the processes of steps S1 to S4.

본 발명의 일례로, 확장자가 *.AVI와 같은 멀티미디어 동영상인 파일의 경우 이미 데이터가 압축되어 있는 경우가 많기 때문에 실제 ZIP 포맷으로 압축을 하면 거의 압축이 되지 않음을 종종 확인할 수 있다. 임의의 700MB의 AVI파일을 본 발명을 적용하여 압축한 결과는 다음의 표와 같다.As an example of the present invention, since a file having a multimedia video extension such as * .AVI is often compressed in data, it is often confirmed that compression is not performed when actually compressing in a ZIP format. The result of compressing an arbitrary 700 MB AVI file by applying the present invention is shown in the following table.

구분division 원본 파일 크기
(AVI)Original file size
(AVI) 압축 파일 크기
(ZIP)Compressed file size
(ZIP) 압축률Compression ratio 압축시간Compression time 일반압축General compression 734,337,976734,337,976 731,063,935731,063,935 0.00444%0.00444% 47초47 seconds 본 발명을 적용한 압축Compression applying the present invention 734,337,976734,337,976 734,295,873734,295,873 0.00005%0.00005% 15초15 seconds

상기 표 1에서 압축률은 크게 차이가 나지 않지만, 압축에 걸리는 시간은 많은 차이가 있음을 확인할 수 있다.In Table 1, the compression ratio is not significantly different, but it can be seen that the time required for compression is significantly different.

이와 같이 본 발명은 압축이 불가능한 데이터를 실시간으로 인지하여 압축 작업 도중에 압축을 포기함으로써, 아카이빙(Archiving, 여러 개의 파일을 하나의 파일로 묶는 과정에서 데이터 압축 작업을 병행하는 것) 작업을 빠르게 처리할 수 있도록 하는 장점이 있다.As described above, the present invention recognizes incompressible data in real time and gives up compression in the middle of the compression operation, thereby rapidly processing archiving (combining data compression in the process of combining several files into one file). There is an advantage to this.

이상의 설명에서 본 발명은 특정의 실시 예와 관련하여 도시 및 설명하였지만, 특허청구범위에 의해 나타난 발명의 사상 및 영역으로부터 벗어나지 않는 한도 내에서 다양한 개조 및 변화가 가능하다는 것을 이 기술분야에서 통상의 지식을 가진 자라면 누구나 쉽게 알 수 있을 것이다.
While the invention has been shown and described with respect to the specific embodiments thereof, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined by the appended claims. Anyone who has it will know it easily.

10: 입력모듈 12: 압축모듈
14: 출력모듈 20: 판단모듈
30: 비압축모듈10: input module 12: compression module
14: output module 20: judgment module
30: Uncompressed Module

Claims

In the device for generating a compressed file using a lossless data compression algorithm,
A compression module repeatedly performing a process of compressing a file input from an input module with a compression algorithm and generating compressed data and storing the compressed data as an output file in the output module;
A determination module for determining a size of an input file input to the compression module and a compression ratio of the output file output from the compression module to the output module;
And a non-compression module configured to store a file of an input module as an output module without passing through the compression module according to the determination of the determination module.

The apparatus of claim 1, wherein the decision module is included in the compression module.

The apparatus of claim 1, wherein the non-compression module includes a fast compression algorithm.

In the method for generating a compressed file using a lossless data compression algorithm,
(a) the compression module reading data of a predetermined size from an input file to be compressed in the input module;
(b) the compression module compressing the input data using a compression algorithm and storing the compressed data in an output file of the output module;
(c) During the compression operation of the compression module, the determination module compares the data input to the compression module with the data output from the compression module after processing a data of a predetermined size or more in the compression module to determine whether the compression ratio is less than or equal to the set ratio. step;
and (d) the determining module reads the remaining data from the input file of the input module when the compression ratio is less than the set ratio, and stores the remaining data without compression in the output file of the output module via the non-compression module. How to create compressed file using algorithm.

The lossless method of claim 4, wherein the determination module determines whether to continue the compression through the compression module or move the input file to the output file through the non-compression module and store the flag according to the compression ratio of the compression module. Method of creating compressed file using data compression algorithm.

5. The method of claim 4, wherein the determination module analyzes the header of the original input file of the input module and selects the operation of the compression module and the non-compression module.