JPH0621827A

JPH0621827A - Data compressor and its method

Info

Publication number: JPH0621827A
Application number: JP19646492A
Authority: JP
Inventors: Yoshihisa Aotani; 嘉久青谷
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1992-06-30
Filing date: 1992-06-30
Publication date: 1994-01-28

Abstract

PURPOSE:To compress efficiently data of consecutive same characters such as consecutive characters of Japanese data in 2-byte codes. CONSTITUTION:A compression pre-processing section 12 converts and processes data with same consecutive characters into a sum/subtraction modulo of adjacent data or exclusive OR data as the compression preprocessing and converts it into data consecutive in the unit of bytes. Compression data 14 comprising a special character representing the compression degree, repetitive character data and repetitive number of times of the characters are obtained from a compression post-processing section 13 when some characters are consecutive by taking one-byte character of the consecutive data from the compression pre-processing section 12 as one unit.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、２バイトで構成され日
本語データなどの同一キャラクタが連続するデータの繰
り返し文字の圧縮処理を行うデータ圧縮装置及びその方
法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data compression apparatus and method for compressing repeated characters of data consisting of 2 bytes and having consecutive identical characters such as Japanese data.

【０００２】[0002]

【従来の技術】従来のこの種のデータ圧縮方式は、１バ
イトキャラクタを一つの単位としてキャラクタがいくつ
か連続している場合には、これの圧縮を示す特殊文字
と、繰り返し文字データと、繰り返し回数とに変換して
データの縮小を実現していた。2. Description of the Related Art In the conventional data compression method of this type, when several characters are consecutive with one byte character as one unit, a special character indicating the compression of this character, repeated character data, and repeated character data are used. The data was reduced by converting it to the number of times.

【０００３】図５は従来のデータ圧縮装置の構成を示
し、図５において、被圧縮データ１は圧縮処理部２に供
給されて圧縮処理じ施されて圧縮データ３を得ている。
図６は、図５における変換の処理手順を示している。図
６において、先ず、文字カウンタＣｃと繰り返しカウン
タＣｒが０に設定される（ステップ（図中、Ｓで示す）
６０，６１）。原データから１文字を読み出す（ステッ
プ６２）。その後、文字カウンタＣｃが１だけ加算され
る（ステップ６３）。文字カウンタＣｃの値はこの読み
出した文字と比較される（ステップ６４）。最初のサイ
クルではこの比較は必ず真となり、原データが４個以上
の繰り返し文字から構成されているかを調べるために読
み出された文字が、バッファに格納される（ステップ６
５）。２回目以降のサイクルでは、原データから読み出
された文字がバッファに格納されている文字と比較され
る（ステップ６６）。原文字が格納されている文字と等
しければ、４個以上の同じ文字が繰り返されているので
圧縮が行われる。文字が格納されている文字と等しい場
合に繰り返しカウンタＣｒが１つ加算される（ステップ
６７）、そして他の文字が原データから読み出される。
現文字が格納されている文字と等しくなければ、繰り返
しカウンタＣｃは４と比較される（ステップ６８）。そ
して、より少なければ、３文字しか同じ文字が繰り返さ
れていないので圧縮は行われない（ステップ６９）。こ
のようにして繰り返しカウンタが４以上の時に、圧縮形
式が作成され、図７に示す圧縮データ列が得られる。FIG. 5 shows the configuration of a conventional data compression apparatus. In FIG. 5, compressed data 1 is supplied to a compression processing section 2 and subjected to compression processing to obtain compressed data 3.
FIG. 6 shows the conversion processing procedure in FIG. In FIG. 6, first, the character counter Cc and the repeat counter Cr are set to 0 (step (indicated by S in the drawing)).
60, 61). One character is read from the original data (step 62). Thereafter, the character counter Cc is incremented by 1 (step 63). The value of the character counter Cc is compared with the read character (step 64). In the first cycle, this comparison is always true, and the read character is stored in the buffer to see if the original data consists of four or more repeated characters (step 6).
5). In the second and subsequent cycles, the character read from the original data is compared with the character stored in the buffer (step 66). If the original character is equal to the stored character, four or more same characters are repeated, so compression is performed. When the character is equal to the stored character, the repeat counter Cr is incremented by 1 (step 67), and another character is read from the original data.
If the current character is not equal to the stored character, the repeat counter Cc is compared with 4 (step 68). If the number is less, the compression is not performed because only three characters are repeated (step 69). In this way, when the repetition counter is 4 or more, the compression format is created and the compressed data string shown in FIG. 7 is obtained.

【０００４】[0004]

【発明が解決しようとする課題】この従来のデータ圧縮
方式では、ＡＳＣＩＩコード等の１バイトコード文字か
らなるデータ列において、連続して同じキャラクタが出
現する場合にデータサイズの縮小を図ることができる
が、シフトＪＩＳコード等の２バイトコード文字からな
るデータ列においては、たとえ同じキャラクタが連続し
て出現しても、隣合う１バイトデータは等しくないため
データサイズの縮小ができないという問題点があった。With this conventional data compression method, it is possible to reduce the data size when the same character appears consecutively in a data string consisting of 1-byte code characters such as ASCII code. However, in a data string composed of 2-byte code characters such as a shift JIS code, even if the same character appears consecutively, the adjacent 1-byte data is not equal, so there is a problem that the data size cannot be reduced. It was

【０００５】本発明は、このような従来の技術における
問題を解決するものであり、同一キャラクタが連続する
データ、例えば、２バイトコードの日本語データの連続
文字を効率良く圧縮できるデータ圧縮装置及びその方法
の提供を目的とする。The present invention solves the problems in the prior art as described above, and a data compression apparatus and a data compression apparatus capable of efficiently compressing data in which the same character continues, for example, consecutive characters of Japanese data of 2-byte code. The purpose is to provide the method.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するため
に、請求項１の本発明のデータ圧縮装置は、同一キャラ
クタが連続するデータの圧縮前処理として隣り合うデー
タの加減算モジュロ又は排他論理和データに変換処理し
てバイト単位で連続したデータに変換する前圧縮処理手
段と、前圧縮処理手段からの連続したデータの１バイト
キャラクタを一つの単位としてキャラクタがいくつか連
続している場合に、これの圧縮を示す特殊文字と、繰り
返し文字データと、繰り返し回数とに変換した圧縮デー
タを得る後圧縮処理手段とを備える構成としてある。In order to achieve the above object, the data compression apparatus according to the present invention of claim 1 is a pre-compression process for data in which the same character is continuous. In the case where the pre-compression processing means for converting into data and converting it into continuous data in byte units, and a case where several characters are continuous with one byte character of continuous data from the pre-compression processing means as one unit, The special character indicating the compression, the repeated character data, and the post-compression processing means for obtaining the compressed data converted into the number of repetitions are provided.

【０００７】また、請求項２の発明のデータ圧縮方法
は、同一キャラクタが連続するデータの圧縮を示す特殊
文字と繰り返し文字データと繰り返し回数に変換して同
一キャラクタを繰り返す際よりデータサイズを縮小する
データ圧縮方法であって、データ圧縮の前処理として、
隣り合うバイトの加減算モジュロ又は排他論理和をと
り、同一キャラクタが連続するデータの繰り返し文字を
バイト単位で繰り返して文字と判別している。Further, in the data compression method according to the present invention, the data size is reduced as compared with the case of repeating the same character by converting the special character indicating the compression of data in which the same character is continuous, the repeated character data, and the number of repetitions. It is a data compression method, and as a pre-processing of data compression,
The addition / subtraction modulo or exclusive OR of the adjacent bytes is taken, and the repeated character of the data in which the same character continues is repeated in byte units to be determined as a character.

【０００８】そして、この請求項１又は２記載中、同一
キャラクタが連続するデータの繰り返し文字を、２バイ
トコードの日本語データの連続文字としている。In the first or second aspect of the present invention, the repeated character of the data in which the same character is continuous is the continuous character of the Japanese data of 2-byte code.

【０００９】[0009]

【作用】上記構成からなる、本発明のデータ圧縮装置及
びその方法は、同一キャラクタが連続するデータの圧縮
前処理として隣り合うデータの加減算モジュロ又は排他
論理和データに変換処理して、バイト単位で連続したデ
ータに一度変換し、同一キャラクタが連続するデータ、
例えば、２バイトコードの日本語データの連続文字が効
率良く圧縮される。According to the data compression apparatus and method of the present invention having the above-mentioned structure, the data is converted into addition / subtraction modulo of adjacent data or exclusive OR data as pre-compression processing of data in which the same character continues, and in byte units. Data that is converted into continuous data once and the same character is continuous,
For example, consecutive characters of 2-byte code Japanese data are efficiently compressed.

【００１０】[0010]

【実施例】次に、本発明のデータ圧縮装置及びその方法
の実施例を図面にもとづいて説明する。図１は実施例の
構成を示している。図１において、シフトＪＩＳコード
のように２バイトコードの日本語データの被圧縮データ
１０が入力される圧縮前処理部１２と、この後、１バイ
トキャラクタを一つの単位としてキャラクタがいくつか
連続している場合に、これの圧縮を示す特殊文字と、繰
り返し文字データと、繰り返し回数とに変換する圧縮後
処理部１３とを有し、この圧縮後処理部１３から特殊文
字と、繰り返し文字データと、繰り返し回数とに変換し
た圧縮データ１４が得られる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT An embodiment of a data compression apparatus and method of the present invention will be described below with reference to the drawings. FIG. 1 shows the configuration of the embodiment. In FIG. 1, a pre-compression processing unit 12 to which compressed data 10 of Japanese data of 2-byte code such as shift JIS code is input, and thereafter, several characters are consecutive with one byte character as one unit. In this case, the special character indicating the compression of the special character, the repeated character data, and the post-compression processing unit 13 for converting into the number of repetitions are included. , The number of repetitions and the compressed data 14 are obtained.

【００１１】次に、この実施例の構成の動作について説
明する。被圧縮データ１０は、例えば、シフトＪＩＳコ
ードのように２バイトコードの日本語データである。こ
の日本語データを圧縮前処理部１２で変換した後、圧縮
後処理部１３で変換して圧縮データ１４が得られる。Next, the operation of the configuration of this embodiment will be described. The compressed data 10 is 2-byte code Japanese data such as shift JIS code, for example. The Japanese data is converted by the pre-compression processing unit 12 and then converted by the post-compression processing unit 13 to obtain compressed data 14.

【００１２】図２は圧縮前処理部１２での詳細な処理手
順を示している。図２において、初めに文字データを格
納する２つのメモリを初期化する（ステップ２０）。被
圧縮データから１バイト読み出しメモリ２へ格納する
（ステップ２１）。この読み出しを判断し（ステップ２
２）、所定時間経過の後未終了の場合は処理を終了す
る。終了完了のＹｅｓの場合、メモリ１とメモリ２の排
他論理和を実行して、その演算結果を出力する（ステッ
プ２３，２４）。その後、メモリ２の内容でメモリ１を
更新して再び被圧縮データの読み込みを行なうループを
繰り返す（ステップ２５）。以上の変換処理によりバイ
ト単位で連続しなかったデータを連続データに変換され
る。FIG. 2 shows a detailed processing procedure in the pre-compression processing unit 12. In FIG. 2, first, two memories for storing character data are initialized (step 20). One byte is read from the compressed data and stored in the memory 2 (step 21). This reading is judged (step 2
2) If the processing has not ended after the elapse of a predetermined time, the processing ends. If the end is Yes, the exclusive OR of the memory 1 and the memory 2 is executed, and the operation result is output (steps 23 and 24). After that, the memory 1 is updated with the contents of the memory 2 and the loop for reading the compressed data again is repeated (step 25). Through the above conversion processing, data that is not continuous in byte units is converted into continuous data.

【００１３】次に、圧縮データを復元した場合は、図２
の処理に続いて後処理を行う。図３は、この圧縮データ
を復元の処理手順を示している。図３において、初めに
文字データを格納する２つのメモリを初期化する（ステ
ップ３０）。被圧縮データから１バイト読み出しメモリ
２へ格納する（ステップ３１）。この読み出しを判断し
（ステップ３２）、所定時間経過の後未終了の場合は処
理を終了する。終了完了のＹｅｓの場合、メモリ１とメ
モリ２の排他論理和を実行して、その演算結果を出力す
る（ステップ３３，３４）。その後、計算結果をメモリ
１に移す（ステップ３５）。Next, when the compressed data is restored, as shown in FIG.
The post-processing is performed subsequent to the above processing. FIG. 3 shows a processing procedure for restoring this compressed data. In FIG. 3, first, two memories for storing character data are initialized (step 30). One byte is read from the compressed data and stored in the memory 2 (step 31). This reading is judged (step 32), and if it is not completed after the elapse of a predetermined time, the process is ended. If the end is Yes, the exclusive OR of the memory 1 and the memory 2 is executed, and the operation result is output (steps 33 and 34). Then, the calculation result is transferred to the memory 1 (step 35).

【００１４】以上の変換をすることによって図４に示す
ようにバイト単位で連続しなかったデータを連続データ
に変換できる。図４（ａ）は圧縮前処理部１２に入力さ
れる被圧縮データ１０であり、図４（ｂ）は圧縮前処理
部１２から出力される圧縮前処理の後のデータ列であ
る。図４（ｃ）は圧縮後処理部１３から出力され、目的
の圧縮データ１４を示し、圧縮を示す特殊文字と、連続
している繰り返し文字データと、繰り返し回数とに変換
されている。By performing the above conversion, it is possible to convert data that is not continuous in byte units into continuous data as shown in FIG. 4A is the compressed data 10 input to the pre-compression processing unit 12, and FIG. 4B is the data string output from the pre-compression processing unit 12 after the pre-compression processing. FIG. 4C shows the target compressed data 14 output from the post-compression processing unit 13, which is converted into special characters indicating compression, continuous repeated character data, and the number of repetitions.

【００１５】[0015]

【発明の効果】以上のように、本発明のデータ圧縮装置
及びその方法は、同一キャラクタが連続するデータの圧
縮前処理として隣り合うデータの加減算モジュロ又は排
他論理和データに変換処理して、バイト単位で連続した
データに一度変換しているため、同一キャラクタが連続
するデータ、例えば、２バイトコードの日本語データの
連続文字が効率良く圧縮できるという効果を有する。As described above, according to the data compression apparatus and method of the present invention, as a pre-compression process of data in which the same character is continuous, conversion processing of adjacent data into addition / subtraction modulo or exclusive OR data is performed, and the data is converted into bytes. Since the data is once converted into continuous data in units, it is possible to efficiently compress data in which the same character is continuous, for example, continuous characters of 2-byte code Japanese data.

[Brief description of drawings]

【図１】本発明のデータ圧縮装置及びその方法の実施例
における構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of a data compression apparatus and method according to the present invention.

【図２】図１に示す圧縮前処理部における処理手順を示
すフローチャートである。FIG. 2 is a flowchart showing a processing procedure in a pre-compression processing unit shown in FIG.

【図３】実施例における圧縮データの復元の処理手順を
示すフローチャートである。FIG. 3 is a flowchart showing a processing procedure for decompressing compressed data in the embodiment.

【図４】実施例の動作説明に供され、目的とする圧縮デ
ータ内容を示す図である。FIG. 4 is a diagram for explaining the operation of the embodiment and showing a target compressed data content.

【図５】従来のデータ圧縮装置の構成を示すブロック図
である。FIG. 5 is a block diagram showing a configuration of a conventional data compression device.

【図６】図５の構成における変換の処理手順を示すフロ
ーチャートである。FIG. 6 is a flowchart showing a conversion processing procedure in the configuration of FIG.

【図７】従来の処理による圧縮データ列を示す図であ
る。FIG. 7 is a diagram showing a compressed data string by conventional processing.

[Explanation of symbols]

１０被圧縮データ１２圧縮前処理部１３圧縮後処理部１４圧縮データ 10 compressed data 12 pre-compression processing unit 13 post-compression processing unit 14 compressed data

Claims

[Claims]

1. Pre-compression processing means for converting data to add / subtract modulo or exclusive OR data of adjacent data and converting into continuous data in byte units as pre-compression processing for data in which the same character continues, and said pre-compression. When several characters are continuous with one byte character of continuous data from the processing means as one unit, special characters indicating compression of the characters, repeated character data, and compressed data converted into the number of repetitions are generated. A data compression apparatus comprising: post-compression processing means for obtaining.

2. A data compression method for converting a special character indicating the compression of data in which the same character is continuous, repeated character data, and the number of repetitions to reduce the data size when repeating the same character. As a pre-processing of the data compression, an addition / subtraction modulo or exclusive OR of adjacent bytes is taken, and a repeated character of data in which the same character is continuous is repeated in byte units to be determined as a character.

3. The data compression apparatus according to claim 1, or the data compression method according to claim 2, wherein the repeated character of data in which the same character is continuous is a continuous character of Japanese data of 2-byte code. .