JPH03240173A

JPH03240173A - Image data compressing system

Info

Publication number: JPH03240173A
Application number: JP2035583A
Authority: JP
Inventors: Yasuhiko Nakano; 泰彦中野; Shigeru Yoshida; 茂吉田; Yoshiyuki Okada; 佳之岡田; Hirotaka Chiba; 広隆千葉
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-02-16
Filing date: 1990-02-16
Publication date: 1991-10-25

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】［概要］読取ライン走査で得られた２値画像データを圧縮する画
像データ圧縮方式に関し、統計的性質の異なる画像データが混在しても効率的に圧
縮してデータ量を低減することを目的と腰２値画像データを圧縮する時に、画像データの縦の複数
ラインを一括して、そのビットパタンとパタンか続く長
さの組で成る固定長の中間符号に変換する前処理を行っ
た後、中間符号列の隣接する符号間でのパタン遷移の出
現頻度（順位）のヒストグラムを作成し、出やすい遷移
パタン（例えば最高頻度のもの）をもつ中間符号だけ（
１つとは限らない）特定のパタン（例えば出現頻度の最
も低い中間符号のパタン）に置き換え、出にくいパタン
遷移をもつ中間符号はそのまま出力し、最終的に現時点
で符号化すべき変換符号列を、既に符号化済みの変換符
号列からの複製として圧縮符号化（ユニバーサル符号化
）するように構成する。[Detailed Description of the Invention] [Summary] Regarding an image data compression method that compresses binary image data obtained by scanning a reading line, the present invention relates to an image data compression method that compresses binary image data obtained by scanning a reading line. When compressing binary image data with the aim of reducing image data, multiple vertical lines of the image data are converted at once into a fixed-length intermediate code consisting of a pair of the bit pattern and the length of the pattern. After performing preprocessing, a histogram of the appearance frequency (rank) of pattern transitions between adjacent codes of the intermediate code string is created, and only intermediate codes with transition patterns that are likely to appear (for example, the one with the highest frequency) are
(not limited to one pattern) (for example, the pattern of the intermediate code with the lowest frequency of appearance), intermediate codes with pattern transitions that are difficult to appear are output as they are, and finally the converted code string to be encoded at the current time is The configuration is such that compression encoding (universal encoding) is performed as a copy of an already encoded transform code string.

［産業上の利用分野］本発明は、読取ライン走査で得られた２値画像データを
圧縮する画像データ圧縮方式に関する。[Industrial Application Field] The present invention relates to an image data compression method for compressing binary image data obtained by reading line scanning.

近年、ＯＡが発展し、文書中の文字コードや白黒２値画
像などの混在情報がレイアウト情報と共に６４フアクシ
ミリや光デイスクファイル・システムなどで扱われるよ
うになってきている。文書情報をディジタルデータとし
て利用するとき、画像情報のデータ量は、文字コード情
報に比べ非常に大きく１０数〜数１０倍となるので、蓄
積や伝送等で画像情報を効率良く扱うため、データ圧縮
を加えることでデータ量を減らすことが必須となる。In recent years, with the development of OA, mixed information such as character codes and black-and-white binary images in documents has come to be handled together with layout information in 64 facsimiles, optical disk file systems, and the like. When document information is used as digital data, the amount of image information is extremely large compared to character code information, several tens to several tens of times larger. Therefore, in order to efficiently handle image information during storage and transmission, data compression is required. It is essential to reduce the amount of data by adding .

［従来の技術］従来、データ圧縮方式は、画像データの圧縮と文字コー
ドの圧縮とに大別できる。[Prior Art] Conventionally, data compression methods can be roughly divided into image data compression and character code compression.

まず従来の画像データの圧縮方式としては、白黒２値画
像の代表的な圧縮方式としてＭＨ方式、ＭＭＲ方式、及
び予測符号化方式とが知られている。First, as conventional image data compression methods, the MH method, the MMR method, and the predictive encoding method are known as typical compression methods for black-and-white binary images.

■ＭＨ方式（Ｍｏｄｉｆｉｅｄ　Ｈｕｌｆｍａｎ　Ｃｏ
ｄｉｎｇ）２値画像の国際標準の１次元圧縮方式として
ＭＨ方式がある。この方式は、主走査方向に沿って、白
または黒の連続する長さ（Ｒｕｎ　Ｌｅｎｇｔｈ）をハ
フマン符号で可変長符号化して、データ圧縮するもので
ある。ハフマン符号は符号語数を減らすため、６４以下
の長さを表すターミネイティング符号と６４の倍数を表
すメイクアップ符号とで構成される。このＭＨ方式によ
り、通常の文書画像は数分の１に圧縮できる。■MH method (Modified Hulfman Co.
ding) The MH method is an international standard one-dimensional compression method for binary images. This method compresses data by variable-length encoding a continuous length of white or black (Run Length) along the main scanning direction using a Huffman code. In order to reduce the number of code words, the Huffman code is composed of a terminating code representing a length of 64 or less and a make-up code representing a multiple of 64. With this MH method, a normal document image can be compressed to a fraction of the original size.

■ＭＭＲ（Ｍｏｄｉｆｉｅｄ　Ｍｏｄｉｆｉｅｄ　ＲＥ
ＡＤ（ＲＥＩａｔｉｖｅ＾ｄｄｒｅｓｓ　Ｄｅｓｉｇｎ
ａｔｅ　ｃｏｄｉｎｇ））方式２値画像の国際標準の２
次元圧縮方式としてＭＭＲ方式がある。この方式は、主
走査方向にみていって白から黒、又は黒から白に変化す
る画素を変化画素と呼び、隣接する走査線間で変化画素
の表す白黒パターン境界ずれ（変化画素相対アドレス）
が小さいという変化画素の接続関係に着目してデータ圧
縮するものである。このＭＭＲ方式により、通常の文書
画像は数分の１から１０数分の１に圧縮できる。■MMR (Modified Modified RE
AD(REIative＾ddress Design
ate coding)) method 2 of the international standard for binary images
There is an MMR method as a dimension compression method. In this method, a pixel that changes from white to black or from black to white in the main scanning direction is called a changed pixel, and the black and white pattern boundary shift represented by the changed pixel between adjacent scanning lines (changed pixel relative address)
Data compression is performed by focusing on the connection relationship of pixels with a small change. By using this MMR method, a normal document image can be compressed to one-tenth to one-tenth.

一方、文字コードの効率の良いデータ圧縮方式としては
、２ｉｖ−Ｌｅｍｐｅｌ符号化に代表されるユニバーサ
ル符号化方式がある。On the other hand, as an efficient data compression method for character codes, there is a universal encoding method represented by 2iv-Lempel encoding.

２ｉｖ−Ｌｅｍｐｅｌ符号化（以下、ｒＺＬ符号化」と
略す）について簡単に説明する（詳しくは、例えば、宗
像「ｚｉｙ−Ｌｅｍｐｅｌのデータ圧縮法」、情報処理
。2iv-Lempel encoding (hereinafter abbreviated as rZL encoding) will be briefly explained (for details, see, for example, Munakata's "ziy-Lempel data compression method", information processing.

ｖａｔ、　２６．　Ｎｏ、　１．１９８５年を参照のこ
と）。vat, 26. No. 1.1985).

ｚｉｖ−Ｌｅｍｐｅｌ符号では、 ■ユニバーサル型と、 ■増分分解型（Ｉｎｃｒｅｍｅｎｔａｌ　ｐａｒｓｉｎ
ｇ）の２つのアルゴリズムが提案されている。The ziv-Lempel code has two types: ■Universal type and ■Incremental parsin type.
Two algorithms have been proposed: g).

■ユニバーサル型のアルゴリズムこのアルゴリズムは、演算量は多いが、高圧縮率が得ら
れる。符号化データを、過去のデータ系列の任意の位置
から一致する最大長の系列に区切リ（部分列）、過去の
系列の複製として符号化する方法である。■Universal algorithm This algorithm requires a large amount of calculations, but provides a high compression ratio. This is a method in which encoded data is segmented (subsequences) of the maximum length from an arbitrary position in a past data sequence and encoded as a copy of the past sequence.

第９図にユニバーサル型ＺＬ符号の符号器の原理図を示
す。まずＰバッファ３０には符号化済みの入力データが
格納されており、Ｑバッファ３２にはこれから符号化す
るデータが入力されている。FIG. 9 shows a principle diagram of a universal ZL code encoder. First, encoded input data is stored in the P buffer 30, and data to be encoded is input to the Q buffer 32.

Ｐバッファ３０の系列をＱバッファ３２の系列でサーチ
し、Ｐバッファ３０中で一致する最大長の部分列をもと
める。そして、Ｐバッファ３０中でこの最大長部分列を
指定するため次の情報の組を符号化する。The sequence in the P buffer 30 is searched with the sequence in the Q buffer 32 to find a matching partial sequence in the P buffer 30 that has the maximum length. The next set of information is then encoded in P-buffer 30 to specify this maximum length subsequence.

次にＱバッファ３２内の符号化した系列をＰバッファ３
０に移して新たなデータを得る。以下、同様の操作を繰
り返し、データを部分列に分解して符号化する。Next, the encoded sequence in the Q buffer 32 is transferred to the P buffer 3.
0 to obtain new data. Thereafter, similar operations are repeated to decompose the data into subsequences and encode them.

■増分分解型アルゴリズムこのアルゴリズムは、圧縮率はユニバーサル型より劣る
が、シンプルで、計算も容易であることが知られている
。増分分解型ＺＬ符号では、入力シンボルの系列をＸ＝ａａｂａｂａｂａａ　＠　・＠とすると、成分系列ｘ＝ＸＯＸＩ　Ｘ２　・＃− への増分分解は次のようにする。■Incremental decomposition algorithm This algorithm has a lower compression rate than the universal type, but it is known to be simple and easy to calculate. In the incrementally decomposed ZL code, if the input symbol sequence is X=aabababaa@*@, then the incremental decomposition into the component series x=XOXI X2*#- is performed as follows.

Ｘｉを既威分の右端に１シンボルを付は加えた最長の列
とし、Ｘ＝ａ１１ａｂＩＩａｂａＩＩｂａａａ・・・となる。Let Xi be the longest sequence with one symbol added to the right end of the existing part, and then X = a11abIIabaIIbaaa...

従って、ＸＯ＝λ（全列）　、　ＸＩ　＝ＸＯａ。Therefore, XO=λ (all columns), XI=XOa.

Ｘ２　＝ＸＩ　ｂ　　　、　Ｘ３　＝Ｘ２　ａ。X2 = XI b, X3 = X2 a.

Ｘ４＝ＸＯｂ　　　Ｘ５＝Ｘ１ａ　・・・と分解できる
。It can be decomposed as X4=XOb X5=X1a...

増分分解した各成分系列は既成分系列を用いて次のよう
な組で符号化する。Each incrementally decomposed component sequence is encoded as the following set using the existing component sequence.

増分分解型アルゴリズムは、符号化パターンについて、
過去に分解した部分列の内、最大長一致するものを求め
、過去に分解した部分列の複製として符号化するもので
ある。The incremental decomposition algorithm uses the encoding pattern to
Among the subsequences decomposed in the past, the one with the maximum length matching is found and encoded as a copy of the subsequence decomposed in the past.

即ち、ＺＬ符号では現在の文字コードの系列を、符号化
済みの過去の系列からの複製として符号化するものであ
る。このＺＬ符号を用いた場合、文字コードの文書情報
は、１／２程度に圧縮できる。That is, in the ZL code, the current character code sequence is encoded as a copy of the encoded past sequence. When this ZL code is used, character code document information can be compressed to about 1/2.

Ｅ発明が解決しようとする課題］従来の文書のディジタルデータのうち、画像情報は、同
一の色の画素が連続する長さを用いるＭＨ方式や、ＭＭ
Ｒ方式に代表される変化画素相対アドレスを用いる方式
によって圧縮されている。[Problems to be solved by the invention] Among the conventional digital data of documents, image information is processed using the MH method, which uses the length of consecutive pixels of the same color, and the MM method.
Compression is performed by a method using a relative address of a changed pixel, typified by the R method.

また、文字コード情報はＺＬ符号化に代表されるユニバ
ーサル符号化方式で圧縮することができる。Further, character code information can be compressed using a universal encoding method such as ZL encoding.

ユニバーサル符号化方式の名称は「万能の符号化方式」
を意味する。しかし、ＺＬ符号化方式を画像データに適
用した場合、１／２程度には圧縮できるが、ＭＭＲ方式
や予測符号化方式に比べると圧縮率は非常に低いものと
なる。The name of the universal encoding method is "universal encoding method"
means. However, when the ZL encoding method is applied to image data, it can be compressed to about 1/2, but the compression rate is very low compared to the MMR method or the predictive encoding method.

このため現状では、文字コードと２値画像データとは統
計的性質が大きく異なるため、一つの符号化方式で両者
をカバーするのは不利であり、文字コードや画像データ
といったデータの種類に合わせて符号化方式を切り換え
る方式が採用されている。For this reason, at present, character codes and binary image data have very different statistical properties, so it is disadvantageous to cover both with a single encoding method. A method of switching the encoding method is adopted.

本発明は、係る状況に鑑みて成されたもので、網点や線
画、文字画像、その画像などの統計的性質の異なる画像
データが混在しても効率的に圧縮してデータ量を低減で
きる画像データ圧縮方式を提供することを目的とする。The present invention was developed in view of the above situation, and is capable of efficiently compressing and reducing the amount of data even when image data with different statistical properties such as halftone dots, line drawings, character images, and their images are mixed. The purpose is to provide an image data compression method.

［課題を解決するための手段］第１図は本発明の原理説明図である。[Means to solve the problem] FIG. 1 is a diagram explaining the principle of the present invention.

まず本発明は、読取ライン走査で得られた２値画像を圧
縮する画像データ圧縮方式を対象とする。First, the present invention is directed to an image data compression method for compressing a binary image obtained by scanning a reading line.

このような画像データ圧縮方式につき本発明にあっては
、まず画像データを前処理手段１０に入力し、入力画像
データの縦の複数ライン（Ｎライン）を−括して、その
ビットパタン（２Ｎ種類）と該パタンが続く長さ（ＲＬ
）の組で成る固定長の中間符号に変換する。In the present invention for such an image data compression method, first, image data is input to the preprocessing means 10, a plurality of vertical lines (N lines) of the input image data are grouped, and the bit pattern (2N lines) is grouped. type) and the length of the pattern (RL
) into a fixed-length intermediate code.

次に変換手段１２において、前処理手段■０から得られ
た中間符号列に対し、隣接する中間符号間でのパタン遷
移の出現頻度を示すヒストグラム１６を作成し、このヒ
ストグラム１６に基づき現在処理中の中間符号２０に対
する１つ前の中間符号１８からのパタン遷移がでやすい
場合には該中間符号２０のパタンを他の特定のパタン例
えばＡに置き換え、出現しにくい場合には該中間符号２
０をそのまま出力する。Next, in the conversion means 12, a histogram 16 indicating the appearance frequency of pattern transitions between adjacent intermediate codes is created for the intermediate code string obtained from the preprocessing means 0, and based on this histogram 16, the current processing is If a pattern transition from the previous intermediate code 18 to the intermediate code 20 is likely to occur, the pattern of the intermediate code 20 is replaced with another specific pattern, such as A;
Outputs 0 as is.

そして最終的に、圧縮符号化手段１４において、変換手
段１２で得られた変換符号列を、既に圧縮符号化済みの
変換符号列からの複製として圧縮符号化手段（ユニバー
サル符号化手段）１４で符号化するように構成する。Finally, in the compression encoding means 14, the conversion code string obtained by the conversion means 12 is encoded by the compression encoding means (universal encoding means) 14 as a copy of the conversion code string that has already been compressed and encoded. Configure it so that it becomes

ここで変換手段１２は、パタン遷移の出現頻度が高い中
間符号のパタンを、パタン遷移の出現頻度が低い中間符
号のパタンに置き換える。Here, the converting means 12 replaces the intermediate code pattern in which pattern transitions occur frequently with an intermediate code pattern in which pattern transitions occur less frequently.

また圧縮符号化手段１４としては、現時点で符号化すべ
き変換符号列を、既に符号化済みの変換符号列の複製位
置及び複製の長さの組に変化するユニバーサル型ＺＬ符
号化方式、或いは、現時点で符号化統べき変換符号列を
、既に符号化済みの変換符号列を異なる部分列に分けた
時の該部分列の番号で表わす増分分解型ＺＬ符号化方式
を使用する。The compression encoding means 14 may be a universal ZL encoding method that changes the transform code string to be encoded at the present time into a set of a copy position and a copy length of a transform code string that has already been encoded; An incremental decomposition type ZL encoding method is used in which the commonly encoded transform code string is expressed by the number of the subsequence when an already encoded transform code string is divided into different subsequences.

［作用コこのような構成を備えた本発明の画像データ圧縮方式に
よれば、統計的性質の異なる画像が混在した画像データ
であっても、画像特有の統計的性質を取り込むような前
処理を施し、統計的性質の差異を整合させた後、ユニバ
ーサル符号化するようになり、一つの符号化方式で統計
的性質の大きく異なるデータに有効に対処することがで
きる。[Operations] According to the image data compression method of the present invention having such a configuration, even if image data contains a mixture of images with different statistical properties, it is possible to perform preprocessing that incorporates the statistical properties unique to the images. Universal encoding is performed after the differences in statistical properties are matched, and one encoding method can effectively handle data with widely different statistical properties.

更に詳細に説明すると、元来、ユニバーサル符号化は、
情報保存型のデータ圧縮方法であり、データ圧縮時に情
報源の統計的な性質を予め仮定しないため、種々のタイ
プ（文字コード、オブジェクトコードなど）のデータに
適用することができる。To explain in more detail, universal encoding was originally
This is an information preservation type data compression method that does not pre-assume the statistical properties of the information source when compressing data, so it can be applied to various types of data (character code, object code, etc.).

しかし、文字コードやシンボルが８ビツトや１６ビツト
単位と境界があるのに対して、画像ブタには文字やシン
ボルのような境界はない。境界があるとしても、画像そ
のものとは関係のないライン単位である。However, while character codes and symbols have boundaries in 8-bit and 16-bit units, image pigs do not have boundaries like characters and symbols. Even if there is a boundary, it is a line unit that has nothing to do with the image itself.

このため、例えば、文字コードをバイト単位に処理する
ユニバーサル符号化方式で画像データを圧縮しても、バ
イトとしてみたときに画像の統計的性質から掛は離れた
種々のパターンが均等に出現することになり、有効な圧
縮率が得られない。For this reason, for example, even if image data is compressed using a universal encoding method that processes character codes in bytes, various patterns that are far apart from each other may appear evenly when viewed as bytes due to the statistical properties of the image. , and an effective compression ratio cannot be obtained.

また、ＭＭＲ方式が画像の２次元相関を利用して圧縮す
るのに対して、既存のユニバーサル符号化方式では１次
元の相関を利用するだけであり、２次元画像符号化方式
に比べて圧縮性能が劣っている。In addition, while the MMR method compresses images by using two-dimensional correlation, existing universal encoding methods only use one-dimensional correlation, and their compression performance is lower than that of two-dimensional image encoding methods. is inferior.

そこで画像データを有効に圧縮するためビット単位に処
理するユニバーサル符号化も考えられるが、この場合は
、バイト単位の文字コードの圧縮で、ビット処理により
時間がかかるようになり、また、ビット単位に異なる系
列に分解するため学習の効率が悪くなる欠点がある。In order to effectively compress image data, universal encoding, which processes it bit by bit, may be considered, but in this case, compressing the character code in bytes would take more time due to bit processing, and The disadvantage is that learning efficiency is reduced because it is decomposed into different sequences.

これに対し本発明にあっては、画像データを符号化する
際の前処理として、２次元相関を利用して画像の複数ラ
インを一括して、例えば８ビット単位の固定長の中間符
号に直す処理を施す。次に前処理で得られた中間符号列
につき、隣接する中間符号間でのパタン遷移の出現頻度
を示すヒストグラムを作成し、このヒストグラムに従っ
て更に中間符号を変換する。On the other hand, in the present invention, as preprocessing when encoding image data, two-dimensional correlation is used to collectively convert multiple lines of an image into a fixed-length intermediate code of, for example, 8 bits. Apply processing. Next, for the intermediate code string obtained in the preprocessing, a histogram showing the frequency of appearance of pattern transitions between adjacent intermediate codes is created, and the intermediate code is further converted according to this histogram.

ヒストグラムを使用する中間符号の変換方式として本願
発明者は、中間符号の全てについて、１つ前の中間符号
からのパタン遷移の出現頻度（順位）をヒストグラムか
ら求めて置き換える方式を提案している。この方式によ
れば、中間符号の頻度分布の偏りを大きくすることで、
最終的に行なうユニバーサル符号化で圧縮がかかり易い
ようにしている。As an intermediate code conversion method using a histogram, the inventor of the present invention has proposed a method for calculating and replacing the appearance frequency (rank) of pattern transitions from the previous intermediate code from the histogram for all intermediate codes. According to this method, by increasing the bias of the frequency distribution of intermediate codes,
This makes it easier to compress the data in the final universal encoding.

しかし、中間符号自体にある程度の統計的な偏りがない
パタンでは、中間コード列のパタンを出現頻度に置き換
えても、頻度分布が滑かとなって中間符号列の規則性が
逆に失われ、却ってユニバーサル符号化での圧縮率が低
下する問題を招いてしまう。However, for patterns in which the intermediate code itself does not have a certain degree of statistical bias, even if the pattern of the intermediate code sequence is replaced by the frequency of occurrence, the frequency distribution becomes smooth and the regularity of the intermediate code sequence is lost, and on the contrary, This causes a problem in which the compression rate in universal encoding decreases.

そこで本発明では、作成したヒストグラムに基づき、出
現頻度の偏りの大きいパタンを持つ中間符号、即ち出現
頻度の高いパタン遷移をもつ中間符号についてのみ特定
のパタン、例えばパタン遷移の低い中間符号のパタンに
置き換えて表わす。Therefore, in the present invention, based on the created histogram, only a specific pattern, for example, a pattern of an intermediate code with a low pattern transition, is applied to an intermediate code having a pattern with a large bias in appearance frequency, that is, an intermediate code with a pattern transition with a high frequency of appearance. Replaced and expressed.

この結果、平均的な偏りの少ない系列であっても、いく
つかのピーク分布をもつ系列に変換されることとなり、
そのピーク値を利用して状態数を減らし、ユニバーサル
符号化をかかり易くすることができる。As a result, even a series with a small average bias will be converted into a series with several peak distributions,
By using the peak value, the number of states can be reduced, making it easier to perform universal encoding.

［実施例］第２図に本発明のデータ圧縮方式の実施例構成図である
。[Embodiment] FIG. 2 is a block diagram of an embodiment of the data compression method of the present invention.

第２図において、読取装置等による読取ライン走査で得
られた１ビット単位の画像データは端子２２から前処理
回路１０に入力される。前処理回路１０より入力された
画像データは、２値画像から統計的特徴を抽出する前処
理を施された後に、中間コードとしての８ビット単位の
固定長符号に変換される。In FIG. 2, image data in 1-bit units obtained by reading line scanning by a reading device or the like is inputted to a preprocessing circuit 10 from a terminal 22. The image data input from the preprocessing circuit 10 is subjected to preprocessing to extract statistical features from the binary image, and then converted into a fixed length code in 8-bit units as an intermediate code.

前処理回路１０に続いて設けられた中間コード変換器１
２は、前処理回路１０から得られた中間コードから統計
的性質をとって状態数の少ない符号系列に変換し、変換
結果をユニバーサル符号器１４に出力する。ユニバーサ
ル符号器１４は入力された変換符号の統計的性質を学習
しながら符号化し、圧縮符号を端子２４に出力する。Intermediate code converter 1 provided following preprocessing circuit 10
2 converts the intermediate code obtained from the preprocessing circuit 10 into a code sequence with a small number of states by taking statistical properties, and outputs the conversion result to the universal encoder 14. The universal encoder 14 encodes the input transform code while learning the statistical properties thereof, and outputs the compressed code to the terminal 24.

ユニバーサル符号器１４は、前述のユニバーサル型又は
増分分解型の圧縮アルゴリズムを実行するものであり、
その構成は公知である。そこで以下に第２図の実施例に
おける前処理回路１０と中間コード変換器１２の処理を
詳細に説明する。The universal encoder 14 executes the aforementioned universal type or incremental decomposition type compression algorithm,
Its configuration is known. Therefore, the processing of the preprocessing circuit 10 and intermediate code converter 12 in the embodiment shown in FIG. 2 will be explained in detail below.

第３図は、画像の複数ラインを一括して処理する前処理
回路１０の処理内容を（ａ）（ｂ）（ｃ）に分けて示す
。FIG. 3 shows the processing contents of the preprocessing circuit 10, which processes multiple lines of an image at once, divided into (a), (b), and (c).

まず第３図（ａ）に示すように、入力した２値画像を例
えば縦方向となる４ラインごとに区切る。First, as shown in FIG. 3(a), the input binary image is divided into, for example, every four lines in the vertical direction.

次に、４ラインのビットパタンとそのパタンの続く長さ
ＲＬを、左端から順に１ビツトずつ右にシフトしながら
コード化する。このコード化により第３図（ｂ）に示す
パタンと、同じパタンが続く長さＲＬとの組で成る中間
コードが得られる。Next, the 4-line bit pattern and the continuous length RL of the pattern are encoded while being shifted one bit at a time to the right from the left end. By this encoding, an intermediate code consisting of a pattern shown in FIG. 3(b) and a length RL in which the same pattern continues is obtained.

ここで４ラインのビットパターンの種類は、２’＝１６
種類となり、パタンの種別は４ビツトコードで表わすことが
できる。一般的には一括するライン数Ｎに対し、パタン
の種類は２Ｎとなり、従って８ビツト中のパタンコード
の占める割合はＮビットとなる。Here, the type of bit pattern of 4 lines is 2'=16
The type of pattern can be expressed by a 4-bit code. Generally speaking, the number of types of patterns is 2N for the number of lines to be grouped N, and therefore the proportion of the pattern code in 8 bits is N bits.

第３図（ｂ）の中間コードのデータ構造は、同図（ｃ）
のように８ビット単位の固定長符号に直される。即ち、
上位４ビツトは、４ライン単位のビットパタンを示すコ
ードがそのまま格納され、下位４ビツトには、同じビッ
トパタンの続く長さＲＬが格納される。但し、長さの最
大長は、４ビツトで表せる１６個までで、それ以上は新
たなコードが必要となる。The data structure of the intermediate code in Figure 3(b) is as shown in Figure 3(c).
It is converted into a fixed length code in 8-bit units as shown in the following. That is,
The upper 4 bits store a code indicating the bit pattern in units of 4 lines as is, and the lower 4 bits store the length RL of the same bit pattern. However, the maximum length is up to 16 that can be represented by 4 bits, and any longer than that requires a new code.

このような処理を４ライン毎に、縦の画素が無くなるま
で繰り返す。This process is repeated every four lines until there are no vertical pixels left.

第４図は第２，３図に示した前処理回路１０の処理フロ
ーである。FIG. 4 is a processing flow of the preprocessing circuit 10 shown in FIGS. 2 and 3.

まずステップ８１（以下「ステップ」は省略」）から８
２に進んで画像の縦の４ライン分のビットをバッファに
読み込む。次に８３で最初の４ラインのパタンをｏｌｄ
−ｐａｔとして記憶する。続いてＳ４で１画素右にシフ
トし、Ｓ５で次の４ラインの画素パタンをＢｙ−ｐａｔ
として記憶する。First, step 81 (hereinafter "step" is omitted) to 8
Proceed to step 2 and read bits for four vertical lines of the image into the buffer. Next, change the pattern of the first 4 lines to 83.
-Store as pat. Next, shift one pixel to the right in S4, and By-pat the pixel pattern of the next four lines in S5.
be memorized as

次Ｓ６でｏｌｄ−ｐａｔとｎｅｗ−ｐａｔと比較し、同
じであればＳ７でＲＬ＜１６を満足することを条件に８８に進んでＲＬを１つ増やし
、再びＳ４に戻って次の縦の４画素を読込む。Next, in S6, compare the old-pat and new-pat, and if they are the same, in S7, proceed to 88 on the condition that RL<16 is satisfied, increase RL by 1, return to S4 again, and add the next vertical 4. Read pixels.

Ｓ６でｏｌｄ−ｐａｔ　とｎｅｗ−ｐａｌが一致しても
Ｓ７でＲＬが１６を越えたことが判別されたり、Ｓ６で
ｏｌｄ−ｐａｌ　とｎｅｗ−ｐａｌの不一致を判別した
時にはＳ９に進み、その時のｏｌｄ−ｐａｌとＲＬを組
にした結果コード（中間コード）を作成して出力する。Even if old-pat and new-pal match in S6, if it is determined in S7 that RL exceeds 16, or if it is determined in S6 that old-pal and new-pal do not match, the process advances to S9 and the current old - Create and output a result code (intermediate code) that combines pal and RL.

続いてＳ１０でｎｅｗ−ｐａｔをｏｌｄ−ｐａｌに代入
し、Ｓｌｌで横画素が続いていればＳ３に戻って新たに
パタンの続き具合をコード化する処理を引き続き行う。Next, in S10, new-pat is substituted into old-pal, and if horizontal pixels continue in Sll, the process returns to S3 and continues the process of newly encoding the continuation of the pattern.

横の画素数が無くなると、Ｓｌｌから３１を介してＳ２
に戻り、次の４ラインをバッファに読み込んで同様の処
理を行う。縦のラインすべてについて以上の処理を終了
すると、ＳＬで残り縦ライン数がゼロになったことが判
別され、一連の前処理を終了する。When the number of horizontal pixels runs out, it is transferred from Sll to S2 via 31.
Return to , read the next four lines into the buffer, and perform the same process. When the above processing is completed for all vertical lines, it is determined at SL that the number of remaining vertical lines has become zero, and the series of preprocessing is completed.

この実施例では、−括するライン数を４ラインとしたが
、これに限るものではない。例えば−括するライン数を
３ラインとすれば、前処理による結果コードの上位３ビ
ツトがパタンに、そして下位５ビツトがＲＬに割り当て
られることになる。In this embodiment, the number of lines bracketed by - is set to four, but the number is not limited to this. For example, if the number of lines to be grouped is three, the upper three bits of the preprocessing result code will be allocated to the pattern, and the lower five bits will be allocated to the RL.

次に、前処理回路１０に続いて設けられた中間コード変
換器１２の構成及び作用を説明する。Next, the configuration and operation of the intermediate code converter 12 provided subsequent to the preprocessing circuit 10 will be explained.

まず中間コード変換器工２は、前処理回路１０から得ら
れた中間コードのパタンの遷移状態の頻度分布（ヒスト
グラム）を第５図のように求める。First, the intermediate code converter 2 obtains the frequency distribution (histogram) of transition states of the intermediate code pattern obtained from the preprocessing circuit 10, as shown in FIG.

第５図において、横軸は現在パタン、縦軸は１つ前の前
置パタンを示し、ある前置パタンからある現在パタンへ
遷移する頻度、即ち出現頻度が数値で示されており、数
値が大きい程、出現頻度（順位）が高い。In Figure 5, the horizontal axis shows the current pattern, the vertical axis shows the previous prefix pattern, and the frequency of transition from a certain prefix pattern to a certain current pattern, that is, the appearance frequency, is shown numerically. The larger the value, the higher the frequency of appearance (rank).

次に第５図の遷移状態の頻度分布に基づき、前置パタン
０−Ｆ毎に、現在パターン０〜Ｆに遷移する順位０−Ｆ
を割り当てた状態を第６図に示す。Next, based on the frequency distribution of transition states in FIG.
FIG. 6 shows the state in which .

本願発明者が提案しているヒストグラムを使用した他の
中間コードの変換方式では、前処理回路１０から得られ
た中間コードのパタンを、全て第６図に従った出現頻度
の順位に置き換える再変換を行い、その後にユニバーサ
ル符号化している。In another intermediate code conversion method using a histogram proposed by the inventor of the present application, all intermediate code patterns obtained from the preprocessing circuit 10 are re-converted to the order of appearance frequency according to FIG. and then universal encoding.

しかし中間コードの偏りが小さいときは、必ずしも圧縮
率が上がらない。However, when the bias of the intermediate code is small, the compression ratio does not necessarily increase.

そこで本発明の中間コード変換器１２では、第５図に示
すパタン遷移状態の頻度の中でパタン遷移の出やすいも
の、即ち出現頻度の高い部分のみをまとめるようにする
。Therefore, in the intermediate code converter 12 of the present invention, among the frequencies of pattern transition states shown in FIG. 5, only those in which pattern transitions are likely to occur, that is, portions with high frequency of appearance are summarized.

具体的には第５図の前置パタン１〜Ｆの各々に対し最も
出現頻度の高い丸で囲った部分の現在パタンを、長方形
で囲った■の一番でにくい現在パタン列のパタンｒＡＪ
そのものと置き換える。Specifically, for each of the prefix patterns 1 to F in Fig. 5, the current pattern in the circled part that appears most frequently is the pattern rAJ of the current pattern row that is the least likely to appear in the rectangle.
Replace it with that.

また２番目にでやすい三角で囲った部分の現在パタンを
、長方形で囲った■の２番目に出にくい現在パタン列の
パタン「５」と置き換える。Also, the current pattern in the part surrounded by a triangle that is the second most likely to appear is replaced with pattern "5" in the current pattern row that is the second least likely to appear in the part surrounded by a rectangle.

その結果、パタンｒＡＪとパタン「５」への置き換えの
ために中間コード変換器１２で使用されるヒストグラム
は、第７図の丸で囲ったパタンｒＡＪと三角で囲ったパ
タン「５」のみとなる。As a result, the histogram used by the intermediate code converter 12 for replacing pattern rAJ and pattern "5" is only the pattern rAJ enclosed in a circle and the pattern "5" enclosed in a triangle in FIG. .

次に中間コード変換器１２によるコード変換を第８図を
参照して具体的に説明する。Next, code conversion by the intermediate code converter 12 will be specifically explained with reference to FIG.

第８図において、いま同図（ａ）に示すように前処理回
路１０からの６つの結果コードが得られていたとすると
、同図（ｂ）のように変換される。In FIG. 8, if six result codes are obtained from the preprocessing circuit 10 as shown in FIG. 8(a), they are converted as shown in FIG. 8(b).

まず第８図（ａ）の最初の２つの結果コードに着目する
と、第１コード；パタン１．　ＲＬ＝２第２コード；パタンＯ，ＲＬ＝３となる。ここで処理対象となっている現在コードは、第
２コードである。このため前置パタン＝１現在パタン−〇であるから、第７図より変換パタンｒＡＪが求まり、第
８図（ｂ）に示すように、現在コードのパタンは、それ
までのパタン種類を示すコード＝：０から一番出にくい
パタンを示すコード＝Ａに置換する変換が行われる。First, focusing on the first two result codes in FIG. 8(a), 1st code; pattern 1. RL=2 Second code; pattern O, RL=3. The current code to be processed here is the second code. Therefore, since the prefix pattern = 1 and the current pattern - 〇, the conversion pattern rAJ is found from Fig. 7, and as shown in Fig. 8 (b), the current code pattern is the code indicating the previous pattern type. =: Conversion is performed to replace 0 with code =A, which indicates the pattern that is least likely to appear.

次の２番目のパタン遷移に着目すると、前置パタン＝０現在パタン＝１であるから、第７図より変換パタン「５」が求まり、第
８図（ｂ）に示すように、現在コードのパタンは、それ
までのパタン種類を示すコード＝１から２番目に出にく
いパタンを示すコード＝「５」に置換する変換が行なわ
れる。Focusing on the next second pattern transition, since the prefix pattern = 0 and the current pattern = 1, the conversion pattern "5" is found from Figure 7, and as shown in Figure 8 (b), the current code is The pattern is converted so that the code = 1 indicating the pattern type up to that point is replaced with the code = "5" indicating the pattern that is the second least likely to appear.

次の３番目以降のパタン遷移は１−８．　８−Ｅ。The next 3rd and subsequent pattern transitions are 1-8. 8-E.

Ｅ−３，であり、第７図の丸で囲った出現頻度が最も高
い場合の置き換えパタンと２番目に出現頻度の高い置き
換えパタンに該当しないことから、置き換え操作を行わ
ずに入力パタンをそのまま出力する。E-3, and it does not correspond to the replacement pattern with the highest frequency of occurrence and the replacement pattern with the second highest frequency of occurrence circled in Figure 7, so the input pattern is left as is without performing the replacement operation. Output.

このように中間段階で得られるコードの上位４ビツトに
ビットパタンの遷移確率を考慮したコードに変換するこ
とにより、前処理による中間符号列での偏りが小さくと
も、統計的性質に十分な偏りを持った圧縮しやすいコー
ド列に変換することができ、特に最終的に行なうユニバ
ーサル符号化での圧縮率が大幅に向上される。ここでは
、置き換えるパタンを２つのパタンの時について示した
が、２つに限るものではない。By converting the upper 4 bits of the code obtained in the intermediate stage into a code that takes into account the transition probability of the bit pattern, even if the bias in the intermediate code string due to preprocessing is small, it is possible to maintain sufficient bias in the statistical properties. It can be converted into a code string that is easy to compress, and the compression rate in the final universal encoding is greatly improved. Although the case where there are two patterns to be replaced is shown here, the number of patterns to be replaced is not limited to two.

［発明の効果］以上説明したように本発明によれば、種々の統計的性質
の異なった画像データに前処理を施すことにより、文字
や符号と同様な統計的性質の揃った所定ビット単位の固
定長データに変換することができる。[Effects of the Invention] As explained above, according to the present invention, by performing preprocessing on image data with different statistical properties, predetermined bit units with uniform statistical properties similar to characters and codes can be obtained. Can be converted to fixed length data.

また前処理後の中間データの遷移確率を基に、偏りを持
った圧縮しやすいデータ列に変換することで、ユニバー
サル符号化後のデータは、ベクトル符号化に近い高い圧
縮率を得ることができる。In addition, by converting the data into a biased data sequence that is easy to compress based on the transition probability of the intermediate data after preprocessing, the data after universal encoding can achieve a high compression rate close to that of vector encoding. .

[Brief explanation of drawings]

第１図は本発明の原理説明図；第２図は本発明の実施例構成図；第３図は本発明の前処理回路の処理説明図；第４図は本
発明の前処理回路の処理フロー図；第５図は本発明で用
いるパタン遷移状態に頻度説明図；第６図は第５図から得られたパタン遷移順位の説明図：第７図は本発明のパタンコードから出現頻度コードへの
変換処理の説明図；第８図は本発明の中間コードの再変換に使用するパタン
遷移状態説明図；第９図は本発明で用いるユニバーサル符号化の説明図で
ある。図中、１０：１２：１４：１６：２２゜３０：３２：前処理手段（前処理回路）変換手段（中間コード変換器）圧縮符号化手段（ユニバーサル符号器）ヒストグラム２４：端子ＰバツアファＱバッファ特許出願願人　富士通株式会社FIG. 1 is a diagram explaining the principle of the present invention; FIG. 2 is a configuration diagram of an embodiment of the present invention; FIG. 3 is a diagram explaining the processing of the preprocessing circuit of the present invention; FIG. 4 is a diagram explaining the processing of the preprocessing circuit of the present invention Flow diagram; Fig. 5 is a frequency explanatory diagram of pattern transition states used in the present invention; Fig. 6 is an explanatory diagram of the pattern transition order obtained from Fig. 5; Fig. 7 is an appearance frequency code from the pattern code of the present invention. FIG. 8 is an explanatory diagram of the pattern transition state used for reconversion of the intermediate code of the present invention; FIG. 9 is an explanatory diagram of universal encoding used in the present invention. In the figure, 10: 12: 14: 16: 22° 30: 32: Preprocessing means (preprocessing circuit) Conversion means (intermediate code converter) Compression encoding means (universal encoder) Histogram 24: Terminal P buffer Q buffer Patent applicant Fujitsu Limited

Claims

[Claims]

(1) In an image data compression method that compresses binary image data obtained by reading line scanning, multiple vertical lines of input image data are collectively compressed, and a set of the bit pattern and the length of the pattern is preprocessing means (10) for converting into a fixed length intermediate code consisting of;
For the intermediate code string obtained from 10), a histogram (1
6) and calculates the previous intermediate code (1) for the intermediate code (20) currently being processed based on the histogram (16).
8) converting means that replaces the pattern of the intermediate code (20) with another specific pattern (A) when the pattern transition from 8) is likely to appear, and outputs the intermediate code (20) as is when it is unlikely to appear; (12); Compression encoding means (14) for encoding the transform code string obtained from the transform means (12) as a copy of a transform code string that has already been compressed and encoded; Characteristic image data compression method.

(2) The image data compression according to claim 1, wherein the converting means (12) replaces an intermediate code pattern with a high frequency of pattern transitions with an intermediate code pattern with a low frequency of pattern transitions. method.

(3) The compression encoding means (14) converts the transform code string to be encoded at the present time into a set of a copy position and a copy length of the already encoded transform code string. Image data compression method according to item 1.

(4) The compression encoding means (14) is characterized in that the conversion code string to be encoded at the present time is represented by the number of a subsequence when an already encoded conversion code string is divided into different subsequences. The image data compression method according to claim 1.