JP2009081589A

JP2009081589A - Image decoding device, method and program

Info

Publication number: JP2009081589A
Application number: JP2007248325A
Authority: JP
Inventors: Masakazu Ebihara; 正和海老原
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2007-09-26
Filing date: 2007-09-26
Publication date: 2009-04-16

Abstract

PROBLEM TO BE SOLVED: To decode compressed image data coded by a progressive system in a short period of time by parallel processing. SOLUTION: A data analysis part 21 rearranges a coded data group for each prescribed division coding unit constituting the compressed image data so as to be processed for each component. A decoding control part 22 causes a CPU sequentially to apply variable length decoding and inverse quantization to the coded data group rearranged by the data analysis part 21 and causes the coefficient data of frequency components to be output. Also, when the coded data group corresponding to one component is processed by the CPU, the decoding control part 22 supplies the output coefficient data to an accelerator (inverse DCT part 13), causes the accelerator to perform transformation processing to time components, and causes the CPU to execute the variable length decoding and the inverse quantization to the coded data group corresponding to the next component in parallel. COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、プログレッシブ方式で符号化された圧縮画像データを復号する画像復号装置、方法、およびプログラムに関し、特に、互いに並列に動作可能な第１の演算処理部および第２の演算処理部を用いて復号処理を行う画像復号装置、方法、およびプログラムに関する。 The present invention relates to an image decoding apparatus, method, and program for decoding compressed image data encoded by a progressive method, and in particular, using a first arithmetic processing unit and a second arithmetic processing unit that can operate in parallel with each other. The present invention relates to an image decoding apparatus, a method, and a program that perform decoding processing.

近年、静止画像データの圧縮／伸張のための国際標準方式として、ＪＰＥＧ（Joint Photographic Experts Group）方式が広く普及している。ＪＰＥＧ方式では、ＪＰＥＧ標準システム（Baseline System）に加えて、ＪＰＥＧ拡張システム（Extended System）が規定されている。 In recent years, the JPEG (Joint Photographic Experts Group) method has become widespread as an international standard method for compressing / decompressing still image data. In the JPEG system, in addition to the JPEG standard system (Baseline System), a JPEG extended system (Extended System) is defined.

図１３は、ＪＰＥＧ標準システムにおける復号装置の構成を概略的に示すブロック図である。
ＪＰＥＧ標準システムにおける復号装置は、図１３に示すように、可変長復号（ＶＬＤ：Variable Length Decode）部１１と、逆量子化（ＩＱ：Inverse Quantize）部１２と、逆ＤＣＴ（ＩＤＣＴ：Inverse Discrete Cosine Transform）部１３と、色変換部１４とから構成される。 FIG. 13 is a block diagram schematically showing a configuration of a decoding device in the JPEG standard system.
As shown in FIG. 13, the decoding device in the JPEG standard system includes a variable length decoding (VLD) unit 11, an inverse quantization (IQ) unit 12, and an inverse DCT (IDCT: Inverse Discrete Cosine). Transform) unit 13 and color conversion unit 14.

可変長復号部１１は、符号化装置によって符号化されたＪＰＥＧデータに対して可変長復号処理を施し、量子化データを出力する。逆量子化部１２は、可変長復号部１１で得られた量子化データを、量子化テーブルに基づいて逆量子化し、ＤＣＴ係数を出力する。逆ＤＣＴ部１３は、逆量子化部１２で得られたＤＣＴ係数を逆ＤＣＴ処理して伸張し、画素データ（ＹＵＶデータ）を出力する。色変換部１４は、逆ＤＣＴ部１３で得られたＹＵＶデータに対して色変換を行い、色情報データ（ＲＧＢ画像データ）を出力する。 The variable length decoding unit 11 performs variable length decoding processing on the JPEG data encoded by the encoding device, and outputs quantized data. The inverse quantization unit 12 inversely quantizes the quantized data obtained by the variable length decoding unit 11 based on the quantization table, and outputs a DCT coefficient. The inverse DCT unit 13 decompresses the DCT coefficient obtained by the inverse quantization unit 12 by inverse DCT processing, and outputs pixel data (YUV data). The color conversion unit 14 performs color conversion on the YUV data obtained by the inverse DCT unit 13 and outputs color information data (RGB image data).

図１４は、ＪＰＥＧデータにおけるブロックの構成とスキャン順を説明するための図である。
ＪＰＥＧ標準システムの符号化装置では、入力画像データを８画素×８画素からなるブロックに分割し、これらのブロックを単位としてＤＣＴ演算を行う。そして、得られたＤＣＴ係数を、ＤＣ（Direct Current）成分とＡＣ（Alternate Current）成分とで独立して量子化する。図１４に示すように、ＪＰＥＧデータにおいて、各ブロックは６４個の要素からなり、それぞれの要素は、要素単位の画像データを周波数成分で表現した複数（８ビットまたは１２ビット）のビット列を含む。 FIG. 14 is a diagram for explaining a block configuration and scan order in JPEG data.
In the encoding device of the JPEG standard system, input image data is divided into blocks each consisting of 8 pixels × 8 pixels, and DCT calculation is performed in units of these blocks. Then, the obtained DCT coefficient is quantized independently by a DC (Direct Current) component and an AC (Alternate Current) component. As shown in FIG. 14, in JPEG data, each block is composed of 64 elements, and each element includes a plurality (8 bits or 12 bits) of a bit string representing image data in element units by frequency components.

また、図１４において、ブロック内の各要素に示した数字は、ブロック内のスキャン順を示しており、以下ではこれらをスキャン番号（scan number）と呼ぶことにする。各ブロックにおいては、ＤＣ成分については最初の要素（スキャン番号０の要素）として符号化され、ＡＣ成分については、残りの要素（スキャン番号１〜６３の要素）の中でジグザグスキャンによって並び替えられて符号化される。 Further, in FIG. 14, the numbers shown for each element in the block indicate the scan order in the block, and these will be referred to as scan numbers in the following. In each block, the DC component is encoded as the first element (element of scan number 0), and the AC component is rearranged by zigzag scanning among the remaining elements (elements of scan numbers 1 to 63). Encoded.

一方、ＪＰＥＧ拡張システムには、ＪＰＥＧ標準システムに対する追加機能として、シーケンシャル符号化モードを適用したシステム（Sequential system）や、プログレッシブ符号化モードを適用したシステム（Progressive system）などが規定されている。シーケンシャル符号化されたＪＰＥＧデータでは、上から下に対して画像が順次復号されて表示される。また、プログレッシブ符号化されたＪＰＥＧデータでは、まず、低解像度の大まかな画像が復号されて表示された後、次第に高解像度の精細な画像が復号されて復元される。 On the other hand, in the JPEG extended system, as an additional function to the JPEG standard system, a system (Sequential system) to which a sequential coding mode is applied, a system (Progressive system) to which a progressive coding mode is applied, and the like are defined. In sequential-encoded JPEG data, images are sequentially decoded from top to bottom and displayed. Also, with progressively encoded JPEG data, a low-resolution rough image is first decoded and displayed, and then a high-resolution fine image is gradually decoded and restored.

これらのうち、後者のプログレッシブ符号化モードでは、次の２つの符号化方式が提案されている。１つは、ＤＣＴ係数を空間周波数の低い成分から複数に分割し、分割単位ごとに順に符号化していく周波数分割型の方式であり、これはスペクトル選択（Spectral Selection）方式と呼ばれる。もう１つは、ＤＣＴ係数のビット列をＭＳＢ（Most Significant Bit）からＬＳＢ（Least Significant Bit）に対して複数に分割し、分割単位ごとに順に符号化していく近似精度向上型の方式であり、これは連続近似（Successive Approximation）方式と呼ばれる。また、これらの方式を組み合わせて、符号化する分割単位を多くすることも可能である。これらの方式に従って多段に分割された分割単位ごとの符号化データ群を、バンド（band）と呼ぶ。 Among these, in the latter progressive encoding mode, the following two encoding methods have been proposed. One is a frequency division type method in which a DCT coefficient is divided into a plurality of components having a low spatial frequency and encoded in order for each division unit. This is called a spectral selection method. The other is a method of improving the approximation accuracy by dividing a bit string of DCT coefficients into a plurality of MSBs (Most Significant Bits) to LSBs (Least Significant Bits) and sequentially encoding each division unit. Is called a Successive Approximation method. Also, it is possible to increase the number of division units to be encoded by combining these methods. An encoded data group for each division unit divided in multiple stages according to these methods is called a band.

ＪＰＥＧ標準システムにおいては、上述したブロック内の６４個の要素が、順次復号されていく。一方、プログレッシブ符号化されたＪＰＥＧデータの復号時には、スペクトル選択方式や連続近似方式により多段に分割されたバンド単位で、復号が行われる。以下、バンドの分割例を挙げて、プログレッシブ符号化されたＪＰＥＧデータの復号処理の流れを具体的に説明する。 In the JPEG standard system, the 64 elements in the block described above are sequentially decoded. On the other hand, when decoding progressively encoded JPEG data, decoding is performed in units of bands divided in multiple stages by a spectrum selection method or a continuous approximation method. Hereinafter, a flow of decoding processing of progressively encoded JPEG data will be described in detail with an example of band division.

図１５は、バンドの構成例を示す図であり、図１６は、バンドごとに処理されるデータを示す図である。
ここでは例として、図１５に示すように、ＤＣＴ係数を４つのバンドに分割している。プログレッシブ符号化されたＪＰＥＧデータは、通常、ストリームにおけるバンドの格納順通りに復号される。すなわち、この例では、第１のバンド、第２のバンド、第３のバンド、第４のバンドの順に復号処理が行われる。なお、ブロック内の各要素のＤＣＴ係数は、８ビットのデータで表されることにする。 FIG. 15 is a diagram illustrating a configuration example of a band, and FIG. 16 is a diagram illustrating data processed for each band.
Here, as an example, as shown in FIG. 15, the DCT coefficient is divided into four bands. Progressive-encoded JPEG data is normally decoded in the order in which the bands in the stream are stored. That is, in this example, the decoding process is performed in the order of the first band, the second band, the third band, and the fourth band. The DCT coefficient of each element in the block is represented by 8-bit data.

第１のバンドには、スキャン番号０の要素の全ビット（０ビットから７ビットまで）のデータが符号化されている。この第１のバンドには、ＤＣＴ係数のうちのＤＣ成分のみが含まれる。復号装置では、最初に、図１６の左上において“１ｓｔ”と示した要素の８ビット分の全データが、可変長復号部１１によって可変長復号され、その後、ＪＰＥＧ基本システムの場合と同様に、得られたデータが逆量子化部１２および逆ＤＣＴ部１３で処理されて、ＹＵＶデータが生成される。 In the first band, data of all bits (from 0 to 7 bits) of the element of scan number 0 is encoded. This first band includes only the DC component of the DCT coefficient. In the decoding device, first, all the 8-bit data of the element indicated as “1st” in the upper left of FIG. 16 is variable-length decoded by the variable-length decoding unit 11, and thereafter, as in the case of the JPEG basic system, The obtained data is processed by the inverse quantization unit 12 and the inverse DCT unit 13 to generate YUV data.

第２のバンドには、ＤＣＴ係数のＡＣ成分のうち、スキャン番号１〜５の要素の１ビットから７ビットまでのデータが符号化されている。復号装置では、第１のバンドの処理を実行した後、図１６の右上において“２ｎｄ”と示した要素のうちの上位７ビット分のデータが、可変長復号部１１によって可変長復号され、さらに逆量子化部１２および逆ＤＣＴ部１３で処理される。 In the second band, data of 1 to 7 bits of the elements of scan numbers 1 to 5 among the AC components of the DCT coefficient is encoded. In the decoding device, after executing the processing of the first band, the data for the upper 7 bits of the element indicated as “2nd” in the upper right of FIG. 16 is variable-length decoded by the variable-length decoding unit 11, Processing is performed by the inverse quantization unit 12 and the inverse DCT unit 13.

第３のバンドには、ＤＣＴ係数のＡＣ成分のうち、スキャン番号６〜６３の要素の１ビットから７ビットまでのデータが符号化されている。復号装置では、第２のバンドの処理を実行した後、図１６の左下において“３ｒｄ”と示した要素のうちの上位７ビット分のデータが、可変長復号部１１によって可変長復号され、さらに逆量子化部１２および逆ＤＣＴ部１３で処理される。 In the third band, data of 1 to 7 bits of elements of scan numbers 6 to 63 is encoded in the AC component of the DCT coefficient. In the decoding device, after executing the processing of the second band, the upper 7 bits of data indicated by “3rd” in the lower left of FIG. 16 are variable-length decoded by the variable-length decoding unit 11, Processing is performed by the inverse quantization unit 12 and the inverse DCT unit 13.

第４のバンドには、ＤＣＴ係数のＡＣ成分のうち、スキャン番号１〜６３の要素の０ビット目のみのデータが符号化されている。復号装置では、第３のバンドの処理を実行した後、図１６の右下において“４ｔｈ”と示した要素のうちの下位１ビット分のデータが、可変長復号部１１によって可変長復号され、さらに逆量子化部１２および逆ＤＣＴ部１３で処理される。 In the fourth band, only the 0th bit data of the elements of scan numbers 1 to 63 among the AC components of the DCT coefficients is encoded. In the decoding device, after executing the processing of the third band, the variable-length decoding unit 11 performs variable-length decoding on the lower-order 1-bit data of the elements indicated as “4th” in the lower right of FIG. Further processing is performed by the inverse quantization unit 12 and the inverse DCT unit 13.

以上のように、プログレッシブ符号化されたＪＰＥＧデータについては、符号化時に分割されたバンドごとに可変長復号、逆量子化、逆ＤＣＴの各処理が順次実行されて、画像が徐々に復元されていく。しかし、特に逆ＤＣＴ演算は非常に処理負荷が高いため、バンドごとに逆ＤＣＴ演算を行うと、処理時間が長くなるという問題が生じてしまう。そこで、まず、すべてのバンドの可変長復号を行い、可変長復号をすべて終了した段階で、逆ＤＣＴ演算を１度にまとめて行い、ＹＵＶデータを出力する手法が提案されている。この手法では、低解像度の画面から徐々に復元して表示させることはできないものの、復号処理全体の演算時間を短縮できるようになる。この手法は、ＩＪＧ（Independent JPEG Group's）のｄｊｐｅｇと呼ばれるデコーダソフトウェアなどに、すでに実装されている。 As described above, for progressively encoded JPEG data, the variable length decoding, inverse quantization, and inverse DCT processes are sequentially executed for each band divided at the time of encoding, and the image is gradually restored. Go. However, since the inverse DCT operation has a very high processing load, if the inverse DCT operation is performed for each band, there arises a problem that the processing time becomes long. In view of this, first, a method has been proposed in which variable length decoding is performed on all bands, and when all variable length decoding is completed, inverse DCT operations are performed all at once and YUV data is output. Although this method cannot be gradually restored from a low-resolution screen and displayed, the calculation time of the entire decoding process can be shortened. This method is already implemented in decoder software called djpeg of IJG (Independent JPEG Group's).

さらに、近年のデジタルカメラの高解像度化に伴い、ＪＰＥＧデータのＣＯＤＥＣ（Coder Decoder）に対しては、高解像度の画像をより高速に表示できるような要求が強まっている。しかし、ｄｊｐｅｇに実装されている上記手法を用いて演算時間を短縮したとしても、高速化に対する要求を満たせない場合がある。そこで、解決策の１つとして、ＣＰＵ（Central Processing Unit）とアクセラレータとを使用し、高負荷の処理（すなわち逆ＤＣＴ）をアクセラレータに実行させ、なおかつ、ＣＰＵでの処理とアクセラレータでの処理を並列化することで、高速化を図る手法が提案されている。ここで、アクセラレータとは、特定の機能や処理能力を向上させるために、ＣＰＵが担当する処理を肩代わりするハードウェアを指し、例えばＬＳＩ（Large Scale Integration）として実現される。 Furthermore, with the recent increase in resolution of digital cameras, there is an increasing demand for CODEC (Coder Decoder) of JPEG data so that high-resolution images can be displayed at higher speed. However, even if the calculation time is shortened using the above-described method implemented in djpeg, there is a case where the demand for high speed cannot be satisfied. Therefore, as one of the solutions, a CPU (Central Processing Unit) and an accelerator are used to cause the accelerator to execute high-load processing (that is, inverse DCT), and the CPU processing and the accelerator processing are performed in parallel. A method for speeding up the system has been proposed. Here, the accelerator refers to hardware that takes over the processing performed by the CPU in order to improve a specific function and processing capability, and is realized, for example, as LSI (Large Scale Integration).

図１７は、ｄｊｐｅｇ方式の復号装置の構成を概略的に示すブロック図である。なお、図１７では、図１３と同等の機能には同じ符号を付して示しており、それらの説明を省略する。 FIG. 17 is a block diagram schematically showing the configuration of a djpeg decoding device. In FIG. 17, functions equivalent to those in FIG. 13 are denoted by the same reference numerals, and description thereof is omitted.

ｄｊｐｅｇでは、可変長復号部１１および逆量子化部１２においてバンドごとに処理が行われ、得られたＤＣＴ係数がＤＣＴ係数バッファ１５に一旦格納される。そして、逆ＤＣＴ部１３は、ＤＣＴ係数バッファ１５からＤＣＴ係数を読み出して逆ＤＣＴを行い、生成した画素データを色変換部１４に出力する。 In djpeg, the variable length decoding unit 11 and the inverse quantization unit 12 perform processing for each band, and the obtained DCT coefficients are temporarily stored in the DCT coefficient buffer 15. Then, the inverse DCT unit 13 reads the DCT coefficient from the DCT coefficient buffer 15 and performs inverse DCT, and outputs the generated pixel data to the color conversion unit 14.

ここで、このような復号処理をＣＰＵとアクセラレータとで並列に実行する場合には、可変長復号部１１および逆量子化部１２の処理がＣＰＵにより実行され、逆ＤＣＴ部１３の処理がアクセラレータにより実行される。この場合、逆ＤＣＴ部１３は、ＤＣＴ係数バッファ１５において、逆ＤＣＴの演算に必要なだけのＤＣＴ係数が格納されると、それらのＤＣＴ係数を読み出して逆ＤＣＴの処理を開始することができる。具体的には、逆ＤＣＴは、Ｙ，Ｃｂ，ＣｒなどのコンポーネントごとにＤＣＴ係数が揃うまで実行できないため、ＤＣＴ係数バッファ１５に所定のコンポーネントに対応するＤＣＴ係数が揃った時点で、そのコンポーネントについての逆ＤＣＴの処理を開始することができる。 Here, when such decoding processing is executed in parallel by the CPU and the accelerator, the processing of the variable length decoding unit 11 and the inverse quantization unit 12 is executed by the CPU, and the processing of the inverse DCT unit 13 is executed by the accelerator. Executed. In this case, the inverse DCT unit 13 can start the inverse DCT process by reading out the DCT coefficients when the DCT coefficient buffer 15 stores only the DCT coefficients necessary for the inverse DCT operation. Specifically, since the inverse DCT cannot be executed until the DCT coefficients for each component such as Y, Cb, and Cr are aligned, when the DCT coefficients corresponding to a predetermined component are aligned in the DCT coefficient buffer 15, The inverse DCT process can be started.

図１８は、ｄｊｐｅｇ方式の復号処理を並列化する場合のＣＰＵの処理手順を示すフローチャートである。
まず、ＣＰＵは、ストリーム内の格納順に従ってバンドを選択し（ステップＳ１０１）、そのバンドのデータに対して可変長復号を行い（ステップＳ１０２）、さらに逆量子化を行って（ステップＳ１０３）、得られたＤＣＴ係数をＤＣＴ係数バッファ１５に格納する。ここで、所定のコンポーネントに対応するすべてのバンドに対する処理が終了したか否かを判定し（ステップＳ１０４）、終了していない場合には、ステップＳ１０１に戻って次のバンドを選択し、可変長復号および逆量子化を行う。 FIG. 18 is a flowchart illustrating the processing procedure of the CPU when the decoding process of the djpeg method is parallelized.
First, the CPU selects a band according to the storage order in the stream (step S101), performs variable-length decoding on the data of the band (step S102), and further performs inverse quantization (step S103) to obtain the band. The obtained DCT coefficient is stored in the DCT coefficient buffer 15. Here, it is determined whether or not the processing for all the bands corresponding to the predetermined component has been completed (step S104). If not, the process returns to step S101 to select the next band, and the variable length. Perform decoding and inverse quantization.

また、ステップＳ１０４の判定で、所定のコンポーネントに対応するすべてのバンドに対する処理が終了していた場合には、そのコンポーネントに対応するＤＣＴ係数をＤＣＴ係数バッファ１５から読み出して、アクセラレータに出力して、逆ＤＣＴの実行を要求する（ステップＳ１０５）。次に、ストリーム内のすべてのバンドに対する処理が終了したか否かを判定し（ステップＳ１０６）、終了していない場合には、ステップＳ１０１に戻って次のバンドを選択し、可変長復号および逆量子化を行う。また、終了していた場合には復号処理を終了する。 If it is determined in step S104 that the processing for all the bands corresponding to the predetermined component has been completed, the DCT coefficient corresponding to the component is read from the DCT coefficient buffer 15 and output to the accelerator. The execution of reverse DCT is requested (step S105). Next, it is determined whether or not the processing for all the bands in the stream has been completed (step S106). If not, the process returns to step S101 to select the next band, and variable length decoding and reverse processing are performed. Perform quantization. If it has been completed, the decoding process ends.

以上の処理では、ステップＳ１０５での逆ＤＣＴの実行要求に応じて、アクセラレータが逆ＤＣＴの実行を開始した後、ステップＳ１０１に戻って、次のバンドに対する可変長復号および逆量子化がＣＰＵにより実行されることで、ＣＰＵによる可変長復号および逆量子化とアクセラレータによる逆ＤＣＴとが並列に実行され、これにより処理時間が短縮される。 In the above processing, the accelerator starts execution of inverse DCT in response to the inverse DCT execution request in step S105, and then returns to step S101 to execute variable length decoding and inverse quantization for the next band by the CPU. As a result, variable length decoding and inverse quantization by the CPU and inverse DCT by the accelerator are executed in parallel, thereby shortening the processing time.

なお、プログレッシブ符号化されたＪＰＥＧデータを復号して画像を表示させるための従来技術として、階層データごとにプログレッシブ方式で順次復号していく際に、表示された画像がユーザにより指定された必要解像度を満たした時点で、それ以後の階層データの復号を停止することで、復号に要する時間を短縮した画像処理方法があった（例えば、特許文献１参照）。
特開２０００−７８５７６号公報（段落番号〔００５５〕〜〔００５９〕、図３） As a conventional technique for decoding progressively encoded JPEG data and displaying an image, when the progressive decoding is sequentially performed for each hierarchical data, the displayed image has a required resolution specified by the user. There is an image processing method in which the time required for decoding is shortened by stopping decoding of the hierarchical data after that (see, for example, Patent Document 1).
JP 2000-78576 A (paragraph numbers [0055] to [0059], FIG. 3)

前述のように、プログレッシブ符号化されたＪＰＥＧデータを、ｄｊｐｅｇに実装された時間短縮手法を用いて復号する場合には、ＣＰＵでの可変長復号および逆量子化の処理と、アクセラレータでの逆ＤＣＴとを並列化することで、処理時間を短縮することが可能であった。しかし、この処理では、可変長復号および逆量子化の処理によっていずれか１つのコンポーネントに対応するＤＣＴ係数がすべて揃い、そのコンポーネントについての逆ＤＣＴが実行可能になるまでの間、アクセラレータでの逆ＤＣＴが実行されない。このため、ＣＰＵとアクセラレータとの並列処理の開始までに長時間かかる上、並列処理の実行期間が短くなり、処理効率が低いという問題があった。 As described above, when decoding progressively encoded JPEG data using the time reduction method implemented in djpeg, variable length decoding and inverse quantization processing in the CPU and inverse DCT in the accelerator are performed. It was possible to shorten the processing time by parallelizing the. However, in this processing, until all the DCT coefficients corresponding to any one component are obtained by the variable length decoding and inverse quantization processing, and the inverse DCT for that component becomes feasible, the inverse DCT in the accelerator is performed. Is not executed. For this reason, it takes a long time to start parallel processing between the CPU and the accelerator, and the parallel processing execution period is shortened, resulting in low processing efficiency.

本発明はこのような点に鑑みてなされたものであり、プログレッシブ方式で符号化された圧縮画像データを並列処理によってより短時間に復号できるようにした画像復号装置、方法、およびプログラムを提供することを目的とする。 The present invention has been made in view of these points, and provides an image decoding apparatus, method, and program capable of decoding compressed image data encoded by the progressive method in a shorter time by parallel processing. For the purpose.

本発明では上記課題を解決するために、プログレッシブ方式で符号化された圧縮画像データを復号する画像復号装置において、前記圧縮画像データを構成する、所定の分割符号化単位ごとの符号化データ群を、コンポーネントごとに処理されるように並び替える処理順決定部と、前記処理順決定部により並び替えられた前記符号化データ群を順次受け付けて、可変長復号および逆量子化を施し、周波数成分の係数データを出力する第１の演算処理部と、前記第１の演算処理部と並列に動作可能に構成され、前記第１の演算処理部からの前記係数データを時間成分のデータに変換する第２の演算処理部と、を有し、前記第１の演算処理部により１つのコンポーネントに対応する前記符号化データ群が処理されると、出力された前記係数データに対する時間成分のデータへの変換処理を前記第２の演算処理部が実行するとともに、前記第１の演算処理部が、次のコンポーネントに対応する前記符号化データ群に対する処理を並列に実行することを特徴とする画像復号装置が提供される。 In the present invention, in order to solve the above-described problem, in an image decoding apparatus that decodes compressed image data encoded by a progressive method, an encoded data group for each predetermined divided encoding unit that constitutes the compressed image data. , A processing order determination unit that rearranges them to be processed for each component, and sequentially accepts the encoded data group rearranged by the processing order determination unit, performs variable length decoding and inverse quantization, A first arithmetic processing unit that outputs coefficient data; and a first arithmetic processing unit configured to be operable in parallel with the first arithmetic processing unit, and converting the coefficient data from the first arithmetic processing unit into data of a time component. When the encoded data group corresponding to one component is processed by the first arithmetic processing unit, the output coefficient data is included in the output coefficient data. The second arithmetic processing unit executes the conversion process of the time component to be performed, and the first arithmetic processing unit executes the processing for the encoded data group corresponding to the next component in parallel. Is provided.

このような画像復号装置では、圧縮画像データを復号する前に、まず、処理順決定部が、この圧縮画像データを構成する、所定の分割符号化単位ごとの符号化データ群を、コンポーネントごとに処理されるように並び替える。そして、処理順決定部により並び替えられた符号化データ群が、第１の演算処理部に順次供給されて、可変長復号および逆量子化が行われ、その結果として周波数成分の係数データが出力される。また、第２の演算処理部は、第１の演算処理部と並列に動作可能に構成され、第１の演算処理部からの係数データを時間成分のデータに変換する。ここで、第１の演算処理部により１つのコンポーネントに対応する符号化データ群が処理されると、出力された係数データに対する時間成分のデータへの変換処理が第２の演算処理部によって実行されるとともに、第１の演算処理部では、次のコンポーネントに対応する符号化データ群に対する処理が並列に実行される。 In such an image decoding apparatus, before decoding the compressed image data, first, the processing order determination unit determines the encoded data group for each predetermined divided encoding unit constituting the compressed image data for each component. Rearrange to be processed. Then, the encoded data group rearranged by the processing order determination unit is sequentially supplied to the first arithmetic processing unit, variable length decoding and inverse quantization are performed, and as a result, coefficient data of the frequency component is output. Is done. The second arithmetic processing unit is configured to be operable in parallel with the first arithmetic processing unit, and converts coefficient data from the first arithmetic processing unit into time component data. Here, when the encoded data group corresponding to one component is processed by the first arithmetic processing unit, the second arithmetic processing unit executes conversion processing of the output coefficient data into time component data. In addition, in the first arithmetic processing unit, processing for the encoded data group corresponding to the next component is executed in parallel.

本発明の画像復号装置によれば、圧縮画像データの復号前に、この圧縮画像データを構成する符号化データ群が、コンポーネントごとに処理されるように並び替えられ、それらの符号化データ群が、並び替えられた順に第１の演算処理部に供給され、可変長復号および逆量子化が施される。また、第１の演算処理部により１つのコンポーネントに対応する符号化データ群が処理されると、出力された係数データに対する時間成分のデータへの変換処理が第２の演算処理部によって実行されるとともに、第１の演算処理部では、次のコンポーネントに対応する符号化データ群に対する処理が並列に実行される。これにより、第２の演算処理部によるデータ変換処理の実行開始タイミングが早められるとともに、第１の演算処理部と第２の演算処理部とが並列に動作し得る期間が長くなり、その結果、圧縮画像データ全体の復号に要する時間が短縮される。 According to the image decoding apparatus of the present invention, before decoding the compressed image data, the encoded data group constituting the compressed image data is rearranged so as to be processed for each component. Then, the data are supplied to the first arithmetic processing unit in the rearranged order, and subjected to variable length decoding and inverse quantization. Further, when the encoded data group corresponding to one component is processed by the first arithmetic processing unit, the second arithmetic processing unit executes conversion processing of the output coefficient data into data of time components. At the same time, in the first arithmetic processing unit, processing for the encoded data group corresponding to the next component is executed in parallel. As a result, the execution start timing of the data conversion process by the second arithmetic processing unit is advanced, and the period in which the first arithmetic processing unit and the second arithmetic processing unit can operate in parallel is increased. The time required for decoding the entire compressed image data is shortened.

以下、本発明の実施の形態を図面を参照して詳細に説明する。
図１は、実施の形態に係る画像復号装置のハードウェア構成を概略的に示すブロック図である。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram schematically showing a hardware configuration of the image decoding apparatus according to the embodiment.

図１に示す画像復号装置は、ＣＰＵ１１０、メインメモリ１２０、アクセラレータ１３０、および外部記憶装置１４０が、バス１５０を通じて相互に接続された構成を有している。 The image decoding apparatus shown in FIG. 1 has a configuration in which a CPU 110, a main memory 120, an accelerator 130, and an external storage device 140 are connected to each other through a bus 150.

ＣＰＵ１１０は、外部記憶装置１４０、あるいは図示しないＲＯＭ（Read Only Memory）などに記憶されたプログラムを実行することで、後述する可変長復号化や逆量子化などの所定の処理を実行する。メインメモリ１２０は、ＣＰＵ１１０の作業領域として使用されるＲＡＭ（Random Access Memory）である。 The CPU 110 executes a predetermined process such as variable length decoding or inverse quantization described later by executing a program stored in an external storage device 140 or a ROM (Read Only Memory) (not shown). The main memory 120 is a RAM (Random Access Memory) used as a work area for the CPU 110.

アクセラレータ１３０は、ＬＳＩなどからなるハードウェア回路であり、ＣＰＵ１１０との間でバス１５０を通じて受け取ったデータを処理する。アクセラレータ１３０には、ＡＬＵ（Arithmetic and Logic Unit）あるいはＭＡＣ（Multiplier and Accumulator）などからなる演算器１３１、アクセラレータ１３０に専用のＲＡＭであるローカルメモリ１３２などが設けられている。 The accelerator 130 is a hardware circuit composed of an LSI or the like, and processes data received through the bus 150 with the CPU 110. The accelerator 130 is provided with an arithmetic unit 131 composed of an ALU (Arithmetic and Logic Unit) or a MAC (Multiplier and Accumulator), a local memory 132 that is a RAM dedicated to the accelerator 130, and the like.

このアクセラレータ１３０は、ＣＰＵ１１０とは独立して稼動することが可能になっている。すなわち、ＣＰＵ１１０が演算処理を実行している間、アクセラレータ１３０では、ローカルメモリ１３２に対するバス１５０を通じたデータの読み出し／書き込みを行ったり、演算器１３１により、ＣＰＵ１１０とは別の処理をローカルメモリ１３２内のデータを用いて実行することができ、これによって並列処理が実現される。 This accelerator 130 can be operated independently of the CPU 110. That is, while the CPU 110 is executing arithmetic processing, the accelerator 130 reads / writes data to / from the local memory 132 through the bus 150, or the arithmetic unit 131 performs processing different from the CPU 110 in the local memory 132. The data can be executed using the data, and parallel processing is realized.

本実施の形態では、後述するように、ＪＰＥＧデータの復号の際に、可変長復号および逆量子化の各処理がＣＰＵ１１０により実行され、逆ＤＣＴの演算がアクセラレータ１３０により実行される。 In the present embodiment, as will be described later, when decoding JPEG data, variable length decoding and inverse quantization processing are executed by the CPU 110, and inverse DCT calculation is executed by the accelerator 130.

外部記憶装置１４０は、ＨＤＤ（Hard Disk Drive）などからなり、例えば、ＣＰＵ１１０で実行されるプログラムや、その実行に必要なデータ、復号対象のＪＰＥＧデータなどが記憶される。なお、復号対象のＪＰＥＧデータは、例えば、図示しない通信インタフェースを通じて外部から供給されてもよい。 The external storage device 140 includes an HDD (Hard Disk Drive) or the like, and stores, for example, a program executed by the CPU 110, data necessary for the execution, JPEG data to be decoded, and the like. Note that the JPEG data to be decoded may be supplied from the outside through a communication interface (not shown), for example.

図２は、画像復号装置が備える機能を示すブロック図である。
図２に示すように、本実施の形態の画像復号装置は、可変長復号部１１、逆量子化部１２、逆ＤＣＴ部１３、色変換部１４、ＤＣＴ係数バッファ１５、データ解析部２１、および復号制御部２２を備えている。これらの機能のうち、逆ＤＣＴ部１３は、アクセラレータ１３０のハードウェア処理によって実現され、ＤＣＴ係数バッファ１５を除くその他の機能は、ＣＰＵ１１０のソフトウェア処理によって実現される機能である。これにより、可変長復号部１１および逆量子化部１２での処理と、逆ＤＣＴ部１３での処理は、並列に実行可能になっている。また、データ解析部２１は、プログレッシブ符号化されたＪＰＥＧデータの入力を受けて、そのヘッダやデータ量などを解析し、その解析結果から最も効率的に並列処理を実行できるようなバンドの処理順を決定してリスト化し、復号制御部２２に通知する。このデータ解析部２１は、処理量予測部３１と、処理順変換部３２とを備える。 FIG. 2 is a block diagram illustrating functions of the image decoding apparatus.
As shown in FIG. 2, the image decoding apparatus according to the present embodiment includes a variable length decoding unit 11, an inverse quantization unit 12, an inverse DCT unit 13, a color conversion unit 14, a DCT coefficient buffer 15, a data analysis unit 21, and A decoding control unit 22 is provided. Among these functions, the inverse DCT unit 13 is realized by hardware processing of the accelerator 130, and other functions except for the DCT coefficient buffer 15 are functions realized by software processing of the CPU 110. Thereby, the process in the variable length decoding part 11 and the inverse quantization part 12, and the process in the inverse DCT part 13 can be performed in parallel. Further, the data analysis unit 21 receives the input of progressively encoded JPEG data, analyzes the header, the data amount, and the like, and processes the bands in such a manner that parallel processing can be executed most efficiently from the analysis result. Are listed and notified to the decoding control unit 22. The data analysis unit 21 includes a processing amount prediction unit 31 and a processing order conversion unit 32.

処理量予測部３１は、ストリームに含まれるデータのうち、コンポーネントごとのデータ処理量を予測し、その処理量順にコンポーネントごとの順位を付与する。処理順変換部３２は、処理量予測部３１により付与された順位を、復号処理機能の仕様（処理能力）によって決まる優先順序を基に変換し、復号時における最終的なバンドの処理順を決定する。なお、処理順変換部３２が参照する復号処理機能の仕様の情報は、例えば外部記憶装置１４０などにあらかじめ設定される。 The processing amount prediction unit 31 predicts the data processing amount for each component in the data included in the stream, and assigns a rank for each component in the order of the processing amount. The processing order conversion unit 32 converts the order given by the processing amount prediction unit 31 based on the priority order determined by the specification (processing capability) of the decoding processing function, and determines the final band processing order at the time of decoding. To do. Note that the information on the specifications of the decoding processing function referred to by the processing order conversion unit 32 is set in advance in the external storage device 140, for example.

復号制御部２２は、データ解析部２１から通知されたバンドの処理順に従って、可変長復号部１１、逆量子化部１２、および逆ＤＣＴ部１３の処理を制御する。例えば、通知されたバンドの処理順に従って、可変長復号部１１に対してストリーム内のバンドのデータを供給して、可変長復号処理を実行させる。また、逆量子化部１２において所定のコンポーネントに対応する全バンドについての逆量子化処理が完了すると、得られたＤＣＴ係数に対する逆ＤＣＴ演算の実行を逆ＤＣＴ部１３に要求する。このとき、逆ＤＣＴ部１３が処理中の状態（Ｂｕｓｙ状態）のときは、可変長復号部１１に対して次のバンドの処理を実行させつつ、Ｂｕｓｙ状態が解除されるまでの間、逆ＤＣＴ部１３に対してポーリングを行うことで、逆ＤＣＴ部１３に次の処理を順次実行させる。 The decoding control unit 22 controls the processes of the variable length decoding unit 11, the inverse quantization unit 12, and the inverse DCT unit 13 according to the band processing order notified from the data analysis unit 21. For example, according to the notified processing order of the band, the band data in the stream is supplied to the variable length decoding unit 11 to execute the variable length decoding process. Further, when the inverse quantization processing for all the bands corresponding to the predetermined component is completed in the inverse quantization unit 12, the inverse DCT unit 13 is requested to execute the inverse DCT operation on the obtained DCT coefficient. At this time, when the inverse DCT unit 13 is in the process state (Busy state), the variable length decoding unit 11 performs the process of the next band and continues until the Busy state is canceled. By polling the unit 13, the inverse DCT unit 13 is caused to sequentially execute the following processing.

可変長復号部１１は、復号制御部２２の制御により読み込まれたバンドに対して順次可変長復号処理を施し、量子化データを出力する。逆量子化部１２は、可変長復号部１１でバンドごとに得られた量子化データを、量子化テーブルに基づいて逆量子化してＤＣＴ係数を出力する。 The variable length decoding unit 11 sequentially performs variable length decoding processing on the band read under the control of the decoding control unit 22 and outputs quantized data. The inverse quantization unit 12 inversely quantizes the quantized data obtained for each band by the variable length decoding unit 11 based on the quantization table, and outputs a DCT coefficient.

ＤＣＴ係数バッファ１５は、逆量子化部１２から出力されたＤＣＴ係数が一時的に格納されるバッファであり、メインメモリ１２０の格納領域の一部として実現される。ＤＣＴ係数バッファ１５では、復号制御部２２の制御により、所定のコンポーネントに対応するＤＣＴ係数がすべて格納されると、これらのＤＣＴ係数が逆ＤＣＴ部１３に対して出力される。 The DCT coefficient buffer 15 is a buffer that temporarily stores the DCT coefficient output from the inverse quantization unit 12, and is realized as a part of the storage area of the main memory 120. In the DCT coefficient buffer 15, when all the DCT coefficients corresponding to a predetermined component are stored under the control of the decoding control unit 22, these DCT coefficients are output to the inverse DCT unit 13.

逆ＤＣＴ部１３は、復号制御部２２からの要求に応じて、所定のコンポーネントの処理に必要なＤＣＴ係数をＤＣＴ係数バッファ１５から読み出して、逆ＤＣＴ処理して伸張し、画素データ（ＹＵＶデータ）を出力する。色変換部１４は、逆ＤＣＴ部１３で得られたＹＵＶデータに対して色空間変換を行い、色情報データ（ＲＧＢ画像データ）を出力する。 In response to a request from the decoding control unit 22, the inverse DCT unit 13 reads out DCT coefficients necessary for processing of a predetermined component from the DCT coefficient buffer 15, performs inverse DCT processing, and decompresses the pixel data (YUV data). Is output. The color conversion unit 14 performs color space conversion on the YUV data obtained by the inverse DCT unit 13 and outputs color information data (RGB image data).

次に、バンドの分割例を挙げて、本実施の形態において、プログレッシブ符号化されたＪＰＥＧデータを復号する際の処理手順について、具体的に説明する。図３は、実施の形態で適用されるバンドの分割例を示す図である。 Next, with reference to an example of band division, a processing procedure for decoding progressively encoded JPEG data in the present embodiment will be specifically described. FIG. 3 is a diagram illustrating an example of band division applied in the embodiment.

図３において、丸印で囲まれた数字は、ＪＰＥＧデータのストリーム内のバンドの格納順を示している。すなわち、ここで例示するＪＰＥＧデータの各ブロックは、第１から第１２までの１２個のバンドにより構成される。 In FIG. 3, the numbers surrounded by circles indicate the storage order of the bands in the JPEG data stream. That is, each block of JPEG data exemplified here is composed of 12 bands from 1st to 12th.

なお、従来と同様に、ここで処理されるＪＰＥＧデータは、８画素×８画素のブロックを単位として符号化されており、ブロック内の要素のスキャン順（スキャン番号）は、図１４に示した通りである。また、符号化の対象とされた元の画像データは、一般に輝度成分（Ｙ）と色差成分（Ｃｂ，Ｃｒ）とのサンプリング比が“４：２：２”と表されるデータ（ＹＵＶデータ）であったものとし、ブロック内の各要素の符号化データは、Ｙ，Ｃｂ，Ｃｒのそれぞれについて、０ビットから７ビットまでの８ビットのビット列として構成される。 As in the prior art, the JPEG data processed here is encoded in units of blocks of 8 pixels × 8 pixels, and the scan order (scan number) of the elements in the block is shown in FIG. Street. The original image data to be encoded is generally data (YUV data) in which the sampling ratio of the luminance component (Y) and the color difference components (Cb, Cr) is expressed as “4: 2: 2”. The encoded data of each element in the block is configured as an 8-bit bit string from 0 to 7 bits for each of Y, Cb, and Cr.

第１のバンドには、スキャン番号０の要素に含まれるＹ，Ｃｂ，Ｃｒの各ＤＣＴ係数（ＤＣ成分）のビット列のうち、１ビットから７ビットまでが符号化されている。
第２のバンドには、スキャン番号１〜５の要素に含まれるＹのＤＣＴ係数（ＡＣ成分）のビット列のうち、２ビットから７ビットまでが符号化されている。 In the first band, 1 to 7 bits of the bit string of each DCT coefficient (DC component) of Y, Cb, and Cr included in the element of scan number 0 are encoded.
In the second band, 2 bits to 7 bits of the bit string of Y DCT coefficients (AC components) included in the elements of scan numbers 1 to 5 are encoded.

第３，第４のバンドには、スキャン番号１〜５の要素に含まれるＤＣＴ係数（ＡＣ成分）のビット列のうち、１ビットから７ビットまでが符号化されている。第３のバンドではＣｂのＤＣＴ係数が符号化され、第４のバンドではＣｒのＤＣＴ係数が符号化される。 In the third and fourth bands, 1 to 7 bits of the bit string of DCT coefficients (AC components) included in the elements of scan numbers 1 to 5 are encoded. In the third band, the Cb DCT coefficients are encoded, and in the fourth band, the Cr DCT coefficients are encoded.

第５，第６のバンドには、スキャン番号６〜６３の要素に含まれるＤＣＴ係数（ＡＣ成分）のビット列のうち、１ビットから７ビットまでが符号化されている。第５のバンドではＣｂのＤＣＴ係数が符号化され、第６のバンドではＣｒのＤＣＴ係数が符号化される。 In the fifth and sixth bands, 1 to 7 bits of the bit string of DCT coefficients (AC components) included in the elements of scan numbers 6 to 63 are encoded. In the fifth band, Cb DCT coefficients are encoded, and in the sixth band, Cr DCT coefficients are encoded.

第７のバンドには、スキャン番号６〜６３の要素に含まれるＹのＤＣＴ係数（ＡＣ成分）のビット列のうち、２ビットから７ビットまでが符号化されている。
第８のバンドには、スキャン番号１〜６３の要素に含まれるＹのＤＣＴ係数（ＡＣ成分）のビット列のうち、１ビット目のみが符号化されている。 In the seventh band, 2 bits to 7 bits are encoded in a bit string of Y DCT coefficients (AC components) included in elements of scan numbers 6 to 63.
In the eighth band, only the first bit of the bit string of Y DCT coefficients (AC components) included in the elements of scan numbers 1 to 63 is encoded.

第９のバンドには、スキャン番号０の要素に含まれるＹ，Ｃｂ，Ｃｒの各ＤＣＴ係数（ＤＣ成分）のビット列のうち、０ビット目のみが符号化されている。
第１０〜第１２のバンドには、スキャン番号１〜６３の要素に含まれるＤＣＴ係数（ＡＣ成分）のビット列のうち、０ビット目のみが符号化されている。第１０のバンドではＹのＤＣＴ係数が符号化され、第１１のバンドではＣｂのＤＣＴ係数が符号化され、第１２のバンドではＣｒのＤＣＴ係数が符号化される。 In the ninth band, only the 0th bit is encoded in the bit string of each DCT coefficient (DC component) of Y, Cb, and Cr included in the element of scan number 0.
In the 10th to 12th bands, only the 0th bit of the bit string of the DCT coefficient (AC component) included in the elements of the scan numbers 1 to 63 is encoded. In the tenth band, Y DCT coefficients are encoded, in the eleventh band Cb DCT coefficients are encoded, and in the twelfth band Cr DCT coefficients are encoded.

次に、本実施の形態での復号処理について、具体的に説明する。
図４は、復号処理時におけるＣＰＵの処理手順を示すフローチャートである。
まず、データ解析部２１は、プログレッシブ符号化されたＪＰＥＧデータの入力を受けると、ストリーム全体のヘッダやデータ量などを解析し、バンドの処理順を決定する（ステップＳ１１）。このとき、データ解析部２１は、バンドをコンポーネントごとに順次処理できるように並び替えて、それらの処理順をリスト化する。この処理順決定処理については、後の図７において詳しく説明する。 Next, the decoding process in the present embodiment will be specifically described.
FIG. 4 is a flowchart showing the processing procedure of the CPU during the decoding process.
First, when receiving the input of progressively encoded JPEG data, the data analysis unit 21 analyzes the header, data amount, etc. of the entire stream and determines the processing order of the bands (step S11). At this time, the data analysis unit 21 rearranges the bands so that they can be sequentially processed for each component, and lists their processing order. This processing order determination processing will be described in detail later with reference to FIG.

次に、復号制御部２２は、データ解析部２１で作成されたバンドの処理順リストを参照しながら、復号処理を開始する。まず、データ解析部２１で決定されたコンポーネントの処理順に従って、次に処理するコンポーネントを選択する（ステップＳ１２）。次に、選択したコンポーネントに対応するバンドの処理順に従って、次に処理するバンドを選択する（ステップＳ１３）。 Next, the decoding control unit 22 starts the decoding process while referring to the band processing order list created by the data analysis unit 21. First, the component to be processed next is selected in accordance with the component processing order determined by the data analysis unit 21 (step S12). Next, the band to be processed next is selected according to the processing order of the band corresponding to the selected component (step S13).

可変長復号部１１は、ステップＳ１３で選択されたバンドの符号化データを読み込み、可変長復号の処理を実行する（ステップＳ１４）。さらに、逆量子化部１２は、可変長復号によって得られた量子化データを逆量子化し、得られたＤＣＴ係数をＤＣＴ係数バッファ１５に格納する（ステップＳ１５）。 The variable length decoding unit 11 reads the encoded data of the band selected in step S13 and executes variable length decoding processing (step S14). Further, the inverse quantization unit 12 inversely quantizes the quantized data obtained by variable length decoding, and stores the obtained DCT coefficient in the DCT coefficient buffer 15 (step S15).

ここで、復号制御部２２は、ステップＳ１２で選択したコンポーネントに対応するすべてのバンドの処理が終了したか否かを判定し（ステップＳ１６）、終了していない場合には、ステップＳ１３に戻って、次に処理するバンドを選択する。これにより、選択されたバンドの符号化データが、可変長復号部１１および逆量子化部１２で処理される。 Here, the decoding control unit 22 determines whether or not the processing of all the bands corresponding to the component selected in step S12 has been completed (step S16), and if not, returns to step S13. Next, the band to be processed is selected. As a result, the encoded data of the selected band is processed by the variable length decoding unit 11 and the inverse quantization unit 12.

そして、選択したコンポーネントに対応するすべてのバンドが処理され、得られたＤＣＴ係数がＤＣＴ係数バッファ１５に格納されると、復号制御部２２は、アクセラレータ１３０（逆ＤＣＴ部１３）に対して、それらのＤＣＴ係数に対する逆ＤＣＴの実行を要求する処理を行う（ステップＳ１７）。この処理では、復号制御部２２は、まず、逆ＤＣＴ部１３が処理中であるか否かを問い合わせ、Ｂｕｓｙ状態であった場合には、Ｂｕｓｙ状態が解除するまでの間、逆ＤＣＴ部１３に対するポーリングを繰り返し行う。そして、Ｂｕｓｙ状態が解除されると、次のコンポーネントについての逆ＤＣＴの実行を要求する。 When all the bands corresponding to the selected component are processed and the obtained DCT coefficients are stored in the DCT coefficient buffer 15, the decoding control unit 22 sends them to the accelerator 130 (inverse DCT unit 13). Processing for requesting execution of inverse DCT for the DCT coefficients is performed (step S17). In this process, the decoding control unit 22 first inquires whether or not the inverse DCT unit 13 is processing, and if it is in the Busy state, the decoding control unit 22 applies to the inverse DCT unit 13 until the Busy state is canceled. Repeat polling. Then, when the Busy state is released, it requests execution of inverse DCT for the next component.

逆ＤＣＴの実行要求処理を行うと、復号制御部２２は、ストリーム内のすべてのバンドの処理が終了したか否かを判定する（ステップＳ１８）。なお、このステップＳ１８の処理は、ステップＳ１７で逆ＤＣＴ部１３がＢｕｓｙ状態であった場合には、その後のポーリングと並行して実行される。 When the reverse DCT execution request process is performed, the decoding control unit 22 determines whether or not the processing for all the bands in the stream has been completed (step S18). The process in step S18 is executed in parallel with the subsequent polling when the inverse DCT unit 13 is in the busy state in step S17.

ストリーム内のすべてのバンドの処理が終了していない場合には、ステップＳ１２に戻って、次に処理するコンポーネントが選択される。これにより、選択されたコンポーネントに対応するバンドが、可変長復号部１１および逆量子化部１２で順次処理されていく。そして、ストリーム内のすべてのバンドの処理が終了すると、復号処理が終了される。 If all the bands in the stream have not been processed, the process returns to step S12, and the next component to be processed is selected. As a result, the band corresponding to the selected component is sequentially processed by the variable length decoding unit 11 and the inverse quantization unit 12. When all the bands in the stream have been processed, the decoding process ends.

図５は、復号処理時におけるＣＰＵとアクセラレータとの間の処理シーケンスを示す図である。
まず、タイミングＴ２１において、ＣＰＵ１１０は、データ解析部２１によるデータ解析処理を行う。この処理は、図４のステップＳ１１に対応し、前述のように、コンポーネントごとに順次処理されるようにバンドの処理順が決定される。 FIG. 5 is a diagram showing a processing sequence between the CPU and the accelerator during the decoding process.
First, at timing T 21, the CPU 110 performs data analysis processing by the data analysis unit 21. This processing corresponds to step S11 in FIG. 4, and the processing order of the bands is determined so that the processing is sequentially performed for each component as described above.

本実施の形態では、図３に示したようにＹ，Ｃｂ，Ｃｒの３種類のコンポーネントが存在するので、ＣＰＵ１１０は、図４に示したステップＳ１２〜Ｓ１８までの処理を３回繰り返すことになる。従って、ＣＰＵ１１０によるデータ解析後の可変長復号および逆量子化の処理期間は、各コンポーネントに対応する第１のＶＬＤ＆ＩＱ期間（タイミングＴ２２〜Ｔ２３）、第２のＶＬＤ＆ＩＱ期間（タイミングＴ２３〜Ｔ２４）、第３のＶＬＤ＆ＩＱ期間（タイミングＴ２４〜Ｔ２７）に分けることができる。また、アクセラレータ１３０による逆ＤＣＴの処理期間も、各コンポーネントに対応する第１のＩＤＣＴ期間（タイミングＴ２３からタイミングＴ２５の直前まで）、第２のＩＤＣＴ期間（タイミングＴ２６からタイミングＴ２７の直前まで）、第３のＩＤＣＴ期間（タイミングＴ２８〜Ｔ２９）に分けることができる。 In the present embodiment, as shown in FIG. 3, there are three types of components Y, Cb, and Cr. Therefore, the CPU 110 repeats the processing from steps S12 to S18 shown in FIG. 4 three times. . Accordingly, the variable-length decoding and inverse quantization processing periods after data analysis by the CPU 110 include the first VLD & IQ period (timing T22 to T23), the second VLD & IQ period (timing T23 to T24), and the first corresponding to each component. Can be divided into three VLD & IQ periods (timing T24 to T27). Also, the inverse DCT processing period by the accelerator 130 includes a first IDCT period (from timing T23 to immediately before timing T25), a second IDCT period (from timing T26 to immediately before timing T27) corresponding to each component, It can be divided into three IDCT periods (timing T28 to T29).

データ解析処理後のタイミングＴ２２において、ＣＰＵ１１０は、最初に処理すべきコンポーネントを選択して、このコンポーネントに対応するバンドに対する可変長復号および逆量子化の各処理を実行する。 At timing T22 after the data analysis processing, the CPU 110 selects a component to be processed first, and executes variable length decoding and inverse quantization processing for a band corresponding to this component.

次に、タイミングＴ２３において、これらの処理が完了し、そのコンポーネントについての逆ＤＣＴを行うために必要なすべてのＤＣＴ係数がＤＣＴ係数バッファ１５に格納されると、ＣＰＵ１１０は、アクセラレータ１３０に対して逆ＤＣＴの処理中であるか否かを問い合わせる。このとき、アクセラレータ１３０は処理を行っていない状態なので、アクセラレータ１３０からの応答後に、ＣＰＵ１１０は逆ＤＣＴの実行をアクセラレータに要求し、アクセラレータ１３０は、その要求に応じて、ＤＣＴ係数バッファ１５からＤＣＴ係数を読み込み、逆ＤＣＴの演算を実行する。なお、第１のＩＤＣＴ期間以前にはアクセラレータ１３０の逆ＤＣＴは実行されていないので、図５に示すように、タイミングＴ２３では、問い合わせを行うことなく、アクセラレータ１３０に対して逆ＤＣＴの実行が要求されてもよい。 Next, when these processes are completed and all the DCT coefficients necessary for performing the inverse DCT for the component are stored in the DCT coefficient buffer 15 at the timing T23, the CPU 110 performs the inverse operation on the accelerator 130. An inquiry is made as to whether DCT is being processed. At this time, since the accelerator 130 is not processing, the CPU 110 requests the accelerator to execute inverse DCT after the response from the accelerator 130, and the accelerator 130 receives the DCT coefficient from the DCT coefficient buffer 15 in response to the request. Is read and the inverse DCT operation is executed. Since the inverse DCT of the accelerator 130 is not executed before the first IDCT period, as shown in FIG. 5, at time T23, the accelerator 130 is requested to execute the inverse DCT without making an inquiry. May be.

また、ＣＰＵ１１０は、タイミングＴ２３において逆ＤＣＴの実行要求を行うとともに、次の処理すべきコンポーネントを選択して、このコンポーネントに対応するバンドに対する可変長復号および逆量子化の各処理を実行する。すなわち、この期間では、ＣＰＵ１１０での可変長復号および逆量子化の処理と、アクセラレータ１３０での逆ＤＣＴ処理とが並列に実行される。 In addition, the CPU 110 issues an inverse DCT execution request at timing T23, selects a component to be processed next, and executes variable length decoding and inverse quantization processing for a band corresponding to this component. That is, in this period, variable length decoding and inverse quantization processing by the CPU 110 and inverse DCT processing by the accelerator 130 are executed in parallel.

そして、タイミングＴ２４においてそれらの処理が終了すると、ＣＰＵ１１０は、アクセラレータ１３０に対して逆ＤＣＴの処理中であるか否かを問い合わせる。図５の例では、タイミングＴ２４では、アクセラレータ１３０において、最初に選択されたコンポーネントについての逆ＤＣＴの処理が終了しておらず、アクセラレータ１３０からは、ＣＰＵ１１０に対してＢｕｓｙ信号が返送される。ＣＰＵ１１０は、Ｂｕｓｙ信号を受信すると、次に処理するコンポーネントを選択して可変長復号および逆量子化の処理を開始するとともに、アクセラレータ１３０に対して一定時間ごとに処理中か否かを問い合わせるポーリングを繰り返す。 When these processes are completed at timing T24, the CPU 110 inquires of the accelerator 130 whether or not reverse DCT is being processed. In the example of FIG. 5, at timing T 24, the accelerator 130 has not finished the inverse DCT processing for the first selected component, and the accelerator 130 returns a Busy signal to the CPU 110. When the CPU 110 receives the Busy signal, the CPU 110 selects a component to be processed next, starts variable-length decoding and inverse quantization, and polls the accelerator 130 to inquire whether the processing is being performed at regular intervals. repeat.

そして、図５の例では、タイミングＴ２５でのポーリング時に、アクセラレータ１３０での逆ＤＣＴが終了しているので、ＣＰＵ１１０は、アクセラレータ１３０からの応答受信後のタイミングＴ２６において、アクセラレータ１３０に対して、次のコンポーネントについての逆ＤＣＴの実行を要求する。従って、ここからの期間でも、ＣＰＵ１１０での可変長復号および逆量子化の処理と、アクセラレータ１３０での逆ＤＣＴ処理とが並列に実行されることになる。 In the example of FIG. 5, the reverse DCT at the accelerator 130 is completed at the time of polling at the timing T 25, so the CPU 110 performs the following to the accelerator 130 at the timing T 26 after receiving a response from the accelerator 130. Request to perform inverse DCT on the other components. Therefore, even in the period from here, the variable length decoding and inverse quantization processes in the CPU 110 and the inverse DCT process in the accelerator 130 are executed in parallel.

次に、ＣＰＵ１１０は、タイミングＴ２７において、最後のコンポーネントに対応するすべてのバンドに対する可変長復号および逆量子化の処理を終了すると、アクセラレータ１３０に対して、アクセラレータ１３０に対して逆ＤＣＴの処理中であるか否かを問い合
わせる。図５の例では、タイミングＴ２４では、アクセラレータ１３０における前のコンポーネントについての逆ＤＣＴの処理が終了しているので、ＣＰＵ１１０は、アクセラレータ１３０からの応答受信後のタイミングＴ２８において、アクセラレータ１３０に対して、次のコンポーネントについての逆ＤＣＴの実行を要求する。アクセラレータ１３０は、この要求に応じて、最後のコンポーネントについての逆ＤＣＴを実行し、タイミングＴ２９において、この処理を終了する。 Next, when the CPU 110 finishes variable length decoding and inverse quantization processing for all bands corresponding to the last component at timing T27, the CPU 110 is in the process of performing inverse DCT on the accelerator 130. Queries whether there is. In the example of FIG. 5, since the inverse DCT process for the previous component in the accelerator 130 is completed at the timing T24, the CPU 110 causes the accelerator 130 to respond to the accelerator 130 at the timing T28 after receiving a response from the accelerator 130. Request to perform inverse DCT on the next component. In response to this request, the accelerator 130 executes inverse DCT for the last component, and ends this processing at timing T29.

ところで、アクセラレータ１３０では、第１，第２，第３のＩＤＣＴ期間のそれぞれにおいて、逆ＤＣＴの演算がコンポーネントごとに行われる。このことから、ＣＰＵ１１０での第１，第２，第３のＶＬＤ＆ＩＱ期間のそれぞれは、１つのコンポーネントについての逆ＤＣＴの実行に必要なだけのＤＣＴ係数を取り出すための期間であると言える。 Meanwhile, in the accelerator 130, the inverse DCT calculation is performed for each component in each of the first, second, and third IDCT periods. From this, it can be said that each of the first, second, and third VLD & IQ periods in the CPU 110 is a period for extracting only the DCT coefficients necessary for executing the inverse DCT for one component.

その一方で、入力されるＪＰＥＧデータのストリームには、基本的にコンポーネントに関係ない順番でバンドが格納されている。このため、従来のｄｊｐｅｇ方式に従い、バンドの格納順に可変長復号および逆量子化を実行した場合には、第１のＶＬＤ＆ＩＱ期間において、必要のないコンポーネントに対応するバンドに対する処理も実行されることがあり、この結果、第１のＶＬＤ＆ＩＱ期間が長くなり、第２のＶＬＤ＆ＩＱ期間以降のＣＰＵ１１０の処理期間が短くなる。 On the other hand, bands are stored in the input JPEG data stream in an order basically not related to the components. For this reason, when variable-length decoding and inverse quantization are executed in the band storage order in accordance with the conventional djpeg method, processing for bands corresponding to unnecessary components may be executed in the first VLD & IQ period. As a result, the first VLD & IQ period becomes longer, and the processing period of the CPU 110 after the second VLD & IQ period becomes shorter.

ここで、図６は、従来のｄｊｐｅｇ方式で復号処理を行った場合のバンドの処理順を示す図である。
ｄｊｐｅｇ方式では、ストリーム内のバンドの格納順に処理が行われるので、ＣＰＵ１１０による可変長復号および逆量子化の処理は、図３に示したバンドの順序通りに実行される。ここで、アクセラレータ１３０での逆ＤＣＴを、Ｙ，Ｃｂ，Ｃｒのコンポーネント順で実行するものとすると、ＣＰＵ１１０によって第１０のバンドに対する可変長復号および逆量子化の処理が終了されたときに、Ｙについての逆ＤＣＴに必要なすべてのＤＣＴ係数が得られ、アクセラレータ１３０による逆ＤＣＴの演算を開始することができるようになる。 Here, FIG. 6 is a diagram showing the processing order of the bands when the decoding process is performed by the conventional djpeg method.
In the djpeg method, processing is performed in the order in which the bands in the stream are stored. Therefore, the variable length decoding and inverse quantization processing by the CPU 110 is executed in the band order shown in FIG. Here, assuming that the inverse DCT in the accelerator 130 is executed in the order of components of Y, Cb, and Cr, when the variable length decoding and inverse quantization processing for the tenth band is finished by the CPU 110, Y All the DCT coefficients necessary for the inverse DCT with respect to are obtained, and the computation of the inverse DCT by the accelerator 130 can be started.

従って、第１から第１０までのバンドに対する可変長復号および逆量子化の処理期間が、上述した第１のＶＬＤ＆ＩＱ期間に対応することになり、この期間には、Ｙについての逆ＤＣＴに必要のないバンド（第３〜第６のバンド）に対する処理も実行されるので、この期間が必要以上に長くなり、第１のＩＤＣＴ期間の開始タイミングが遅れてしまう。さらに、アクセラレータ１３０での逆ＤＣＴが開始された後、ＣＰＵ１１０は残りのバンドに対する処理を実行するが、Ｃｂ，Ｃｒに対応する一部のバンドについては第１のＶＬＤ＆ＩＱ期間に処理済みとなっているので、Ｃｂに対応するＤＣＴ係数を得るための期間（第２のＶＬＤ＆ＩＱ期間に対応）、Ｃｒに対応するＤＣＴ係数を得るための期間（第３のＶＬＤ＆ＩＱ期間に対応）の両方とも短くなってしまう。その結果、ＣＰＵ１１０での処理とアクセラレータ１３０での処理とが並列に実行される期間が短くなり、逆に第２のＶＬＤ＆ＩＱ期間以降で並列に処理されない期間が長くなるために、全体の処理時間も長くなってしまう。 Therefore, the variable length decoding and inverse quantization processing periods for the first to tenth bands correspond to the first VLD & IQ period described above, and this period is necessary for the inverse DCT for Y. Since the processing for the non-existing bands (third to sixth bands) is also executed, this period becomes longer than necessary, and the start timing of the first IDCT period is delayed. Further, after the inverse DCT in the accelerator 130 is started, the CPU 110 executes processing for the remaining bands, but some bands corresponding to Cb and Cr have already been processed in the first VLD & IQ period. Therefore, both the period for obtaining the DCT coefficient corresponding to Cb (corresponding to the second VLD & IQ period) and the period for obtaining the DCT coefficient corresponding to Cr (corresponding to the third VLD & IQ period) are shortened. . As a result, the period during which the processing at the CPU 110 and the processing at the accelerator 130 are executed in parallel is shortened, and conversely, the period during which the processing is not performed in parallel after the second VLD & IQ period is increased. It will be long.

これに対して、本実施の形態では、可変長復号および逆量子化を開始する前のデータ解析処理により、可変長復号および逆量子化の処理についてもコンポーネントごとに処理するようにバンドの処理順を並び替えることで、第１のＶＬＤ＆ＩＱ期間をできるだけ短くするとともに、ＣＰＵ１１０での処理とアクセラレータ１３０での処理とを並列に実行する期間を長くして、全体の処理時間を短縮する。 In contrast, in the present embodiment, the processing order of the bands is performed so that the variable length decoding and the inverse quantization process are also processed for each component by the data analysis process before starting the variable length decoding and the inverse quantization. By rearranging, the first VLD & IQ period is shortened as much as possible, and the period in which the processing in the CPU 110 and the processing in the accelerator 130 are executed in parallel is lengthened to shorten the entire processing time.

図７は、図４のステップＳ１１のデータ解析処理の手順を示すフローチャートである。
データ解析部２１では、まず、処理量予測部３１が、ストリームのヘッダを解析して、符号化されたデータのカラースペースを識別し、これを基にコンポーネントごとの処理量を予測する（ステップＳ３１）。ストリームのヘッダからは、符号化データを構成するコンポーネントの種類や、ＭＣＵ（Minimum Coded Unit）におけるコンポーネントごとのブロックサイズなどを認識することができるが、これらの値はカラースペースごとに異なるものとなる。処理量予測部３１は、これらの情報から、コンポーネントの種類を識別し、それらのコンポーネントごとのデータ量を予測する。すなわち、ブロックサイズが大きいほど、処理すべきデータ量が大きいと推定する。なお、カラースペースの種類としては、例えば、“４：２：２”、“４：１：１”、“４：２：０”といったサンプリング比を用いて表されるものがある。 FIG. 7 is a flowchart showing the procedure of the data analysis process in step S11 of FIG.
In the data analysis unit 21, first, the processing amount prediction unit 31 analyzes the header of the stream, identifies the color space of the encoded data, and predicts the processing amount for each component based on this (step S31). ). From the header of the stream, it is possible to recognize the types of components constituting the encoded data, the block size for each component in the MCU (Minimum Coded Unit), etc., but these values are different for each color space. . The processing amount prediction unit 31 identifies the type of component from these pieces of information, and predicts the data amount for each component. That is, it is estimated that the larger the block size, the larger the amount of data to be processed. Note that the types of color spaces include those represented by using sampling ratios such as “4: 2: 2”, “4: 1: 1”, and “4: 2: 0”.

なお、このステップＳ３１での処理量の予測処理は、上記に限らず、例えば、コンポーネントごとのバンド数、バンドごとの分割ビット数、係数分割の状態、モード設定など、コンポーネントに特有の、あるいはコンポーネントごとのバンドに特有の性質や特徴（すなわち属性）を、ストリームのヘッダを解析することで得て、それらの情報を基に実行されてもよい。また、複数の要素を基に予測処理量の順位を決定してもよい。例えば、バンドごとに割り当てられたＤＣＴ係数のビット数が多いほど、データ量が大きいと予測することができる。 Note that the processing amount prediction processing in step S31 is not limited to the above. For example, the number of bands for each component, the number of divided bits for each band, the state of coefficient division, mode setting, etc. Properties and characteristics (that is, attributes) peculiar to each band may be obtained by analyzing a stream header, and may be executed based on the information. Further, the rank of the predicted processing amount may be determined based on a plurality of elements. For example, the larger the number of bits of the DCT coefficient allocated for each band, the larger the data amount can be predicted.

次に、処理順変換部３２は、処理量予測部３１の予測処理により、すべてのコンポーネントの処理量順を決定できるようになったか否かを判定する（ステップＳ３２）。ここで、コンポーネントごとの予測処理量（すなわち、コンポーネントごとのブロックサイズ）がすべて異なれば、それらのすべてを順位付けできるので、ステップＳ３４の処理に移行する。一方、予測処理量が同じコンポーネントがある場合には、ストリームを解析し、バンドの先頭ポインタの位置を基準としてバンドごとの実データ量を検出することで、コンポーネントごとのデータ量を算出する（ステップＳ３３）。 Next, the processing order conversion unit 32 determines whether or not the processing amount order of all components can be determined by the prediction processing of the processing amount prediction unit 31 (step S32). Here, if the prediction processing amount for each component (that is, the block size for each component) is different, all of them can be ranked, and the process proceeds to step S34. On the other hand, if there are components with the same predicted processing amount, the stream is analyzed, and the actual data amount for each band is detected based on the position of the head pointer of the band, thereby calculating the data amount for each component (step) S33).

なお、図３の例の場合、上記のステップＳ３３において実データ量を検出する際には、第１，第２，第７〜第１０の各バンドによりＹ成分についてのデータ量が検出され、第１，第３，第５，第９，第１１の各バンドによりＣｂ成分についてのデータ量が検出され、第１，第４，第６，第９，第１２の各バンドによりＣｒ成分についてのデータ量が検出されればよい。 In the case of the example of FIG. 3, when the actual data amount is detected in the above step S33, the data amount for the Y component is detected by the first, second, seventh to tenth bands. The data amount for the Cb component is detected by the first, third, fifth, ninth, and eleventh bands, and the data for the Cr component by the first, fourth, sixth, ninth, and twelfth bands. The amount only needs to be detected.

次に、処理順変換部３２は、ステップＳ３１およびステップＳ３３の処理で得られたデータ量が多い順に、各コンポーネントに対して順位を付与する（ステップＳ３４）。
次に、処理順変換部３２は、ＣＰＵ１１０およびアクセラレータ１３０のそれぞれの処理性能についての判別を行う。具体的には、ＣＰＵ１１０での可変長復号および逆量子化の処理と、アクセラレータ１３０での逆ＤＣＴの処理のうち、どちらがボトルネックになる処理か（すなわち、どちらが低速な処理か）を判別する（ステップＳ３５）。ここで、どちらがボトルネックになるかを示す設定情報は、例えば製品出荷時に、外部記憶装置１４０などにあらかじめ記録されており、処理順変換部３２は、この設定情報を基に判別する。 Next, the processing order conversion unit 32 assigns ranks to the respective components in descending order of the amount of data obtained by the processing of Step S31 and Step S33 (Step S34).
Next, the processing order conversion unit 32 determines the processing performance of each of the CPU 110 and the accelerator 130. Specifically, it is determined which of the variable length decoding and inverse quantization processes in the CPU 110 and the inverse DCT process in the accelerator 130 is the bottleneck process (that is, which is the slower process) ( Step S35). Here, the setting information indicating which is the bottleneck is recorded in advance in the external storage device 140 or the like at the time of product shipment, for example, and the processing order conversion unit 32 determines based on this setting information.

ＣＰＵ１１０での可変長復号および逆量子化の処理がボトルネックになる場合には、処理順変換部３２は、ステップＳ３４で付与したコンポーネントの処理量順位を、処理量が最小のものを先頭とするように変換する。この例では、コンポーネントの処理量順位を、３位、１位、２位の順に変換する。そして、変換後の順番をコンポーネントの処理順とし、この処理順に従ってバンドの処理順を決定し、リスト化する（ステップＳ３６）。 When the variable length decoding and inverse quantization processing in the CPU 110 becomes a bottleneck, the processing order conversion unit 32 sets the processing amount order of the components assigned in step S34 to the head with the processing amount of the smallest amount. Convert as follows. In this example, the processing order of components is converted in the order of third, first, and second. Then, the converted order is set as the component processing order, and the band processing order is determined in accordance with this processing order and is listed (step S36).

一方、アクセラレータ１３０での逆ＤＣＴの処理がボトルネックになる場合には、処理順変換部３２は、ステップＳ３４で付与したコンポーネントの処理量順位を、最小のものを最後にし、残りを処理量が小さい順になるように変換する。この例では、コンポーネントの処理量順位を、２位、１位、３位の順に変換する。そして、変換後の順番をコンポーネントの処理順とし、この処理順に従ってバンドの処理順を決定し、リスト化する（ステップＳ３７）。 On the other hand, when the inverse DCT processing in the accelerator 130 becomes a bottleneck, the processing order conversion unit 32 sets the processing amount rank of the components assigned in step S34 to the last one, and the remaining processing amount is the processing amount. Convert them in ascending order. In this example, the processing order of components is converted in the order of second, first, and third. Then, the converted order is set as the component processing order, and the band processing order is determined in accordance with this processing order and is listed (step S37).

図８は、ＣＰＵでの可変長復号および逆量子化の処理がボトルネックになる場合のバンドの処理順を示す図である。
本実施の形態では、例として、ステップＳ３１およびステップＳ３３でのデータ量の予測処理により、Ｙが１位、Ｃｒが２位、Ｃｂが３位という予測処理量の順位が付与されたものとする。このとき、ＣＰＵ１１０での可変長復号および逆量子化の処理がボトルネックとなる場合には、コンポーネントごとの最終的な処理順は、図７の処理によりＣｂ，Ｙ，Ｃｒの順に決定される。 FIG. 8 is a diagram illustrating the processing order of bands when variable length decoding and inverse quantization processing in the CPU becomes a bottleneck.
In the present embodiment, as an example, it is assumed that the prediction processing amount ranks such that Y is first, Cr is second, and Cb is third in the data amount prediction processing in steps S31 and S33. . At this time, when the variable length decoding and inverse quantization processing in the CPU 110 becomes a bottleneck, the final processing order for each component is determined in the order of Cb, Y, Cr by the processing of FIG.

まず、第１のＶＬＤ＆ＩＱ期間では、Ｃｂ成分のデータを含む第１，第３，第５，第９，第１１のバンドが、可変長復号部１１および逆量子化部１２において順に処理されて、ＤＣＴ係数が出力される。これらのバンドから得られたＤＣＴ係数を用いることで、アクセラレータ１３０では、Ｃｂ成分に対する逆ＤＣＴを実行可能となる。 First, in the first VLD & IQ period, the first, third, fifth, ninth, and eleventh bands including the data of the Cb component are sequentially processed in the variable length decoding unit 11 and the inverse quantization unit 12, DCT coefficients are output. By using the DCT coefficients obtained from these bands, the accelerator 130 can perform inverse DCT on the Cb component.

また、アクセラレータ１３０でのＣｂ成分に対する逆ＤＣＴが開始されるとともに、第２のＶＬＤ＆ＩＱ期間が開始され、この期間では、Ｙ成分のデータを含む第２，第７，第８，第１０のバンドが、可変長復号部１１および逆量子化部１２において順に処理される。Ｃｂ成分の処理を終えたアクセラレータ１３０では、これらのバンドの処理で得られたＤＣＴ係数と、第１のＶＬＤ＆ＩＱ期間で第１，第９の各バンドからすでに得られているＤＣＴ係数とを用いて、Ｙ成分に対する逆ＤＣＴを実行可能となる。 In addition, the inverse DCT for the Cb component in the accelerator 130 is started, and the second VLD & IQ period is started. During this period, the second, seventh, eighth, and tenth bands including the Y component data are displayed. The variable length decoding unit 11 and the inverse quantization unit 12 sequentially process. The accelerator 130 that has finished processing the Cb component uses the DCT coefficients obtained by the processing of these bands and the DCT coefficients already obtained from the first and ninth bands in the first VLD & IQ period. , The inverse DCT for the Y component can be executed.

また、アクセラレータ１３０でのＹ成分に対する逆ＤＣＴが開始されるとともに、第３のＶＬＤ＆ＩＱ期間が開始され、この期間では、Ｃｒ成分のデータを含む第４，第６，第１２のバンドが、可変長復号部１１および逆量子化部１２において順に処理される。Ｙ成分の処理を終えたアクセラレータ１３０では、これらのバンドの処理で得られたＤＣＴ係数と、第１のＶＬＤ＆ＩＱ期間で第１，第９の各バンドからすでに得られているＤＣＴ係数とを用いて、Ｃｒ成分に対する逆ＤＣＴを実行可能となる。 In addition, the inverse DCT for the Y component in the accelerator 130 is started, and the third VLD & IQ period is started. In this period, the fourth, sixth, and twelfth bands including the Cr component data are variable length. Processing is sequentially performed in the decoding unit 11 and the inverse quantization unit 12. The accelerator 130 that has finished processing the Y component uses the DCT coefficients obtained by the processing of these bands and the DCT coefficients already obtained from the first and ninth bands in the first VLD & IQ period. , The inverse DCT on the Cr component can be executed.

図９は、アクセラレータでの逆ＤＣＴの処理がボトルネックになる場合のバンドの処理順を示す図である。
前述のように、本実施の形態では、コンポーネントごとの予測処理量の順位がＹ，Ｃｒ，Ｃｂの順となっているので、アクセラレータ１３０での逆ＤＣＴの処理がボトルネックになる場合には、コンポーネントの最終的な処理順は、図７の処理によりＣｒ，Ｙ，Ｃｂの順に決定される。 FIG. 9 is a diagram illustrating the processing order of the bands when the inverse DCT processing in the accelerator becomes a bottleneck.
As described above, in this embodiment, the order of the predicted processing amount for each component is in the order of Y, Cr, and Cb. Therefore, when the inverse DCT processing in the accelerator 130 becomes a bottleneck, The final processing order of components is determined in the order of Cr, Y, Cb by the processing of FIG.

まず、第１のＶＬＤ＆ＩＱ期間では、Ｃｒ成分のデータを含む第１，第４，第６，第９，第１２のバンドが、可変長復号部１１および逆量子化部１２において順に処理されて、ＤＣＴ係数が出力される。これらのバンドから得られたＤＣＴ係数を用いることで、アクセラレータ１３０では、Ｃｒ成分に対する逆ＤＣＴを実行可能となる。 First, in the first VLD & IQ period, the first, fourth, sixth, ninth, and twelfth bands including Cr component data are sequentially processed in the variable length decoding unit 11 and the inverse quantization unit 12, DCT coefficients are output. By using the DCT coefficients obtained from these bands, the accelerator 130 can perform inverse DCT on the Cr component.

また、アクセラレータ１３０でのＣｒ成分に対する逆ＤＣＴが開始されるとともに、第２のＶＬＤ＆ＩＱ期間が開始され、この期間では、Ｙ成分のデータを含む第２，第７，第８，第１０のバンドが、可変長復号部１１および逆量子化部１２において順に処理される。Ｃｒ成分の処理を終えたアクセラレータ１３０では、これらのバンドの処理で得られたＤＣＴ係数と、第１のＶＬＤ＆ＩＱ期間で第１，第９の各バンドからすでに得られているＤＣＴ係数とを用いて、Ｙ成分に対する逆ＤＣＴを実行可能となる。 In addition, the inverse DCT for the Cr component in the accelerator 130 is started, and the second VLD & IQ period is started. In this period, the second, seventh, eighth, and tenth bands including the Y component data are displayed. The variable length decoding unit 11 and the inverse quantization unit 12 sequentially process. In the accelerator 130 that has finished the processing of the Cr component, the DCT coefficients obtained by the processing of these bands and the DCT coefficients already obtained from the first and ninth bands in the first VLD & IQ period are used. , The inverse DCT for the Y component can be executed.

また、アクセラレータ１３０でのＹ成分に対する逆ＤＣＴが開始されるとともに、第３のＶＬＤ＆ＩＱ期間が開始され、この期間では、Ｃｂ成分のデータを含む第３，第５，第１１のバンドが、可変長復号部１１および逆量子化部１２において順に処理される。Ｙ成分の処理を終えたアクセラレータ１３０では、これらのバンドの処理で得られたＤＣＴ係数と、第１のＶＬＤ＆ＩＱ期間で第１，第９の各バンドからすでに得られているＤＣＴ係数とを用いて、Ｃｂ成分に対する逆ＤＣＴを実行可能となる。 Further, the inverse DCT for the Y component in the accelerator 130 is started, and the third VLD & IQ period is started. In this period, the third, fifth, and eleventh bands including the data of the Cb component are variable length. Processing is sequentially performed in the decoding unit 11 and the inverse quantization unit 12. The accelerator 130 that has finished processing the Y component uses the DCT coefficients obtained by the processing of these bands and the DCT coefficients already obtained from the first and ninth bands in the first VLD & IQ period. , Cb component can be subjected to inverse DCT.

以上の処理によれば、第１，第２，第３のＶＬＤ＆ＩＱ期間のそれぞれにおいて、１つのコンポーネントについての逆ＤＣＴを行うために必要最小限のバンドが処理されるように、バンドの処理順を入れ替えたことにより、アクセラレータ１３０において逆ＤＣＴ処理が開始されるタイミングを早くし、ＣＰＵ１１０およびアクセラレータ１３０の各処理を並列化可能な期間（すなわち、第２のＶＬＤ＆ＩＱ期間の開始時から第３のＶＬＤ＆ＩＱ期間の終了時までの期間）を長くすることができ、その結果、処理全体に要する時間が短縮される。 According to the above processing, in each of the first, second, and third VLD & IQ periods, the band processing order is set so that the minimum band necessary for performing inverse DCT for one component is processed. As a result of the replacement, the timing at which the inverse DCT process is started in the accelerator 130 is advanced, and each process of the CPU 110 and the accelerator 130 can be parallelized (that is, the third VLD & IQ period from the start of the second VLD & IQ period). (Period until the end of the process) can be lengthened, and as a result, the time required for the entire process is shortened.

また、ＣＰＵ１１０での可変長復号および逆量子化の処理がボトルネックになる場合には、予測処理量が最も小さいコンポーネントを最初に処理するようにバンドの処理順を決定することにより、比較的低速なＣＰＵ１１０のみが単独で動作する第１のＶＬＤ＆ＩＱ期間での処理量がより小さくなり、なおかつ、並列化可能な期間である第２，第３のＶＬＤ＆ＩＱ期間での処理量がより大きくなる。これにより、第１のＶＬＤ＆ＩＱ期間が短くなって、逆ＤＣＴの処理開始タイミングをさらに早めることができるとともに、並列化可能な期間をより長くすることができ、その結果、処理全体に要する時間をさらに短縮することが可能になる。 In addition, when variable length decoding and inverse quantization processing in the CPU 110 becomes a bottleneck, the processing order of the bands is determined so that the component with the smallest prediction processing amount is processed first. The processing amount in the first VLD & IQ period in which only the CPU 110 operates alone becomes smaller, and the processing amount in the second and third VLD & IQ periods, which are periods that can be parallelized, becomes larger. As a result, the first VLD & IQ period is shortened, the inverse DCT processing start timing can be further advanced, and the parallelizable period can be lengthened. As a result, the time required for the entire processing can be further increased. It becomes possible to shorten.

なお、上記の例では、コンポーネントの予測処理量の順位を、３位、１位、２位と変換して、最終的なバンドの処理順を決定していたが、この他に例えば、３位、２位、１位と変換してもよい。３位、１位、２位の順に処理した場合、アクセラレータ１３０が単独で動作する第３のＩＤＣＴ期間を短くすることができるが、その反面、第２のＶＬＤ＆ＩＱ期間の処理量が大きくなることで、第２のＶＬＤ＆ＩＱ期間の終了タイミングより第１のＩＤＣＴ期間の終了タイミングが早くなって、アクセラレータ１３０が待機状態となる期間が発生する可能性が高くなる。一方、３位、２位、１位の順に処理した場合、第１のＩＤＣＴ期間後にアクセラレータ１３０が待機状態となる可能性が低くなるために、全体の処理時間をより短縮できる場合もあり得る。しかし、その反面、アクセラレータ１３０が単独で動作する第３のＩＤＣＴ期間は長くなってしまう。 In the above example, the order of the predicted processing amount of the component is converted into the third, first, and second positions, and the final band processing order is determined. You may convert with 2nd place and 1st place. When the third, first, and second rank processes are performed, the third IDCT period in which the accelerator 130 operates independently can be shortened, but on the other hand, the processing amount in the second VLD & IQ period is increased. The end timing of the first IDCT period is earlier than the end timing of the second VLD & IQ period, and there is a high possibility that a period in which the accelerator 130 is in a standby state will occur. On the other hand, when processing is performed in the third, second, and first order, the accelerator 130 is less likely to be in a standby state after the first IDCT period, so that the overall processing time may be further shortened. However, on the other hand, the third IDCT period in which the accelerator 130 operates alone becomes long.

また、アクセラレータ１３０での逆ＤＣＴの処理がボトルネックになる場合には、予測処理量が最も小さいコンポーネントを最後に処理するようにバンドの処理順を決定することにより、比較的低速なアクセラレータ１３０が単独で動作する第３のＩＤＣＴ期間での処理量がより小さくなり、この期間を短縮できる。また、最後以外のコンポーネントの処理順を処理量の小さい順とすることで、第１のＶＬＤ＆ＩＱ期間の処理量を小さくして、アクセラレータ１３０による逆ＤＣＴ処理の開始タイミングを早めることができる。この結果、処理全体に要する時間をさらに短縮することが可能になる。 In addition, when the inverse DCT processing in the accelerator 130 becomes a bottleneck, the relatively low-speed accelerator 130 is determined by determining the band processing order so that the component with the smallest predicted processing amount is processed last. The processing amount in the third IDCT period that operates independently becomes smaller, and this period can be shortened. In addition, by setting the processing order of components other than the last to the order of decreasing processing amount, the processing amount of the first VLD & IQ period can be reduced, and the start timing of the inverse DCT processing by the accelerator 130 can be advanced. As a result, the time required for the entire process can be further shortened.

次に、上記処理による時間短縮効果について、より具体的に説明する。以下の説明では、例として、処理単位時間ｄを用い、データ解析部２１によるデータ解析に要する処理時間を１ｄ、可変長復号部１１および逆量子化部１２による１つのバンドに対する処理時間を１ｄ、逆ＤＣＴ部１３による１つのコンポーネントに対する処理時間を４ｄと仮定する。また、処理対象のＪＰＥＧデータのストリームは、Ｙ，Ｃｂ，Ｃｒの３つのコンポーネントから構成され、このストリームには図３に示したような順序でバンドが格納されているものとする。 Next, the time shortening effect by the above process will be described more specifically. In the following description, the processing unit time d is used as an example, the processing time required for data analysis by the data analysis unit 21 is 1d, the processing time for one band by the variable length decoding unit 11 and the inverse quantization unit 12 is 1d, The processing time for one component by the inverse DCT unit 13 is assumed to be 4d. The JPEG data stream to be processed is composed of three components Y, Cb, and Cr, and bands are stored in this stream in the order shown in FIG.

図１０は、ｄｊｐｅｇ方式を用いて並列処理を行った場合の処理時間について説明するための図である。
ｄｊｐｅｇ方式では、ストリーム内の格納順にバンドが処理されることから、ＣＰＵ１１０では、図３と同様の順序でバンドが順次選択されて、可変長復号および逆量子化の処理が実行される。このため、１つのコンポーネント（ここではＹ）についての逆ＤＣＴに必要なＤＣＴ係数が得られるまでには、第１から第１０までの１０個のバンドに対する可変長復号および逆量子化が必要となり、アクセラレータ１３０で逆ＤＣＴが開始されるまでに１０ｄの時間がかかる。さらに、逆ＤＣＴが開始されてからは、ＣＰＵ１１０での可変長復号および逆量子化とアクセラレータ１３０での逆ＤＣＴとが並列に実行される期間は２ｄのみとなり、並列化の効率も低い。この場合にストリーム全体の復号が完了するまで要する時間は、２２ｄとなる。 FIG. 10 is a diagram for explaining processing time when parallel processing is performed using the djpeg method.
In the djpeg method, the bands are processed in the order in which they are stored in the stream. Therefore, the CPU 110 sequentially selects the bands in the same order as in FIG. 3, and executes variable length decoding and inverse quantization. For this reason, variable length decoding and inverse quantization for 10 bands from 1 to 10 are necessary until the DCT coefficients necessary for inverse DCT for one component (here, Y) are obtained. It takes 10d before the accelerator 130 starts reverse DCT. Furthermore, after the start of inverse DCT, the period during which variable length decoding and inverse quantization in CPU 110 and inverse DCT in accelerator 130 are executed in parallel is only 2d, and the efficiency of parallelization is low. In this case, the time required to complete the decoding of the entire stream is 22d.

図１１は、コンポーネントごとの予測処理量を基にバンドの処理順を入れ替えた場合の処理時間について説明するための図である。
この図１１の例は、図７におけるステップＳ３１〜Ｓ３４の処理によって、バンドの処理順が入れ替えられた場合の処理を示している。ここでは、予測処理量がＹ，Ｃｒ，Ｃｂの順に大きいものとし、上記処理によってこの順に順位付けされたものとする。 FIG. 11 is a diagram for explaining the processing time when the processing order of the bands is changed based on the predicted processing amount for each component.
The example of FIG. 11 shows processing when the processing order of the bands is changed by the processing of steps S31 to S34 in FIG. Here, it is assumed that the predicted processing amounts are large in the order of Y, Cr, and Cb, and are ranked in this order by the above processing.

この場合、ステップＳ３１〜Ｓ３４によるデータ解析処理の後、図１１に示すように、第１，第２，第７，第８，第９，第１０の各バンドに対して、ＣＰＵ１１０によって順次、可変長復号および逆量子化の処理を施していくことで、Ｙ成分の逆ＤＣＴの実行のために必要なすべてのＤＣＴ係数を得ることができる。従って、復号処理の開始から、アクセラレータ１３０での逆ＤＣＴの処理を開始するまでの時間が、データ解析処理時間を含めても７ｄに短縮される。さらに、ＣＰＵ１１０とアクセラレータ１３０の並列処理期間が６ｄに伸びる。以上のように処理が効率化されることにより、全体の処理時間が１９ｄに短縮される。 In this case, after the data analysis processing in steps S31 to S34, the CPU 110 sequentially changes the first, second, seventh, eighth, ninth, and tenth bands as shown in FIG. By performing the long decoding and inverse quantization processes, all DCT coefficients necessary for performing the inverse DCT of the Y component can be obtained. Therefore, the time from the start of the decoding process to the start of the inverse DCT process in the accelerator 130 is shortened to 7d including the data analysis processing time. Furthermore, the parallel processing period of the CPU 110 and the accelerator 130 is extended to 6d. As described above, the efficiency of the processing reduces the overall processing time to 19d.

図１２は、予測処理量と処理性能とを基にバンドの処理順を入れ替えた場合の処理時間について説明するための図である。
この図１２の例は、図７におけるステップＳ３１〜Ｓ３４で付与された予測処理量の順位を、さらにステップＳ３５〜Ｓ３７の処理により、ＣＰＵ１１０およびアクセラレータ１３０の各処理性能を加味して変換した場合の処理を示している。ここでは、ＣＰＵ１１０での可変長復号および逆量子化の処理がボトルネックになるものとして、予測処理量の順位を３位、１位、２位の順に変換した場合について示す。すなわち、コンポーネントの処理順をＣｂ，Ｙ，Ｃｒとし、図８に示した順にバンドを処理していくものとする。 FIG. 12 is a diagram for explaining the processing time when the processing order of the bands is changed based on the predicted processing amount and the processing performance.
In the example of FIG. 12, the rank of the predicted processing amount given in steps S 31 to S 34 in FIG. 7 is further converted by the processing of steps S 35 to S 37 in consideration of the processing performance of the CPU 110 and the accelerator 130. Processing is shown. Here, assuming that variable length decoding and inverse quantization processing in CPU 110 becomes a bottleneck, the case where the order of prediction processing amounts is converted in the order of third, first, and second is shown. That is, assume that the component processing order is Cb, Y, Cr, and the bands are processed in the order shown in FIG.

この場合、前述したように、アクセラレータ１３０で逆ＤＣＴが開始されるまでに可変長復号および逆量子化の処理が施されるバンド数が減少し、処理されるデータ量が小さくなる。このため、復号処理の開始から逆ＤＣＴが開始されるまでの時間が６ｄに短縮される。さらに、ＣＰＵ１１０とアクセラレータ１３０の並列処理期間が７ｄに伸びる。この結果、全体の処理時間は１８ｄとなり、図１０の従来の処理例と比較して４ｄだけ短縮され、図１１の処理例と比較しても１ｄだけ短縮されることになる。 In this case, as described above, the number of bands subjected to variable length decoding and inverse quantization processing before the inverse DCT is started by the accelerator 130 decreases, and the amount of data to be processed decreases. For this reason, the time from the start of the decoding process to the start of the inverse DCT is shortened to 6d. Furthermore, the parallel processing period of the CPU 110 and the accelerator 130 extends to 7d. As a result, the total processing time is 18d, which is shortened by 4d as compared with the conventional processing example of FIG. 10, and is also shortened by 1d when compared with the processing example of FIG.

なお、上記では、データ解析処理（すなわち、図７によるストリーム解析結果に基づくバンド処理順決定処理）に要する時間を処理単位時間ｄとしていたが、このデータ解析処理時間が、バンド処理順変換後の復号処理全体での時間短縮分（すなわち、最初のバンドに対する可変長復号の開始から最後のコンポーネントに対する逆ＤＣＴの終了までの時間から、図１０における全体の処理時間を差し引いた時間）より短いという条件が成立する場合に、上記実施の形態による処理時間の短縮効果を奏することになる。 In the above description, the time required for the data analysis process (that is, the band processing order determination process based on the stream analysis result shown in FIG. 7) is the processing unit time d. The condition that it is shorter than the time reduction in the entire decoding process (that is, the time from the start of variable length decoding for the first band to the end of the inverse DCT for the last component minus the total processing time in FIG. 10) When the above is established, the effect of shortening the processing time according to the above embodiment is exhibited.

また、上記の実施の形態では、符号化および復号の処理において画像データの時間成分と周波数成分とを変換するためにＤＣＴ演算を用いたが、これに限らず、例えばウェーブレット変換などの他の変換方式を用いた圧縮符号化システムに対しても、本発明を適用することが可能である。 In the above embodiment, the DCT operation is used to convert the time component and the frequency component of the image data in the encoding and decoding processes. However, the present invention is not limited to this, and other conversions such as a wavelet transform are used. The present invention can also be applied to a compression encoding system using a method.

さらに、上記の実施の形態では、いわゆるＹＵＶ系のカラースペースを用いて符号化・復号を行うようにしたが、これに限らず、例えばＣＭＹＫ系、ＹＣＣＫ系などの他のカラースペースを用いた場合にも、本発明を適用することが可能である。 Furthermore, in the above embodiment, encoding / decoding is performed using a so-called YUV color space. However, the present invention is not limited to this. For example, other color spaces such as a CMYK system and a YCCK system are used. In addition, the present invention can be applied.

また、上記の実施の形態におけるデータ解析や復号処理の機能、さらには可変長復号および逆量子化の処理機能は、コンピュータによって実現することができる。その場合、これらの機能の処理内容を記述したプログラムが提供され、そのプログラムをコンピュータで実行することにより、上記処理機能がコンピュータ上で実現される。また、処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、磁気テープやハードディスクなどの磁気記録媒体、光ディスク、光磁気記録媒体、半導体メモリなどがある。 In addition, the functions of data analysis and decoding processing in the above-described embodiment, and further, processing functions of variable length decoding and inverse quantization can be realized by a computer. In this case, a program describing the processing contents of these functions is provided, and the processing functions are realized on the computer by executing the program on the computer. The program describing the processing contents can be recorded on a computer-readable recording medium. Examples of the computer-readable recording medium include a magnetic recording medium such as a magnetic tape and a hard disk, an optical disk, a magneto-optical recording medium, and a semiconductor memory.

プログラムを流通させる場合には、例えば、そのプログラムが記録された光ディスクなどの可搬型記録媒体が販売される。また、プログラムをサーバコンピュータの記憶装置に格納しておき、そのプログラムを、サーバコンピュータからネットワークを介して他のコンピュータに転送することもできる。 When the program is distributed, for example, a portable recording medium such as an optical disk on which the program is recorded is sold. It is also possible to store the program in a storage device of a server computer and transfer the program from the server computer to another computer via a network.

プログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプログラムまたはサーバコンピュータから転送されたプログラムを、自機に接続された記憶装置に格納する。そして、コンピュータは、その記憶装置からプログラムを読み取り、プログラムに従った処理を実行する。なお、コンピュータは、可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することもできる。また、コンピュータは、サーバコンピュータからプログラムが転送されるごとに、逐次、受け取ったプログラムに従った処理を実行することもできる。 The computer that executes the program stores, for example, the program recorded on the portable recording medium or the program transferred from the server computer in a storage device connected to the own computer. Then, the computer reads the program from the storage device and executes processing according to the program. The computer can also read the program directly from the portable recording medium and execute processing according to the program. Further, each time the program is transferred from the server computer, the computer can sequentially execute processing according to the received program.

さらに、上記の実施の形態では、逆ＤＣＴの処理を、ＣＰＵとは別の専用のハードウェア（アクセラレータ）によって実行していたが、この他に例えば、並列に動作可能な複数のＣＰＵを用い、一方のＣＰＵで可変長復号および逆量子化を行い、他方のＣＰＵで逆ＤＣＴを行うような構成に対しても、上記の処理手順を適用することができる。また、この場合には、逆ＤＣＴ機能を含む上記の復号装置の機能を、コンピュータでのソフトウェア処理によって実現することもできる。 Further, in the above embodiment, the inverse DCT process is executed by dedicated hardware (accelerator) different from the CPU. However, for example, a plurality of CPUs operable in parallel are used, The above processing procedure can also be applied to a configuration in which variable length decoding and inverse quantization are performed by one CPU and inverse DCT is performed by the other CPU. In this case, the functions of the decoding device including the inverse DCT function can also be realized by software processing on a computer.

実施の形態に係る画像復号装置のハードウェア構成を概略的に示すブロック図である。It is a block diagram which shows roughly the hardware constitutions of the image decoding apparatus which concerns on embodiment. 画像復号装置が備える機能を示すブロック図である。It is a block diagram which shows the function with which an image decoding apparatus is provided. 実施の形態で適用されるバンドの分割例を示す図である。It is a figure which shows the example of a division | segmentation of the band applied in embodiment. 復号処理時におけるＣＰＵの処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of CPU at the time of a decoding process. 復号処理時におけるＣＰＵとアクセラレータとの間の処理シーケンスを示す図である。It is a figure which shows the process sequence between CPU and an accelerator at the time of a decoding process. 従来のｄｊｐｅｇ方式で復号処理を行った場合のバンドの処理順を示す図である。It is a figure which shows the processing order of the band at the time of performing a decoding process by the conventional djpeg system. 図４のステップＳ１１のデータ解析処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the data analysis process of FIG.4 S11. ＣＰＵでの可変長復号および逆量子化の処理がボトルネックになる場合のバンドの処理順を示す図である。It is a figure which shows the processing order of a band when the variable-length decoding in CPU and the process of inverse quantization become a bottleneck. アクセラレータでの逆ＤＣＴの処理がボトルネックになる場合のバンドの処理順を示す図である。It is a figure which shows the processing order of a band when the process of reverse DCT in an accelerator becomes a bottleneck. ｄｊｐｅｇ方式を用いて並列処理を行った場合の処理時間について説明するための図である。It is a figure for demonstrating the processing time at the time of performing parallel processing using a djpeg system. コンポーネントごとの予測処理量を基にバンドの処理順を入れ替えた場合の処理時間について説明するための図である。It is a figure for demonstrating the processing time at the time of changing the process order of a band based on the prediction processing amount for every component. 予測処理量と処理性能とを基にバンドの処理順を入れ替えた場合の処理時間について説明するための図である。It is a figure for demonstrating the processing time at the time of changing the process order of a band based on prediction processing amount and processing performance. ＪＰＥＧ標準システムにおける復号装置の構成を概略的に示すブロック図である。It is a block diagram which shows roughly the structure of the decoding apparatus in a JPEG standard system. ＪＰＥＧデータにおけるブロックの構成とスキャン順を説明するための図である。It is a figure for demonstrating the structure of a block and scanning order in JPEG data. バンドの構成例を示す図である。It is a figure which shows the structural example of a band. バンドごとに処理されるデータを示す図である。It is a figure which shows the data processed for every band. ｄｊｐｅｇ方式の復号装置の構成を概略的に示すブロック図である。It is a block diagram which shows roughly the structure of the decoding apparatus of a djpeg system. ｄｊｐｅｇ方式の復号処理を並列化する場合のＣＰＵの処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of CPU in the case of parallelizing the decoding process of a djpeg system.

Explanation of symbols

１１……可変長復号部、１２……逆量子化部、１３……逆ＤＣＴ部、１４……色変換部、１５……ＤＣＴ係数バッファ、２１……データ解析部、２２……復号制御部、３１……処理量予測部、３２……処理順変換部、１１０……ＣＰＵ、１２０……メインメモリ、１３０……アクセラレータ、１３１……演算器、１３２……ローカルメモリ、１４０……外部記憶装置、１５０……バス DESCRIPTION OF SYMBOLS 11 ... Variable length decoding part, 12 ... Inverse quantization part, 13 ... Inverse DCT part, 14 ... Color conversion part, 15 ... DCT coefficient buffer, 21 ... Data analysis part, 22 ... Decoding control part , 31... Processing amount prediction unit, 32... Processing order conversion unit, 110... CPU, 120... Main memory, 130 ... accelerator, 131 ... computing unit, 132 ... local memory, 140. Equipment, 150 ... Bus

Claims

In an image decoding apparatus for decoding compressed image data encoded by a progressive method,
A processing order determination unit that rearranges the encoded data group for each predetermined divided encoding unit that constitutes the compressed image data so as to be processed for each component;
A first arithmetic processing unit that sequentially receives the encoded data groups rearranged by the processing order determination unit, performs variable-length decoding and inverse quantization, and outputs coefficient data of frequency components;
A second arithmetic processing unit configured to be operable in parallel with the first arithmetic processing unit, and converting the coefficient data from the first arithmetic processing unit into data of a time component;
Have
When the encoded data group corresponding to one component is processed by the first arithmetic processing unit, the second arithmetic processing unit executes conversion processing of the output coefficient data into time component data. In addition, the first arithmetic processing unit executes processing for the encoded data group corresponding to the next component in parallel.

The image decoding apparatus according to claim 1, wherein the processing order determination unit rearranges the encoded data group based on a data amount of the encoded data group for each component.

The image decoding apparatus according to claim 2, wherein the processing order determination unit predicts the data amount for each component by analyzing a header of the compressed image data.

The said processing order determination part predicts the said data amount for every component based on the block size for every component in the minimum encoding unit obtained from the analysis result of the said header. Image decoding apparatus.

The processing order determination unit predicts the data amount for each component based on an attribute of the encoded data group corresponding to each component obtained from an analysis result of the header. 3. The image decoding device according to 3.

The said processing order determination part detects the actual data amount of the said encoding data group for every component about the component which cannot discriminate | determine the difference of the said data amount from the analysis result of the said header. Image decoding device.

The processing order determination unit assigns processing orders to components in descending order of the data amount, and further determines the processing orders based on the processing performances of the first arithmetic processing unit and the second arithmetic processing unit. The image decoding apparatus according to claim 2, wherein the encoded data group is rearranged so that each component is processed according to the processing order after replacement.

The processing order determination unit
When the processing of the first arithmetic processing unit becomes a bottleneck with respect to the processing of the second arithmetic processing unit, the processing order of the component having the smallest data amount is placed at the top,
When the processing of the second arithmetic processing unit becomes a bottleneck with respect to the processing of the first arithmetic processing unit, the processing order of the component having the smallest data amount is set last, and the remaining components 8. The image decoding apparatus according to claim 7, wherein the processing order is set such that the data amount is in ascending order.

The encoded data group is a data group obtained by encoding the coefficient data using at least one of a spatial frequency component or a component obtained by dividing a bit string as the divided coding unit. Image decoding device.

The image decoding apparatus according to claim 1, wherein the coefficient data is a DCT coefficient, and the second arithmetic processing unit performs inverse DCT on the coefficient data.

The image decoding apparatus according to claim 1, wherein the compressed image data is image data compressed and encoded by a JPEG method.

In an image decoding method for decoding compressed image data encoded by a progressive method using a first arithmetic processing unit and a second arithmetic processing unit operable in parallel with each other,
The processing order determination unit rearranges the encoded data group for each predetermined divided encoding unit constituting the compressed image data so as to be processed for each component,
The decoding control unit sequentially supplies the encoded data group rearranged by the processing order determination unit to the first arithmetic processing unit, executes variable length decoding and inverse quantization, and performs coefficients of frequency components. Output data,
When the encoded data group corresponding to one component is processed by the first arithmetic processing unit, the decoding control unit supplies the output coefficient data to the second arithmetic processing unit, Causing the first arithmetic processing unit to execute variable length decoding and inverse quantization on the encoded data group corresponding to the next component in parallel, while performing conversion processing to a time component;
An image decoding method characterized by the above.

In an image decoding program for decoding compressed image data encoded by a progressive method using a first arithmetic processing unit and a second arithmetic processing unit operable in parallel with each other,
Computer
A processing order determination unit that rearranges the encoded data group for each predetermined divided encoding unit constituting the compressed image data so as to be processed for each component,
The encoded data group rearranged by the processing order determination unit is sequentially supplied to the first arithmetic processing unit, variable length decoding and inverse quantization are executed, and coefficient data of frequency components is output, When the encoded data group corresponding to one component is processed by the first arithmetic processing unit, the output coefficient data is supplied to the second arithmetic processing unit, and conversion processing into a time component is performed. A decoding control unit that causes the first arithmetic processing unit to execute variable length decoding and inverse quantization on the encoded data group corresponding to the next component in parallel,
An image decoding program that is made to function as: