JP4565393B2

JP4565393B2 - Video signal hierarchical encoding apparatus, video signal hierarchical encoding method, and video signal hierarchical encoding program

Info

Publication number: JP4565393B2
Application number: JP2005369547A
Authority: JP
Inventors: 和博嶋内; 智坂爪; 徹熊倉; 基晴上田
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2005-12-22
Filing date: 2005-12-22
Publication date: 2010-10-20
Anticipated expiration: 2025-12-22
Also published as: JP2007174285A

Description

本発明は、映像信号の符号化、特に階層符号化に関する。 The present invention relates to video signal coding, and in particular to hierarchical coding.

従来、映像符号化において空間解像度、時間解像度およびSNRそれぞれのスケーラビリティを実現する符号化方式が数多く提案されており、さまざまな分野でこれらの実用化がなされている。なかでも、空間解像度のスケーラビリティに関しては、静止画像の符号化を含め、その適用範囲が広い。 Conventionally, many coding schemes have been proposed for realizing spatial resolution, temporal resolution, and SNR scalability in video coding, and these have been put to practical use in various fields. In particular, the spatial resolution scalability includes a wide range of applications including still image coding.

映像の空間解像度スケーラビリティを実現する従来技術として特許文献1がある。図11に特許文献1の符号化部1101と復号化部1103の構成例を示す。符号化部1101にはオリジナルの映像信号が入力され、符号化部1101で生成されたビットストリームが通信回線またはメディアなど1102を介して復号化部1103に伝送される。復号化部1103では供給されたビットストリームから必要な情報を取り出して、ディスプレイ等の性能に合った空間解像度のデコード映像信号を出力する。 There is Patent Document 1 as a conventional technique for realizing spatial resolution scalability of video. FIG. 11 shows a configuration example of the encoding unit 1101 and the decoding unit 1103 of Patent Document 1. The original video signal is input to the encoding unit 1101, and the bit stream generated by the encoding unit 1101 is transmitted to the decoding unit 1103 via a communication line or media 1102. The decoding unit 1103 extracts necessary information from the supplied bit stream and outputs a decoded video signal having a spatial resolution suitable for the performance of a display or the like.

符号化部1101は、空間デシメーション部（空間的縮小部）1104、ベースレイヤエンコード部1105、空間インターポレーション部（空間的拡大部）1106、エンハンスメントレイヤ符号化部1107および多重化部1108から構成される。 The encoding unit 1101 includes a spatial decimation unit (spatial reduction unit) 1104, a base layer encoding unit 1105, a spatial interpolation unit (spatial expansion unit) 1106, an enhancement layer encoding unit 1107, and a multiplexing unit 1108. The

空間デシメーション部1104は、オリジナルの映像信号を入力として受け付け、入力された信号を所望の空間解像度に空間デシメーションする機能（解像度を低くする機能）を有する。また、所望の空間解像度に空間解像度デシメーションされた信号をベースレイヤエンコード部1105に出力する機能を有する。 The spatial decimation unit 1104 has a function of receiving an original video signal as an input and spatially decimating the input signal to a desired spatial resolution (a function of reducing the resolution). Further, it has a function of outputting a signal that has been spatially decimated to a desired spatial resolution to the base layer encoding unit 1105.

ベースレイヤエンコード部1105は、空間デシメーション部1104の出力を入力として受け付け、入力された信号を符号化してビットストリームを生成し、多重化部1108へ出力する機能を有する。ここで、エンコードの方法には、MPEG-2などが用いられる。また、MPEG-2等におけるローカルデコード（局部復号）をおこなった信号を空間インターポレーション部1106へ出力する機能を有する。 The base layer encoding unit 1105 has a function of receiving the output of the spatial decimation unit 1104 as an input, encoding the input signal to generate a bit stream, and outputting the bit stream to the multiplexing unit 1108. Here, MPEG-2 or the like is used as an encoding method. Further, it has a function of outputting a signal subjected to local decoding (local decoding) in MPEG-2 or the like to the spatial interpolation unit 1106.

空間インターポレーション部1106は、ベースレイヤエンコード部1105から出力されるローカルデコード信号を入力として受け付け、入力された信号をエンハンスメントレイヤの信号の解像度に空間インターポレーションする機能を有する。また、エンハンスメントレイヤの信号の解像度に空間インターポレーションされた信号をエンハンスメントレイヤエンコード部1107へ出力する機能を有する。 Spatial interpolation section 1106 has a function of receiving a local decode signal output from base layer encoding section 1105 as an input and spatially interpolating the input signal to the resolution of the enhancement layer signal. Further, it has a function of outputting a signal spatially interpolated to the resolution of the enhancement layer signal to the enhancement layer encoding unit 1107.

エンハンスメントレイヤエンコード部1107は、オリジナルの映像信号と空間インターポレーション部1106より出力される信号を入力として受け付ける機能を有する。入力されるそれぞれの信号を用いて、空間解像度間および時間の相関を利用した予測をおこない、それに伴って生じる予測誤差信号を符号化する機能を有する。また、符号化されて生成されるビットストリームを多重化部1108に出力する機能を有する。 The enhancement layer encoding unit 1107 has a function of receiving an original video signal and a signal output from the spatial interpolation unit 1106 as inputs. Each input signal is used to perform prediction using correlation between spatial resolutions and time, and has a function of encoding a prediction error signal generated in association with the prediction. In addition, it has a function of outputting a bit stream generated by encoding to the multiplexing unit 1108.

多重化部1108は、ベースレイヤエンコード部1105およびエンハンスメントレイヤエンコード部1107より出力されるそれぞれのビットストリームを入力として受け付け、多重化してひとつのビットストリームを生成し、符号化部1101の外部、例えば通信回線やメディアなど1102へ出力する機能を有する。 The multiplexing unit 1108 receives as input the respective bitstreams output from the base layer encoding unit 1105 and the enhancement layer encoding unit 1107 and multiplexes them to generate one bitstream, for example, outside the encoding unit 1101, for example, communication It has a function to output to 1102 such as a line or media.

復号化部1103は、エクストラクト部1109、ベースレイヤデコード部1110、空間インターポレーション部1111およびエンハンスメントレイヤデコード部1112から構成される。
エクストラクト部1109は、ビットストリームを入力として受け付ける機能を有する。復号化部1103またはディスプレイ等の性能にあわせて、ビットストリーム全体から復号に必要なものを切り出し、分割してそれぞれをベースレイヤデコード部1110およびエンハンスメントレイヤデコード部1112に出力する機能を有する。 The decoding unit 1103 includes an extract unit 1109, a base layer decoding unit 1110, a spatial interpolation unit 1111, and an enhancement layer decoding unit 1112.
The extractor 1109 has a function of accepting a bitstream as an input. In accordance with the performance of the decoding unit 1103 or the display, etc., it has a function of extracting what is necessary for decoding from the entire bit stream, dividing it, and outputting each to the base layer decoding unit 1110 and the enhancement layer decoding unit 1112.

ベースレイヤデコード部1110は、エクストラクト部1109で切り出されたベースレイヤのビットストリームを入力として受け付ける機能を有する。入力されたビットストリームを復号し、デコード映像信号を空間インターポレーション部1111と必要に応じてディスプレイ等への出力をおこなう機能を有する。ここで、復号にはMPEG-2デコーダなどを用いる。 The base layer decoding unit 1110 has a function of accepting the base layer bit stream extracted by the extract unit 1109 as an input. It has a function of decoding the input bit stream and outputting the decoded video signal to the spatial interpolation unit 1111 and, if necessary, to a display or the like. Here, an MPEG-2 decoder or the like is used for decoding.

空間インターポレーション部1111は、ベースレイヤデコード部1110から出力されるベースレイヤデコード信号を入力として受け付け、入力された信号をエンハンスメントレイヤの信号の解像度に空間インターポレーションする機能を有する。また、エンハンスメントレイヤの信号の解像度に空間インターポレーションされた信号をエンハンスメントレイヤデコード部1112へ出力する機能を有する。 Spatial interpolation section 1111 has a function of accepting the base layer decoded signal output from base layer decoding section 1110 as an input and spatially interpolating the input signal to the resolution of the enhancement layer signal. In addition, it has a function of outputting a signal spatially interpolated to the resolution of the enhancement layer signal to the enhancement layer decoding unit 1112.

エンハンスメントレイヤデコード部1112は、エクストラクト部1109から得られるビットストリームおよび空間インターポレーション部1111から出力される信号を入力として受け付ける機能を有する。入力されるそれぞれの信号を用いて、オリジナル映像信号の空間解像度の信号を復号する機能を有する。復号された映像信号は、ディスプレイ等へ出力される。 The enhancement layer decoding unit 1112 has a function of accepting a bit stream obtained from the extract unit 1109 and a signal output from the spatial interpolation unit 1111 as inputs. Each input signal has a function of decoding the spatial resolution signal of the original video signal. The decoded video signal is output to a display or the like.

図11に示した符号化部1101の構成例を用いて映像信号を空間スケーラブル符号化する手順を図12に示す。
オリジナルの映像信号を、まず、空間デシメーション部1104において空間解像度のデシメーションをおこなう[ステップS1201]。空間解像度をデシメーションした信号を、ベースレイヤエンコード部1105を用いて符号化し、ビットストリームを生成する[ステップS1202]。生成されたビットストリームを多重化部1108へ送り、符号化過程で得られるベースレイヤのローカルデコード信号を空間インターポレーション部1106へ送る。ベースレイヤエンコード部1105より得られるベースレイヤのローカルデコード信号を空間インターポレーション部1106において空間解像度のインターポレーションをおこなう[ステップS1203]。そして、空間インターポレーションした信号をエンハンスメントレイヤエンコード部1107に送る。 FIG. 12 shows a procedure for spatially scalable encoding of a video signal using the configuration example of the encoding unit 1101 shown in FIG.
First, spatial resolution decimation is performed on the original video signal in the spatial decimation unit 1104 [step S1201]. The signal with the spatial resolution decimated is encoded using the base layer encoding unit 1105 to generate a bitstream [step S1202]. The generated bit stream is sent to multiplexing section 1108, and the base layer local decoded signal obtained in the encoding process is sent to spatial interpolation section 1106. The base layer local decode signal obtained from the base layer encoding unit 1105 is spatially interpolated in the spatial interpolation unit 1106 [step S1203]. Then, the spatially interpolated signal is sent to the enhancement layer encoding unit 1107.

オリジナルの映像信号と空間インターポレーション部1106の出力信号を用いて、エンハンスメントレイヤエンコード部1107において空間解像度間および時間の相関を利用した予測を行い、それに伴って生じる予測誤差信号を符号化する[ステップS1204]。そして、符号化により生成されたビットストリームを、多重化部1108へ送る。ベースレイヤエンコード部1105およびエンハンスメントレイヤエンコード部1107より得られたそれぞれのビットストリームを多重化部1108において、多重化をおこない、ひとつのビットストリームを生成する[ステップS1205]。 Using the original video signal and the output signal of the spatial interpolation unit 1106, the enhancement layer encoding unit 1107 performs prediction using the correlation between the spatial resolutions and the time, and encodes the prediction error signal generated thereby [ Step S1204]. Then, the bit stream generated by the encoding is sent to the multiplexing unit 1108. Each bit stream obtained from the base layer encoding unit 1105 and the enhancement layer encoding unit 1107 is multiplexed in the multiplexing unit 1108 to generate one bit stream [step S1205].

図11に示した復号化部1103の構成例を用いて空間スケーラブル構成のビットストリームを復号してデコード映像信号を得る手順を図13に示す。
通信回線やメディア等1102からビットストリームをエクストラクト部1109を用いて受信する。ビットストリームを解析し、復号化部1103およびディスプレイ等の性能に合わせて必要な符号データを抽出する。そして、ベースレイヤデコード部1110、エンハンスメントレイヤデコード部1112それぞれに対応したデータに分割して出力する[ステップS1301]。 FIG. 13 shows a procedure for obtaining a decoded video signal by decoding a spatially scalable bit stream using the configuration example of the decoding unit 1103 shown in FIG.
A bit stream is received from the communication line or media 1102 using the extract unit 1109. The bit stream is analyzed, and necessary code data is extracted in accordance with the performance of the decoding unit 1103 and the display. Then, the data is divided into data corresponding to the base layer decoding unit 1110 and the enhancement layer decoding unit 1112 and output [step S1301].

エクストラクト部1109で分割したベースレイヤに対応するデータをベースレイヤデコード部1110で復号する[ステップS1302]。復号したベースレイヤデコード映像信号を空間インターポレーション部1111に出力し、必要があればディスプレイ等にも出力する。ベースレイヤデコード部1110より得られるベースレイヤのデコード映像信号を空間インターポレーション部1111において空間解像度のインターポレーションをおこなう[ステップS1303]。そして、空間インターポレーションした信号をエンハンスメントレイヤデコード部1112に送る。エクストラクト部1109で分割したエンハンスメントレイヤに対応するデータおよび空間インターポレーション部1111で空間インターポレーションした信号をエンハンスメントレイヤデコード部1112で復号する[ステップS1304]。そして、復号したデコード映像信号をディスプレイ等へ出力する。 Data corresponding to the base layer divided by the extractor 1109 is decoded by the base layer decoder 1110 [step S1302]. The decoded base layer decoded video signal is output to the spatial interpolation unit 1111 and output to a display or the like if necessary. The base layer decoded video signal obtained from the base layer decoding unit 1110 is subjected to spatial resolution interpolation in the spatial interpolation unit 1111 [step S1303]. The spatially interpolated signal is sent to the enhancement layer decoding unit 1112. The data corresponding to the enhancement layer divided by the extractor 1109 and the signal spatially interpolated by the spatial interpolation unit 1111 are decoded by the enhancement layer decoding unit 1112 [step S1304]. Then, the decoded decoded video signal is output to a display or the like.

一方、画像拡大法の分野において、画像拡大時に拡大後の解像度に適切な高周波数成分を推定して付加する非特許文献1の技術がある。非特許文献1は、階層符号化におけるラプラシアンピラミッドの考え方を画像拡大法に応用したものである。階層間のラプラシアン成分の相関が強いことを利用して、注目する階層の信号のみから空間解像度がひとつ高い階層のラプラシアン成分の推定を成し遂げる方法である。 On the other hand, in the field of image enlargement methods, there is a technique of Non-Patent Document 1 that estimates and adds a high-frequency component suitable for the resolution after enlargement when an image is enlarged. Non-Patent Document 1 applies the idea of the Laplacian pyramid in hierarchical coding to the image enlargement method. This is a method for achieving the estimation of the Laplacian component of the layer having a higher spatial resolution from only the signal of the layer of interest using the strong correlation of the Laplacian component between the layers.

図14に非特許文献1による高周波数成分推定を伴う画像拡大部1401の構成例を示す。高周波数成分を伴う画像拡大部1401は、第1のハイパスフィルタリング部1402、第1のインターポレーション部1403、振幅制限・定数倍処理部1404、第2のハイパスフィルタリング部1405、第2のインターポレーション部1406及び信号合成部1407で構成される。 FIG. 14 shows a configuration example of the image enlargement unit 1401 accompanied by high-frequency component estimation according to Non-Patent Document 1. The image enlarging unit 1401 with a high frequency component includes a first high-pass filtering unit 1402, a first interpolation unit 1403, an amplitude limiting / constant multiplication unit 1404, a second high-pass filtering unit 1405, and a second interpolator. And a signal synthesizing unit 1407.

第1のハイパスフィルタリング部1402は、拡大対象のオリジナルの信号を入力として受け付け、入力信号のラプラシアン成分を抽出する機能を有する。入力信号のラプラシアン成分の抽出は次のように行う。ここで、説明を簡単にするために、1次元の信号モデルを例にして、入力信号をG₀(x)、入力信号から抽出されるラプラシアン成分をL₀(x)とする。 The first high-pass filtering unit 1402 has a function of receiving an original signal to be enlarged as an input and extracting a Laplacian component of the input signal. The Laplacian component of the input signal is extracted as follows. Here, for the sake of simplicity, taking a one-dimensional signal model as an example, the input signal is G ₀ (x), and the Laplacian component extracted from the input signal is L ₀ (x).

ここで、ρは、ガウシアンフィルタの帯域を調整するためのパラメータである。また、第1のハイパスフィルタリング部1402は、入力信号から抽出したラプラシアン成分の信号を第1のインターポレーション部1403へ出力する機能を有する。

Here, ρ is a parameter for adjusting the band of the Gaussian filter. The first high-pass filtering unit 1402 has a function of outputting a Laplacian component signal extracted from the input signal to the first interpolation unit 1403.

第1のインターポレーション部1403は、第1のハイパスフィルタリング部1402より出力されるラプラシアン成分の信号を入力として受け付け、その信号を所望の解像度となるように、任意倍率のインターポレーションをおこなう機能を有する。任意倍率のインターポレーションは次のように行う。任意倍率rにインターポレーションされた信号(EXPAND)_rL₀(x)は、入力ラプラシアン成分信号をL₀(x)とすると、 The first interpolation unit 1403 has a function of accepting a Laplacian component signal output from the first high-pass filtering unit 1402 as an input, and performing interpolation at an arbitrary magnification so that the signal has a desired resolution. Have Interpolation at an arbitrary magnification is performed as follows. Interpolation signal _{_{(EXPAND) r L 0 (x}} ) is optionally magnification r, when the input Laplacian component signal L ₀ and (x),

で与えられる。ここでint(・)は整数部分を取り出す操作を示す。また、第1のインターポレーション部1403は、インターポレーションした信号を振幅制限・定数倍処理部1404へ出力する機能を有する。

Given in. Here, int (·) indicates an operation for extracting the integer part. The first interpolation unit 1403 has a function of outputting the interpolated signal to the amplitude limit / constant multiplication unit 1404.

振幅制限・定数倍処理部1404は、第1のインターポレーション部1403より出力される信号を入力として受け付け、未知の高周波数成分を推定するための第1工程を実施する機能を有する。未知の高周波数成分を推定するための第1工程は、入力される信号に対して、振幅制限と定数倍処理を行うことで実現される。生成される信号Ｌ_rバー(x)は、入力される信号を(EXPAND)_rL₀(x)とすると、 The amplitude limiting / constant multiplication processing unit 1404 has a function of receiving a signal output from the first interpolation unit 1403 as an input and performing a first step for estimating an unknown high frequency component. The first step for estimating an unknown high-frequency component is realized by performing amplitude limitation and constant multiplication on the input signal. The generated signal L _r bar (x) is _expressed as (EXPAND) _r L ₀ (x).

で与えられる。ここで、振幅制限のためのパラメータT及び定数倍処理のためのパラメータα_rは、非特許文献1中で実験的に求められている。なお、パラメータα_rは、拡大率に応じて可変である。また、振幅制限・定数倍処理部1404は、振幅制限・定数倍処理した信号を第2のハイパスフィルタリング部1405へ出力する機能を有する。

Given in. Here, the parameter T for amplitude limitation and the parameter α _r for constant multiplication processing are experimentally obtained in Non-Patent Document 1. The parameter α _r is variable according to the enlargement ratio. The amplitude limiting / constant multiplication processing unit 1404 has a function of outputting the signal subjected to the amplitude limiting / constant multiplication processing to the second high-pass filtering unit 1405.

第2のハイパスフィルタリング部1405は、振幅制限・定数倍処理部1404より出力される信号を入力として受け付け、未知の高周波数成分を推定するための第2工程を実施する機能を有する。未知の高周波数成分を推定するための第2工程は、振幅制限・定数倍処理より出力された信号から低域成分を取り除き、本来求めようとしている高周波数成分のみを得るものである。これは、入力される信号に対してハイパスフィルタリングをおこなうことで実現される。ハイパスフィルタリングされた信号、すなわち、推定された未知の高周波数成分Ｌ_rハット(x)は、入力される信号をＬ_rバー(x)とすると、 The second high-pass filtering unit 1405 has a function of receiving a signal output from the amplitude limiting / constant multiplication processing unit 1404 as an input and performing a second step for estimating an unknown high-frequency component. In the second step for estimating the unknown high frequency component, the low frequency component is removed from the signal output by the amplitude limiting / constant multiplication processing, and only the high frequency component originally obtained is obtained. This is realized by performing high-pass filtering on the input signal. The high-pass filtered signal, that is, the estimated unknown high-frequency component L _r hat (x) is defined as L _r bar (x).

で与えられる。ここで、W(i)は式(2)に示したものである。また、第2のハイパスフィルタリング部1405は、推定された高周波数成分を信号合成部1407へ出力する機能を有する。

Given in. Here, W (i) is shown in Equation (2). The second high-pass filtering unit 1405 has a function of outputting the estimated high frequency component to the signal synthesis unit 1407.

第2のインターポレーション部1406は、拡大対象のオリジナルの信号を入力として受け付け、その信号を所望の解像度となるように、任意倍率のインターポレーションをおこなう機能を有する。任意倍率のインターポレーションは次のように行う。任意倍率rにインターポレーションされた信号(EXPAND)_rG₀(x)は、入力信号をG₀(x)とすると、 The second interpolation unit 1406 has a function of accepting an original signal to be enlarged as an input, and performing interpolation at an arbitrary magnification so that the signal has a desired resolution. Interpolation at an arbitrary magnification is performed as follows. The signal (EXPAND) _r G ₀ (x) interpolated to an arbitrary magnification r is G ₀ (x).

で与えられる。ここで、W_r(i)は式(4)と式(5)で示したものである。また、第2のインターポレーション部1406は、インターポレーションした信号を信号合成部1407へ出力する機能を有する。

Given in. Here, W _r (i) is represented by Expression (4) and Expression (5). The second interpolation unit 1406 has a function of outputting the interpolated signal to the signal synthesis unit 1407.

信号合成部1407は、第2のハイパスフィルタリング部1405より出力される信号と第2のインターポレーション部1406より出力される信号を入力として受け付ける機能を有する。また、入力されたそれぞれの信号を足し合わせて、高周波数成分推定を伴う画像拡大部1401の外部に出力する機能を有する。 The signal synthesis unit 1407 has a function of receiving the signal output from the second high-pass filtering unit 1405 and the signal output from the second interpolation unit 1406 as inputs. Also, it has a function of adding the input signals and outputting them to the outside of the image enlarging unit 1401 accompanied by high frequency component estimation.

図14に示した高周波数成分推定を伴う画像拡大部1401の構成例を用いて画像信号を拡大する手順を図15に示す。
まず、拡大対象の入力信号を第2のインターポレーション部1406において所望の解像度にインターポレーションする[ステップS1501]。 FIG. 15 shows a procedure for enlarging the image signal using the configuration example of the image enlarging unit 1401 with high frequency component estimation shown in FIG.
First, the input signal to be enlarged is interpolated to a desired resolution in the second interpolation unit 1406 [step S1501].

次に、第1のハイパスフィルタリング部1402を用いて拡大対象の入力信号からラプラシアン成分信号を抽出する[ステップS1502]。抽出したラプラシアン成分信号を第1のインターポレーション部1403において所望の解像度にインターポレーションする[ステップS1503]。インターポレーションした信号を振幅制限・定数倍処理部1404を用いて振幅制限・定数倍処理をおこなう[ステップS1504]。振幅制限定数倍処理をした信号に対して第2のハイパスフィルタリング部においてハイパスフィルタリング処理をおこない、推定された高周波数成分信号を得る[ステップS1505]。 Next, a Laplacian component signal is extracted from the input signal to be enlarged using the first high-pass filtering unit 1402 [step S1502]. The extracted Laplacian component signal is interpolated to a desired resolution in the first interpolation unit 1403 [step S1503]. The interpolated signal is subjected to amplitude limiting / constant multiplication processing using the amplitude limiting / constant multiplication processing unit 1404 [step S1504]. A high-pass filtering process is performed in the second high-pass filtering unit on the signal subjected to the amplitude limiting constant multiplication process to obtain an estimated high-frequency component signal [step S1505].

最後に、入力信号をインターポレーションした信号と推定された高周波数成分信号を信号合成部1407を用いて足し合わせて、高周波数成分推定を伴う画像拡大処理された信号を得る[ステップS1506]。
特開平7-162870号公報高橋靖正, 田口亮, "高周波数成分推定を伴う任意倍率可能な画像拡大法," 信学論(A), vol. J84-A, no. 9, pp1192-1201, Sep. 2001. Finally, the input signal is interpolated and the estimated high frequency component signal is added using the signal synthesis unit 1407 to obtain an image-enlarged signal with high frequency component estimation [step S1506].
Japanese Unexamined Patent Publication No. 7-16870 Takamasa Takamasa and Taguchi Ryo, "Image magnification method with high-frequency component estimation, arbitrary magnification," IEICE Tech. (A), vol. J84-A, no. 9, pp1192-1201, Sep. 2001.

映像の空間解像度スケーラビリティを実現する従来技術の一般的なものは、その一例として特許文献1に示したように、ベースレイヤのローカルデコードをインターポレーションし、それをエンハンスメントレイヤ符号化における予測信号に用いている。これは、エンハンスメントレイヤに入力されるオリジナルの映像信号とベースレイヤの信号との間にある程度の相関がある、すなわち、オリジナルの映像信号の一部の周波数成分をベースレイヤの信号がもっていることを利用したものである。したがって、ベースレイヤのローカルデコード信号とエンハンスメントレイヤに入力されるオリジナルの映像信号との間の相関がより高ければ、符号化効率は高くなる。したがって、より効率的な符号化を実現する為には、ベースレイヤのローカルデコードを単純にインターポレーションして予測信号を得るのではなく、よりオリジナルの映像信号に近づけるような推定処理（高解像度化処理）をおこなって予測信号を得ることが必要であると考えられる。 As an example of the conventional technology that realizes spatial resolution scalability of video, as shown in Patent Document 1, as an example, the base layer local decoding is interpolated and used as a prediction signal in enhancement layer coding. Used. This means that there is a certain degree of correlation between the original video signal input to the enhancement layer and the base layer signal, that is, the base layer signal has some frequency components of the original video signal. It is used. Therefore, the higher the correlation between the base layer local decode signal and the original video signal input to the enhancement layer, the higher the coding efficiency. Therefore, in order to realize more efficient encoding, an estimation process (high resolution) that brings the original video signal closer rather than simply interpolating the local decoding of the base layer to obtain a prediction signal. It is considered that it is necessary to obtain a prediction signal by performing a conversion process.

ここで、非特許文献1を階層符号化の推定処理にそのまま適用することにはいろいろな問題がある。ひとつは、非特許文献1が自然画像の拡大を対象につくられていることである。ベースレイヤのローカルデコード信号は、劣化した信号であり、本来の高い周波数成分をもたない。また、量子化の程度が荒い場合には、オリジナルの信号との相関が低くなった信号となっている。したがって、自然画像用にチューニングされた非特許文献1を単純に前述の推定処理に適用した場合、期待する符号化効率の効果が得られるとは限らない。ふたつは、非特許文献1は、拡大法であるため、入力された低解像度の信号のみから未知の高周波数成分を推定しなければならない。 Here, there are various problems in applying Non-Patent Document 1 as it is to the estimation process of hierarchical encoding. One is that Non-Patent Document 1 is designed for natural image enlargement. The local decode signal of the base layer is a deteriorated signal and does not have an original high frequency component. Further, when the degree of quantization is rough, the signal has a low correlation with the original signal. Therefore, when Non-Patent Document 1 tuned for natural images is simply applied to the above-described estimation process, the expected effect of coding efficiency is not always obtained. Second, since Non-Patent Document 1 is an enlargement method, it is necessary to estimate an unknown high-frequency component only from an input low-resolution signal.

本発明は、予測信号の適確な高解像度化処理を行って、より効率的な映像階層符号化を実現することを目的とする。 It is an object of the present invention to realize more efficient video hierarchical coding by performing accurate high resolution processing of a prediction signal.

そこで、上記課題を解決するために本発明は、以下の装置、方法、及びプログラムを提供するものである。
(1) 入力映像信号を解像度の異なる階層に分解して得た前記入力映像信号よりも解像度の低い映像信号を符号化すると共に、前記解像度の低い映像信号から予測信号を生成し、その予測信号を用いて解像度の高い側の前記入力映像信号を空間解像度間予測により符号化し、異なる解像度の映像信号の各符号化データを得る映像信号階層符号化装置であって、
入力映像信号に対して空間的縮小を行って前記入力映像信号よりも解像度の低い第１の映像信号を得る空間的縮小手段と、
前記第１の映像信号を、局部復号化処理を含む符号処理を用いて符号化した第１の符号化データを得る第１の符号化手段と、
前記局部復号化処理で得られた第１の局部復号信号から抽出された第１の高周波数成分信号に、前記入力映像信号の解像度となるように拡大処理を行った後、振幅制限及び定数倍処理して得た第２の高周波数成分信号と、前記第１の局部復号信号を前記入力映像信号の解像度となるように拡大処理して得られた第２の局部復号信号とを加算して、高解像度化拡大映像信号である第２の映像信号を生成する高解像度化処理を行う際に、生成される前記第２の映像信号と前記入力映像信号とを比較して、比較結果が所定の条件を満たすまで、前記振幅制限及び前記定数倍処理のためのパラメータを変更しながら、前記第２の映像信号の生成及び比較を繰り返し行うことで、前記所定の条件を満たす前記第２の映像信号とその第２の映像信号を生成するための前記パラメータとを得る空間的拡大手段と、
前記空間的拡大手段で最終的に得られた第２の映像信号を前記第１の映像信号から生成された予測信号とし、前記入力映像信号の解像度における、空間方向予測の結果である空間方向予測信号と、時間方向予測の結果である時間方向予測信号と、前記第１の映像信号から生成された予測信号との３つの予測信号に対して、選択、またはそれぞれの予測信号に所定の重み付けを行った後に合成することで、前記入力映像信号の解像度における空間解像度間予測の予測信号を生成し、前記入力映像信号から前記空間解像度間予測の予測信号を減算後符号化することで、解像度の高い側の映像信号の符号化データである第２の符号化データを得る第２の符号化手段と、
前記空間的拡大手段で最終的に得られた、前記振幅制限及び前記定数倍処理のためのパラメータを符号化した第３の符号化データを得る第３の符号化手段と、
前記第１〜第３の各符号化データを多重化する多重化手段と、
を備えることを特徴とする映像信号階層符号化装置。
(2) 入力映像信号を解像度の異なる階層に分解して得た前記入力映像信号よりも解像度の低い映像信号を符号化すると共に、前記解像度の低い映像信号から予測信号を生成し、その予測信号を用いて解像度の高い側の前記入力映像信号を空間解像度間予測により符号化し、異なる解像度の映像信号の各符号化データを得る映像信号階層符号化方法であって、
入力映像信号に対して空間的縮小を行って前記入力映像信号よりも解像度の低い第１の映像信号を得る空間的縮小ステップと、
前記第１の映像信号を、局部復号化処理を含む符号処理を用いて符号化した第１の符号化データを得る第１の符号化ステップと、
前記局部復号化処理で得られた第１の局部復号信号から抽出された第１の高周波数成分信号に、前記入力映像信号の解像度となるように拡大処理を行った後、振幅制限及び定数倍処理して得た第２の高周波数成分信号と、前記第１の局部復号信号を前記入力映像信号の解像度となるように拡大処理して得られた第２の局部復号信号とを加算して、高解像度化拡大映像信号である第２の映像信号を生成する高解像度化処理を行う際に、生成される前記第２の映像信号と前記入力映像信号とを比較して、比較結果が所定の条件を満たすまで、前記振幅制限及び前記定数倍処理のためのパラメータを変更しながら、前記第２の映像信号の生成及び比較を繰り返し行うことで、前記所定の条件を満たす前記第２の映像信号とその第２の映像信号を生成するための前記パラメータとを得る空間的拡大ステップと、
前記空間的拡大ステップで最終的に得られた第２の映像信号を前記第１の映像信号から生成された予測信号とし、前記入力映像信号の解像度における、空間方向予測の結果である空間方向予測信号と、時間方向予測の結果である時間方向予測信号と、前記第１の映像信号から生成された予測信号との３つの予測信号に対して、選択、またはそれぞれの予測信号に所定の重み付けを行った後に合成することで、前記入力映像信号の解像度における空間解像度間予測の予測信号を生成し、前記入力映像信号から前記空間解像度間予測の予測信号を減算後符号化することで、解像度の高い側の映像信号の符号化データである第２の符号化データを得る第２の符号化ステップと、
前記空間的拡大ステップで最終的に得られた、前記振幅制限及び前記定数倍処理のためのパラメータを符号化した第３の符号化データを得る第３の符号化ステップと、
前記第１〜第３の各符号化データを多重化する多重化ステップと、
を備えることを特徴とする映像信号階層符号化方法。
(3) 入力映像信号を解像度の異なる階層に分解して得た前記入力映像信号よりも解像度の低い映像信号を符号化すると共に、前記解像度の低い映像信号から予測信号を生成し、その予測信号を用いて解像度の高い側の前記入力映像信号を空間解像度間予測により符号化し、異なる解像度の映像信号の各符号化データを得る動作をコンピュータに実行させるための映像信号階層符号化プログラムであって、
入力映像信号に対して空間的縮小を行って前記入力映像信号よりも解像度の低い第１の映像信号を得る空間的縮小手段と、
前記第１の映像信号を、局部復号化処理を含む符号処理を用いて符号化した第１の符号化データを得る第１の符号化手段と、
前記局部復号化処理で得られた第１の局部復号信号から抽出された第１の高周波数成分信号に、前記入力映像信号の解像度となるように拡大処理を行った後、振幅制限及び定数倍処理して得た第２の高周波数成分信号と、前記第１の局部復号信号を前記入力映像信号の解像度となるように拡大処理して得られた第２の局部復号信号とを加算して、高解像度化拡大映像信号である第２の映像信号を生成する高解像度化処理を行う際に、生成される前記第２の映像信号と前記入力映像信号とを比較して、比較結果が所定の条件を満たすまで、前記振幅制限及び前記定数倍処理のためのパラメータを変更しながら、前記第２の映像信号の生成及び比較を繰り返し行うことで、前記所定の条件を満たす前記第２の映像信号とその第２の映像信号を生成するための前記パラメータとを得る空間的拡大手段と、
前記空間的拡大手段で最終的に得られた第２の映像信号を前記第１の映像信号から生成された予測信号とし、前記入力映像信号の解像度における、空間方向予測の結果である空間方向予測信号と、時間方向予測の結果である時間方向予測信号と、前記第１の映像信号から生成された予測信号との３つの予測信号に対して、選択、またはそれぞれの予測信号に所定の重み付けを行った後に合成することで、前記入力映像信号の解像度における空間解像度間予測の予測信号を生成し、前記入力映像信号から前記空間解像度間予測の予測信号を減算後符号化することで、解像度の高い側の映像信号の符号化データである第２の符号化データを得る第２の符号化手段と、
前記空間的拡大手段で最終的に得られた、前記振幅制限及び前記定数倍処理のためのパラメータを符号化した第３の符号化データを得る第３の符号化手段と、
前記第１〜第３の各符号化データを多重化する多重化手段と、
してコンピュータを機能させるための映像信号階層符号化プログラム。
Therefore, in order to solve the above problems, the present invention provides the following apparatus, method, and program.
(1) Encoding a video signal having a lower resolution than the input video signal obtained by decomposing the input video signal into layers having different resolutions, generating a prediction signal from the video signal having a lower resolution, and generating the prediction signal A video signal hierarchical encoding device that encodes the input video signal on the higher resolution side using spatial prediction and obtains encoded data of video signals of different resolutions,
Spatial reduction means for spatially reducing the input video signal to obtain a first video signal having a resolution lower than that of the input video signal;
First encoding means for obtaining first encoded data obtained by encoding the first video signal using an encoding process including a local decoding process;
The first high frequency component signal extracted from the first local decoded signal obtained by the local decoding process is subjected to an expansion process so as to have the resolution of the input video signal, and then the amplitude limitation and the constant multiplication are performed. The second high frequency component signal obtained by processing is added to the second local decoded signal obtained by enlarging the first local decoded signal to the resolution of the input video signal. When the high resolution processing for generating the second video signal that is the high resolution enlarged video signal is performed, the generated second video signal is compared with the input video signal, and the comparison result is predetermined. The second video that satisfies the predetermined condition is generated by repeatedly generating and comparing the second video signal while changing the parameters for the amplitude limitation and the constant multiplication process until the condition is satisfied. Generate signal and its second video signal Spatial expansion means for obtaining said parameters of order,
The second video signal finally obtained by the spatial enlargement means is used as a prediction signal generated from the first video signal, and spatial direction prediction is a result of spatial direction prediction at the resolution of the input video signal. A signal, a time direction prediction signal that is a result of time direction prediction, and a prediction signal generated from the first video signal are selected or given weights to each prediction signal. And performing synthesis after generation to generate a prediction signal of inter-spatial resolution prediction at the resolution of the input video signal, and encoding after subtraction of the prediction signal of inter-spatial resolution prediction from the input video signal . Second encoding means for obtaining second encoded data which is encoded data of the higher video signal;
Third encoding means for obtaining third encoded data finally encoded by the spatial enlargement means and encoding the parameters for the amplitude limitation and the constant multiplication processing ;
Multiplexing means for multiplexing each of the first to third encoded data;
A video signal hierarchical encoding device comprising:
(2) encoding a video signal having a resolution lower than that of the input video signal obtained by decomposing the input video signal into layers having different resolutions, generating a prediction signal from the video signal having a low resolution, and generating the prediction signal A video signal hierarchical encoding method for encoding the input video signal on the higher resolution side using spatial prediction and obtaining encoded data of video signals of different resolutions,
A spatial reduction step of spatially reducing the input video signal to obtain a first video signal having a lower resolution than the input video signal;
A first encoding step of obtaining first encoded data obtained by encoding the first video signal using an encoding process including a local decoding process;
The first high frequency component signal extracted from the first local decoded signal obtained by the local decoding process is subjected to an expansion process so as to have the resolution of the input video signal, and then the amplitude limitation and the constant multiplication are performed. The second high frequency component signal obtained by processing is added to the second local decoded signal obtained by enlarging the first local decoded signal to the resolution of the input video signal. When the high resolution processing for generating the second video signal that is the high resolution enlarged video signal is performed, the generated second video signal is compared with the input video signal, and the comparison result is predetermined. The second video that satisfies the predetermined condition is generated by repeatedly generating and comparing the second video signal while changing the parameters for the amplitude limitation and the constant multiplication process until the condition is satisfied. Generate signal and its second video signal Spatial expansion step of obtaining said parameters because,
The second video signal finally obtained in the spatial enlargement step is used as a prediction signal generated from the first video signal, and spatial direction prediction is a result of spatial direction prediction at the resolution of the input video signal. A signal, a time direction prediction signal that is a result of time direction prediction, and a prediction signal generated from the first video signal are selected or given weights to each prediction signal. And performing synthesis after generation to generate a prediction signal of inter-spatial resolution prediction at the resolution of the input video signal, and encoding after subtraction of the prediction signal of inter-spatial resolution prediction from the input video signal . A second encoding step of obtaining second encoded data that is encoded data of the higher-side video signal;
A third encoding step for obtaining third encoded data obtained by encoding the parameters for the amplitude limitation and the constant multiplication processing, which is finally obtained in the spatial expansion step ;
A multiplexing step of multiplexing each of the first to third encoded data;
A video signal hierarchical encoding method comprising:
(3) encoding a video signal having a resolution lower than that of the input video signal obtained by decomposing the input video signal into layers having different resolutions, generating a prediction signal from the video signal having a low resolution, and generating the prediction signal A video signal hierarchical encoding program for causing the computer to execute an operation of encoding the input video signal on the higher resolution side using spatial prediction and obtaining encoded data of video signals of different resolutions ,
Spatial reduction means for spatially reducing the input video signal to obtain a first video signal having a resolution lower than that of the input video signal;
First encoding means for obtaining first encoded data obtained by encoding the first video signal using an encoding process including a local decoding process;
The first high frequency component signal extracted from the first local decoded signal obtained by the local decoding process is subjected to an expansion process so as to have the resolution of the input video signal, and then the amplitude limitation and the constant multiplication are performed. The second high frequency component signal obtained by processing is added to the second local decoded signal obtained by enlarging the first local decoded signal to the resolution of the input video signal. When the high resolution processing for generating the second video signal that is the high resolution enlarged video signal is performed, the generated second video signal is compared with the input video signal, and the comparison result is predetermined. The second video that satisfies the predetermined condition is generated by repeatedly generating and comparing the second video signal while changing the parameters for the amplitude limitation and the constant multiplication process until the condition is satisfied. Generate signal and its second video signal Spatial expansion means for obtaining said parameters of order,
The second video signal finally obtained by the spatial enlargement means is used as a prediction signal generated from the first video signal, and spatial direction prediction is a result of spatial direction prediction at the resolution of the input video signal. A signal, a time direction prediction signal that is a result of time direction prediction, and a prediction signal generated from the first video signal are selected or given weights to each prediction signal. And performing synthesis after generation to generate a prediction signal of inter-spatial resolution prediction at the resolution of the input video signal, and encoding after subtraction of the prediction signal of inter-spatial resolution prediction from the input video signal . Second encoding means for obtaining second encoded data which is encoded data of the higher video signal;
Third encoding means for obtaining third encoded data finally encoded by the spatial enlargement means and encoding the parameters for the amplitude limitation and the constant multiplication processing ;
Multiplexing means for multiplexing each of the first to third encoded data;
Video signal hierarchical encoding program for causing a computer to function.

本発明によれば、従来の映像階層符号化における階層間予測の為の単純なインターポレーション（空間的拡大）に変わる、予測信号の適確な高解像度化処理を新たにインターポレーションに導入することにより、階層間予測誤差をより小さくすることができ、効率的でより高品位な映像信号階層符号化を実現することが可能となる。 According to the present invention, an accurate high-resolution processing of a prediction signal is newly introduced in the interpolation, instead of simple interpolation (spatial expansion) for inter-layer prediction in the conventional video hierarchical coding. By doing so, the inter-layer prediction error can be further reduced, and it is possible to realize efficient and higher-quality video signal hierarchical coding.

さらに、符号化部内で入力映像信号（高解像度信号）を参照し、低解像度信号から入力映像信号（高解像度信号）により近い予測信号を生成する構成がとれる為、予測信号の高解像度化理をより強化した効率的な映像階層符号化を実現することが可能となる。 In addition, the encoding unit can be configured to refer to the input video signal (high resolution signal) and generate a prediction signal closer to the input video signal (high resolution signal) from the low resolution signal. It is possible to realize more efficient and efficient video hierarchical coding.

本発明は、従来の階層符号化に階層間の予測効率を上げるための推定処理を導入することがまずひとつの新しい概念であり、それに加えて入力映像信号を教師データとして、入力映像信号を解像度の異なる階層に分解して得た前記入力映像信号よりも解像度の低い映像信号を符号化する過程で得られる局部復号化信号（ベースレイヤローカルデコード信号）をそれに近づけることがもうひとつの新しい概念である。これらを実現するための構成、方法及びプログラムの実施例を以下に示す。なお、以下に示す実施例は、説明を簡単にするために二階層の階層符号化・復号化を例に挙げているが、これを多階層で実現しても良い。
［実施例１］
図1に、本発明の実施例１を適用した空間解像度スケーラビリティを実現する階層符号化・復号化装置の構成例を示す。符号化部101にはオリジナルの映像信号が入力され、符号化部101で生成されたビットストリームが通信回線またはメディアなど102を介して復号化部103に伝送される。復号化部103では供給されたビットストリームから必要な情報を取り出して、ディスプレイ等の性能に合った空間解像度のデコード映像信号を出力する。 In the present invention, it is a new concept to introduce an estimation process for improving the prediction efficiency between layers in the conventional layer coding. In addition, the input image signal is used as teacher data and the input image signal is resolved. Another new concept is to bring a local decoded signal (base layer local decoded signal) obtained in the process of encoding a video signal having a resolution lower than that of the input video signal obtained by decomposition into different layers into is there. Examples of configurations, methods, and programs for realizing these will be described below. In addition, although the Example shown below has mentioned the hierarchy encoding / decoding of 2 hierarchies as an example in order to simplify description, you may implement | achieve this in multiple hierarchies.
[Example 1]
FIG. 1 shows a configuration example of a hierarchical encoding / decoding apparatus that realizes spatial resolution scalability to which the first embodiment of the present invention is applied. The original video signal is input to the encoding unit 101, and the bit stream generated by the encoding unit 101 is transmitted to the decoding unit 103 via a communication line or media 102. The decoding unit 103 extracts necessary information from the supplied bit stream and outputs a decoded video signal having a spatial resolution suitable for the performance of a display or the like.

符号化部101は、空間デシメーション部（空間的縮小手段）104、ベースレイヤエンコード部（第１の符号化手段）105、高解像度推定信号生成部（空間的拡大手段、第３の符号化手段）106、エンハンスメントレイヤ符号化部（第２の符号化手段）107および多重化部108から構成される。 The encoding unit 101 includes a spatial decimation unit (spatial reduction unit) 104, a base layer encoding unit (first encoding unit) 105, and a high-resolution estimated signal generation unit (spatial expansion unit, third encoding unit). 106, an enhancement layer encoding unit (second encoding unit) 107 and a multiplexing unit 108.

空間デシメーション部104は、オリジナルの映像信号を入力として受け付け、入力された信号を所望の空間解像度に空間デシメーションする機能（解像度を低くする機能）を有する。ここで、空間デシメーションの方法はいくつか考えられるが、ラプラシアンピラミッドと同様の関係を利用するために後述する高解像度推定信号生成部106で扱うフィルタに対応した方法を用いることが望ましい。そして、任意縮小率にも対応していることが望ましい。また、空間デシメーション部104は、所望の空間解像度に空間解像度デシメーションされた信号をベースレイヤエンコード部105に出力する機能を有する。 The spatial decimation unit 104 has a function of receiving an original video signal as an input and spatially decimating the input signal to a desired spatial resolution (a function of reducing the resolution). Here, several spatial decimation methods can be considered, but in order to use the same relationship as the Laplacian pyramid, it is desirable to use a method corresponding to a filter handled by the high-resolution estimated signal generation unit 106 described later. It is also desirable to support an arbitrary reduction ratio. In addition, the spatial decimation unit 104 has a function of outputting a signal subjected to spatial resolution decimation to a desired spatial resolution to the base layer encoding unit 105.

ベースレイヤエンコード部105は、空間デシメーション部104の出力を入力として受け付け、入力された信号を符号化してビットストリームを生成し、多重化部108へ出力する機能を有する。ここで、エンコードの方法は、いくつか考えられるが、例えば、MPEG-2やH.264などのクローズドループのエンコーダやなどが用いられる。時間方向のスケーラビリティやSN比スケーラビリティなどの機能を含んでいても良い。オープンループのエンコーダを用いた場合、そのエンコーダにはローカルデコード(リコンストラクト)機能を含むものとする。また、ベースレイヤエンコード部105内におけてローカルデコード（局部復号）をおこなった信号を空間インターポレーション（空間的拡大部）機能を有する高解像度推定信号生成部106へ出力する機能を有する。 The base layer encoding unit 105 has a function of receiving the output of the spatial decimation unit 104 as an input, encoding the input signal to generate a bit stream, and outputting the bit stream to the multiplexing unit 108. Here, several encoding methods can be considered. For example, a closed-loop encoder such as MPEG-2 or H.264 is used. Functions such as scalability in the time direction and S / N ratio scalability may be included. When an open loop encoder is used, the encoder includes a local decoding (reconstruction) function. Further, the base layer encoding unit 105 has a function of outputting a signal subjected to local decoding (local decoding) to the high resolution estimated signal generating unit 106 having a spatial interpolation (spatial expansion unit) function.

高解像度推定信号生成部106は、ベースレイヤエンコード部105から出力されるローカルデコード信号及びエンハンスメントレイヤエンコード部107から出力されるオリジナルの映像信号を入力として受け付け、ベースレイヤのローカルデコード信号からオリジナルの解像度の映像信号を推定する機能を有する。詳細については後述する。また、ベースレイヤのローカルデコード信号からオリジナルの高解像度映像信号を推定した信号をエンハンスメントレイヤエンコード部107へ出力し、推定に用いたパラメータを符号化して多重化部108へ出力する機能を有する。 The high resolution estimation signal generation unit 106 receives as input the local decode signal output from the base layer encoding unit 105 and the original video signal output from the enhancement layer encoding unit 107, and converts the original resolution from the base layer local decode signal. The video signal is estimated. Details will be described later. Further, it has a function of outputting a signal obtained by estimating an original high-resolution video signal from a base layer local decode signal to the enhancement layer encoding unit 107, encoding parameters used for the estimation, and outputting the encoded parameters to the multiplexing unit.

エンハンスメントレイヤエンコード部107は、オリジナルの映像信号と高解像度推定信号生成部106より出力される信号を入力として受け付ける機能を有する。入力されるそれぞれの信号を用いて、空間解像度間および時間の相関を利用した予測をおこない、それに伴って生じる予測誤差信号を符号化する機能を有する。詳細については後述する。また、符号化されて生成されるビットストリームを多重化部108に出力し、オリジナルの映像信号を高解像度推定信号生成部106へ出力する機能を有する。 The enhancement layer encoding unit 107 has a function of receiving an original video signal and a signal output from the high resolution estimation signal generation unit 106 as inputs. Each input signal is used to perform prediction using correlation between spatial resolutions and time, and has a function of encoding a prediction error signal generated in association with the prediction. Details will be described later. In addition, the bit stream generated by encoding is output to the multiplexing unit 108, and the original video signal is output to the high resolution estimation signal generating unit 106.

多重化部108は、ベースレイヤエンコード部105、高解像度推定信号生成部106及びエンハンスメントレイヤエンコード部107より出力されるそれぞれのビットストリームを入力として受け付け、多重化してひとつのビットストリームを生成し、符号化部101の外部、例えば通信回線やメディアなど102へ出力する機能を有する。 The multiplexing unit 108 receives each bit stream output from the base layer encoding unit 105, the high resolution estimation signal generation unit 106, and the enhancement layer encoding unit 107 as an input, multiplexes to generate one bit stream, It has a function of outputting to the outside of the conversion unit 101, for example, a communication line or media 102.

復号化部103は、エクストラクト部（分離手段）109、ベースレイヤデコード部（第１の復号化手段）110、高解像度推定信号復元部（復元手段）111およびエンハンスメントレイヤデコード部（第２の復号化手段）112から構成される。 The decoding unit 103 includes an extraction unit (separating unit) 109, a base layer decoding unit (first decoding unit) 110, a high-resolution estimated signal restoration unit (reconstruction unit) 111, and an enhancement layer decoding unit (second decoding unit). ) 112.

エクストラクト部109は、ビットストリームを入力として受け付ける機能を有する。復号化部103またはディスプレイ等の性能にあわせて、ビットストリーム全体から復号に必要なものを切り出し、分割してそれぞれをベースレイヤデコード部110、高解像度推定信号復元部111及びエンハンスメントレイヤデコード部112に出力する機能を有する。 The extractor 109 has a function of accepting a bitstream as an input. In accordance with the performance of the decoding unit 103 or the display, what is necessary for decoding is cut out from the entire bit stream, divided and divided into the base layer decoding unit 110, the high resolution estimated signal restoration unit 111, and the enhancement layer decoding unit 112, respectively. Has a function to output.

ベースレイヤデコード部110は、エクストラクト部109で切り出されたベースレイヤのビットストリームを入力として受け付ける機能を有する。入力されたビットストリームを復号し、デコード映像信号を高解像度推定信号復元部111と必要に応じてディスプレイ等への出力をおこなう機能を有する。ここで、復号には、例えばMPEG-2やH.264などを用いる。また、時間方向のスケーラビリティやSN比スケーラビリティなどの機能を含んでいても良い。 The base layer decoding unit 110 has a function of accepting the base layer bit stream extracted by the extract unit 109 as an input. It has a function of decoding the input bit stream and outputting the decoded video signal to the high-resolution estimated signal restoring unit 111 and, if necessary, a display. Here, for decoding, for example, MPEG-2, H.264, or the like is used. Also, it may include functions such as time direction scalability and SN ratio scalability.

高解像度推定信号復元部111は、ベースレイヤデコード部110から出力されるベースレイヤデコード信号及びエクストラクト部109から出力されるビットストリームを入力として受け付ける機能を有する。ビットストリームを復号し、高解像度推定信号を復元するためのパラメータを得る機能を有する。また、復号したパラメータを用いて、ベースレイヤデコード信号から高解像度推定信号を復元し、その信号をエンハンスメントレイヤデコード部112へ出力する機能を有する。詳細については後述する。 The high resolution estimation signal restoration unit 111 has a function of receiving the base layer decoded signal output from the base layer decoding unit 110 and the bitstream output from the extract unit 109 as inputs. It has a function of obtaining a parameter for decoding a bit stream and restoring a high resolution estimation signal. Further, it has a function of restoring a high resolution estimation signal from the base layer decoded signal using the decoded parameters and outputting the signal to the enhancement layer decoding unit 112. Details will be described later.

エンハンスメントレイヤデコード部112は、エクストラクト部109から得られるビットストリーム及び高解像度推定信号復元部111から出力される高解像度推定信号を入力として受け付ける機能を有する。ビットストリームを復号し、そこで得られる信号と、高解像度推定信号を用いて、オリジナル映像信号の空間解像度の信号を復号する機能を有する。復号された映像信号は、ディスプレイ等へ出力される。 The enhancement layer decoding unit 112 has a function of receiving the bit stream obtained from the extract unit 109 and the high resolution estimation signal output from the high resolution estimation signal restoration unit 111 as inputs. It has a function of decoding a bit stream and decoding a spatial resolution signal of the original video signal using a signal obtained there and a high resolution estimation signal. The decoded video signal is output to a display or the like.

図1に示した符号化部101の構成例を用いて映像信号を空間スケーラブル符号化する手順を図2に示す。
オリジナルの映像信号を、まず、空間デシメーション部104において空間解像度のデシメーションをおこなう[ステップS201]。空間解像度をデシメーションした信号を、ベースレイヤエンコード部105を用いて符号化し、ビットストリームを生成する[ステップS202]。生成されたビットストリームを多重化部108へ送り、符号化過程で得られるベースレイヤのローカルデコード信号を高解像度推定信号生成部106へ送る。高解像度推定信号生成部106及びエンハンスメントレイヤエンコード部107を用いてオリジナルの映像信号を推定する[ステップS203]。詳細については後述する。そして、ここで生成した高解像度推定信号をエンハンスメントレイヤエンコード部107へ送り、推定時に用いたパラメータを符号化して多重化部108へ送る。オリジナルの映像信号と高解像度推定信号を用いて、エンハンスメントレイヤエンコード部107において空間解像度間および時間の相関を利用した予測を行い、それに伴って生じる予測誤差信号を符号化する[ステップS204]。そして、符号化により生成されたビットストリームを、多重化部108へ送る。ベースレイヤエンコード部105、高解像度推定信号生成部106及びエンハンスメントレイヤエンコード部107より得られたそれぞれのビットストリームを多重化部108において、多重化をおこない、ひとつのビットストリームを生成する[ステップS205]。 FIG. 2 shows a procedure for spatially scalable video signals using the configuration example of the encoding unit 101 shown in FIG.
First, spatial resolution decimation is performed on the original video signal in the spatial decimation unit 104 [step S201]. The signal obtained by decimating the spatial resolution is encoded using the base layer encoding unit 105 to generate a bit stream [step S202]. The generated bit stream is sent to the multiplexing unit 108, and the base layer local decoded signal obtained in the encoding process is sent to the high resolution estimation signal generating unit 106. The original video signal is estimated using the high resolution estimation signal generation unit 106 and the enhancement layer encoding unit 107 [step S203]. Details will be described later. Then, the high-resolution estimation signal generated here is sent to the enhancement layer encoding unit 107, and the parameters used at the time of encoding are encoded and sent to the multiplexing unit 108. Using the original video signal and the high-resolution estimation signal, the enhancement layer encoding unit 107 performs prediction using the correlation between the spatial resolutions and the time, and encodes the prediction error signal generated accordingly [step S204]. Then, the bit stream generated by the encoding is sent to the multiplexing unit 108. Each bit stream obtained from the base layer encoding unit 105, the high resolution estimation signal generating unit 106, and the enhancement layer encoding unit 107 is multiplexed in the multiplexing unit 108 to generate one bit stream [step S205]. .

図1に示した復号化部103の構成例を用いて空間スケーラブル構成のビットストリームを復号してデコード映像信号を得る手順を図3に示す。
通信回線やメディア等102からビットストリームをエクストラクト部109を用いて受信する。ビットストリームを解析し、復号化部103およびディスプレイ等の性能に合わせて必要な符号データを抽出する。そして、ベースレイヤデコード部110、高解像度推定信号復元部111及びエンハンスメントレイヤデコード部112それぞれに対応したデータに分割して出力する[ステップS301]。 FIG. 3 shows a procedure for obtaining a decoded video signal by decoding a spatially scalable bit stream using the configuration example of the decoding unit 103 shown in FIG.
A bit stream is received from the communication line or media 102 using the extract unit 109. The bit stream is analyzed, and necessary code data is extracted in accordance with the performance of the decoding unit 103 and the display. Then, the data is divided into data corresponding to each of the base layer decoding unit 110, the high-resolution estimated signal restoration unit 111, and the enhancement layer decoding unit 112 and output [step S301].

エクストラクト部109で分割したベースレイヤに対応するデータをベースレイヤデコード部110で復号する[ステップS302]。復号したベースレイヤデコード映像信号を高解像度推定信号復元部111に出力し、必要があればディスプレイ等にも出力する。エクストラクト部109で分割した高解像度推定信号復元用のパラメータを高解像度推定信号復元部111で復号し、復号したパラメータとベースレイヤデコード部110より得られるベースレイヤのデコード映像信号を用いて高解像度推定信号を復元する[ステップS303]。そして、復元した高解像度推定信号をエンハンスメントレイヤデコード部112に送る。エンハンスメントレイヤデコード部112において、エクストラクト部109から得られるエンハンスメントレイヤに対応するデータを復号し、そこで得られる信号と高解像度推定信号を用いてオリジナルの映像信号の解像度の再生映像をデコードする[ステップS304]。そして、復号したデコード映像信号をディスプレイ等へ出力する。 Data corresponding to the base layer divided by the extractor 109 is decoded by the base layer decoder 110 [step S302]. The decoded base layer decoded video signal is output to the high-resolution estimated signal restoration unit 111, and is output to a display or the like if necessary. The high resolution estimated signal restoration parameter divided by the extractor 109 is decoded by the high resolution estimated signal restoration unit 111, and the decoded video signal of the base layer obtained from the decoded parameter and the base layer decoding unit 110 is used for high resolution. The estimated signal is restored [step S303]. Then, the restored high resolution estimation signal is sent to the enhancement layer decoding unit 112. The enhancement layer decoding unit 112 decodes the data corresponding to the enhancement layer obtained from the extract unit 109, and decodes the reproduced video having the resolution of the original video signal using the signal obtained there and the high resolution estimation signal [step S304]. Then, the decoded decoded video signal is output to a display or the like.

高解像度推定信号生成部106及びエンハンスメントレイヤエンコード部107の詳細な構成例を示したものが、図4である。
高解像度推定信号生成部106は、第1のハイパスフィルタリング部403、第1のインターポレーション部404、振幅制限・定数倍処理部405、第2のハイパスフィルタリング部406、第2のインターポレーション部407、信号合成部408、推定度判断部409及びエントロピー符号化部410で構成される。 FIG. 4 shows a detailed configuration example of the high-resolution estimated signal generation unit 106 and the enhancement layer encoding unit 107.
The high-resolution estimated signal generation unit 106 includes a first high-pass filtering unit 403, a first interpolation unit 404, an amplitude limiting / constant multiplication processing unit 405, a second high-pass filtering unit 406, and a second interpolation unit. 407, a signal synthesis unit 408, an estimation degree determination unit 409, and an entropy encoding unit 410.

第1のハイパスフィルタリング部403は、ベースレイヤの(ローカル)デコード信号を入力として受け付け、入力信号から高周波数成分を抽出する機能を有する。高周波数成分は前述の式(1)、(2)によって求める。ここで、式(1)、(2)では、ガウシアン関数を用いて高周波数成分を抽出しているが、これを他の方法に置き換えても良い。ただし、ここで用いるフィルタや補間関数等と、空間デシメーション部104、第1のインターポレーション部404、第2のハイパスフィルタリング部406及び第2のインターポレーション部407に用いるフィルタや補間関数等の関係は、ピラミッド構成を満たすものとなっていることが望ましい。例えば、空間デシメーション部にsinc関数を用いた場合、第1のインターポレーション部404、第2のハイパスフィルタリング部406及び第2のインターポレーション部407にもsinc関数を用いることでsinc関数によるピラミッド構成の関係が構築できる。また、第1のハイパスフィルタリング部403は、ここで得た高周波数成分を第1のインターポレーション部404へ出力する機能を有する。 The first high-pass filtering unit 403 has a function of receiving a base layer (local) decoded signal as an input and extracting a high-frequency component from the input signal. The high frequency component is obtained by the above formulas (1) and (2). Here, in Equations (1) and (2), high frequency components are extracted using a Gaussian function, but this may be replaced with other methods. However, the filters and interpolation functions used here, and the filters and interpolation functions used for the spatial decimation unit 104, the first interpolation unit 404, the second high-pass filtering unit 406, and the second interpolation unit 407, etc. It is desirable that the relationship satisfies the pyramid configuration. For example, when the sinc function is used for the spatial decimation unit, the sinc function is also used for the first interpolation unit 404, the second high-pass filtering unit 406, and the second interpolation unit 407, so that the pyramid based on the sinc function is used. A configuration relationship can be established. Further, the first high-pass filtering unit 403 has a function of outputting the high frequency component obtained here to the first interpolation unit 404.

第1のインターポレーション部404は、第1のハイパスフィルタリング部403より出力される高周波数成分の信号を入力として受け付け、その信号をエンハンスメントレイヤに入力されるオリジナルの映像信号の解像度となるように、インターポレーションをおこなう機能を有する。インターポレーションは、前述の式(3)、(4)、(5)で実現可能である。ここでも、インターポレーションの方法(用いるフィルタ係数や補間関数など)は、式(3)、(4)、(5)以外のものを用いても良い。また、第1のインターポレーション部404は、インターポレーションした信号を振幅制限・定数倍処理部405へ出力する機能を有する。 The first interpolation unit 404 receives the high-frequency component signal output from the first high-pass filtering unit 403 as an input, so that the signal has the resolution of the original video signal input to the enhancement layer. , Has the function of interpolating. Interpolation can be realized by the aforementioned equations (3), (4), and (5). Here again, interpolation methods (filter coefficients, interpolation functions, etc.) may be used other than equations (3), (4), and (5). Further, the first interpolation unit 404 has a function of outputting the interpolated signal to the amplitude limiting / constant multiplication unit 405.

振幅制限・定数倍処理部405は、パラメータ及び第1のインターポレーション部404より出力される信号入力として受け付け、未知の高周波数成分を推定するための第1工程を実施する機能を有する。未知の高周波数成分を推定するための第1工程は式(6)で与えられる。ここで、パラメータα_rとTは、非特許文献1と同様のものを用いても良いが、本実施例では、拡大率だけではなくベースレイヤの量子化の程度にも推定精度が関わるため、最適なパラメータ算出のための試行が可能となるように、振幅制限・定数倍処理部405外部から与えることを可能としている。また、振幅制限・定数倍処理部405は、振幅制限・定数倍処理した信号を第2のハイパスフィルタリング部406へ出力する機能を有する。 The amplitude limiting / constant multiplication processing unit 405 has a function of receiving a parameter and a signal input output from the first interpolation unit 404 and performing a first step for estimating an unknown high frequency component. The first step for estimating the unknown high frequency component is given by equation (6). Here, the parameters α _r and T may be the same as those in Non-Patent Document 1, but in this embodiment, since the estimation accuracy is related not only to the enlargement ratio but also to the degree of quantization of the base layer, It is possible to give from the outside of the amplitude limit / constant multiplication processing unit 405 so that trial for calculating the optimum parameter is possible. The amplitude limiting / constant multiplication processing unit 405 has a function of outputting the signal subjected to the amplitude limiting / constant multiplication processing to the second high-pass filtering unit 406.

第2のハイパスフィルタリング部406は、振幅制限・定数倍処理部405より出力される信号を入力として受け付け、未知の高周波数成分を推定するための第2工程を実施する機能を有する。未知の高周波数成分を推定するための第2工程は、式(7)で与えられる。ここでも、高周波数成分の抽出方法は式(7)以外のものを用いても良い。また、第2のハイパスフィルタリング部406は、推定された高周波数成分を信号合成部408へ出力する機能を有する。 The second high-pass filtering unit 406 has a function of receiving a signal output from the amplitude limiting / constant multiplication processing unit 405 as an input and performing a second step for estimating an unknown high-frequency component. The second step for estimating the unknown high frequency component is given by equation (7). In this case as well, a method other than Expression (7) may be used as the high frequency component extraction method. The second high-pass filtering unit 406 has a function of outputting the estimated high frequency component to the signal synthesis unit 408.

第2のインターポレーション部407は、ベースレイヤの(ローカル)デコード信号を入力として受け付け、その信号をエンハンスメントレイヤに入力されるオリジナルの映像信号の解像度となるように、インターポレーションをおこなう機能を有する。インターポレーションは、前述の式(8)で実現可能である。ここでも、インターポレーションの方法(用いるフィルタ係数や補間関数など)は、式(8)以外のものを用いても良い。また、第2のインターポレーション部907は、インターポレーションした信号を信号合成部408へ出力する機能を有する。 The second interpolation unit 407 has a function of accepting a base layer (local) decoded signal as an input and performing the interpolation so that the signal becomes the resolution of the original video signal input to the enhancement layer. Have. Interpolation can be realized by the aforementioned equation (8). Again, interpolation methods (filter coefficients, interpolation functions, etc.) may be used other than the equation (8). The second interpolation unit 907 has a function of outputting the interpolated signal to the signal synthesis unit 408.

信号合成部408は、第2のハイパスフィルタリング部406より出力される信号と第2のインターポレーション部407より出力される信号を入力として受け付ける機能を有する。また、入力されたそれぞれの信号を足し合わせて出力する機能を有する。 The signal synthesis unit 408 has a function of receiving the signal output from the second high-pass filtering unit 406 and the signal output from the second interpolation unit 407 as inputs. Also, it has a function of adding and outputting the input signals.

推定度判断部409は、信号合成部408から出力される信号及びフレームメモリ1・411から出力される信号を入力として受け付ける機能を有する。信号合成部408から出力される信号は、振幅制限・定数倍処理部405において、あるパラメータを用いたときの高解像度推定信号である。この信号とフレームメモリ1より出力されるオリジナルの映像信号との間にどの程度の相関があるかを定量化し、それを記録する機能を有する。2つの信号の相関の定量化の仕方は、例えば相互相関を算出しても良いし、例えば差分を取って2乗平均をとっても良い。推定度判定部409が設置されている目的は、2つの信号がより近くなるパラメータα_rとT(またはα_rのみ)を求めることであるため、任意の範囲内のパラメータを逐次更新して振幅制限・定数倍処理部405へ出力する機能ももつ。そして、オリジナル映像信号と逐次更新したパラメータを用いて生成した高解像度推定信号との間の、逐次記録した相関定量化値から、もっとも2つの信号が近くなるケースを判断し、そのときのパラメータをエントロピー符号化部410へ出力し、そのときの高解像度推定信号を予測信号選択部416へ出力する機能を有する。 The estimation degree determination unit 409 has a function of receiving the signal output from the signal synthesis unit 408 and the signal output from the frame memories 1 and 411 as inputs. The signal output from the signal synthesis unit 408 is a high resolution estimation signal when a certain parameter is used in the amplitude limit / constant multiplication unit 405. It has a function of quantifying the degree of correlation between this signal and the original video signal output from the frame memory 1 and recording it. As a method of quantifying the correlation between two signals, for example, a cross-correlation may be calculated, or for example, a root mean square may be obtained by taking a difference. The purpose of the estimation degree determination unit 409 is to obtain parameters α _r and T (or only α _r ) that make the two signals closer to each other. It also has a function of outputting to the limit / constant multiplication unit 405. Then, from the sequentially recorded correlation quantification values between the original video signal and the high-resolution estimated signal generated using the sequentially updated parameters, the case where the two signals are closest is determined, and the parameters at that time are determined. It has a function of outputting to the entropy encoding unit 410 and outputting the high resolution estimation signal at that time to the prediction signal selection unit 416.

エントロピー符号化部410は、推定度判断部409より出力されるパラメータを入力として受け付ける機能を有する。また、入力されたパラメータをエントロピー符号化してビットストリームを生成し、高解像度推定信号生成部106の外部へ出力する機能を有する。 The entropy encoding unit 410 has a function of accepting the parameter output from the estimation degree determination unit 409 as an input. In addition, it has a function of entropy encoding the input parameters to generate a bit stream and outputting the bit stream to the outside of the high resolution estimation signal generation unit 106.

エンハンスメントレイヤエンコード部107は、フレームメモリ1・411、フレームメモリ2・412、動き推定部413、動き補償部414、イントラ予測部415、予測信号選択部416、予測誤差信号生成手段417、直交変換・量子化部418、エントロピー符号化部419、逆量子化・逆直交変換部420、信号合成部421及びデブロッキングフィルタ部422で構成される。この構成例は、H.264エンコーダの一部を変更したものであり、各部分は従来技術でほぼ実現可能である。 The enhancement layer encoding unit 107 includes a frame memory 1/411, a frame memory 2/412, a motion estimation unit 413, a motion compensation unit 414, an intra prediction unit 415, a prediction signal selection unit 416, a prediction error signal generation unit 417, an orthogonal transform / A quantization unit 418, an entropy encoding unit 419, an inverse quantization / inverse orthogonal transform unit 420, a signal synthesis unit 421, and a deblocking filter unit 422 are configured. This configuration example is obtained by changing a part of the H.264 encoder, and each part can be substantially realized by the conventional technology.

フレームメモリ1・411は、オリジナルの映像信号を入力として受け付け、少なくとも1GOP(Group Of Picture)分の信号を格納できる機能を有する。また、格納した信号を予測信号生成部417、動き推定部413及び推定度判断部409へ、エンハンスメントレイヤエンコード部107と高解像度推定信号生成部106の処理の同期が取れるように対応するフレームの信号を出力する機能を有する。 The frame memories 1 and 411 have a function of receiving an original video signal as an input and storing a signal for at least 1 GOP (Group Of Picture). In addition, the stored signal is transmitted to the prediction signal generation unit 417, the motion estimation unit 413, and the estimation degree determination unit 409 so that the processing of the enhancement layer encoding unit 107 and the high resolution estimation signal generation unit 106 can be synchronized. Has a function of outputting.

フレームメモリ2・412は、デブロッキングフィルタ部422より出力される信号を入力として受け付け、少なくとも1フレーム分格納する機能を有する。そして、動き推定に必要なフレームの信号を動き推定部413へ、動き補償に必要なフレームの信号を動き補償部414へ出力する機能を有する。 The frame memories 2 and 412 have a function of receiving a signal output from the deblocking filter unit 422 as an input and storing at least one frame. It has a function of outputting a frame signal necessary for motion estimation to the motion estimation unit 413 and a frame signal necessary for motion compensation to the motion compensation unit 414.

動き推定部413は、フレームメモリ1・411及びフレームメモリ2・412より出力される信号を入力として受け付け、例えばH.264のような動き推定をおこなう機能を有する。動き推定によって得られた動き情報を動き補償部414及びエントロピー符号化部419へ出力する機能を有する。 The motion estimation unit 413 has a function of receiving signals output from the frame memories 1 and 411 and the frame memories 2 and 412 as inputs and performing motion estimation such as H.264. It has a function of outputting motion information obtained by motion estimation to the motion compensation unit 414 and the entropy coding unit 419.

動き補償部414は、フレームメモリ2・412より出力される信号及び動き情報を入力として受け付け、例えばH.264のような動き補償をおこなう機能を有する。また、動き補償によって得られた信号を予測信号選択部416へ出力する機能を有する。 The motion compensation unit 414 has a function of receiving a signal and motion information output from the frame memories 2 and 412 as inputs and performing motion compensation such as H.264. Further, it has a function of outputting a signal obtained by motion compensation to the prediction signal selection unit 416.

イントラ予測部415は、信号合成部421より出力される信号を入力として受け付け、例えばH.264のようなイントラ予測をおこなう機能を有する。また、イントラ予測して得られた信号を予測信号選択部416へ出力する機能を有する。 The intra prediction unit 415 has a function of receiving a signal output from the signal synthesis unit 421 as an input and performing intra prediction such as H.264, for example. Further, it has a function of outputting a signal obtained by intra prediction to the prediction signal selection unit 416.

予測信号選択部416は、動き補償部414、イントラ予測部415よりそれぞれから出力される信号及び高解像度推定信号を受け付け、入力される信号のうち、いずれかひとつを選択する、または、それぞれの信号に重みを与えて合成する機能を有する。信号の選択、合成の判断基準は任意である。例えば、符号化効率を重視する場合は、予測誤差信号の二乗平均が小さくなるように、信号を選択、合成する。また、予測信号選択部416は、選択または合成した信号を予測誤差信号生成部417及び信号合成手段421へ出力する機能を有する。 The prediction signal selection unit 416 receives a signal and a high resolution estimation signal output from the motion compensation unit 414 and the intra prediction unit 415, and selects any one of the input signals or each signal. It has a function of giving a weight to and combining. The criteria for selecting and combining signals are arbitrary. For example, when importance is placed on coding efficiency, signals are selected and synthesized so that the mean square of the prediction error signal becomes small. The prediction signal selection unit 416 has a function of outputting the selected or synthesized signal to the prediction error signal generation unit 417 and the signal synthesis unit 421.

予測誤差信号生成部417は、フレームメモリ1・411より出力される信号及び予測信号選択部416より出力される予測信号を入力として受け付ける機能を有する。また、フレームメモリ1・411より出力される信号から予測信号を差し引いて予測誤差信号を生成し、それを直交変換・量子化部418へ出力する機能を有する。 The prediction error signal generation unit 417 has a function of receiving a signal output from the frame memories 1 and 411 and a prediction signal output from the prediction signal selection unit 416 as inputs. Further, it has a function of generating a prediction error signal by subtracting the prediction signal from the signal output from the frame memories 1 and 411 and outputting it to the orthogonal transform / quantization unit 418.

直交変換・量子化部418は、予測誤差信号生成部417より出力される信号を入力として受け付け、その信号を直交変換及び量子化する機能を有する。直交変換には、DCTやウェーブレットなどが用いられる。H.264のように、直交変換と量子化を合成した手段を採用しても良い。また、直交変換及び量子化した信号をエントロピー符号化部419及び逆量子化・逆直交変換部420へ出力する機能を有する。 The orthogonal transform / quantization unit 418 has a function of receiving a signal output from the prediction error signal generation unit 417 as an input, and performing orthogonal transform and quantization on the signal. For orthogonal transform, DCT or wavelet is used. As in H.264, a method that combines orthogonal transformation and quantization may be employed. Further, it has a function of outputting the orthogonal transformed and quantized signal to the entropy coding unit 419 and the inverse quantization / inverse orthogonal transform unit 420.

エントロピー符号化部419は、直交変換・量子化部418から出力される信号及び動き推定部913より出力される動き情報を入力として受け付け、それらをエントロピー符号化する機能を有する。また、エントロピー符号化の結果生成されるビットストリームをエンハンスメントレイヤエンコード部107の外部へ出力する機能を有する。 The entropy encoding unit 419 has a function of receiving the signal output from the orthogonal transform / quantization unit 418 and the motion information output from the motion estimation unit 913 as inputs, and entropy encoding them. In addition, the bit stream generated as a result of entropy coding has a function of outputting the enhancement layer encoding unit 107 to the outside.

逆量子化・逆直交変換部420は、直交変換・量子化された状態の信号を入力として受け付け、その信号を逆量子化・逆直交変換する機能を有する。また、逆量子化・逆直交変換した信号を信号合成部421へ出力する機能を有する。 The inverse quantization / inverse orthogonal transform unit 420 has a function of receiving a signal in an orthogonal transform / quantized state as an input and performing inverse quantization / inverse orthogonal transform on the signal. Further, it has a function of outputting a signal obtained by inverse quantization and inverse orthogonal transform to the signal synthesis unit 421.

信号合成部421は、予測信号選択部416より出力される信号及び逆量子化・逆直交変換部420より出力される信号を入力として受け付け、2つの信号を合成する機能を有する。また、合成した信号をイントラ予測部415及びデブロッキングフィルタ部422へ出力する機能を有する。 The signal synthesis unit 421 has a function of receiving the signal output from the prediction signal selection unit 416 and the signal output from the inverse quantization / inverse orthogonal transform unit 420 as inputs, and combining the two signals. Further, it has a function of outputting the synthesized signal to the intra prediction unit 415 and the deblocking filter unit 422.

デブロッキングフィルタ部422は、信号合成部421より出力される信号を入力として受け付け、入力された信号に対してデブロッキングフィルタ処理をおこなう機能を有する。ここで、デブロッキングフィルタは、例えばH.264で用いられているものなどがある。また、デブロッキングフィルタ処理した信号をフレームメモリ2・412へ出力する機能を有する。 The deblocking filter unit 422 has a function of receiving a signal output from the signal synthesis unit 421 as an input and performing deblocking filter processing on the input signal. Here, examples of the deblocking filter include those used in H.264. In addition, it has a function of outputting the deblocking filtered signal to the frame memories 2 and 412.

図4に示した高解像度推定信号生成部106の構成例を用いて高解像度推定信号を生成する手順を図5に示す。
まず、第2のインターポレーション部407を用いて入力信号をインターポレーションする[ステップS501]。 FIG. 5 shows a procedure for generating a high resolution estimation signal using the configuration example of the high resolution estimation signal generation unit 106 shown in FIG.
First, the input signal is interpolated using the second interpolation unit 407 [step S501].

次に、第1のハイパスフィルタリング部403を用いて入力から高周波数成分信号を抽出する[ステップS502]。そして、抽出した高周波数成分信号を第1のインターポレーション部404においてインターポレーションする[ステップS503]。インターポレーションした信号に対して振幅制限・定数倍処理部405を用いて振幅制限及び定数倍処理をおこなう[ステップS504]。ここで、振幅制限及び定数倍処理に伴うパラメータは、推定度判断部409から与えられたものを用いる。第2のハイパスフィルタリング部406において、振幅制限及び定数倍処理した信号から推定した高周波数成分を抽出する[ステップS505]。信号合成部408を用いて入力信号をインターポレーションした信号と推定した高周波数成分を足し合わせ、高解像度推定信号を得る[ステップS506]。推定度判断部409において、オリジナルの映像信号と高解像度推定信号の差分をとって記録する。ここで、そのときのパラメータも記録しておく。そして、パラメータを更新する[ステップS507]。高解像度推定信号がオリジナルの映像信号に最も近づくものを試行によって求めるために、ステップS504からステップS507の手順を指定範囲内のパラメータに対して繰り返しおこなう[ステップS508]。指定範囲内の全てのパラメータで生成したそれぞれの高解像度推定信号とオリジナルの映像信号の差分の中で、もっとも差分の二乗平均が小さくなるものを選択する。そのときの高解像度推定信号をエンハンスメントレイヤエンコード部107内の予測信号選択部416へ送り、そのときのパラメータをエントロピー符号化部410でエントロピー符号化する[ステップS509]。なお、パラメータは、ブロックごとに符号化しても良いし、例えば1GOPでのパラメータの平均値を採用し、そのGOP内でパラメータを一律として高解像度推定信号を生成し、1GOPで符号化するパラメータをひとつだけにしても良い(符号化するパラメータの数、タイミング等の制限はしない)。 Next, a high frequency component signal is extracted from the input using the first high-pass filtering unit 403 [step S502]. Then, the extracted high frequency component signal is interpolated by the first interpolation unit 404 [step S503]. Amplitude limiting and constant multiplication processing are performed on the interpolated signal using the amplitude limiting / constant multiplication processing unit 405 [step S504]. Here, the parameters given from the estimation degree determination unit 409 are used as parameters associated with the amplitude limitation and constant multiplication processing. The second high-pass filtering unit 406 extracts a high frequency component estimated from the signal subjected to the amplitude limiting and constant multiplication processing [Step S505]. The signal synthesis unit 408 is used to add the interpolated signal and the estimated high frequency component to obtain a high resolution estimated signal [step S506]. In the estimation degree determination unit 409, the difference between the original video signal and the high resolution estimation signal is taken and recorded. Here, the parameters at that time are also recorded. Then, the parameter is updated [Step S507]. In order to find out by trial that the high resolution estimation signal is closest to the original video signal, the procedure from step S504 to step S507 is repeated for the parameters within the specified range [step S508]. Among the differences between the respective high-resolution estimation signals generated with all parameters within the specified range and the original video signal, the one with the smallest mean square difference is selected. The high resolution estimation signal at that time is sent to the prediction signal selection unit 416 in the enhancement layer encoding unit 107, and the parameter at that time is entropy encoded by the entropy encoding unit 410 [step S509]. The parameter may be encoded for each block, for example, an average value of parameters in 1 GOP is adopted, a high resolution estimation signal is generated with the parameters uniformly in the GOP, and the parameters to be encoded in 1 GOP are There may be only one (the number of parameters to be encoded, the timing, etc. are not limited).

また、計算コスト削減のために、あらかじめ指定したパラメータを用いることで、最適パラメータ算出のための繰り返し処理を省いても良い。符号化側と復号化側でパラメータを決めておき、パラメータを符号化しない方法をとっても良い。 Further, in order to reduce the calculation cost, it is possible to omit the iterative process for calculating the optimum parameter by using a parameter designated in advance. A method may be used in which parameters are determined on the encoding side and decoding side, and the parameters are not encoded.

図4に示したエンハンスメントレイヤエンコード部107の構成例を用いてオリジナルの映像信号の解像度の信号(エンハンスメントレイヤ)を符号化する手順を図6に示す。
イントラ予測部415を用いてイントラ予測をおこなう[ステップS601]。イントラ予測した信号を予測信号選択部416へ送る。 FIG. 6 shows a procedure for encoding a signal (enhancement layer) having the resolution of the original video signal using the configuration example of the enhancement layer encoding unit 107 shown in FIG.
Intra prediction is performed using the intra prediction unit 415 [step S601]. The intra-predicted signal is sent to the prediction signal selection unit 416.

一方、動き推定部413及び動き補償部414を用いて、動き推定及び動き補償(動き補償予測)をおこなう[ステップS602]。動き補償予測した信号を予測信号選択部416へ送る。
また、高解像度推定信号生成部106を用いて高解像度推定信号を生成する[ステップS603]。詳細については前述したとおりである。生成した高解像度推定信号を予測信号選択部416へ送る。 On the other hand, motion estimation and motion compensation (motion compensation prediction) are performed using the motion estimation unit 413 and the motion compensation unit 414 [step S602]. The motion compensation predicted signal is sent to the prediction signal selection unit 416.
In addition, a high resolution estimation signal is generated using the high resolution estimation signal generation unit 106 [step S603]. Details are as described above. The generated high resolution estimation signal is sent to the prediction signal selection unit 416.

予測信号選択部416において、イントラ予測した信号、動き補償予測した信号及び高解像度推定信号のいずれかひとつを選択、または、それぞれの信号に重みを与えて合成する[ステップS604]。選択、または、合成して生成した予測信号をフレームメモリ1・411から出力される信号から差し引いて予測誤差信号を生成する[ステップS605]。予測誤差信号を直交変換・量子化部418を用いて直交変換及び量子化する[ステップS606]。直交変換及び量子化した信号及び動き情報をエントロピー符号化部419を用いてエントロピー符号化する[ステップS607]。 The prediction signal selection unit 416 selects any one of the intra-predicted signal, the motion-compensated prediction signal, and the high-resolution estimation signal, or combines each signal with a weight [step S604]. A prediction error signal is generated by subtracting the prediction signal generated by selection or synthesis from the signal output from the frame memories 1 and 411 [step S605]. The prediction error signal is orthogonally transformed and quantized using the orthogonal transformation / quantization unit 418 [step S606]. The entropy coding unit 419 performs entropy coding on the orthogonally transformed and quantized signal and motion information [step S607].

符号化対象の信号を全て符号化した場合は、ここで処理を終了する。そうでない場合は、現在符号化している信号が他の信号の符号化時に参照されることが可能となるように、次に示す手順によってローカルデコード及びデブロッキング処理する[ステップS608]。 If all the signals to be encoded have been encoded, the process ends here. Otherwise, local decoding and deblocking are performed according to the following procedure so that the currently encoded signal can be referred to when other signals are encoded [step S608].

ステップS606で直交変換及び量子化した信号を逆量子化・逆直交変換部420で逆量子化及び逆直交変換する[ステップS609]。逆量子化及び逆直交変換した信号を信号合成部421を用いて、予測信号と合成し、ローカルデコード信号を得る[ステップS610]。ローカルデコード信号をイントラ予測部415及びデブロッキングフィルタ部422へ送る。そして、ローカルデコード信号をデブロッキングフィルタ部422においてデブロッキングフィルタ処理する[ステップS611]。デブロッキングフィルタ処理した信号をフレームメモリ2・412に格納する[ステップS612]。 The signal that has been orthogonally transformed and quantized in step S606 is inversely quantized and inversely orthogonally transformed by the inverse quantization / inverse orthogonal transform unit 420 [step S609]. The signal subjected to inverse quantization and inverse orthogonal transform is combined with the prediction signal using the signal combining unit 421 to obtain a local decoded signal [step S610]. The local decoding signal is sent to the intra prediction unit 415 and the deblocking filter unit 422. Then, the deblocking filter unit 422 performs deblocking filtering on the local decoded signal [step S611]. The signal subjected to the deblocking filter processing is stored in the frame memories 2 and 412 [step S612].

高解像度推定信号復元部111及びエンハンスメントレイヤデコード部112の詳細な構成例を示したものが、図7である。
高解像度推定信号復元部701(111に相当)は、第1のハイパスフィルタリング部403、第1のインターポレーション部404、振幅制限・定数倍処理部405、第2のハイパスフィルタリング部406、第2のインターポレーション部407、信号合成部408、エントロピー復号化部709で構成される。ここで、エントロピー復号化部709以外の各部分が備える機能は、図4におけるものと同じもので実現できるため、同じ番号で示してある。 FIG. 7 shows a detailed configuration example of the high-resolution estimated signal restoration unit 111 and the enhancement layer decoding unit 112.
The high-resolution estimated signal restoration unit 701 (corresponding to 111) includes a first high-pass filtering unit 403, a first interpolation unit 404, an amplitude limiting / constant multiplication processing unit 405, a second high-pass filtering unit 406, a second Interpolating section 407, signal synthesizing section 408, and entropy decoding section 709. Here, the functions of each part other than the entropy decoding unit 709 can be realized by the same functions as those in FIG.

エントロピー復号化部709は、エクストラクト部109より出力されるビットストリームのうち、パラメータに相当するものを入力として受け付け、復号する機能を有する。また、復号したパラメータを振幅制限・定数倍処理部405へ出力する機能を有する。 The entropy decoding unit 709 has a function of receiving and decoding a bit stream output from the extract unit 109 as an input corresponding to a parameter. Also, it has a function of outputting the decoded parameters to the amplitude limit / constant multiplication unit 405.

エンハンスメントレイヤデコード部702は、エントロピー復号化部710、フレームメモリ2・412、動き補償部414、イントラ予測部415、予測信号選択部416、逆量子化・逆直交変換部420、信号合成部420及びデブロッキングフィルタ部422で構成される。ここで、エントロピー復号化部710以外の各部分が備える機能は、図4におけるものと同じもので実現できるため、同じ番号で示してある。 The enhancement layer decoding unit 702 includes an entropy decoding unit 710, frame memories 2 and 412, motion compensation unit 414, intra prediction unit 415, prediction signal selection unit 416, inverse quantization / inverse orthogonal transform unit 420, signal synthesis unit 420, and The deblocking filter unit 422 is configured. Here, functions provided in each part other than the entropy decoding unit 710 can be realized by the same functions as those in FIG.

エントロピー復号化部710は、エクストラクト部109より出力されるビットストリームのうち、エンハンスメントレイヤに相当するものを入力として受け付け、復号する機能を有する。また、復号した信号を逆量子化・逆直交変換部420へ、復号した動き情報を動き補償部414へ出力する機能を有する。 The entropy decoding unit 710 has a function of receiving and decoding a bit stream output from the extractor 109 as an input corresponding to the enhancement layer. Further, it has a function of outputting the decoded signal to the inverse quantization / inverse orthogonal transform unit 420 and the decoded motion information to the motion compensation unit 414.

図7に示したエンハンスメントレイヤデコード部702の構成例を用いてオリジナルの映像信号の解像度の信号(エンハンスメントレイヤ)を復号化する手順を図8に示す。
エクストラクト部109より得られるエンハンスメントレイヤに相当するビットストリームをエントロピー復号化部710で復号化する[ステップS801]。復号化した信号を逆量子化・逆直交変換部420で逆量子化及び逆直交変換して予測誤差信号を復元する[ステップS802]。注目するブロックが、イントラ予測、動き補償予測及び高解像度推定信号による予測のいずれが選択されていたか、または合成されていたかを解読し、それに対応する処理をおこなう[ステップS803]。イントラ予測が選択されていた場合、イントラ予測部415を用いてイントラ予測をおこなう[ステップS804]。一方、動き補償予測が選択されていた場合には、動き補償部414を用いて動き補償をおこなう[ステップS805]。また、高解像度推定信号による予測が選択されていた場合には、高解像度推定信号復元部701を用いて高解像度推定信号を復元する[ステップS806]。詳細の手順については後述する。それぞれの信号が合成されていた場合には、ステップS804、ステップS805及びステップS806をすべて実行し、重みをつけて合成する。 FIG. 8 shows a procedure for decoding a signal (enhancement layer) of the resolution of the original video signal using the configuration example of the enhancement layer decoding unit 702 shown in FIG.
A bit stream corresponding to the enhancement layer obtained from the extractor 109 is decoded by the entropy decoder 710 [step S801]. The decoded signal is dequantized and inverse orthogonal transformed by the inverse quantization / inverse orthogonal transform unit 420 to restore the prediction error signal [step S802]. It is decoded whether the block of interest has been selected or synthesized from intra prediction, motion compensated prediction, and prediction based on a high resolution estimation signal, and performs a corresponding process [step S803]. When intra prediction is selected, intra prediction is performed using the intra prediction unit 415 [step S804]. On the other hand, if motion compensation prediction has been selected, motion compensation is performed using the motion compensation unit 414 [step S805]. If prediction based on the high resolution estimation signal is selected, the high resolution estimation signal is restored using the high resolution estimation signal restoration unit 701 [step S806]. Details of the procedure will be described later. If the respective signals have been combined, step S804, step S805, and step S806 are all executed and combined with weights.

ステップS804、ステップS805及びステップS806のいずれか、またはそれらの合成によって得られた信号と予測誤差信号を信号合成部421で合成する[ステップS807]。合成した信号をデブロッキングフィルタ部422でデブロッキングフィルタ処理する[ステップS808]。デブロッキングフィルタ処理した信号は復号映像信号としてディスプレイ等へ出力される。復号化対象ビットストリームが残されている場合、復号映像信号を参照フレームとしてフレームメモリ2・412に蓄積する[ステップS810]。そして、ステップS801からステップS810の処理を繰り返す[ステップS809]。 The signal synthesizer 421 synthesizes a signal obtained by combining one of step S804, step S805, and step S806, or a combination thereof with the prediction error signal [step S807]. The combined signal is subjected to deblocking filter processing by the deblocking filter unit 422 [step S808]. The signal subjected to the deblocking filter processing is output to a display or the like as a decoded video signal. When the decoding target bit stream remains, the decoded video signal is stored in the frame memories 2 and 412 as a reference frame [step S810]. Then, the processing from step S801 to step S810 is repeated [step S809].

図7に示した高解像度推定信号復元部701の構成例を用いて高解像度推定信号を復元する手順を図9に示す。
エクストラクト部109より得られるパラメータに相当するビットストリームをエントロピー復号化部709で復号化し、振幅制限・定数倍処理部405へ送る[ステップS901]。 FIG. 9 shows a procedure for restoring the high resolution estimated signal using the configuration example of the high resolution estimated signal restoring unit 701 shown in FIG.
The bit stream corresponding to the parameter obtained from the extractor 109 is decoded by the entropy decoder 709 and sent to the amplitude limit / constant multiplication processor 405 [step S901].

ベースレイヤのデコード信号を第2のインターポレーション部407においてエンハンスメントレイヤの解像度にインターポレーションする[ステップS902]。
第1のハイパスフィルタリング部403を用いてベースレイヤのデコード信号から高周波数成分信号を抽出する[ステップS903]。抽出した高周波数成分信号を第1のインターポレーション部404においてエンハンスメントレイヤの解像度にインターポレーションする[ステップS904]。インターポレーションした信号を振幅制限・定数倍処理部405を用いて振幅制限・定数倍処理をおこなう[ステップS905]。振幅制限定数倍処理をした信号に対して第2のハイパスフィルタリング部406においてハイパスフィルタリング処理をおこない、推定された高周波数成分信号を得る[ステップS906]。 The second layer interpolation unit 407 interpolates the base layer decoded signal to the enhancement layer resolution [step S902].
The first high-pass filtering unit 403 is used to extract a high-frequency component signal from the base layer decoded signal [step S903]. The extracted high-frequency component signal is interpolated to the enhancement layer resolution by the first interpolation unit 404 [step S904]. The interpolated signal is subjected to amplitude limiting / constant multiplication using the amplitude limiting / constant multiplication processing unit 405 [step S905]. The second high-pass filtering unit 406 performs a high-pass filtering process on the signal that has been subjected to the amplitude limit constant multiplication process to obtain an estimated high-frequency component signal [step S906].

入力信号をインターポレーションした信号と推定された高周波数成分信号を信号合成部408を用いて足し合わせて、高解像度推定信号を得る[ステップS907]。
図10に、本発明の実施例を適用した符号化機能および復号化機能を備えた情報処理装置1001の一例のブロック図を示す。情報処理装置1001は、外部記憶装置1002、一時記憶装置1003、通信装置1004、入力装置1005、中央処理制御装置1006および出力装置1007で構成されており、コンピュータである中央処理制御装置1006により、上述の実施例１の符号化および復号化装置の機能をプログラムにより実現させるものである。ここで、上記のプログラムは記録媒体から読み取られて中央処理制御装置1006に取り込まれても良いし、ネットワークを介して通信装置1004により受信されて中央処理制御装置1006に取り込まれても良い。 The input signal is interpolated and the estimated high-frequency component signal is added using the signal synthesis unit 408 to obtain a high-resolution estimated signal [step S907].
FIG. 10 shows a block diagram of an example of an information processing apparatus 1001 having an encoding function and a decoding function to which the embodiment of the present invention is applied. The information processing apparatus 1001 includes an external storage device 1002, a temporary storage device 1003, a communication device 1004, an input device 1005, a central processing control device 1006, and an output device 1007. The functions of the encoding and decoding apparatus according to the first embodiment are realized by a program. Here, the above program may be read from the recording medium and taken into the central processing control apparatus 1006, or may be received by the communication apparatus 1004 via the network and taken into the central processing control apparatus 1006.

中央処理制御装置1006は、上記プログラムにより、図10の中央処理制御装置内に示すそれぞれの手段をハードウェアまたはソフトウェア処理にて実現する。
［実施例２］
本発明の実施例２を適用した空間解像度スケーラビリティを実現する階層符号化・復号化装置について説明する。この実施例２適用した装置は、上述の実施例１を適用した高解像度推定信号生成部106(図4)および高解像度推定信号復元部701(図7)を一部変更したものである。実施例1におけるインターポレーションと高周波数成分抽出の処理の順序を変えることで、実施例1と同様の効果を得るとともに、さらにメモリ等の資源および処理量の幾分かの削減を実現する。 The central processing control device 1006 realizes each means shown in the central processing control device of FIG. 10 by hardware or software processing by the above program.
[Example 2]
A hierarchical encoding / decoding device that realizes spatial resolution scalability to which Embodiment 2 of the present invention is applied will be described. The apparatus to which the second embodiment is applied is obtained by partially changing the high-resolution estimated signal generation unit 106 (FIG. 4) and the high-resolution estimated signal restoration unit 701 (FIG. 7) to which the first embodiment is applied. By changing the order of the processing of interpolation and high frequency component extraction in the first embodiment, the same effects as in the first embodiment can be obtained, and some reduction in resources such as memory and processing amount can be realized.

実施例1では、最初にベースレイヤ(ローカル)デコード信号に対して高周波数成分の抽出をおこない、抽出した高周波数成分と、ベースレイヤ(ローカル)デコード信号それぞれにインターポレーションを実施していた。これに対して実施例2では、最初にベースレイヤ(ローカル)デコード信号に対してインターポレーションをおこない、インターポレーションした信号の高周波数成分の抽出をおこなうことで、処理量やメモリ等の資源の幾分かの削減を実現する。なお、インターポレーションおよび高周波数成分の抽出をそれぞれ線形とすることで、それらの順序を変えても結果は同じとなる。ただし、実施例2では、インターポレーションした後に高周波数成分抽出をおこなう、すなわち、サンプリング周波数が変化した信号に対してのフィルタ処理をおこなうことになるため、ここで用いるフィルタは、それに対応したものを用いることが望ましい。以下に実施例2の詳細を示す。 In the first embodiment, high frequency components are first extracted from the base layer (local) decode signal, and interpolation is performed on each of the extracted high frequency components and the base layer (local) decode signal. In contrast, in the second embodiment, the base layer (local) decoded signal is first interpolated, and the high frequency components of the interpolated signal are extracted, so that resources such as processing amount and memory can be obtained. Achieve some reduction of It should be noted that the interpolation and the extraction of the high frequency component are linear, so that the result is the same even if their order is changed. However, in Example 2, high frequency component extraction is performed after interpolation, that is, filter processing is performed on a signal whose sampling frequency has changed, so the filter used here corresponds to that. It is desirable to use Details of Example 2 are shown below.

図16に、実施例２適用の高解像度推定信号生成部1601を示す。高解像度推定信号生成部1601は、第1のインターポレーション部1602、第1のハイパスフィルタリング部1603、振幅制限・定数倍処理部405、第2のハイパスフィルタリング部406、信号合成部408、推定度判断部409及びエントロピー符号化部410で構成される。ここで、第1のインターポレーション部1602及び第1のハイパスフィルタリング部1603以外の各部分が備える機能は、図4におけるものと同じもので実現できるため、同じ番号で示してある。 FIG. 16 shows a high-resolution estimated signal generation unit 1601 applied to the second embodiment. The high-resolution estimated signal generation unit 1601 includes a first interpolation unit 1602, a first high-pass filtering unit 1603, an amplitude limiting / constant multiplication processing unit 405, a second high-pass filtering unit 406, a signal synthesis unit 408, an estimation degree A determination unit 409 and an entropy encoding unit 410 are included. Here, the functions of each part other than the first interpolation unit 1602 and the first high-pass filtering unit 1603 can be realized by the same functions as those in FIG.

第1のインターポレーション部1602は、ベースレイヤの(ローカル)デコード信号を入力として受け付け、その信号をエンハンスメントレイヤに入力されるオリジナルの映像信号の解像度となるように、インターポレーションをおこなう機能を有する。インターポレーションは、前述の式(8)で実現可能である。ここでも、インターポレーションの方法(用いるフィルタ係数や補間関数など)は、式(8)以外のものを用いても良い。また、第1のインターポレーション部1602は、インターポレーションした信号を第1のハイパスフィルタリング部1603及び信号合成部408へ出力する機能を有する。 The first interpolation unit 1602 has a function of accepting a base layer (local) decoded signal as an input, and interpolating the signal so that it has the resolution of the original video signal input to the enhancement layer. Have. Interpolation can be realized by the aforementioned equation (8). Again, interpolation methods (filter coefficients, interpolation functions, etc.) may be used other than the equation (8). The first interpolation unit 1602 has a function of outputting the interpolated signal to the first high-pass filtering unit 1603 and the signal synthesis unit 408.

第1のハイパスフィルタリング部1603は、第1のインターポレーション部1602より出力された信号を入力として受け付け、入力信号から高周波数成分を抽出する機能を有する。高周波数成分は前述の式(1)、(2)によって求める。ここで、実施例2の第1のハイパスフィルタリング部1603に入力される信号は、インターポレーションによってサンプリング周波数(解像度)が高くなっているため、式(2)の帯域をそれに応じたものに設定することが望ましい。例えば、拡大率が2倍の場合には、式(2)の帯域を実施例1の場合の半分に設定する。また、式(1)、(2)をそれ以外の方法に置き換えても良い。ただし、ここで用いるフィルタや補間関数等と、空間デシメーション部104、第1のインターポレーション部1602、第2のハイパスフィルタリング部406及び第2のインターポレーション部407に用いるフィルタや補間関数等の関係は、ピラミッド構成を満たすものとなっていることが望ましい。また、第1のハイパスフィルタリング部1603は、ここで得た高周波数成分を振幅制限・定数倍処理部405へ出力する機能を有する。 The first high-pass filtering unit 1603 has a function of receiving a signal output from the first interpolation unit 1602 as an input and extracting a high-frequency component from the input signal. The high frequency component is obtained by the above formulas (1) and (2). Here, since the signal input to the first high-pass filtering unit 1603 of Example 2 has a higher sampling frequency (resolution) due to interpolation, the band of Equation (2) is set accordingly. It is desirable to do. For example, when the enlargement ratio is twice, the band of Expression (2) is set to half that in the first embodiment. Further, the expressions (1) and (2) may be replaced with other methods. However, the filters and interpolation functions used here, and the filters and interpolation functions used for the spatial decimation unit 104, the first interpolation unit 1602, the second high-pass filtering unit 406, and the second interpolation unit 407, etc. It is desirable that the relationship satisfies the pyramid configuration. Further, the first high-pass filtering unit 1603 has a function of outputting the high frequency component obtained here to the amplitude limiting / constant multiplication processing unit 405.

図16に示した高解像度推定信号生成部1601の構成例を用いて高解像度推定信号を生成する手順を図17に示す。ここで、ステップS504からステップS509の各ステップは図5(実施例1)と同じである為、同じ番号で示してある。 FIG. 17 shows a procedure for generating a high resolution estimation signal using the configuration example of the high resolution estimation signal generation unit 1601 shown in FIG. Here, since steps S504 to S509 are the same as those in FIG. 5 (Example 1), they are denoted by the same numbers.

まず、第1のインターポレーション部1602を用いて入力信号をインターポレーションする[ステップS1701]。そして、インターポレーションの結果得られた信号を、インターポレーションした信号を第1のハイパスフィルタリング部1603及び信号合成部408へ送る。 First, the input signal is interpolated using the first interpolation unit 1602 [step S1701]. Then, the signal obtained as a result of the interpolation is sent to the first high-pass filtering unit 1603 and the signal synthesis unit 408.

次に、第1のハイパスフィルタリング部1603を用いてインターポレーションした信号から高周波数成分信号を抽出する[ステップS1702]。抽出した高周波数成分信号に対して振幅制限・定数倍処理部405を用いて振幅制限及び定数倍処理をおこなう[ステップS504]。それ以降は、実施例1の[ステップS505〜S509]と同様の手順で高解像度推定信号を生成する。 Next, a high frequency component signal is extracted from the signal interpolated using the first high-pass filtering unit 1603 [step S1702]. The extracted high frequency component signal is subjected to amplitude limiting and constant multiplication processing using the amplitude limiting / constant multiplication processing unit 405 [step S504]. Thereafter, a high-resolution estimation signal is generated in the same procedure as [Steps S505 to S509] in the first embodiment.

図18に、実施例２適用の高解像度推定信号復元部1801を示す。高解像度推定信号復元部1801は、第1のインターポレーション部1602、第1のハイパスフィルタリング部1603、振幅制限・定数倍処理部405、第2のハイパスフィルタリング部406、信号合成部408及びエントロピー復号化部709で構成される。ここで、これらの各部分が備える機能は、図4、図7及び図16におけるものと同じもので実現できるため、同じ番号で示してある。 FIG. 18 shows a high-resolution estimated signal restoration unit 1801 applied to the second embodiment. The high-resolution estimated signal restoration unit 1801 includes a first interpolation unit 1602, a first high-pass filtering unit 1603, an amplitude limiting / constant multiplication processing unit 405, a second high-pass filtering unit 406, a signal synthesis unit 408, and entropy decoding. The configuration unit 709 is configured. Here, the functions provided in each of these parts can be realized by the same functions as those in FIGS. 4, 7, and 16, and therefore are denoted by the same numbers.

図18に示した高解像度推定信号復元部1801の構成例を用いて高解像度推定信号を復元する手順を図19に示す。ここで、ステップS901及びステップS905からステップS907の各ステップは図9(実施例1)と同じである為、同じ番号で示してある。 FIG. 19 shows a procedure for restoring the high resolution estimated signal using the configuration example of the high resolution estimated signal restoring unit 1801 shown in FIG. Here, since steps S901 and S905 to S907 are the same as those in FIG. 9 (Example 1), they are denoted by the same numbers.

エクストラクト部109より得られるパラメータに相当するビットストリームをエントロピー復号化部709で復号化し、振幅制限・定数倍処理部405へ送る[ステップS901]。
ベースレイヤデコード信号を第1のインターポレーション部407においてエンハンスメントレイヤの解像度にインターポレーションする[ステップS1901]。 The bit stream corresponding to the parameter obtained from the extractor 109 is decoded by the entropy decoder 709 and sent to the amplitude limit / constant multiplication processor 405 [step S901].
The base layer decoded signal is interpolated to the enhancement layer resolution in the first interpolation unit 407 [step S1901].

第1のハイパスフィルタリング部403を用いてベースレイヤデコード信号をインターポレーションした信号から高周波数成分信号を抽出する[ステップS1902]。抽出した高周波数成分信号を振幅制限・定数倍処理部405を用いて振幅制限・定数倍処理をおこなう[ステップS905]。振幅制限定数倍処理をした信号に対して第2のハイパスフィルタリング部406においてハイパスフィルタリング処理をおこない、推定された高周波数成分信号を得る[ステップS906]。 A high frequency component signal is extracted from the signal obtained by interpolating the base layer decoded signal using the first high-pass filtering unit 403 [step S1902]. The extracted high frequency component signal is subjected to amplitude limiting / constant multiplication using the amplitude limiting / constant multiplication processing unit 405 [step S905]. The second high-pass filtering unit 406 performs a high-pass filtering process on the signal that has been subjected to the amplitude limit constant multiplication process to obtain an estimated high-frequency component signal [step S906].

ベースレイヤデコード信号をインターポレーションした信号と推定された高周波数成分信号を信号合成部408を用いて足し合わせて、高解像度推定信号を得る[ステップS907]。 The high-frequency component signal estimated by interpolating the base layer decoded signal and the estimated high-frequency component signal are added using the signal synthesis unit 408 to obtain a high-resolution estimated signal [step S907].

本発明の実施例１を適用した階層符号化・復号化装置の一例を示す構成図である。It is a block diagram which shows an example of the hierarchy encoding / decoding apparatus to which Example 1 of this invention is applied. 図１に示す装置の符号化部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the encoding part of the apparatus shown in FIG. 図１に示す装置の復号化部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the decoding part of the apparatus shown in FIG. 図１に示す装置の符号化部における高解像度推定信号生成部及びエンハンスメントレイヤエンコード部を示す構成図である。It is a block diagram which shows the high-resolution estimated signal production | generation part and enhancement layer encoding part in the encoding part of the apparatus shown in FIG. 図４に示す高解像度推定信号生成部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the high resolution estimated signal production | generation part shown in FIG. 図４に示すエンハンスメントレイヤエンコード部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the enhancement layer encoding part shown in FIG. 図１に示す装置の復号化部における高解像度推定信号復元部及びエンハンスメントレイヤデコード部を示す構成図である。It is a block diagram which shows the high-resolution estimated signal decompression | restoration part and enhancement layer decoding part in the decoding part of the apparatus shown in FIG. 図７に示すエンハンスメントレイヤデコード部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the enhancement layer decoding part shown in FIG. 図７に示す高解像度推定信号復元部の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the high resolution estimated signal decompression | restoration part shown in FIG. 本発明の一実施例を適用した符号化および復号化プログラムを実行する情報処理装置の一例を示すブロック図である。It is a block diagram which shows an example of the information processing apparatus which performs the encoding and decoding program to which one Example of this invention is applied. 従来技術の符号化部および復号化部を示す構成図である。It is a block diagram which shows the encoding part and decoding part of a prior art. 従来技術の符号化部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the encoding part of a prior art. 従来技術の復号化部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the decoding part of a prior art. 従来技術の高周波数成分推定を伴う画像拡大部を示す構成図である。It is a block diagram which shows the image expansion part accompanied by the high frequency component estimation of a prior art. 従来技術の高周波数成分推定を伴う画像拡大部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image expansion part accompanied by the high frequency component estimation of a prior art. 本発明の実施例２を適用した階層符号化・復号化装置における高解像度推定信号生成部を示す構成図である。It is a block diagram which shows the high-resolution estimated signal production | generation part in the hierarchy encoding / decoding apparatus to which Example 2 of this invention is applied. 図１６に示す高解像度推定信号生成部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the high-resolution estimated signal production | generation part shown in FIG. 本発明の実施例２を適用した階層符号化・復号化装置における高解像度推定信号復元部を示す構成図である。It is a block diagram which shows the high-resolution estimated signal decompression | restoration part in the hierarchy encoding / decoding apparatus to which Example 2 of this invention is applied. 図１６に示す高解像度推定信号復元部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the high resolution estimated signal decompression | restoration part shown in FIG.

Explanation of symbols

101 符号化部
102 通信回線またはメディア
103 復号化部
104 空間デシメーション部
105 ベースレイヤエンコード部
106 高解像度推定信号生成部
107 エンハンスメントレイヤエンコード部
108 多重化部
109 エクストラクト部
110 ベースレイヤデコード部
111 高解像度推定信号復元部
112 エンハンスメントレイヤデコード部
403 第1のハイパスフィルタリング部
404 第1のインターポレーション部
405 振幅制限・定数倍処理部
406 第2のハイパスフィルタリング部
407 第2のインターポレーション部
408 信号合成部
409 推定度判断部
410 エントロピー符号化部
411 フレームメモリ1
412 フレームメモリ2
413 動き推定部
414 動き補償部
415 イントラ予測部
416 予測信号選択部
417 予測誤差信号生成部
418 直交変換・量子化部
419 エントロピー符号化部
420 逆量子化・逆直交変換部
421 信号合成部
422 デブロッキングフィルタ部
701 高解像度推定信号復元部
702 エンハンスメントレイヤデコード部
709 エントロピー復号化部
710 エントロピー復号化部
1001 情報処理装置
1002 外部記憶装置
1003 一時記憶装置
1004 通信装置
1005 入力装置
1006 中央処理制御装置
1007 出力装置
1101 符号化部
1102 通信回線またはメディア
1103 復号化部
1104 空間デシメーション部
1105 ベースレイヤエンコード部
1106 空間インターポレーション部
1107 エンハンスメントレイヤエンコード部
1108 多重化部
1109 エクストラクト部
1110 ベースレイヤデコード部
1111 空間インターポレーション部
1112 エンハンスメントレイヤデコード部
1401 高周波数成分推定を伴う画像拡大部
1402 第1のハイパスフィルタリング部
1403 第1のインターポレーション部
1404 振幅処理・定数倍処理部
1405 第2のハイパスフィルタリング部
1406 第2のインターポレーション部
1407 信号合成部
1601 高解像度推定信号生成部
1602 第1のインターポレーション部
1603 第1のハイパスフィルタリング部
1801 高解像度推定信号復元部

101 Encoder
102 Communication line or media
103 Decryption unit
104 Spatial decimation section
105 Base layer encoding section
106 High-resolution estimation signal generator
107 Enhancement layer encoding part
108 Multiplexer
109 Extract part
110 Base layer decoding section
111 High-resolution estimated signal restoration unit
112 Enhancement layer decoding unit
403 First high-pass filtering unit
404 1st interpolation part
405 Amplitude limit and constant multiplier
406 Second high-pass filtering unit
407 Second interpolation part
408 Signal synthesis unit
409 Estimator
410 Entropy encoder
411 Frame memory 1
412 Frame memory 2
413 Motion estimation unit
414 Motion compensation unit
415 Intra prediction unit
416 Predictive signal selector
417 Prediction error signal generator
418 Orthogonal Transform / Quantizer
419 Entropy Coding Unit
420 Inverse quantization and inverse orthogonal transform
421 Signal synthesis unit
422 Deblocking filter
701 High resolution estimation signal restoration unit
702 Enhancement layer decoding unit
709 Entropy decoding unit
710 Entropy decoding unit
1001 Information processing equipment
1002 External storage device
1003 Temporary storage
1004 Communication equipment
1005 Input device
1006 Central processing controller
1007 Output device
1101 Encoder
1102 Communication line or media
1103 Decryption unit
1104 Spatial decimation section
1105 Base layer encoding part
1106 Spatial interpolation section
1107 Enhancement layer encoding part
1108 Multiplexer
1109 Extract part
1110 Base layer decoding section
1111 Spatial interpolation section
1112 Enhancement layer decoding part
1401 Image enlargement with high frequency component estimation
1402 First high-pass filtering section
1403 First interpolation section
1404 Amplitude processing and constant multiplication processing section
1405 Second high-pass filtering unit
1406 Second interpolation section
1407 Signal synthesis unit
1601 High resolution estimation signal generator
1602 First interpolation part
1603 First high-pass filtering unit
1801 High-resolution estimated signal restoration unit

Claims

Encode a video signal having a resolution lower than that of the input video signal obtained by decomposing the input video signal into layers having different resolutions, generate a prediction signal from the video signal having a low resolution, and use the prediction signal A video signal hierarchical encoding device that encodes the input video signal on the higher resolution side by prediction between spatial resolutions and obtains encoded data of video signals of different resolutions,
Spatial reduction means for spatially reducing the input video signal to obtain a first video signal having a resolution lower than that of the input video signal;
First encoding means for obtaining first encoded data obtained by encoding the first video signal using an encoding process including a local decoding process;
The first high frequency component signal extracted from the first local decoded signal obtained by the local decoding process is subjected to an expansion process so as to have the resolution of the input video signal, and then the amplitude limitation and the constant multiplication are performed. The second high frequency component signal obtained by processing is added to the second local decoded signal obtained by enlarging the first local decoded signal to the resolution of the input video signal. When the high resolution processing for generating the second video signal that is the high resolution enlarged video signal is performed, the generated second video signal is compared with the input video signal, and the comparison result is predetermined. The second video that satisfies the predetermined condition is generated by repeatedly generating and comparing the second video signal while changing the parameters for the amplitude limitation and the constant multiplication process until the condition is satisfied. Generate signal and its second video signal Spatial expansion means for obtaining said parameters of order,
The second video signal finally obtained by the spatial enlargement means is used as a prediction signal generated from the first video signal, and spatial direction prediction is a result of spatial direction prediction at the resolution of the input video signal. A signal, a time direction prediction signal that is a result of time direction prediction, and a prediction signal generated from the first video signal are selected or given weights to each prediction signal. And performing synthesis after generation to generate a prediction signal of inter-spatial resolution prediction at the resolution of the input video signal, and encoding after subtraction of the prediction signal of inter-spatial resolution prediction from the input video signal . Second encoding means for obtaining second encoded data which is encoded data of the higher video signal;
Third encoding means for obtaining third encoded data finally encoded by the spatial enlargement means and encoding the parameters for the amplitude limitation and the constant multiplication processing ;
Multiplexing means for multiplexing each of the first to third encoded data;
A video signal hierarchical encoding device comprising:

Encode a video signal having a resolution lower than that of the input video signal obtained by decomposing the input video signal into layers having different resolutions, generate a prediction signal from the video signal having a low resolution, and use the prediction signal A video signal hierarchical encoding method for encoding the input video signal on the higher resolution side by inter-spatial resolution prediction and obtaining encoded data of video signals of different resolutions,
A spatial reduction step of spatially reducing the input video signal to obtain a first video signal having a lower resolution than the input video signal;
A first encoding step of obtaining first encoded data obtained by encoding the first video signal using an encoding process including a local decoding process;
The first high frequency component signal extracted from the first local decoded signal obtained by the local decoding process is subjected to an expansion process so as to have the resolution of the input video signal, and then the amplitude limitation and the constant multiplication are performed. The second high frequency component signal obtained by processing is added to the second local decoded signal obtained by enlarging the first local decoded signal to the resolution of the input video signal. When the high resolution processing for generating the second video signal that is the high resolution enlarged video signal is performed, the generated second video signal is compared with the input video signal, and the comparison result is predetermined. The second video that satisfies the predetermined condition is generated by repeatedly generating and comparing the second video signal while changing the parameters for the amplitude limitation and the constant multiplication process until the condition is satisfied. Generate signal and its second video signal Spatial expansion step of obtaining said parameters because,
The second video signal finally obtained in the spatial enlargement step is used as a prediction signal generated from the first video signal, and spatial direction prediction is a result of spatial direction prediction at the resolution of the input video signal. A signal, a time direction prediction signal that is a result of time direction prediction, and a prediction signal generated from the first video signal are selected or given weights to each prediction signal. And performing synthesis after generation to generate a prediction signal of inter-spatial resolution prediction at the resolution of the input video signal, and encoding after subtraction of the prediction signal of inter-spatial resolution prediction from the input video signal . A second encoding step of obtaining second encoded data that is encoded data of the higher-side video signal;
A third encoding step for obtaining third encoded data obtained by encoding the parameters for the amplitude limitation and the constant multiplication processing, which is finally obtained in the spatial expansion step ;
A multiplexing step of multiplexing each of the first to third encoded data;
A video signal hierarchical encoding method comprising:

Encode a video signal having a resolution lower than that of the input video signal obtained by decomposing the input video signal into layers having different resolutions, generate a prediction signal from the video signal having a low resolution, and use the prediction signal A video signal hierarchical encoding program for encoding the input video signal on the higher resolution side by inter-spatial resolution prediction and causing a computer to execute an operation of obtaining encoded data of video signals of different resolutions,
Spatial reduction means for spatially reducing the input video signal to obtain a first video signal having a resolution lower than that of the input video signal;
First encoding means for obtaining first encoded data obtained by encoding the first video signal using an encoding process including a local decoding process;
The first high frequency component signal extracted from the first local decoded signal obtained by the local decoding process is subjected to an expansion process so as to have the resolution of the input video signal, and then the amplitude limitation and the constant multiplication are performed. The second high frequency component signal obtained by processing is added to the second local decoded signal obtained by enlarging the first local decoded signal to the resolution of the input video signal. When the high resolution processing for generating the second video signal that is the high resolution enlarged video signal is performed, the generated second video signal is compared with the input video signal, and the comparison result is predetermined. The second video that satisfies the predetermined condition is generated by repeatedly generating and comparing the second video signal while changing the parameters for the amplitude limitation and the constant multiplication process until the condition is satisfied. Generate signal and its second video signal Spatial expansion means for obtaining said parameters of order,
The second video signal finally obtained by the spatial enlargement means is used as a prediction signal generated from the first video signal, and spatial direction prediction is a result of spatial direction prediction at the resolution of the input video signal. A signal, a time direction prediction signal that is a result of time direction prediction, and a prediction signal generated from the first video signal are selected or given weights to each prediction signal. And performing synthesis after generation to generate a prediction signal of inter-spatial resolution prediction at the resolution of the input video signal, and encoding after subtraction of the prediction signal of inter-spatial resolution prediction from the input video signal . Second encoding means for obtaining second encoded data which is encoded data of the higher video signal;
Third encoding means for obtaining third encoded data finally encoded by the spatial enlargement means and encoding the parameters for the amplitude limitation and the constant multiplication processing ;
Multiplexing means for multiplexing each of the first to third encoded data;
Video signal hierarchical encoding program for causing a computer to function.