JP4819856B2

JP4819856B2 - Moving picture coding method, moving picture coding apparatus, moving picture coding program, and computer-readable recording medium recording the program

Info

Publication number: JP4819856B2
Application number: JP2008209861A
Authority: JP
Inventors: 幸浩坂東; 和也早瀬; 誠之高村; 一人上倉; 由幸八島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2008-08-18
Filing date: 2008-08-18
Publication date: 2011-11-24
Anticipated expiration: 2028-08-18
Also published as: JP2010045722A

Description

本発明は、予測誤差信号に対して変換符号化および量子化による情報圧縮を行うことで動画像を符号化する動画像符号化方法およびその装置と、その動画像符号化方法の実現に用いられる動画像符号化プログラムおよびそのプログラムを記録したコンピュータ読み取り可能な記録媒体とに関する。 INDUSTRIAL APPLICABILITY The present invention is a moving image coding method and apparatus for coding a moving image by performing information compression by transform coding and quantization on a prediction error signal, and the moving image coding method. The present invention relates to a moving image encoding program and a computer-readable recording medium on which the program is recorded.

［二乗誤差規範のコスト関数を用いる符号化方式］
Ｈ．２６４では、イントラ予測および可変形状動き補償の導入に伴い、従来の標準化方式と比べて、予測モードの種類が増加している。このため、一定の主観画質を保持しつつ符号量を削減するには、適切な予測モードを選択する必要がある。Ｈ．２６４の参照ソフトウェアＪＭ（非特許文献１参照）では、以下のＲ−Ｄコストを最小化する予測モードを選択している。なお、以下の表記において、「＾Ｘ」（Ｘは文字）における記号＾は、「Ｘ」の上に付く記号を示している。 [Encoding method using cost function of square error criterion]
H. In H.264, with the introduction of intra prediction and variable shape motion compensation, the types of prediction modes are increasing compared to the conventional standardized method. For this reason, in order to reduce the amount of codes while maintaining a constant subjective image quality, it is necessary to select an appropriate prediction mode. H. In the H.264 reference software JM (see Non-Patent Document 1), the following prediction mode that minimizes the RD cost is selected. In the following notation, the symbol ^ in “^ X” (where X is a letter) indicates a symbol attached to “X”.

ここで、Ｓは原信号、ｑは量子化パラメータ、ｍは予測モードを表す番号であり、＾Ｓ_m,qは原信号Ｓに対して予測モードｍを用いて予測し、量子化パラメータｑを用いて量子化した場合の復号信号である。また、λはモード選択に用いるラグランジェの未定乗数である。さらに、Ｄ（Ｓ，＾Ｓ_m,q）は次式に示す二乗誤差和である。 Here, S is an original signal, q is a quantization parameter, m is a number representing a prediction mode, ^ S _{m, q} is predicted for the original signal S using the prediction mode m, and the quantization parameter q is It is a decoded signal when quantized by using. Further, λ is a Lagrange's undetermined multiplier used for mode selection. Further, D (S, ^ S _{m, q} ) is a sum of square errors shown in the following equation.

ここで、Ｓ^Y，Ｓ^U，Ｓ^Vはそれぞれ原信号のＹ，Ｕ，Ｖ成分であり、＾Ｓ^Y _m,q，＾Ｓ^U _m,q，＾Ｓ^V _m,qはそれぞれ復号信号のＹ，Ｕ，Ｖ成分である。また、ｘ₀，ｙ₀はブロックの中で原点に最も近い位置の座標値である。 Here, S ^Y , S ^U , and S ^V are Y, U, and V components of the original signal, respectively, and ＾ S ^Y _{m, q} , ＳS ^U _{m, q} , and ＳS ^V _{m, q} are respectively the decoded signals. Y, U and V components. Further, x ₀ and y ₀ are coordinate values at positions closest to the origin in the block.

Ｈ．２６４における復号信号の算出方法を以下に示す。なお、説明に用いる記号を下記の表にまとめる。 H. The calculation method of the decoded signal in H.264 is shown below. The symbols used for the explanation are summarized in the following table.

原信号Ｓに対して、モード番号ｍの予測方法を用いた場合の予測信号をＰ_mとする。Ｈ．２６４の符号化処理では、モード番号ｍの予測方法を用いた場合の予測誤差信号Ｒ_m（＝Ｓ−Ｐ_m）に対して、変換行列〜Φを用いた直交変換が次式のように施される。なお、以下の表記において、「〜Ｘ」（Ｘは文字）における記号〜は、「Ｘ」の上に付く記号を示している。 Let P _m be the prediction signal when the prediction method of mode number m is used for the original signal S. H. In the H.264 encoding process, the orthogonal transformation using the transformation matrix to Φ is applied to the prediction error signal R _m (= S−P _m ) when the prediction method of the mode number m is used as follows: Is done. In the following notation, the symbol “˜X” (where X is a letter) indicates a symbol attached to “X”.

ここで、〜Φ^tは変換行列〜Φに対する転置行列を表す。なお、変換行列〜Φは次式で表される整数要素の直交行列である。 Here, ~ Φ ^t represents a transposed matrix for the transformation matrix ~ Φ. Note that the transformation matrix to Φ is an orthogonal matrix of integer elements expressed by the following equation.

次に、行列〜Φが非正規行列であるため、次に示すような行列の正規化に相当する処理を行う。 Next, since the matrices Φ are non-normal matrices, processing corresponding to matrix normalization as shown below is performed.

これは、式（３）において、〜Φの代わりに次式のΦを用いることにあたる。 This is equivalent to using Φ of the following expression instead of ˜Φ in the expression (3).

さらに、Ｃ_mに対して、量子化パラメータｑを用いた量子化が次式のとおり施される。なお、Ｈ．２６４の参照ソフトウェアＪＭでは、正規化は量子化の中に組み込まれている。 Further, the quantization using the quantization parameter q is performed on C _m as follows. H. In the H.264 reference software JM, normalization is built into quantization.

一方、Ｈ．２６４の復号処理では、Ｖに対して、次式のように逆量子化を施し、変換係数の復号値＾Ｃ_m,qを得る。 On the other hand, H. In the H.264 decoding process, V is inversely quantized as in the following equation to obtain a decoded coefficient ^ C _{m, q} of the transform coefficient.

次に、この＾Ｃ_m,qに対して、次式のように逆変換を施し、予測誤差の復号信号を得る。 Next, inverse transform is applied to this ^ C _{m, q} as shown in the following equation to obtain a prediction error decoded signal.

最後に、次式により、符号化対象画像の復号信号を得る。 Finally, a decoded signal of the encoding target image is obtained by the following equation.

［主観画質を考慮した歪み量への重み付け］
前述の通り、Ｈ．２６４の参照ソフトウェアＪＭで用いられている主観画質の尺度は二乗誤差である。しかし、この二乗誤差は必ずしも、主観的な画質劣化を反映した歪み量ではない。例えば、高周波数成分の変化は低周波成分の変化に比べて、視覚的には検知されにくい。 [Weighting distortion amount considering subjective image quality]
As described above, H.P. The measure of subjective image quality used in the H.264 reference software JM is a square error. However, this square error is not necessarily a distortion amount reflecting subjective image quality degradation. For example, a change in a high frequency component is less visually detected than a change in a low frequency component.

しかし、こうした視覚特性を利用していない符号化器（例えば、ＪＭ）には、符号量の効率的な削減に関して、改良の余地が残る。 However, an encoder (for example, JM) that does not use such visual characteristics still has room for improvement in terms of efficient code amount reduction.

そこで、時空間周波数成分に対して視覚感度に差があることを利用する検討がなされている。例えば、変移量に応じて適応的に量子化ステップ幅を設定する検討があげられる（特許文献１参照）。 Therefore, studies have been made to use the difference in visual sensitivity with respect to spatio-temporal frequency components. For example, there is a study of adaptively setting the quantization step width according to the amount of shift (see Patent Document 1).

また、時空間の視覚感度に基づき重み付けされた歪み量を符号化パラメータ選択のコスト関数において用いる検討があげられる（非特許文献２参照）。 Further, there is a study of using a distortion amount weighted on the basis of spatio-temporal visual sensitivity in a cost function for selecting an encoding parameter (see Non-Patent Document 2).

この非特許文献２に記載された方法では、直交変換係数に対して、視覚感度に応じて空間周波数成分毎に歪み量の重み付けを行うことで、主観画質に対応した歪み量を定義する。さらに、時間方向の視覚感度も考慮して、前述の重み付けされた歪み量に対して、変移量に応じてさらに重み付けを行う。こうして時空間の視覚感度に基づき重み付けされた歪み量を符号化パラメータ選択のコスト関数において用いる。 In the method described in Non-Patent Document 2, the amount of distortion corresponding to the subjective image quality is defined by weighting the amount of distortion for each spatial frequency component according to visual sensitivity with respect to the orthogonal transform coefficient. Further, in consideration of the visual sensitivity in the time direction, the above-described weighted distortion amount is further weighted according to the shift amount. Thus, the weighted distortion amount based on the spatio-temporal visual sensitivity is used in the cost function for selecting the encoding parameter.

量子化誤差信号に対する視覚感度に基づく重み付けについて、以下、説明する。以下に示す方法（本発明も同様）では、次式のＲ−Ｄコストを用いることを想定している。 The weighting based on the visual sensitivity for the quantization error signal will be described below. In the following method (the same applies to the present invention), it is assumed that the RD cost of the following equation is used.

ここで、Ｃ_mはモード番号ｍを用いた場合の予測誤差信号Ｒ_mに対する変換係数であり、＾Ｃ_m,qはＣ_mを量子化パラメータｑで量子化・逆量子化して得られる変換係数の復号値である。このＲ−Ｄコストの計算に用いる歪み量として、以下の重み付け歪み量を用いる。 Here, C _m is a transform coefficient for the prediction error signal R _m when the mode number m is used, and ^ C _{m, q} is a transform coefficient obtained by quantizing and dequantizing C _m with the quantization parameter q. Is the decoded value. The following weighted distortion amount is used as the distortion amount used for calculating the RD cost.

ここで、Ｃ_m ^s(i)［ｋ，ｌ］（ｓ＝Ｙ，Ｕ，Ｖ）はＣ_mの要素であり、マクロブロック（Ｙ成分の場合には１６×１６［画素］，Ｕ，Ｖ成分の場合には８×８［画素］）内のサブブロック（Ｎ×Ｎ［画素］）のうち、ラスター走査においてｉ番目に走査されるサブブロックに含まれる変換係数である。また、＾Ｃ_m,q ^s(i)［ｋ，ｌ］（ｓ＝Ｙ，Ｕ，Ｖ）は＾Ｃ_m,qの要素であり、マクロブロック（Ｙ成分の場合には１６×１６［画素］，Ｕ，Ｖ成分の場合には８×８［画素］）内のサブブロック（Ｎ×Ｎ［画素］）のうち、ラスター走査においてｉ番目に走査されるサブブロックに含まれる復号変換係数である。 Here, C _m ^{s (i)} [k, l] (s = Y, U, V) is an element of C _m , and a macroblock (16 × 16 [pixel], U, V in the case of Y component) In the case of components, it is a conversion coefficient included in the i-th sub-block scanned in the raster scanning among the sub-blocks (N × N [pixels]) in 8 × 8 [pixels]). ^ C _{m, q} ^{s (i)} [k, l] (s = Y, U, V) is an element of ^ C _{m, q} , and a macroblock (16 × 16 [pixels in the case of Y component) ], U, and V components, the decoding transform coefficients included in the i-th sub-block scanned in the raster scan among the sub-blocks (N × N [pixels]) in 8 × 8 [pixels]). is there.

さらに、Ｗ_k,l ^s（ｓ＝Ｙ，Ｕ，Ｖ）は１以下に設定される重み係数であり、以下では、感度係数と呼ぶ。式（１２）において、Ｗ_k,l ^sを小さな値に設定することは、量子化歪み〜Ｄ（Ｃ_m，＾Ｃ_m,q）を小さく見積もることに相当する。 Further, W _{k, l} ^s (s = Y, U, V) is a weighting coefficient set to 1 or less, and is hereinafter referred to as a sensitivity coefficient. In Expression (12), setting W _{k, l} ^s to a small value corresponds to estimating the quantization distortion to D (C _m , ^ C _{m, q} ) to be small.

なお、直交変換の正規性より、Ｗ_k,l ^s＝１（∀ｋ，ｌ；ｓ＝Ｙ，Ｕ，Ｖ）とすれば、前述の重み付け歪み量は二乗誤差和と等価となる。 From the normality of orthogonal transformation, if W _{k, l} ^s = 1 (∀k, l; s = Y, U, V), the above-described weighted distortion amount is equivalent to the square error sum.

Ｗ_k,l ^s（ｓ＝Ｙ，Ｕ，Ｖ）は空間周波数および時間周波数が高いほど、小さな値をとる。具体的な算出法は非特許文献２で検討されている。
K.P.Lim and G.Sullivan and T.Wiegand, Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. Joint Video Team(JVT) of ISO/IEC MPEG and ITU-T VCEG, JVT-R95, Jan., 2006. http://ftp3.itu.ch/av-arch/jvt-site/2006＿01＿Bangkok/JVT-R095.zip 特開平５−２３６４４４号公報坂東幸浩, 早瀬和也, 高村誠之, 上倉一人, 八島由幸, 時空間視覚感度特性に基づくH.264/AVC モード選択方法の検討．映像情報メディア学会年次大会，7-4, 2007 W _{k, l} ^s (s = Y, U, V) takes a smaller value as the spatial frequency and the temporal frequency are higher. A specific calculation method is discussed in Non-Patent Document 2.
KPLim and G. Sullivan and T. Wiegand, Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods.Joint Video Team (JVT) of ISO / IEC MPEG and ITU-T VCEG, JVT-R95, Jan., 2006. http : //ftp3.itu.ch/av-arch/jvt-site/2006_01_Bangkok/JVT-R095.zip JP-A-5-236444 Yukihiro Bando, Kazuya Hayase, Masayuki Takamura, Hitoshi Uekura, Yoshiyuki Yajima, H.264 / AVC mode selection method based on spatio-temporal visual sensitivity characteristics. The Institute of Image Information and Television Engineers Annual Conference, 7-4, 2007

前述のコントラスト感度関数に基づく歪み量への重み付けを行う方法（非特許文献２に記載された方法）では、マクロブロック単位に感度関数による重み付けを行うため、重み付け後の歪み量にブロック境界における不連続性（ブロック歪み）が反映されない。 In the above-described method of weighting the distortion amount based on the contrast sensitivity function (the method described in Non-Patent Document 2), the sensitivity function is weighted in units of macroblocks. Continuity (block distortion) is not reflected.

動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式（例えば、Ｈ．２６４）では、ブロック歪みは特徴的な符号化歪みである。このため、ブロック歪みが考慮されていない場合、得られた重み付き歪み量が主観画質を正しく反映できないケースが発生する。 In a block-based coding scheme (for example, H.264) that combines inter-frame prediction based on motion compensation and orthogonal transform, block distortion is a characteristic coding distortion. For this reason, when the block distortion is not taken into consideration, a case where the obtained weighted distortion amount cannot correctly reflect the subjective image quality occurs.

本発明はかかる事情に鑑みてなされたものであって、動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式（例えば、Ｈ．２６４）に対して、ブロック歪みを含む主観画質を適切に評価した符号化歪みの尺度を導入することを実現することで、符号化パラメータの選択に用いるコスト関数として、ブロック歪みを含む主観画質を反映したものを実現できるようにする新たな動画像符号化技術の提供を目的とする。 The present invention has been made in view of the above circumstances, and has a subjective image quality including block distortion as compared with a block-based encoding method (for example, H.264) that combines inter-frame prediction based on motion compensation and orthogonal transform. By introducing a measure of coding distortion that appropriately evaluates the video, a new video that enables the implementation of a cost function used to select coding parameters that reflects subjective image quality including block distortion An object is to provide an image encoding technique.

非特許文献２に記載された従来法の符号化歪み尺度においてブロック歪みが反映されなかった原因を、以下に考察する。 The reason why the block distortion is not reflected in the coding distortion scale of the conventional method described in Non-Patent Document 2 will be considered below.

従来法では、各マクロブロックのＤＣＴ係数に対して、コントラスト感度関数に基づき重み付けを行っていた。このため、各マクロブロック内の波形に対するコントラスト感度は反映していたが、隣接ブロック間の不連続性については考慮されていなかった。 In the conventional method, the DCT coefficient of each macroblock is weighted based on the contrast sensitivity function. For this reason, the contrast sensitivity with respect to the waveform in each macroblock was reflected, but the discontinuity between adjacent blocks was not considered.

図１０に示す１次元信号を例に取ると、各ブロック（ブロックｋ−１，ブロックｋ，ブロックｋ＋１）のＤＣＴ係数に対して重み付けを行うブロックに閉じた処理では、ブロック間の不連続性（ブロックｋ−１とブロックｋの間の不連続性、あるいは、ブロックｋとブロックｋ＋１の間の不連続性）は知り得ない。 Taking the one-dimensional signal shown in FIG. 10 as an example, in a process closed to blocks that weight the DCT coefficients of each block (block k-1, block k, block k + 1), discontinuity between blocks ( The discontinuity between block k-1 and block k, or the discontinuity between block k and block k + 1) is not known.

そこで、本発明では、ブロック内の波形とあわせて、隣接ブロック間の不連続性も考慮した周波数分析を行い、コントラスト感度関数に基づき歪み量に対する重み付けを行うようにする。 Therefore, in the present invention, frequency analysis is performed in consideration of discontinuity between adjacent blocks together with the waveform in the block, and the distortion amount is weighted based on the contrast sensitivity function.

［感度係数の算出］
本発明では、各ブロック内の歪み量に対して、時空間視覚感度関数に基づく重み付けを行う。この重み付けの係数の算出において、入力となるのは変換係数と変移量（動きベクトルのような画像信号の時間的な動きを示すもの）である。ここで、イントラ予測を行うフレームについても、フレーム間における画像信号の時間的な動きを求めることで変移量を求めることができる。 [Calculation of sensitivity coefficient]
In the present invention, the distortion amount in each block is weighted based on the spatiotemporal visual sensitivity function. In the calculation of the weighting coefficient, input is a conversion coefficient and a shift amount (indicating temporal movement of an image signal such as a motion vector). Here, also for a frame for which intra prediction is performed, a shift amount can be obtained by obtaining temporal movement of an image signal between frames.

なお、以下では、縦幅Ｈの画像を視距離ｒＨにおいて観測する場合を考える。ｒを視距離パラメータと呼ぶ。また、以下では、表現を簡略化するために、Ｙ，Ｕ，Ｖの区別を表す添え字を省略し、Ｙ成分について議論する。なお、Ｕ，Ｖ成分についても以下と同様に議論できる。 In the following, a case where an image having a vertical width H is observed at a viewing distance rH is considered. r is called a viewing distance parameter. Also, in the following, in order to simplify the expression, the subscript indicating the distinction between Y, U, and V is omitted, and the Y component will be discussed. The U and V components can be discussed in the same manner as described below.

直交変換による２次元信号の変換とは、その直交変換の基底画像を用いて信号を表現することである。変換行列Φ（Ｎ×Ｎ行列）の第ｋ列ベクトル（Ｎ次元ベクトル）をφ_kとすると、同行列に対する基底画像は、
ｆ_k,l（ｘ，ｙ）＝φ_k［ｙ］φ_l［ｘ］^t （０≦ｘ，ｙ≦Ｎ−１）
という式より得られる。なお、Ｈ．２６４の場合、Ｎとして取りうる値は４または８のいずれかである。ここで、φ_l ^tはφ_lの転置ベクトルである。 The transformation of a two-dimensional signal by orthogonal transformation is to express a signal using a base image of the orthogonal transformation. When the k-th column vector (N-dimensional vector) of the transformation matrix Φ (N × N matrix) is φ _k , the base image for the matrix is
f _{k, l} (x, y) = φ _k [y] φ _l [x] ^t (0 ≦ x, y ≦ N−1)
It can be obtained from the formula H. In the case of H.264, the possible value for N is either 4 or 8. Here, φ _l ^t is a transposed vector of φ _l .

Ｎ×Ｎ［画素］の領域における予測誤差信号Ｒ_m［ｘ，ｙ］（Ｎｉ_x≦ｘ≦Ｎｉ_x＋Ｎ−１，Ｎｉ_y≦ｙ≦Ｎｉ_y＋Ｎ−１）をＲ_m ^(ix,iy)と略記し、以降、基準ブロックと呼ぶ。ここで、ｉ_x,ｉ_yは、基準ブロックの位置を指し示す整数値である。さらに、対応する直交変換係数をＣ_m ^(ix,iy)［ｋ，ｌ］（０≦ｋ，ｌ≦Ｎ−１）とすると、同予測誤差信号Ｒ_m ^(ix,iy)は次式のように表せる。 Predictive error signal R _m [x, y] (Ni _x ≦ x ≦ Ni _x + N−1, Ni _y ≦ y ≦ Ni _y + N−1) in the region of N × N [pixel] is ^expressed as R _m ^{(ix, iy).} And abbreviated as “reference block” hereinafter. Here, i _{x and} i _y are integer values indicating the position of the reference block. Further, assuming that the corresponding orthogonal transform coefficient is C _m ^{(ix, iy)} [k, l] (0 ≦ k, l ≦ N−1), the prediction error signal R _m ^{(ix, iy)} is given by It can be expressed as

Ｍ_x×Ｍ_y個の基準ブロック（Ｒ_m ^(ix,iy)）から構成されるＭ_xＮ×Ｍ_yＮ画素を含む予測誤差信号Ｒ_m［ｘ，ｙ］（Ｎｉ_x0≦ｘ≦Ｎ（ｉ_x0＋Ｍ_x）−１，Ｎｉ_y0≦ｙ≦Ｎ（ｉ_y0＋Ｍ_y）−１）（分析対象ブロックと呼ぶ）に対して、ブロック歪みを考慮した主観画質を考察する。 M _x × M _y-number of reference block _{^{(R m (ix, iy)}} ) composed of M _x N × M _y prediction including N pixel error signal _{R m [x, y] (} Ni x0 ≦ x ≦ N ( For i _x0 + M _x ) −1, Ni _y0 ≦ y ≦ N (i _y0 + M _y ) −1) (referred to as an analysis target block), the subjective image quality considering block distortion is considered.

ここで、この分析対象ブロックはＭ_x×Ｍ_y個の基準ブロックから構成されるものであることから、この分析対象ブロックをフーリエ変換すると、基準ブロック内のみならず隣接する基準ブロック間の不連続性（ブロック歪み）についても評価できるようになる。 Here, since the analyzed block is intended to be composed of M _x × M _y-number of reference blocks, when Fourier transform the analyte block, discontinuity between adjacent reference blocks not only the reference block It becomes possible to evaluate the property (block distortion).

分析対象ブロックと各基準ブロックとの関係を表すためには、次式に示すように、Ｒ_m ^(ix,iy)に対して、ゼロ埋め込みにより、Ｍ_xＮ×Ｍ_yＮ画素の信号を得る。 To represent the relationship between the analyzed block and the reference block, as shown in the following ^formula, R _m ^{(ix, iy)} with respect to, the zero padded, obtaining a signal of M _x N × M _y N pixels .

同様に、次式に示すように、各基底画像ｆ_k,l（ｘ，ｙ）（０≦ｘ，ｙ≦Ｎ−１）に対して、ゼロ埋め込みにより、Ｍ_xＮ×Ｍ_yＮ画素の信号を得る。 Similarly, as shown in the following equation, the base image f _k, with respect to l (x, y) (0 ≦ x, y ≦ N-1), the zero padded, the M _x N × M _y N pixels Get a signal.

ゼロ埋め込みの結果得られる〜ｆ_k,l ^(p)（ｘ，ｙ）（ｘ＝０,....,Ｍ_xＮ−１；ｙ＝０,....,Ｍ_yＮ−１；ｐ＝０,....,Ｍ_xＭ_y−１）を修正基底画像と呼ぶ。例えば、Ｎ＝４，Ｍ_x＝２，Ｍ_y＝２の場合には、図１に示すように、網掛け部の４×４画素に基底画像が配置され、それ以外の位置にゼロ値がパディングされる。 Zero padding the resulting _{^{~f k, l (p) (}} x, y) (x = 0, ...., M x N-1; y = 0, ...., M y N-1; p = 0, ...., referred to as a M _x M _y -1) a modified base image. For example, in the case of _{N = 4, M x = 2} , M y = 2 , as shown in FIG. 1, is arranged the base image in 4 × 4 pixels shaded portion, are the zero value to the other position Padded.

ここで、図１（ａ）はｉ_x＝ｉ_x0＋１，ｉ_y＝ｉ_y0に対応し、図１（ｂ）はｉ_x＝ｉ_x0，ｉ_y＝ｉ_y0に対応し、図１（ｃ）はｉ_x＝ｉ_x0＋１，ｉ_y＝ｉ_y0＋１に対応し、図１（ｄ）はｉ_x＝ｉ_x0，ｉ_y＝ｉ_y0＋１に対応する。 Here, FIG. 1A corresponds to i _x = i _x0 +1, i _y = i _y0 , and FIG. 1B corresponds to i _x = i _x0 , i _y = i _y0 , and FIG. ) Corresponds to i _x = i _x0 +1, i _y = i _y0 +1, and FIG. 1D corresponds to i _x = i _x0 , i _y = i _y0 +1.

このとき、分析対象ブロックは修正基底画像を用いて次式のように表せる。 At this time, the analysis target block can be expressed as follows using the corrected base image.

この式（１６）の両辺のフーリエ変換（Ｆ［・］で表記）は、フーリエ変換の線形性により、次式のように表せる。 The Fourier transform (represented by F [•]) on both sides of the equation (16) can be expressed as the following equation due to the linearity of the Fourier transform.

ここで、この式（１７）は、分析対象ブロックをフーリエ変換することで得られるフーリエ変換係数（基準ブロック内の周波数成分のみならず隣接する基準ブロック間の不連続性に起因する周波数成分）は、修正基底画像をフーリエ変換することで得られるフーリエ変換係数の線形和で表されるということを意味している。 Here, this equation (17) is obtained by applying a Fourier transform coefficient (frequency component due to discontinuity between adjacent reference blocks as well as frequency components in the reference block) obtained by performing Fourier transform on the analysis target block. This means that the corrected base image is represented by a linear sum of Fourier transform coefficients obtained by performing Fourier transform.

Ｆ［〜ｆ_k,l ^(ix,iy)］は、Ｍ_xＮ×Ｍ_yＮ次元の複素数ベクトルであり、その第（ｕ_x,ｕ_y）要素は次式のように表されるフーリエ変換係数である。なお、以下では、Ｎ＝２^mとおく。 _{^{F [~f k, l (ix}} , iy)] is a complex vector of M _x N × M _y N-dimensional, the first (u _x, u _y) elements Fourier transform is expressed by the following equation It is a coefficient. In the following, it is assumed that N = 2 ^m .

ここで、ｊは虚数単位である。また、ｕ_x,ｕ_yを空間周波数インデックスと呼ぶ。 Here, j is an imaginary unit. U _{x and} u _y are referred to as spatial frequency indexes.

このようにして、フーリエ変換の対象をｆ_k,l（ｘ，ｙ）ではなく、〜ｆ_k,l（ｘ，ｙ）とすることで、ブロックの不連続性に起因する周波数成分も考慮した周波数分析を行うことが可能となる。 In this way, the frequency component due to the discontinuity of the block is also taken into account by setting the target of the Fourier transform to ˜f _{k, l} (x, y) instead of f _{k, l} (x, y). Frequency analysis can be performed.

上記のフーリエ変換係数Ｆ_k,l（ｕ_x,ｕ_y）（０≦ｕ_x≦Ｍ_xＮ−１，０≦ｕ_y≦Ｍ_yＮ−１）に対して、以下の重み付けを行う。なお、（ｄ_x,ｄ_y）は、分析対象ブロックに対して推定された変移量とする。 Fourier transform coefficients of the F _k, relative to _{_{l (u x, u y)}} (0 ≦ u x ≦ M x N-1,0 ≦ u y ≦ M y N-1), performs the weighting of the following. Note that (d _x, _dy ) is a transition amount estimated for the analysis target block.

以下、〜Ｆ_k,l（ｕ_x,ｕ_y）について説明する。ここで、＾ｇ（η，ｄ）はコントラスト感度などの視覚系の特性に基づき設定される関数であり、視覚感度関数と呼ぶ。視覚感度関数の設定については、例えば、後述の［視覚感度関数の設定１］あるいは［視覚感度関数の設定２］に示す方法がある。 Hereinafter, ~ F _{k, l} (ux _, u _y ) will be described. Here, {circumflex over (g)} (η, d) is a function set based on visual system characteristics such as contrast sensitivity, and is called a visual sensitivity function. As for the setting of the visual sensitivity function, for example, there is a method shown in [Visual sensitivity function setting 1] or [Visual sensitivity function setting 2] described later.

予測誤差信号Ｒ_mの（ｋ，ｌ）基底の成分に対する感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y）を、次式の電力比として定義する。 The sensitivity coefficient W _{k, l} ^s (d _x, d _y ) for the (k, l) basis component of the prediction error signal R _m is defined as the power ratio of the following equation.

このとき、輝度成分と色差成分でモデルパラメータを変更することも可能である。 At this time, it is also possible to change the model parameter with the luminance component and the color difference component.

式（２０）の意味するところについて説明するならば、分析対象ブロックはＭ_x×Ｍ_y個の基準ブロックから構成されるものであることから、分析対象ブロックをフーリエ変換すると、ブロック内のみならず隣接ブロック間の不連続性についても評価できるようになる。一方、式（１７）に示すように、分析対象ブロックをフーリエ変換するということは、修正基底画像をフーリエ変換することで得られるフーリエ変換係数の線形和（直交変換係数を係数とする線形和）を算出することと等価である。 If explained the meaning of the formula (20), since the analyzed block is intended to be composed of M _x × M _y-number of reference blocks, if the analyte block Fourier transform, not only in the block It also becomes possible to evaluate discontinuity between adjacent blocks. On the other hand, as shown in Expression (17), Fourier transform of the analysis target block means that a linear sum of Fourier transform coefficients obtained by Fourier transform of the corrected base image (linear sum using orthogonal transform coefficients as coefficients). Is equivalent to calculating.

そこで、本発明では、ブロック内のみならず隣接ブロック間の不連続性についても評価できるようにするために、式（１７）の右辺の｛・｝の中に記載される部分の二乗和（式（２０）の分母に相当するもの）を算出するようにするとともに、式（１９）で重み付けされたそれに対応する二乗和（式（２０）の分子に相当するもの）を算出するようにして、その比値である式（２０）に従って、予測誤差信号Ｒ_mの（ｋ，ｌ）基底の成分に対する感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y）を算出するようにするのである。 Therefore, in the present invention, in order to be able to evaluate not only the block but also the discontinuity between adjacent blocks, the sum of squares of the part described in {•} on the right side of the equation (17) (formula (Corresponding to the denominator of (20)) and calculating the sum of squares corresponding to the weight weighted by equation (19) (corresponding to the numerator of equation (20)), The sensitivity coefficient W _{k, l} ^s (d _x, d _y ) for the (k, l) basis component of the prediction error signal R _m is calculated according to the ratio (20).

ちなみに、式（２０）の分母の算出値をｋ，ｌについて総和をとったものは、式（１６）の二乗和に相当し、これから、式（２０）の分母は予測誤差信号の電力に相当するものとなる。 Incidentally, the sum of the calculated values of the denominator of Equation (20) with respect to k and l corresponds to the sum of squares of Equation (16), and from this, the denominator of Equation (20) corresponds to the power of the prediction error signal. To be.

このようにして決定される感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y）は、空間周波数および時間周波数に応じた値を示すものであることから、主観画質に対応した歪み量を定義することができるようになる。しかも、ブロック内の波形とあわせて、隣接ブロック間の不連続性も考慮した周波数分析を行うことで決定されることから、ブロック歪みを含む主観画質に対応した歪み量を定義することができるようになる。 Since the sensitivity coefficient W _{k, l} ^s (d _x, _dy ) determined in this way indicates a value corresponding to the spatial frequency and the temporal frequency, the amount of distortion corresponding to the subjective image quality is defined. Will be able to. Moreover, since it is determined by performing a frequency analysis taking into account the discontinuity between adjacent blocks together with the waveform in the block, it is possible to define a distortion amount corresponding to subjective image quality including block distortion. become.

すなわち、視覚的に検知されにくい時空間周波数成分の歪み量については、感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y）の値が相対的に小さなものとなることで相対的に小さな値として評価し、一方、視覚的に検知されやすい時空間周波数成分の歪み量については、感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y）の値が相対的に大きなものとなることで相対的に大きな値として評価することから、主観画質に対応した歪み量を定義することができるようになるのである。しかも、この感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y）の値がブロック内の波形とあわせて、隣接ブロック間の不連続性も考慮した周波数分析を行うことで決定されることから、ブロック歪みを含む主観画質に対応した歪み量を定義することができるようになるのである。 That is, the distortion amount of the spatio-temporal frequency component that is difficult to detect visually is set to a relatively small value because the sensitivity coefficient W _{k, l} ^s (d _x, _dy ) is relatively small. On the other hand, the distortion amount of the spatio-temporal frequency component that is easy to detect visually is relatively high because the value of the sensitivity coefficient W _{k, l} ^s (d _x, _dy ) becomes relatively large. Since the evaluation is performed as a large value, the distortion amount corresponding to the subjective image quality can be defined. Moreover, since the value of the sensitivity coefficient W _{k, l} ^s (d _x, _dy ) is determined by performing frequency analysis in consideration of discontinuity between adjacent blocks together with the waveform in the block, The distortion amount corresponding to the subjective image quality including the block distortion can be defined.

次に、視覚感度関数の設定方法の一例である［視覚感度関数の設定１］および［視覚感度関数の設定２］について説明する。 Next, [visual sensitivity function setting 1] and [visual sensitivity function setting 2], which are examples of the visual sensitivity function setting method, will be described.

［視覚感度関数の設定１］
次式のようなコントラスト感度関数を考える。 [Visual Sensitivity Function Setting 1]
Consider a contrast sensitivity function such as:

ここで、ａ₁,ａ₂,ａ₃,ａ₄は視覚感度関数の関数形を定めるパラメータ（以後、モデルパラメータと呼ぶ）であり、例えば、
（ａ₁,ａ₂,ａ₃,ａ₄）＝（ 6.1 , 7.31 , 2 , 45.9 )
というような値が用いられる。 Here, a ₁ , a ₂ , a ₃ , and a ₄ are parameters (hereinafter referred to as model parameters) that define the function form of the visual sensitivity function.
_{_{(A 1, a 2, a}} 3, a 4) = (6.1, 7.31, 2, 45.9)
Such a value is used.

また、ηは単位視野角内の明暗対の個数を表す空間周波数[cycle/degree]である。ここで、ηは一次元の空間周波数であることに注意する。 Also, η is a spatial frequency [cycle / degree] representing the number of light-dark pairs within the unit viewing angle. Note that η is a one-dimensional spatial frequency.

このとき、ηと空間周波数インデックス（ｕ_xまたはｕ_yのいずれか）との間には、
η（ｕ，ｒ）＝θ（ｒ，Ｈ）ｕ／２ＭＮ・・・式（２２）
の関係がある。 At this time, between η and the spatial frequency index (either u _x or u _y ),
η (u, r) = θ (r, H) u / 2MN Equation (22)
There is a relationship.

ここで、ｕ＝ｕ_xの場合、Ｍ＝Ｍ_xであり、ｕ＝ｕ_yの場合、Ｍ＝Ｍ_yである。また、θ（ｒ，Ｈ）は縦幅Ｈの画像を視距離ｒＨにおいて観測する場合の一画素あたりの角度[degrees/pixel] であり、
θ（ｒ，Ｈ）＝（１／Ｈ）×ａｒｃｔａｎ（１／ｒ）×（１８０／π）
・・・式（２３）
という式により与えられる。 In the case of u = u _x, a M = M _x, the case of u = u _y, is M = M _y. Θ (r, H) is an angle [degrees / pixel] per pixel when an image having a vertical width H is observed at a viewing distance rH.
θ (r, H) = (1 / H) × arctan (1 / r) × (180 / π)
... Formula (23)
Is given by the expression

ωは単位時間当たりの角度の変化量[degrees/sec] である。このとき、ωと変移量ｄ（ｄ_xまたはｄ_yのいずれか）との間には、
ω（ｄ）＝ｔａｎ^-1（ｆ_rｄ／ｒＨ）・・・式（２４）
の関係がある。ここで、ｆ_rはフレームレートである。 ω is the angle change per unit time [degrees / sec]. At this time, between the ω the displacement amount d (either d _x or d _y) is
^{ω (d) = tan -1 (} f r d / rH) ··· formula (24)
There is a relationship. Here, _fr is a frame rate.

式（２２）および式（２４）を式（２１）に代入し、コントラスト感度関数ｇ（η，ω）をｕ，ｄの関数として表した次式の＾ｇ（ｕ，ｄ）
＾ｇ（ｕ，ｄ）＝ｇ（η（ｕ），ω（ｄ)) ・・・式（２５）
を視覚感度関数とする。 By substituting Equation (22) and Equation (24) into Equation (21) and expressing the contrast sensitivity function g (η, ω) as a function of u and d, ^ g (u, d)
^ G (u, d) = g (η (u), ω (d)) Equation (25)
Is a visual sensitivity function.

［視覚感度関数の設定２］
次のようなコントラスト感度関数を考える。 [Visual sensitivity function setting 2]
Consider the following contrast sensitivity function.

ここで、ｂ₁,ｂ₂,ｂ₃,ｂ₄は視覚感度関数の関数形を定めるパラメータ（以後、モデルパラメータと呼ぶ）であり、例えば、
（ｂ₁,ｂ₂,ｂ₃,ｂ₄）＝ ( 0.4992 , 0.2964 , -0.114 , 1.1 )
（ｂ₁,ｂ₂,ｂ₃,ｂ₄）＝ ( 0.2 , 0.45 , -0.18 , 1 )
（ｂ₁,ｂ₂,ｂ₃,ｂ₄）＝ ( 0.31 , 0.69 , -0.29 , 1 )
（ｂ₁,ｂ₂,ｂ₃,ｂ₄）＝ ( 0.246 , 0.615 , -0.25 , 1 )
というような値をとる。 Here, b ₁ , b ₂ , b ₃ , and b ₄ are parameters (hereinafter referred to as model parameters) that define the function form of the visual sensitivity function.
_{_{(B 1, b 2, b}} 3, b 4) = (0.4992, 0.2964, -0.114, 1.1)
_{_{(B 1, b 2, b}} 3, b 4) = (0.2, 0.45, -0.18, 1)
_{_{(B 1, b 2, b}} 3, b 4) = (0.31, 0.69, -0.29, 1)
_{_{(B 1, b 2, b}} 3, b 4) = (0.246, 0.615, -0.25, 1)
It takes such a value.

このとき、ηと空間周波数インデックス（ｕ_xまたはｕ_yのいずれか）との間には、
η（ｕ，ｒ）＝θ（ｒ，Ｈ）ｕ／２ＭＮ・・・式（２７）
の関係がある。 At this time, between η and the spatial frequency index (either u _x or u _y ),
η (u, r) = θ (r, H) u / 2MN Equation (27)
There is a relationship.

ここで、ｕ＝ｕ_xの場合、Ｍ＝Ｍ_xであり、ｕ＝ｕ_yの場合、Ｍ＝Ｍ_yである。また、θ（ｒ，Ｈ）は縦幅Ｈの画像を視距離ｒＨにおいて観測する場合の一画素あたりの角度[degrees/pixel] であり、
θ（ｒ，Ｈ）＝（１／Ｈ）×ａｒｃｔａｎ（１／ｒ）×（１８０／π）
・・・式（２８）
という式により与えられる。 In the case of u = u _x, a M = M _x, the case of u = u _y, is M = M _y. Θ (r, H) is an angle [degrees / pixel] per pixel when an image having a vertical width H is observed at a viewing distance rH.
θ (r, H) = (1 / H) × arctan (1 / r) × (180 / π)
... Formula (28)
Is given by the expression

また、ｒは変移量ｄの大きさに応じて適応的に変化させるものとする。例えば、次式のような設定法である。ここで、Ａは閾値であり、ｒ₁＞ｒ₂とする。 Further, r is adaptively changed according to the magnitude of the shift amount d. For example, the setting method is as follows. Here, A is a threshold value, and r ₁ > r ₂ .

式（２７）および式（２８）を式（２６）に代入し、コントラスト感度関数ｇ（η）をｕ，ｄの関数として表した次式の＾ｇ（ｕ，ｄ）
＾ｇ（ｕ，ｄ）＝ｇ（η（ｕ，ｒ（ｄ)) ・・・式（３０）
を視覚感度関数とする。 By substituting Equation (27) and Equation (28) into Equation (26) and expressing the contrast sensitivity function g (η) as a function of u and d, ^ g (u, d)
^ G (u, d) = g (η (u, r (d)) (30)
Is a visual sensitivity function.

以上に説明した構成に従って、本発明によれば、ブロック歪みを含む主観画質を適切に評価した符号化歪みの尺度を導入することができるようになり、これにより、符号化パラメータの選択に用いるコスト関数として、ブロック歪みを含む主観画質を反映したものを実現できるようになることで、高能率の符号化を実現できるようになるとともに、符号量の削減を実現できるようになる。 In accordance with the configuration described above, according to the present invention, it is possible to introduce a measure of coding distortion that appropriately evaluates subjective image quality including block distortion, thereby reducing the cost used to select coding parameters. By realizing a function that reflects subjective image quality including block distortion as a function, it is possible to realize highly efficient coding and to reduce the amount of codes.

次に、本発明により構成される動画像符号化装置の構成について説明する。 Next, the configuration of the moving picture coding apparatus constructed according to the present invention will be described.

本発明の動画像符号化装置は、画像信号、あるいは、フレーム内予測およびフレーム間予測により得られた予測誤差信号に対して、変換符号化および量子化による情報圧縮を行うことで動画像を符号化する構成を採るときに、（イ）変換行列の対象となる複数のブロックで構成される分析対象ブロックに対応付けて定義されて、１つのブロックに変換行列の基底画像が配置され、他のブロックにゼロ値が埋め込まれることで構成されるブロックの数分の修正基底画像について算出された空間周波数成分を記憶する記憶手段と、（ロ）分析対象ブロックの画像信号の時間的な動きを示す変移量を推定する推定手段と、（ハ）記憶手段から修正基底画像の空間周波数成分を読み出して、その空間周波数成分の空間周波数インデックスと推定手段の推定した変移量とに基づいて、その空間周波数成分に割り当てられる視覚感度値を算出して、その空間周波数成分を重み付けする重み付け手段と、（ニ）重み付け手段の重み付けをした空間周波数成分と、その重み付けをしない空間周波数成分と、分析対象ブロックを構成するブロックの変換係数とに基づいて、予測誤差信号の各基底成分についての重要度を算出する算出手段と、（ホ）算出手段の算出した重要度を用いて重み付けされた符号化の歪み量を用いて符号化コストを評価することで、符号化パラメータを決定する決定手段とを備え、（ヘ）前記算出手段は、変換係数の二乗和と重み付けをした空間周波数成分の二乗ノルム和との乗算値と、変換係数の二乗和と重み付けをしない空間周波数成分の二乗ノルム和との乗算値とを求めて、その２つの乗算値の割り算値に従って重要度を算出するように構成する。 The moving image encoding apparatus of the present invention encodes a moving image by performing information compression by transform encoding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction. (A) is defined in association with an analysis target block composed of a plurality of blocks to be converted matrix, and a base image of the conversion matrix is arranged in one block, Storage means for storing spatial frequency components calculated for the number of corrected base images corresponding to the number of blocks configured by embedding zero values in the block, and (b) showing temporal movement of the image signal of the analysis target block (C) reading out the spatial frequency component of the modified base image from the storage means, and estimating the spatial frequency index of the spatial frequency component and the estimation means The visual sensitivity value assigned to the spatial frequency component is calculated based on the determined shift amount, the weighting means for weighting the spatial frequency component, (d) the spatial frequency component weighted by the weighting means, A calculation means for calculating the importance of each base component of the prediction error signal based on the spatial frequency component without weighting and the transform coefficient of the block constituting the analysis target block; and (e) the importance calculated by the calculation means Determining means for determining an encoding parameter by evaluating an encoding cost using an amount of encoding distortion weighted using a degree , and (f) the calculating means includes a square sum of transform coefficients, Find the product of the squared norm sum of the weighted spatial frequency components and the product of the square sum of the transform coefficients and the squared norm sum of the unweighted spatial frequency components. , Configured to calculate the importance according quotient of the two multiplied values.

ここで、修正基底画像について算出された空間周波数成分を記憶する記憶手段を備えるようにするのは、修正基底画像の空間周波数成分が符号化対象の画像とは関係なく求めることができることで、その都度算出することが不要であるからである。 Here, the storage means for storing the spatial frequency component calculated for the corrected base image is provided because the spatial frequency component of the corrected base image can be obtained regardless of the image to be encoded. This is because it is not necessary to calculate each time.

この構成を採るときに、重み付け手段は、水平方向の空間周波数インデックスと推定手段の推定した変移量の水平成分とに基づいて水平方向の視覚感度値を算出するとともに、垂直方向の空間周波数インデックスと推定手段の推定した変移量の垂直成分とに基づいて垂直方向の視覚感度値を算出することで、記憶手段から読み出した修正基底画像の空間周波数成分に割り当てられる視覚感度値を算出することがある。 When adopting this configuration, the weighting means calculates the visual sensitivity value in the horizontal direction based on the horizontal spatial frequency index and the horizontal component of the displacement estimated by the estimating means, and the vertical spatial frequency index and The visual sensitivity value assigned to the spatial frequency component of the corrected base image read from the storage means may be calculated by calculating the visual sensitivity value in the vertical direction based on the vertical component of the shift amount estimated by the estimation means. .

このように、空間周波数成分および動きの方向依存性を考慮するのは、時空間領域における視覚系の検知機構が空間的エッジ方向および動きの方向に依存するためである。例えば、縦縞が移動する場合、動きとして認識できるのは、動きの水平方向成分のみとなる。つまり、時空間周波数に対する視覚感度を評価する際は、空間周波数成分のエッジ方向および変移方向を考慮する必要がある。このことを考慮して、重み付け手段は、空間周波数成分および動きの方向依存性を考慮する形で視覚感度値を算出することがある。 The reason why the spatial frequency component and the direction dependency of the motion are considered in this way is because the detection mechanism of the visual system in the spatio-temporal region depends on the spatial edge direction and the direction of motion. For example, when the vertical stripe moves, only the horizontal component of the movement can be recognized as the movement. That is, when evaluating the visual sensitivity with respect to the spatio-temporal frequency, it is necessary to consider the edge direction and the transition direction of the spatial frequency component. In consideration of this, the weighting means may calculate the visual sensitivity value in consideration of the spatial frequency component and the direction dependency of motion.

以上の各処理手段が動作することで実現される本発明の動画像符号化方法はコンピュータプログラムでも実現できるものであり、このコンピュータプログラムは、適当なコンピュータ読み取り可能な記録媒体に記録して提供されたり、ネットワークを介して提供され、本発明を実施する際にインストールされてＣＰＵなどの制御手段上で動作することにより本発明を実現することになる。 The moving image encoding method of the present invention realized by the operation of each of the above processing means can also be realized by a computer program, and this computer program is provided by being recorded on a suitable computer-readable recording medium. Alternatively, the present invention is realized by being provided via a network, installed when the present invention is carried out, and operating on a control means such as a CPU.

このようにして、本発明では、画像信号、あるいは、フレーム内予測およびフレーム間予測により得られた予測誤差信号に対して、変換符号化および量子化による情報圧縮を行うことで動画像を符号化する動画像符号化において、歪み量、符号量、未定乗数からなるラグランジェのコスト関数に基づいて、動画像符号化における動き補償ブロックサイズ・インター予測モード・量子化パラメータや、静止画像符号化におけるイントラ予測モード・量子化パラメータなどの符号化パラメータを決定する際に、ブロック内の空間周波数成分および隣接ブロック間の不連続性に関する空間周波数成分を測定し、さらに、得られた空間周波数成分と動き推定により得られた動きベクトルとに基づいて時間周波数成分を推定して、時空間周波数の成分毎に、視覚感度関数に基づき重要度を算出し、その重要度に基づき、周波数毎に重み付けされた二乗誤差として得られる歪み量を用いて符号量との加重和としてコスト関数を設定し、そのコスト関数を最小化するモードを選択することにより符号化パラメータを選択するようにする。 Thus, in the present invention, a moving image is encoded by performing information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction. In moving picture coding, based on Lagrangian cost functions consisting of distortion, coding quantity, and undetermined multiplier, motion compensation block size, inter prediction mode, quantization parameter in moving picture coding, and still picture coding When determining coding parameters such as intra prediction mode and quantization parameter, the spatial frequency component in the block and the spatial frequency component related to discontinuity between adjacent blocks are measured, and the obtained spatial frequency component and motion are also measured. Estimate temporal frequency components based on the motion vector obtained by estimation, and The cost function is calculated based on the visual sensitivity function, and the cost function is set as a weighted sum with the code amount using the distortion amount obtained as a square error weighted for each frequency based on the importance. The encoding parameter is selected by selecting a mode that minimizes.

本発明では、動き補償によるフレーム間予測と直交変換を組み合わせたブロックベースの符号化方式に対して、空間周波数成分の方向および変移量の方向を考慮することにより時空間周波数を推定して、ブロック歪みを含む主観画質を適切に評価した符号化歪みの尺度を導入する。 In the present invention, for a block-based coding scheme that combines inter-frame prediction by motion compensation and orthogonal transform, the spatio-temporal frequency is estimated by considering the direction of the spatial frequency component and the direction of the shift amount, We introduce a measure of coding distortion that appropriately evaluates subjective image quality including distortion.

これにより、本発明によれば、符号化パラメータの選択に用いるコスト関数として、ブロック歪みを含む主観画質を反映したものを実現できるようになることで、高能率の符号化を実現できるようになるとともに、符号量の削減を実現できるようになる。 As a result, according to the present invention, it is possible to realize a cost function used for selecting an encoding parameter that reflects subjective image quality including block distortion, thereby realizing highly efficient encoding. At the same time, the amount of code can be reduced.

以下、実施の形態に従って本発明を詳細に説明する。 Hereinafter, the present invention will be described in detail according to embodiments.

図２に、本発明の適用される動画像符号化装置１の装置構成を図示する。 FIG. 2 illustrates a device configuration of a moving image encoding device 1 to which the present invention is applied.

本発明の適用される動画像符号化装置１は、Ｈ．２６４に従って動画像を符号化する処理を行うものであり、この図に示すように、符号化に用いる符号化パラメータを選択する符号化パラメータ選択部１０と、符号化パラメータ選択部１０の選択した符号化パラメータを使って動画像の符号化を実行する符号化部１１とを備える。 The moving image encoding apparatus 1 to which the present invention is applied is described in H.264. H.264 is used to perform encoding of a moving image. As shown in this figure, an encoding parameter selection unit 10 that selects an encoding parameter used for encoding, and a code selected by the encoding parameter selection unit 10 And an encoding unit 11 that performs encoding of a moving image using the encoding parameters.

図３〜図５に、符号化パラメータ選択部１０が本発明を実現すべく実行するフローチャートの一例を図示する。ここで、このフローチャートでは、符号化パラメータ選択部１０が符号化パラメータとして最適な予測モードを選択することを想定している。 3 to 5 show examples of flowcharts executed by the encoding parameter selection unit 10 to realize the present invention. Here, in this flowchart, it is assumed that the encoding parameter selection unit 10 selects an optimal prediction mode as an encoding parameter.

符号化パラメータ選択部１０は、本発明に従って符号化に用いる最適な予測モードを選択する場合には、図３のフローチャートに示すように、まず最初に、ステップＳ１０１で、予測モードの初期値（初期値となる予測モード）を設定する。 When selecting the optimal prediction mode used for encoding according to the present invention, the encoding parameter selection unit 10 firstly, as shown in the flowchart of FIG. Set the prediction mode).

続いて、ステップＳ１０２で、最小コストを格納するレジスタ（以下、最小コストレジスタと称することがある）に対して大きな値を示す初期コストを格納するとともに、最適な予測モードを格納するレジスタ（以下、最適モードレジスタと称することがある）に対して意味のない値を格納することで、これらのレジスタを初期化する。 Subsequently, in step S102, an initial cost indicating a large value is stored in a register for storing a minimum cost (hereinafter also referred to as a minimum cost register), and a register for storing an optimal prediction mode (hereinafter, referred to as a minimum cost register). These registers are initialized by storing meaningless values for (sometimes referred to as optimal mode registers).

続いて、ステップＳ１０３で、変移量（前述した（ｄ_x,ｄ_y))を推定し、各候補ベクトルの予測誤差をテーブルに格納する。この変移量の推定方法については、外部より与えられるものとする。例えば、Ｈ．２６４の参照ソフトウェアＪＭが算出する動きベクトルを、以下で使用する変移量の推定値として用いることも可能である。 Subsequently, in step S103, the shift amount ((d _x, d _y ) described above) is estimated, and the prediction error of each candidate vector is stored in a table. The method for estimating the amount of displacement is given from the outside. For example, H.M. It is also possible to use a motion vector calculated by the H.264 reference software JM as an estimated value of the shift amount used below.

続いて、ステップＳ１０４で、設定されている処理対象の予測モード、その予測モードによる予測ベクトル、量子化パラメータ、符号化対象フレーム信号、参照フレーム信号を入力として、その予測モードを用いて符号化する場合の符号量を算出する。具体的な算出方法は、Ｈ．２６４の参照ソフトウェアＪＭの方法に従う。 Subsequently, in step S104, the set prediction mode of the processing target, the prediction vector based on the prediction mode, the quantization parameter, the encoding target frame signal, and the reference frame signal are input, and encoding is performed using the prediction mode. The code amount in the case is calculated. The specific calculation method is as follows. According to the H.264 reference software JM method.

続いて、ステップＳ１０５で、最初に、ステップＳ１０３で推定した変移量に基づいて時空間視覚感度を考慮した重みを決定し、次に、設定されている処理対象の予測モード、その予測モードによる予測ベクトル、量子化パラメータ、符号化対象フレーム信号、参照フレーム信号を入力として、それらの入力信号とその決定した重みとに基づいて、その予測モードを用いて符号化する場合の重み付き歪み量を算出する。具体的な算出方法については、図４および図５のフローチャートで説明する。 Subsequently, in step S105, first, a weight considering the spatiotemporal visual sensitivity is determined based on the amount of transition estimated in step S103, and then, the prediction mode to be processed and the prediction based on the prediction mode are determined. Calculates the weighted distortion amount when encoding using the prediction mode based on the input signal and the determined weight based on the vector, quantization parameter, encoding target frame signal, and reference frame signal. To do. A specific calculation method will be described with reference to the flowcharts of FIGS.

続いて、ステップＳ１０６で、設定されている処理対象の予測モード、量子化パラメータを入力として、その予測モードを用いて符号化する場合の未定乗数を算出する。 Subsequently, in step S106, the prediction mode to be processed and the quantization parameter that have been set are input, and an undetermined multiplier for encoding using the prediction mode is calculated.

続いて、ステップＳ１０７で、ステップＳ１０４で算出した符号量と、ステップＳ１０５で算出した重み付き歪み量と、ステップＳ１０６で算出した未定乗数とに基づいて、式（１１）で示されるＲ−Ｄコストを算出する。 Subsequently, in step S107, based on the code amount calculated in step S104, the weighted distortion amount calculated in step S105, and the undetermined multiplier calculated in step S106, the RD cost represented by Expression (11) is used. Is calculated.

続いて、ステップＳ１０８で、算出したＲ−Ｄコストと最小コストレジスタに格納されているコストとを比較して、算出したＲ−Ｄコストの方が最小コストレジスタに格納されているコストよりも小さいことを判断するときには、ステップＳ１０９に進んで、算出したＲ−Ｄコストを最小コストレジスタに格納し、続くステップＳ１１０で、設定されている処理対象の予測モードの識別情報を最適モードレジスタに格納する。 Subsequently, in step S108, the calculated RD cost is compared with the cost stored in the minimum cost register, and the calculated RD cost is smaller than the cost stored in the minimum cost register. When judging this, the process proceeds to step S109, where the calculated RD cost is stored in the minimum cost register, and in the subsequent step S110, the set identification information of the prediction mode to be processed is stored in the optimum mode register. .

一方、ステップＳ１０８で、算出したＲ−Ｄコストの方が最小コストレジスタに格納されているコストよりも大きいことを判断するときには、このステップＳ１０９，１１０の処理を省略する。 On the other hand, when it is determined in step S108 that the calculated RD cost is higher than the cost stored in the minimum cost register, the processes in steps S109 and 110 are omitted.

続いて、ステップＳ１１１で、全ての予測モードについて処理したのか否かを判断して、全ての予測モードについて処理していないことを判断するときには、ステップＳ１１２に進んで、予め定められる順番に従って未処理の予測モードの中から予測モードを１つ選択することで処理対象の予測モードを更新してから、ステップＳ１０４の処理に戻る。 Subsequently, in step S111, it is determined whether or not processing has been performed for all prediction modes, and when it is determined that processing has not been performed for all prediction modes, the process proceeds to step S112, and unprocessed according to a predetermined order. After the prediction mode to be processed is updated by selecting one prediction mode from among the prediction modes, the process returns to step S104.

一方、ステップＳ１１１で、全ての予測モードについて処理したことを判断するときには、ステップＳ１１３に進んで、最適モードレジスタに格納されている予測モードを最適な予測モードとして符号化部１１に出力して、処理を終了する。 On the other hand, when it is determined in step S111 that all prediction modes have been processed, the process proceeds to step S113, and the prediction mode stored in the optimum mode register is output to the encoding unit 11 as the optimum prediction mode. The process ends.

次に、図４のフローチャートに従って、図３のフローチャートのステップＳ１０５で実行する重み付き歪み量の算出処理について説明する。 Next, the calculation processing of the weighted distortion amount executed in step S105 of the flowchart of FIG. 3 will be described according to the flowchart of FIG.

この重み付き歪み量の算出処理は、式（１２）の算出式を計算することで実行するものである。なお、以下のフローチャートでは、説明の便宜上、一次元的なインデックスｉ（ｋ，ｌを指すインデックス）での処理で説明している。 The calculation processing of the weighted distortion amount is executed by calculating the calculation formula of Formula (12). In the following flowchart, for the sake of convenience of explanation, the processing using a one-dimensional index i (index indicating k, l) is described.

符号化パラメータ選択部１０は、図３のフローチャートのステップＳ１０５の処理に入ると、図４のフローチャートに示すように、まず最初に、ステップＳ２０１で、変換係数を正規化する。 When entering the process of step S105 in the flowchart of FIG. 3, the encoding parameter selection unit 10 first normalizes the transform coefficient in step S201 as shown in the flowchart of FIG.

続いて、ステップＳ２０２で、式（１２）中の＾Ｃ_m,q ^s(i)［ｋ，ｌ］（ｓ＝Ｙ，Ｕ，Ｖ）で示される各変換係数の復号値を求める。 Subsequently, in step S202, a decoded value of each transform coefficient indicated by ^ _{Cm, qs} ⁽ⁱ⁾ [k, l] (s = Y, U, V) in the equation (12) is obtained.

続いて、ステップＳ２０３で、図３のフローチャートのステップＳ１０３で推定した変移量（前述した（ｄ_x,ｄ_y))を読み込む。なお、この変移量は図５のフローチャートの実行にあたって必要となるものであり、この段階で読み込んでおく。 Subsequently, in step S203, the shift amount (the above-described (d _x, d _y )) estimated in step S103 of the flowchart of FIG. 3 is read. This amount of change is necessary for executing the flowchart of FIG. 5, and is read at this stage.

続いて、ステップＳ２０４で、式（１２）中に記載される感度係数Ｗ_k,l ^s（ｓ＝Ｙ，Ｕ，Ｖ）を設定する。具体的な設定方法については図５のフローチャートで説明する。なお、このとき設定する感度係数が図３のフローチャートのステップＳ１０５で説明した重みに相当する。 Subsequently, in step S204, the sensitivity coefficient W _{k, l} ^s (s = Y, U, V) described in the equation (12) is set. A specific setting method will be described with reference to the flowchart of FIG. The sensitivity coefficient set at this time corresponds to the weight described in step S105 of the flowchart of FIG.

続いて、ステップＳ２０５で、ｉを指定するカウンタｉを０に初期化し、さらに、レジスタＳを０に初期化する。 In step S205, a counter i designating i is initialized to 0, and a register S is initialized to 0.

続いて、ステップＳ２０６で、式（１２）中に記載される
｜Ｃ_m ^s(i)［ｋ，ｌ］−＾Ｃ_m,q ^s(i)［ｋ，ｌ］｜²
ただし、ｓ＝Ｙ，Ｕ，Ｖ
の算出式に従って、変換係数の第ｉ成分についての符号化歪みの歪み量を算出する。 Subsequently, in step S206, | C _m ^{s (i)} [k, l] − ^ C _{m, q} ^{s (i)} [k, l] | ² described in equation (12)
However, s = Y, U, V
The amount of coding distortion for the i-th component of the transform coefficient is calculated according to the following equation.

続いて、ステップＳ２０７で、ステップＳ２０６で算出した符号化歪みの歪み量に対して、ステップＳ２０４で設定した感度係数Ｗ_k,l ^s（ｓ＝Ｙ，Ｕ，Ｖ）を乗ずることで、変換係数の第ｉ成分についての重み付き歪み量を算出する。 Subsequently, in step S207, the transform coefficient is obtained by multiplying the distortion amount of the coding distortion calculated in step S206 by the sensitivity coefficient W _{k, l} ^s (s = Y, U, V) set in step S204. The weighted distortion amount for the i-th component is calculated.

続いて、ステップＳ２０８で、ステップＳ２０７で算出した乗算値をレジスタＳに加算する。 Subsequently, in step S208, the multiplication value calculated in step S207 is added to the register S.

続いて、ステップＳ２０９で、変換係数の全成分について処理したのか否かを判断して、変換係数の全成分について処理していないことを判断するときには、ステップＳ２１０に進んで、変換係数の全成分を処理すべく、カウンタｉに１を加算してから、ステップＳ２０６の処理に戻る。 Subsequently, in step S209, it is determined whether or not all components of the transform coefficient have been processed, and when it is determined that all components of the transform coefficient have not been processed, the process proceeds to step S210 and all components of the transform coefficient are processed. 1 is added to the counter i to return to step S206.

一方、ステップＳ２０９で、変換係数の全成分について処理したことを判断するときには、重み付き歪み量の算出処理を終了したことを判断して、ステップＳ２１１に進んで、図３のフローチャートのステップＳ１０５の算出結果として、レジスタＳに格納されている重み付き歪み量を出力して、処理を終了する。 On the other hand, when it is determined in step S209 that all components of the transform coefficient have been processed, it is determined that the processing for calculating the weighted distortion amount has been completed, and the process proceeds to step S211 where step S105 in the flowchart of FIG. As a calculation result, the weighted distortion amount stored in the register S is output, and the process ends.

次に、図５のフローチャートに従って、図４のフローチャートのステップＳ２０４で実行する感度係数Ｗ_k,l ^sの設定処理について説明する。 Next, the sensitivity coefficient W _{k, l} ^s setting process executed in step S204 of the flowchart of FIG. 4 will be described with reference to the flowchart of FIG.

この感度係数Ｗ_k,l ^sの設定処理は、式（２０）の算出式を計算することで実行するものである。 The setting process of the sensitivity coefficient W _{k, l} ^s is executed by calculating the calculation formula of Expression (20).

符号化パラメータ選択部１０は、図４のフローチャートのステップＳ２０４の処理に入ると、図５のフローチャートに示すように、まず最初に、ステップＳ３０１で、基底画像を入力として、修正基底画像のＤＦＴ係数を算出して出力するという処理を行う“処理４”を実行する。 When the encoding parameter selection unit 10 enters the process of step S204 in the flowchart of FIG. 4, first, as shown in the flowchart of FIG. "Process 4" is executed to perform a process of calculating and outputting.

このとき算出した修正基底画像のＤＦＴ係数については再利用されることになるので、テーブル（後述する図７に示す修正基底画像ＤＦＴ係数記憶部２１０２）に格納することになる。 Since the DFT coefficient of the corrected base image calculated at this time is reused, it is stored in a table (a corrected base image DFT coefficient storage unit 2102 shown in FIG. 7 described later).

続いて、ステップＳ３０２で、“処理４”で得た修正基底画像のＤＦＴ係数と、ＤＣＴ係数とを入力として、分析対象ブロックに対する予測誤差電力（式（２０）の分母）を算出して出力するという処理を行う“処理２”を実行する。 Subsequently, in step S302, the DFT coefficient and DCT coefficient of the modified base image obtained in “Process 4” are input, and the prediction error power (the denominator of Expression (20)) for the analysis target block is calculated and output. "Process 2" is executed to perform the process.

続いて、ステップＳ３０３で、“処理４”で得た修正基底画像のＤＦＴ係数と、ＤＣＴ係数と、分析対象ブロックの動きベクトルと、コントラスト関数とを入力として、分析対象ブロックに対する予測誤差電力（式（２０）の分子）を算出して出力するという処理を行う“処理３”を実行する。 Subsequently, in step S303, the DFT coefficient of the modified base image obtained in “Process 4”, the DCT coefficient, the motion vector of the analysis target block, and the contrast function are input, and the prediction error power (formula “Process 3” is executed to perform the process of calculating and outputting the numerator (20).

続いて、ステップＳ３０４で、“処理２”で得た予測誤差電力（式（２０）の分母）と、“処理３”で得た予測誤差電力（式（２０）の分子）とを入力として、モード選択のコスト関数における歪み量に対しての感度係数（重み）を算出して出力するという処理を行う“処理１”を実行する。 Subsequently, in step S304, the prediction error power obtained in “Process 2” (the denominator of Expression (20)) and the prediction error power obtained in “Process 3” (the numerator of Expression (20)) are input. “Process 1” is executed to perform a process of calculating and outputting a sensitivity coefficient (weight) for the distortion amount in the cost function of mode selection.

次に、“処理１”、“処理２”、“処理３”、“処理４”の詳細なフローチャートについて説明する。なお、以下に説明する“処理１”のフローチャートでは、説明の便宜上、設定した感度係数に基づいて図３のフローチャートのステップＳ１０７の処理（コスト関数の算出処理）まで実行することで説明している。 Next, detailed flowcharts of “Process 1”, “Process 2”, “Process 3”, and “Process 4” will be described. In the flowchart of “Process 1” described below, for the sake of convenience of explanation, the processing up to step S107 (cost function calculation processing) of the flowchart of FIG. 3 is executed based on the set sensitivity coefficient. .

［１］“処理４”
入力：第ｋ，ｌ基底画像（ｋ，ｌ＝０,....,Ｎ−１）
出力：修正基底画像に対するＤＦＴ係数
処理：
（１）位置情報を示すインデックスｉ_x,ｉ_yを読み込む
（２）ｌ＝０
（３）ｋ＝０
（４）第ｋ，ｌ基底画像ｆ_k,l ^(ix,iy)を読み込む
（５）（４）の基底画像に対して、（１）の位置情報に応じたゼロパディングにより、Ｍ_xＮ×Ｍ_xＮの画像〜ｆ_k,l ^(ix,iy)を生成する。ここで得られた画像を修正基底画像と呼ぶ。具体的な生成方法は式（１５）
（６）（５）の修正基底画像に対してＤＦＴを実施し、修正基底画像内の周波数成分の分布を算出する。具体的な算出方法は式（１８）
（７）ｋ＝ｋ＋１
（８）ｋ＝Ｎならば（９）へ、そうでなければ（４）へ
（９）ｌ＝ｌ＋１
（10）ｌ＝Ｎならば終了、そうでなければ（３）へ
このフローチャートに従って、符号化パラメータ選択部１０は、“処理４”において、基底画像ｆ_k,l ^(ix,iy)を入力として、修正基底画像のＤＦＴ係数Ｆ_k,l ^(ix,iy)（ｕ_x,ｕ_y）を算出するという処理を行うのである。 [1] “Process 4”
Input: k-th, l-th base image (k, l = 0,..., N−1)
Output: DFT coefficient for modified base image
(1) Reading indexes i _{x and} i _y indicating position information (2) l = 0
(3) k = 0
(4) Read the kth, l-th base image f _{k, l} ^{(ix, iy)} (5) For the base image of (4), zero padding according to the position information of (1) gives M _x N × M _x N images ~ f _{k, l} ^{(ix, iy)} are generated. The image obtained here is called the modified base image. The specific generation method is the equation (15).
(6) DFT is performed on the modified base image of (5), and the distribution of frequency components in the modified base image is calculated. The specific calculation method is formula (18).
(7) k = k + 1
(8) If k = N, go to (9), otherwise go to (4) (9) l = l + 1
(10) If l = N, end, otherwise go to (3) According to this flowchart, the encoding parameter selection unit 10 receives the base image f _{k, l} ^{(ix, iy)} as input in “Process 4”. Then, the process of calculating the DFT coefficient F _{k, l} ^{(ix, iy)} ( ^ux _, u _y ) of the modified base image is performed.

このようにして算出された修正基底画像のＤＦＴ係数Ｆ_k,l ^(ix,iy)（ｕ_x,ｕ_y）は、テーブル（後述する図７の修正基底画像ＤＦＴ係数記憶部２１０２）に記憶されることになる。 The DFT coefficients F _{k, l} ^{(ix, iy)} ( ^ux _, u _y ) of the corrected base image calculated in this way are stored in a table (a corrected base image DFT coefficient storage unit 2102 in FIG. 7 described later). Will be.

［２］“処理２”
入力：修正基底画像のＤＦＴ係数（ｋ，ｌ＝０,....,Ｎ−１，
ｉ_x＝０,....,Ｍ_x−１，ｉ_y＝０,....,Ｍ_y−１）
：ＤＣＴ係数
出力：分析対象ブロックに対する予測誤差電力Ｅ［ｋ，ｌ］（式（２０）の分母）
ここで、Ｅ［ｋ，ｌ］は配列
処理：
（１）ｌ＝０
（２）ｋ＝０
（３）Ｅ［ｋ，ｌ］＝０
（４）ｉ_y＝０
（５）ｉ_x＝０
（６）Ｓ＝０
（７）ｕ_y＝０
（８）ｕ_x＝０
（９）位置インデックスｉ_x,ｉ_yの第ｋ，ｌ修正基底画像のＤＦＴ係数のｕ_x,ｕ_y成分を、前述のテーブルから読み込む
（10）直前に読み込んだ複素数の二乗ノルムＦ_k,l ^(ix,iy)（ｕ_x,ｕ_y）²を計算する
（11）Ｓ＝Ｓ＋Ｆ_k,l ^(ix,iy)（ｕ_x,ｕ_y）²
（12）ｕ_x＝ｕ_x＋１
（13）ｕ_x＝ＮＭ_xならば次へ、そうでなければ（９）へ
（14）ｕ_y＝ｕ_y＋１
（15）ｕ_y＝ＮＭ_yならば次へ、そうでなければ（８）へ
（16）第ｋ，ｌ基底のＤＣＴ係数Ｃ^(ix,iy)［ｋ，ｌ］を、読み込む
（17）Ｅ［ｋ，ｌ］＝Ｅ［ｋ，ｌ］＋Ｃ^(ix,iy)［ｋ，ｌ］²Ｓ
（18）ｉ_x＝ｉ_x＋１
（19）ｉ_x＝Ｍ_xならば次へ、そうでなければ（６）へ
（20）ｉ_y＝ｉ_y＋１
（21）ｉ_y＝Ｍ_yならば次へ、そうでなければ（５）へ
（22）ｋ＝ｋ＋１
（23）ｋ＝Ｎならば次へ、そうでなければ（３）へ
（24）ｌ＝ｌ＋１
（25）ｌ＝Ｎならば終了、そうでなければ（２）へ
このフローチャートに従って、符号化パラメータ選択部１０は、“処理２”において、“処理４”で得た修正基底画像のＤＦＴ係数Ｆ_k,l ^(ix,iy)（ｕ_x,ｕ_y）と、ＤＣＴ係数Ｃ^(ix,iy)［ｋ，ｌ］とを入力として、分析対象ブロックに対する予測誤差電力Ｅ［ｋ，ｌ］（式（２０）の分母）を算出するという処理を行うのである。 [2] “Process 2”
Input: DFT coefficient of modified base image (k, l = 0,..., N−1,
i _x = 0, ..., M _x -1, i _y = 0, ..., M _y -1)
: DCT coefficient Output: Prediction error power E [k, l] for the analysis target block (denominator of equation (20))
Where E [k, l] is an array process:
(1) l = 0
(2) k = 0
(3) E [k, l] = 0
(4) i _y = 0
(5) i _x = 0
(6) S = 0
(7) u _y = 0
(8) u _x = 0
(9) Read the u _x, u _y components of the DFT coefficients of the kth, lth modified base image of the position index i _x, i _y from the above table. (10) Complex square norm F _{k, l} read immediately before ^{_{(ix, iy) (u x}} , u y) 2 to calculate the (11) S = S + F k, l (ix, iy) (u x, u y) 2
(12) u _x = u _x +1
(13) If u _x = NM _x , go to the next step, otherwise go to (9) (14) u _y = u _y +1
(15) If u _y = NM _y , go to the next, otherwise go to (8) (16) Read the DCT coefficient C ^{(ix, iy)} [k, l] of the kth and ^lth basis (17) E [K, l] = E [k, l] + C ^{(ix, iy)} [k, l] ² S
(18) i _x = i _x +1
(19) If i _x = M _x , go to the next step, otherwise go to (6) (20) i _y = i _y +1
(21) i _{_y} = M _y if to the next, if not to (5) (22) k = k + 1
(23) If k = N, go to the next, otherwise go to (3) (24) l = l + 1
(25) If l = N, end; otherwise, go to (2) According to this flowchart, the encoding parameter selection unit 10 performs the DFT coefficient F of the modified base image obtained in “Process 4” in “Process 2”. _{k, l} ^{(ix, iy)} ( ^ux _, u _y ) and DCT coefficient C ^{(ix, iy)} [k, l] as inputs, and prediction error power E [k, l] (formula The process of calculating the denominator of (20) is performed.

［３］“処理３”
入力：修正基底画像のＤＦＴ係数（ｋ，ｌ＝０,....,Ｎ−１，
ｉ_x＝０,....,Ｍ_x−１，ｉ_y＝０,....,Ｍ_y−１）
：ＤＣＴ係数
：分析対象ブロックの動きベクトル（ｄ_x,ｄ_y）
：コントラスト感度関数＾ｇ（η，ω）
出力：分析対象ブロックに対する予測誤差電力＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］（式（２０）の分子）
ここで、＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］は配列
処理：
（０）変移量記憶部２０８から、動きベクトル（ｄ_x,ｄ_y）を読み込む
（１）ｌ＝０
（２）ｋ＝０
（３）＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］＝０
（４）ｉ_y＝０
（５）ｉ_x＝０
（６）＾Ｓ＝０
（７）ｕ_y＝０
（８）ｕ_x＝０
（９）位置インデックスｉ_x,ｉ_yの第ｋ，ｌ修正基底画像のＤＦＴ係数のｕ_x,ｕ_y成分を、前述のテーブルから読み込む
（10）直前に読み込んだ複素数の二乗ノルムＦ_k,l ^(ix,iy)（ｕ_x,ｕ_y）²を計算する
（11）＾ｇ（η_x,ｄ_x）および＾ｇ（η_y,ｄ_y）を計算する。具体的な計算は、例えば式（２１）により求める
（12）＾Ｓ＝＾Ｓ＋Ｆ_k,l ^(ix,iy)（ｕ_x,ｕ_y）²＾ｇ（η_x,ｄ_x）²＾ｇ（η_y,ｄ_y ）²
（13）ｕ_x＝ｕ_x＋１
（14）ｕ_x＝ＮＭ_xならば次へ、そうでなければ（９）へ
（15）ｕ_y＝ｕ_y＋１
（16）ｕ_y＝ＮＭ_yならば次へ、そうでなければ（８）へ
（17）第ｋ，ｌ基底のＤＣＴ係数Ｃ^(ix,iy)［ｋ，ｌ］を読み込む
（18）＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］＝＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］＋Ｃ^(ix,iy)［ｋ，ｌ］²＾Ｓ
（19）ｉ_x＝ｉ_x＋１
（20）ｉ_x＝Ｍ_xならば次へ、そうでなければ（６）へ
（21）ｉ_y＝ｉ_y＋１
（22）ｉ_y＝Ｍ_yならば次へ、そうでなければ（５）へ
（23）ｋ＝ｋ＋１
（24）ｋ＝Ｎならば次へ、そうでなければ（３）へ
（25）ｌ＝ｌ＋１
（26）ｌ＝Ｎならば終了、そうでなければ（２）へ
このフローチャートに従って、符号化パラメータ選択部１０は、“処理３”において、“処理４”で得た修正基底画像のＤＦＴ係数Ｆ_k,l ^(ix,iy)（ｕ_x,ｕ_y）と、ＤＣＴ係数Ｃ^(ix,iy)［ｋ，ｌ］と、分析対象ブロックの動きベクトル（ｄ_x,ｄ_y）と、コントラスト感度関数＾ｇ（η，ω）とを入力として、分析対象ブロックに対する予測誤差電力＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］（式（２０）の分子）を算出するという処理を行うのである。 [3] “Process 3”
Input: DFT coefficient of modified base image (k, l = 0,..., N−1,
i _x = 0, ..., M _x -1, i _y = 0, ..., M _y -1)
: DCT coefficient: Motion vector (d _x, _dy ) of the analysis target block
: Contrast sensitivity function ^ g (η, ω)
Output: Prediction error power ^ E (d _x, _dy ) [k, l] (numerator of equation (20)) for the analysis target block
Here, ^ E (d _x, d _y ) [k, l] is an array process:
(0) The motion vector (d _x, _dy ) is read from the transition amount storage unit 208. (1) l = 0
(2) k = 0
(3) ^ E (d _x, d _y ) [k, l] = 0
(4) i _y = 0
(5) i _x = 0
(6) ^ S = 0
(7) u _y = 0
(8) u _x = 0
(9) Read the u _x, u _y components of the DFT coefficients of the kth, lth modified base image of the position index i _x, i _y from the above table. (10) Complex square norm F _{k, l} read immediately before ^{_{(ix, iy) (u x}} , u y) 2 to calculate the _{(11) ^ g (η x} , d x) and _{_{^ g (η y, d y}} ) is calculated. Concrete calculation is obtained, for example, by equation (21) (12) ^ S = ^ S + F k, l (ix, iy) (u x, u y) 2 ^ g (η x, d x) 2 ^ g ( η _y, d _y ) ²
(13) u _x = u _x +1
(14) If u _x = NM _x , go to the next step, otherwise go to (9) (15) u _y = u _y +1
(16) If u _y = NM _y , go to the next step, otherwise go to (8) (17) Read the DCT coefficient C ^{(ix, iy)} [k, l] of the kth and lth basis (18) ^ E (Dx _, _dy ) [k, l] = ^ E (dx _, _dy ) [k, l] + C ^{(ix, iy)} [k, l] ² ^ S
(19) i _x = i _x +1
(20) If i _x = M _x , go to the next step, otherwise go to (6) (21) i _y = i _y +1
(22) If i _y = M _y , go to the next step, otherwise go to (5) (23) k = k + 1
(24) If k = N, go to the next step, otherwise go to (3) (25) l = l + 1
(26) If l = N, end; otherwise, go to (2). According to this flowchart, the encoding parameter selection unit 10 in “Process 3”, the DFT coefficient F of the modified base image obtained in “Process 4”. _{^{k, l (ix, iy)}} (u x, u y) and, DCT coefficients C and ^{(ix, iy) [k,} l], the motion vector of the analysis target block (d _x, d _y) and the contrast sensitivity function The process of calculating the prediction error power ^ E (d _x, _dy ) [k, l] (numerator of equation (20)) for the analysis target block is performed with ^ g (η, ω) as an input. .

［４］“処理１”
入力：“処理２”で求めたＥ［ｋ，ｌ］（ｋ，ｌ＝０,....,Ｎ−１）
：“処理３”で求めた＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］（ｋ，ｌ＝０,....,Ｎ−１）
出力：モード選択のコスト関数における歪み量に対しての重み（感度係数）
処理：
（１）ｌ＝０，Ｄ’＝０
（２）ｋ＝０
（３）Ｄ＝０
（４）ｉ_y＝０
（５）ｉ_x＝０
（６）位置インデックスｉ_x,ｉ_yの基準ブロックにおけるＤＣＴの第ｋ，ｌ成分の係数の二乗値ｅ［ｋ，ｌ］^(ix,iy)を読み込む
（７）Ｅ［ｋ，ｌ］を読み込む
（８）＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］を読み込む
（９）重み係数を計算する
Ｗ（ｄ_x,ｄ_y）［ｋ，ｌ］＝＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］／Ｅ［ｋ，ｌ］
（10）Ｄ＝Ｄ＋ｅ［ｋ，ｌ］^(ix,iy)
（11）ｉ_x＝ｉ_x＋１
（12）ｉ_x＝Ｍ_xならば次へ、そうでなければ（６）へ
（13）ｉ_y＝ｉ_y＋１
（14）ｉ_y＝Ｍ_yならば次へ、そうでなければ（５）へ
（15）Ｄ’＝Ｄ’＋Ｄ＊Ｗ［ｋ，ｌ］
（16）ｋ＝ｋ＋１
（17）ｋ＝Ｎならば次へ、そうでなければ（３）へ
（18）ｌ＝ｌ＋１
（19）ｌ＝Ｎならば終了、そうでなければ（２）へ
（20）位置インデックスｉ_x,ｉ_yの基準ブロックにおける符号量の推定値α^(ix,iy)を算出し、分析対象ブロック内の符号量の総和の推定値Ａ＝Σ_ix,iyα^(ix,iy)を算出する
（21）コスト関数Ｊを算出する。Ｊ＝Ｄ’＋λＡ
このフローチャートに従って、符号化パラメータ選択部１０は、“処理１”において、“処理２”で得た分析対象ブロックに対する予測誤差電力Ｅ［ｋ，ｌ］（式（２０）の分母）と、“処理３”で時空間の視覚感度を考慮して得た分析対象ブロックに対する予測誤差電力＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］（式（２０）の分子）とを入力として、
Ｗ（ｄ_x,ｄ_y）［ｋ，ｌ］＝＾Ｅ（ｄ_x,ｄ_y）［ｋ，ｌ］／Ｅ［ｋ，ｌ］
という算出式（式（２０))に従って、モード選択のコスト関数における歪み量に対しての重みＷ（ｄ_x,ｄ_y）［ｋ，ｌ］（前述の感度係数Ｗ_k,l ^s（ｄ_x,ｄ_y))を算出するという処理を行うのである。 [4] “Process 1”
Input: E [k, l] obtained in “Process 2” (k, l = 0,..., N−1)
: ^ E (d _x, d _y ) [k, l] (k, l = 0,..., N−1) obtained in “Process 3”
Output: Weight for the amount of distortion in the cost function of mode selection (sensitivity coefficient)
processing:
(1) l = 0, D ′ = 0
(2) k = 0
(3) D = 0
(4) i _y = 0
(5) i _x = 0
(6) position index i _x, reads the k of the DCT in the reference block of the i _y, square value e of the coefficients of l component ^{[k, l] (ix,} iy) read (7) E [k, l] and (8) Read {circumflex over (E) _} (d _x, d _y ) [k, l] (9) Calculate the weighting coefficient W (d _x, d _y ) [k, l] = ^ E (d _x, d _y ) [K, l] / E [k, l]
(10) D = D + e [k, l] ^{(ix, iy)}
(11) i _x = i _x +1
(12) If i _x = M _x , go to the next step, otherwise go to (6) (13) i _y = i _y +1
(14) i _{_y} = M _y if to the next, otherwise (5) to (15) D '= D' + D * W [k, l]
(16) k = k + 1
(17) If k = N, go to the next, otherwise go to (3) (18) l = l + 1
(19) l = N if finished, calculates otherwise (2) to (20) position index i _x, the estimated value of the code amount in the reference block of the i _y alpha ^{(ix, iy),} the analysis target block (21) The cost function J is calculated. (21) Calculate the estimated value A = Σ _{ix, iy} α ^{(ix, iy)} . J = D '+ λA
According to this flowchart, the encoding parameter selection unit 10 in “Process 1”, the prediction error power E [k, l] (the denominator of Expression (20)) for the analysis target block obtained in “Process 2”, and “Process As an input, the prediction error power ^ E (d _x, _dy ) [k, l] (numerator of equation (20)) for the analysis target block obtained in consideration of the visual sensitivity of space-time in 3 ″,
W (dx _, _dy ) [k, l] = ^ E (dx _, _dy ) [k, l] / E [k, l]
According to the calculation formula (formula (20)), the weight W (d _x, d _y ) [k, l] (the aforementioned sensitivity coefficient W _{k, l} ^s (d _{x ,} d _y )) is calculated.

このようにして、符号化パラメータ選択部１０は、図３〜図５のフローチャートを実行することで、動画像の符号化に用いる予測モードを選択するにあたって、予測モードの選択に用いるコスト関数として、ブロック歪みを含む主観画質を反映したものを実現できるようになり、これにより、符号化部１１は高能率の符号化を実現できるようになるとともに、符号量を削減できるようになる。 In this way, the encoding parameter selection unit 10 executes the flowcharts of FIGS. 3 to 5 to select a prediction mode used for encoding a moving image, as a cost function used for selecting a prediction mode. Thus, it is possible to realize what reflects subjective image quality including block distortion, and thus the encoding unit 11 can realize high-efficiency encoding and reduce the amount of codes.

図６および図７に、図３〜図５のフローチャートを実行すべく構成される符号化パラメータ選択部１０の装置構成を図示する。 6 and 7 illustrate the device configuration of the encoding parameter selection unit 10 configured to execute the flowcharts of FIGS. 3 to 5.

次に、この図６および図７に従って、符号化パラメータ選択部１０の装置構成について説明する。 Next, the apparatus configuration of the encoding parameter selection unit 10 will be described with reference to FIGS.

符号化パラメータ選択部１０は、図３のフローチャートを実行するために、図６に示すように、（１）符号化対象のブロックの変移量の推定値を記憶する変移量記憶部１０１と、（２）符号化対象のブロックの予測ベクトルを算出する予測ベクトル算出部１０２と、（３）予測ベクトル算出部１０２の算出した予測ベクトルを記憶する予測ベクトル記憶部１０３と、（４）予測モードの初期値を設定する初期モード設定部１０４と、（５）処理対象となる予測モードを設定するモード設定部１０５と、（６）初期モード設定部１０４やモード設定部１０５の設定した予測モードを記憶するモード記憶部１０６と、（７）モード記憶部１０６の記憶する予測モードで符号化した場合の符号量を算出する符号量算出部１０７と、（８）符号量算出部１０７の算出した符号量を記憶する符号量記憶部１０８と、（９）モード記憶部１０６の記憶する予測モードで符号化した場合の重み付き歪み量を算出する重み付き歪み量算出部１０９と、（１０）重み付き歪み量算出部１０９の算出した重み付き歪み量を記憶する重み付き歪み量記憶部１１０と、（１１）モード記憶部１０６の記憶する予測モードで符号化した場合の未定乗数を算出する未定乗数算出部１１１と、（１２）未定乗数算出部１１１の算出した未定乗数を記憶する未定乗数記憶部１１２と、（１３）符号量と重み付き歪み量と未定乗数とに基づいて、モード記憶部１０６の記憶する予測モードで符号化した場合の符号化コストを算出するコスト算出部１１３と、（１４）コスト算出部１１３の算出した符号化コストを記憶するコスト記憶部１１４と、（１５）これまでに得た最小コストを記憶する最小コスト記憶部１１５と、（１６）最小コスト記憶部１１５の記憶する最小コストを参照しつつ、コスト算出部１１３の算出した符号化コストがこれまでに得た最小コストであるのか否かを判定する最小コスト判定部１１６と、（１７）最適な予測モードを記憶する最適モード記憶部１１７と、（１８）最小コスト判定部１１６が最小コストであることを判断したときに、モード記憶部１０６の記憶する予測モードに従って最適モード記憶部１１７の記憶する予測モードを更新する最適モード更新部１１８と、（１９）最小コスト判定部１１６が最小コストでないと判定したときには、直ちに予測モードの全てを処理したのか否かを判定し、一方、最小コスト判定部１１６が最小コストであることを判定したときには、最適モード更新部１１８からの指示を受けて予測モードの全てを処理したのか否かを判定して、最終の予測モードでないことを判断するときには、モード設定部１０５に対して次の予測モードの設定を指示する最終モード判定部１１９と、（２０）最終モード判定部１１９が最終の予測モードであることを判断するときに、最適モード記憶部１１７の記憶する予測モードを最適な予測モードとして出力する最適モード出力部１２０とを備える。 In order to execute the flowchart of FIG. 3, the encoding parameter selection unit 10, as shown in FIG. 6, (1) a transition amount storage unit 101 that stores an estimation value of a transition amount of a block to be encoded; 2) a prediction vector calculation unit 102 that calculates a prediction vector of a block to be encoded; (3) a prediction vector storage unit 103 that stores a prediction vector calculated by the prediction vector calculation unit 102; and (4) an initial prediction mode. An initial mode setting unit 104 for setting a value; (5) a mode setting unit 105 for setting a prediction mode to be processed; and (6) a prediction mode set by the initial mode setting unit 104 or the mode setting unit 105. A mode storage unit 106; (7) a code amount calculation unit 107 that calculates a code amount when encoding is performed in the prediction mode stored in the mode storage unit 106; and (8) a code amount calculation. A code amount storage unit 108 that stores the calculated code amount 107, (9) a weighted distortion amount calculation unit 109 that calculates a weighted distortion amount when encoded in the prediction mode stored in the mode storage unit 106; (10) A weighted distortion amount storage unit 110 that stores the weighted distortion amount calculated by the weighted distortion amount calculation unit 109, and (11) an undetermined multiplier when encoded in the prediction mode stored in the mode storage unit 106. Based on the undetermined multiplier calculation unit 111 to calculate, (12) the undetermined multiplier storage unit 112 that stores the undetermined multiplier calculated by the undetermined multiplier calculation unit 111, and (13) the code amount, the weighted distortion amount, and the undetermined multiplier, A cost calculation unit 113 for calculating an encoding cost when encoding is performed in a prediction mode stored in the mode storage unit 106; and (14) an encoding cost calculated by the cost calculation unit 113 is stored. The cost calculation unit 113 calculates, referring to the cost storage unit 114, (15) the minimum cost storage unit 115 that stores the minimum cost obtained so far, and (16) the minimum cost stored in the minimum cost storage unit 115. A minimum cost determination unit 116 that determines whether the encoded cost is the minimum cost obtained so far, (17) an optimal mode storage unit 117 that stores an optimal prediction mode, and (18) a minimum cost determination. An optimal mode update unit 118 that updates the prediction mode stored in the optimal mode storage unit 117 according to the prediction mode stored in the mode storage unit 106 when the unit 116 determines that the cost is the minimum cost; and (19) minimum cost determination. When the unit 116 determines that the cost is not the minimum cost, it immediately determines whether all the prediction modes have been processed, while the minimum cost determination unit 1 When it is determined that 16 is the minimum cost, it is determined whether or not all of the prediction modes have been processed in response to an instruction from the optimum mode update unit 118. A final mode determination unit 119 that instructs the setting unit 105 to set the next prediction mode; and (20) when the final mode determination unit 119 determines that the final prediction mode is set, the optimum mode storage unit 117 And an optimal mode output unit 120 that outputs the stored prediction mode as the optimal prediction mode.

符号化パラメータ選択部１０は、この装置構成に従って、図３のフローチャートを実行するのである。 The encoding parameter selection unit 10 executes the flowchart of FIG. 3 according to this apparatus configuration.

図７に、図４および図５のフローチャートを実行する重み付き歪み量算出部１０９の装置構成を図示する。 FIG. 7 illustrates a device configuration of the weighted distortion amount calculation unit 109 that executes the flowcharts of FIGS. 4 and 5.

重み付き歪み量算出部１０９は、図４および図５のフローチャートを実行するために、図７に示すように、（１）変換係数を正規化する変換係数正規化部２０１と、（２）変換係数正規化部２０１の変換した正規化変換係数を記憶する正規化変換係数記憶部２０２と、（３）変換係数を復号する変換係数復号部２０３と、（４）変換係数復号部２０３の復号した復号変換係数を記憶する復号変換係数記憶部２０４と、（５）正規化変換係数記憶部２０２の記憶する変換係数と、復号変換係数記憶部２０４の記憶する変換係数とに基づいて、変換係数の第ｉ成分（ｉは更新されていく）の符号化歪みの歪み量を算出する歪み量算出部２０５と、（６）歪み量算出部２０５の算出した歪み量を記憶する歪み量記憶部２０６と、（７）現在処理している変換係数の第ｉ成分の値ｉを記憶する変換係数インデックス記憶部２０７と、（８）分析対象ブロックの変移量の推定値を記憶する変移量記憶部２０８と、（９）基底画像を記憶する基底画像記憶部２０９と、（１０）変移量記憶部２０８の記憶する変移量と、基底画像記憶部２０９の記憶する基底画像と、正規化変換係数記憶部２０２の記憶する変換係数と、コントラスト感度関数とに基づいて、感度係数Ｗ_k,lを算出する感度係数算出部２１０と、（１１）感度係数算出部２１０の算出した感度係数Ｗ_k,lを記憶する感度係数記憶部２１１と、（１２）歪み量記憶部２０６の記憶する歪み量と、感度係数記憶部２１１の記憶する感度係数Ｗ_k,lとを乗算することで、重み付けの歪み量を算出する感度係数乗算部２１２と、（１３）感度係数乗算部２１２の算出した歪み量を記憶する歪み量記憶部２１３と、（１４）歪み量記憶部２１３に順次格納されていく歪み量の総和を算出することで重み付き歪み量を算出する歪み量和算出部２１４とを備える。 In order to execute the flowcharts of FIGS. 4 and 5, the weighted distortion amount calculation unit 109 includes (1) a conversion coefficient normalization unit 201 that normalizes conversion coefficients, and (2) conversion. A normalized transform coefficient storage unit 202 that stores the normalized transform coefficient converted by the coefficient normalization unit 201; (3) a transform coefficient decoding unit 203 that decodes the transform coefficient; and (4) a decryption performed by the transform coefficient decoding unit 203. Based on the decoded transform coefficient storage unit 204 that stores the decoded transform coefficient, (5) the transform coefficient stored in the normalized transform coefficient storage unit 202, and the transform coefficient stored in the decoded transform coefficient storage unit 204, A distortion amount calculation unit 205 that calculates the distortion amount of the encoded distortion of the i-th component (i is updated); and (6) a distortion amount storage unit 206 that stores the distortion amount calculated by the distortion amount calculation unit 205. (7) Currently processing A conversion coefficient index storage unit 207 that stores the i-th component value i of the conversion coefficient, (8) a transition amount storage unit 208 that stores an estimated value of the transition amount of the analysis target block, and (9) a base image. A base image storage unit 209, (10) a shift amount stored in the shift amount storage unit 208, a base image stored in the base image storage unit 209, a conversion coefficient stored in the normalized conversion coefficient storage unit 202, and a contrast A sensitivity coefficient calculation unit 210 that calculates the sensitivity coefficient W _{k, l} based on the sensitivity function; and (11) a sensitivity coefficient storage unit 211 that stores the sensitivity coefficient W _{k, l} calculated by the sensitivity coefficient calculation unit 210; (12) a sensitivity coefficient multiplication unit 212 that calculates a weighting distortion amount by multiplying the distortion amount stored in the distortion amount storage unit 206 by the sensitivity coefficient W _{k, l} stored in the sensitivity coefficient storage unit 211; (13) Sensitivity staff A distortion amount storage unit 213 that stores the distortion amount calculated by the multiplication unit 212, and (14) a distortion amount that calculates a weighted distortion amount by calculating a sum of distortion amounts sequentially stored in the distortion amount storage unit 213. A sum calculation unit 214.

そして、感度係数算出部２１０は、（１）前述した“処理４”の処理を実行する処理４実行部２１０１と、（２）処理４実行部２１０１により算出された修正基底画像のＤＦＴ係数を記憶する修正基底画像ＤＦＴ係数記憶部２１０２と、（３）修正基底画像ＤＦＴ係数記憶部２１０２の記憶する修正基底画像ＤＦＴ係数と、正規化変換係数記憶部２０２の記憶する変換係数とを入力として、前述した“処理２”の処理を実行する処理２実行部２１０３と、（４）修正基底画像ＤＦＴ係数記憶部２１０２の記憶する修正基底画像ＤＦＴ係数と、正規化変換係数記憶部２０２の記憶する変換係数と、変移量記憶部２０８の記憶する変移量と、コントラスト関数とを入力として、前述した“処理３”の処理を実行する処理３実行部２１０４と、（５）処理２実行部２１０３により算出された予測誤差電力と、処理３実行部２１０４により算出された予測誤差電力とを入力として、前述した“処理１”の処理の内の感度係数の算出に係わる処理を実行する処理１実行部２１０５とを備える。 Then, the sensitivity coefficient calculation unit 210 stores (1) the process 4 execution unit 2101 that executes the process of “process 4” described above, and (2) the DFT coefficient of the corrected base image calculated by the process 4 execution unit 2101. The modified base image DFT coefficient storage unit 2102 to be processed, (3) the modified base image DFT coefficient stored in the modified base image DFT coefficient storage unit 2102 and the transform coefficient stored in the normalized transform coefficient storage unit 202 are input. The processing 2 execution unit 2103 that executes the processing of “processing 2”, (4) the modified base image DFT coefficient stored in the modified base image DFT coefficient storage unit 2102, and the transform coefficient stored in the normalized transform coefficient storage unit 202 And a process 3 execution unit 2104 that executes the process of “process 3” described above by using the shift amount stored in the shift amount storage unit 208 and the contrast function as inputs. The process related to the calculation of the sensitivity coefficient in the process of “process 1” described above is input with the prediction error power calculated by the process 2 execution unit 2103 and the prediction error power calculated by the process 3 execution unit 2104 as input. And a process 1 execution unit 2105 to be executed.

重み付き歪み量算出部１０９は、この装置構成に従って、図４および図５のフローチャートを実行するのである。 The weighted distortion amount calculation unit 109 executes the flowcharts of FIGS. 4 and 5 according to this apparatus configuration.

以上に説明した実施の形態では、符号化パラメータ選択部１０が符号化パラメータとして最適な予測モードを選択することで説明したが、本発明は、予測モード以外の符号化パラメータを選択する場合にもそのまま適用できるものである。 In the embodiment described above, the encoding parameter selection unit 10 has been described as selecting the optimum prediction mode as the encoding parameter. However, the present invention is also applicable to the case of selecting an encoding parameter other than the prediction mode. It can be applied as it is.

例えば、符号化パラメータ選択部１０が符号化パラメータとして最適な量子化パラメータを選択する場合には、符号化パラメータ選択部１０は、図３のフローチャートに代えて、図８のフローチャートを実行することになる。 For example, when the encoding parameter selection unit 10 selects an optimal quantization parameter as the encoding parameter, the encoding parameter selection unit 10 executes the flowchart of FIG. 8 instead of the flowchart of FIG. Become.

すなわち、符号化パラメータ選択部１０は、符号化対象マクロブロックの符号化に用いる最適な量子化パラメータを決定する場合には、図８のフローチャートに示すように、まず最初に、ステップＳ４０１で、量子化パラメータの初期値（初期値となる量子化パラメータ）を設定する。 That is, when determining the optimal quantization parameter used for encoding the macroblock to be encoded, the encoding parameter selection unit 10 first determines the quantization parameter in step S401 as shown in the flowchart of FIG. The initial value of the quantization parameter (quantization parameter as the initial value) is set.

続いて、ステップＳ４０２で、最小コストを格納するレジスタ（最小コストレジスタ）に対して大きな値を示す初期コストを格納するとともに、最適な量子化パラメータを格納するレジスタ（以下、最適量子化パラメータレジスタと称することがある）に対して意味のない値を格納することで、これらのレジスタを初期化する。 Subsequently, in step S402, an initial cost indicating a large value is stored with respect to a register (minimum cost register) for storing a minimum cost, and a register for storing an optimal quantization parameter (hereinafter referred to as an optimal quantization parameter register). These registers are initialized by storing meaningless values.

続いて、ステップＳ４０３で、変移量（前述した（ｄ_x,ｄ_y))を推定し、各候補ベクトルの予測誤差をテーブルに格納する。この変移量の推定方法については、外部より与えられるものとする。例えば、Ｈ．２６４の参照ソフトウェアＪＭが算出する動きベクトルを、以下で使用する変移量の推定値として用いることも可能である。 Subsequently, in step S403, the shift amount ((d _x, d _y ) described above) is estimated, and the prediction error of each candidate vector is stored in a table. The method for estimating the amount of displacement is given from the outside. For example, H.M. It is also possible to use a motion vector calculated by the H.264 reference software JM as an estimated value of the shift amount used below.

続いて、ステップＳ４０４で、設定されている量子化パラメータ、符号化に用いる予測モード、その予測モードによる予測ベクトル、符号化対象フレーム信号、参照フレーム信号を入力として、その量子化パラメータを用いて符号化する場合の符号量を算出する。 Subsequently, in step S404, the set quantization parameter, the prediction mode used for encoding, the prediction vector based on the prediction mode, the encoding target frame signal, and the reference frame signal are input, and the encoding is performed using the quantization parameter. The amount of code for conversion is calculated.

続いて、ステップＳ４０５で、最初に、ステップＳ４０３で推定した変移量に基づいて時空間視覚感度を考慮した重みを決定し、次に、設定されている量子化パラメータ、符号化に用いる予測モード、その予測モードによる予測ベクトル、符号化対象フレーム信号、参照フレーム信号を入力として、それらの入力信号とその決定した重みとに基づいて、その量子化パラメータを用いて符号化する場合の重み付き歪み量を算出する。具体的な算出方法については、図４および図５のフローチャートで説明した通りである。 Subsequently, in step S405, first, a weight considering the spatiotemporal visual sensitivity is determined based on the amount of transition estimated in step S403, and then the set quantization parameter, the prediction mode used for encoding, Weighted distortion amount when encoding using the prediction parameter, the encoding target frame signal, and the reference frame signal in the prediction mode, and encoding using the quantization parameter based on the input signal and the determined weight Is calculated. A specific calculation method is as described in the flowcharts of FIGS.

続いて、ステップＳ４０６で、設定されている量子化パラメータ、符号化に用いる予測モードを入力として、その量子化パラメータを用いて符号化する場合の未定乗数を算出する。 Subsequently, in step S406, the set quantization parameter and the prediction mode used for encoding are input, and an undetermined multiplier when encoding using the quantization parameter is calculated.

続いて、ステップＳ４０７で、ステップＳ４０４で算出した符号量と、ステップＳ４０５で算出した重み付き歪み量と、ステップＳ４０６で算出した未定乗数とに基づいて、式（１１）で示されるＲ−Ｄコストを算出する。 Subsequently, in step S407, based on the code amount calculated in step S404, the weighted distortion amount calculated in step S405, and the undetermined multiplier calculated in step S406, the RD cost represented by Expression (11) is used. Is calculated.

続いて、ステップＳ４０８で、算出したＲ−Ｄコストと最小コストレジスタに格納されているコストとを比較して、算出したＲ−Ｄコストの方が最小コストレジスタに格納されているコストよりも小さいことを判断するときには、ステップＳ４０９に進んで、算出したＲ−Ｄコストを最小コストレジスタに格納し、続くステップＳ４１０で、設定されている量子化パラメータを最適量子化パラメータレジスタに格納する。一方、ステップＳ４０８で、算出したＲ−Ｄコストの方が最小コストレジスタに格納されているコストよりも大きいことを判断するときには、このステップＳ４０９，４１０の処理を省略する。 Subsequently, in step S408, the calculated RD cost is compared with the cost stored in the minimum cost register, and the calculated RD cost is smaller than the cost stored in the minimum cost register. When this is determined, the process proceeds to step S409, where the calculated RD cost is stored in the minimum cost register, and in step S410, the set quantization parameter is stored in the optimum quantization parameter register. On the other hand, when it is determined in step S408 that the calculated RD cost is higher than the cost stored in the minimum cost register, the processes in steps S409 and 410 are omitted.

続いて、ステップＳ４１１で、全ての量子化パラメータについて処理したのか否かを判断して、全ての量子化パラメータについて処理していないことを判断するときには、ステップＳ４１２に進んで、予め定められる順番に従って未処理の量子化パラメータの中から量子化パラメータを１つ選択することで処理対象の量子化パラメータを更新してから、ステップＳ４０４の処理に戻る。 Subsequently, in step S411, it is determined whether or not all the quantization parameters have been processed, and when it is determined that all the quantization parameters have not been processed, the process proceeds to step S412 in accordance with a predetermined order. After the quantization parameter to be processed is updated by selecting one quantization parameter from the unprocessed quantization parameters, the process returns to step S404.

一方、ステップＳ４１１で、全ての量子化パラメータについて処理したことを判断するときには、ステップＳ４１３に進んで、最適量子化パラメータレジスタに格納されている量子化パラメータを最適な量子化パラメータとして出力して、処理を終了する。 On the other hand, when it is determined in step S411 that all the quantization parameters have been processed, the process proceeds to step S413, and the quantization parameter stored in the optimum quantization parameter register is output as the optimum quantization parameter. The process ends.

図９に、図８のフローチャートを実行すべく構成される符号化パラメータ選択部１０の装置構成を図示する。ここで、図７に示したものと同じものについては同一の記号で示してある。 FIG. 9 illustrates a device configuration of the encoding parameter selection unit 10 configured to execute the flowchart of FIG. Here, the same components as those shown in FIG. 7 are indicated by the same symbols.

この図９に示すように、符号化パラメータ選択部１０は、図８のフローチャートを実行するために、前述の初期モード設定部１０４に代えて初期量子化パラメータ設定部３０４を備え、前述のモード設定部１０５に代えて量子化パラメータ設定部３０５を備え、前述のモード記憶部１０６に代えて量子化パラメータ記憶部３０６を備え、前述の最適モード記憶部１１７に代えて最適量子化パラメータ記憶部３１７を備え、前述の最適モード更新部１１８に代えて最適量子化パラメータ更新部３１８を備え、前述の最終モード判定部１１９に代えて最終量子化パラメータ判定部３１９を備え、前述の最適モード出力部１２０に代えて最適量子化パラメータ出力部３２０を備えることになる。 As shown in FIG. 9, the encoding parameter selection unit 10 includes an initial quantization parameter setting unit 304 in place of the above-described initial mode setting unit 104 in order to execute the flowchart of FIG. A quantization parameter setting unit 305 instead of the unit 105, a quantization parameter storage unit 306 instead of the mode storage unit 106, and an optimal quantization parameter storage unit 317 instead of the optimal mode storage unit 117. An optimal quantization parameter update unit 318 instead of the optimal mode update unit 118 described above, a final quantization parameter determination unit 319 instead of the final mode determination unit 119, and the optimal mode output unit 120 described above. Instead, an optimum quantization parameter output unit 320 is provided.

最後に、本発明の有効性を検証するために行った実験について説明する。 Finally, an experiment conducted to verify the effectiveness of the present invention will be described.

この実験は、本発明を参照ソフトウェアＪＳＶＭ（version 8.0.[3])に実装して、無改造のＪＳＶＭと比較することで行った。 This experiment was performed by implementing the present invention in the reference software JSVM (version 8.0. [3]) and comparing it with an unmodified JSVM.

下記の表に、実験条件を示す。符号化対象のシーケンスは、サイズ３５２×２８８[pixels]、フレームレート３０[fps] である。符号化処理は、先頭の１２０フレームに対して実施した。式（３０）におけるパラメータはｒ₁＝８，ｒ₂＝６，Ａ＝５とした。また、基準ブロックのサイズを与えるパラメータはＮ＝４とし、分析対象ブロックのサイズを与えるパラメータはＭ_x＝Ｍ_y＝４とした。 The following table shows the experimental conditions. The encoding target sequence has a size of 352 × 288 [pixels] and a frame rate of 30 [fps]. The encoding process was performed on the first 120 frames. The parameters in equation (30) were r ₁ = 8, r ₂ = 6, A = 5. Further, the parameter that gives the size of the reference block is N = 4, and the parameter that gives the size of the analysis target block is M _x = M _y = 4.

下記の表に、符号量の比較結果を示す。いずれのシーケンス、ＱＰ値においても、本発明によって符号量の削減が図られていることが確認できる。なお、両手法の復号画像には、主観的な画質の差が認められないことを確認している。さらに、ＪＳＶＭに対する本発明の相対的な符号量削減率を評価するために、ＪＳＶＭの符号量および本発明の符号量を各々Ｒ_JSVM，Ｒ_Oursとして、
｛（Ｒ_Ours−Ｒ_JSVM）／Ｒ_JSVM｝×１００％
という式で示される符号量削減率を下記の表の３列目の括弧内に示す。この結果、本発明は、ＪＳＶＭに対して平均５．３％の符号量低減を実現していることが確認できた。 The following table shows the comparison results of the code amount. In any sequence and QP value, it can be confirmed that the code amount is reduced by the present invention. It has been confirmed that there is no subjective difference in image quality between the decoded images of both methods. Further, in order to evaluate the relative code amount reduction rate of the present invention with respect to JSVM, the code amount of JSVM and the code amount of the present invention are respectively represented as R _JSVM and R _Ours .
{(R _Ours -R _JSVM ) / R _JSVM } × 100%
The code amount reduction rate represented by the formula is shown in parentheses in the third column of the following table. As a result, it has been confirmed that the present invention achieves an average code amount reduction of 5.3% with respect to JSVM.

下記の表に、符号化モードの割合を示す。ここで、ここで、ＳＫＩＰの列はスキップモードの割合を示しており、ＩＮＴＥＲの列はスキップモードを除くインター予測の割合を示しており、ＩＮＴＲＡの列は全てのイントラ予測モードの選択された割合を示している。この表に示すように、本発明のビットレートの削減は、発生符号量の少ないスキップモードを多く選択することで実現していることが分かる。 The following table shows the ratio of the encoding mode. Here, the SKIP column indicates the skip mode ratio, the INTER column indicates the inter prediction ratio excluding the skip mode, and the INTRA column indicates the selected ratio of all intra prediction modes. Is shown. As shown in this table, it can be seen that the reduction of the bit rate according to the present invention is realized by selecting many skip modes with a small amount of generated codes.

さらに、ブロック歪みについての評価結果を示す。各フレーム（Ｘ×Ｙ［画素］）の位置（ｘ，ｙ）における復号画素値をＳ（ｘ，ｙ）とし、水平・垂直方向の画素間差分値を各々
δ_h（ｘ，ｙ）＝Ｓ（ｘ＋１，ｙ）−Ｓ（ｘ，ｙ）
δ_v（ｘ，ｙ）＝Ｓ（ｘ，ｙ＋１）−Ｓ（ｘ，ｙ）
とする。このとき、
Δ＝｛Σ₁Σ₂｜δ_h（Ｎｉ_x,ｉ_y）｜｝／｛２Ｙ（Ｘ／Ｎ−１）｝
＋｛Σ₃Σ₄｜δ_v（ｉ_x,Ｎｉ_y）｜｝／｛２Ｘ（Ｙ／Ｎ−１）｝
ただし、Σ₁は、ｉ_x＝１〜（Ｘ／Ｎ−１）の総和
Σ₂は、ｉ_y＝０〜（Ｙ−１）の総和
Σ₃は、ｉ_y＝１〜（Ｙ／Ｎ−１）の総和
Σ₄は、ｉ_x＝０〜（Ｘ−１）の総和
という式で表されるブロック境界における画素間差分値の平均値Δを用いて、隣接ブロック間の不連続性を評価する。 Furthermore, the evaluation result about block distortion is shown. The decoded pixel value at the position (x, y) of each frame (X × Y [pixel]) is S (x, y), and the inter-pixel difference values in the horizontal and vertical directions are respectively δ _h (x, y) = S. (X + 1, y) -S (x, y)
δ _v (x, y) = S (x, y + 1) −S (x, y)
And At this time,
Δ = {Σ ₁ Σ ₂ | δ _h (Ni _x, i _y ) |} / {2Y (X / N−1)}
+ {Σ ₃ Σ ₄ | δ _v (ix _, Ni _y ) |} / {2X (Y / N−1)}
Where Σ ₁ is the sum of i _x = 1 to (X / N−1)
Σ ₂ is the sum of i _y = 0 to (Y−1)
Σ ₃ is the sum of i _y = 1 to (Y / N−1)
Σ ₄ evaluates the discontinuity between adjacent blocks by using the average value Δ of the inter-pixel difference value at the block boundary expressed by the formula of the sum of i _x = 0 to (X−1).

下記の表に、ＪＳＶＭおよび本発明に対するΔの全フレーム平均値を示す。なお、各行の２列目の値と３列目の値を各々Δ_JSVM，Δ_Oursとして、
｛（Δ_Ours−Δ_JSVM）／Δ_JSVM｝×１００％
という式で示されるブロック歪み削減率を下記の表の３列目の括弧内に示す。この結果、本発明は、ＪＳＶＭに対して平均１．２％のブロック歪み低減を実現していることが確認できた。 The table below shows the total frame average of Δ for JSVM and the present invention. In addition, the value in the second column and the value in the third column of each row are set as Δ _JSVM and Δ _Ours respectively.
{(Δ _Ours −Δ _JSVM ) / Δ _JSVM } × 100%
The block distortion reduction rate represented by the formula is shown in parentheses in the third column of the following table. As a result, it was confirmed that the present invention achieved an average 1.2% block distortion reduction with respect to JSVM.

本実験によって、本発明によれば、ＪＳＶＭに対して符号量を低減し、さらに、ブロック歪みも低減できることが確認できた。本発明では、隣接ブロック間の依存関係を考慮した周波数分析を行っている。このため、隣接ブロック間の不連続性は、水平・垂直方向のエッジとして、コントラスト感度関数が大きな重みを与える周波数成分として分析される。つまり、大きな感度係数の値が設定される。この感度係数により重み付けされた歪み尺度がコスト関数に用いられるため、ブロック間の不連続性をもたらす成分の歪みは回避され、結果としてブロック歪みが低減したものと考察される。 From this experiment, it was confirmed that according to the present invention, the code amount can be reduced with respect to JSVM, and further, block distortion can be reduced. In the present invention, frequency analysis is performed in consideration of the dependency between adjacent blocks. For this reason, the discontinuity between adjacent blocks is analyzed as a frequency component to which the contrast sensitivity function gives a large weight as horizontal and vertical edges. That is, a large sensitivity coefficient value is set. Since a distortion measure weighted by this sensitivity coefficient is used in the cost function, distortion of components that cause discontinuity between blocks is avoided, and it is considered that block distortion is reduced as a result.

本発明は、フレーム内予測やフレーム間予測により得られた予測誤差信号に対して、変換符号化および量子化による情報圧縮を行うことで動画像を符号化する場合に適用できるものであり、本発明を適用することで、符号化パラメータの選択に用いるコスト関数として、ブロック歪みを含む主観画質を反映したものを実現できるようになることで、高能率の符号化を実現できるようになるとともに、符号量の削減を実現できるようになる。 The present invention can be applied to a case where a moving image is encoded by performing information compression by transform coding and quantization on a prediction error signal obtained by intra-frame prediction or inter-frame prediction. By applying the invention, it becomes possible to realize a cost function used for selecting an encoding parameter that reflects subjective image quality including block distortion, thereby realizing highly efficient encoding. The amount of code can be reduced.

修正基底画像の説明図である。It is explanatory drawing of a correction base image. 本発明の適用される動画像符号化装置の装置構成図である。It is an apparatus block diagram of the moving image encoding apparatus with which this invention is applied. 符号化パラメータ選択部の実行するフローチャートである。It is a flowchart which an encoding parameter selection part performs. 符号化パラメータ選択部の実行するフローチャートである。It is a flowchart which an encoding parameter selection part performs. 符号化パラメータ選択部の実行するフローチャートである。It is a flowchart which an encoding parameter selection part performs. 符号化パラメータ選択部の装置構成図である。It is an apparatus block diagram of an encoding parameter selection part. 符号化パラメータ選択部の装置構成図である。It is an apparatus block diagram of an encoding parameter selection part. 符号化パラメータ選択部の実行するフローチャートである。It is a flowchart which an encoding parameter selection part performs. 符号化パラメータ選択部の装置構成図である。It is an apparatus block diagram of an encoding parameter selection part. ブロック間の不連続性の説明図である。It is explanatory drawing of the discontinuity between blocks.

Explanation of symbols

２０１変換係数正規化部
２０２正規化変換係数記憶部
２０３変換係数復号部
２０４復号変換係数記憶部
２０５歪み量算出部
２０６歪み量記憶部
２０７変換係数インデックス記憶部
２０８変移量記憶部
２０９基底画像記憶部
２１０感度係数算出部
２１１感度係数記憶部
２１２感度係数乗算部
２１３歪み量記憶部
２１４歪み量和算出部
２１０１処理４実行部
２１０２修正基底画像ＤＦＴ係数記憶部
２１０３処理２実行部
２１０４処理３実行部
２１０５処理１実行部 DESCRIPTION OF SYMBOLS 201 Transformation coefficient normalization part 202 Normalization transformation coefficient memory | storage part 203 Transformation coefficient decoding part 204 Decoding transformation coefficient memory | storage part 205 Distortion amount calculation part 206 Distortion amount memory | storage part 207 Transformation coefficient index memory | storage part 208 Transition amount memory | storage part 209 Base image memory | storage part 210 sensitivity coefficient calculation unit 211 sensitivity coefficient storage unit 212 sensitivity coefficient multiplication unit 213 distortion amount storage unit 214 distortion amount sum calculation unit 2101 processing 4 execution unit 2102 modified base image DFT coefficient storage unit 2103 processing 2 execution unit 2104 processing 3 execution unit 2105 Process 1 execution part

Claims

A moving picture coding method for coding a moving picture by performing information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction. ,
It is defined by associating with the analysis target block consisting of multiple blocks that are the target of the transformation matrix, and is configured by placing the base image of the transformation matrix in one block and embedding zero values in other blocks A process of reading the spatial frequency components of the corrected base image from the storage means for storing the spatial frequency components calculated for the corrected base images for the number of blocks
Estimating a shift amount indicating temporal movement of the image signal of the analysis target block;
Calculating a visual sensitivity value assigned to the spatial frequency component based on the spatial frequency index of the read spatial frequency component and the shift amount, and weighting the spatial frequency component;
Calculating importance for each base component of the prediction error signal based on the weighted spatial frequency component, the non-weighted spatial frequency component, and the transform coefficient of the block constituting the analysis target block; ,
Determining a coding parameter by evaluating a coding cost using a coding distortion weighted using the importance , and
In the process of calculating, a multiplication value of the square sum of the transform coefficients and the square norm sum of the weighted spatial frequency components, and a square sum of the square sum of the transform coefficients and the square frequency norm sum of the non-weighted spatial frequency components. Obtaining a multiplication value and calculating the importance according to a division value of the two multiplication values ;
A moving image encoding method as a feature.

The moving image encoding method according to claim 1,
In the weighting process, a visual sensitivity value in the horizontal direction is calculated based on the horizontal spatial frequency index and the horizontal component of the shift amount, and based on the vertical spatial frequency index and the vertical component of the shift amount. Calculating a visual sensitivity value assigned to the read spatial frequency component by calculating a vertical visual sensitivity value,
A moving image encoding method as a feature.

A moving image encoding apparatus that encodes a moving image by performing information compression by transform coding and quantization on an image signal or a prediction error signal obtained by intra-frame prediction and inter-frame prediction. ,
It is defined by associating with the analysis target block consisting of multiple blocks that are the target of the transformation matrix, and is configured by placing the base image of the transformation matrix in one block and embedding zero values in other blocks Storage means for storing spatial frequency components calculated for the corrected base images for the number of blocks
Estimating means for estimating a shift amount indicating temporal movement of the image signal of the analysis target block;
The spatial frequency component of the corrected base image is read from the storage means, and the visual sensitivity value assigned to the spatial frequency component is calculated based on the spatial frequency index of the spatial frequency component and the shift amount, and the spatial frequency is calculated. A weighting means for weighting the components;
Calculation means for calculating importance for each base component of the prediction error signal based on the weighted spatial frequency component, the non-weighted spatial frequency component, and the transform coefficient of the block constituting the analysis target block When,
To assess the coding cost by using the distortion amount of the weighted coded using the importance degree, e Bei and determining means for determining the encoding parameters,
The calculating means multiplies a multiplication value of a square sum of the transform coefficients and a square norm sum of the weighted spatial frequency components, and a multiplication of a square sum of the transform coefficients and a square norm sum of the non-weighted spatial frequency components. And calculating the importance according to a division value of the two multiplication values ,
A moving image encoding device.

In the moving image encoding device according to claim 3 ,
The weighting means calculates a visual sensitivity value in the horizontal direction based on a horizontal spatial frequency index and a horizontal component of the shift amount, and based on a vertical spatial frequency index and the vertical component of the shift amount. By calculating the visual sensitivity value in the vertical direction, calculating the visual sensitivity value assigned to the read spatial frequency component,
A moving image encoding device.

Video encoding program for executing a video encoding method according to the computer to claim 1 or 2.

A computer-readable recording medium on which a moving image encoding program for causing a computer to execute the moving image encoding method according to claim 1 or 2 is recorded.