JP4709074B2

JP4709074B2 - Moving picture encoding method, apparatus, program thereof, and recording medium recording the program

Info

Publication number: JP4709074B2
Application number: JP2006160390A
Authority: JP
Inventors: 幸浩坂東; 一人上倉; 由幸八島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-06-09
Filing date: 2006-06-09
Publication date: 2011-06-22
Anticipated expiration: 2026-06-09
Also published as: JP2007329780A

Description

本発明は，高能率画像信号符号化技術に関し，特に，視覚感度を考慮した動画像符号化方法に関するものである。 The present invention relates to a high-efficiency image signal encoding technique, and more particularly, to a moving image encoding method considering visual sensitivity.

Ｈ．２６４では，イントラ予測および可変形状動き補償の導入に伴い，従来の標準化方式と比べて，予測モードの種類が増加している。このため，一定の主観画質を保持しつつ，符号量を削減するには，適切な予測モードを選択する必要がある。Ｈ．２６４の参照ソフトウェアＪＭ［非特許文献１］では，以下のＲ−Ｄコストを最小化する予測モードを選択している。 H. In H.264, with the introduction of intra prediction and variable shape motion compensation, the types of prediction modes are increasing as compared with the conventional standardized method. For this reason, in order to reduce the amount of codes while maintaining a constant subjective image quality, it is necessary to select an appropriate prediction mode. H. In the H.264 reference software JM [Non-Patent Document 1], the following prediction mode that minimizes the RD cost is selected.

ここで，Ｓは原信号，ｑは量子化パラメータ，ｍは予測モードを表す番号であり，＾Ｓm,q はＳに対してモードｍを用いて予測し，ｑを用いて量子化した場合の復号信号である。また，λはモード選択に用いるラグランジェの未定乗数である。さらに，Ｄ（Ｓ，＾Ｓm,q ）は次式に示す二乗誤差和である。 Here, S is an original signal, q is a quantization parameter, m is a number indicating a prediction mode, and ^ Sm, q is a prediction for mode S using mode m and is quantized using q. It is a decoded signal. Λ is a Lagrange's undetermined multiplier used for mode selection. Further, D (S ,， Sm, q) is a sum of square errors shown in the following equation.

ここで，Ｓ^Y，Ｓ^U，Ｓ^Vは原信号のＹ，Ｕ，Ｖ成分であり，＾Ｓm,q ^Y，＾Ｓm,q ^U，＾Ｓm,q ^Vは復号信号のＹ，Ｕ，Ｖ成分である。 Here, S ^Y , S ^U , and S ^V are Y, U, and V components of the original signal, and ^ Sm, q ^Y , ^ Sm, q ^U , ^ Sm, q ^V are Y, U, and V of the decoded signal. It is an ingredient.

Ｈ．２６４における復号信号の算出を以下に示す。なお，説明に用いる記号を表１にまとめる。 H. The calculation of the decoded signal in H.264 is shown below. The symbols used for explanation are summarized in Table 1.

Ｈ．２６４の符号化処理では，モード番号ｍの予測を用いた場合の予測誤差信号Ｒ（＝Ｓ−Ｐ_m）に対して，変換行列Φを用いた直交変換が次式のように施される。 H. In the H.264 encoding process, orthogonal transformation using the transformation matrix Φ is performed on the prediction error signal R (= S−P _m ) when the prediction of the mode number m is used as follows.

Ｃ＝ΦＲΦ^t (1)
Φ^tは変換行列Φに対する転置行列を表す。なお，変換行列Φは次式で表される整数要素の直交行列である。 C = ΦRΦ ^t (1)
Φ ^t represents a transposed matrix with respect to the transformation matrix Φ. The transformation matrix Φ is an orthogonal matrix of integer elements expressed by the following equation.

次に，行列Φが非正規行列であるため，行列の正規化に相当する処理を行う。 Next, since the matrix Φ is a non-normal matrix, processing equivalent to matrix normalization is performed.

Ｃ_n＝Ｎ（Ｃ） (3)
さらに，Ｃに対して，量子化パラメータｑを用いた量子化が次式のとおり施される。なおＪＭでは，正規化は量子化の中に組み込まれている。 C _n = N (C) (3)
Further, the quantization using the quantization parameter q is performed on C as follows. In JM, normalization is built into quantization.

Ｖ＝Ｑ（Ｃ_n） (4)
一方，Ｈ．２６４の復号処理では，Ｖに対して，次式のように逆量子化を施し，変換係数の復号値を得る。 V = Q (C _n ) (4)
On the other hand, H. In the H.264 decoding process, V is inversely quantized as in the following equation to obtain a decoded value of the transform coefficient.

次に，＾Ｃ_qに対して，次式のように逆変換を施し，予測誤差の復号信号を得る。 Next, the inverse transformation is applied to ^ C _q as shown in the following equation to obtain a prediction error decoded signal.

最後に，次式により，符号化対象画像の復号信号を得る。 Finally, a decoded signal of the encoding target image is obtained by the following equation.

なお，本発明に関連する技術が記載された参考文献としては，上記非特許文献１の他に下記の非特許文献２〜５がある。また，下記の特許文献１には，ハードウェアの大幅な増加なしに複数回の符号化による符号化効率の向上を図るために，第１の符号化時に検出した動きベクトルから動画像の動き量を検出し，第２の符号化時の符号化レートを，人間の視覚特性と符号化歪み低減を考慮して制御する技術が記載されている。
特開平１０−３３６６７５号公報 K.P.Lim and G.Sullivan and T.Wiegand，Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG，JVT-R95 ，Jan.，2006. J.L.Mannos and D.J.Sakrison. The effect of a visual fidelity criterion on the encoding of images. IEEE Trans. Infomation Theory ，Vol.IT-20 ，pp.523-536，July 1974. N.B.Nill. A visual model weighted cosine transform for image compression and quality assessment. IEEE Trans. Commun.，Vol.COM-33，No.12 ，pp.551-557，June 1985. B.Chitprasert and K.R.Rao. Human visual weighted progressive image transmission. IEEE Trans. Commun.，pp.1040-1044，July 1990. 上倉一人，渡辺裕，小林直樹，一之瀬進，安田浩. 演算量低減を考慮したグローバル動き・輝度変化補償動画像符号化. 電子情報通信学会論文誌，Vol.J82-B ，No.9，pp.1676-1688，Spt. 1999. In addition to the above non-patent document 1, there are the following non-patent documents 2 to 5 as reference documents describing the technology related to the present invention. Further, in Patent Document 1 below, in order to improve the encoding efficiency by a plurality of encodings without a significant increase in hardware, the motion amount of a moving image is detected from the motion vector detected during the first encoding. Is described, and the coding rate at the time of the second coding is controlled in consideration of human visual characteristics and coding distortion reduction.
JP-A-10-336675 KPLim and G. Sullivan and T. Wiegand, Text Description of Joint Model Reference Encoding Methods and Decoding Concealment Methods. Joint Video Team (JVT) of ISO / IEC MPEG and ITU-T VCEG, JVT-R95, Jan., 2006. JLMannos and DJSakrison. The effect of a visual fidelity criterion on the encoding of images. IEEE Trans. Infomation Theory, Vol.IT-20, pp.523-536, July 1974. NBNill. A visual model weighted cosine transform for image compression and quality assessment. IEEE Trans. Commun., Vol.COM-33, No.12, pp.551-557, June 1985. B. Chitprasert and KRRao. Human visual weighted progressive image transmission. IEEE Trans. Commun., Pp.1040-1044, July 1990. Hitoshi Uekura, Hiroshi Watanabe, Naoki Kobayashi, Susumu Ichinose, Hiroshi Yasuda. Global motion / brightness change compensation video coding considering computational complexity. IEICE Transactions, Vol. .1676-1688, Spt. 1999.

前述の通り，ＪＭで用いられている主観画質の尺度は二乗誤差である。しかし，この二乗誤差は必ずしも，主観的な画質劣化を反映した歪み量ではない。例えば，高周波数成分の変化は低周波数成分の変化に比べて，視覚的には検知されにくい。また，動きの早いシーンにおける画質劣化は静止シーンにおける画質劣化よりも相対的に目立ちにくいことが知られている（時間マスキング効果）。しかし，こうした視覚特性を利用していない符号化器（例えば，ＪＭ）には，符号量の効率的な削減に関して，改良の余地が残る。 As described above, the subjective image quality scale used in JM is a square error. However, this square error is not necessarily a distortion amount reflecting subjective image quality degradation. For example, a change in a high frequency component is less visually detected than a change in a low frequency component. It is also known that image quality degradation in a fast-moving scene is relatively less noticeable than image quality degradation in a still scene (time masking effect). However, an encoder (for example, JM) that does not use such visual characteristics still has room for improvement in terms of efficient code amount reduction.

上記特許文献１に記載の技術では，視覚特性を考慮した量子化ステップ幅の制御などの符号化レートの制御は行っているが，符号化モードの選択には対応していない。 In the technique described in Patent Document 1, encoding rate control such as quantization step width control in consideration of visual characteristics is performed, but it does not support selection of an encoding mode.

本発明はかかる事情に鑑みてなされたものであって，時空間でのマスキング効果を考慮して，主観画質を反映した歪み量を用いることにより，効率的に符号量を削減する予測モード選択方法を確立することを目的とする。 The present invention has been made in view of such circumstances, and a prediction mode selection method for efficiently reducing a code amount by using a distortion amount reflecting subjective image quality in consideration of a masking effect in time and space. The purpose is to establish.

前述の空間マスキング効果は，空間周波数成分に対する視覚感度に差があることに由来する。そこで，直交変換係数に対して，周波数成分毎の視覚感度に応じて歪み量の重み付けを行うことで，主観画質に対応した歪み量を定義する。さらに，時間マスキングを考慮して，上述の重み付けされた歪み量に対して，動き量に応じてさらに重み付けを行う。こうした時空間的な性質に応じて重み付けされた歪み量をＲ−Ｄコスト内で用いる。 The above-mentioned spatial masking effect is derived from the difference in visual sensitivity with respect to the spatial frequency component. Therefore, the distortion amount corresponding to the subjective image quality is defined by weighting the orthogonal transformation coefficient according to the visual sensitivity for each frequency component. Further, in consideration of time masking, the above-described weighted distortion amount is further weighted according to the motion amount. The amount of distortion weighted according to such spatiotemporal properties is used within the RD cost.

すなわち，本発明は，フレーム内予測およびフレーム間予測に関するブロックサイズ等の予測モードを設定し，同予測により得られた予測誤差信号に対して，直交変換，量子化による情報圧縮を行う動画像符号化において，歪み量，符号量，未定乗数からなるラグランジェのコスト関数に基づき予測モードを決定する際，直交変換を施すブロック内の周波数成分に対して，所定の視覚感度関数によって算出された視覚感度の値を乗じたものと前記直交変換を施すブロック内の周波数成分との比を，感度係数として算出する過程と，各予測モードごとに，前記感度係数を二乗誤差に乗じて得られる重み付き歪み量を算出する過程と，前記重み付き歪み量を用いてコスト関数を設定し，該コスト関数を用いて各予測モードのコストを算出する過程と，前記コスト関数を用いて算出されたコストを最小化する予測モードを選択する過程とを有することを特徴とする。 That is, the present invention sets a prediction mode such as a block size for intra-frame prediction and inter-frame prediction, and performs video compression that performs information compression by orthogonal transformation and quantization on a prediction error signal obtained by the prediction. When determining the prediction mode based on the Lagrangian cost function consisting of distortion, code amount, and undetermined multiplier , the visual component calculated by a predetermined visual sensitivity function is applied to the frequency components in the block subjected to orthogonal transformation. The process of calculating the ratio between the product of the sensitivity value and the frequency component in the block subjected to the orthogonal transformation as a sensitivity coefficient, and weighting obtained by multiplying the square error by the sensitivity coefficient for each prediction mode a process of calculating the distortion amount, setting the cost function using the weighted distortion amount, the process of calculating the cost of each prediction mode by using the cost function Characterized by having a step of selecting a prediction mode that minimizes the cost calculated by using the cost function.

また，上記本発明において，視覚感度関数は，少なくともパラメータの一つとして視距離を画像の幅で除した視距離パラメータを有し，該視距離パラメータの減少関数であり，感度係数を算出する際，フレーム内の大局的な動き量に応じて，動き量が大きい場合の視距離パラメータが，動き量が小さい場合の視距離パラメータより大きな値となるように設定し，感度係数を各フレーム毎に適応的に変化させることを特徴とする。 In the above present invention, the visual sensitivity function has at least the parameter distance parameter view obtained by dividing the distance viewing as one by the width of the image, is a decreasing function of the visual distance parameter, when calculating the sensitivity coefficient , depending on the global motion of the frame, visual distance parameter when the motion amount is large, and set to a value larger than the viewing distance parameter when the motion amount is small, the sensitivity coefficients for each frame It is characterized by being adaptively changed.

また，上記本発明において，視覚感度関数は，少なくともパラメータの一つとして視距離を画像の幅で除した視距離パラメータを有し，該視距離パラメータの減少関数であり，感度係数を算出する際，ブロック内の動き量に応じて，動き量が大きい場合の視距離パラメータが，動き量が小さい場合の視距離パラメータより大きな値となるように設定し，感度係数を各ブロック毎に適応的に変化させることを特徴とする。 In the above present invention, the visual sensitivity function has at least the parameter distance parameter view obtained by dividing the distance viewing as one by the width of the image, is a decreasing function of the visual distance parameter, when calculating the sensitivity coefficient , according to the motion amount in the block, visual distance parameter when the motion amount is large, and set to a value larger than the viewing distance parameter when the motion amount is small, adaptive sensitivity coefficients for each block It is characterized by changing to.

本発明では，符号化歪みの主観画質への寄与の度合いを考慮して，モード選択のＲ−Ｄコストを適応的に切り替える。これにより，符号化歪みの主観画質への寄与が小さい領域に対しては，符号量削減を重視するような重み付けがＲ−Ｄコストに加えられる。この場合，視覚的には検知され難い領域に対して符号量の削減を行うため，復号画像の主観画質を保ちながら，効率的に符号量を削減できる。 In the present invention, the RD cost of mode selection is adaptively switched in consideration of the degree of contribution of coding distortion to subjective image quality. As a result, weighting that emphasizes code amount reduction is added to the RD cost for an area where the contribution of coding distortion to the subjective image quality is small. In this case, since the code amount is reduced for a region that is difficult to detect visually, the code amount can be efficiently reduced while maintaining the subjective image quality of the decoded image.

以下，本発明の具体的な実施形態を詳しく説明する。 Hereinafter, specific embodiments of the present invention will be described in detail.

［量子化誤差信号の重み付け］
まず，量子化誤差信号に対する視覚感度に基づく重み付けについて，説明する。本実施形態では，次式のＲ−Ｄコストを用いる。 [Quantization error signal weighting]
First, the weighting based on visual sensitivity for the quantization error signal will be described. In the present embodiment, the following RD cost is used.

このＲ−Ｄコストの計算に用いる歪み量として，以下の重み付き歪み量を用いる。 The following weighted distortion amount is used as the distortion amount used for the calculation of the RD cost.

ここで，Ｃ_n ^Y(i)［ｋ，ｌ］，Ｃ_n ^U(i)［ｋ，ｌ］，Ｃ_n ^V(i)［ｋ，ｌ］はマクロブロック（Ｙ成分の場合，１６×１６［画素］，Ｕ，Ｖ成分の場合，８×８［画素］）内のサブブロック（Ｎ×Ｎ［画素］）のうち，ラスター走査においてｉ番目に走査されるサブブロックである。また，＾Ｃ_q ^Y(i)［ｋ，ｌ］，＾Ｃ_q ^U(i)［ｋ，ｌ］，＾Ｃ_q ^V(i)［ｋ，ｌ］はマクロブロック内の復号変換係数（Ｙ成分の場合，１６×１６［画素］，Ｕ，Ｖ成分の場合，８×８［画素］）内のサブブロック（Ｎ×Ｎ［画素］）のうち，ラスター走査においてｉ番目に走査されるサブブロックである。さらに，Ｗ_k,l ^Y，Ｗ_k,l ^U，Ｗ_k,l ^Vは１以下に設定される重み係数であり，以下では，感度係数と呼ぶ。感度係数の算出については，［感度係数の算出］にて詳述する。 Here, C _n ^{Y (i)} [k, l], C _n ^{U (i)} [k, l], C _n ^{V (i)} [k, l] are macroblocks (16 × 16 in the case of Y component ⁾ . In the case of [pixel], U, and V components, it is the i-th sub-block scanned in raster scanning among the sub-blocks (N × N [pixel]) in 8 × 8 [pixel]). ^ C _q ^{Y (i)} [k, l], ^ C _q ^{U (i)} [k, l], ^ C _q ^{V (i)} [k, l] are decoding transform coefficients (Y Among the sub-blocks (N × N [pixels]) within the 16 × 16 [pixel] in the case of the component and 8 × 8 [pixel] in the case of the U and V components, the i th sub-scan in the raster scan is performed. It is a block. Further, W _{k, l} ^Y , W _{k, l} ^U , W _{k, l} ^V are weighting coefficients set to 1 or less, and are hereinafter referred to as sensitivity coefficients. The calculation of the sensitivity coefficient will be described in detail in [Calculation of sensitivity coefficient].

上式において，Ｗ_k,l ^Y，Ｗ_k,l ^U，Ｗ_k,l ^Vを小さな値に設定することは，量子化歪みＤ（Ｃ_n，＾Ｃ_q）を小さく見積もることに相当する。なお，直交変換の正規性より，すべてのｋ，ｌに対して，Ｗ_k,l ^Y＝１，Ｗ_k,l ^U＝１，Ｗ_k,l ^V＝１とすれば，上述の重み付き歪み量は二乗誤差和と等価となる。 In the above equation, setting W _{k, l} ^Y , W _{k, l} ^U , W _{k, l} ^V to a small value is equivalent to estimating the quantization distortion D (C _n , _Cq ) small. From the normality of orthogonal transformation, if W _{k, l} ^Y = 1, W _{k, l} ^U = 1, W _{k, l} ^V = 1 for all _{k, l} , the weighted distortion described above is obtained. The quantity is equivalent to the sum of squared errors.

［感度係数の算出］
変換行列Φ（Ｎ×Ｎ行列）の第ｋ列ベクトル（Ｎ次元ベクトル）をφ_kとすると，同行列に対する基底画像は，次式より得られる。なお，Ｈ．２６４の場合，Ｎとして取りうる値は４または８のいずれかである。 [Calculation of sensitivity coefficient]
If the k-th column vector (N-dimensional vector) of the transformation matrix Φ (N × N matrix) is φ _k , a base image for the matrix can be obtained from the following equation. H. In the case of H.264, possible values for N are either 4 or 8.

ここで，φ_l ^tはφ_lの転置ベクトルである。 Here, φ _l ^t is a transposed vector of φ _l .

各基底画像ｆ_k,l（ｘ，ｙ）（０≦ｘ，ｙ≦Ｎ−１）に対して，次式に示す離散フーリエ変換を施し，フーリエ係数を得る。なお，以下ではＮ＝２^mとおく。 Each base image f _{k, l} (x, y) (0 ≦ x, y ≦ N−1) is subjected to a discrete Fourier transform represented by the following equation to obtain a Fourier coefficient. In the following, N = 2 ^m .

ここで，ｊは虚数単位である。得られたフーリエ係数Ｆ_k,l（ｕ，ｖ）（０≦ｕ≦Ｎ−１，０≦ｖ≦Ｎ−１）に対して，以下の重み付けを行う。 Here, j is an imaginary unit. The following weighting is performed on the obtained Fourier coefficient F _{k, l} (u, v) (0 ≦ u ≦ N−1, 0 ≦ v ≦ N−1).

以下，〜Ｆ_k,l（ｕ，ｖ）（なお，〜はＦの上側に付く綴字記号）について，説明する。＾ｇ（η）はｇ（η）を用いた次式で表される関数である。 Hereinafter, ˜F _{k, l} (u, v) (where ˜ is a spelling symbol attached to the upper side of F) will be described. ^ G (η) is a function represented by the following equation using g (η).

ここで，αは重みを制御するパラメータであり，外部から与えられる。また，ｇ（η）は視覚感度関数として知られる関数であり，次式のような関数形で表される。 Here, α is a parameter for controlling the weight and is given from the outside. Further, g (η) is a function known as a visual sensitivity function, and is expressed in a function form as shown in the following equation.

ｇ（η）＝（ａ＋ｂη）ｅｘｐ（−（ｃη）^d） (12)
ここで，ａ，ｂ，ｃ，ｄは視覚感度関数の関数形を定めるパラメータ（以後，モデルパラメータと呼ぶ）であり，例えば，次のような値をとる。次の値は，上から順に，非特許文献２，非特許文献３，非特許文献４による。 g (η) = (a + bη) exp (− (cη) ^d ) (12)
Here, a, b, c, and d are parameters (hereinafter referred to as model parameters) that define the function form of the visual sensitivity function, and take the following values, for example. The following values are based on Non-Patent Literature 2, Non-Patent Literature 3, and Non-Patent Literature 4 in order from the top.

（ａ，ｂ，ｃ，ｄ）＝（0.4992，0.2964，−0.114 ，1.1 ） (13)
（ａ，ｂ，ｃ，ｄ）＝（0.2 ，0.45，−0.18，1 ） (14)
（ａ，ｂ，ｃ，ｄ）＝（0.246 ，0.615 ，−0.25，1 ） (15)
また，式(12)において，ηは以下の値とする。 (A, b, c, d) = (0.4992, 0.2964, −0.114, 1.1) (13)
(A, b, c, d) = (0.2, 0.45, −0.18, 1) (14)
(A, b, c, d) = (0.246, 0.615, -0.25, 1) (15)
In Equation (12), η is the following value.

η₀は以下のように，ｇ（η）が最大値をとる引数である。 η ₀ is an argument for which g (η) takes a maximum value as follows.

θ（ｒ，Ｈ）は縦幅Ｈの画像を視距離ｒＨにおいて観測する場合の一画素あたりの角度であり，次式により与えられる。 θ (r, H) is an angle per pixel when an image having a vertical width H is observed at a viewing distance rH, and is given by the following equation.

以後，ｒを視距離パラメータと呼ぶ。以上の議論より，〜Ｆ_k（ｕ，ｖ）は視距離パラメータｒの関数であることが分かる。そこで，以下では，〜Ｆ_k（ｕ，ｖ，ｒ）として，視距離パラメータｒの関数であることを陽に示す表記法を用いる。 Hereinafter, r is referred to as a viewing distance parameter. From the above discussion, it can be seen that ˜F _k (u, v) is a function of the viewing distance parameter r. Therefore, in the following, a notation that explicitly indicates that it is a function of the viewing distance parameter r is used as ˜F _k (u, v, r).

基底画像ｆ_k,l（ｘ，ｙ）（０≦ｘ，ｙ≦Ｎ−１）に対する感度係数を次式の電力比として定義する。 A sensitivity coefficient for the base image f _{k, l} (x, y) (0 ≦ x, y ≦ N−1) is defined as a power ratio of the following equation.

感度係数は符号化対象画像とは独立に求めることが可能である。このため，符号化前に予め感度係数を求め，ルックアップテーブルに格納すれば，符号化時の感度係数算出のための演算は省略することができる。なお，Ｗ_k,l ^U（ｒ），Ｗ_k,l ^V（ｒ）についても同様に求めることができる。このとき，輝度成分と色差成分でモデルパラメータを変更することも可能である。 The sensitivity coefficient can be obtained independently of the encoding target image. Therefore, if the sensitivity coefficient is obtained in advance before encoding and stored in the lookup table, the calculation for calculating the sensitivity coefficient at the time of encoding can be omitted. Note that W _{k, l} ^U (r) and W _{k, l} ^V (r) can be similarly obtained. At this time, it is also possible to change the model parameter with the luminance component and the color difference component.

このとき，Ｗ_k,l ^Y（ｒ），Ｗ_k,l ^U（ｒ），Ｗ_k,l ^V（ｒ）は次式を満たす。 At this time, W _{k, l} ^Y (r), W _{k, l} ^U (r), and W _{k, l} ^V (r) satisfy the following expression.

Ｗ_k,l ^Y（ｒ）≦１
Ｗ_k,l ^U（ｒ）≦１
Ｗ_k,l ^V（ｒ）≦１
これは，
ｇ（η）≦１
となることから， W _{k, l} ^Y (r) ≦ 1
W _{k, l} ^U (r) ≦ 1
W _{k, l} ^V (r) ≦ 1
this is,
g (η) ≦ 1
Because

となるためである。また，〜Ｆ_k（ｕ，ｖ，ｒ）がｒの減少関数であることから，Ｗ_k,l ^Y（ｒ），Ｗ_k,l ^U（ｒ），Ｗ_k,l ^V（ｒ）は視距離パラメータｒに対する減少関数であることが分かる。これは，視距離と共に視覚感度が鈍化するという視覚特性に対応するものである。 It is because it becomes. Since ~ F _k (u, v, r) is a decreasing function of r, W _{k, l} ^Y (r), W _{k, l} ^U (r), W _{k, l} ^V (r) It can be seen that this is a decreasing function for the distance parameter r. This corresponds to the visual characteristic that the visual sensitivity decreases with the viewing distance.

［時間マスキング効果の影響を考慮した修正］
あるフレームにおいて，時間軸方向の大きな変化（高速なカメラパン・チルト，シーンチェンジ等）が発生した場合，そのフレームの画質に対する感度は極端に低下する。これは，時間マスキング効果として知られる視覚特性である。そこで，時間軸方向の大きな変化が発生したフレームに対しては，感度係数が小さな値になるよう制御する。本実施形態では，この感度係数の制御に視距離パラメータを用いる。 [Correction considering the effect of time masking effect]
When a large change in the time axis direction (high-speed camera pan / tilt, scene change, etc.) occurs in a certain frame, the sensitivity to the image quality of that frame extremely decreases. This is a visual characteristic known as the time masking effect. Therefore, the sensitivity coefficient is controlled to be a small value for a frame in which a large change in the time axis direction has occurred. In the present embodiment, a viewing distance parameter is used for controlling the sensitivity coefficient.

例えば，以下のようになる。
（１）フレーム毎に視距離パラメータを制御する場合：
ｒ＝ｒ₁ …グローバル動きベクトルの大きさが閾値以上の場合
ｒ＝ｒ₂ …それ以外の場合 (20)
ここで，ｒ₁＞ｒ₂とする。なお，グローバル動きベクトルの算出については，例えば，非特許文献５の方法による。
（２）マクロブロック毎に視距離パラメータを制御する場合：
ｒ＝ｒ₁’…マクロブロックの動きベクトルの大きさが閾値以上の場合
ｒ＝ｒ₂’…それ以外の場合 (21)
なお，マクロブロック内のサブマクロブロックが異なる動きベクトルを有する場合には，以下のようになる。 For example:
(1) When controlling the viewing distance parameter for each frame:
r = r ₁ ... When the magnitude of the global motion vector is greater than or equal to the threshold value r = r ₂ ... Otherwise, (20)
Here, r ₁ > r ₂ . Note that the global motion vector is calculated by the method of Non-Patent Document 5, for example.
(2) When controlling the viewing distance parameter for each macroblock:
r = r ₁ ′ ... when the size of the motion vector of the macroblock is greater than or equal to the threshold value r = r ₂ ′ ... otherwise (21)
Note that when sub-macroblocks in a macroblock have different motion vectors, the following occurs.

ｒ＝ｒ₁’…サブマクロブロックに含まれる動きベクトルの大きさの最小値が閾値以上の場合
ｒ＝ｒ₂’…それ以外の場合 (22)
ここで，ｒ₁’＞ｒ₂’とする。 r = r ₁ ′… when the minimum value of the magnitude of the motion vector included in the sub macroblock is equal to or greater than the threshold value r = r ₂ ′ —otherwise (22)
Here, r ₁ '> r ₂ '.

［処理フローチャート］
本発明の実施形態の処理について，図１の処理フローチャートに従って説明する。なお，予測モードの選択以外の処理については，従来の動画像符号化処理と同様であるため，予測モードの選択以外の処理部分の説明は省略する。 [Processing flowchart]
The processing of the embodiment of the present invention will be described with reference to the processing flowchart of FIG. Since processes other than the selection of the prediction mode are the same as those of the conventional moving image encoding process, the description of the processing parts other than the selection of the prediction mode is omitted.

ステップＳ１：予測モードの初期値をレジスタＸに書き込む。 Step S1: Write the initial value of the prediction mode to the register X.

ステップＳ２：最小コストを格納するレジスタＣ，最適モードを格納するレジスタＭを各々，初期値に設定する。 Step S2: A register C for storing the minimum cost and a register M for storing the optimum mode are respectively set to initial values.

ステップＳ３：予測モード，量子化パラメータ，符号化対象信号，参照信号を入力とし，与えられた予測モードを用いた場合の符号量を算出し，算出した値をレジスタに書き出す。具体的な算出方法は，ＪＭの方法に従う。 Step S3: The prediction mode, the quantization parameter, the encoding target signal, and the reference signal are input, the code amount when the given prediction mode is used is calculated, and the calculated value is written to the register. The specific calculation method follows the JM method.

ステップＳ４：予測モード，量子化パラメータ，符号化対象信号，参照信号を入力とし，与えられた予測モードを用いた場合の重み付き歪み量を算出し，算出した値をレジスタに書き出す。本処理の詳細については図２を用いて後述する。 Step S4: The prediction mode, the quantization parameter, the encoding target signal, and the reference signal are input, the weighted distortion amount when the given prediction mode is used is calculated, and the calculated value is written to the register. Details of this processing will be described later with reference to FIG.

ステップＳ５：予測モード，量子化パラメータを入力とし，未定乗数を算出し，算出した値をレジスタに書き出す。具体的な算出方法は，ＪＭの方法に従う。 Step S5: The prediction mode and the quantization parameter are input, the undetermined multiplier is calculated, and the calculated value is written to the register. The specific calculation method follows the JM method.

ステップＳ６：符号量，重み付き歪み量，未定乗数を入力とし，Ｒ−Ｄコストを算出し，算出した値をレジスタに書き出す。具体的な算出方法は，式(8) に従う。 Step S6: The code amount, the weighted distortion amount, and the undetermined multiplier are input, the RD cost is calculated, and the calculated value is written to the register. The specific calculation method follows equation (8).

ステップＳ７：ステップＳ６で算出したＲ−Ｄコスト，およびレジスタＣの値を入力とし，ステップＳ６で算出したＲ−ＤコストがレジスタＣの値よりも小さいか否かの判定を行い，判定結果である真偽値を出力する。出力が真値の場合，ステップＳ８に進む。出力が偽値の場合，ステップＳ１０の処理に移る。 Step S7: Using the RD cost calculated in step S6 and the value of the register C as inputs, it is determined whether or not the RD cost calculated in step S6 is smaller than the value of the register C. A certain boolean value is output. If the output is a true value, the process proceeds to step S8. If the output is a false value, the process proceeds to step S10.

ステップＳ８：ステップＳ６で算出したＲ−ＤコストをレジスタＣに書き出す。 Step S8: The RD cost calculated in step S6 is written in the register C.

ステップＳ９：レジスタＸの値をレジスタＭに書き出す。 Step S9: Write the value of the register X to the register M.

ステップＳ１０：全ての予測モードについて，以上の処理を終えたかどうかを判定し，まだの場合には，ステップＳ１１に進む。全ての予測モードについて処理を終えたならば，処理を終了する。最終的にレジスタＭに格納されたモードが最適モードとして選択された予測モードであり，そのときのコストがレジスタＣに格納された値である。 Step S10: It is determined whether or not the above processing has been completed for all prediction modes, and if not, the process proceeds to step S11. When the process is completed for all prediction modes, the process is terminated. The mode finally stored in the register M is the prediction mode selected as the optimum mode, and the cost at that time is the value stored in the register C.

ステップＳ１１：レジスタＸに次の予測モードを表す値を書き出す。レジスタに書き出す予測モードの順番は予め与えられるものとする。その後，ステップＳ３に戻り，同様に処理を繰り返す。 Step S11: A value representing the next prediction mode is written in the register X. The order of prediction modes to be written to the register is given in advance. Then, it returns to step S3 and repeats a process similarly.

図１におけるステップＳ４の詳しい処理の流れを図２に示す。 A detailed processing flow of step S4 in FIG. 1 is shown in FIG.

ステップＳ２１：変換行列Φにより得られた変換係数を入力とし，正規化処理を行い，正規化後の変換係数を一次元配列としてレジスタに書き出す。なお，二次元データである変換係数を一次元データとする走査方法については外部から与えられるものとする。これは，以下のステップＳ２２，ステップＳ２４においても同様である。 Step S21: The conversion coefficient obtained by the conversion matrix Φ is input, normalization processing is performed, and the normalized conversion coefficient is written in a register as a one-dimensional array. It should be noted that a scanning method in which a conversion coefficient that is two-dimensional data is used as one-dimensional data is given from the outside. The same applies to the following steps S22 and S24.

ステップＳ２２：変換係数に対する量子化結果を入力とし，逆量子化処理を行い，変換係数の復号値を一次元配列としてレジスタに書き出す。 Step S22: The quantization result for the transform coefficient is input, inverse quantization processing is performed, and the decoded value of the transform coefficient is written to the register as a one-dimensional array.

ステップＳ２３：動きベクトルを入力とし，その動きベクトルに応じて，視距離パラメータを算出する処理を行い，算出した視距離パラメータをレジスタに書き出す。例えば，式(20)〜(22)に示す方法に従う。ここでは，２種類の感度係数について選択する例を示しているが，本発明は，選択の候補が３種類以上となっても同様に適用できる。なお，ここで入力として用いる動きベクトルは，別途，与えられるものとする。例えば，符号化の過程で算出された動きベクトルを用いる方法（マクロブロック内の動きベクトルの平均ベクトルを用いる等）がある。あるいは，符号化とは独立に求めたマクロブロック（１６×１６［画素］）に対する動きベクトルを用いる方法もある。 Step S23: A motion vector is input, a process for calculating a viewing distance parameter is performed according to the motion vector, and the calculated viewing distance parameter is written to a register. For example, the method shown in equations (20) to (22) is followed. Here, an example is shown in which two types of sensitivity coefficients are selected, but the present invention can be similarly applied even when there are three or more selection candidates. Note that the motion vector used as an input here is given separately. For example, there is a method using a motion vector calculated in the process of encoding (using an average vector of motion vectors in a macroblock). Alternatively, there is a method of using a motion vector for a macroblock (16 × 16 [pixel]) obtained independently of encoding.

ステップＳ２４：視距離パラメータを入力として，視距離パラメータに応じて感度係数を算出する処理を行い，算出された感度係数を一次元配列としてレジスタに書き出す。具体的な算出法は，式(19)に従う。なお，選択可能な視距離パラメータが予め定められている場合は，感度係数も予め算出しておくことができるため，感度係数の算出処理は省略できる。この場合，本処理は，以下のようになる。視距離パラメータを入力として，視距離パラメータに応じて予めレジスタに格納された感度係数を読み出し，一次元配列としてレジスタに書き出す。 Step S24: A processing for calculating a sensitivity coefficient according to the viewing distance parameter is performed with the viewing distance parameter as an input, and the calculated sensitivity coefficient is written in a register as a one-dimensional array. The specific calculation method follows equation (19). If selectable viewing distance parameters are determined in advance, the sensitivity coefficient can be calculated in advance, so that the sensitivity coefficient calculation process can be omitted. In this case, this processing is as follows. With the viewing distance parameter as an input, the sensitivity coefficient previously stored in the register is read in accordance with the viewing distance parameter, and is written in the register as a one-dimensional array.

ステップＳ２５：繰り返し回数をカウントするために用いるカウンタｉの値を０に初期化する。レジスタＳの値を０に初期化する。 Step S25: The value of the counter i used for counting the number of repetitions is initialized to zero. The value of the register S is initialized to 0.

ステップＳ２６：ステップＳ２１で出力された正規化後の変換係数の第ｉ成分，およびステップＳ２２で出力された変換係数の復号値の第ｉ成分を入力とし，両者の差分を求め，同差分値を二乗する処理を行い，算出した二乗誤差（変換係数の符号化歪み）をレジスタに書き出す。 Step S26: The i-th component of the normalized transform coefficient output in step S21 and the i-th component of the decoded value of the transform coefficient output in step S22 are input, the difference between them is obtained, and the difference value is obtained. A process of squaring is performed, and the calculated square error (coding distortion of the transform coefficient) is written to the register.

ステップＳ２７：ステップＳ２６で出力した二乗誤差，感度係数の第ｉ成分を入力とし，二乗誤差に感度係数を乗じる処理を行い，乗算結果をレジスタに書き出す。 Step S27: The square error output at step S26 and the i-th component of the sensitivity coefficient are input, the square error is multiplied by the sensitivity coefficient, and the multiplication result is written to the register.

ステップＳ２８：ステップＳ２７で算出した値，レジスタＳの値を入力として，両者を加算し，加算結果をレジスタＳに書き出す。 Step S28: The value calculated in step S27 and the value of the register S are input, both are added, and the addition result is written to the register S.

ステップＳ２９，Ｓ３０：以上の処理をカウンタｉに１を加算しながら繰り返し，変換係数の全成分について行う。 Steps S29 and S30: The above processing is repeated while adding 1 to the counter i, and is performed for all components of the transform coefficient.

［装置構成例］
本発明の実施形態に係る装置構成図を図３に示す。なお，本実施形態により選択した最適な予測モードを用いて動画像を符号化する処理部の構成については，従来の一般的な動画像符号化装置の構成と同様であるので，ここでは，予測モードを選択する処理構成の部分だけを説明する。 [Device configuration example]
An apparatus configuration diagram according to the embodiment of the present invention is shown in FIG. Note that the configuration of the processing unit that encodes a moving image using the optimal prediction mode selected according to the present embodiment is the same as the configuration of a conventional general moving image encoding apparatus, and therefore here, the prediction is performed. Only the part of the processing configuration for selecting the mode will be described.

初期モード設定部１０１：予測モードの初期値をモード記憶部１０２に書き出す。 Initial mode setting unit 101: Writes the initial value of the prediction mode to the mode storage unit 102.

符号量算出部１０３：予測モード，量子化パラメータ，符号化対象信号，参照信号を入力とし，符号化した場合の符号量を算出し，算出した値を符号量記憶部１０４に書き出す。具体的な算出方法は，ＪＭの方法に従う。 Code amount calculation unit 103: Inputs a prediction mode, a quantization parameter, an encoding target signal, and a reference signal, calculates a code amount when encoding, and writes the calculated value to the code amount storage unit 104. The specific calculation method follows the JM method.

重み付き歪み量算出部１０５：予測モード，量子化パラメータ，符号化対象信号，参照信号を入力とし，符号化した場合の重み付き歪み量を算出し，算出した値を重み付き歪み量記憶部１０６に書き出す。本処理の詳細については図２を用いて後述する。 Weighted distortion amount calculation unit 105: The prediction mode, the quantization parameter, the encoding target signal, and the reference signal are input, the weighted distortion amount when encoding is calculated, and the calculated value is used as the weighted distortion amount storage unit 106. Export to Details of this processing will be described later with reference to FIG.

未定乗数算出部１０７：予測モード，量子化パラメータを入力とし，未定乗数を算出し，算出した値を未定乗数記憶部１０８に書き出す。具体的な算出方法は，ＪＭの方法に従う。 Undetermined multiplier calculation unit 107: Inputs the prediction mode and the quantization parameter, calculates the undefined multiplier, and writes the calculated value to the undefined multiplier storage unit 108. The specific calculation method follows the JM method.

コスト算出部１０９：符号量記憶部１０４，重み付き歪み量記憶部１０６，未定乗数記憶部１０８から読み出した符号量，重み付き歪み量，未定乗数を入力とし，Ｒ−Ｄコストを算出し，算出した値をコスト記憶部１１０に書き出す。具体的な算出方法は，式(8) に従う。 Cost calculation unit 109: The code amount storage unit 104, the weighted distortion amount storage unit 106, the code amount read from the undetermined multiplier storage unit 108, the weighted distortion amount, and the undetermined multiplier are input to calculate the RD cost. The calculated value is written in the cost storage unit 110. The specific calculation method follows equation (8).

最小コスト判定部１１１：コスト記憶部１１０，最小コスト記憶部１１２から読み出したＲ−Ｄコスト，最小コストを入力とし，Ｒ−Ｄコストが最小コストよりも小さいか否かの判定を行い，判定結果である真偽値を出力する。出力が真値の場合，コスト記憶部１１０から読み出したＲ−Ｄコストを最小コスト記憶部１１２に書き出し，最適モード更新部１１３の処理に進む。出力が偽値の場合，最終モード判定部１１５の処理に移る。 Minimum cost determination unit 111: The RD cost and the minimum cost read from the cost storage unit 110 and the minimum cost storage unit 112 are input, and it is determined whether the RD cost is smaller than the minimum cost. Outputs a boolean value that is When the output is a true value, the RD cost read from the cost storage unit 110 is written to the minimum cost storage unit 112 and the process proceeds to the optimum mode update unit 113. If the output is a false value, the process proceeds to the final mode determination unit 115.

最適モード更新部１１３：最小コスト記憶部１１２に書き出したＲ−Ｄコスト算出に用いたモードを最適モード記憶部１１４に書き出す。 Optimal mode update unit 113: Writes the mode used for RD cost calculation written in the minimum cost storage unit 112 into the optimal mode storage unit 114.

最終モード判定部１１５，モード設定部１１６：以上の処理を全ての予測モードについて行う。 Final mode determination unit 115, mode setting unit 116: The above processing is performed for all prediction modes.

最適モード出力部１１７：全ての予測モードについての処理が終了したならば，最適モード記憶部１１４に記憶されている予測モードを最適モードとして出力する。 Optimal mode output unit 117: When processing for all prediction modes is completed, the prediction mode stored in the optimal mode storage unit 114 is output as the optimal mode.

図３における重み付き歪み量算出部１０５の詳細構成について図４に示す。 FIG. 4 shows a detailed configuration of the weighted distortion amount calculation unit 105 in FIG.

変換係数正規化部２０１：変換行列Φにより得られた変換係数を入力とし，正規化処理を行い，正規化後の変換係数を正規化変換係数記憶部２０２に書き出す。 Conversion coefficient normalization unit 201: The conversion coefficient obtained from the conversion matrix Φ is input, normalization processing is performed, and the normalized conversion coefficient is written to the normalized conversion coefficient storage unit 202.

変換係数復号部２０３：変換係数に対する量子化結果を入力とし，逆量子化処理を行い，変換係数の復号値を復号変換係数記憶部２０４に書き出す。ここでの量子化は，ＪＭと同様に正規化処理を含む。 Transform coefficient decoding unit 203: The quantization result for the transform coefficient is input, inverse quantization processing is performed, and the decoded value of the transform coefficient is written in the decoded transform coefficient storage unit 204. The quantization here includes normalization processing as in the case of JM.

歪み量算出部２０５：正規化変換係数記憶部２０２，復号変換係数記憶部２０４から読み出した正規化後の変換係数および変換係数の復号値を入力とし，各成分毎に両者の差分を求め，同差分値を二乗する処理を行い，算出した二乗誤差を歪み量記憶部２０６に書き出す。 Distortion amount calculation unit 205: The normalized conversion coefficient storage unit 202 and the decoded conversion coefficient storage unit 204 read from the normalized conversion coefficient and the decoded value of the conversion coefficient are input, and the difference between the two is obtained for each component. A process of squaring the difference value is performed, and the calculated square error is written in the distortion amount storage unit 206.

感度係数乗算部２０７：感度係数記憶部２０８−１，２０８−２のいずれかから読み出した感度係数，歪み量記憶部２０６から読み出した二乗誤差を入力とし，二乗誤差に感度係数を乗じる処理を行い，得られた重み付き歪み量を歪み量記憶部２１２に書き出す。なお，ここでは，２種類の感度係数について選択する例を示しているが，本発明は，選択の候補が３種類以上となっても同様に適用できる。 Sensitivity coefficient multiplication unit 207: The sensitivity coefficient read from one of the sensitivity coefficient storage units 208-1 and 208-2 and the square error read from the distortion amount storage unit 206 are input, and the square error is multiplied by the sensitivity coefficient. The obtained weighted distortion amount is written in the distortion amount storage unit 212. Although an example of selecting two types of sensitivity coefficients is shown here, the present invention can be similarly applied even when there are three or more selection candidates.

視距離パラメータ算出部２０９：動きベクトルを入力とし，動きベクトルに応じて，視距離パラメータを算出する処理を行い，算出した視距離パラメータを視距離パラメータ記憶部２１０に書き出す。例えば，式(20)〜(22)に示す方法に従う。なお，ここでは，２種類の視距離パラメータを選択する例を示しているが，本発明は，選択の候補が３種類以上となっても同様に適用できる。また，ここで入力として用いる動きベクトルは，別途，与えられるものとする。例えば，符号化の過程で算出された動きベクトルを用いて，マクロブロック内の動きベクトルの平均ベクトルを用いる方法がある。あるいは，符号化とは独立に求めたマクロブロック（１６×１６［画素］）に対する動きベクトルを用いても良い。 Viewing distance parameter calculation unit 209: With a motion vector as an input, processing for calculating a viewing distance parameter is performed according to the motion vector, and the calculated viewing distance parameter is written in the viewing distance parameter storage unit 210. For example, the method shown in equations (20) to (22) is followed. Although an example in which two types of viewing distance parameters are selected is shown here, the present invention can be similarly applied even if there are three or more selection candidates. The motion vector used as an input here is separately given. For example, there is a method of using an average vector of motion vectors in a macroblock using a motion vector calculated in the encoding process. Alternatively, a motion vector for a macroblock (16 × 16 [pixel]) obtained independently of encoding may be used.

感度係数選択部２１１：視距離パラメータを入力として，視距離パラメータに応じて感度係数を選択する処理を行い，選択された感度係数を感度係数乗算部２０７の入力とすることを表す制御信号を出力する。 Sensitivity coefficient selection unit 211: receives a viewing distance parameter, performs a process of selecting a sensitivity coefficient according to the viewing distance parameter, and outputs a control signal indicating that the selected sensitivity coefficient is input to the sensitivity coefficient multiplication unit 207 To do.

歪み量和算出部２１３：歪み量記憶部２１２から読み出した各成分の重み付き歪み量を入力とし，全ての成分についての合計和を算出し，算出結果の合計和を図３の重み付き歪み量記憶部１０６に書き出す。 Distortion amount sum calculation unit 213: The weighted distortion amount of each component read from the distortion amount storage unit 212 is input, the total sum for all components is calculated, and the total sum of the calculation results is the weighted distortion amount of FIG. Write to the storage unit 106.

以上の動画像符号化の処理は，コンピュータとソフトウェアプログラムとによっても実現することができ，そのプログラムをコンピュータ読み取り可能な記録媒体に記録して提供することも，ネットワークを通して提供することも可能である。 The above moving image encoding processing can be realized by a computer and a software program, and the program can be provided by being recorded on a computer-readable recording medium or provided through a network. .

以上の手段を用いることにより，本発明は，例えばＨ．２６４のような多くの符号化モードを有する符号化器において，適切な符号化モードを選択することが可能となり，復号画像の主観画質を保ちながら符号化効率を向上させることができるようになる。Ｈ．２６４等の高能率画像信号符号化の特徴は，自由度の高い符号化モードの選択にあるため，適切な符号化モードの選択によって初めて，符号化性能を最大限に引き出すことが可能となる。 By using the above-described means, the present invention is, for example, H.264. In an encoder having many encoding modes such as H.264, an appropriate encoding mode can be selected, and encoding efficiency can be improved while maintaining the subjective image quality of the decoded image. H. The feature of high-efficiency image signal encoding such as H.264 is the selection of an encoding mode with a high degree of freedom. Therefore, the encoding performance can be maximized only by selecting an appropriate encoding mode.

本発明の実施形態の処理フローチャートである。It is a processing flowchart of an embodiment of the present invention. 図１におけるステップＳ４の処理フローチャートである。It is a process flowchart of step S4 in FIG. 本発明の実施形態に係る装置構成図である。It is a device block diagram concerning the embodiment of the present invention. 図３における重み付き歪み量算出部の詳細構成図である。FIG. 4 is a detailed configuration diagram of a weighted distortion amount calculation unit in FIG. 3.

Explanation of symbols

１０１初期モード設定部
１０２モード記憶部
１０３符号量算出部
１０４符号量記憶部
１０５重み付き歪み量算出部
１０６重み付き歪み量記憶部
１０７未定乗数算出部
１０８未定乗数記憶部
１０９コスト算出部
１１０コスト記憶部
１１１最小コスト判定部
１１２最小コスト記憶部
１１３最適モード更新部
１１４最適モード記憶部
１１５最終モード判定部
１１６モード設定部
１１７最適モード出力部
２０１変換係数正規化部
２０２正規化変換係数記憶部
２０３変換係数復号部
２０４復号変換係数記憶部
２０５歪み量算出部
２０６歪み量記憶部
２０７感度係数乗算部
２０８−１，２０８−２感度係数記憶部
２０９視距離パラメータ算出部
２１０視距離パラメータ記憶部
２１１感度係数選択部
２１２歪み量記憶部
２１３歪み量和算出部
DESCRIPTION OF SYMBOLS 101 Initial mode setting part 102 Mode memory | storage part 103 Code amount calculation part 104 Code amount memory | storage part 105 Weighted distortion amount calculation part 106 Weighted distortion amount memory | storage part 107 Undecided multiplier calculation part 108 Undecided multiplier memory | storage part 109 Cost calculation part 110 Cost memory | storage Unit 111 minimum cost determination unit 112 minimum cost storage unit 113 optimal mode update unit 114 optimal mode storage unit 115 final mode determination unit 116 mode setting unit 117 optimal mode output unit 201 conversion coefficient normalization unit 202 normalization conversion coefficient storage unit 203 conversion Coefficient decoding unit 204 decoding transform coefficient storage unit 205 distortion amount calculation unit 206 distortion amount storage unit 207 sensitivity coefficient multiplication unit 208-1, 208-2 sensitivity coefficient storage unit 209 viewing distance parameter calculation unit 210 viewing distance parameter storage unit 211 sensitivity coefficient Selection unit 212 Distortion amount Section 213 the amount of distortion sum calculation unit

Claims

In a video encoding method for setting a prediction mode for intra-frame prediction and inter-frame prediction, and compressing information by orthogonal transform and quantization for a prediction error signal obtained by prediction in the set prediction mode,
When determining the prediction mode based on the Lagrangian cost function consisting of distortion, code amount and undetermined multiplier,
The ratio of the frequency component in the block subjected to orthogonal transformation multiplied by the visual sensitivity value calculated by a predetermined visual sensitivity function and the frequency component in the block subjected to orthogonal transformation is calculated as a sensitivity coefficient. The process of
For each prediction mode, and the process of calculating the weighted distortion amount obtained by multiplying the sensitivity coefficient to the square error,
Setting a cost function using the weighted distortion amount, and calculating a cost of each prediction mode using the cost function;
And a step of selecting a prediction mode that minimizes the cost calculated using the cost function.

The moving image encoding method according to claim 1,
The visual sensitivity function has a visual distance parameter obtained by dividing the visual distance by the width of the image as at least one parameter, and is a decreasing function of the visual distance parameter,
When calculating the sensitivity coefficients, depending on the global motion of the frame, visual distance parameter when the motion amount is large, and set to a value larger than the viewing distance parameter when the motion amount is small The moving picture coding method characterized by adaptively changing the sensitivity coefficient for each frame.

The moving image encoding method according to claim 1,
The visual sensitivity function has a visual distance parameter obtained by dividing the visual distance by the width of the image as at least one parameter, and is a decreasing function of the visual distance parameter,
When calculating the sensitivity coefficients, depending on the amount of motion within the block, visual distance parameter when the motion amount is large, and set to a value larger than the viewing distance parameter when the motion amount is small, the sensitivity A moving picture coding method characterized by adaptively changing a coefficient for each block.

In a video encoding apparatus that sets a prediction mode for intra-frame prediction and inter-frame prediction, and performs information compression by orthogonal transform and quantization on a prediction error signal obtained by prediction in the set prediction mode,
Means for calculating the amount of code for each prediction mode;
The ratio of the frequency component in the block subjected to orthogonal transformation multiplied by the visual sensitivity value calculated by a predetermined visual sensitivity function and the frequency component in the block subjected to orthogonal transformation is calculated as a sensitivity coefficient. and, means for calculating for each prediction mode, the weighted distortion amount obtained by multiplying the sensitivity coefficient to the square error,
For each prediction mode, a means for calculating the Lagrange's undetermined multiplier with the quantization parameter as input,
Means for calculating the cost of each prediction mode using a cost function determined by the code amount, the weighted distortion amount, and the undetermined multiplier;
Means for selecting a prediction mode that minimizes the cost calculated using the cost function. A moving picture coding apparatus, comprising:

A moving picture coding program for causing a computer to execute the moving picture coding method according to any one of claims 1 to 3 .

A computer-readable recording medium recording a moving image encoding program for causing a computer to execute the moving image encoding method according to any one of claims 1 to 3 .