JPH099263A

JPH099263A - Encoding method and encoder for motion compensation prediction of dynamic image

Info

Publication number: JPH099263A
Application number: JP15469995A
Authority: JP
Inventors: Atsushi Sagata; 淳嵯峨田; Hirotaka Jiyosawa; 裕尚如沢; Yutaka Watanabe; 裕渡辺
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-06-21
Filing date: 1995-06-21
Publication date: 1997-01-10

Abstract

PURPOSE: To detect a spatially and temporally continuous motion vector in the motion vector detection method of a block matching type. CONSTITUTION: An encoding object image 1 is inputted to a frame memory 66 and connected with the plural encoding object images timewisely continued to the encoding object image 1 already stored inside the frame memory 66 and space-time images 61 are prepared. The space and time images 61 are inputted from the frame memory 66 to a luminance value continuation information detector 65 and the line element of a part indicated by a direction for which a luminance value is similar to a time direction is extracted. The encoding object image is inputted to a motion detection part 63 along with luminance value continuation information 62 and the motion vector 64 of respective blocks is obtained. In the motion detection part 63, as estimated search position for the respective blocks within the encoding object image 1 is obtained from the curve formula, only around the estimated search position is searched and the motion vector 64 is detected.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、画像通信、画像記録等
に利用される画像信号のディジタル圧縮符号化方法に関
し、特に領域分割による動画像の動き補償予測符号化方
法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digital compression encoding method for image signals used in image communication, image recording, etc., and more particularly to a motion compensation predictive encoding method for moving images by area division.

【０００２】[0002]

【従来の技術】動画像のディジタル圧縮符号化におい
て、動画像信号の時間冗長性を抑圧する手段として、動
き補償フレーム間予測がしばしば用いられる。このフレ
ーム間予測では、通常、符号化対象画像を１６画素×１
６ライン等の矩形ブロックに区切り、各ブロック毎に参
照画像との間の動き量（動きベクトル）を検出し、参照
画像を動きベクトル分シフトして生成した予測画像と符
号化対象画像との差分（動き補償予測誤差）信号を符号
化する。この動き補償フレーム間予測により動画像のフ
レーム間相関は飛躍的に向上し、単純フレーム間予測に
比べ大幅な情報圧縮が得られる。さらに、動き補償予測
誤差信号に対して離散コサイン変換（ＤＣＴ：Ｄｉｓｃ
ｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）やサブ
バンド分割を施すことにより、空間方向の冗長性も抑圧
され、一層の情報圧縮が図られる。このため、テレビ電
話／会議用ビデオ符号化ＩＴＵ−ＴＨ．２６１、蓄積
用ビデオ符号化ＩＳＯ／ＩＥＣ１１１７２（ＭＰＥＧ
−１）などでは、動き補償フレーム間予測による残差信
号をＤＣＴ符号化するハイブリッド符号化構成が採用さ
れている。2. Description of the Related Art In digital compression coding of moving images, motion-compensated interframe prediction is often used as a means for suppressing temporal redundancy of moving image signals. In this inter-frame prediction, the encoding target image is usually 16 pixels × 1.
The difference between the prediction image generated by dividing the reference image by a motion vector and detecting the amount of motion (motion vector) from the reference image for each block by dividing the block into rectangular blocks of 6 lines The (motion compensation prediction error) signal is encoded. By this motion-compensated inter-frame prediction, the inter-frame correlation of the moving image is dramatically improved, and a large amount of information compression can be obtained as compared with the simple inter-frame prediction. Furthermore, a discrete cosine transform (DCT: Disc) is applied to the motion compensation prediction error signal.
By performing rete cosine transform) or sub-band division, the redundancy in the spatial direction is suppressed and further information compression is achieved. For this reason, video coding for videophone / conference video ITU-T H.264. 261, video encoding ISO / IEC 11172 for storage (MPEG
In -1) and the like, a hybrid coding configuration for DCT-coding a residual signal by motion-compensated interframe prediction is adopted.

【０００３】ＩＴＵ−Ｔ（前ＣＣＩＴＴ）勧告Ｈ．２６
１は、「ｐ×６４ｋｂ／ｓオーディオビジュアルサー
ビス用ビデオ符号化方式」と題され、６４ｋｂ／ｓ（ｐ
＝１）から２Ｍｂ／ｓ（ｐ＝３０）までのビットレート
を用いる通信用のビデオ符号化標準である。標準化の作
業開始は１９８４年１２月、勧告成立は１９９０年１２
月である。アプリケーションとしてはテレビ電話、テレ
ビ会議等が挙げられる。Ｈ．２６１は動画像信号の時間
的冗長度を動き補償予測により抑圧し、各フレームの空
間的冗長度を離散コサイン変換（ＤＣＴ）符号化により
抑圧する。以下、図２を用いてＨ．２６１の符号化アル
ゴリズムを簡単に説明する。ITU-T (formerly CCITT) Recommendation H.264 26
1 is entitled “p × 64 kb / s video coding method for audiovisual service”, and is 64 kb / s (p
= 1) to 2 Mb / s (p = 30), which is a video coding standard for communication. The standardization work started in December 1984, and the recommendation was established in December 1990.
It is the moon. Examples of the application include a videophone and a video conference. H. 261 suppresses the temporal redundancy of the moving image signal by motion compensation prediction, and suppresses the spatial redundancy of each frame by discrete cosine transform (DCT) coding. Hereinafter, with reference to FIG. The encoding algorithm of H.261 will be briefly described.

【０００４】まず、符号化対象画像１は正方形パターン
５０と共に動き検出部５１に入力され、１６画素×１６
ラインのマクロブロックと称される正方形ブロックに分
割される。動き検出部５１では、符号化対象画像１の中
のマクロブロックごとに、参照画像との間の動き量を、
全探索などを用いて、探索範囲内で最も類似するブロッ
クを、絶対値差分や差分自乗和等を評価関数として使用
することで相対位置を動ベクトル５２として検出し、ブ
ロック動き補償部５３に送る。ここで、各マクロブロッ
クの動ベクトルは、参照画像において、着目マクロブロ
ックとのマッチング度が最も高いブロックの座標と、着
目マクロブロックの座標との変位として表される。動ベ
クトルの探索範囲は、着目マクロブロックの座標とその
周囲の±１５画素×±１５ラインに制限される。First, the image to be coded 1 is input to the motion detecting section 51 together with the square pattern 50, and 16 pixels × 16.
It is divided into square blocks called line macroblocks. In the motion detection unit 51, the amount of motion with respect to the reference image is calculated for each macroblock in the encoding target image 1,
By using a full search or the like, the most similar block in the search range is detected as a motion vector 52 by using the absolute value difference or the sum of squared differences as an evaluation function, and is sent to the block motion compensation unit 53. . Here, the motion vector of each macroblock is represented as a displacement between the coordinate of the block having the highest degree of matching with the macroblock of interest and the coordinate of the macroblock of interest in the reference image. The motion vector search range is limited to the coordinates of the macroblock of interest and ± 15 pixels × ± 15 lines around it.

【０００５】次に、ブロック動き補償部５３では、各マ
クロブロックの動ベクトル５２とフレームメモリ５に蓄
積された直前フレームの局部復号画像６とから動き補償
予測画像１５を生成する。ここで得られた動き補償予測
画像１５は符号化対象画像１と共に減算器１６に入力さ
れる。両者の差分すなわち動き補償予測誤差１７は、Ｄ
ＣＴ／量子化部５４においてＤＣＴ変換され、さらに量
子化されて圧縮差分データ１９となる。ここで、ＤＣＴ
のブロックサイズは８×８である。圧縮差分データ１９
（量子化インデックス）は差分データ符号化部２０にお
いてデータ圧縮され、差分画像符号化データ２１とな
る。一方、動ベクトル５２は動ベクトル符号化部２６に
おいて符号化され、得られた動ベクトル符号化データ２
７は差分画像符号化データ２１と共に多重化部２８にて
多重化され、多重化データ２９として伝送される。Next, the block motion compensation unit 53 generates a motion compensation predicted image 15 from the motion vector 52 of each macroblock and the locally decoded image 6 of the immediately preceding frame stored in the frame memory 5. The motion compensation prediction image 15 obtained here is input to the subtractor 16 together with the encoding target image 1. The difference between the two, that is, the motion compensation prediction error 17 is D
The CT / quantization unit 54 performs DCT conversion, and further quantizes the compressed difference data 19. Where DCT
Has a block size of 8 × 8. Compressed difference data 19
The (quantization index) is data-compressed in the differential data encoding unit 20 and becomes differential image encoded data 21. On the other hand, the motion vector 52 is coded in the motion vector coding unit 26, and the obtained motion vector coded data 2
7 is multiplexed with the differential image coded data 21 by the multiplexing unit 28 and transmitted as multiplexed data 29.

【０００６】なお、復号器と同じ復号画像を符号化器内
でも得るため、圧縮差分データ１９（量子化インデック
ス）は逆量子化／逆ＤＣＴ部５５で量子化代表値に戻さ
れ、さらに逆ＤＣＴ変換された後、復号差分画像２３と
なる。復号差分画像２３と動き補償予測画像１５は加算
器２４で加算され、局部復号画像２５となる。この局部
復号画像２５はフレームメモリ５に蓄積され、次のフレ
ームの符号化時に参照画像として用いられる。Since the same decoded image as the decoder is obtained in the encoder, the compressed difference data 19 (quantization index) is returned to the quantized representative value by the inverse quantization / inverse DCT unit 55, and further the inverse DCT is performed. After conversion, the decoded difference image 23 is obtained. The decoded difference image 23 and the motion-compensated predicted image 15 are added by the adder 24 to form a locally decoded image 25. This locally decoded image 25 is stored in the frame memory 5 and used as a reference image when the next frame is encoded.

【０００７】[0007]

【発明が解決しようとする課題】従来の動画像のディジ
タル圧縮符号化における動き予測の手法を用いて動ベク
トルを得る場合、動き探索においては、通常１６×１６
ブロック内の予測誤差の電力（平均自乗誤差）が最小と
なる位置がマッチング先として選ばれるため、平均輝度
値の差が動き探索に大きな影響を及ぼし、光源の変化な
どにより実際のマッチング先とは異なるが、誤差電力最
小の部分にマッチングしてしまい、空間・時間方向の各
ブロックの動ベクトル値間に連続性が存在しない場合が
多いという問題があった。When a motion vector is obtained using a motion estimation method in the conventional digital compression encoding of a moving image, the motion search is usually 16 × 16.
The position where the prediction error power (mean squared error) in the block is the smallest is selected as the matching target, so the difference in the average luminance value has a large influence on the motion search, and due to changes in the light source, etc. Although different, there is a problem in that there is often a case where there is no continuity between the motion vector values of each block in the space / time direction because it matches with the part with the minimum error power.

【０００８】本発明の目的は、上記問題点を解決し、空
間的かつ時間的に連続した動ベクトルを検出する、動画
像の動き補償予測符号化方法および符号化器を提供する
ことにある。An object of the present invention is to solve the above-mentioned problems and to provide a motion compensation predictive coding method and a coder for a moving image, which detects a spatially and temporally continuous motion vector.

【０００９】[0009]

【課題を解決するための手段】上記目的を達成するため
に、本発明は、符号化対象画像を多角形パッチに分割
し、符号化対象画像と予測参照画像との間の動き量を前
記多角形パッチごとに検出し、動き補償を行なって予測
画像を生成し、該予測画像と符号化対象画像との差分を
符号化する、動画像の動き補償予測符号化方法におい
て、符号化対象画像中の多角形パッチの頂点動ベクトル
を求めるにあたり、動ベクトル検出の前処理として、時
間方向に連続した複数の符号化参照画像内の各輝度値の
連続情報、すなわち輝度値が似た線要素の方向を各画素
単位に求め、線要素の方向を動き量の検出の際に参照す
ることで、動き予測探索範囲に拘束をはめることを特徴
とする。In order to achieve the above object, the present invention divides an image to be encoded into polygonal patches, and sets the amount of motion between the image to be encoded and a prediction reference image to the above-mentioned multiple amount. A motion-compensated predictive coding method for a moving image, which detects each of the rectangular patches, performs motion compensation to generate a predicted image, and encodes a difference between the predicted image and the target image to be encoded. When calculating the vertex motion vector of the polygon patch of, the continuous information of each brightness value in a plurality of coded reference images that are continuous in the time direction, that is, the direction of the line element with similar brightness values, as a pre-process of motion vector detection. Is calculated for each pixel, and the direction of the line element is referred to when detecting the motion amount, whereby the motion prediction search range is constrained.

【００１０】また、本発明の、動画像の動き補償予測符
号化器は、符号化対象画像を蓄積し、新たに入力された
符号化対象画像とすでに蓄積されている、前記符号化対
象画像と時間的に連続した複数の符号化対象画像とをつ
なげ、時空間画像を形成するフレームメモリと、前記時
空間画像を入力し、時間方向に輝度値が似た方向を表わ
す所の線要素抽出を行い、輝度値連続情報を出力する輝
度値連続情報抽出手段と、前記符号化対象画像と前記輝
度値連続情報を入力し、各ブロックの動ベクトルを求め
る動き検出手段を有することを特徴とする。The motion-compensated predictive encoder for moving pictures according to the present invention accumulates the image to be encoded, the newly input image to be encoded, and the previously-encoded image to be encoded. A frame memory that connects a plurality of temporally continuous encoding target images to form a spatiotemporal image and the spatiotemporal image is input, and line element extraction is performed at a location that represents a direction in which luminance values are similar in the temporal direction. The present invention is characterized by further comprising luminance value continuation information extraction means for performing and outputting luminance value continuity information, and motion detection means for inputting the image to be encoded and the luminance value succession information to obtain a motion vector of each block.

【００１１】[0011]

【作用】動画像シーケンスにおいて、物体の領域形状に
は時間的な相関性が存在する。たとえば、画面の背景部
分の領域形状は時間的にほとんど変化しない。また、短
い時間であるならば、物体の領域形状の時間的変化も無
いと考えてよい。In the moving image sequence, the region shape of the object has temporal correlation. For example, the area shape of the background portion of the screen hardly changes with time. Further, if the time is short, it may be considered that there is no temporal change in the area shape of the object.

【００１２】しかし実際には、画像中の雑音や光源の影
響等により領域形状が微妙に変化するため背景や静止部
分でさえ、予測参照画像の領域形状から予測することが
難しい。また、動領域についても、その動きが正しく求
められていれば、予測参照画像の領域形状と動き情報と
から、現在の領域形状を推定することができるが、実際
にはブロック内の予測誤差の電力が最小となる位置がマ
ッチング先として選ばれるため、平均輝度値の差が動き
探索に大きな影響を及ぼすため、実際の動きに即しな
い、誤差電力最小の部分にマッチングしてしまう。However, in reality, it is difficult to predict the background and even the static portion from the area shape of the prediction reference image because the area shape changes subtly due to the noise in the image and the influence of the light source. Also, regarding the moving area, if the motion is correctly obtained, the current area shape can be estimated from the area shape of the prediction reference image and the motion information. Since the position where the electric power is the minimum is selected as the matching destination, the difference in the average luminance value has a great influence on the motion search, so that the portion with the minimum error electric power that does not match the actual motion is matched.

【００１３】従って、請求項２及び請求項３にあるよう
に、動ベクトル検出の前処理として、時間方向に連続し
た複数の符号化参照画像内の、時間方向に輝度値が似た
線要素（時空間画像中に描かれた物体の軌跡）方向を求
めるにあたり、輝度値が似た方向を３次元ハフ変換など
を用いて関数近似を行ない、その線要素の方向を動ベク
トル探索の際の拘束として用いることにより、動ベクト
ルの検索範囲を狭めることで、探索時の演算量を減少さ
せるとともに、検出された動ベクトルの空間・時間方向
の連続性を保つことが可能になる。Therefore, as described in claim 2 and claim 3, as a pre-process of motion vector detection, a line element (luminance value similar in the time direction in a plurality of encoded reference images continuous in the time direction) In obtaining the direction of the trajectory of the object drawn in the spatiotemporal image, a function approximation is performed using a three-dimensional Hough transform for directions with similar brightness values, and the direction of the line element is constrained during motion vector search. As a result, by narrowing the search range of the motion vector, it is possible to reduce the amount of calculation at the time of search and maintain the continuity of the detected motion vector in the space / time direction.

【００１４】また、前記従来技術における動き補償予測
方法は、１６画素×１６ライン等の矩形ブロックを１つ
の剛体とみなしたブロック単位の予測であるため、予測
画像中にブロック状の不連続歪みが発生する。特に動き
の激しい部分ではこの不連続歪みは顕著となり、予測誤
差画像の符号化に十分な符号量を割り当てることのでき
ない低ルート符号化時には、視覚的に大きな妨害とな
る。Further, since the motion-compensated prediction method in the above-mentioned prior art is a block-based prediction in which a rectangular block of 16 pixels × 16 lines is regarded as one rigid body, block-like discontinuous distortion occurs in the predicted image. appear. This discontinuous distortion is particularly noticeable in a portion having a large amount of motion, and becomes a large visual obstacle during low root coding in which a sufficient coding amount cannot be allocated for coding a prediction error image.

【００１５】上記問題点を解決するための手段として、
符号化対象画像を多角形、例えば三角形または四角形の
パッチに分割し、各パッチの頂点の動ベクトルを空間変
換により内挿して画素ごとの動き補償を行なう方法が提
案されている。代表的な例として、 Gary J. Sullivan
らによる "Motion Compensation for Video Compressio
n Using Control Grid Interpolation" (IEEE ICASSP '
91, pp. 2713-2716, 1991 年）を、図３により簡単に説
明する。As means for solving the above problems,
A method has been proposed in which an image to be encoded is divided into polygonal, for example, triangular or quadrangular patches, and the motion vectors of the vertices of each patch are interpolated by spatial transformation to perform motion compensation for each pixel. As a typical example, Gary J. Sullivan
Et al. "Motion Compensation for Video Compressio
n Using Control Grid Interpolation "(IEEE ICASSP '
91, pp. 2713-2716, 1991) will be briefly described with reference to FIG.

【００１６】まず、符号化対象画像１を１６画素×１６
ライン等の正方形パッチに分割し、動き検出部３で、多
角形パターン２を用いることで各パッチの頂点の動ベク
トル４を求める。次に、各パッチごとに４つの頂点ベク
トル４から動ベクトル内挿部１２において画素単位の動
ベクトル１３を計算する。図４に示すように、頂点Ａ、
Ｂ、Ｃ、Ｄにおける動ベクトルをそれぞれFirst, the image 1 to be encoded is 16 pixels × 16.
It is divided into square patches such as lines, and the motion detection unit 3 uses the polygon pattern 2 to obtain the motion vector 4 of the vertex of each patch. Next, the motion vector interpolation unit 12 calculates a motion vector 13 for each pixel from the four vertex vectors 4 for each patch. As shown in FIG. 4, vertex A,
The motion vectors in B, C and D are

【００１７】[0017]

【外１】とすると、正方形ＡＢＣＤ内の座標（ｘ，ｙ）における
内挿ベクトル[Outside 1] Then, the interpolation vector at the coordinates (x, y) in the square ABCD

【００１８】[0018]

【外２】は以下の式により計算される。[Outside 2] Is calculated by the following formula.

【００１９】[0019]

【数１】この空間変換方法は共一次内挿（Ｂｉ−ｌｉｎｅａｒ
ｉｎｔｅｒｐｏｌａｔｉｏｎ）と呼ばれる。これにより
動ベクトル値は画素ごとに滑らかに変化し、ブロック境
界においても動ベクトルは滑らかに接続される。こうし
て得られた画素単位の動ベクトル１３を用い、画素単位
動き補償部１４において画素ごとの動き補償予測を行な
うことにより、ブロック内の全ての画素に同じ動ベクト
ル値を与えていた従来の動き補償予測方法に比べ、予測
画像中にブロック状の不連続歪みが発生しないという利
点がある。[Equation 1] This spatial transformation method is based on bi-linear interpolation.
interpolation). As a result, the motion vector value changes smoothly for each pixel, and the motion vectors are smoothly connected even at block boundaries. The pixel-based motion compensation unit 14 uses the pixel-based motion vector 13 thus obtained to perform motion-compensated prediction for each pixel, thereby giving the same motion vector value to all the pixels in the block. Compared with the prediction method, there is an advantage that block-like discontinuous distortion does not occur in the predicted image.

【００２０】しかし、従来の動画像のディジタル圧縮符
号化における動き予測の手法を用いて動ベクトルを得る
場合、動き探索においては、前記の通り、ブロック内の
予測誤差電力が最小となる位置がマッチング先として選
ばれるため、実際のマッチング先とは異なるが、誤差電
力最小の部分にマッチングしてしまい、空間・時間方向
の各ブロックの動ベクトル値間に連続性が存在しない場
合が多く存在し、この動ベクトルが不連続な部所をまた
いで動ベクトル内挿を行なうことにより、予測画像の品
質のみならず予測効率を下げてしまう。However, when a motion vector is obtained by using a motion prediction method in the conventional digital compression encoding of a moving image, in the motion search, as described above, the position where the prediction error power is the minimum is matched. Since it is selected as the destination, it differs from the actual matching destination, but it matches the part with the minimum error power, and there are many cases where there is no continuity between the motion vector values of each block in the space / time direction. By performing the motion vector interpolation across the discontinuous parts of the motion vector, not only the quality of the predicted image but also the prediction efficiency is reduced.

【００２１】従って、請求項２及び請求項３にあるよう
に、動ベクトル検出の前処理として、時間方向に連続し
た複数の符号化参照画像内の、時間方向に輝度値が似た
方向を求めるにあたり、輝度値が似た方向を３次元ハフ
変換などを用いて関数近似を行ない、その連続情報を動
ベクトル探索の際の拘束として用いることにより、検出
された動ベクトルの空間・時間方向の連続性を保つこと
が可能になり、動ベクトル内挿後の予測画像の歪みの削
減を図ることが可能となる。Therefore, as described in claims 2 and 3, as a pre-process of motion vector detection, a direction in which the luminance value is similar in the time direction is obtained in a plurality of encoded reference images continuous in the time direction. At this time, by performing a function approximation using a three-dimensional Hough transform or the like on the directions having similar brightness values, and using the continuous information as a constraint in the search for a motion vector, the detected motion vector is continuous in the space / time direction. It is possible to maintain the property, and it is possible to reduce the distortion of the predicted image after the motion vector interpolation.

【００２２】[0022]

【実施例】次に、本発明の実施例について図面を参照し
て説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００２３】図１は、本発明の一実施例における動画像
の動き補償予測符号化方法を実現する符号化器の構成を
示す図で、請求項１、２の発明に対応するものである。FIG. 1 is a diagram showing the structure of an encoder for realizing a motion-compensated predictive coding method for a moving image according to an embodiment of the present invention, and corresponds to the first and second aspects of the present invention.

【００２４】符号化器では、まず始めに符号化対象画像
１がフレームメモリ６６に入力され、フレームメモリ６
６内にすでに蓄積されている、符号化対象画像１と時間
的に連続した複数の符号化対象画像とをつなげ、時空間
画像６１を作成する。In the encoder, the image to be encoded 1 is first input to the frame memory 66, and then the frame memory 6
The encoding target image 1 already stored in 6 is connected to a plurality of temporally continuous encoding target images to create a spatiotemporal image 61.

【００２５】時空間画像６１はフレームメモリ６６から
輝度値連続情報検出器６５に入力され、時間方向に輝度
値が似た方向の表す所の線要素抽出を行なう。この際、
現フレームの画素毎に他の複数のフレーム内の画素の輝
度値との自乗誤差をとり、その方向をスプライン曲線に
よって近似する。線要素抽出の手段としては、局所オペ
レータを用いる例としてはＶａｎｄｅｒＢｒｕｇやＰ
ａｔｏｎやＫａｓｖａｎｄの線検出オペレータなどがあ
り、他にも弛緩法による方法、ハフ変換による方法な
ど、あらゆるアルゴリズムを適用することができる。こ
れらの領域分割の手法については、「画像解析ハンドブ
ック」（高木幹雄、下田陽久監修。東京大学出版会、１
９９１年１月）に詳しい。The spatio-temporal image 61 is input from the frame memory 66 to the luminance value continuous information detector 65, and the line element is extracted in the direction where the luminance values are similar in the time direction. On this occasion,
For each pixel of the current frame, the squared error from the luminance values of the pixels in the other plurality of frames is taken, and the direction is approximated by the spline curve. As a means for extracting line elements, as an example of using a local operator, Vander Brug or P
There are line detection operators such as aton and Kasvand, and other various algorithms such as a relaxation method and a Hough transform method can be applied. The method of region segmentation is described in "Image Analysis Handbook" (supervised by Mikio Takagi and Yohisa Shimoda. The University of Tokyo Press, 1
Details (January 991).

【００２６】また、符号化対象画像１は、輝度値連続情
報６２と共に動き検出部６３に入力され、各ブロックの
動ベクトル６４が求められる。この際、輝度値連続情報
６２とは、輝度値が似た方向を曲線式として近似した際
の係数を用いる。The image 1 to be coded is also input to the motion detecting section 63 together with the luminance value continuous information 62, and the motion vector 64 of each block is obtained. At this time, the brightness value continuation information 62 uses a coefficient obtained by approximating a direction in which the brightness values are similar as a curve formula.

【００２７】動き検出部６３では、その曲線式から符号
化対象画像１内の各ブロックごとの推定探索位置を求
め、その推定探索位置近傍のみを探索し、動ベクトル６
４を検出する。The motion detection unit 63 obtains an estimated search position for each block in the image to be coded 1 from the curve expression, searches only in the vicinity of the estimated search position, and calculates the motion vector 6
4 is detected.

【００２８】以降の動作は、図２の場合と全く同じであ
る。The subsequent operation is exactly the same as in the case of FIG.

【００２９】次に、図５、６を用いて時空間画像につい
て説明する。Next, the spatiotemporal image will be described with reference to FIGS.

【００３０】図５は、ある動画像シーケンスを時間方向
に縦に並べたものである。該動画像中では、物体７１は
静止しており、また物体７２は遠方に遠ざかりながら
（小さくなりながら）右の方に移動していることを示し
ている。FIG. 5 shows a moving image sequence vertically arranged in the time direction. In the moving image, the object 71 is stationary, and the object 72 is moving to the right while moving away (reducing) in the distance.

【００３１】図５を時間方向に並べてつなげ合わせたの
が図６であり、これを時空間画像と呼ぶ。FIG. 6 is a diagram obtained by arranging and connecting FIG. 5 in the time direction, which is called a spatiotemporal image.

【００３２】図６で明らかなように、時空間画像中で
は、動画像シーケンス中に存在するオブジェクトは、オ
ブジェクトの動きに依存して、時空間画像内に三次元的
な軌跡を残す。図７は図６に示した時空間画像をｙ軸に
垂直な平面（ｙ＝ａ）で切った断面図である。As is apparent from FIG. 6, in the spatiotemporal image, the objects existing in the moving image sequence leave a three-dimensional trajectory in the spatiotemporal image depending on the movement of the object. FIG. 7 is a sectional view of the spatiotemporal image shown in FIG. 6 taken along a plane (y = a) perpendicular to the y-axis.

【００３３】時空間画像内での線要素の抽出の一例とし
てハフ交換について述べる。簡単のため２次元画像の時
空間断面図、すなわち図７において２次元ハフ変換を用
い、かつ線要素のうち直線を取り出す方法について述べ
る。Hough exchange will be described as an example of the extraction of line elements in a spatiotemporal image. For simplification, a spatiotemporal sectional view of a two-dimensional image, that is, a method of using the two-dimensional Hough transform in FIG. 7 and extracting a straight line from line elements will be described.

【００３４】ハフ変換とはパラメータで表現できる図形
（例えば直線、円、楕円、放物線）を画像中から検出す
るための手段である。ここでは代表的にＤｕｄａａｎ
ｄＨａｒｔの方法による直線検出法について述べる。The Hough transform is a means for detecting a figure (for example, a straight line, a circle, an ellipse, a parabola) that can be expressed by parameters from an image. Here, typically Duda an
A straight line detection method based on the dHart method will be described.

【００３５】直線は、式ρ＝ｘ₀ cos θ＋ｙ₀ sin θで
表現し、直線を記述するためのパラメータとして（ρ，
θ）を用いる。ここで、ρは原点から直線へおろした垂
線の長さ、θは垂線とｘ軸のなす角である。この直線が
画像上の点（ｘ₀ ，ｙ₀ ）を通るとすると、次式（２）The straight line is expressed by the equation ρ = x ₀ cos θ + y ₀ sin θ, and (ρ,
θ) is used. Here, ρ is a length of a perpendicular line drawn from the origin to a straight line, and θ is an angle formed by the perpendicular line and the x axis. If this straight line passes through the point (x ₀ , y ₀ ) on the image, the following equation (2)

【００３６】[0036]

【数２】の関係がなり立つ。この関数はパラメータ空間ρ−θ上
ではサイン曲線となる。すなわちｘ−ｙ空間の１点はρ
−θの一本の軌跡に対応し、逆に式（２）で表されるρ
−θ空間の軌跡は、ｘ−ｙ空間において（ｘ₀ ，ｙ₀ ）
を通るすべての直線群を表していることになる。従っ
て、ｘ−ｙ空間上で１本の直線上の点をρ−θ空間に写
した場合、これらの点から作られるρ−θ空間上での軌
跡は１点で交わることになる。[Equation 2] The relationship is established. This function is a sine curve on the parameter space ρ-θ. That is, one point in the xy space is ρ
Corresponding to a single locus of −θ, conversely ρ expressed by equation (2)
The locus in −θ space is (x ₀ , y ₀ ) in xy space.
This means that it represents all the straight lines passing through. Therefore, when a point on a straight line in the xy space is copied in the ρ-θ space, the locus in the ρ-θ space created from these points intersects at one point.

【００３７】具体的には、まず線検出オペレータにより
現画像から直線の要素の候補となる画素を抽出する。こ
れらの画素のｘ−ｙ座標を（ｘ_i ，ｙ_i ）とすれば、そ
れに対応するすべての軌跡ρ＝ｘ_i cos θ＋ｙ_i sin θ
をρ−θ空間上でカウントする。もし画像上に直線ρ₀
＝ｘcos θ₀ ＋ｙsin θ₀ が存在していれば、その直線
上の画素はすべてρ−θ空間上の点（ρ₀ ，θ₀ ）に対
してカウントアップを行なうため（ρ₀ ，θ₀ ）にピー
クが生じるはずである。このピークを検出することによ
って直線を検出する。Specifically, first, a pixel which is a candidate for a linear element is extracted from the current image by the line detection operator. If the x-y coordinates of these pixels are (x _i , y _i ), all the loci ρ = x _i cos θ + y _i sin θ
Are counted in the ρ-θ space. If a straight line ρ ₀ on the image
= Xcos θ ₀ + ysin θ ₀ exists, all pixels on the straight line count up with respect to the point (ρ ₀ , θ ₀ ) on the ρ-θ space (ρ ₀ , θ ₀ ). There should be a peak at. A straight line is detected by detecting this peak.

【００３８】[0038]

【発明の効果】以上説明したように、本発明は、符号化
対象画像中の多角形パッチの頂点動ベクトルを求めるに
あたり、時間方向に輝度値が似た線要素の方向を求め、
動き量の検出の際にこれを参照することにより、動ベク
トルの検索範囲を狭め、探索時の演算量を減少させると
同時に、空間・時間方向の動ベクトルの連続性を保たれ
るため、復号画像のエッジが原画に近く自然になる。さ
らに、本発明により、検出された動ベクトルの空間・時
間方向の連続性を保つことが可能になり、その動ベクト
ル場において動ベクトル内挿することにより、動ベクト
ル内挿後の予測画像の歪みの削減を図ることが可能とな
る。As described above, according to the present invention, when the vertex motion vector of the polygon patch in the image to be encoded is obtained, the direction of the line element whose luminance value is similar to the time direction is obtained,
By referring to this when detecting the motion amount, the search range of the motion vector is narrowed, the amount of calculation at the time of search is reduced, and at the same time, the continuity of the motion vector in the space / time direction is maintained, so decoding The edges of the image become more natural and close to the original. Furthermore, according to the present invention, it is possible to maintain the continuity of the detected motion vector in the space / time direction, and by performing the motion vector interpolation in the motion vector field, the distortion of the predicted image after the motion vector interpolation is performed. Can be reduced.

[Brief description of drawings]

【図１】本発明の一実施例における画像符号化方法によ
る符号化器の構成を示す図である。FIG. 1 is a diagram showing a configuration of an encoder according to an image encoding method in an embodiment of the present invention.

【図２】従来のブロック単位の動き補償予測符号化方法
の符号化器の構成を示す図である。[Fig. 2] Fig. 2 is a diagram illustrating the configuration of an encoder of a conventional block-based motion compensation predictive encoding method.

【図３】従来の動ベクトル内挿による動き補償予測符号
化方法の符号化器の構成を示す図である。FIG. 3 is a diagram showing a configuration of an encoder of a conventional motion compensation predictive coding method by motion vector interpolation.

【図４】動ベクトル内挿の方法を示す図である。FIG. 4 is a diagram showing a method of motion vector interpolation.

【図５】ある動画像シーケンスを時間方向に縦方向に並
べた図である。FIG. 5 is a diagram in which a moving image sequence is arranged vertically in the time direction.

【図６】図５の動画像シーケンスを時間方向に並べてつ
なげ合わせた図である。FIG. 6 is a diagram in which the moving image sequences of FIG. 5 are arranged in the time direction and connected.

【図７】図６に示した時空間画像をｙ軸に垂直な平面で
切った断面図である。7 is a cross-sectional view of the spatiotemporal image shown in FIG. 6 taken along a plane perpendicular to the y-axis.

[Explanation of symbols]

１符号化対象画像２多角形パターン３動き検出部４頂点動ベクトル５フレームメモリ６局部復号画像７領域分割部８領域形状参照画像９形状動き補償部１０多角形パターン１１動き補償予測形状１２適応動ベクトル内挿部１３画素単位動ベクトル１４画素単位動き補償部１５動き補償予測画像１６減算器１７動き補償予測誤差１８空間冗長度圧縮部１９圧縮差分データ２０差分データ符号化部２１差分符号化データ２２差分データ伸長部２３伸長差分画像２４加算器２５局部復号画像２６動ベクトル符号化部２７動ベクトル符号化データ２８多重化部２９多重化データ部３０領域分割部３１領域形状画像５０正方形パターン５１動き検出部５２動ベクトル５３ブロック動き補償部６１時空間画像６２輝度値連続情報６３動き検出部６４動ベクトル６５輝度値連続情報検出器６６フレームメモリ７１静止物体７２動物体 1 image to be encoded 2 polygon pattern 3 motion detection unit 4 vertex motion vector 5 frame memory 6 locally decoded image 7 region division unit 8 region shape reference image 9 shape motion compensation unit 10 polygon pattern 11 motion compensation prediction shape 12 adaptive motion Vector interpolation unit 13 Pixel unit motion vector 14 Pixel unit motion compensation unit 15 Motion compensation prediction image 16 Subtractor 17 Motion compensation prediction error 18 Spatial redundancy compression unit 19 Compressed differential data 20 Differential data encoding unit 21 Differential encoded data 22 Differential data expansion unit 23 Expanded differential image 24 Adder 25 Local decoded image 26 Motion vector coding unit 27 Motion vector coded data 28 Multiplexing unit 29 Multiplexing data unit 30 Region dividing unit 31 Region shape image 50 Square pattern 51 Motion detection Part 52 motion vector 53 block motion compensation part 61 space-time Image 62 luminance values continuous information 63 motion detection section 64 motion vector 65 luminance value continuous information detector 66 frame memory 71 still object 72 moving objects

Claims

[Claims]

1. An image to be encoded is divided into polygonal patches, a motion amount between the image to be encoded and a prediction reference image is detected for each polygonal patch, and motion compensation is performed to obtain a predicted image. A motion compensation predictive coding method for generating and coding a difference between the predicted image and a coding target image, in detecting a vertex motion vector of a polygon patch in the coding target image, detecting a motion vector As pre-processing of, the continuous information of each luminance value in a plurality of encoded reference images continuous in the time direction, that is, the direction of a line element having a similar luminance value is obtained for each pixel, and the direction of the line element is the motion amount. A motion compensation predictive coding method for a moving image, characterized in that the motion prediction search range is constrained by referring to it when detecting.

2. In obtaining a direction of a line element in which a pixel has a similar luminance value in the time direction in a plurality of coded reference images, the luminance values in a plurality of coded reference images continuous in the time direction have similar luminance values. 2. The motion compensation predictive coding method according to claim 1, wherein the approximated direction is approximated by a function.

3. The motion-compensated prediction code according to claim 2, wherein a three-dimensional Hough transform is used for functionally approximating directions of line elements having similar pixel luminance values in the time direction in a plurality of coded reference images. Method.

4. An image to be encoded is divided into polygonal patches, a motion amount between the image to be encoded and a prediction reference image is detected for each polygonal patch, and motion compensation is performed to obtain a predicted image. A motion compensation predictive encoder that generates and encodes the difference between the predicted image and the encoding target image, accumulates the encoding target image, and stores the newly input encoding target image and the already accumulated image. A frame memory that forms a spatiotemporal image by connecting the encoding target image and a plurality of temporally continuous encoding target images to each other and inputs the spatiotemporal image, and the luminance values are similar in the temporal direction. The brightness value continuous information extracting means for extracting the line element of the portion representing the different direction and outputting the brightness value continuous information, and the motion for inputting the encoding target image and the brightness value continuous information to obtain the motion vector of each block Having detection means Wherein, the motion compensated prediction coder image video.