JP2007228560A

JP2007228560A - Moving picture coding method and moving picture coding apparatus

Info

Publication number: JP2007228560A
Application number: JP2007001782A
Authority: JP
Inventors: Shintaro Kudo; 慎太郎工藤; Seishi Abe; 清史安倍; Shinya Sumino; 眞也角野; Hiroaki Toida; 博明樋田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2006-01-25
Filing date: 2007-01-09
Publication date: 2007-09-06

Abstract

【課題】時間的ダイレクトモード符号化における動きベクトルの予測の精度低下による画質劣化を防ぎ、動画像を効率良く圧縮することが可能な動画像符号化装置を提供する。
【解決手段】動画像符号化装置は、Ｂピクチャのダイレクトモード処理として、時間的に近傍にある符号化済みピクチャの有する動きベクトルを参照して、対象ブロックの動きベクトルを予測して生成する時間的ダイレクトモード処理部と、対象ブロックの空間的周辺に位置する符号化済みブロックの有する動きベクトルを参照して、対象ブロックの動きベクトルを予測して生成する空間的ダイレクトモード処理部と、符号化対象の条件によって、時間的ダイレクトモードの使用の禁止を判定する時間的ダイレクトモード禁止判定部とを備え、判定部で時間的ダイレクトモードが禁止された場合に符号化対象に対して空間的ダイレクトモード処理部のみを用いてダイレクトモード符号化を行うことを特徴とする。
【選択図】図１The present invention provides a moving picture coding apparatus capable of preventing a deterioration in image quality due to a reduction in accuracy of motion vector prediction in temporal direct mode coding and efficiently compressing a moving picture.
A moving picture coding apparatus predicts and generates a motion vector of a target block by referring to a motion vector of a coded picture that is temporally nearby as direct mode processing of a B picture. Direct mode processing unit, a spatial direct mode processing unit that predicts and generates a motion vector of a target block with reference to a motion vector of an encoded block located in the spatial periphery of the target block, and encoding A temporal direct mode prohibition determining unit that determines prohibition of use of the temporal direct mode depending on the target conditions, and when the temporal direct mode is prohibited by the determination unit, the spatial direct mode for the encoding target Direct mode coding is performed using only the processing unit.
[Selection] Figure 1

Description

本発明は、動画像符号化方法および動画像符号化装置に関し、特にダイレクトモードにおける動きベクトルの予測の精度低下による画質劣化を防止する技術に関する。 The present invention relates to a moving picture coding method and a moving picture coding apparatus, and more particularly to a technique for preventing image quality deterioration due to a reduction in accuracy of motion vector prediction in the direct mode.

近年、音声や画像などを統合的に扱うマルチメディア時代を迎え、従来からの情報メディア、つまり新聞、雑誌、テレビ、ラジオ、電話等の情報を人に伝達する手段がマルチメディアの対象として取り上げられるようになってきた。一般に、マルチメディアとは、文字だけでなく、図形、音声、特に画像等を同時に関連づけて表すことをいうが、上記従来の情報メディアをマルチメディアの対象とするには、その情報をディジタル形式にして表すことが必須条件となる。 In recent years, with the era of multimedia that handles voice and images in an integrated manner, traditional information media, that is, means for transmitting information such as newspapers, magazines, televisions, radios, and telephones to people, are taken up as multimedia targets. It has become like this. In general, multimedia refers to not only characters but also figures, sounds, especially images, etc. that are associated with each other at the same time. It is an essential condition.

ところが、上記各情報メディアの持つ情報量をディジタル情報量として見積もってみると、文字の場合１文字当たりの情報量は１〜２バイトであるのに対し、音声の場合１秒当たり６４Ｋｂｉｔｓ（電話品質）、さらに動画については１秒当たり１００Ｍｂｉｔｓ（現行テレビ受信品質）以上の情報量が必要となり、上記情報メディアでその膨大な情報をディジタル形式でそのまま扱うことは現実的では無い。例えば、テレビ電話は、６４Ｋｂｉｔ／ｓ〜１．５Ｍｂｉｔｓ／ｓの伝送速度を持つサービス総合ディジタル網（ＩＳＤＮ：ＩｎｔｅｇｒａｔｅｄＳｅｒｖｉｃｅｓＤｉｇｉｔａｌＮｅｔｗｏｒｋ）によって既に実用化されているが、テレビ・カメラの映像をそのままＩＳＤＮで送ることは不可能である。 However, when the information amount of each information medium is estimated as a digital information amount, the amount of information per character is 1 to 2 bytes in the case of characters, whereas 64 Kbits (phone quality) per second in the case of speech. In addition, for a moving image, an information amount of 100 Mbits (current television reception quality) or more per second is required, and it is not realistic to handle the enormous amount of information in the digital format as it is with the information medium. For example, a video phone has already been put into practical use by an integrated services digital network (ISDN) having a transmission rate of 64 Kbit / s to 1.5 Mbits / s. It is impossible to send.

そこで、必要となってくるのが情報の圧縮技術であり、例えば、テレビ電話の場合、ＩＴＵ−Ｔ（国際電気通信連合電気通信標準化部門）で勧告されたＨ．２６１やＨ．２６３規格の動画圧縮技術が用いられている。また、ＭＰＥＧ−１規格の情報圧縮技術によると、通常の音楽用ＣＤ（コンパクト・ディスク）に音声情報とともに画像情報を入れることも可能となる。 Therefore, what is required is information compression technology. For example, in the case of a videophone, H.264 recommended by ITU-T (International Telecommunication Union Telecommunication Standardization Sector). 261 and H.264. H.263 standard video compression technology is used. In addition, according to the information compression technology of the MPEG-1 standard, it is possible to put image information together with audio information on a normal music CD (compact disc).

ここで、ＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）とは、ＩＳＯ／ＩＥＣ（国際標準化機構国際電気標準会議）で標準化された動画像信号圧縮の国際規格であり、ＭＰＥＧ−１は、動画像信号を１．５Ｍｂｐｓまで、つまりテレビ信号の情報を約１００分の１にまで圧縮する規格である。また、ＭＰＥＧ−１規格では対象とする品質を伝送速度が主として約１．５Ｍｂｐｓで実現できる程度の中程度の品質としたことから、さらに高画質化の要求をみたすべく規格化されたＭＰＥＧ−２では、動画像信号を２〜１５ＭｂｐｓでＴＶ放送品質を実現する。さらに現状では、ＭＰＥＧ−１およびＭＰＥＧ−２の標準化を進めてきた作業グループ（ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２９／ＷＧ１１）によって、ＭＰＥＧ−１、ＭＰＥＧ−２を上回る圧縮率を達成し、さらに物体単位で符号化・復号化・操作を可能とし、マルチメディア時代に必要な新しい機能を実現するＭＰＥＧ−４が規格化された。ＭＰＥＧ−４では、当初、低ビットレートの符号化方法の標準化を目指して進められたが、現在はインタレース画像も含む高ビットレートも含む、より汎用的な符号化に拡張されている。 Here, MPEG (Moving Picture Experts Group) is an international standard for video signal compression standardized by ISO / IEC (International Electrotechnical Commission). This is a standard for compressing information of TV signals up to 5 Mbps, that is, about 1/100. Further, since the MPEG-1 standard has set the target quality to a medium quality that can be realized mainly at a transmission rate of about 1.5 Mbps, the MPEG-2 standardized to meet the demand for higher image quality. Then, the TV broadcast quality is realized at 2 to 15 Mbps for the moving image signal. Furthermore, at present, the working group (ISO / IEC JTC1 / SC29 / WG11) that has been standardizing MPEG-1 and MPEG-2 achieves a compression ratio that exceeds MPEG-1 and MPEG-2, and further, in units of objects. MPEG-4 has been standardized that enables encoding, decoding, and operation and realizing new functions necessary in the multimedia era. In MPEG-4, it was originally aimed at standardizing a low bit rate encoding method, but now it has been extended to a more general encoding including a high bit rate including interlaced images.

さらに、２００３年に、ＩＳＯ／ＩＥＣとＩＴＵ−Ｔが共同でより高圧縮率の次世代画像符号化方式として、ＭＰＥＧ−４ＡＶＣおよびＩＴＵＨ．２６４が標準化されている（例えば、非特許文献１参照。）。Ｈ．２６４規格では、ＨＤ（ＨｉｇｈＤｅｆｉｎｉｔｉｏｎ）画像などに適したＨｉｇｈＰｒｏｆｉｌｅ対応の規格も策定されており、ＢＤ−ＲＯＭ（Ｂｌｕ−ｒａｙＤｉｓｋＲＯＭ）等の次世代メディアの圧縮規格として採用が決定されている。 Further, in 2003, MPEG-4 AVC and ITU H.264 were jointly developed by ISO / IEC and ITU-T as next-generation image coding systems with higher compression rates. H.264 is standardized (for example, see Non-Patent Document 1). H. In the H.264 standard, a High Profile compatible standard suitable for HD (High Definition) images and the like has also been formulated, and it has been decided to adopt it as a compression standard for next-generation media such as BD-ROM (Blu-ray Disk ROM). .

一般に動画像の符号化では、時間方向および空間方向の冗長性を削減することによって情報量の圧縮を行う。そこで時間的な冗長性の削減を目的とする画面間予測符号化では、前方又は後方のピクチャを参照してブロック単位で動きの検出および予測画像の作成を行い、得られた予測画像と符号化対象ピクチャとの差分値に対して符号化を行う。ここで、ピクチャとは１枚の画面を表す用語であり、プログレッシブ画像ではフレームを意味し、インタレース画像ではフレームもしくはフィールドを意味する。ここで、インタレース画像とは、１つのフレームが時刻の異なる２つのフィールドから構成される画像である。インタレース画像の符号化や復号化処理においては、１つのフレームをフレームのまま処理したり、２つのフィールドとして処理したり、フレーム内のブロック毎にフレーム構造又はフィールド構造として処理したりすることができる。 In general, in encoding of moving images, the amount of information is compressed by reducing redundancy in the time direction and the spatial direction. Therefore, in inter-picture predictive coding for the purpose of reducing temporal redundancy, motion is detected and a predicted image is created in block units with reference to the front or rear picture, and the obtained predicted image and coding are encoded. Encoding is performed on the difference value from the target picture. Here, a picture is a term representing a single screen, which means a frame in a progressive image and a frame or field in an interlaced image. Here, an interlaced image is an image in which one frame is composed of two fields having different times. In encoding and decoding of interlaced images, one frame may be processed as a frame, processed as two fields, or processed as a frame structure or field structure for each block in the frame. it can.

参照画像を持たず画面内予測符号化を行うものをＩピクチャと呼ぶ。また、１枚のピクチャのみを参照し画面間予測符号化を行うものをＰピクチャと呼ぶ。また、同時に２枚のピクチャを参照して画面間予測符号化を行うことのできるものをＢピクチャと呼ぶ。Ｂピクチャは表示時間が前方もしくは後方から任意の組み合わせとして２枚のピクチャを参照することが可能である。参照画像（参照ピクチャ）は符号化および復号化の基本単位であるブロック毎に指定することができるが、符号化を行ったビットストリーム中に先に記述される方の参照ピクチャを第１参照ピクチャ、後に記述される方を第２参照ピクチャとして区別する。ただし、これらのピクチャを符号化および復号化する場合の条件として、参照するピクチャが既に符号化および復号化されている必要がある。 A picture that does not have a reference picture and performs intra prediction coding is called an I picture. A picture that performs inter-frame predictive coding with reference to only one picture is called a P picture. A picture that can be subjected to inter-picture prediction coding with reference to two pictures at the same time is called a B picture. The B picture can refer to two pictures as an arbitrary combination of display times from the front or the rear. A reference picture (reference picture) can be specified for each block which is a basic unit of encoding and decoding, but the reference picture described earlier in the encoded bitstream is designated as the first reference picture. The one described later is distinguished as the second reference picture. However, as a condition for encoding and decoding these pictures, the picture to be referenced needs to be already encoded and decoded.

Ｐピクチャ又はＢピクチャの符号化には、動き補償画面間予測符号化が用いられている。動き補償画面間予測符号化とは、画面間予測符号化に動き補償を適用した符号化方式である。動き補償とは、単純に参照フレームの画素値から予測するのではなく、ピクチャ内の各部の動き量（以下、これを動きベクトルと呼ぶ）を検出し、当該動き量を考慮した予測を行うことにより予測精度を向上すると共に、データ量を減らす方式である。例えば、符号化対象ピクチャの動きベクトルを検出し、その動きベクトルの分だけシフトした予測値と符号化対象ピクチャとの予測残差を符号化することによりデータ量を減している。この方式の場合には、復号化の際に動きベクトルの情報が必要になるため、動きベクトルも符号化されて記録又は伝送される。 Motion compensation inter-picture prediction coding is used for coding a P picture or a B picture. The motion compensation inter-picture prediction encoding is an encoding method in which motion compensation is applied to inter-picture prediction encoding. Motion compensation is not simply predicting from the pixel value of the reference frame, but detecting the amount of motion of each part in the picture (hereinafter referred to as a motion vector) and performing prediction in consideration of the amount of motion. This improves the prediction accuracy and reduces the amount of data. For example, the amount of data is reduced by detecting the motion vector of the encoding target picture and encoding the prediction residual between the prediction value shifted by the motion vector and the encoding target picture. In the case of this method, since motion vector information is required at the time of decoding, the motion vector is also encoded and recorded or transmitted.

動きベクトルはマクロブロック単位で検出されており、具体的には、符号化対象ピクチャ側のマクロブロックを固定しておき、参照ピクチャ側のマクロブロックを探索範囲内で移動させ、基準ブロックと最も似通った参照ブロックの位置を見つけることにより、動きベクトルが検出される。 The motion vector is detected in units of macroblocks. Specifically, the macroblock on the encoding target picture side is fixed, the macroblock on the reference picture side is moved within the search range, and is most similar to the reference block. The motion vector is detected by finding the position of the reference block.

Ｈ．２６４方式では、Ｂピクチャの符号化において、ダイレクトモードという符号化モードを選択することができる。このダイレクトモードには時間的方法（時間的ダイレクトモード）と空間的方法（空間的ダイレクトモード）との２種類の方法がある。ダイレクトモードは符号化対象のスライス（ブロック群）に対して、時間的方法と空間的方法のどちらか１つのみを使用することができる。 H. In the H.264 system, an encoding mode called a direct mode can be selected for encoding a B picture. There are two types of direct mode: a temporal method (temporal direct mode) and a spatial method (spatial direct mode). In the direct mode, only one of a temporal method and a spatial method can be used for a slice (block group) to be encoded.

時間的ダイレクトモードでは、符号化対象ブロック自体は動きベクトルを持たず、符号化済みの他ピクチャの動きベクトルを参照動きベクトルとして、ピクチャ間の表示時間的位置関係に基づいてスケーリング処理を行うことによって、符号化対象ブロックで用いる動きベクトルを予測して生成している。 In temporal direct mode, the current block itself does not have a motion vector. By using the motion vector of another encoded picture as a reference motion vector, scaling is performed based on the display temporal positional relationship between pictures. The motion vector used in the encoding target block is predicted and generated.

図１１は、時間ダイレクトモードにおける動きベクトルの予測生成方法を示す模式図である。なお、同図に示されるＰはＰピクチャ、ＢはＢピクチャを示し、ピクチャタイプに付している数字は各ピクチャの表示順を示している。また、各ピクチャＰ１、Ｂ２、Ｂ３、Ｐ４はそれぞれ表示順情報Ｔ１、Ｔ２、Ｔ３、Ｔ４を有している。ここでは図１１に示されるピクチャＢ３のブロックＢＬ０を時間的ダイレクトモードで符号化する場合について説明する。 FIG. 11 is a schematic diagram illustrating a motion vector prediction generation method in the temporal direct mode. In the figure, P indicates a P picture, B indicates a B picture, and the numbers attached to the picture types indicate the display order of the pictures. Each picture P1, B2, B3, P4 has display order information T1, T2, T3, T4, respectively. Here, a case where the block BL0 of the picture B3 shown in FIG. 11 is encoded in the temporal direct mode will be described.

ピクチャＢ３の表示時間的に近傍に位置し、既に符号化済みであるピクチャＰ４に含まれ、ブロックＢＬ０と同じ位置にあるブロックＢＬ１の動きベクトルＭＶ１を利用する。この動きベクトルＭＶ１はブロックＢＬ１が符号化された際に用いられた動きベクトルであり、ピクチャＰ１を参照している。この場合、ブロックＢＬ０を符号化する際に用いる動きベクトルは、ピクチャＰ１に対しては動きベクトルＭＶ＿Ｆ、ピクチャＰ４に対しては動きベクトルＭＶ＿Ｂとなる。この際、動きベクトルＭＶ１の大きさをＭＶ、動きベクトルＭＶ＿Ｆの大きさをＭＶｆ、動きベクトルＭＶ＿Ｂの大きさをＭＶｂとすると、ＭＶｆ、ＭＶｂはそれぞれ式（１）、式（２）によって得られる。 The motion vector MV1 of the block BL1 located in the vicinity of the display time of the picture B3 and included in the already encoded picture P4 and located at the same position as the block BL0 is used. This motion vector MV1 is a motion vector used when the block BL1 is encoded, and refers to the picture P1. In this case, the motion vector used when coding the block BL0 is the motion vector MV_F for the picture P1 and the motion vector MV_B for the picture P4. At this time, assuming that the magnitude of the motion vector MV1 is MV, the magnitude of the motion vector MV_F is MVf, and the magnitude of the motion vector MV_B is MVb, MVf and MVb are obtained by Expression (1) and Expression (2), respectively.

ＭＶｆ＝（Ｔ３−Ｔ１）／（Ｔ４−Ｔ１）×ＭＶ …（１）
ＭＶｂ＝（Ｔ３−Ｔ４）／（Ｔ４−Ｔ１）×ＭＶ …（２） MVf = (T3-T1) / (T4-T1) × MV (1)
MVb = (T3-T4) / (T4-T1) × MV (2)

このように動きベクトルＭＶ１からスケーリング処理を行うことによって得られた動きベクトルＭＶ＿Ｆ、動きベクトルＭＶ＿Ｂを用いて、参照ピクチャであるピクチャＰ１とピクチャＰ４からブロックＢＬ０の動き補償を行う。 Using the motion vector MV_F and the motion vector MV_B obtained by performing the scaling process from the motion vector MV1 in this way, motion compensation of the block BL0 is performed from the picture P1 and the picture P4 which are reference pictures.

なお、スケーリング処理のために参照するブロックＢＬ１が画面内予測符号化を行ったブロックであり動きベクトルを持たなかった場合、動きベクトルＭＶ＿Ｆ、ＭＶ＿Ｂの大きさが共に「０」であるものとして動き補償を行う。 Note that when the block BL1 referred to for scaling processing is a block on which intra prediction encoding has been performed and does not have a motion vector, motion compensation is performed assuming that the magnitudes of the motion vectors MV_F and MV_B are both “0”. I do.

空間的ダイレクトモードでは、時間的ダイレクトモードと同様に、符号化対象ブロック自体は動きベクトルを持たず、符号化対象ブロックの空間的に周辺に位置する符号化済みブロックの持つ動きベクトルを参照し、それを用いて符号化を行う。 In the spatial direct mode, similar to the temporal direct mode, the encoding target block itself does not have a motion vector, but refers to the motion vector of an encoded block located spatially around the encoding target block, Encode using it.

図１２は、空間的ダイレクトモードにおける動きベクトルの予測生成方法を示す模式図である。なお、同図に示されるＰはＰピクチャ、ＢはＢピクチャを示し、ピクチャタイプに付している数字は各ピクチャの表示順序を示している。ここでは、図１２に示されるピクチャＢ３のブロックＢＬ０を空間的ダイレクトモードで符号化する場合について説明する。 FIG. 12 is a schematic diagram illustrating a motion vector prediction generation method in the spatial direct mode. In the figure, P indicates a P picture, B indicates a B picture, and the numbers attached to the picture types indicate the display order of the pictures. Here, a case will be described in which block BL0 of picture B3 shown in FIG. 12 is encoded in the spatial direct mode.

符号化対象であるブロックＢＬ０の周辺の３画素Ａ、Ｂ、Ｃを含む符号化済みのブロックのそれぞれの動きベクトルＭＶＡ１、ＭＶＢ１、ＭＶＣ１のうち、符号化対象ピクチャから表示時間的に最も近くにある既に符号化されたピクチャを参照した動きベクトルを、符号化対象ブロックの動きベクトルの候補として決定する。この決定した動きベクトルが３つある場合には、それらの中央値を符号化対象ブロックの動きベクトルとして選択する。また２つである場合には、それらの平均値を求め、符号化対象ブロックの動きベクトルとする。また１つだけである場合には、その動きベクトルを符号化対象ブロックの動きベクトルとする。 Among the motion vectors MVA1, MVB1, and MVC1 of the encoded block including the three pixels A, B, and C around the block BL0 to be encoded, the motion vector is closest to the encoding target picture in display time. A motion vector referring to an already encoded picture is determined as a motion vector candidate of the encoding target block. When there are three determined motion vectors, the median value is selected as the motion vector of the block to be encoded. If there are two, the average value of them is obtained and used as the motion vector of the encoding target block. When there is only one, the motion vector is set as the motion vector of the encoding target block.

図１２に示される例では、動きベクトルＭＶＡ１、ＭＶＣ１はピクチャＰ２を参照して求められ、動きベクトルＭＶＢ１はピクチャＰ１を参照して求められている。よって、符号化対象ピクチャから表示時間的に最も近くにある既に符号化されたピクチャであるピクチャＰ２を参照した動きベクトルＭＶＡ１、ＭＶＣ１の平均値を求め、符号化対象ブロックの１つ目の動きベクトルであるＭＶ＿Ｆとする。２つ目の動きベクトルであるＭＶ＿Ｂを求める場合も同様である。
ＩＳＯ／ＩＥＣ１４４９６−１０，ＩｎｔｅｒｎａｔｉｏｎａｌＳｔａｎｄａｒｄ： “Ｉｎｆｏｒｍａｔｉｏｎｔｅｃｈｎｏｌｏｇｙ − Ｃｏｄｉｎｇｏｆａｕｄｉｏ−ｖｉｓｕａｌｏｂｊｅｃｔｓ − Ｐａｒｔ１０：Ａｄｖａｎｃｅｄｖｉｄｅｏｃｏｄｉｎｇ”（２００３−１２−０１）． In the example shown in FIG. 12, the motion vectors MVA1 and MVC1 are obtained with reference to the picture P2, and the motion vector MVB1 is obtained with reference to the picture P1. Therefore, the average value of the motion vectors MVA1 and MVC1 that refer to the picture P2, which is the already encoded picture closest to the encoding target picture in terms of display time, is obtained, and the first motion vector of the encoding target block is obtained. MV_F which is The same applies to obtaining MV_B, which is the second motion vector.
ISO / IEC 14496-10, International Standard: "Information technology-Coding of audio-visual objects-Part 10: Advanced video coding" (2003-12-01).

ところで、映画などのフィルムメディアは、その多くが１秒間に２４枚のフレームで構成されている（２４ｆｐｓ）。それら２４ｆｐｓの素材をテレビで放映する場合には、テレビ（ＮＴＳＣ）が２９．９７ｆｐｓのため、２４ｆｐｓから２９．９７ｆｐｓといったようにピクチャの表示時間間隔を変換する必要がある。この変換はテレシネ変換（２−３変換）と呼ばれる。 By the way, most film media such as movies are composed of 24 frames per second (24 fps). When these 24 fps materials are broadcast on television, since the television (NTSC) is 29.97 fps, it is necessary to convert the picture display time interval from 24 fps to 29.97 fps. This conversion is called telecine conversion (2-3 conversion).

図１３は、テレシネ変換の変換方法を示す図である。
テレシネ変換において、図１３に示されるように、２４ｆｐｓの最初のフレームを２フィールドに、次のフレームを３フィールドに、以降順に２フィールド、３フィールドと変換し、変換されたフィールドを２フィールド毎に１フレームとすることによって、２４ｆｐｓから３０ｆｐｓへの変換が行われる。 FIG. 13 is a diagram illustrating a conversion method of telecine conversion.
In the telecine conversion, as shown in FIG. 13, the first frame of 24 fps is converted into 2 fields, the next frame is converted into 3 fields, and subsequently converted into 2 fields and 3 fields, and the converted fields are converted every 2 fields. By making one frame, conversion from 24 fps to 30 fps is performed.

ここで、テレシネ変換する場合、３０ｆｐｓではＮＴＳＣの２９．９７ｆｐｓとはずれてしまうため、ある一定時間毎にコマ落ちをさせて、タイミングを合わせるという処理が行われる。 Here, in the case of telecine conversion, since 30 fps deviates from 29.97 fps of NTSC, a process of dropping frames every certain time and adjusting the timing is performed.

上記の変換を行うと図１３の３０Ｐフレーム１のフィールド１−０と、３０Ｐフレーム２のフィールド１−０のように、異なる表示時刻に同じフィールドの画像が表示されることになる。これはテレシネ変換を行った画像の表示時間の間隔と、表示する画像の記録時間間隔が完全に一致していないことを意味する。 When the above conversion is performed, images of the same field are displayed at different display times, such as field 1-0 of 30P frame 1 and field 1-0 of 30P frame 2 in FIG. This means that the display time interval of the image subjected to telecine conversion and the recording time interval of the image to be displayed do not completely match.

このようなテレシネ変換を行った動画像に対して、時間的ダイレクト符号化モードを用いて符号化を行うと、動きベクトル予測のスケーリング処理に問題が発生する。 When a moving image subjected to such telecine conversion is encoded using the temporal direct encoding mode, a problem occurs in scaling processing of motion vector prediction.

すなわち、時間的ダイレクトモード符号化では、上記したように、式（１）、式（２）を用いて動きベクトルを予測する。これは符号化対象の動画像の表示順序情報に基づいて計算されることを意味しているが、表示する画像の記録時間間隔に基づいてはいない。 That is, in the temporal direct mode encoding, as described above, the motion vector is predicted using the equations (1) and (2). This means that the calculation is based on the display order information of the moving image to be encoded, but not based on the recording time interval of the image to be displayed.

よって、表示順序情報と記録時間間隔が一致していない場合は、時間的ダイレクトモードの動きベクトル予測におけるスケーリング処理が意味をなさない。この一例を、図１４に示す。 Therefore, when the display order information and the recording time interval do not match, the scaling process in the motion vector prediction in the temporal direct mode does not make sense. An example of this is shown in FIG.

図１４は、テレシネ変換を行った画像における時間的ダイレクトモードの符号化を示す図である。ここでＩはＩピクチャ、ＰはＰピクチャ、ＢはＢピクチャとして符号化していることを意味し、数字は表示順序を示している。また括弧内に図１３のどのフィールドに対応するかを示している。 FIG. 14 is a diagram showing temporal direct mode encoding in an image subjected to telecine conversion. Here, I means I picture, P means P picture, and B means B picture, and the numbers indicate the display order. In addition, which field in FIG. 13 corresponds to the parenthesis.

この場合、Ｂ４の時間的ダイレクトモード符号化において、フィールドＢ４の表示時間的に近傍に位置する既に符号化済みピクチャであるフィールドＰ６のブロックＢＬ２と同じ位置にあるブロックＢＬ３の動きベクトルＭＶ＿Ｐ６と、フィールドＢ４、フィールドＰ６、ＢＬ３が参照するフィールドＩ０の表示順情報Ｔｂ６、Ｔｐ２、Ｔｉ０を用いて以下の式で動きベクトルを予測する。 In this case, in the temporal direct mode encoding of B4, the motion vector MV_P6 of the block BL3 at the same position as the block BL2 of the field P6, which is an already encoded picture located near the display time of the field B4, and the field A motion vector is predicted by the following expression using display order information Tb6, Tp2, Ti0 of the field I0 referred to by B4, fields P6, BL3.

ＭＶｆ＿ｂ６＝（Ｔｂ６−Ｔｉ０）／（Ｔｐ２−Ｔｉ０）×ＭＶ＿Ｐ２ …（３）
ＭＶｂ＿ｂ６＝（Ｔｂ６−Ｔｐ２）／（Ｔｐ２−Ｔｉ０）×ＭＶ＿Ｐ２ …（４） MVf_b6 = (Tb6-Ti0) / (Tp2-Ti0) × MV_P2 (3)
MVb_b6 = (Tb6-Tp2) / (Tp2-Ti0) × MV_P2 (4)

ここでフィールドの表示順序情報間隔をＴａとすると、上記の式（３），（４）は、下記の式（５），（６）で表される。 When the display order information interval of the field is Ta, the above formulas (3) and (4) are expressed by the following formulas (5) and (6).

ＭＶｆ＿ｂ６＝（４×Ｔａ／６×Ｔａ）×ＭＶ＿Ｐ２＝（２／３）×ＭＶ＿Ｐ２
…（５）
ＭＶｂ＿ｂ６＝（−２×Ｔａ／６×Ｔａ）×ＭＶ＿Ｐ２＝−（１／３）×ＭＶ＿Ｐ２
…（６） MVf_b6 = (4 × Ta / 6 × Ta) × MV_P2 = (2/3) × MV_P2
... (5)
MVb_b6 = (− 2 × Ta / 6 × Ta) × MV_P2 = − (1/3) × MV_P2
(6)

上記のＭＶｆ＿ｂ６、ＭＶｂ＿ｂ６の結果は実際にピクチャを記録した時刻とはずれている表示順情報を用いているため、誤ったスケーリングを行うことになる。例えば図１５のように、元の素材においてピクチャ内のオブジェクトが一定の間隔で移動している場合、フィールドＩ０、フィールドＢ４、フィールドＰ６は記録時刻の間隔が一定のため、フィールドＩ０に対して、フィールドＰ６でのオブジェクトの移動距離をＬとすると、フィールドＢ４での移動距離はＬ／２となる。ここで記録時刻を基準にして上記のスケーリング処理を行ったとすると、時刻Ｔｉ０、Ｔｂ６、Ｔｐ２の間隔は一定となるため、予測される動きベクトルは下記式（７），（８）のようになる。 Since the results of MVf_b6 and MVb_b6 described above use display order information that deviates from the time when the picture was actually recorded, erroneous scaling is performed. For example, as shown in FIG. 15, when the objects in the picture are moving at regular intervals in the original material, the recording time intervals are constant in the field I0, the field B4, and the field P6. If the moving distance of the object in the field P6 is L, the moving distance in the field B4 is L / 2. Here, assuming that the above scaling process is performed with reference to the recording time, the intervals between the times Ti0, Tb6, and Tp2 are constant, so that the predicted motion vectors are expressed by the following equations (7) and (8). .

ＭＶｆ＿ｂ６＝（１／２）×ＭＶ＿Ｐ２ …（７）
ＭＶｂ＿ｂ６＝−（１／２）×ＭＶ＿Ｐ２ …（８） MVf_b6 = (1/2) × MV_P2 (7)
MVb_b6 = − (1/2) × MV_P2 (8)

これは図１５の例における、移動距離のスケーリングと一致している。一般的に動画像では一定の速度でピクチャ内の背景やオブジェクトが動いていることが多いため、記録時間を基準にして動きベクトルの予測を行えば、高い精度で予測ができる。 This is consistent with the scaling of the movement distance in the example of FIG. In general, in a moving image, a background or an object in a picture is moving at a constant speed in many cases. Therefore, if a motion vector is predicted based on a recording time, it can be predicted with high accuracy.

しかし、上記のようにテレシネ変換を行った動画像に対して、時間的ダイレクトモードの動きベクトル予測を行うと、記録時刻間隔とずれた表示順序情報を用いて予測してしまうため、誤ったスケーリングを行ってしまい、予測精度が悪くなってしまう。その結果画質の劣化につながってしまう。 However, if motion vector prediction in the temporal direct mode is performed on a moving image that has been subjected to telecine conversion as described above, the prediction is performed using display order information that deviates from the recording time interval. The prediction accuracy will deteriorate. As a result, the image quality is deteriorated.

このようにテレシネ変換など、表示順情報と記録時間間隔がずれてしまうような表示時間間隔の変換を行った動画像に対して時間的ダイレクトモードを用いると、画質が劣化してしまうという課題がある。 As described above, when the temporal direct mode is used for a moving image that has been subjected to display time interval conversion that causes the recording time interval to deviate from the display order information, such as telecine conversion, there is a problem that image quality deteriorates. is there.

また動きベクトルの参照に用いる表示時間的に近傍に位置する既に符号化済みピクチャがＩピクチャである場合、上記のブロックＢＬ１が画面内符号化された場合と同じように時間的ダイレクトモード符号化される。つまり、符号化対象ピクチャの全てのブロックについて、（空間的）ダイレクトモードでの動きベクトルの予測が行われないこととなり、予測精度が悪くなってしまい、画質の劣化につながるという課題がある。 In addition, when an already-encoded picture that is located near the display time used for motion vector reference is an I picture, temporal direct mode encoding is performed in the same manner as when the block BL1 is intra-coded. The That is, motion vectors are not predicted in the (spatial) direct mode for all the blocks of the encoding target picture, resulting in a problem that the prediction accuracy is deteriorated and the image quality is deteriorated.

そこで、本発明は、時間的ダイレクトモード符号化における動きベクトルの予測の精度低下による画質劣化を防ぎ、動画像を効率良く圧縮することが可能な動画像符号化方法および動画像符号化装置を提供することを目的とする。 Accordingly, the present invention provides a moving picture coding method and a moving picture coding apparatus capable of preventing image quality deterioration due to a reduction in accuracy of motion vector prediction in temporal direct mode coding and efficiently compressing a moving picture. The purpose is to do.

上記課題を解決するために、本発明に係る動画像符号化方法においては、時間的に前方又は後方にある符号化済みのピクチャを複数、参照して予測符号化を行うＢピクチャを含む動画の符号化方法であって、前記Ｂピクチャのダイレクトモード処理として、時間的に近傍にある符号化済みピクチャの有する動きベクトルを参照して、対象ブロックの動きベクトルを予測して生成する時間的ダイレクトモード処理ステップと、符号化対象の条件によって、前記時間的ダイレクトモードの使用の禁止を判定する時間的ダイレクトモード禁止判定ステップとを含み、前記判定ステップで時間的ダイレクトモードが禁止された場合に、前記符号化対象に対して前記時間的ダイレクトモード処理ステップ以外の処理ステップを用いて前記予測符号化を行うことを特徴とする。 In order to solve the above-described problem, in the moving picture coding method according to the present invention, a moving picture including a B picture that is subjected to predictive coding by referring to a plurality of coded pictures that are temporally forward or backward is referred to. In the encoding method, as the direct mode processing of the B picture, the temporal direct mode in which the motion vector of the target block is predicted and generated by referring to the motion vector of the encoded picture that is temporally nearby Including a processing step and a temporal direct mode prohibition determining step for determining prohibition of use of the temporal direct mode according to a condition to be encoded, and when the temporal direct mode is prohibited in the determination step, The prediction encoding is performed on the encoding target using processing steps other than the temporal direct mode processing step. It is characterized in.

これにより、時間的ダイレクトモード符号化を使用した場合に画質の劣化が起きると予測される場合に、時間的ダイレクトモード符号化を行わないことによって、時間的ダイレクトモード符号化における動きベクトルの予測の精度低下による画質劣化を防ぎ、結果的に動画像を効率良く圧縮することが可能となる。 Thus, when it is predicted that image quality degradation will occur when temporal direct mode encoding is used, motion vector prediction in temporal direct mode encoding is not performed by performing temporal direct mode encoding. It is possible to prevent image quality deterioration due to a decrease in accuracy, and as a result, to efficiently compress moving images.

また、本発明に係る動画像符号化方法においては、前記時間的ダイレクトモード処理ステップ以外の処理ステップには、前記Ｂピクチャのダイレクトモード処理として、前記対象ブロックの空間的周辺に位置する符号化済みブロックの有する動きベクトルを参照して、前記対象ブロックの動きベクトルを予測して生成する空間的ダイレクトモード処理ステップが含まれており、前記判定ステップで時間的ダイレクトモードが禁止された場合に、前記符号化対象に対して前記空間的ダイレクトモード処理ステップを用いて前記予測符号化を行うことを特徴とすることができる。 Further, in the moving picture coding method according to the present invention, the processing steps other than the temporal direct mode processing step include coding that is located in the spatial periphery of the target block as the direct mode processing of the B picture. A spatial direct mode processing step of predicting and generating a motion vector of the target block with reference to a motion vector of the block, and when the temporal direct mode is prohibited in the determination step, The predictive encoding may be performed on the encoding target using the spatial direct mode processing step.

これにより、時間的ダイレクトモード符号化における動きベクトルの予測の精度低下による画質劣化を防ぎ、結果的に動画像を効率良く圧縮することが可能となる。 As a result, it is possible to prevent image quality deterioration due to a decrease in the accuracy of motion vector prediction in temporal direct mode encoding, and as a result, it is possible to efficiently compress moving images.

また、本発明に係る動画像符号化方法においては、前記時間的ダイレクトモード禁止判定ステップにおいて、符号化対象を構成するピクチャの時間間隔が一定ではないと判定された場合に、前記時間的ダイレクトモードの使用を禁止すると判定されることを特徴とすることができる。 Further, in the moving picture encoding method according to the present invention, when the temporal direct mode prohibition determining step determines that the time interval of pictures constituting the encoding target is not constant, the temporal direct mode It is characterized in that it is determined that the use of is prohibited.

これにより、ピクチャが画像として記録された時間間隔が一定でない場合に発生する時間的ダイレクトモード符号化の動きベクトル予測の精度低下による画質の劣化を防ぐことが可能となる。 As a result, it is possible to prevent deterioration in image quality due to a decrease in accuracy of motion vector prediction in temporal direct mode encoding that occurs when the time interval at which a picture is recorded as an image is not constant.

また、本発明に係る動画像符号化方法においては、前記時間的ダイレクトモード禁止判定ステップにおいて、符号化対象がピクチャの表示時間間隔の変換を行ったものであると判定された場合に、前記時間的ダイレクトモードの使用を禁止すると判定されることを特徴とすることができる。 Also, in the moving picture encoding method according to the present invention, when it is determined in the temporal direct mode prohibition determining step that the encoding target is a picture display time interval converted, the time It is characterized in that it is determined that the use of the general direct mode is prohibited.

これによっても、ピクチャの表示時間間隔の変換が行われた動画像は構成するピクチャ間の時間間隔が一定でない場合があり、この場合における時間的ダイレクトモード符号化の動きベクトル予測の精度低下による画質の劣化を防ぐことが可能となる。 Even in this case, the time interval between pictures constituting a moving image in which the display time interval of the picture has been converted may not be constant. In this case, the image quality due to a decrease in the accuracy of motion vector prediction in temporal direct mode encoding may occur. It becomes possible to prevent degradation of the.

また、本発明に係る動画像符号化方法においては、前記時間的ダイレクトモード禁止判定ステップにおいて、前記時間的ダイレクトモードで動きベクトルの参照に用いる符号化済みピクチャが画面内予測符号化を行うＩピクチャであると判定された場合に、時間的ダイレクトモードの使用を禁止すると判定されることを特徴とすることもできる。 Further, in the moving picture coding method according to the present invention, in the temporal direct mode prohibition determining step, an encoded picture used for referring to a motion vector in the temporal direct mode is an I picture for performing intra prediction encoding. If it is determined that the temporal direct mode is not used, it may be determined that the temporal direct mode is prohibited.

これによっても、Ｉピクチャを参照する場合に生じる時間的ダイレクトモード符号化における動きベクトル予測の精度の低下による画質の劣化を防ぐことが可能となる。 This also makes it possible to prevent deterioration in image quality due to a decrease in the accuracy of motion vector prediction in temporal direct mode encoding that occurs when referring to an I picture.

また、本発明に係る動画像符号化方法においては、前記時間的ダイレクトモード禁止判定ステップにおいて、符号化対象を構成するピクチャの時間間隔が一定ではないと判定された場合、符号化対象がピクチャの表示時間間隔の変換を行ったものであると判定された場合、および前記時間的ダイレクトモードで動きベクトルの参照に用いる符号化済みピクチャが画面内予測符号化を行うＩピクチャであると判定された場合の少なくとも２つのうちのいずれかに該当するか否か判定し、前記少なくとも２つのうちのいずれかに該当する場合に、前記時間的ダイレクトモードの使用を禁止すると判定されることを特徴としてもよい。 In the moving image encoding method according to the present invention, if it is determined in the temporal direct mode prohibition determining step that the time interval between pictures constituting the encoding target is not constant, the encoding target is the picture. When it is determined that the display time interval has been converted, and when it is determined that the encoded picture used for referring to the motion vector in the temporal direct mode is an I picture for performing intra prediction encoding It is determined whether or not any one of at least two cases is satisfied, and when any one of the at least two cases is determined, it is determined that use of the temporal direct mode is prohibited. Good.

これにより、時間的ダイレクトモード符号化における動きベクトル予測の精度の低下を確実に防ぐことができ、画質の劣化をより防ぐことが可能となる。 As a result, it is possible to reliably prevent a reduction in the accuracy of motion vector prediction in temporal direct mode encoding, and to further prevent deterioration in image quality.

なお、本発明は、このような画像符号化方法として実現することができるだけでなく、このような画像符号化方法が含む特徴的なステップを手段とする画像符号化装置として実現したり、当該画像符号化装置が備える手段を集積化した集積回路として実現したり、それらのステップをコンピュータに実行させるプログラムとして実現したりすることもできる。そして、そのようなプログラムは、ＣＤ−ＲＯＭ等の記録媒体やインターネット等の伝送媒体を介して配信することができるのはいうまでもない。 Note that the present invention can be realized not only as such an image encoding method but also as an image encoding apparatus using the characteristic steps included in such an image encoding method as a means, The means included in the encoding apparatus can be realized as an integrated circuit, or can be realized as a program for causing a computer to execute these steps. Needless to say, such a program can be distributed via a recording medium such as a CD-ROM or a transmission medium such as the Internet.

以上の説明から明らかなように、本発明の動画像符号化方法によれば、時間的ダイレクトモード符号化における動きベクトルの予測の精度低下による画質劣化を防ぎ、動画像を効率良く圧縮することが可能となる。 As is apparent from the above description, according to the moving picture coding method of the present invention, it is possible to prevent image quality deterioration due to a decrease in the accuracy of motion vector prediction in temporal direct mode coding and to efficiently compress a moving picture. It becomes possible.

よって、本発明により、圧縮率が高く、高画質の動画像の配信が可能となり、インターネットが普及してきた今日における本願発明の実用的価値は極めて高い。 Therefore, according to the present invention, it is possible to deliver a high-quality moving image with a high compression rate, and the practical value of the present invention in the present day when the Internet has become widespread is extremely high.

以下本発明の実施の形態について、図面を参照しながら説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（実施の形態１）
図１は、本発明の実施の形態１における動画像符号化装置１００ａの機能構成を示すブロック図である。 (Embodiment 1)
FIG. 1 is a block diagram showing a functional configuration of a moving picture coding apparatus 100a according to Embodiment 1 of the present invention.

動画像符号化装置１００ａは、ＡＶ機器等から入力される画像を圧縮符号化する装置であり、図１に示されるように、予測残差符号化部１０１と、符号列生成部１０２と、予測残差復号化部１０３と、面内予測部１０４と、フレームメモリ１０５と、動き検出部１０６と、動き補償部１０７と、動きベクトル記憶部１０８と、時間的ダイレクトモード処理部１０９と、空間的ダイレクトモード処理部１１０と、ダイレクト処理判定部１１１と、減算部１１２と、モード選択部１１３と、加算部１１４と、フレームメモリ１１５と、時間的ダイレクトモード禁止判定部１１６と、モード選択部１２１とを備える。 The moving image encoding device 100a is a device that compresses and encodes an image input from an AV device or the like, and as illustrated in FIG. 1, a prediction residual encoding unit 101, a code string generation unit 102, and a prediction Residual decoding unit 103, in-plane prediction unit 104, frame memory 105, motion detection unit 106, motion compensation unit 107, motion vector storage unit 108, temporal direct mode processing unit 109, spatial Direct mode processing unit 110, direct processing determination unit 111, subtraction unit 112, mode selection unit 113, addition unit 114, frame memory 115, temporal direct mode prohibition determination unit 116, mode selection unit 121, Is provided.

動き検出部１０６は、符号化済みの再構成画像データを参照ピクチャとして用いて、そのピクチャ内の探索領域において最適と予測される位置を示す動きベクトルの検出を行う。 The motion detection unit 106 uses the encoded reconstructed image data as a reference picture, and detects a motion vector indicating a position predicted to be optimal in the search area in the picture.

動き補償部１０７は、動き検出部１０６で検出された動きベクトルを用いてブロックの画面間符号化における符号化モードを決定し、この符号化モードに基づいて予測画像データを生成する。この符号化モードは、マクロブロックをどのような方法で符号化するか示すものである。 The motion compensation unit 107 determines a coding mode in inter-frame coding of a block using the motion vector detected by the motion detection unit 106, and generates predicted image data based on the coding mode. This encoding mode indicates how the macroblock is encoded.

動きベクトル記憶部１０８は、動き検出部１０６で検出された動きベクトルを記憶する。 The motion vector storage unit 108 stores the motion vector detected by the motion detection unit 106.

時間的ダイレクトモード禁止判定部１１６は、符号化対象の動画像情報から時間的ダイレクトモードの使用の可否、つまり時間的ダイレクトモードの使用禁止の可否を判定し、判定結果をダイレクト処理判定部１１１に通知する。 The temporal direct mode prohibition determination unit 116 determines whether or not the temporal direct mode can be used from the moving image information to be encoded, that is, whether or not the temporal direct mode is prohibited, and the determination result is sent to the direct processing determination unit 111. Notice.

ダイレクト処理判定部１１１は、時間的ダイレクトモード禁止判定部１１６の通知を元に、符号化対象画像に対して、ダイレクト符号化モードとして時間的ダイレクトモードを使用するのか、時間的ダイレクトモード以外の処理を用いて予測符号化を行うのかの判定を行う。 Based on the notification from the temporal direct mode prohibition determining unit 116, the direct processing determination unit 111 uses the temporal direct mode as the direct encoding mode for the encoding target image, or processes other than the temporal direct mode. Is used to determine whether to perform predictive coding.

時間的ダイレクトモード処理部１０９は、ダイレクト符号化モードが時間的ダイレクトモードである場合に、動きベクトル記憶部１０８に記憶してある符号化対象画像の表示時間的に近傍に位置する既に符号化済み画像の符号化対象ブロックと同じ位置にあるブロックの動きベクトルを参照して、スケーリング処理を行い動きベクトルの予測を行う。 The temporal direct mode processing unit 109, when the direct encoding mode is the temporal direct mode, has already been encoded that is located near the display time of the encoding target image stored in the motion vector storage unit 108. A motion vector is predicted by performing scaling processing with reference to the motion vector of the block located at the same position as the encoding target block of the image.

空間的ダイレクトモード処理部１１０は、ダイレクト符号化モードが空間的ダイレクトモードである場合に、動きベクトル記憶部１０８に記憶してある符号化対象ブロックの符号化済み隣接ブロックの動きベクトルを参照して動きベクトルの予測を行う。 When the direct coding mode is the spatial direct mode, the spatial direct mode processing unit 110 refers to the motion vector of the coded adjacent block of the coding target block stored in the motion vector storage unit 108. Predict motion vectors.

モード選択部１２１は、時間的ダイレクトモード処理部１０９の判定結果に基づいて、時間的ダイレクトモード処理部１０９による動きベクトルの予測と空間的ダイレクトモード処理部１１０による動きベクトルの予測とのいずれかを、動き補償部１０７に出力する。 Based on the determination result of the temporal direct mode processing unit 109, the mode selection unit 121 performs either motion vector prediction by the temporal direct mode processing unit 109 or motion vector prediction by the spatial direct mode processing unit 110. To the motion compensation unit 107.

面内予測部１０４は、面内予測の符号化モードとして符号化対象ブロックの隣接画素を利用して予測画像データを生成する。 The in-plane prediction unit 104 generates predicted image data using adjacent pixels of the encoding target block as an encoding mode for in-plane prediction.

モード選択部１１３は、動き補償部１０７が決定した画面間予測の符号化モードと、面内予測部１０４の符号化モードにおいて、符号化効率の良いモードを選択する。 The mode selection unit 113 selects a mode with good coding efficiency among the coding mode of inter-frame prediction determined by the motion compensation unit 107 and the coding mode of the in-plane prediction unit 104.

減算部１１２は、フレームメモリ１１５より読み出された画像データと動き補償部１０７、もしくは面内予測部１０４の予測画像データとの差分を演算し、予測残差画像データを生成する。 The subtraction unit 112 calculates a difference between the image data read from the frame memory 115 and the prediction image data of the motion compensation unit 107 or the in-plane prediction unit 104, and generates prediction residual image data.

予測残差符号化部１０１は、入力された予測残差画像データに対して周波数変換や量子化等の符号化処理を行い、符号化データを生成する。 The prediction residual encoding unit 101 performs encoding processing such as frequency conversion and quantization on the input prediction residual image data to generate encoded data.

符号列生成部１０２は、入力された符号化データに対して可変長符号化等を行い、さらにモード選択部１１３から入力される符号化モード情報等を付加することにより符号列を生成する。 The code string generation unit 102 performs variable-length coding or the like on the input encoded data, and generates a code string by adding the encoding mode information and the like input from the mode selection unit 113.

予測残差復号化部１０３は、入力された符号化データに対して、逆量子化や逆周波数変換等の復号化処理を行い、符号化データを生成する。 The prediction residual decoding unit 103 performs decoding processing such as inverse quantization and inverse frequency transform on the input encoded data, and generates encoded data.

加算部１１４は、予測残差復号化部１０３より入力された復号化差分画像データと、モード選択部１１３が選択したモードの予測画像データとを加算して再構成画像データを生成する。 The addition unit 114 adds the decoded difference image data input from the prediction residual decoding unit 103 and the predicted image data of the mode selected by the mode selection unit 113 to generate reconstructed image data.

フレームメモリ１０５は、加算部１１４で生成された再構成画像データを格納する。
次に、上記のように構成された動画像符号化装置１００ａの動作について説明する。 The frame memory 105 stores the reconstructed image data generated by the adding unit 114.
Next, the operation of the moving picture coding apparatus 100a configured as described above will be described.

図２は、フレームメモリ１１５におけるピクチャの順序を示す説明図であり、特に図２（ａ）は入力された順序を、図２（ｂ）は並び替えられた順序を、それぞれ示す説明図である。ここで、縦線はピクチャを示し、各ピクチャの右下に示される記号は１文字目のアルファベットがピクチャタイプ（Ｉ、Ｐ、Ｂ）を示し、２文字目以降の数字は表示時間順のピクチャ番号を示している。またＰピクチャは、表示時間順で前方にある近傍のＩピクチャ又はＰピクチャを参照ピクチャとし、Ｂピクチャは、表示時間順で前方にある近傍のＩピクチャ、Ｐピクチャ、参照可能なＢピクチャと、表示時間順で後方にある近傍の１枚のＩピクチャ又はＰピクチャを参照ピクチャとして用いるものとしている。 FIG. 2 is an explanatory diagram showing the order of pictures in the frame memory 115. In particular, FIG. 2 (a) shows an input order, and FIG. 2 (b) shows an rearranged order. . Here, the vertical line indicates a picture, the symbol shown at the lower right of each picture is that the first letter of the alphabet indicates the picture type (I, P, B), and the numbers after the second letter are pictures in order of display time. Numbers are shown. In addition, the P picture uses a neighboring I picture or P picture that is forward in display time order as a reference picture, and the B picture is a neighboring I picture, P picture, and referenceable B picture that are forward in display time order; One nearby I picture or P picture in the display time order is used as a reference picture.

入力画像は、例えば図２（ａ）に示されるように表示時間順にピクチャ単位でフレームメモリ１１５に入力される。フレームメモリ１１５に入力された各ピクチャは、符号化するピクチャタイプが決定されると、例えば図２（ｂ）に示されるように符号化が行われる順に並び替えられる。この符号化順への並び替えは、画面間予測符号化における参照関係に基づいて行われ、参照ピクチャとして用いられるピクチャが先に符号化されるように並び替えられる。 For example, as shown in FIG. 2A, the input image is input to the frame memory 115 in picture units in order of display time. When the picture type to be encoded is determined, the pictures input to the frame memory 115 are rearranged in the order of encoding as shown in FIG. 2B, for example. This rearrangement to the coding order is performed based on the reference relationship in the inter-frame predictive coding, and is rearranged so that the picture used as the reference picture is first coded.

フレームメモリ１１５で並び替えが行われた各ピクチャは、例えば水平１６×垂直１６画素のグループに分割されたマクロブロック単位で読み出される。また、動き補償および動き検出は、例えば水平１６×垂直１６画素、水平８×垂直１６画素、水平１６×垂直８画素、水平８×垂直８画素のグループに分割されたブロック単位で行っている。 Each picture rearranged in the frame memory 115 is read in units of macroblocks divided into groups of horizontal 16 × vertical 16 pixels, for example. Also, motion compensation and motion detection are performed in units of blocks divided into groups of, for example, horizontal 16 × vertical 16 pixels, horizontal 8 × vertical 16 pixels, horizontal 16 × vertical 8 pixels, and horizontal 8 × vertical 8 pixels.

以降の動作については、符号化対象のピクチャがＢピクチャである場合について説明する。 The subsequent operation will be described in the case where the picture to be encoded is a B picture.

Ｂピクチャでは、２方向参照を用いた画面間予測符号化を行っている。例えば、図２（ａ）に示される例でピクチャＢ１１の符号化処理を行う場合、表示時間順で前方にある参照ピクチャはピクチャＰ１０、Ｐ７、Ｐ４、表示時間順で後方にある参照ピクチャはピクチャＰ１３となる。ここでは、Ｂピクチャが他のピクチャの符号化時に、参照ピクチャとして用いられない場合を考える。 For B pictures, inter-picture predictive coding using two-way reference is performed. For example, when the encoding process of the picture B11 is performed in the example shown in FIG. 2A, the reference pictures ahead in the display time order are the pictures P10, P7, P4, and the reference pictures behind the display time order are the pictures. P13. Here, consider a case where a B picture is not used as a reference picture when another picture is encoded.

フレームメモリ１１５より読み出されたピクチャＢ１１のマクロブロックは、動き検出部１０６および減算部１１２に入力される。動き検出部１０６はフレームメモリ１０５に格納された参照ピクチャを用いて、マクロブロック内の各ブロックに対して前方動きベクトルと後方動きベクトルの検出を行う。ここでは、フレームメモリ１０５に格納されたピクチャＰ１０、Ｐ７、Ｐ４の再構成画像データを前方参照ピクチャとして、ピクチャＰ１３の再構成画像データを後方参照ピクチャとして用いることになる。動き検出部１０６は検出した動きベクトルを動き補償部１０７に対して出力する。 The macroblock of the picture B11 read from the frame memory 115 is input to the motion detection unit 106 and the subtraction unit 112. The motion detection unit 106 uses the reference picture stored in the frame memory 105 to detect a forward motion vector and a backward motion vector for each block in the macroblock. Here, the reconstructed image data of the pictures P10, P7, and P4 stored in the frame memory 105 is used as a forward reference picture, and the reconstructed image data of the picture P13 is used as a backward reference picture. The motion detection unit 106 outputs the detected motion vector to the motion compensation unit 107.

動き補償部１０７は動き検出部１０６で検出された動きベクトルを用いて、マクロブロックの画面間予測における符号化モードを決定する。ここでＢピクチャの画面間符号化モードは、例えば前方動きベクトルを用いた画面間予測符号化、後方動きベクトルを用いた画面間予測符号化、双方向動きベクトルを用いた画面間予測符号化、ダイレクトモードの中から、いずれの方法で符号化するかを選択することができるものとする。ダイレクトモードはダイレクト処理判定部１１１で時間的ダイレクトモード、又は空間的ダイレクトモードのどちらを用いるかをある特定の単位で決めている。なお、上記のある特定の単位とは、スライス単位、ピクチャ単位、ＧＯＰ単位、シーケンス単位など、スライスよりも大きな単位であればどのようなものでもよい。 The motion compensation unit 107 uses the motion vector detected by the motion detection unit 106 to determine a coding mode in the inter-screen prediction of the macroblock. Here, the inter-picture coding mode of B picture is, for example, inter-picture predictive coding using a forward motion vector, inter-picture predictive coding using a backward motion vector, inter-picture predictive coding using a bidirectional motion vector, It is assumed that the encoding method can be selected from the direct mode. In the direct mode, the direct processing determination unit 111 determines whether to use the temporal direct mode or the spatial direct mode in a specific unit. The specific unit described above may be any unit that is larger than a slice, such as a slice unit, a picture unit, a GOP unit, or a sequence unit.

モード選択部１１３は動き補償部１０７で決定した画面間予測の符号化モードと面内予測部１０４で決定した面内予測の符号化モードを入力として、最も符号化効率が高くなる符号化モードを選択し、そのモードがそのマクロブロックの符号化モードとなる。 The mode selection unit 113 receives the coding mode of the inter prediction determined by the motion compensation unit 107 and the coding mode of the intra prediction determined by the intra prediction unit 104 as an input, and selects an encoding mode with the highest encoding efficiency. And the mode becomes the encoding mode of the macroblock.

次に、時間的ダイレクトモード禁止判定部１１６における処理の動作を説明する。この時間的ダイレクトモード禁止判定の動作は、以下に説明する方法１によって行うことができる。 Next, the processing operation in temporal direct mode prohibition determination unit 116 will be described. The temporal direct mode prohibition determination operation can be performed by the method 1 described below.

（方法１）
図３は、方法１による時間的ダイレクトモード禁止判定の動作を示すフローチャートである。 (Method 1)
FIG. 3 is a flowchart showing an operation of temporal direct mode prohibition determination according to method 1.

時間的ダイレクトモード禁止判定部１１６は、符号化対象の動画像情報を元に判定を行う。まず、時間的ダイレクトモード禁止判定部１１６は、符号化対象の動画像を構成しているピクチャの時間間隔が一定か否かを判定する（ステップＳ２０１）。一定でない場合（ステップＳ２０１でＮＯ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を禁止する（ステップＳ２０２）。ここで、時間間隔が一定でない場合は、例えばコマ落ちが発生した場合に生じる。一方、一定の場合（ステップＳ２０１でＹＥＳ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を許可する（ステップＳ２０３）。そして、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用の可否をダイレクト処理判定部１１１に通知する。 The temporal direct mode prohibition determination unit 116 performs determination based on moving image information to be encoded. First, the temporal direct mode prohibition determination unit 116 determines whether or not the time interval between pictures constituting a moving image to be encoded is constant (step S201). If not constant (NO in step S201), the temporal direct mode prohibition determination unit 116 prohibits the use of the temporal direct mode (step S202). Here, when the time interval is not constant, for example, a frame drop occurs. On the other hand, if it is constant (YES in step S201), the temporal direct mode prohibition determination unit 116 permits the use of the temporal direct mode (step S203). Then, the temporal direct mode prohibition determination unit 116 notifies the direct processing determination unit 111 of whether the temporal direct mode can be used.

なお、上記のステップＳ２０１での判定は、符号化装置に与えられるピクチャ毎の時間情報を用いて行ってもよく、もしくは符号化装置に与えられる時間間隔が一定であるかどうかの情報を用いて行ってもよく、もしくは符号化装置が入力されたピクチャ毎に実際に符号化を行なったかそれとも符号化をスキップしたかを判別することによって得られる時間間隔情報を用いて行ってもよい。つまり、外部からの情報であってもよく、内部の時間管理部１２０で得られた情報であってもよい。 Note that the determination in step S201 may be performed using time information for each picture given to the encoding device, or using information on whether the time interval given to the encoding device is constant. Alternatively, the encoding apparatus may use the time interval information obtained by determining whether the encoding is actually performed for each input picture or whether the encoding is skipped. That is, it may be information from the outside, or information obtained by the internal time management unit 120.

なお、本実施の形態における時間的ダイレクトモード禁止判定処理は、ピクチャ単位で行っても、もしくは複数のピクチャを１つのグループにしたＧＯＰ単位で行なっても、もしくはある特定のピクチャで区切られるシーケンス単位で行なっても、もしくは符号化対象の動画像列全体であるストリーム単位で行なってもよい。つまり、時間的ダイレクトモードを禁止する範囲は、参照ピクチャと符号化対象のピクチャとの間にコマ落ちが発生している場合の最小の範囲だけ禁止してもよく、最小の範囲を超えたタイムスケールにおいてコマ落ちが発生している場合の広い範囲の単位についても禁止するようにしてもよい。 Note that the temporal direct mode prohibition determination processing in the present embodiment may be performed in units of pictures, in units of GOPs in which a plurality of pictures are grouped into one group, or in sequence units delimited by a specific picture. Or in units of streams that are the entire moving image sequence to be encoded. In other words, the range in which the temporal direct mode is prohibited may be prohibited only when the frame drop occurs between the reference picture and the picture to be encoded, and the time exceeding the minimum range may be prohibited. You may make it prohibit also about the unit of a wide range when the frame drop | omission has generate | occur | produced in the scale.

以上の方法１により、時間的ダイレクトモード符号化の動きベクトル予測におけるスケーリング処理によって、予測精度の低い動きベクトルを算出することがなくなり、画質の劣化を防ぐことができる。 With the method 1 described above, a motion vector with low prediction accuracy is not calculated by scaling processing in motion vector prediction of temporal direct mode encoding, and deterioration in image quality can be prevented.

（実施の形態２）
次いで、本発明の他の実施の形態について説明する。 (Embodiment 2)
Next, another embodiment of the present invention will be described.

図４は、本発明の実施の形態２における動画像符号化装置１００ｂの機能構成を示すブロック図である。なお、同図にはテレシネ変換装置２００も併せて図示されている。また、図１に示される動画像符号化装置１００ａの構成と対応する動画像符号化装置１００ｂの部分に同じ番号を付し、その説明を省略し、異なる部分を詳細に説明する。 FIG. 4 is a block diagram showing a functional configuration of the moving picture coding apparatus 100b according to Embodiment 2 of the present invention. In the figure, a telecine conversion device 200 is also shown. Also, the same reference numerals are given to the parts of the moving picture encoding apparatus 100b corresponding to the configuration of the moving picture encoding apparatus 100a shown in FIG. 1, the description thereof will be omitted, and different parts will be described in detail.

ここで、この動画像符号化装置１００ｂにおいては、テレシネ変換装置２００等から受け取った符号化対象の動画像情報に基づいて、時間的ダイレクトモード禁止判定部１１６が、符号化対象がピクチャの表示時間間隔の変換を行った動画像であるか否かを判断するように構成されている点が、図１に示される動画像符号化装置１００ａの構成と異なっている。 Here, in the moving image encoding apparatus 100b, the temporal direct mode prohibition determining unit 116 determines that the encoding target is the picture display time based on the encoding target moving image information received from the telecine conversion apparatus 200 or the like. The configuration is different from the configuration of the video encoding device 100a shown in FIG. 1 in that it is configured to determine whether or not the video has undergone interval conversion.

次に、上記のように構成された動画像符号化装置１００ｂの動作について説明する。 Next, the operation of the moving picture coding apparatus 100b configured as described above will be described.

（方法２）
図５は、方法２による時間的ダイレクトモード禁止判定の動作を示すフローチャートである。 (Method 2)
FIG. 5 is a flowchart showing an operation of temporal direct mode prohibition determination according to method 2.

時間的ダイレクトモード禁止判定部１１６は、符号化対象の動画像情報を元に判定を行う。まず、符号化対象がピクチャの表示時間間隔の変換を行った動画像か否かを判定する（ステップＳ３０１）。変換を行った動画像の場合（ステップＳ３０１でＹＥＳ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を禁止する（ステップＳ３０２）。一方、変換を行っていない動画像の場合（ステップＳ３０１でＮＯ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を許可する（ステップＳ３０３）。そして、時間的ダイレクトモードの使用の可否をダイレクト処理判定部１１１に通知する。 The temporal direct mode prohibition determination unit 116 performs determination based on moving image information to be encoded. First, it is determined whether or not the encoding target is a moving image that has undergone conversion of a picture display time interval (step S301). In the case of a converted moving image (YES in step S301), the temporal direct mode prohibition determination unit 116 prohibits the use of the temporal direct mode (step S302). On the other hand, in the case of a moving image that has not been converted (NO in step S301), the temporal direct mode prohibition determination unit 116 permits the use of the temporal direct mode (step S303). Then, the direct processing determination unit 111 is notified of whether or not the temporal direct mode can be used.

なお、ここではピクチャの表示時間間隔の変換を判定に用いたが、表示時間間隔の変換をテレシネ変換（２−３変換）に限定しても良い。 Although the conversion of the display time interval of the picture is used for the determination here, the conversion of the display time interval may be limited to the telecine conversion (2-3 conversion).

また、上記のステップＳ３０１での判定に用いるピクチャの表示時間間隔の変換を行ったかどうかを示す情報は、符号化装置の外部のテレシネ変換装置２００から与えられてもよく、動画像符号化装置１００ａ内部で画像の特徴からピクチャの表示時間間隔の変換を行ったかどうかを決定してもよい。 Information indicating whether or not the display time interval conversion of the picture used for the determination in step S301 has been performed may be given from the telecine conversion device 200 outside the encoding device, and the moving image encoding device 100a. It may be determined whether or not the display time interval of the picture has been converted from the feature of the image.

また、テレシネ変換（２−３変換）を行なった場合、通常テレシネ変換された全範囲について時間的ダイレクトモードの使用を禁止するようにしてもよいが、符号化対象ピクチャと時間的ダイレクトモードで動きベクトルの参照を行なうピクチャとの位置関係によっては、表示時間間隔の変換の変換前と変換後で時間間隔の関係が同じになっている場合がある。つまり、時間間隔のスケーリングに支障がない場合がある。そのような条件を検出し、上記条件に該当する場合は、従来どおり時間的ダイレクトモードを使用することを許可するような判定処理としてもよい。 When telecine conversion (2-3 conversion) is performed, use of the temporal direct mode may be prohibited for the entire range subjected to normal telecine conversion. Depending on the positional relationship with the picture for which the vector is referenced, the relationship between the time intervals before and after the conversion of the display time interval may be the same. In other words, there may be no problem in scaling the time interval. If such a condition is detected and the above condition is met, a determination process may be made that permits the temporal direct mode to be used as before.

以上の方法２により、時間的ダイレクトモード符号化の動きベクトル予測におけるスケーリング処理によって、予測精度の低い動きベクトルを算出することがなくなり、画質の劣化を防ぐことができる。 With the method 2 described above, a motion vector with low prediction accuracy is not calculated by scaling processing in motion vector prediction of temporal direct mode encoding, and deterioration of image quality can be prevented.

（実施の形態３）
次いで、本発明のさらに他の実施の形態について説明する。 (Embodiment 3)
Next, still another embodiment of the present invention will be described.

図６は、本発明の実施の形態３における動画像符号化装置１００ｃの機能構成を示すブロック図である。なお、図１，図４に示される動画像符号化装置１００ａ，１００ｂの構成と対応する動画像符号化装置１００ｃの部分に同じ番号を付し、その説明を省略し、異なる部分を詳細に説明する。 FIG. 6 is a block diagram showing a functional configuration of the moving picture coding apparatus 100c according to Embodiment 3 of the present invention. 1 and FIG. 4, the same reference numerals are assigned to the parts of the moving picture coding apparatus 100c corresponding to the configurations of the moving picture coding apparatuses 100a and 100b shown in FIG. 1, the description thereof is omitted, and different parts are described in detail. To do.

ここで、この動画像符号化装置１００ｃにおいては、例えばモード選択部１１３から送られてくる参照ピクチャの種別に基づいて、時間的ダイレクトモード禁止判定部１１６が時間ダイレクトモードで動きベクトルの参照に用いる符号化済みピクチャがＩピクチャであるか否か判定するように構成されている点が、図１，図４に示される動画像符号化装置１００ａ，１００ｂの構成と異なっている。 Here, in the moving picture coding apparatus 100c, for example, based on the type of the reference picture sent from the mode selection unit 113, the temporal direct mode prohibition determination unit 116 uses the motion vector in the temporal direct mode. This is different from the configurations of the moving image encoding apparatuses 100a and 100b shown in FIGS. 1 and 4 in that it is configured to determine whether or not the encoded picture is an I picture.

次に、上記のように構成された動画像符号化装置１００ｃの動作について説明する。 Next, the operation of the moving picture coding apparatus 100c configured as described above will be described.

（方法３）
図７は、方法３による時間的ダイレクトモード禁止判定の動作を示すフローチャートである。 (Method 3)
FIG. 7 is a flowchart showing the operation of temporal direct mode prohibition determination according to method 3.

時間的ダイレクトモード禁止判定部１１６は、符号化対象の動画像情報を元に判定を行う。まず、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードで動きベクトルの参照に用いる符号化済みピクチャはＩピクチャか否かを判定する（ステップＳ４０１）。この判定は、例えば、モード選択部１１３から送られてくる参照ピクチャの種別によって行われる。Ｉピクチャの場合（ステップＳ４０１でＹＥＳ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を禁止する（ステップＳ４０２）。 The temporal direct mode prohibition determination unit 116 performs determination based on moving image information to be encoded. First, the temporal direct mode prohibition determining unit 116 determines whether or not an encoded picture used for referring to a motion vector in the temporal direct mode is an I picture (step S401). This determination is performed based on the type of reference picture sent from the mode selection unit 113, for example. In the case of an I picture (YES in step S401), the temporal direct mode prohibition determination unit 116 prohibits the use of the temporal direct mode (step S402).

一方、Ｉピクチャでない場合（ステップＳ４０１でＮＯ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を許可する（ステップＳ４０３）。そして、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用の可否をダイレクト処理判定部１１１に通知する。 On the other hand, if it is not an I picture (NO in step S401), the temporal direct mode prohibition determination unit 116 permits the use of the temporal direct mode (step S403). Then, the temporal direct mode prohibition determination unit 116 notifies the direct processing determination unit 111 of whether the temporal direct mode can be used.

以上の方法３により、時間的ダイレクトモード符号化の動きベクトル予測におけるスケーリング処理によって、予測精度の低い動きベクトルを算出することがなくなり、画質の劣化を防ぐことができる。 With the method 3 described above, the motion vector with low prediction accuracy is not calculated by the scaling process in the motion vector prediction of temporal direct mode encoding, and deterioration in image quality can be prevented.

（実施の形態４）
実施の形態１から３では、方法１から３を別々な方法として説明したが、これらの方法を少なくとも２つ組み合わせて実施するようにしても良い。以下ではその例として、方法２と方法３とを組み合わせた場合を説明する。 (Embodiment 4)
In the first to third embodiments, the methods 1 to 3 have been described as separate methods, but at least two of these methods may be combined. Below, the case where the method 2 and the method 3 are combined is demonstrated as the example.

図８は、本発明の実施の形態４における動画像符号化装置１００ｄのブロック図である。なお、図１，図４，図６に示される動画像符号化装置１００ａ，１００ｂ，１００ｃの構成と対応する動画像符号化装置１００ｄの部分に同じ番号を付し、異なる部分を詳細に説明する。 FIG. 8 is a block diagram of moving picture coding apparatus 100d according to Embodiment 4 of the present invention. In addition, the same number is attached | subjected to the part of the moving image encoder 100d corresponding to the structure of the moving image encoder 100a, 100b, 100c shown by FIG.1, FIG.4, FIG.6, and a different part is demonstrated in detail. .

ここで、この動画像符号化装置１００ｄは、上記の方法２と方法３とを組み合わせて行うように構成されている点が、図１，図４，図６に示される動画像符号化装置１００ａ，１００ｂ，１００ｃの構成と異なっている。 Here, the moving picture coding apparatus 100d is configured to perform the method 2 and the method 3 in combination, and the moving picture coding apparatus 100a shown in FIGS. 1, 4, and 6 is used. , 100b, and 100c.

次に、上記のように構成された動画像符号化装置１００ｄの動作について説明する。
図９は、方法２と方法３を組みせた場合のフローチャートである。 Next, the operation of the moving picture coding apparatus 100d configured as described above will be described.
FIG. 9 is a flowchart when the method 2 and the method 3 are combined.

まず、時間的ダイレクトモード禁止判定部１１６は、符号化対象がピクチャの表示時間間隔の変換を行った動画像であるか否かを判定する（ステップＳ５０１）。変換を行った動画像の場合（ステップＳ５０１でＹＥＳ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を禁止する（ステップＳ５０３）。一方、変換を行っていない動画の場合（ステップＳ５０１でＮＯ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードで動きベクトルの参照に用いる符号化済みピクチャはＩピクチャか否かを判定する（ステップＳ５０２）。Ｉピクチャの場合（ステップＳ５０２でＹＥＳ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を禁止する（ステップＳ５０３）。 First, the temporal direct mode prohibition determination unit 116 determines whether or not the encoding target is a moving image that has undergone conversion of a picture display time interval (step S501). In the case of a converted moving image (YES in step S501), the temporal direct mode prohibition determination unit 116 prohibits the use of the temporal direct mode (step S503). On the other hand, in the case of a moving image that has not been converted (NO in step S501), the temporal direct mode prohibition determining unit 116 determines whether or not the encoded picture used for motion vector reference in the temporal direct mode is an I picture. (Step S502). In the case of an I picture (YES in step S502), the temporal direct mode prohibition determination unit 116 prohibits the use of the temporal direct mode (step S503).

一方、Ｉピクチャでない場合（ステップＳ５０２でＮＯ）、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用を許可する（ステップＳ５０４）。 On the other hand, if it is not an I picture (NO in step S502), the temporal direct mode prohibition determination unit 116 permits the use of the temporal direct mode (step S504).

そして、時間的ダイレクトモード禁止判定部１１６は、時間的ダイレクトモードの使用の可否をダイレクト処理判定部１１１に通知する。 Then, the temporal direct mode prohibition determination unit 116 notifies the direct processing determination unit 111 of whether the temporal direct mode can be used.

このような方法で、方法１と方法２と方法３、方法１と方法２、方法１と方法３を組み合わせることにより、時間的ダイレクトモード符号化の動きベクトル予測におけるスケーリング処理によって、予測精度の低い動きベクトルを算出する頻度がさらに少なくでき、画質の劣化を防ぐことができる。ここで、例えば図９では、ステップＳ５０１、Ｓ５０２の順番に処理したが、この逆の順番に処理してもよく、他の組み合わせにおいても処理の順番の如何は限定されるものではない。 By combining method 1, method 2, and method 3, method 1 and method 2, and method 1 and method 3 with such a method, the prediction accuracy is low due to scaling processing in motion vector prediction of temporal direct mode encoding. The frequency of calculating the motion vector can be further reduced, and the deterioration of the image quality can be prevented. Here, for example, in FIG. 9, the processing is performed in the order of steps S501 and S502, but the processing may be performed in the reverse order, and the order of processing is not limited in other combinations.

（実施の形態５）
さらに、上記各実施の形態１〜４で示した動画像符号化方法を実現するためのプログラムを、フレキシブルディスク等の記録媒体に記録するようにすることにより、上記各実施の形態で示した処理を、独立したコンピュータシステムにおいて簡単に実施することが可能となる。 (Embodiment 5)
Further, the program for realizing the moving picture encoding method shown in each of the above embodiments 1 to 4 is recorded on a recording medium such as a flexible disk, so that the processing shown in each of the above embodiments is performed. Can be easily implemented in an independent computer system.

図１０は、上記各実施の形態の動画像符号化方法を、フレキシブルディスク等の記録媒体に記録されたプログラムを用いて、コンピュータシステムにより実施する場合の説明図である。 FIG. 10 is an explanatory diagram when the moving image encoding method of each of the above embodiments is implemented by a computer system using a program recorded on a recording medium such as a flexible disk.

図１０（ｂ）は、フレキシブルディスクの正面からみた外観、断面構造、およびフレキシブルディスクを示し、図１０（ａ）は、記録媒体本体であるフレキシブルディスクの物理フォーマットの例を示している。フレキシブルディスクＦＤはケースＦ内に内蔵され、該ディスクの表面には、同心円状に外周からは内周に向かって複数のトラックＴｒが形成され、各トラックは角度方向に１６のセクタＳｅに分割されている。従って、上記プログラムを格納したフレキシブルディスクでは、上記フレキシブルディスクＦＤ上に割り当てられた領域に、上記プログラムが記録されている。 FIG. 10B shows an appearance, a cross-sectional structure, and a flexible disk as viewed from the front of the flexible disk, and FIG. 10A shows an example of a physical format of the flexible disk that is a recording medium body. The flexible disk FD is built in the case F, and on the surface of the disk, a plurality of tracks Tr are formed concentrically from the outer periphery toward the inner periphery, and each track is divided into 16 sectors Se in the angular direction. ing. Therefore, in the flexible disk storing the program, the program is recorded in an area allocated on the flexible disk FD.

また、図１０（ｃ）は、フレキシブルディスクＦＤに上記プログラムの記録再生を行うための構成を示す。動画像符号化方法を実現する上記プログラムをフレキシブルディスクＦＤに記録する場合は、コンピュータシステムＣｓから上記プログラムをフレキシブルディスクドライブを介して書き込む。また、フレキシブルディスク内のプログラムにより動画像符号化方法を実現する上記動画像符号化方法をコンピュータシステム中に構築する場合は、フレキシブルディスクドライブによりプログラムをフレキシブルディスクから読み出し、コンピュータシステムに転送する。 FIG. 10C shows a configuration for recording and reproducing the program on the flexible disk FD. When the program for realizing the moving image encoding method is recorded on the flexible disk FD, the program is written from the computer system Cs through the flexible disk drive. When the above-described moving picture coding method for realizing the moving picture coding method by a program in a flexible disk is constructed in a computer system, the program is read from the flexible disk by a flexible disk drive and transferred to the computer system.

なお、上記説明では、記録媒体としてフレキシブルディスクを用いて説明を行ったが、光ディスクを用いても同様に行うことができる。また、記録媒体はこれに限らず、ＩＣカード、ＲＯＭカセット等、プログラムを記録できるものであれば同様に実施することができる。 In the above description, a flexible disk is used as the recording medium, but the same can be done using an optical disk. Further, the recording medium is not limited to this, and any recording medium such as an IC card or a ROM cassette capable of recording a program can be similarly implemented.

なお、上記各実施の形態で示した動画像符号化方法を実現するための処理を集積回路であるＬＳＩとして実現してもよい。これらは一部又は全てを含むように１チップ化されてもよい。 Note that the processing for realizing the moving picture encoding method shown in each of the above embodiments may be realized as an LSI which is an integrated circuit. These may be integrated into one chip so as to include a part or all of them.

ここではＬＳＩとしたが、集積度の違いにより、ＩＣ、システムＬＳＩ、スーパーＬＳＩ、ウルトラＬＳＩと称されることもある。 The name used here is LSI, but it may also be called IC, system LSI, super LSI, or ultra LSI depending on the degree of integration.

また、集積回路化の手法はＬＳＩに限るものではなく、専用回路又は汎用プロセッサで実現しても良い。ＬＳＩ製造後に、プログラムすることが可能なＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）や、ＬＳＩ内部の回路セルの接続や設定を再構成可能なリコンフィギュラブル・プロセッサを利用しても良い。 Further, the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. An FPGA (Field Programmable Gate Array) that can be programmed after manufacturing the LSI, or a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.

さらに、半導体技術の進歩又は派生する別技術によりＬＳＩなどに置き換わる集積回路の技術が登場すれば、当然、その技術を用いて機能ブロックの集積化を行ってもよい。 Furthermore, if integrated circuit technology that replaces LSI or the like appears as a result of progress in semiconductor technology or other derived technology, it is naturally also possible to perform integration of functional blocks using that technology.

また、上記実施の形態では、動画像符号化装置１００ａの外部にテレシネ変換装置２００を設けるようにしたが、動画像符号化装置１００ａの内部にテレシネ変換装置２００を設けて実施するようにしてもよい。 In the above embodiment, the telecine conversion apparatus 200 is provided outside the moving picture encoding apparatus 100a. However, the telecine conversion apparatus 200 may be provided inside the moving picture encoding apparatus 100a. Good.

本発明に係る動画像符号化方法は、例えばＤＶＤ装置、携帯電話、およびパーソナルコンピューター等で、動画像を構成する各ピクチャを符号化して符号列を生成する方法として有用である。 The moving image encoding method according to the present invention is useful as a method for generating a code string by encoding each picture constituting a moving image using, for example, a DVD device, a mobile phone, a personal computer, or the like.

本発明の実施の形態１に係る動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder which concerns on Embodiment 1 of this invention. ピクチャメモリにおけるピクチャの順序を示す図であり、図２（ａ）は入力された順序を示す図であり、図２（ｂ）は並び替えられた順序を示す図である。FIG. 2A is a diagram showing the order of pictures in the picture memory, FIG. 2A is a diagram showing the order of input, and FIG. 2B is a diagram showing the rearranged order. 時間的ダイレクトモード禁止判定部での方法１による時間的ダイレクトモードの使用の可否を決定する動作を示すフローチャートである。It is a flowchart which shows the operation | movement which determines the propriety of use of the temporal direct mode by the method 1 in the temporal direct mode prohibition determination part. 本発明の実施の形態２に係る動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder which concerns on Embodiment 2 of this invention. 時間的ダイレクトモード禁止判定部での方法２による時間的ダイレクトモードの使用の可否を決定する動作を示すフローチャートである。It is a flowchart which shows the operation | movement which determines the propriety of use of the temporal direct mode by the method 2 in the temporal direct mode prohibition determination part. 本発明の実施の形態３に係る動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder which concerns on Embodiment 3 of this invention. 時間的ダイレクトモード禁止判定部での方法３による時間的ダイレクトモードの使用の可否を決定する動作を示すフローチャートである。It is a flowchart which shows the operation | movement which determines the propriety of use of the temporal direct mode by the method 3 in the temporal direct mode prohibition determination part. 本発明の実施の形態４に係る動画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the moving image encoder which concerns on Embodiment 4 of this invention. 時間的ダイレクトモード禁止判定部での方法２と方法３の組み合わせによる時間的ダイレクトモードの使用の可否を決定する動作を示すフローチャートである。It is a flowchart which shows the operation | movement which determines the availability of the temporal direct mode by the combination of the method 2 and the method 3 in the temporal direct mode prohibition determination part. 実施の形態１〜４の動画像符号化方法をコンピュータシステムにより実現するためのプログラムを格納するための記録媒体についての説明図である。It is explanatory drawing about the recording medium for storing the program for implement | achieving the moving image encoding method of Embodiment 1-4 by a computer system. 時間的ダイレクトモードにおける動きベクトルの予測生成方法を示す模式図である。It is a schematic diagram which shows the prediction production | generation method of the motion vector in temporal direct mode. 空間的ダイレクトモードにおける動きベクトルの予測生成方法を示す模式図である。It is a schematic diagram which shows the prediction production | generation method of the motion vector in spatial direct mode. テレシネ変換（２−３変換）の例を示す図である。It is a figure which shows the example of telecine conversion (2-3 conversion). テレシネ変換（２−３変換）した動画像における時間的ダイレクトモードの動きベクトル予測の例を示す図である。It is a figure which shows the example of the motion vector prediction of the temporal direct mode in the moving image which carried out telecine conversion (2-3 conversion). テレシネ変換（２−３変換）した動画像における時間的ダイレクトモードの動きベクトル予測の例を示す図である。It is a figure which shows the example of the motion vector prediction of the temporal direct mode in the moving image which carried out telecine conversion (2-3 conversion).

Explanation of symbols

１００ａ，１００ｂ，１００ｃ，１００ｄ動画像符号化装置
１０１予測残差符号化部
１０２符号列生成部
１０３予測残差復号化部
１０４面内予測部
１０５，１１５フレームメモリ
１０６動き検出部
１０７動き補償部
１０８動きベクトル記憶部
１０９時間的ダイレクトモード処理部
１１０空間的ダイレクトモード処理部
１１１ダイレクト処理判定部
１１２減算部
１１３モード選択部
１１４加算部
１１６時間的ダイレクトモード禁止判定部
１２０時間管理部
１２１モード選択部
２００テレシネ変換装置 100a, 100b, 100c, 100d Moving picture encoding apparatus 101 Prediction residual encoding unit 102 Code sequence generation unit 103 Prediction residual decoding unit 104 In-plane prediction unit 105, 115 Frame memory 106 Motion detection unit 107 Motion compensation unit 108 Motion vector storage unit 109 Temporal direct mode processing unit 110 Spatial direct mode processing unit 111 Direct processing determination unit 112 Subtraction unit 113 Mode selection unit 114 Addition unit 116 Temporal direct mode prohibition determination unit 120 Time management unit 121 Mode selection unit 200 Telecine converter

Claims

A method for encoding a moving image including a B picture for performing predictive encoding with reference to a plurality of encoded pictures that are temporally forward or backward,
As the direct mode processing of the B picture, a temporal direct mode processing step of predicting and generating a motion vector of a target block with reference to motion vectors of encoded pictures that are temporally nearby;
A temporal direct mode prohibition determining step for determining prohibition of use of the temporal direct mode according to an encoding target condition,
When the temporal direct mode is prohibited in the determination step, the predictive encoding is performed using a processing step other than the temporal direct mode processing step on the encoding target. Method.

In processing steps other than the temporal direct mode processing step, as a direct mode processing of the B picture, referring to a motion vector of an encoded block located in the spatial periphery of the target block, It includes a spatial direct mode processing step that predicts and generates motion vectors,
The moving image according to claim 1, wherein when the temporal direct mode is prohibited in the determining step, the predictive encoding is performed on the encoding target using the spatial direct mode processing step. Encoding method.

The use of the temporal direct mode is determined to be prohibited when it is determined in the temporal direct mode prohibition determining step that a time interval between pictures constituting the encoding target is not constant. Item 8. A moving image encoding method according to Item 1.

In the temporal direct mode prohibition determining step, when it is determined that the encoding target is a conversion of a picture display time interval, it is determined to prohibit the use of the temporal direct mode. The moving picture encoding method according to claim 1.

In the temporal direct mode prohibition determining step, when it is determined that an encoded picture used for reference to a motion vector in the temporal direct mode is an I picture to be subjected to intra prediction encoding, the temporal direct mode The moving image encoding method according to claim 1, wherein use is determined to be prohibited.

In the temporal direct mode prohibition determining step, when it is determined that the time interval of the picture constituting the encoding target is not constant, it is determined that the encoding target is a conversion of the display time interval of the picture. And whether or not the encoded picture used for referring to the motion vector in the temporal direct mode corresponds to at least one of the two when it is determined that the picture is an I picture for intra prediction encoding The moving picture encoding method according to claim 1, further comprising: determining that the use of the temporal direct mode is prohibited when any one of the at least two conditions is satisfied.

An apparatus for encoding a moving image including a B picture that performs predictive encoding with reference to a plurality of encoded pictures that are temporally forward or backward,
As the direct mode of the B picture, temporal direct mode processing means for predicting and generating a motion vector of the target block with reference to motion vectors of encoded pictures that are temporally nearby,
A temporal direct mode prohibition judging means for judging prohibition of use of the temporal direct mode according to a condition to be encoded;
When the temporal direct mode is prohibited by the determining unit, the predictive encoding is performed on the encoding target using a unit other than the temporal direct mode processing unit. apparatus.

The program which makes a computer perform the step contained in the moving image encoding method of Claim 1.

The integrated circuit which integrated the means contained in the moving image encoder of Claim 7.