JP2003219430A

JP2003219430A - Method and closed-loop transcoder for reduced spatial resolution transcoding of compressed bitstream of sequence of frames of video signal

Info

Publication number: JP2003219430A
Application number: JP2002381469A
Authority: JP
Inventors: Vetro Anthony; アンソニー・ヴェトロ; Huifang Sun; ハイファン・スン; Peng Yin; ペン・イン; Bede Liu; ベデ・リュー
Original assignee: Mitsubishi Electric Research Laboratories Inc
Current assignee: Mitsubishi Electric Research Laboratories Inc
Priority date: 2002-01-14
Filing date: 2002-12-27
Publication date: 2003-07-31

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method and closed-loop transcoder for reducing the spatial resolution of a compressed bitstream of a sequence of frames of a video signal by first decoding the frames and storing the decoded frames in a first frame buffer. <P>SOLUTION: While performing decoding, motion compensating is performed with full resolution motion vectors of the stored decoded frames. The decoded frames are then down-sampled to reduced resolution, and stored in a second frame buffer. The reduced resolution frames are partially encoded to produce a reduced resolution compressed bitstream of the video. While performing the partial encoding, motion compensation is performed with reduced resolution motion vectors of the stored reduced resolution frames. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】［関連特許出願の相互参照］
本特許出願は、本特許出願と同じ譲受人に譲渡された、
２００１年５月１１日出願の「Video Transcoder with
Spatial Resolution Reduction」というタイトルの米国
特許出願第０９／８５３，３９４号の一部継続出願であ
る。本発明は全般に、ビットストリームをトランスコー
ドする分野に関し、より詳細にはドリフト補償を含む空
間解像度の低減に関する。したがって、本発明は、空間
解像度の低減およびドリフト補償を含む方法およびビデ
オトランスコーダに関する。［発明の背景］ビデオ圧縮は、少ない記憶、ネットワー
ク及びプロセッサ資源で画像情報の記憶、伝送及び処理
を可能にする。最も広範に用いられているビデオ圧縮規
格には、動画の記憶及び検索のためのＭＰＥＧ−１、デ
ィジタルテレビジョン用のＭＰＥＧ−２及びビデオ会議
用のＨ.２６３がある。これ等については、ISO/IEC 111
72-2: 1993 「Information Technology - Coding of Mo
ving Pictures and Associated Audio for Digital Sto
rage Media up to about 1.5 Mbit/s - Part 2:Vide
o」、D. LeGallの「MPEG: A Video Compression Standa
rd for MultimediaApplications」、 Communications o
f the ACM, Vol. 34, No. 4, pp. 46-58, 1991, ISO/IE
C 13818-2: 1996, 「Information Technology - Generi
c Coding ofMoving Pictures and Associated Audio In
formation - Part 2: Video」、1994, ITU-T SG XV, DR
AFT H. 263, 「Video Coding for Low Bitrate Communi
cation」1996, ITU-T SG XVI, DRAFT13 H. 263 + Q15-A
-60 rev. 0, 「Video Codingfor Low Bitrate Communic
ation」1997を参照されたい。TECHNICAL FIELD OF THE INVENTION [Cross-reference of related patent applications]
This patent application was assigned to the same assignee as this patent application,
“Video Transcoder with” filed on May 11, 2001
This is a continuation-in-part application of US patent application Ser. No. 09 / 853,394 entitled "Spatial Resolution Reduction". The present invention relates generally to the field of transcoding bitstreams, and more particularly to reducing spatial resolution including drift compensation. Accordingly, the present invention relates to methods and video transcoders that include spatial resolution reduction and drift compensation. BACKGROUND OF THE INVENTION Video compression enables storage, transmission and processing of image information with low storage, network and processor resources. The most widely used video compression standards are MPEG-1 for video storage and retrieval, MPEG-2 for digital television and H.263 for video conferencing. For these, see ISO / IEC 111
72-2: 1993 `` Information Technology-Coding of Mo
ving Pictures and Associated Audio for Digital Sto
rage Media up to about 1.5 Mbit / s-Part 2: Vide
o ", D. Le Gall's" MPEG: A Video Compression Standa "
rd for Multimedia Applications, Communications o
f the ACM, Vol. 34, No. 4, pp. 46-58, 1991, ISO / IE
C 13818-2: 1996, `` Information Technology-Generi
c Coding of Moving Pictures and Associated Audio In
formation-Part 2: Video '', 1994, ITU-T SG XV, DR
AFT H. 263, `` Video Coding for Low Bitrate Communi
cation '' 1996, ITU-T SG XVI, DRAFT13 H. 263 + Q15-A
-60 rev. 0, `` Video Coding for Low Bitrate Communic
ation ”1997.

【０００２】上述の規格は、主に画像或いはフレームの
空間圧縮並びにフレームシーケンスの空間及び時間圧縮
を取り扱う比較的低レベルの仕様である。共通の特徴と
して、これら規格はフレームベースで圧縮を行う。これ
ら規格によれば、広範囲の用途に対して高い圧縮率を達
成することができる。The above-mentioned standards are relatively low-level specifications that mainly deal with spatial compression of images or frames and spatial and temporal compression of frame sequences. As a common feature, these standards provide frame-based compression. According to these standards, high compression rates can be achieved for a wide range of applications.

【０００３】マルチメディア用途のためのＭＰＥＧ−４
のような新たなビデオ符号化規格（ISO/IEC 14496-2: 1
999, 「Information technology-coding of audio/visu
al objects, Part 2: Visual」、参照）の出現で、任意
の形状のオブジェクトを別個のビデオオブジェクト平面
（ＶＯＰ）として符号化したり或いは復号することが可
能である。この場合オブジェクトとしては、ビジュアル
オブジェクト、オーディオオブジェクト、自然オブジェ
クト、人工オブジェクト、原始オブジェクト、混成オブ
ジェクト或いはそれらの組み合わせからなるオブジェク
トがある。また、無線チャンネルのようなエラーの起こ
りがちなチャンネルを介して堅実な（即ち、ロバスト
な）伝送を可能にするために相当量の耐エラー対策が組
み込まれている。MPEG-4 for multimedia applications
New video coding standards such as ISO / IEC 14496-2: 1
999, `` Information technology-coding of audio / visu
al objects, Part 2: Visual ”), it is possible to encode or decode objects of arbitrary shape as separate video object planes (VOPs). In this case, the object may be a visual object, an audio object, a natural object, an artificial object, a primitive object, a mixed object, or an object composed of a combination thereof. Also, a considerable amount of error-tolerant measures are incorporated to enable robust (ie, robust) transmission over error-prone channels such as wireless channels.

【０００４】新たに出現したＭＰＥＧ−４規格は、イン
タラクチブビデオのようなマルチメディア用途、即ち、
自然及び人工材料が統合され、アクセスが普遍（一方的
ではない）であるマルチメディアへの適用を可能にする
ように企図されている。ビデオ伝送と関連して、ネット
ワーク上での帯域幅の大きさを減少もしくは縮減するの
に圧縮規格が必要とされる。この場合ネットワークは無
線でもインターネットでも良い。いずれにせよ、ネット
ワークの容量には制限があり、従って、少ない資源に対
する競合は最小に抑止すべきである。The emerging MPEG-4 standard is for multimedia applications such as interactive video, namely
It is intended to allow multimedia applications where natural and artificial materials are integrated and access is universal (non-unilateral). In connection with video transmission, compression standards are needed to reduce or reduce the amount of bandwidth on the network. In this case, the network may be wireless or the Internet. In any case, the capacity of the network is limited, so contention for low resources should be minimized.

【０００５】装置がコンテンツをロバストに、即ち堅実
に伝送したりコンテンツの品質を利用可能なネットワー
ク資源に適合することを可能にするシステム及び方法に
関して多大な努力が払われている。これと関連し、コン
テンツをエンコードもしくは符号化する場合、低ビット
レート或いは低解像度でネットワークを介しビットスト
リームを伝送できるようにするために先ず前に該ビット
ストリームを復号（デコーディング）する必要がある場
合がある。Much effort has been expended on systems and methods that enable devices to robustly transmit content, ie, consistently and adapt the quality of content to available network resources. In this regard, when encoding or encoding content, it is necessary to first decode (decode) the bitstream so that the bitstream can be transmitted through the network at a low bit rate or a low resolution. There are cases.

【０００６】これは、図１に示すように、変換符号化器
（トランスコーダ）１００により達成することができ
る。最も単純な構成において、該変換符号化器１００
は、カスケード接続されたデコーダ（復号器）１１０及
びエンコーダ（符号化器）１２０を有する。圧縮された
入力ビットストリーム１０１は入力ビットレートThis can be accomplished by a transcoder 100, as shown in FIG. In the simplest configuration, the transform encoder 100
Has a decoder (decoder) 110 and an encoder (encoder) 120 connected in cascade. The compressed input bitstream 101 is the input bitrate

【数１】で完全に復号され、次いで、再符号化されビットレート[Equation 1] Fully decoded, then re-encoded at bit rate

【数２】の出力ビットストリーム１０３が生成される。通常、出
力ビットレートは入力ビットレートよりも小さい。しか
しながら、実際例では、復号されたビットストリームの
再符号化が非常に複雑であるために変換符号化器で完全
な復号化及び完全な再符号化は行われていない。[Equation 2] Output bitstream 103 is generated. The output bit rate is usually smaller than the input bit rate. However, in a practical example, the re-encoding of the decoded bitstream is so complicated that the transcoder does not perform full decoding and full re-encoding.

【０００７】ＭＰＥＧ−２規格の変換符号化（トランス
コーディング）に関する初期の研究として、Sun外によ
り公表された論文「Architectures for MPEG compresse
d bitstream scaling」、IEEE Transactions on Circui
ts and Systems for Video Technology（１９９６年４
月）がある。この論文には、複雑性及びアーキテクチャ
の変更に伴い４つのレート縮減方法が記述されている。As an early research on the transcoding of the MPEG-2 standard, a paper "Architectures for MPEG compresse" published by Sun et al.
d bitstream scaling '', IEEE Transactions on Circui
ts and Systems for Video Technology (April 1996)
There is a month. This paper describes four rate reduction methods due to complexity and architectural changes.

【０００８】図２は、開ループアーキテクチャと称する
第１の方法例２００を示している。このアーキテクチャ
においては、入力ビットストリーム２０１は部分的にの
み復号されるだけである。具体的に述べると、入力ビッ
トストリームのマクロブロックは可変長復号（ＶＬＤ）
２１０及び微細量子化器FIG. 2 illustrates a first example method 200 referred to as an open loop architecture. In this architecture, the input bitstream 201 is only partially decoded. Specifically, macroblocks in the input bitstream are variable length decoded (VLD).
210 and fine quantizer

【数３】での逆量子化２２０を受け、それにより離散コサイン変
換（ＤＣＴ）係数が生成される。所与の所望の出力ビッ
トレート２０２に対し、ＤＣＴブロックは、量子化器２
３０の粗レベル量子化[Equation 3] Subject to inverse quantization 220 at which the discrete cosine transform (DCT) coefficients are generated. For a given desired output bit rate 202, the DCT block is the quantizer 2
30 coarse level quantization

【数４】で再量子化される。これら再量子化されたブロックは、
次いで２４０で示すように可変長符号化（ＶＬＣ）さ
れ、その結果として低ビットレートで新たな出力ビット
ストリーム２０３が形成される。この方式は、図１に示
した方式よりも相当に単純である。その理由は、運動ベ
クトルが再使用され、逆ＤＣＴ演算が不必要であるから
である。ここで、[Equation 4] Is requantized by. These requantized blocks are
Variable length coding (VLC) is then performed, as shown at 240, resulting in the formation of a new output bitstream 203 at a low bitrate. This scheme is considerably simpler than the scheme shown in FIG. The reason is that the motion vector is reused and the inverse DCT operation is unnecessary. here,

【数５】及び[Equation 5] as well as

【数６】の選択は厳密にビットストリームのビットレート特性に
依存する。場合によりあり得る他の要因、例えばビット
ストリームの空間特性のようなファクタ（要因）は考慮
されない。[Equation 6] The exact choice depends on the bitrate characteristics of the bitstream. Other possible factors, such as the spatial characteristics of the bitstream, are not taken into account.

【０００９】図３は、第２の方法例３００を示す。この
方法は閉ループアーキテクチャと称されている。この方
法においては、入力ビデオビットストリームが再び部分
的に復号され、即ち、入力ビットストリームのマクロブ
ロックが可変長復号（ＶＬＤ）３１０並びに量子化器FIG. 3 illustrates a second example method 300. This method is called a closed loop architecture. In this method, the input video bitstream is partially decoded again, that is, the macroblocks of the input bitstream are variable length decoded (VLD) 310 as well as the quantizer.

【数７】での逆量子化３２０を受け、それにより離散コサイン変
換（ＤＣＴ）係数３２１が生成される。上述した第１の
方法例とは対照的に、入力ＤＣＴ係数３２１に対して補
正ＤＣＴ係数３３２が加算（３３０参照）され、それに
より、再量子化によって生じた不整合（ミスマッチ）が
補償される。この補正により、終局的に復号もしくはデ
コーディングに用いられる基準フレームの品質、即ちク
オリティが改善される。補正を行った後、新たに形成さ
れたブロックは、新たなビットレートとなるように[Equation 7] Subject to dequantization 320, which produces discrete cosine transform (DCT) coefficients 321. In contrast to the first example method described above, the correction DCT coefficient 332 is added to the input DCT coefficient 321 (see 330), thereby compensating for the mismatch caused by the requantization. . This correction improves the quality of the reference frame that is ultimately used for decoding or decoding, that is, the quality. After correction, the newly formed block will have the new bit rate.

【数８】で再量子化（３４０）されると共に既述のように可変長
符号化（３５０）される。この場合にも、[Equation 8] Is re-quantized (340) and is variable-length coded (350) as described above. Also in this case,

【数９】及び[Equation 9] as well as

【数１０】はビットレートに基づいて定められることを注記してお
く。[Equation 10] Note that is based on the bit rate.

【００１０】補正コンポーネント（成分）３３２を得る
ために、再量子化されたＤＣＴ係数は逆量子化（３６
０）され、元の部分的に復号されたＤＣＴ係数から減算
（３７０）される。この減算から得られる差は、逆ＤＣ
Ｔ（ＩＤＣＴ）３６５を介して空間ドメインに変換され
てフレームメモリ３８０に記憶される。ここで、各入力
ブロックと関連の運動ベクトル３８１が、運動補償（３
９０）を行うために対応の差ブロックを再読み出しする
のに用いられる。そこで、対応のブロックは、ＤＣＴ３
３２を介して変換され、それにより補正コンポーネント
が生成される。図３に示した方法の派生例が、Assuncao
外の論文「A frequency domain video transcoder for
dynamic bit-rate reduction of MPEG-2 bitstream
s」、IEEE Transaction on Circuits and System for V
ideo Technology, pp. 953-957, 1998に記述されてい
る。To obtain the correction component 332, the requantized DCT coefficients are dequantized (36
0) and subtracted (370) from the original partially decoded DCT coefficient. The difference resulting from this subtraction is the inverse DC
It is converted to the spatial domain via T (IDCT) 365 and stored in the frame memory 380. Here, the motion vector 381 associated with each input block is
90) is used to reread the corresponding difference block. Therefore, the corresponding block is DCT3.
Transformed via 32, which produces a correction component. Assuncao is a derivative example of the method shown in FIG.
Outside paper `` A frequency domain video transcoder for
dynamic bit-rate reduction of MPEG-2 bitstream
s '', IEEE Transaction on Circuits and System for V
ideo Technology, pp. 953-957, 1998.

【００１１】即ち、Assuncao外はまた、同じタスクのた
めの別の方法をも提案している。この別の方法において
は、ドリフト補償の目的で、周波数ドメインで動作する
運動補償（ＭＣ）ループを用いている。周波数ドメイン
でＭＣブロックの高速計算を行うために近似行列（マト
リックス）が導出される。変換符号化（transcoding）
に対し最良の量子化スケールを算出するためにラグラン
ジェの最適化が採用されている。この方法によれば、Ｉ
ＤＣＴ／ＤＣＴコンポーネントに対する必要性が除かれ
る。That is, Assuncao et al. Also propose another method for the same task. This alternative method uses a motion compensation (MC) loop operating in the frequency domain for the purpose of drift compensation. An approximate matrix is derived for performing high speed calculation of the MC block in the frequency domain. Transcoding
On the other hand, Lagrange's optimization is adopted to calculate the best quantizer scale. According to this method, I
The need for DCT / DCT components is eliminated.

【００１２】従来の圧縮規格に従えば、テクスチャ情報
の符号化に割り当てられるビット数は、量子化パラメー
タ（ＱＰ）により制御される。上に述べた方法は、元の
ビットストリームに含まれている情報をベースとするＱ
Ｐ、即ち量子化パラメータを変えることによりテクスチ
ャ・ビットレートを縮減する点で類似性を有している。
効率的に実施するために、情報は通常、圧縮されたドメ
インから直接抽出され、マクロブロックの運動またはＤ
ＣＴブロックの残存エネルギーに関する尺度を含むこと
ができる。上述の方法はビットレートの縮減に対しての
み適用可能である。According to the conventional compression standard, the number of bits allocated for encoding texture information is controlled by a quantization parameter (QP). The method described above uses a Q based on the information contained in the original bitstream.
There is similarity in that the texture bit rate is reduced by changing P, the quantization parameter.
For efficient implementation, the information is usually extracted directly from the compressed domain to determine the macroblock motion or D
A measure of the residual energy of the CT block can be included. The above method is applicable only for bit rate reduction.

【００１３】ビットレートの縮減に加えて、他の形式の
ビットストリームの変換をも行うことができる。例え
ば、オブジェクトベースの変換が、Vetro外の２０００
年２月１４日付けの米国特許出願０９／５０４,３２３
号明細書（発明の名称：Object-Based Bitstream Trans
coder）に記述されている。また、空間解像度に関する
変換が、Shanableh及びGhanbariの論文「Heterogeneous
video transcoding to lower spatio-temporal resolu
tion, and different encoding formats」、IEEE Trans
action on Multimedia（June 2000）に記述されてい
る。In addition to bit rate reduction, other formats of bit streams can be converted. For example, the object-based conversion is 2000 outside of Vetro.
US patent application 09 / 504,323 dated February 14, 2013
Specification (Title of Invention: Object-Based Bitstream Trans
coder). In addition, the conversion related to spatial resolution is described in Shanableh and Ghanbari's paper "Heterogeneous.
video transcoding to lower spatio-temporal resolu
, and different encoding formats ", IEEE Trans
It is described in action on Multimedia (June 2000).

【００１４】これらの方法では、品質の面で不満がある
低い空間解像度でビットストリームが生成され、品質を
高めようとすれば複雑性が増加する。また、再構成マク
ロブロックを形成する手段に関し適切な考慮が払われな
い。これは、品質及び複雑性双方に関し大きなインパク
トを与えるばかりでなく２とは異なる縮減因数を考慮し
た場合に特に問題となる。更にまた、これらの方法には
具体的なアーキテクチャの詳細が伴っていない。関心の
多くは因数「２」による運動ベクトルの種々なスケーリ
ング手段に注がれている。In these methods, a bitstream is generated with a low spatial resolution which is unsatisfactory in terms of quality, and complexity is increased if quality is attempted. Also, no proper consideration is given to the means of forming the reconstructed macroblock. This not only has a great impact on both quality and complexity, but is especially problematic when considering reduction factors different from two. Furthermore, these methods are not accompanied by specific architectural details. Much of the interest has been devoted to various means of scaling motion vectors by the factor "2".

【００１５】図４は、入力ビットストリームを低い空間
解像度で出力ビットストリーム４０２に変換符号化（tr
anscoding）する方法４００の詳細を示す。この方法
は、図１に示した方法を拡張したものに対応するが、デ
コーダ１１０及びエンコーダ１２０の詳細を示すと共
に、復号化プロセス及び符号化プロセス間にダウンサン
プリング・ブロック４１０を有する。デコーダ１１０
は、ビットストリームの部分的復号を行う。ダウンサン
プリング・ブロック４１０は、部分的にマクロブロック
を含む群の空間解像度を縮減する。デコーダにおける運
動補償４２０では全解像度の運動ベクトルFIG. 4 illustrates transform coding (tr) an input bitstream into an output bitstream 402 at a low spatial resolution.
Details of a method 400 for anscoding). This method corresponds to an extension of the method shown in FIG. 1, but shows details of the decoder 110 and the encoder 120 and has a downsampling block 410 between the decoding and encoding processes. Decoder 110
Performs partial decoding of the bitstream. The downsampling block 410 reduces the spatial resolution of groups that partially contain macroblocks. In motion compensation 420 at the decoder, motion vectors of full resolution

【数１１】４２１が用いられ、他方、エンコーダにおける運動補償
４３０では低解像度の運動ベクトル[Equation 11] 421 is used, while the motion compensation 430 in the encoder uses a low resolution motion vector.

【数１２】４３１が用いられる。低解像度の運動ベクトルはダウン
サンプリングした空間ドメインフレーム[Equation 12] 431 is used. Low resolution motion vectors are downsampled spatial domain frames

【数１３】４０３から推定されるか或いは全解像度の運動ベクトル
からマップされる。変換符号化器（トランスコーダ）４
００の詳細について更に下に説明する。[Equation 13] Estimated from 403 or mapped from full resolution motion vectors. Transform encoder (transcoder) 4
Details of 00 will be described below.

【００１６】図５は、入力ビットストリーム５０１を低
空間解像度で出力ビットストリーム５０２にトランスコ
ード即ち変換符号化するための開ループ方法５００の詳
細を示す。この方法においては、ビットストリーム１０
１はやはり部分的に復号される。即ち、入力ビットスト
リームのマクロブロックが可変長復号（ＶＬＤ）（５１
０）及び逆量子化（５２０）され、それにより離散コサ
イン変換（ＤＣＴ）係数が生成される。なお、これらの
処理ステップは周知のものである。FIG. 5 shows details of an open loop method 500 for transcoding an input bitstream 501 into an output bitstream 502 at low spatial resolution. In this method, the bit stream 10
The 1 is also partially decoded. That is, the macroblock of the input bitstream is variable length decoded (VLD) (51
0) and dequantized (520), which produces discrete cosine transform (DCT) coefficients. Note that these processing steps are well known.

【００１７】次いで、ＤＣＴマクロブロックは、１６×
１６(2^４×2^４) マクロブロック内の各８×８ (2^３×2
^３) ルミナンスブロックの高周波係数をマスキングする
ことにより「２」の因数でダウンサンプリング（５３
０）され、それにより４つの４×４ＤＣＴブロックが生
成される。これに関しては、１９９３年１１月１６日付
けのNgの米国特許第５,２６２,８５４号「Low-resoluti
on HDTV receivers」を参照されたい。言い換えるなら
ば、ダウンサンプリングにより、例えば、４個のブロッ
クからなるブロック群は、小さいサイズの４ブロックか
らなるグループ、即ちブロック群に変換される。The DCT macroblock is then 16 ×
Each 8 × 8 (2 ³ × 2 in 16 (2 ⁴ × 2 ⁴ ) macroblocks
³ ) Downsampling with a factor of "2" by masking the high frequency coefficients of the luminance block (53
0), thereby generating four 4 × 4 DCT blocks. In this regard, Ng US Pat. No. 5,262,854 entitled “Low-resoluti
on HDTV receivers ". In other words, by downsampling, for example, a block group of four blocks is converted into a group of four small blocks, that is, a block group.

【００１８】変換符号化器においてダウンサンプリング
を行うことにより、変換符号化器は、従属１６×１６マ
クロブロックを再形成するための付加的なステップを取
らなければならない。即ち、空間ドメインへの逆変換
と、それに続くＤＣＴドメインへの再変換である。ダウ
ンサンプリング後、ブロックは、同じ量子化レベルを用
いて再量子化され（５４０）、次いで、可変長符号化さ
れる（５５０）。なお、縮減された解像度ブロックに関
するビットレート制御の実施に関する方法は何ら記載さ
れていない。By performing downsampling in the transcoder, the transcoder must take an additional step to reshape the subordinate 16 × 16 macroblocks. That is, an inverse transform into the spatial domain followed by a retransform into the DCT domain. After downsampling, the block is requantized (540) with the same quantization level and then variable length coded (550). It should be noted that there is no description of a method relating to the implementation of the bit rate control for the reduced resolution block.

【００１９】全運動ベクトル５５９から縮減運動ベクト
ル５６１への運動ベクトルマッピング５６０を行うの
に、従来、フレームベースの運動ベクトルに対し幾つか
の適切な方法が提案されている。４個のフレームベース
の運動ベクトルを、１つの群内の各マクロブロックに対
し１つずつ、新たに形成される１６×１６マクロブロッ
クに対する１つの運動ベクトルに写像するため、単純な
平均化もしくはメディアン・フィルタを適用することが
できる。これは、４：１マッピングと称されている。In order to perform the motion vector mapping 560 from the total motion vector 559 to the reduced motion vector 561, several suitable methods have been proposed in the past for frame-based motion vectors. A simple averaging or median is used to map the four frame-based motion vectors, one for each macroblock in a group, to one motion vector for the newly formed 16x16 macroblock. -A filter can be applied. This is called 4: 1 mapping.

【００２０】しかしながら、ＭＰＥＧ−４及びＨ.２６
３のような或る圧縮規格では、８×８ブロック毎に１つ
の運動ベクトルを許容する高度な予測モードを支持して
いる。この場合、各運動ベクトルは、元の解像度での１
６×１６マクロブロックから、縮減された解像度マクロ
ブロックでの８×８ブロックに写像される。これは、
１：１マッピングと称されている。However, MPEG-4 and H.26
Some compression standards, such as 3, support advanced prediction modes that allow one motion vector per 8x8 block. In this case, each motion vector is 1 at the original resolution.
The 6x16 macroblock is mapped to an 8x8 block with a reduced resolution macroblock. this is,
It is called 1: 1 mapping.

【００２１】図６は、４つの１６×１６マクロブロック
群６０１から１つの１６×１６マクロブロック６０２ま
たは４つの８×８マクロブロック群６０３のいずれかに
運動ベクトルを写像、即ちマッピングする（６００）例
が示してある。常に１：１マッピングを用いるのは、４
つの運動ベクトルを符号化するのに多くのビットが用い
られるため非効率である。また、一般に、インターレー
スされた画像のためのフィールドベースの運動ベクトル
への拡張は無意味ではない。周知のように、ダウンサン
プリングされたＤＣＴ係数及びマッピングされた運動ベ
クトルについては、データを可変長符号化し、縮減され
た解像度のビットストリームを形成することが可能であ
る。FIG. 6 maps (600) motion vectors from four 16 × 16 macroblock groups 601 to either one 16 × 16 macroblock 602 or four 8 × 8 macroblock groups 603. An example is shown. Always use 1: 1 mapping is 4
Inefficient because many bits are used to encode one motion vector. Also, in general, extensions to field-based motion vectors for interlaced images are not nonsensical. As is well known, for downsampled DCT coefficients and mapped motion vectors, the data can be variable length encoded to form a reduced resolution bitstream.

【００２２】空間的解像度を縮減するための従来方法に
おける問題点を解決するようなビットストリームの変換
符号化方法を提供することが望ましい。更にまた、ドリ
フトを補償したり、トランスコーダにおける複雑性と良
質性との間のバランスを提供することが望ましい。It would be desirable to provide a transform coding method for a bitstream that overcomes the problems of the conventional methods for reducing spatial resolution. Furthermore, it is desirable to compensate for drift and to provide a balance between complexity and quality in transcoders.

【００２３】[0023]

【課題を解決するための手段】１つの方法およびシステ
ムが、最初にフレームを復号化し、復号化されたフレー
ムを第１のフレームバッファに格納することにより、ビ
デオ信号のフレームのシーケンスの圧縮されたビットス
トリームの空間解像度を低減する。その復号化を実行し
ながら、格納されている復号化されたフレームの最大解
像度の動きベクトルを用いて動き補償が実行される。そ
の後、復号化されたフレームは低減された解像度にダウ
ンサンプリングされ、第２のフレームバッファに格納さ
れる。低減された解像度のフレームは、そのビデオの低
減された解像度の圧縮されたビットストリームを生成す
るために部分的に符号化される。部分符号化を実行しな
がら、格納されている低減された解像度のフレームの低
減された解像度の動きベクトルを用いて動き補償が実行
される。SUMMARY OF THE INVENTION A method and system compresses a sequence of frames of a video signal by first decoding the frame and storing the decoded frame in a first frame buffer. Reduce the spatial resolution of the bitstream. While performing the decoding, motion compensation is performed using the maximum resolution motion vector of the stored decoded frame. The decoded frame is then downsampled to the reduced resolution and stored in the second frame buffer. The reduced resolution frame is partially encoded to produce a reduced resolution compressed bitstream of the video. Motion compensation is performed using the reduced resolution motion vectors of the stored reduced resolution frames while performing the partial encoding.

【００２４】[0024]

【発明の実施の形態】［好適な実施の形態の詳細な説
明］導入本発明は、ディジタルビデオ信号の圧縮されたビットス
トリームを最小のドリフトで縮減した空間解像度にトラ
ンスコーディング、即ち、変換符号化するシステム及び
方法を提供するものである。先ず、本発明による変換符
号化器もしくはトランスコーダを使用することができる
コンテンツの配信に関する幾つかの用途例について説明
する。次に、低空間解像度でビットストリームを発生す
るための基本的な方法について解析的に説明する。この
解析に基づき、基本的な方法に対する幾つかの実施例並
びに各実施例と関連する対応のアーキテクチャについて
述べる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Introduction The present invention transcodes a compressed bitstream of a digital video signal to a spatial resolution reduced with minimal drift, ie, transform coding. System and method. First, some application examples relating to the distribution of content in which the transcoder or transcoder according to the present invention can be used will be described. Next, a basic method for generating a bitstream with a low spatial resolution will be analytically described. Based on this analysis, some embodiments for the basic method and the corresponding architecture associated with each embodiment are described.

【００２５】第１の実施例（図９参照）においては、開
ループアーキテクチャが採用され、他方、他の３つの実
施例（図１０及び図１１Ａ乃至図１１Ｂ参照）は、ダウ
ンサンプリング、再量子化及び運動ベクトル打ち切りに
より生ずるドリフトを補償する手段を構成する閉ループ
アーキテクチャに対応するものである。なお、これら閉
ループアーキテクチャの内の１つは上記補償を縮減解像
度で行い、他方、他の２つの閉ループアーキテクチャは
上記補償を良好な品質を確保するためにＤＣＴドメイン
で元解像度で行う。The first embodiment (see FIG. 9) employs an open loop architecture, while the other three embodiments (see FIGS. 10 and 11A-11B) use downsampling and requantization. And a closed-loop architecture that constitutes means for compensating for drift caused by motion vector truncation. It should be noted that one of these closed-loop architectures performs the compensation at reduced resolution, while the other two closed-loop architectures perform the compensation at the original resolution in the DCT domain to ensure good quality.

【００２６】追って詳細に説明するように、図９の開ル
ープアーキテクチャはあまり複雑ではない。復元ループ
も存在しなければＤＣＴ／ＩＤＣＴブロックも存在せ
ず、またフレームメモリも設けられておらず、従って、
品質は低い画像解像度及びビットレートに相応のもので
ある。このアーキテクチャは、インターネットでの使用
及びソフトウエアでの実現に適している。図１０に示し
た第１の閉ループアーキテクチャも複雑性は中程度のも
のである。この第１の閉ループアーキテクチャは、復元
ループ、ＩＤＣＴ／ＤＣＴブロック及びフレームメモリ
を備えている。このアーキテクチャによれば、縮減解像
度ドメインでドリフトを補償し品質を改善することがで
きる。図１１Ａに示した第２の閉ループアーキテクチャ
の複雑性も中程度である。この第２のアーキテクチャ
は、復元ループ、ＩＤＣＴ／ＤＣＴブロック及びフレー
ムメモリを備えている。このアーキテクチャによれば、
元解像度ドメインでドリフトを補償し品質を改善するこ
とができるが、縮減解像度のフレームのアップサンプリ
ング（up-sampling）が要求される。第３の閉ループア
ーキテクチャでは、縮減解像度ドメインで得られる補正
信号が用いられる。As will be explained in more detail below, the open loop architecture of FIG. 9 is less complex. If there is no restoration loop, there is no DCT / IDCT block, and there is no frame memory, so
The quality is commensurate with the low image resolution and bit rate. This architecture is suitable for Internet use and software implementation. The first closed-loop architecture shown in Figure 10 is also of medium complexity. This first closed loop architecture comprises a reconstruction loop, an IDCT / DCT block and a frame memory. This architecture can compensate for drift and improve quality in the reduced resolution domain. The complexity of the second closed loop architecture shown in FIG. 11A is also moderate. This second architecture comprises a decompression loop, an IDCT / DCT block and a frame memory. According to this architecture
Although it is possible to compensate for drift and improve quality in the original resolution domain, up-sampling of reduced resolution frames is required. In the third closed loop architecture, the correction signal obtained in the reduced resolution domain is used.

【００２７】本発明によるアーキテクチャについて深い
理解を得るために、縮減解像度で「混合モード（mixed
mode）」のマクロブロック群を有するブロックを処理す
るための幾つかの付加的な技術についても説明する。In order to gain a deeper understanding of the architecture according to the invention, a "mixed mode" with reduced resolution is provided.
Mode)) macroblocks are also described for some additional techniques.

【００２８】ダウンサンプリング（down-sampling）す
べき１群のブロック、例えば、４つのブロックは、これ
らブロック群がイントラモード（intra-mode）及びイン
ターモード（inter-mode）の双方で符号化されたブロッ
クを含む場合に「皇后ブロック（mixed block）」と称
する。ＭＰＥＧ規格においては、Ｉ−フレームはイント
ラモードに従って符号化されたマクロブロックだけを含
むが、Ｐ−フレームはイントラモード及びインターモー
ドで符号化されたブロックを含み得る。これらモード
は、特にダウンサンプリングに際して考慮する必要があ
る。然もなければ、出力の品質、即ち画質が劣化する可
能性があるからである。A group of blocks to be down-sampling, eg, four blocks, were coded in both intra-mode and inter-mode. When the block is included, it is called "empress block (mixed block)". In the MPEG standard, I-frames contain only macroblocks coded according to intra mode, but P-frames may contain blocks coded in intramode and intermode. These modes need to be taken into account especially during downsampling. This is because there is a possibility that the quality of the output, that is, the image quality may be deteriorated.

【００２９】また、ドリフト補償並びにＤＣＴベース・
データのアップサンプリング方法についても説明する。
これら方法は、アップサンプリング後の動作もしくは演
算を付加的な変換ステップを伴うことなく適切に行うこ
とができ、第２及び第３の閉ループアーキテクチャにと
って有用である。Further, drift compensation and DCT base
A data upsampling method will also be described.
These methods are suitable for the second and third closed loop architectures as they allow the post-upsampling operation or operation to be performed properly without additional conversion steps.

【００３０】縮減空間解像度の変換符号化（transcodin
g）への適用本発明の目標とする主たる用途は、ディジタルテレビジ
ョン（ＤＴＶ）放送及びインターネットのコンテンツ
を、例えば無線電話、ページャ及びＰＤＡ（personal d
igital assistance）のような低解像度のディスプレイ
を有するデバイスもしくは装置へ配布もしくは配信する
ことである。現在、ＤＴＶ放送及びＤＶＤ記録のための
圧縮フォーマットとしてＭＰＥＧ−２が採用されてお
り、インターネットを介してＭＰＥＧ−１コンテンツが
利用可能である。Reduced spatial resolution transform coding (transcodin
Application to g) The main intended use of the present invention is for digital television (DTV) broadcast and Internet content, eg wireless telephones, pagers and PDAs (personal d).
igital assistance) to distribute or deliver to devices or devices that have low-resolution displays such as. Currently, MPEG-2 is adopted as a compression format for DTV broadcasting and DVD recording, and MPEG-1 content can be used via the Internet.

【００３１】ＭＰＥＧ−４は、移動ネットワークを介し
てのビデオ伝送用の圧縮フォーマットとして採用されて
いたものであるので、本発明では、ＭＰＥＧ−１／２コ
ンテンツを低解像度のＭＰＥＧ−４コンテンツに変換符
号化する方法を取り上げる。Since MPEG-4 has been adopted as a compression format for video transmission over a mobile network, the present invention converts MPEG-1 / 2 contents into low resolution MPEG-4 contents. Let's take a look at how to encode.

【００３２】図７は、本発明を利用したマルチメディア
コンテンツ分配システム７００の第１の例を示す。シス
テム７００は、外部ネットワーク７０３を介してクライ
アント７０２に接続された適応型サーバ７０１を有す
る。このシステムの１つの特徴は、クライアントのディ
スプレイはサイズが小さく、また、低いビットレートの
チャンネルで接続されていることである。従って、クラ
イアント７０２に配信されるコンテンツの解像度を縮減
する必要がある。FIG. 7 shows a first example of a multimedia content distribution system 700 utilizing the present invention. The system 700 comprises an adaptive server 701 connected to a client 702 via an external network 703. One feature of this system is that the client's display is small in size and connected via low bit rate channels. Therefore, it is necessary to reduce the resolution of the content delivered to the client 702.

【００３３】マルチメディアコンテンツの入力ソース７
０４はデータベース７１０に格納される。コンテンツは
特徴抽出及びインデックス付け処理（指標付け処理）７
２０を受ける。データベースサーバ７４０により、クラ
イアント７０２は、データベース７１０のコンテンツを
走査検索し特定のコンテンツに対する要求を行うことが
できる。マルチメディアコンテンツを探索するのにサー
チエンジン７３０を使用することができる。所望のコン
テンツが検索されたならば、データベースサーバ７４０
はマルチメディアコンテンツを本発明による変換符号化
器（トランスコーダ）７５０に送る。Input source 7 for multimedia content
04 is stored in the database 710. Content is feature extraction and indexing processing (indexing processing) 7
Receive 20. The database server 740 allows the client 702 to scan the content of the database 710 and make requests for specific content. The search engine 730 can be used to search for multimedia content. Once the desired content is retrieved, the database server 740
Sends the multimedia content to a transcoder 750 according to the invention.

【００３４】変換符号化器７５０は、ネットワーク及び
クライアントの特性を読み取る。コンテンツの空間解像
度がクライアントのディスプレイの特性よりも高い場合
には、本発明による方法を用いて、コンテンツの解像度
をクライアントのディスプレイ特性と整合するように縮
減する。また、ネットワークチャンネル上のビットレー
トがコンテンツのビットレートよりも小さい場合にも本
発明を適用することができる。Transform encoder 750 reads network and client characteristics. If the spatial resolution of the content is higher than the display characteristics of the client, the method according to the invention is used to reduce the resolution of the content to match the display characteristics of the client. The present invention can also be applied when the bit rate on the network channel is lower than the bit rate of the content.

【００３５】図８は、コンテンツ配信システム８００の
第２の実施例を示す。このシステム８００はローカルな
「ホーム（home）」ネットワーク８０１、外部ネットワ
ーク７０３、放送ネットワーク８０３及び図７と関連し
て説明した適応型サーバ７０１を具備する。この実施例
においては、高品質の入力ソースコンテンツ８０４を、
放送ネットワーク８０３、例えば、ケーブル、地上或い
は衛星放送のネットワークを介してホームネットワーク
８０１に接続されているクライアント８０５に転送する
ことができる。コンテンツはセットトップ・ボックスも
しくはゲートウエイ８２０により受信されてローカルメ
モリ或いはハードディスクドライブ（ＨＤＤ）８３０に
格納される。受信したコンテンツは、ホーム即ち家庭内
のクライアント８０５に配送することができる。加え
て、コンテンツは、全解像度のコンテンツを復号したり
表示する能力を有しないクライアントにも利用可能なよ
うに変換符号化（８５０）を行うことができる。これ
は、例えば、ハイデフィニションテレビジョン（ＨＤＴ
Ｖ）のビットストリームを標準デフィニションのテレビ
ジョンセットで受信する場合に相当する。従って、コン
テンツは、ホーム（家庭）内のクライアントの能力を満
足するようにトランスコード、即ち変換符号化すべきで
ある。FIG. 8 shows a second embodiment of the content distribution system 800. The system 800 comprises a local "home" network 801, an external network 703, a broadcast network 803 and the adaptive server 701 described in connection with FIG. In this example, high quality input source content 804
It can be transferred to the client 805 connected to the home network 801 via a broadcasting network 803, for example, a cable, terrestrial or satellite broadcasting network. The content is received by the set top box or gateway 820 and stored in local memory or hard disk drive (HDD) 830. The received content can be delivered to the client 805 at home. In addition, the content can be transform encoded (850) for use by clients that do not have the ability to decode and display the full resolution content. This is, for example, in high definition television (HDT
This corresponds to the case where the V) bit stream is received by a standard definition television set. Therefore, the content should be transcoded or transcoded to satisfy the capabilities of the client in the home.

【００３６】更にまた、外部ネットワーク８０２を介し
低解像度の外部クライアント８０６よりローカルメモリ
（ＨＤＤ）８３０に格納されているコンテンツへのアク
セスが要求された場合には、変換符号化器８５０を用い
て当該クライアントに対して低解像度のマルチメディア
コンテンツを配布することができる。Furthermore, when access to the content stored in the local memory (HDD) 830 is requested from the low resolution external client 806 via the external network 802, the conversion encoder 850 is used. It is possible to distribute low-resolution multimedia contents to clients.

【００３７】基本的方法の分析複雑性及び品質が可変である変換符号化器を設計するた
めに図４に示した方法で発生される信号について更に分
析し説明することにする。なお、式中の表記法と関連
し、小文字の変数は空間ドメインの信号を表し、他方、
大文字の変数はＤＣＴドメインにおける等価の信号を表
すものとする。また、変数に付した下付け文字は時間
を、他方、１に等しい上付け文字はドリフトを有する信
号を表し、そして２に等しい上付け文字はドリフトの無
い信号を表すものとする。なお、ドリフトは、例えば再
量子化、運動ベクトルの打ち切り（丸め）或いはダウン
サンプリングのようなロスの多いプロセスで生じ得る。
ドリフトの補償方法については追って説明する。Analysis of the Basic Method The signal generated by the method shown in FIG. 4 will be further analyzed and explained in order to design a transcoder having variable complexity and quality. Note that, in relation to the notation in the formula, variables in lowercase represent the signal in the spatial domain, while
Variables in upper case shall represent the equivalent signal in the DCT domain. Also, subscripts attached to variables shall represent time, while superscripts equal to 1 represent signals with drift and superscripts equal to 2 represent signals without drift. Note that drift can occur in lossy processes such as requantization, motion vector truncation (rounding), or downsampling.
The drift compensation method will be described later.

【００３８】Ｉ−フレームＩ−フレームについては運動補償予測は不可能である。
即ち、I-frame Motion-compensated prediction is not possible for I-frames.
That is,

【数１４】であり、従って、信号はダウンサンプリングされる（４
１０）。即ち、[Equation 14] And therefore the signal is downsampled (4
10). That is,

【数１５】である。次いで、符号化器もしくはエンコーダ１２０で
次式に従い符号化される。[Equation 15] Is. Next, the encoder or encoder 120 encodes according to the following equation.

【数１６】 [Equation 16]

【００３９】信号Signal

【数１７】はＤＣＴ４４０を受け、次いで、量子化パラメータ[Equation 17] Receives the DCT 440 and then the quantization parameter

【数１８】で量子化される（４５０）。量子化された信号[Equation 18] Is quantized by (450). Quantized signal

【数１９】は可変長符号化され（４６０）、符号変換されたビット
ストリーム（transcodedbitstream）４０２に書き込ま
れる。エンコーダにおける運動補償ループの一部分とし
て[Formula 19] Is subjected to variable length coding (460) and is written in the transcoded bitstream 402. As part of a motion compensation loop in an encoder

【数２０】は逆量子化され（４７０）、ＩＤＣＴ４８０を受ける。
このようにして、縮減された解像度の基準信号[Equation 20] Is dequantized (470) and undergoes IDCT 480.
In this way, the reduced resolution reference signal

【数２１】４８１がフレームバッファ４９０に、将来のフレーム予
測に対する基準信号として格納される。[Equation 21] 481 is stored in the frame buffer 490 as a reference signal for future frame prediction.

【００４０】Ｐ−フレームＰ−フレームの場合には、下式P-frame In case of P-frame,

【数２２】から、再構成もしくは復元された全解像度の画像が生成
される。Ｉ−フレームの場合と同様に、この信号は次い
で式（２）に従いダウン変換（縮減変換）される。次い
で、下式（５）に従い縮減解像度残留分（reduced reso
lution residual）が生成される。[Equation 22] From, a reconstructed or reconstructed full resolution image is generated. As with the I-frame case, this signal is then down-converted according to equation (2). Next, the reduced resolution residual (reduced reso
solution residual) is generated.

【数２３】上式（５）は、等価的に次のように表される。[Equation 23] The above equation (5) is equivalently expressed as follows.

【数２４】 [Equation 24]

【００４１】式（６）によって与えられる信号は、本発
明によるアーキテクチャが近似する基準信号を表す。な
お、この基準信号の発生における複雑性は高く、従って
複雑性を相当に低減しつつ品質の近似を実現するのが望
ましいことは言うまでもない。The signal given by equation (6) represents the reference signal to which the architecture according to the invention approximates. It is needless to say that the complexity of the generation of the reference signal is high, and therefore it is desirable to realize the quality approximation while considerably reducing the complexity.

【００４２】開ループアーキテクチャ近似として、Open loop architecture As an approximation,

【数２５】とすると、式（６）の縮減解像度残留分信号は下式で表
される。[Equation 25] Then, the reduced resolution residual signal of equation (6) is expressed by the following equation.

【数２６】 [Equation 26]

【００４３】上式は、図９に示した変換符号化器９００
のための開ループアーキテクチャを示唆する。The above equation is the transform encoder 900 shown in FIG.
Suggest an open-loop architecture for.

【００４４】変換符号化器９００において、入力ビット
ストリーム９０１の信号は、可変長復号され（９１
０）、それにより逆量子化ＤＣＴ係数９１１及び全解像
度の運動ベクトルIn the transform encoder 900, the signal of the input bit stream 901 is variable length decoded (91
0), thereby the inverse quantized DCT coefficient 911 and the full resolution motion vector

【数２７】９０２が生成される。全解像度の運動ベクトルは、ＭＶ
マッピング（ＭＶ写像）９２０によって縮減解像度の運
動ベクトル[Equation 27] 902 is generated. All resolution motion vectors are MV
Mapping (MV mapping) 920 reduces motion vector with reduced resolution

【数２８】９０３に写像、即ちマップされる。量子化されたＤＣＴ
係数９１１は、量子化[Equation 28] 903, that is, mapped. Quantized DCT
Coefficient 911 is quantized

【数２９】９３０で逆量子化され、信号[Equation 29] The signal is dequantized at 930

【数３０】９３１が生成される。次いで、この信号は、追って詳述
するように、ブロック群のプロセッサ１３００に供給さ
れる。プロセッサ１３００の出力は、ダウンサンプリン
グ（９５０）され、それにより信号[Equation 30] 931 is generated. This signal is then provided to the block of processors 1300, as described in more detail below. The output of processor 1300 is downsampled (950), thereby

【数３１】９５１が生成される。ダウンサンプリング後、この信号
は量子化[Equation 31] 951 is generated. After downsampling, this signal is quantized

【数３２】（９６０）される。最後に、縮減解像度の再量子化され
たＤＣＴ係数及び運動ベクトルは、可変長符号化（９７
０）されて、変換符号化された出力ビット９０２に書き
込まれる。[Equation 32] (960) is performed. Finally, the reduced resolution requantized DCT coefficients and motion vectors are coded by variable length coding (97
0) and written in the transform-coded output bit 902.

【００４５】ブロック群プロセッサ１３００の好適な実
施例に関する詳細については追って説明するが、ここで
は簡単に、このプロセッサの目的は、ダウンサンプリン
グプロセス９５０で、サブブロックが異なった符号化モ
ード、例えば、インターブロック及びイントラブロック
モードを有するマクロブロック群が発生することのない
ように、選択されたマクロブロック群を予備処理するこ
とにあることを述べておく。なお、マクロブロック内の
混合符号化モード（ミックストコーディングモード）は
公知のいずれのビデオ符号化規格によっても支持されて
いない。Although details regarding a preferred embodiment of the block group processor 1300 are provided below, for purposes of brief description of this processor, the purpose of this processor is a downsampling process 950 in which the sub-blocks are encoded in different coding modes, such as interlace. It should be mentioned that the selected macroblocks are pre-processed so that macroblocks with block and intra block modes will not occur. It should be noted that the mixed coding mode (mixed coding mode) in the macroblock is not supported by any known video coding standard.

【００４６】縮減解像度におけるドリフト補償式（７ｂ）で与えられた近似だけだとすると、式（６）
の縮減解像度残留信号は下式で表される。If only the approximation given by the drift compensation equation (7b) in the reduced resolution is given, equation (6)
The reduced resolution residual signal of is expressed by the following equation.

【数３３】 [Expression 33]

【００４７】上式は、縮減解像度でのドリフト補償を行
う図１０に示した閉ループアーキテクチャ１０００を示
唆している。The above equations imply the closed loop architecture 1000 shown in FIG. 10 with drift compensation at reduced resolution.

【００４８】このアーキテクチャにおいては、入力信号
１００１は、可変長復号（１０１０）され、それによ
り、量子化されたＤＣＴ係数１０１１及び全解像度の運
動ベクトルIn this architecture, the input signal 1001 is variable length decoded (1010), thereby quantized DCT coefficients 1011 and full resolution motion vectors.

【数３４】１０１２が生成される。全解像度の運動ベクトル１０１
２はＭＶマッピング１０２０によってマップされる。そ
れにより縮減解像度の運動ベクトル[Equation 34] 1012 is generated. Full resolution motion vector 101
2 is mapped by MV mapping 1020. This reduces the motion vector with reduced resolution.

【数３５】の集合１０２１が生成される。量子化されたＤＣＴ係数
は量子化[Equation 35] 1021 is generated. Quantized DCT coefficient is quantized

【数３６】で逆量子化され（１０３０）、それにより信号[Equation 36] Dequantized (1030) by

【数３７】１０３１が生成される。この信号は次いでブロック群プ
ロセッサ１３００に供給されてダウンサンプリング（１
０５０）される。ダウンサンプリング１０５０後、縮減
解像度ドリフト補償信号１０５１がＤＣＴドメインの低
解像度残留信号１０５２に加算される（１０６０）。[Equation 37] 1031 is generated. This signal is then provided to the block group processor 1300 for downsampling (1
050). After downsampling 1050, the reduced resolution drift compensation signal 1051 is added to the low resolution residual signal 1052 in the DCT domain (1060).

【００４９】信号１０６１は空間量子化器The signal 1061 is a spatial quantizer.

【数３８】１０７０で量子化される。最後に、縮減解像度の再量子
化されたＤＣＴ係数１０７１及び運動ベクトル１０２１
が可変長符号化（１０８０）されて、変換符号化された
出力ビットストリーム１００２が生成される。[Equation 38] Quantized at 1070. Finally, the reduced resolution requantized DCT coefficient 1071 and motion vector 1021.
Is variable-length coded (1080) to generate a transform-coded output bitstream 1002.

【００５０】縮減解像度ドリフト補償信号が生成される
基準フレームは、再量子化残留分The reference frame from which the reduced resolution drift compensation signal is generated is requantized residual.

【数３９】１０７１を逆量子化（１０９０）し、ダウンサンプリン
グされた残留分[Formula 39] Dequantized (1090) 1071 and downsampled residual

【数４０】１０５２から減算（１０９２）することにより得られ
る。この差信号は、ＩＤＣＴ１０９４に与えられて、フ
レームメモリ１０９１に格納されている前のマクロブロ
ックの低解像度予測コンポーネント１０９６に加算（１
０９５）される。この新しい信号は、差[Formula 40] It is obtained by subtracting (1092) from 1052. This difference signal is given to the IDCT 1094 and added to the low resolution prediction component 1096 of the previous macroblock stored in the frame memory 1091 (1
095). This new signal is the difference

【数４１】１０９７を表し、現在のブロックに対する低解像度の運
動補償のための基準として用いられる。[Formula 41] 1097, which is used as a reference for low resolution motion compensation for the current block.

【００５１】格納された基準信号に対して、低解像度の
運動補償１０９８が行われ、ＤＣＴ１０９９に対し予測
がなされる。このＤＣＴドメイン信号は、縮減解像度ド
リフト補償信号１０５１である。この動作は、低解像度
の運動ベクトル集合A low resolution motion compensation 1098 is performed on the stored reference signal and a prediction is made to the DCT 1099. This DCT domain signal is the reduced resolution drift compensation signal 1051. This motion is a set of low-resolution motion vectors.

【数４２】１０２１を用いてマクロブロック・ベースで行われる。[Equation 42] 1021 is used on a macroblock basis.

【００５２】元解像度での第１のドリフト補償方法近似First Drift Compensation Method at Original Resolution Approximation

【数４３】に対し、式（６）の縮減解像度残留信号は下記のように
表される。[Equation 43] On the other hand, the reduced resolution residual signal of equation (6) is expressed as follows.

【数４４】 [Equation 44]

【００５３】上式は、元解像度のビットストリームにお
けるドリフトを補償する図１１に示した閉ループアーキ
テクチャ１１００を示唆している。The above equation suggests the closed loop architecture 1100 shown in FIG. 11 that compensates for drift in the original resolution bitstream.

【００５４】このアーキテクチャにおいて、入力信号１
００１は、可変長復号（１１１０）され、量子化された
ＤＣＴ係数１１１１及び全解像度の運動ベクトルIn this architecture, input signal 1
001 is a variable-length decoded (1110), quantized DCT coefficient 1111 and motion vector of full resolution.

【数４５】１１１２が生成される。量子化されたＤＣＴ係数１１１
１は量子化器[Equation 45] 1112 is generated. Quantized DCT coefficient 111
1 is a quantizer

【数４６】で逆量子化され（１１３０）、それにより信号[Equation 46] Is dequantized (1130) by

【数４７】１１３１が生成される。この信号は次いでブロック群プ
ロセッサ１３００に供給される。ブロック群処理（１３
００）後、元解像度ドリフト補償信号１１５１がＤＣＴ
ドメインの残留信号１１４１に加算される（１１６
０）。次いで信号１１６２はダウンサンプリング（１１
５０）され、量子化器[Equation 47] 1131 is generated. This signal is then provided to the block group processor 1300. Block group processing (13
00), the original resolution drift compensation signal 1151 is converted to DCT.
It is added to the residual signal 1141 of the domain (116
0). The signal 1162 is then downsampled (11
50) and quantizer

【数４８】で量子化（１１７０）される。最後に、縮減解像度の復
元されたＤＣＴ係数１１７１及び運動ベクトル１１２１
は可変長符号化（１１８０）されて、変換符号化された
（transcoded）ビットストリーム１１０２に書き込まれ
る。[Equation 48] Is quantized (1170). Finally, the reduced resolution restored DCT coefficient 1171 and motion vector 1121.
Is variable length coded (1180) and written to the transcoded bitstream 1102.

【００５５】元解像度ドリフト補償信号１１５１が生成
される基準フレームは、再量子化残留分The reference frame in which the original resolution drift compensation signal 1151 is generated is the requantization residual component.

【数４９】１１７１を逆量子化（１１９０）し、アップサンプリン
グ（１１９１）することにより得られる。この例では、
アップサンプリング後にアップサンプリングされた信号
は元解像度残留分１１６１から減算（１１９２）され
る。この差信号は、ＩＤＣＴ１１９４に与えられて、前
のマクロブロックの元解像度予測コンポーネント１１９
６に加算（１１９５）される。この新しい信号は、差[Equation 49] It is obtained by dequantizing (1190) 1171 and upsampling (1191). In this example,
After the upsampling, the upsampled signal is subtracted (1192) from the original resolution residue 1161. This difference signal is provided to the IDCT 1194 to provide the original resolution prediction component 119 of the previous macroblock.
It is added to 6 (1195). This new signal is the difference

【数５０】１１９７を表し、元解像度における現在のマクロブロッ
クに対する運動補償のための基準として用いられる。[Equation 50] 1197, used as a reference for motion compensation for the current macroblock at the original resolution.

【００５６】フレームバッファ１１８１に格納された基
準信号に対して、元解像度での運動補償１１９８が行わ
れ、ＤＣＴ１１９９に対し予測がなされる。このＤＣＴ
ドメイン信号は、元解像度のドリフト補償信号１１５１
である。この演算は、元解像度の運動ベクトル集合Motion compensation 1198 at the original resolution is performed on the reference signal stored in the frame buffer 1181 to make a prediction for the DCT 1199. This DCT
The domain signal is the original resolution drift compensation signal 1151.
Is. This operation is the motion vector set of the original resolution

【数５１】１１２１を用いてマクロブロック・ベースで行われる。[Equation 51] 1121 is used on a macroblock basis.

【００５７】元解像度での第２のドリフト補償方法図１１Ｂは図１１Ａに示した閉ループアーキテクチャの
別の変形例を示す。この実施例においては、再量子化残
留分Second Drift Compensation Method at Original Resolution FIG. 11B shows another variation of the closed loop architecture shown in FIG. 11A. In this example, the requantization residual

【数５２】１１７２の逆量子化出力１１９０がアップサンプリング
１１９１前に縮減解像度信号から減算（１１９２）され
る。[Equation 52] The inverse quantized output 1190 of 1172 is subtracted (1192) from the reduced resolution signal before upsampling 1191.

【００５８】元解像度における上述の２つのドリフト補
償アーキテクチャでは、ドリフト補償信号１１５１を発
生するのに運動ベクトル近似を用いていない。これは、
アップサンプリング（up-sampling）１１９１を採用す
ることにより実現可能である。上述の２つの代替アーキ
テクチャは、主として、差信号を発生するのに用いられ
る信号の選択において異なる。第１番目の方法において
は、差信号は再量子化及び解像度変換に起因するエラー
を表し、他方、第２番目の方法における差信号において
は再量子化に起因するエラーだけが考慮される。The above two drift compensation architectures at original resolution do not use motion vector approximation to generate the drift compensation signal 1151. this is,
This can be realized by adopting up-sampling 1191. The two alternative architectures described above differ primarily in the selection of the signals used to generate the difference signal. In the first method, the difference signal represents errors due to requantization and resolution conversion, whereas in the difference signal in the second method only errors due to requantization are considered.

【００５９】アップサンプリングされた信号は、変換符
号化ビットストリームの将来の復号もしくはデコーディ
ングにおいて考慮されることはないので、ドリフト補償
信号における連続的なダウンサンプリング及びアップサ
ンプリングによって計測される如何なるエラーをも排除
しておくのが合理的である。しかしながら、アップサン
プリングが２つの理由から採用される。即ち、以降の近
似を回避するために全解像度の運動ベクトル１１２１を
利用し、ドリフト補償信号を元解像度にしてダウンサン
プリング１１５０前に入力残留分１１６１に加算（１１
６０）できるようにするためである。Since the upsampled signal is not considered in future decoding or decoding of the transform-coded bitstream, any error measured by successive downsampling and upsampling in the drift-compensated signal will be accounted for. It is reasonable to exclude also. However, upsampling is adopted for two reasons. That is, the motion vector 1121 of the full resolution is used to avoid the subsequent approximation, and the drift compensation signal is set to the original resolution and added to the input residual component 1161 before the down sampling 1150 (11
60) This is to be able to do it.

【００６０】混合ブロックプロセッサブロック群プロセッサ１３００の目的は、ダウンサンプ
リングによって、サブブロックが異なった符号化モー
ド、例えばインターブロックモード及びイントラブロッ
クモードを有するマクロブロックが発生されないように
選択されたマクロブロックを予備処理することである。
マクロブロック内に混在する符号化モードは周知の如何
なるビデオ符号化規格によっても支持されていない。The purpose of the mixed block processor block group processor 1300 is to select macroblocks such that downsampling does not produce macroblocks in which the subblocks have different coding modes, eg, interblock mode and intrablock mode. It is a pretreatment.
Coding modes mixed within a macroblock are not supported by any known video coding standard.

【００６１】図１２は、変換符号化１２０３後に縮減解
像度でブロック群１２０２を生成することができるマク
ロブロック群１２０１の一例を示す。この例において
は、３つのインターモードブロックと１つのイントラモ
ードブロックが存在する。イントラモードブロックの運
動ベクトル（ＭＶ）は零（ゼロ）であることを注記して
おく。特定のブロック群が混合モード群（ミックストモ
ード群）であるか否かの判定はもっぱらマクロブロック
モードに依存する。ブロック群プロセッサ１３００は、
縮減解像度で単一のマクロブロック１２０２を形成して
いる４つのマクロブロック群１２０１を取り扱う。換言
すれば、ルミナンス成分に対し、ＭＢ（０）１２１０
は、縮減解像度のマクロブロック１２０２内のサブブブ
ロックｂ（０）１２２０に対応し、同様に、ＭＢ（１）
１２１１はｂ（１）１２２１に対応し、ＭＢ（ｋ）１２
１２はｂ（２）１２２２に対応し、そしてＭＢ（ｋ＋
１）１２１３はｂ（３）１２２３に対応する。なお、こ
こでｋは元解像度における列毎のマクロブロック数を表
す。クロミナンス成分も、ルミナンスモードと整合する
類似の仕方で処理される。FIG. 12 shows an example of a macroblock group 1201 that can generate a block group 1202 with reduced resolution after transform coding 1203. In this example, there are three inter mode blocks and one intra mode block. Note that the motion vector (MV) of the intra mode block is zero. The determination as to whether a specific block group is a mixed mode group (mixed mode group) depends solely on the macroblock mode. The block group processor 1300
A group of four macroblocks 1201 forming a single macroblock 1202 with reduced resolution is handled. In other words, for the luminance component, MB (0) 1210
Corresponds to the subblock b (0) 1220 in the reduced resolution macroblock 1202, and similarly MB (1)
1211 corresponds to b (1) 1221, and MB (k) 12
12 corresponds to b (2) 1222, and MB (k +
1) 1213 corresponds to b (3) 1223. Here, k represents the number of macroblocks for each column in the original resolution. Chrominance components are also processed in a similar manner to match luminance modes.

【００６２】ＭＢモードの群は、ブロック群プロセッサ
１３００が特定のＭＢ（マルチブロック）を処理すべき
か否かを決定する。ブロック群が少なくとも１つのイン
トラモードブロック及び少なくとも１つのインターモー
ドブロックを含んでいる場合にはブロック群の処理が行
われる。マクロブロックの選択後、そのＤＣＴ係数及び
運動ベクトルデータは修正もしくは変更される。The MB mode group determines whether the block group processor 1300 should process a particular MB (multi-block). If the block group includes at least one intra mode block and at least one inter mode block, processing of the block group is performed. After the macroblock is selected, its DCT coefficient and motion vector data are modified or changed.

【００６３】図１３は、ブロック群プロセッサ１３００
の構成要素、即ちコンポーネントを示す。選択された混
合ブロック群１３０１に対し、ブロック群プロセッサ
は、モードマッピング１３１０、運動ベクトル修正１３
２０及びＤＣＴ係数修正１３３０を行い、非混合モード
ブロック出力１３０２を生成する。ブロック群１３０１
が識別もしくは同定されている場合には、マクロブロッ
クのモードを全てのマクロブロックが同じになるように
変更する。これは、縮減解像度ブロックにおける各サブ
ブロックのモードを整合する予め特定されたストラテジ
ーに従って行われる。FIG. 13 shows a block group processor 1300.
Shows the components, or components. For the selected mixed block group 1301, the block group processor performs mode mapping 1310 and motion vector correction 13
20 and DCT coefficient modification 1330 to produce an unmixed mode block output 1302. Block group 1301
Is identified or identified, the mode of the macroblock is changed so that all macroblocks are the same. This is done according to a pre-specified strategy that matches the mode of each sub-block in the reduced resolution block.

【００６４】選択されたモード写像に従い、ＭＶデータ
は次いで修正もしくは変更処理１３２０を受ける。対応
のモード写像に適合する可能な修正もしくは変更につい
て、下に図１４Ａ乃至図１４Ｃを参照し詳細に説明す
る。なお、この場合、新しいＭＢ（マクロブロック）モ
ード及びＭＶ（運動ベクトル）データに対し、対応のＤ
ＣＴ係数も写像と適合するように修正もしくは変更され
る（１３３０）。According to the selected mode map, the MV data is then subjected to a modification or change process 1320. Possible modifications or changes to fit the corresponding mode maps are described in detail below with reference to FIGS. 14A-14C. In this case, the new MB (macroblock) mode and the MV (motion vector) data have corresponding D
The CT coefficients are also modified or changed (1330) to match the mapping.

【００６５】図１４Ａに示してあるブロック群プロセッ
サの第１の実施例においては、ブロック群１３０１のＭ
Ｂモードをモード写像１３１０によりインターモードに
変更する。従って、イントラブロックのＭＶデータは運
動ベクトル処理により零にリセットされ、イントラブロ
ックに対応するＤＣＴ係数もＤＣＴ処理１３３０により
零にリセットされる。このようにして、変換されたブロ
ックは基準フレーム内の対応のブロックからのデータで
複製される。In the first embodiment of the block group processor shown in FIG. 14A, M of the block group 1301 is used.
The B mode is changed to the inter mode by the mode map 1310. Therefore, the MV data of the intra block is reset to zero by the motion vector process, and the DCT coefficient corresponding to the intra block is also reset to zero by the DCT process 1330. In this way, the transformed block is duplicated with the data from the corresponding block in the reference frame.

【００６６】図１４Ｂに示したブロック群プロセッサの
第２の実施例においては、混合モードブロック群のＭＢ
モードは、写像、即ちマッピング１３１０によりインタ
ーモードに変更される。従って、第１の好適な実施例と
は異なり、イントラＭＢ（マクロブロック）用のＭＶ
（運動ベクトル）データは推定されることになる。この
推定は、テクスチャ及び運動データ双方を含み得る隣接
のブロック内のデータに基づいて行なわれる。そして、
この推定された運動ベクトルに基づき、修正されたブロ
ックに対する新しい残留分が算出される。最終ステップ
１３２０でインターＤＣＴ（離散コサイン変換）係数が
イントラＤＣＴ係数にリセットされる。In the second embodiment of the block group processor shown in FIG. 14B, the MB of the mixed mode block group is
The mode is changed to inter mode by mapping, mapping 1310. Therefore, unlike the first preferred embodiment, the MV for intra MB (macroblock)
The (motion vector) data will be estimated. This estimation is based on data in adjacent blocks that may include both texture and motion data. And
Based on this estimated motion vector, a new residue for the modified block is calculated. In the final step 1320, the inter DCT (discrete cosine transform) coefficient is reset to the intra DCT coefficient.

【００６７】図１４Ｃに示した第３の実施例において
は、ブロック群のＭＢモードはイントラモードに変更さ
れる（１３１０）。この場合、縮減解像度のマクロブロ
ックと関連する運動情報は存在しないので、全ての関連
の運動ベクトルデータは零にリセットされる（１３２
０）。これは、変換符号化器において行う必要がある。
その理由は、隣接ブロックの運動ベクトルがこのブロッ
クの運動から推定もしくは予測されるからである。デコ
ーダにおいて適切な復元を確保するためには、ブロック
群のＭＶデータを変換符号化器において零にリセットし
なければならない。最終ステップ１３３０においては、
上述のように、インターＤＣＴ係数と入れ替わるべきイ
ントラＤＣＴ係数が発生される。In the third embodiment shown in FIG. 14C, the MB mode of the block group is changed to the intra mode (1310). In this case, since there is no motion information associated with the reduced resolution macroblock, all associated motion vector data is reset to zero (132).
0). This needs to be done in the transcoder.
The reason is that the motion vector of the adjacent block is estimated or predicted from the motion of this block. To ensure proper reconstruction at the decoder, the MV data for the blocks must be reset to zero at the transcoder. In the final step 1330,
As described above, the intra DCT coefficient that should replace the inter DCT coefficient is generated.

【００６８】上に述べた第２及び第３の実施例を実現す
るために、全解像度に再生する復号ループ（レコーディ
ングループ）を用いることができる。この再生されたデ
ータは、ＤＣＴ係数をイントラモードとインターモード
間或いはインターモードとイントラモード間で変換する
ための基準データとして用いることができる。しかしな
がら、そのための符号化ループの使用は必ずしも要求さ
れない。別法として、ドリフト補償ループ内で変換を行
うことができるからである。In order to realize the second and third embodiments described above, a decoding loop (recording loop) for reproducing at full resolution can be used. The reproduced data can be used as reference data for converting the DCT coefficient between the intra mode and the inter mode or between the inter mode and the intra mode. However, the use of a coding loop for that is not necessarily required. Alternatively, the conversion can be performed in the drift compensation loop.

【００６９】運動の大きさが小さくディティルが低レベ
ルである一連のフレームに対しては、図１４Ａに示す複
雑性が小さいストラティジを使用することができる。そ
れ以外の場合には、図１４Ｂ或いは図１４Ｃに示した応
分の複雑性を有するストラティジを採用すべきである。
なお、図１４Ｃに示したストラティジが最良の品質を保
証することを付記する。For a series of frames with a small amount of motion and a low level of detail, the low complexity strategy shown in FIG. 14A can be used. In other cases, the strategy with the appropriate complexity shown in FIG. 14B or 14C should be adopted.
It should be noted that the strategy shown in FIG. 14C guarantees the best quality.

【００７０】ブロック処理でのドリフト補償ブロック群プロセッサ１３００はまた、ドリフトを制御
或いは最小化するのにも使用することができる。イント
ラ符号化ブロック（intra-coded block）は、ドリフト
を受けないので、インター符号化ブロック（inter-code
d block）をイントラ符号化ブロックに変換することに
よりドリフトの影響を軽減できる。Drift Compensation in Block Processing Blocks processor 1300 can also be used to control or minimize drift. Since the intra-coded block does not undergo drift, the inter-coded block (inter-coded block)
The effect of drift can be reduced by converting d block) into an intra-coded block.

【００７１】図１４Ｃの第１のステップ１３５０におい
て、圧縮ビットストリームにおけるドリフト量を測定す
る。閉ループアーキテクチャの場合には、このドリフト
は、１０９２及び１１９２によって発生される差信号の
エネルギー或いは１０９１及び１１９１に格納されてい
るドリフト補償信号のエネルギーに従って計測すること
ができる。なお、信号のエネルギーの計算には周知の方
法を用いることができる。計算されたエネルギーは、再
量子化、ダウンサンプリング及び運動ベクトル打ち切り
（丸め）を含む各種近似に当たって考慮される。In the first step 1350 of FIG. 14C, the amount of drift in the compressed bitstream is measured. In the case of a closed loop architecture, this drift can be measured according to the energy of the difference signal generated by 1092 and 1192 or the energy of the drift compensation signal stored at 1091 and 1191. A known method can be used to calculate the signal energy. The calculated energy is considered in various approximations including requantization, downsampling and motion vector truncation (rounding).

【００７２】開ループアーキテクチャにも適用可能であ
るドリフトの別の計算方法では、運動ベクトルの打ち切
りもしくは丸めによってもたらされるエラーを推定す
る。元解像度での半ピクセル運動ベクトルは、解像度を
縮減した場合に大きな再生エラーを招来することは知ら
れている。これに対して、全ピクセル運動ベクトルには
このようなエラーは生じない。と言うのは、全ピクセル
運動ベクトルは半ピクセル領域に正しくマッピングする
ことができるからである。従って、ドリフトを計測する
１つの可能な方法は、半ピクセル運動ベクトルの百分率
もしくはパーセンテージを記録することである。しかし
ながら、運動ベクトル近似のインパクドはコンテンツの
複雑性に依存するので、計測されたドリフトが半ピクセ
ル運動ベクトルを有するブロックと関連する残留コンポ
ーネントの関数となる可能性もある。Another method of computing drift that is also applicable to open loop architectures estimates the error introduced by truncation or rounding of motion vectors. It is known that the half-pixel motion vector at the original resolution causes a large reproduction error when the resolution is reduced. In contrast, the full pixel motion vector does not have such an error. This is because the full pixel motion vector can be correctly mapped to the half pixel region. Therefore, one possible way to measure drift is to record the percentage or percentage of the half-pixel motion vector. However, since the impact of the motion vector approximation depends on the complexity of the content, it is possible that the measured drift is a function of the residual components associated with blocks with half-pixel motion vectors.

【００７３】ドリフトの測定の目的で差信号のエネルギ
ー及び運動ベクトルデータを利用する方法は組み合わせ
て実施することもできるし、また、フレーム内の部分領
域に亘って採用することも可能である。ドリフト補償方
法により最も利便的なマクロブロックの位置を同定もし
くは識別することができるので、フレーム内の部分領域
について上記方法を適用するのが有利である。上記の方
法を組み合わせて用いるためには、差信号または元解像
度で半ピクセル運動ベクトルを有するマクロブロックに
対するドリフト補償信号のエネルギーによりドリフトを
計測する。The methods using the energy of the difference signal and the motion vector data for the purpose of measuring the drift can be implemented in combination, or can be adopted over a partial region in the frame. Since the most convenient macroblock position can be identified or identified by the drift compensation method, it is advantageous to apply the method to a partial area in a frame. To use the above methods in combination, the drift is measured by the energy of the difference signal or the drift compensation signal for the macroblock with half-pixel motion vector at the original resolution.

【００７４】第２のステップで、ドリフトの測定値は、
ブロック群プロセッサ１３００に対して入力として用い
られる「イントラ再生率（intra refresh rate）」１３
５１に変換される。イントラ符号化ブロックの百分率の
制御は、従来、エラー常駐伝送でビデオを符号化するの
に当たり考慮されていた。例えば、「Analysis of Vide
o Transmission over Lossy Channels」Journal of Sel
ected Areas of Communications, by Stuhlmuller, et
al, 2000を参照されたい。この論文においては、受信部
からエンコーダへの逆チャンネルで、伝送チャンネルに
よってもたらされる損失量を伝達し、予測符号化方式に
おける損失データに起因するエラー率を阻止するため
に、ソース側からイントラ符号化ブロックの符号化が直
接行われている。In the second step, the measured drift value is
"Intra refresh rate" 13 used as input to block group processor 1300
Converted to 51. Controlling the percentage of intra-coded blocks has traditionally been considered in encoding video with error-resident transmission. For example, "Analysis of Vide
o Transmission over Lossy Channels '' Journal of Sel
ected Areas of Communications, by Stuhlmuller, et
See al, 2000. In this paper, the inverse channel from the receiver to the encoder conveys the amount of loss introduced by the transmission channel and intra-codes from the source side to prevent the error rate due to the lost data in the predictive coding scheme. The blocks are coded directly.

【００７５】これとは対照的に、本発明では、既に符号
化されているビデオに対し圧縮ドメインに新たなイント
ラブロックが生成され、インターモードからイントラモ
ードへの変換はブロック群プロセッサ１３００によって
達成される。In contrast, in the present invention, a new intra block is generated in the compression domain for a video that has already been encoded and the conversion from inter mode to intra mode is accomplished by the block group processor 1300. It

【００７６】ドリフトがドリフト閾値量を越えると、図
１４Ｃに示してあるブロック群プロセッサ１３００が起
動してインターモードブロックをイントラモードブロッ
クに変換する。この場合、変換は、予め特定された固定
のイントラリフレッシュレート（intra refrech rate）
で行われる。別法として、上記変換は、測定ドリフト量
に比例するイントラリフレッシュレートで行うこともで
きる。また、信号のレート歪み特性を考慮して、イント
ラリフレッシュレートと、イントラブロック及びインタ
ーブロックの符号化に用いられる量子化器との間で適当
な妥協を設定することも可能である。When the drift exceeds the drift threshold amount, the block group processor 1300 shown in FIG. 14C is activated to convert an inter mode block into an intra mode block. In this case, the conversion is a pre-specified fixed intra refresh rate.
Done in. Alternatively, the conversion can be done at an intra refresh rate that is proportional to the amount of measured drift. Further, it is possible to set an appropriate compromise between the intra refresh rate and the quantizer used for encoding the intra block and the inter block in consideration of the rate distortion characteristic of the signal.

【００７７】ここで、本発明は新しいイントラブロック
を圧縮ドメインにおいて発生するものであり、そしてこ
のドリフト補償方式は解像度の縮減を伴い或いは伴わず
に任意のトランスコーダ即ち変換符号化器で行うことが
できる。The present invention now generates a new intra block in the compression domain, and this drift compensation scheme can be performed with any transcoder or transform encoder with or without resolution reduction. it can.

【００７８】ダウンサンプリング本発明による変換符号化器では、任意のダウンサンプリ
ング方法を採用することができる。しかしながら、好適
なダウンサンプリング方法は、Sun外の１９９９年１１
月１０日付けの米国特許第５,８５５,１５１号「Method
and apparatusfor down-converting a digital signa
l」に記載されているダウンサンプリング方法が有利で
ある。なお、この米国特許明細書の開示内容は本明細書
において参考のために援用する。Downsampling The transform encoder according to the present invention can employ any downsampling method. However, the preferred downsampling method is Sun et al.
U.S. Pat. No. 5,855,151 dated May 10 "Method
and apparatusfor down-converting a digital signa
The downsampling method described in "1" is advantageous. Note that the disclosure content of this US patent specification is incorporated herein by reference.

【００７９】このダウンサンプリング方法の概念は図１
５Ａに示してある。１つの群は４つのThe concept of this downsampling method is shown in FIG.
5A. One group has four

【数５３】ＤＣＴブロック１５０１を含む。即ち、群の大きさもし
くはサイズは、[Equation 53] A DCT block 1501 is included. That is, the size or size of the group is

【数５４】である。ブロック群に周波数合成もしくはフィルタリン
グ１５１０を適用して単一の[Equation 54] Is. Applying frequency synthesis or filtering 1510 to a block group

【数５５】ＤＣＴブロック１５１１を発生する。この合成されたブ
ロックからダウンサンプリングされたＤＣＴブロック１
５１２を抽出することができる。[Equation 55] Generate DCT block 1511. DCT block 1 downsampled from this synthesized block
512 can be extracted.

【００８０】上記動作は、２Ｄ演算を用いるＤＣＴドメ
インに関して説明したが、しかしながら、演算はまた分
離可能な１Ｄフィルタを用いて行うことも可能である。
更にまた、演算は、完全に空間ドメイン内で行うことも
できる。また、Vetro外の１９９８年３月６日付けの米
国特許願Ｓｎ. ０９／１３５,９６９「Three layer sca
lable decoder and method of decoding」に記述されて
いる方法を用いて等価な空間ドメインフィルタを導出す
ることができる。なお、この米国特許願明細書の開示内
容も参考のために本明細書において援用する。Although the above operations have been described with respect to the DCT domain using 2D operations, operations can also be performed using separable 1D filters.
Furthermore, the operations can also be performed entirely in the spatial domain. Also, US patent application Sn. 09 / 135,969 “Three layer sca” dated March 6, 1998 outside Vetro.
An equivalent spatial domain filter can be derived using the method described in "Lable decoder and method of decoding". Note that the disclosure content of the specification of this US patent application is also incorporated herein by reference.

【００８１】本発明による変換符号化器においてダウン
サンプリング方法を使用する主たる利点は、マクロブロ
ック内のサブブロックの正しい次元（dimension）が直
接得られることである。例えば、４つの８×８ＤＣＴブ
ロックから単一の８×８ブロックを形成することができ
る。他方、従来のダウンサンプリング方法では、マクロ
ブロックの出力サブブロックにおいて所要の次元に等し
くない次元でダウンサンプリングされたデータが生成さ
れている。例えば、８×８ＤＣＴブロックから、４つの
４×４ＤＣＴブロックを得ている。従って、従来方法で
は、単一の８×８ＤＣＴブロックを構成するために付加
的なステップが必要とされることになる。The main advantage of using the downsampling method in the transform encoder according to the invention is that the correct dimension of the sub-blocks within the macroblock is directly obtained. For example, a single 8x8 block can be formed from four 8x8 DCT blocks. On the other hand, conventional downsampling methods produce data that is downsampled in a dimension that is not equal to the required dimension in the output subblock of the macroblock. For example, four 4 × 4 DCT blocks are obtained from the 8 × 8 DCT block. Therefore, the conventional method would require an additional step to construct a single 8x8 DCT block.

【００８２】上述のフィルタは、アップサンプリングを
必要とする図１１に示したアーキテクチャを効率的に実
現する上に有用なコンポーネントである。一般に、ここ
で導出されるフィルタは、解像度縮減或いはドリフト補
償を伴う或いは伴わずに、アップサンプリングされたＤ
ＣＴデータに対し演算を必要とする任意のシステムに適
用可能である。The filter described above is a useful component for efficiently implementing the architecture shown in FIG. 11 which requires upsampling. In general, the filters derived here are upsampled D with or without resolution reduction or drift compensation.
It can be applied to any system that requires calculation on CT data.

【００８３】アップサンプリング本発明においては、従来の任意のアップサンプリング手
段を使用することができる。しかしながら、先に引用し
たVetro外の米国特許願「Three layer scalable decode
r and method of decoding」には、最適なアップサンプ
リング方法は、ダウンサンプリング方法に依存すること
が述べられている。従って、ダウンサンプリングフィル
タUpsampling Any conventional upsampling means may be used in the present invention. However, the previously cited US patent application “Three layer scalable decode”
The "r and method of decoding" states that the optimal upsampling method depends on the downsampling method. Therefore, the downsampling filter

【数５６】に対応するアップサンプリングフィルタ[Equation 56] Upsampling filter corresponding to

【数５７】を使用するのが有利である。なお、上記２つのフィルタ
間の関係は次式で与えられる。[Equation 57] It is advantageous to use The relationship between the above two filters is given by the following equation.

【数５８】 [Equation 58]

【００８４】上式から導出されるフィルタに関連し２つ
の問題がある。第１の問題は、ＤＣＴフィルタが反転可
能ではないために、これらフィルタは空間ドメインフィ
ルタにしか適用できないことである。しかしながら、対
応の空間ドメインフィルタを導出してＤＣＴドメインに
変換することはできるので、これはそれほど大きな問題
とはならない。There are two problems associated with the filter derived from the above equation. The first problem is that the DCT filters are not invertible, so they can only be applied to spatial domain filters. However, this is not a big problem, since the corresponding spatial domain filter can be derived and transformed into the DCT domain.

【００８５】しかしながら、第２の問題は、このように
して得られるアップサンプリングフィルタが図１５Ｂに
示すプロセスに対応する点で事情は異なってくる。この
プロセスにおいて、例えば、However, the second problem is different in that the upsampling filter thus obtained corresponds to the process shown in FIG. 15B. In this process, for example,

【数５９】ブロック１５０２は単一の[Equation 59] Block 1502 is single

【数６０】ブロック１５３０にアップサンプリング（１５２０）さ
れる。アップサンプリングが全て空間ドメイン内で行わ
れる場合には問題は生じない。しかしながら、アップサ
ンプリングがＤＣＴドメインで行われる場合には１つの[Equation 60] Upsampled (1520) to block 1530. If all the upsampling is done in the spatial domain then no problems occur. However, if upsampling is done in the DCT domain, one

【数６１】ＤＣＴブロック、即ち、１つのＤＣＴコンポーネントに
対処しなければならない。これは、アップサンプリング
されるＤＣＴブロックが標準のＭＢフォーマット、即
ち、４つの[Equation 61] One has to deal with the DCT block, ie one DCT component. This is because the upsampled DCT block is in standard MB format, ie 4

【数６２】ＤＣＴブロック（但し、Ｎ＝４）であることを要求する
演算には適さない。即ち、アップサンプリングされたブ
ロックはそれより大きな数の元ブロックと同じフォーマ
ット或いは次元を有する。[Equation 62] It is not suitable for operations that require a DCT block (however, N = 4). That is, the upsampled block has the same format or dimension as the larger number of original blocks.

【００８６】ＤＣＴドメインにおける上述のアップサン
プリング方法は、本発明と関連して述べた変換符号化器
での使用には適していない。図１１Ａを参照するに、ア
ップサンプリングされたＤＣＴデータは混合ブロックプ
ロセッサ１３００から出力されるＤＣＴデータから減算
される。これら２つのブロックの２つのＤＣＴデータは
同じフォーマットを有さねばならない。従って、図１５
Ｃに示したアップサンプリングを行うことが可能なフィ
ルタが要求される。ここで、単一のThe above upsampling method in the DCT domain is not suitable for use in the transform encoder described in connection with the present invention. Referring to FIG. 11A, the upsampled DCT data is subtracted from the DCT data output from the mixed block processor 1300. The two DCT data of these two blocks must have the same format. Therefore, FIG.
A filter capable of performing the upsampling shown in C is required. Where single

【数６３】ブロック１５０２は４つの[Equation 63] Block 1502 has four

【数６４】ブロック１５５０にアップサンプリング（１５４０）さ
れている。このようなフィルタは従来考慮されていなか
ったし、また従来技術としても存在しないので、１Ｄの
事例についての式を以下に説明する。[Equation 64] Upsampled (1540) to block 1550. Since such a filter has not been considered in the past and does not exist in the prior art, the equation for the 1D case will be described below.

【００８７】なお、以下に述べる式における表記法と関
連し、小文字の変数は空間ドメインの信号を表し、他
方、大文字の変数はＤＣＴドメインにおける等価信号を
表すものとする。Note that in relation to the notation in the equations described below, lowercase variables represent spatial domain signals, while uppercase variables represent equivalent signals in the DCT domain.

【００８８】図１６に示すように、Ｃ１６０１はＤＣＴ
ドメインでアップサンプリングすべきＤＣＴブロックを
表し、ｃ１６０２は空間ドメインにおける等価ブロック
を表す。これら２つのブロックはＮ−ｐｔＤＣＴ及びＩ
ＤＣＴ１６０３の定義により互いに関連付けられる。例
えば、Rao及びYipの「Discrete Cosine Transform: Alg
orithms, Advantages and Applications」Academic, Bo
ston, 1990を参照されたい。便宜上、下に数式で表す。As shown in FIG. 16, C1601 is a DCT
The domain represents the DCT block to be upsampled, and c1602 represents the equivalent block in the spatial domain. These two blocks are N-ptDCT and I
They are associated with each other by the definition of DCT 1603. For example, Rao and Yip's Discrete Cosine Transform: Alg
orithms, Advantages and Applications '' Academic, Bo
See ston, 1990. For convenience, it is represented by a mathematical formula below.

【００８９】ＤＣＴの定義は、下式で与えられる。The definition of DCT is given by the following equation.

【数６５】また、ＩＤＣＴの定義は、下式で与えられる。[Equation 65] The definition of IDCT is given by the following equation.

【数６６】上式（１３）及び（１４）において、[Equation 66] In the above equations (13) and (14),

【数６７】である。[Equation 67] Is.

【００９０】上から、ブロックＥ１６１０はFrom the top, block E1610

【数６８】１６１１でのフィルタリングＣに基づくアップサンプリ
ングされたＤＣＴブロックを表し、ｅは式（１２）で与
えられる[Equation 68] 1611 represents an upsampled DCT block based on filtering C in 1611, where e is given by equation (12)

【数６９】１６２１でのフィルタリングｃに基づくアップサンプリ
ングされた空間ドメインブロックを表す。ｅ及びＥは２
Ｎ−ｐｔＤＣＴ／ＩＤＣＴ１６３０により関連付けられ
る点に注意されたい。フィルタリングされる入力の入／
出力関係は次式で与えられる。[Equation 69] 16b represents an upsampled spatial domain block based on filtering c at 1621. e and E are 2
Note that it is related by N-ptDCT / IDCT1630. Input of filtered input /
The output relation is given by the following equation.

【数７０】 [Equation 70]

【００９１】図１６を参照するに、所望のＤＣＴブロッ
クはＡ１６１１及びＢ１６１２で表されている。この目
的とするところは、Ｃから直接それぞれＡ及びＢを計算
するのに使用することができるフィルタReferring to FIG. 16, the desired DCT block is represented by A1611 and B1612. This purpose is a filter that can be used to calculate A and B, respectively, directly from C.

【数７１】１６４１及び[Equation 71] 1641 and

【数７２】１６４２を導出することである。[Equation 72] 1642 is to be derived.

【００９２】第１番目のステップで、式（１４）を式
（１６ｂ）に代入する。In the first step, equation (14) is substituted into equation (16b).

【００９３】これによって得られる式は、ＤＣＴ入力Ｃ
の関数としての空間ドメイン出力ｅの式であり、次式の
ように表される。The equation thus obtained is the DCT input C
Is an expression of the spatial domain output e as a function of, and is expressed as

【数７３】 [Equation 73]

【００９４】式（１７）を用いＣでＡ及びＢを表すと、
ａ、ｂ及びｅ間の空間ドメイン関係は次のようになる。When A and B are represented by C using the equation (17),
The spatial domain relationship between a, b and e is as follows.

【数７４】上式中ｉは空間ドメイン指標（インデックス）を表す。
ａに関するＤＣＴドメイン表現式は次式で与えられる。[Equation 74] In the above formula, i represents a spatial domain index (index).
The DCT domain expression for a is given by:

【数７５】 [Equation 75]

【００９５】式（１７）乃至（１９）から次式が得られ
る。The following equation is obtained from the equations (17) to (19).

【数７６】上式は等価的に次のように表される。[Equation 76] The above equation is equivalently expressed as follows.

【数７７】上式中、[Equation 77] In the above formula,

【数７８】である。同様にして、下式が成り立つ。[Equation 78] Is. Similarly, the following equation holds.

【数７９】上式は等価的に下式で表される。[Equation 79] The above equation is equivalently expressed by the following equation.

【数８０】上式中、[Equation 80] In the above formula,

【数８１】である。[Equation 81] Is.

【００９６】次いで、上記のフィルタは、所与の次元も
しくは大きさの単一のブロックを、それぞれが元ブロッ
クと同じ次元を有する多数のブロックにアップサンプリ
ングするのに用いることができる。一般に、ここで導出
したフィルタは、アップサンプリングされるＤＣＴデー
タに演算が要求される任意のシステムに適用可能であ
る。The filter described above can then be used to upsample a single block of a given dimension or size into multiple blocks, each having the same dimension as the original block. In general, the filter derived here is applicable to any system where computation is required on upsampled DCT data.

【００９７】式（２２）及び（２５）によって与えられ
るフィルタを実現するために、フィルタタップのｋ×ｑ
行列を考える。ここでｋは出力ピクセルの指標（インデ
ックス）であり、ｑは入力ピクセルの指標（インデック
ス）である。１Ｄデータに対しては、出力ピクセルは行
列乗算として計算される。２Ｄデータに対しては２つの
ステップが取られる。先ず第１に、データを第１の方
向、例えば水平方向にアップサンプリングする。次い
で、水平方向にアップサンプリングしたデータを第２の
方向、例えば垂直方向にアップサンプリングする。アッ
プサンプリングのための方向の順序は、逆にしてもその
結果には影響はない。To realize the filter given by equations (22) and (25), k × q of filter taps
Consider the matrix. Here, k is an index (index) of the output pixel and q is an index (index) of the input pixel. For 1D data, the output pixel is calculated as a matrix multiplication. Two steps are taken for 2D data. First, the data is upsampled in a first direction, for example the horizontal direction. Next, the up-sampled data in the horizontal direction is up-sampled in the second direction, for example, the vertical direction. Reversing the order of directions for upsampling does not affect the result.

【００９８】水平方向のアップサンプリングの場合に
は、ブロック内の各列が個別に演算操作され、Ｎ次元の
入力ベクトルとして取り扱われる。各入力ベクトルは、
式（２１）及び（２４）に従ってフィルタリング処理を
受ける。このプロセスの出力として２つの標準のＤＣＴ
ブロックが得られる。In the case of upsampling in the horizontal direction, each column in the block is individually operated and treated as an N-dimensional input vector. Each input vector is
Filtering is performed according to equations (21) and (24). Two standard DCTs as the output of this process
You get a block.

【００９９】垂直方向のアップサンプリングでは、ブロ
ック内の各行が個別に演算操作され、Ｎ次元の入力ベク
トルとして取り扱われる。水平方向におけるアップサン
プリングの場合と同様に、各入力ベクトルは、式（２
１）及び（２４）に従ってフィルタリング処理を受け
る。このプロセスの出力として図１５Ｃに示すように４
つの標準のＤＣＴブロックが得られる。In vertical upsampling, each row in the block is individually operated on and treated as an N-dimensional input vector. As in the case of upsampling in the horizontal direction, each input vector is
The filtering process is performed according to 1) and (24). The output of this process is 4 as shown in FIG. 15C.
Two standard DCT blocks are obtained.

【０１００】開ループトランスコーダのドリフト誤差解
析低減された解像度のトランスコーダによって引き起こさ
れるドリフト誤差の解析が以下に記載される。その解析
は、図９に示される開ループトランスコーダに基づく。
このトランスコーダでは、低減された解像度の残差が以
下の式によって与えられる。Open Loop Transcoder Drift Error Analysis An analysis of the drift error caused by a reduced resolution transcoder is described below. The analysis is based on the open loop transcoder shown in FIG.
In this transcoder, the reduced resolution residual is given by:

【数８２】式（６）と比較すると、ドリフト誤差[Equation 82] Drift error compared to equation (6)

【数８３】は以下の式によって表される。[Equation 83] Is represented by the following formula.

【数８４】ただし以下の式が成り立つ。[Equation 84] However, the following equation holds.

【数８５】上記の式では、ドリフト誤差は２つの成分を有する。第
１の成分[Equation 85] In the above equation, the drift error has two components. First component

【数８６】は動き補償のために用いられる基準フレーム内の誤差を
表す。この誤差は、０以外のＤＣＴ係数を除去する再量
子化と、整数の打切りに起因する算術誤差とによって引
き起こされる。これは、多くのトランスコーダにおいて
共通のドリフト誤差である。Assuncao等による「A freq
uency domain video transcoder for dynamic bit-rate
reduction of MPEG-2 bitstreams」（IEEE Transactio
ns on Circuits and Systems for Video Technology, p
p. 953-957, 1998）を参照されたい。この場合には、ト
ランスコーダによって基準として最初に用いられたフレ
ームは、デコーダ内のそれに相当するフレームとは異な
り、それにより予測成分と残差成分との間の不整合を引
き起こす。第２の成分[Equation 86] Represents the error in the reference frame used for motion compensation. This error is caused by requantization removing non-zero DCT coefficients and arithmetic error due to integer truncation. This is a common drift error in many transcoders. "A freq by Assuncao et al.
uency domain video transcoder for dynamic bit-rate
reduction of MPEG-2 bitstreams "(IEEE Transactio
ns on Circuits and Systems for Video Technology, p
p. 953-957, 1998). In this case, the frame initially used by the transcoder as a reference is different from its corresponding frame in the decoder, thereby causing a mismatch between the prediction and residual components. Second component

【数８７】は、動き補償およびダウンサンプリングの非可換性に起
因し、それは低減された解像度のトランスコードに固有
である。[Equation 87] Due to the non-commutative nature of motion compensation and downsampling, which is inherent in reduced resolution transcoding.

【数８８】の影響に寄与する要因には主に２つ、すなわち動きベク
トル（ＭＶ）マッピングおよびダウンサンプリングがあ
る。元の解像度から低減された解像度へＭＶをマッピン
グする際に、ＭＶは、ＭＶを符号化する精度が制限され
ることに起因して打ち切られる。圧縮された領域のより
低い空間解像度にダウンサンプリングする際に、ブロッ
ク間でフィルタが重複するのを避けるために、多くの場
合にブロックの制約が守られる。これらの制約に起因し
て、システムの複雑さは緩和されるが、ダウンサンプリ
ングプロセスの品質は低下し、通常いくつかの誤差が導
入される。１フレームの場合のこれらの誤差の大きさに
かかわらず、これらの２つの変換の組み合わせは一般に
予測成分と残差成分との間のさらに大きな不整合を生成
し、それは連続してフレームが予測される度に増加す
る。動き補償とダウンサンプリングの非可換性に起因す
る予測成分と残差成分との間のこの不整合を示すため
に、１−Ｄ信号を用いる例について検討し、再量子化に
起因する全てのドリフト誤差（すなわち、[Equation 88] There are mainly two factors that contribute to the effect of: motion vector (MV) mapping and downsampling. In mapping the MV from the original resolution to the reduced resolution, the MV is truncated due to the limited accuracy of encoding the MV. In downsampling to a lower spatial resolution of the compressed area, block constraints are often adhered to to avoid filter overlap between blocks. Due to these constraints, the complexity of the system is reduced, but the quality of the downsampling process is degraded and usually introduces some error. Regardless of the magnitude of these errors in the case of a frame, the combination of these two transforms generally produces a larger mismatch between the prediction and residual components, which results in successive frames being predicted. Increase every time. To show this mismatch between the prediction and residual components due to non-commutativeness of motion compensation and downsampling, we consider an example using a 1-D signal and consider all Drift error (ie

【数８９】）を無視する。全て元の解像度において、[Equation 89] ) Is ignored. All at the original resolution,

【数９０】が再構成されたブロックを示すものとし、[Equation 90] Denote the reconstructed block, and

【数９１】が基準ブロックを示すものとし、[Formula 91] Denote the reference block,

【数９２】が誤差（残差）ブロックを示すものとする。さらに[Equation 92] Denote the error (residual) block. further

【数９３】が最大解像度動き補償フィルタを示すものとし、[Equation 93] Denote the maximum resolution motion compensation filter,

【数９４】が低減された解像度の動き補償フィルタを示すものとす
る。その際、元の解像度において再構成されたブロック
は以下の式によって与えられる。[Equation 94] Shall represent a motion compensation filter with reduced resolution. The reconstructed block at the original resolution is then given by:

【数９５】式（３０）の両辺にダウンコンバージョンプロセスを適
用する場合には、以下の式が得られる。[Formula 95] When applying the down-conversion process to both sides of equation (30), the following equation is obtained.

【数９６】上記の式によって生成される信号の品質は[Equation 96] The quality of the signal produced by the above equation is

【数９７】に含まれるドリフト誤差の影響を受けない。しかしなが
ら、これは、低減された解像度のトランスコーダによっ
て生成される信号ではない。実際の再構成された信号は
以下の式によって与えられる。[Numerical Expression 97] Is not affected by the drift error contained in. However, this is not the signal produced by the reduced resolution transcoder. The actual reconstructed signal is given by:

【数９８】なぜなら、[Equation 98] Because

【数９９】であるので、低減された解像度の予測成分と残差成分と
の間には不整合がある。式（３１）によって生成される
品質を達成するために、予測成分および残差成分のいず
れか一方あるいは両方が、互いに整合するように変更さ
れる必要がある。図４の基準トランスコーダでは、この
不整合は、新しい低減された解像度の残差を決定する第
２の符号化ループを用いて除去される。この第２のルー
プを用いて、予測および残差成分が調整し直される。[Numerical expression 99] Therefore, there is a mismatch between the reduced resolution prediction component and the residual component. In order to achieve the quality produced by equation (31), either or both of the prediction and residual components need to be modified to match each other. In the reference transcoder of FIG. 4, this mismatch is removed using a second coding loop that determines a new reduced resolution residual. The second loop is used to readjust the prediction and residual components.

【０１０１】ドリフト補償を有する低減された解像度の
トランスコーディング以下の式、Reduced Resolution Transcoding with Drift Compensation

【数１００】のように近似すると、式（６）の低減された解像度の残
差信号は以下のように表される。[Equation 100] The reduced resolution residual signal of equation (6) can be expressed as:

【数１０１】上記の式は図１７に示される閉ループトランスコーダを
示唆し、それは低減された解像度の信号においてドリフ
トを補償する。部分符号化によるドリフト補償を有するビデオトランス
コーダ図１７は、本発明による、低減された解像度の信号にお
いてドリフト補償を有する、空間解像度を低減するため
の閉ループトランスコーダ１７００のブロック図であ
る。トランスコーダ１７００はデコーダ１７０３と部分
エンコーダ１７０４とを含む。トランスコーダ１７００
では、入力信号１７０１、すなわち圧縮されたビデオ信
号ビットストリームのフレームのシーケンスがデコーダ
１７０３に供給される。デコーダ１７０３は、ＶＬＤ１
７１０と、逆量子化１７２０と、ＩＤＣＴ１７３０と、
動き補償１７４０とを含む。復号化されたフレームは、
以前に復号化された各フレームの最大解像度動きベクト
ルが次に復号化されるフレームの動きベクトルに加算さ
れる（１７８０）ときに、復号化（１７０３）中に動き
補償（１７４０）するために第１のフレームバッファ１
７６０に格納される。復号化されたビットストリームの
各フレームはダウンコンバージョンブロック１７５０に
よってダウンサンプリングされる。低減された解像度の
フレームは、以前の低減された解像度のフレームの動き
補償された予測が、部分符号化（１７０４）中に動き補
償するために現在の低減された解像度のフレームから減
算される（１７８２）ときに、部分符号化（１７０４）
中に動き補償（１７７０）するために第２のフレームバ
ッファ１７６０に格納される。最大解像度のフレームの場合のデコーダ１７０３におけ
る動き補償は、最大解像度の動きベクトル[Equation 101] The above equation suggests the closed loop transcoder shown in Figure 17, which compensates for drift in the reduced resolution signal. Video Transcoder with Drift Compensation by Partial Coding FIG. 17 is a block diagram of a closed loop transcoder 1700 for reducing spatial resolution with drift compensation in a reduced resolution signal according to the present invention. Transcoder 1700 includes decoder 1703 and partial encoder 1704. Transcoder 1700
Then, the input signal 1701, that is, the sequence of frames of the compressed video signal bitstream is supplied to the decoder 1703. The decoder 1703 has VLD1
710, inverse quantization 1720, IDCT 1730,
Motion compensation 1740. The decoded frame is
When the maximum resolution motion vector of each previously decoded frame is added (1780) to the motion vector of the next frame to be decoded, a first value is added for motion compensation (1740) during decoding (1703). Frame buffer 1 of 1
It is stored in 760. Each frame of the decoded bitstream is downsampled by downconversion block 1750. The reduced resolution frame is subtracted from the current reduced resolution frame for motion compensation of the previous reduced resolution frame for motion compensation during subcoding (1704) ( 1782) when partially coded (1704)
It is stored in the second frame buffer 1760 for motion compensation (1770) therein. Motion compensation at the decoder 1703 for a full resolution frame is performed using the full resolution motion vector.

【数１０２】を利用し、一方、低減された解像度のフレームの場合の
部分エンコーダ１７０４における動き補償（１７７０）
は低解像度の動きベクトル[Equation 102] While motion compensation in the partial encoder 1704 for reduced resolution frames (1770)
Is a low resolution motion vector

【数１０３】を利用する。低解像度の動きベクトルは、ダウンサンプ
リングされた空間領域のフレームから推定されるか、あ
るいは最大解像度の動きベクトルからマッピングされる
（１７６５）かのいずれかである。低減された解像度の
残差は、現在の低解像度のフレームから以前の低減され
た解像度のフレームの動き補償された予測値を減算する
（１７８２）ことにより得られる。その後、低減された
解像度の残差はＤＣＴ１７８３、量子化１７８４および
ＶＬＤ１７８６動作にかけられ、解像度が低減され、か
つドリフト補償された出力トランスコードビットストリ
ーム１７０２が生成される。本発明によるトランスコー
ダ１７００は、[Equation 103] To use. The low resolution motion vectors are either estimated from downsampled spatial domain frames or mapped from the full resolution motion vectors (1765). The reduced resolution residual is obtained by subtracting the motion compensated prediction of the previous reduced resolution frame from the current low resolution frame (1782). The reduced resolution residual is then subjected to DCT1783, quantization 1784 and VLD1786 operations to produce a reduced resolution and drift compensated output transcode bitstream 1702. The transcoder 1700 according to the present invention is

【数１０４】によって引き起こされるドリフト誤差を低減する。[Equation 104] Reduce drift errors caused by.

【数１０５】は、通常、[Equation 105] Is usually

【数１０６】より著しく大きいので、トランスコーダ１７００は、動
き補償された予測を形成するために従来技術のデコーダ
によって標準的に用いられる、基準フレームを完全に再
構成することに関連する複雑さを最小限に抑える。それ
ゆえ、図４に示される従来技術のデコーダ４００におけ
る逆量子化４７０、ＩＤＣＴ４８０および加算動作は省
略される。本発明によるドリフト補償は完全復号化およ
び部分符号化と見なすことができ、再量子化誤差は図４
の場合と同様に補償されない。最後に、最大解像度の復
号化はトランスコーダ１７００で実行されるので、従来
技術のような混在ブロックの問題は存在しないことに留
意されたい。本発明が好ましい実施形態を例として記載
されてきたが、種々の他の適合形態および変更形態が本
発明の精神および範囲内において実施されることができ
ることは理解されたい。それゆえ、添付の特許請求の範
囲の目的は、本発明の真の精神および範囲内に入るよう
な全てのそのような変形形態および変更形態を網羅する
ことである。[Equation 106] Being significantly larger, transcoder 1700 minimizes the complexity associated with completely reconstructing a reference frame, which is typically used by prior art decoders to form motion compensated predictions. . Therefore, the dequantization 470, IDCT 480 and addition operations in the prior art decoder 400 shown in FIG. 4 are omitted. The drift compensation according to the present invention can be considered as full decoding and partial coding, and the requantization error is shown in FIG.
It is not compensated as in the case of. Finally, it should be noted that since full resolution decoding is performed in transcoder 1700, there is no mixed block problem as in the prior art. Although the present invention has been described by way of example of preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the invention. Therefore, the purpose of the appended claims is to cover all such variations and modifications as fall within the true spirit and scope of the invention.

[Brief description of drawings]

【図１】従来のカスケード接続の変換符号化器もしく
はトランスコーダのブロックダイヤグラムである。FIG. 1 is a block diagram of a conventional cascaded transform encoder or transcoder.

【図２】ビットレート縮減用の従来の開ループ変換符
号化器のブロックダイヤグラムである。FIG. 2 is a block diagram of a conventional open loop transform encoder for bit rate reduction.

【図３】ビットレート縮減用の従来の閉ループ変換符
号化器のブロックダイヤグラムである。FIG. 3 is a block diagram of a conventional closed loop transform encoder for bit rate reduction.

【図４】空間解像度縮減用の従来のカスケード接続の
変換符号化器のブロックダイヤグラムである。FIG. 4 is a block diagram of a conventional cascaded transform encoder for spatial resolution reduction.

【図５】空間解像度縮減のための従来の開ループ変換
符号化器のブロックダイヤグラムである。FIG. 5 is a block diagram of a conventional open loop transform encoder for spatial resolution reduction.

【図６】従来の運動ベクトル写像もしくはマッピング
のブロックダイヤグラムである。FIG. 6 is a block diagram of a conventional motion vector mapping or mapping.

【図７】本発明による第１のビットストリーム変換符
号化を空間解像度縮減に適用した本発明の第１の実施例
を示すブロックダイヤグラムである。FIG. 7 is a block diagram showing a first embodiment of the present invention in which the first bitstream transform coding according to the present invention is applied to spatial resolution reduction.

【図８】本発明によるビットストリームの縮減空間解
像度への符号化変換を適用した本発明の第２の実施例を
示すブロックダイヤグラムである。FIG. 8 is a block diagram showing a second embodiment of the present invention to which a coding conversion of a bitstream into a reduced spatial resolution according to the present invention is applied.

【図９】本発明による空間解像度縮減のための開ルー
プ符号化変換器のブロックダイヤグラムである。FIG. 9 is a block diagram of an open-loop coding converter for reducing spatial resolution according to the present invention.

【図１０】本発明による縮減解像度でドリフト補償を
行う空間解像度縮減用の第１の閉ループ変換符号化器の
ブロックダイヤグラムである。FIG. 10 is a block diagram of a first closed-loop transform encoder for spatial resolution reduction that performs drift compensation with reduced resolution according to the present invention.

【図１１Ａ】本発明による元の解像度でドリフト補償
を行う空間解像度縮減用の第２の閉ループ変換符号化器
のブロックダイヤグラムである。FIG. 11A is a block diagram of a second closed-loop transform encoder for spatial resolution reduction that performs drift compensation at the original resolution according to the present invention.

【図１１Ｂ】本発明による元の解像度でドリフト補償
を行う空間解像度縮減用の第３の閉ループ変換符号化器
のブロックダイヤグラムである。FIG. 11B is a block diagram of a third closed loop transform encoder for spatial resolution reduction that performs drift compensation at the original resolution according to the present invention.

【図１２】マクロブロックモード、ＤＣＴ係数データ
及び対応の運動ベクトルデータを含むマクロブロック群
の一例を示す図である。FIG. 12 is a diagram showing an example of a macroblock group including a macroblock mode, DCT coefficient data, and corresponding motion vector data.

【図１３】本発明によるブロック群プロセッサのブロ
ックダイヤグラムである。FIG. 13 is a block diagram of a block group processor according to the present invention.

【図１４Ａ】本発明による第１のブロック群処理方法
を図解するブロックダイヤグラムである。FIG. 14A is a block diagram illustrating a first block group processing method according to the present invention.

【図１４Ｂ】本発明による第２のブロック群処理方法
を図解するブロックダイヤグラムである。FIG. 14B is a block diagram illustrating a second block group processing method according to the present invention.

【図１４Ｃ】本発明による第３のブロック群処理方法
を図解するブロックダイヤグラムである。FIG. 14C is a block diagram illustrating a third block group processing method according to the present invention.

【図１５Ａ】ＤＣＴもしくは空間ドメインにおけるダ
ウンサンプリングの従来の考え方を図解する図である。FIG. 15A is a diagram illustrating the conventional idea of downsampling in the DCT or spatial domain.

【図１５Ｂ】ＤＣＴもしくは空間ドメインにおける従
来のアップサンプリングを図解するブロックダイヤグラ
ムである。FIG. 15B is a block diagram illustrating conventional upsampling in the DCT or spatial domain.

【図１５Ｃ】本発明によるＤＣＴドメインにおけるア
ップサンプリングを図解するブロックダイヤグラムであ
る。FIG. 15C is a block diagram illustrating upsampling in the DCT domain according to the present invention.

【図１６】本発明によるＤＣＴドメインにおけるアッ
プサンプリングを図解するダイヤグラムである。FIG. 16 is a diagram illustrating upsampling in the DCT domain according to the present invention.

【図１７】本発明による、ドリフト補償を有する空間
解像度を低減するための閉ループトランスコーダのブロ
ック図である。FIG. 17 is a block diagram of a closed loop transcoder for reducing spatial resolution with drift compensation according to the present invention.

───────────────────────────────────────────────────── フロントページの続き (72)発明者アンソニー・ヴェトロアメリカ合衆国、ニューヨーク州、ステートン・アイランド、レジス・ドライブ 113 (72)発明者ハイファン・スンアメリカ合衆国、ニュージャージー州、クランベリー、キングレット・ドライブ・サウス 61 (72)発明者ペン・インアメリカ合衆国、ニュージャージー州、プリンストン、ホールジー・ストリート 222ビー (72)発明者ベデ・リューアメリカ合衆国、ニュージャージー州、プリンストン、ハートリー・アベニュー 248 Ｆターム(参考） 5C059 KK41 LB04 MA00 MA01 MA23 MA31 MC11 ME01 NN21 PP05 PP06 SS02 SS08 UA02 UA33 5J064 AA01 AA02 BA09 BA16 BB01 BB03 BC01 BC16 ─────────────────────────────────────────────────── ─── Continued front page (72) Inventor Anthony Vetro Stay, New York, United States Ton Island, Regis Drive 113 (72) Inventor Hai Phan Seung Ku, New Jersey, United States Ranberry, Kinglet Drive Us 61 (72) Inventor Penn In Pue, New Jersey, United States Linston, Hallsey Street 222 bees (72) Inventor Bede Liu Pue, New Jersey, United States Linston, Hartley Avenue 248 F-term (reference) 5C059 KK41 LB04 MA00 MA01 MA23 MA31 MC11 ME01 NN21 PP05 PP06 SS02 SS08 UA02 UA33 5J064 AA01 AA02 BA09 BA16 BB01 BB03 BC01 BC16

Claims

[Claims]

1. A method for transcoding a compressed bitstream of a sequence of frames of a video signal to a reduced spatial resolution, the decoding of the frames, the decoding of the decoded frames Storing the decoded frame in a second frame buffer, storing the decoded frame in a second frame buffer, down-sampling the decoded frame to a reduced resolution, Partially encoding a frame to produce a reduced resolution compressed bitstream of the video.

2. The decoding includes variable length decoding the bitstream to produce an output including a full resolution motion vector and quantized DCT coefficients for each block in each frame. Dequantizing the quantized DCT coefficient for each block in each frame, applying an inverse DCT to the dequantized block of the frame, the stored Motion compensation with a full resolution motion vector of the decoded frame.

3. The partially encoding comprises motion compensating with a reduced resolution motion vector of the stored reduced resolution frame; and a DCT comprising the reduced resolution motion compensation. The method of claim 1, further comprising: applying a quantized difference; quantizing a DCT block of the frame; and variable length coding the quantized block of the frame.

4. The method of claim 2, wherein the motion compensation during decoding further comprises adding a motion compensated prediction value at a maximum resolution of a previously decoded frame to a current frame. .

5. The motion compensation during the sub-coding further comprises subtracting a reduced resolution motion compensated prediction value of a previously reduced resolution from a current reduced resolution frame. The method of claim 3 including.

6. The method of claim 3, further comprising estimating the reduced resolution motion vector from the reduced resolution frame.

7. The method of claim 2, further comprising mapping the full resolution motion vector to the reduced resolution motion vector from the variable length decoded frame.

8. A closed-loop transcoder for transcoding a compressed bitstream of a sequence of frames of a video signal to a reduced spatial resolution, partially decoded from said compressed bitstream. A decoder for motion compensation using a maximum resolution motion vector stored in a first frame buffer to generate a frame; and a down downsampler for downsampling the decoded frame into a reduced resolution frame. A conversion block and a partial encoder for motion compensation using a reduced resolution motion vector stored in a second frame buffer to generate a reduced spatial resolution compressed bitstream of the video; Closed-loop transcoder including.