JP7498502B2

JP7498502B2 - Explicit signaling of extended long-term reference picture retention - Patents.com

Info

Publication number: JP7498502B2
Application number: JP2021543479A
Authority: JP
Inventors: ボリヴォイェファート，; ハリカルバ，; ヴェリボールアジッチ，
Original assignee: オーピーソリューションズ，エルエルシー
Priority date: 2019-01-28
Filing date: 2020-01-28
Publication date: 2024-06-12
Anticipated expiration: 2040-01-28
Also published as: BR112021014753A2; JP2024100973A; WO2020159993A1; SG11202108105YA; CN113615184A; JP2022524917A; MX2021009024A; CN113615184B; EP3918799A1; CN118714324A; KR20210118155A; PH12021551798A1; EP3918799A4

Description

本願は、２０１９年１月２８日に出願され「ＥＸＰＬＩＣＩＴＳＩＧＮＡＬＩＮＧＯＦＥＸＴＥＮＤＥＤＬＯＮＧＴＥＲＭＲＥＦＥＲＥＮＣＥＰＩＣＴＵＲＥＲＥＴＥＮＴＩＯＮ」と題された米国仮特許出願第６２／７９７，８０６号の優先権の利益を主張し、これは、参照することによってその全体として本明細書に援用される。 This application claims the benefit of priority to U.S. Provisional Patent Application No. 62/797,806, filed January 28, 2019, entitled "EXPLICIT SIGNALING OF EXTENDED LONG TERM REFERENCE PICTURE RETENTION," which is incorporated herein by reference in its entirety.

本発明は、概して、ビデオ圧縮の分野に関する。具体的には、本発明は、延長された長期参照ピクチャ保持の明示的信号伝達を対象とする。 The present invention relates generally to the field of video compression. In particular, the present invention is directed to explicit signaling of extended long-term reference picture retention.

ビデオコーデックは、デジタルビデオを圧縮または解凍する電子回路またはソフトウェアを含み得る。それは、圧縮されていないビデオを圧縮されたフォーマットに変換することができ、逆もまた同様である。ビデオ圧縮の文脈において、ビデオを圧縮する（および／またはそのうちのいくつかの機能を実施する）デバイスは、典型的には、エンコーダと呼ばれ得、ビデオを解凍する（および／またはそのうちのいくつかの機能を実施する）デバイスは、デコーダと呼ばれ得る。 A video codec may include electronic circuitry or software that compresses or decompresses digital video. It can convert uncompressed video to a compressed format, or vice versa. In the context of video compression, a device that compresses (and/or performs some of the functions of) the video may typically be called an encoder, and a device that decompresses (and/or performs some of the functions of) the video may be called a decoder.

圧縮されたデータのフォーマットは、標準的なビデオ圧縮仕様に適合することができる。圧縮は、圧縮されたビデオが元のビデオの中に存在するある情報を欠く点で非可逆的であり得る。この結果は、元のビデオを正確に再構築するために不十分な情報しか存在しないので、解凍されたビデオが元の圧縮されていないビデオより低い品質を有し得ることを含み得る。 The format of the compressed data can conform to standard video compression specifications. The compression can be lossy, in that the compressed video lacks certain information present in the original video. Consequences of this can include that the decompressed video may have lower quality than the original uncompressed video, because insufficient information exists to exactly reconstruct the original video.

ビデオ品質と、ビデオを表現するために使用される（例えば、ビットレートによって決定される）データ量と、エンコーディングアルゴリズムおよびデコーディングアルゴリズムの複雑性と、データ損失ならびに誤差に対する感度と、編集のし易さと、ランダムアクセスと、エンドツーエンド遅延（例えば、待機時間）と、同等物との間に、複雑な関係が存在し得る。 There can be a complex relationship between video quality, the amount of data used to represent the video (e.g., determined by the bit rate), the complexity of the encoding and decoding algorithms, sensitivity to data loss and errors, ease of editing, random access, end-to-end delay (e.g., latency), and the like.

動き補償は、ビデオにおけるカメラおよび／またはオブジェクトの動きを考慮することによって、以前および／または将来のフレーム等の参照フレームを前提として、ビデオフレームまたはその一部を予測するためのアプローチを含み得る。これは、ビデオ圧縮のためのビデオデータのエンコーディングおよびデコーディングにおいて、例えば、動画専門家集団（ＭＰＥＧ）－２（アドバンスドビデオコーディング（ＡＶＣ）およびＨ．２６４とも称される）規格を使用するエンコーディングおよびデコーディングにおいて採用されることができる。動き補償は、参照ピクチャの現在のピクチャへの変換の観点からピクチャを記述することができる。参照ピクチャは、現在のピクチャと比較すると時間的に以前のものであるか、現在のピクチャと比較すると将来からのものであるか、または長期参照（ＬＴＲ）フレームを含むことができる。画像が、以前に伝送され、かつ／または記憶されている画像から正確に合成されることができるとき、圧縮効率は、改良されることができる。 Motion compensation may include an approach for predicting a video frame or a portion thereof given reference frames, such as previous and/or future frames, by considering the motion of the camera and/or objects in the video. It may be employed in encoding and decoding video data for video compression, for example, in encoding and decoding using the Moving Picture Experts Group (MPEG)-2 (also referred to as Advanced Video Coding (AVC) and H.264) standard. Motion compensation may describe a picture in terms of the transformation of a reference picture into the current picture. The reference picture may be earlier in time compared to the current picture, from the future compared to the current picture, or may include a long-term reference (LTR) frame. Compression efficiency may be improved when images can be accurately synthesized from previously transmitted and/or stored images.

長期参照（ＬＴＲ）フレームは、ＭＰＥＧ－２、Ｈ．２６４（ＡＶＣまたはＭＰＥＧ－４Ｐａｒｔ１０とも称される）、およびＨ．２６５（高効率ビデオコーディング（ＨＥＶＣ）とも称される）等のビデオコーディング規格において使用されている。ビデオビットストリーム内のＬＴＲフレームとしてマーキングされるフレームは、これがビットストリーム信号伝達によって明示的に除去されるまで、参照としての使用のために利用可能である。ＬＴＲフレームは、長い周期にわたって静的背景（例えば、ビデオ会議における背景または駐車場監視のビデオ）を有する場面における予測および圧縮効率を改良する。しかしながら、経時的に、場面の背景は、徐々に変化する（例えば、空いたスポットに自動車が駐車されると、自動車が背景場面の一部になる）。したがって、ＬＴＲフレームを更新することは、より良好な予測を可能にすることによって圧縮性能を改良する。 Long-term reference (LTR) frames are used in video coding standards such as MPEG-2, H.264 (also referred to as AVC or MPEG-4 Part 10), and H.265 (also referred to as High Efficiency Video Coding (HEVC)). A frame marked as an LTR frame in a video bitstream is available for use as a reference until it is explicitly removed by bitstream signaling. LTR frames improve prediction and compression efficiency in scenes with a static background over a long period of time (e.g., background in a video conference or video of a parking lot surveillance). However, over time, the background of a scene changes gradually (e.g., when a car is parked in an empty spot, the car becomes part of the background scene). Thus, updating the LTR frame improves compression performance by allowing better prediction.

Ｈ．２６４およびＨ．２６５等の現在の規格は、新たにデコードされたフレームを信号伝達することによるＬＴＲフレームの更新が、保存され、参照フレームとして利用可能にされることを可能にする。そのような更新は、エンコーダによって信号伝達され、フレーム全体が更新される。しかし、フレーム全体を更新することは、コストがかかり得る。また、ＬＴＲフレームが更新されるとき、前のＬＴＲフレームは破棄される。前の破棄されたＬＴＲフレームと関連付けられる静的背景がビデオ内で再び生じる場合（例えば、第１の場面から第２の場面に、次いで、第１の場面に戻るように切り替わるビデオ内等）、前のＬＴＲフレームは、ビットストリーム内で再びエンコードされなければならず、これは、圧縮効率を低減させる。 Current standards such as H.264 and H.265 allow updates of LTR frames by signaling a newly decoded frame to be stored and made available as a reference frame. Such updates are signaled by the encoder and the entire frame is updated. However, updating the entire frame can be costly. Also, when an LTR frame is updated, the previous LTR frame is discarded. If the static background associated with the previous discarded LTR frame reoccurs in the video (e.g., in a video that switches from a first scene to a second scene and then back to the first scene), the previous LTR frame must be encoded again in the bitstream, which reduces compression efficiency.

ある側面では、デコーダは、回路を含み、回路は、ビットストリームを受信することと、参照リスト内に複数の長期参照フレームを記憶することと、保持時間に基づく時間の長さにわたって参照リスト内に長期参照フレームを保持することと、参照リスト内に保持される長期参照フレームを使用して、ビデオの少なくとも一部をデコードすることとを行うように構成される。 In one aspect, a decoder includes circuitry configured to receive a bitstream, store a plurality of long-term reference frames in a reference list, retain the long-term reference frames in the reference list for a length of time based on a retention time, and decode at least a portion of the video using the long-term reference frames retained in the reference list.

別の側面では、方法は、デコーダがビットストリームを受信することを含む。方法は、デコーダが参照リスト内に複数の長期参照フレームを記憶することを含む。方法は、デコーダが保持時間に基づく時間の長さにわたって参照リスト内に長期参照フレームを保持することを含む。方法は、デコーダが参照リスト内に保持される長期参照フレームを使用してビデオの少なくとも一部をデコードすることを含む。 In another aspect, a method includes a decoder receiving a bitstream. The method includes the decoder storing a plurality of long-term reference frames in a reference list. The method includes the decoder retaining the long-term reference frames in the reference list for a length of time based on the retention time. The method includes the decoder decoding at least a portion of the video using the long-term reference frames retained in the reference list.

本明細書に説明される主題の１つ以上の変形例の詳細が、付随の図面および下記の説明に記載される。本明細書に説明される主題の他の特徴および利点が、説明および図面から、ならびに請求項から明白となるであろう。
本発明は、例えば、以下の項目を提供する。
（項目１）
デコーダであって、前記デコーダは、回路を備え、前記回路は、
ビットストリームを受信することと、
参照リスト内に複数の長期参照フレームを記憶することと、
保持時間に基づく時間の長さにわたって前記参照リスト内に長期参照フレームを保持することと、
前記参照リスト内に保持されている前記長期参照フレームを使用して、ビデオの少なくとも一部をデコードすることと
を行うように構成される、デコーダ。
（項目２）
前記記憶されている長期参照フレーム内の各長期参照フレームは、関連付けられる保持時間を含む、項目１に記載のデコーダ。
（項目３）
前記長期参照フレームが少なくとも前記保持時間にわたって前記参照リスト内に常駐した後、前記長期参照フレームを利用不可能としてマーキングするようにさらに構成される、項目１に記載のデコーダ。
（項目４）
前記ビットストリーム内の信号に基づいて、前記長期参照フレームを利用可能としてマーキングするようにさらに構成される、項目３に記載のデコーダ。
（項目５）
前記ビットストリームは、メモリから前記長期参照フレームを除去するための信号を含む、項目１に記載のデコーダ。
（項目６）
前記信号に基づいて前記参照リストから前記長期参照フレームを除去するようにさらに構成される、項目５に記載のデコーダ。
（項目７）
前記ビットストリームを受信し、前記ビットストリームを量子化された係数にデコードするように構成されるエントロピーデコーダプロセッサと、
逆離散コサインを実施することを含め、前記量子化された係数を処理するように構成される逆量子化および逆変換プロセッサと、
デブロッキングフィルタと、
フレームバッファと、
イントラ予測プロセッサと
をさらに備える、項目１に記載のデコーダ。
（項目８）
コーディングされたブロックを受信することと、
インター予測モードが前記コーディングされたブロックに関して有効化されると決定することと、
参照フレームとして前記長期参照フレームを使用して、かつ前記インター予測モードに従って、デコードされたブロックを決定することと
を行うようにさらに構成される、項目１に記載のデコーダ。
（項目９）
前記デコードされたブロックは、クアッドツリープラスバイナリディシジョンツリーの一部を形成する、項目８に記載のデコーダ。
（項目１０）
前記デコードされたブロックは、前記クアッドツリープラスバイナリディシジョンツリーの非リーフノードである、項目８に記載のデコーダ。
（項目１１）
方法であって、前記方法は、
デコーダが、ビットストリームを受信することと、
前記デコーダが、参照リスト内に複数の長期参照フレームを記憶することと、
前記デコーダが、保持時間に基づく時間の長さにわたって前記参照リスト内に長期参照フレームを保持することと、
前記デコーダが、前記参照リスト内に保持されている前記長期参照フレームを使用して、ビデオの少なくとも一部をデコードすることと
を含む、方法。
（項目１２）
前記記憶されている長期参照フレーム内の各長期参照フレームは、関連付けられる保持時間を含む、項目１１に記載の方法。
（項目１３）
前記長期参照フレームが少なくとも前記保持時間にわたって前記参照リスト内に常駐した後、前記長期参照フレームを利用不可能としてマーキングすることをさらに含む、項目１１に記載の方法。
（項目１４）
前記ビットストリーム内の信号に基づいて、前記長期参照フレームを利用可能としてマーキングすることをさらに含む、項目１３に記載の方法。
（項目１５）
前記ビットストリームは、メモリから前記長期参照フレームを除去するための信号を含む、項目１１に記載の方法。
（項目１６）
前記信号に基づいて、前記参照リストから前記長期参照フレームを除去することをさらに含む、項目１５に記載の方法。
（項目１７）
前記デコーダはさらに、
前記ビットストリームを受信し、前記ビットストリームを量子化された係数にデコードするように構成されるエントロピーデコーダプロセッサと、
逆離散コサインを実施することを含め、前記量子化された係数を処理するように構成される逆量子化および逆変換プロセッサと、
デブロッキングフィルタと、
フレームバッファと、
イントラ予測プロセッサと
を備える、項目１１に記載の方法。
（項目１８）
コーディングされたブロックを受信することと、
インター予測モードが前記コーディングされたブロックに関して有効化されることを決定することと、
参照フレームとして前記長期参照フレームを使用して、かつ前記インター予測モードに従って、デコードされたブロックを決定することと
をさらに含む、項目１１に記載の方法。
（項目１９）
前記デコードされたブロックは、クアッドツリープラスバイナリディシジョンツリーの一部を形成する、項目１８に記載の方法。
（項目２０）
前記デコードされたブロックは、前記クアッドツリープラスバイナリディシジョンツリーの非リーフノードである、項目１８に記載の方法。 The details of one or more variations of the subject matter described herein are set forth in the accompanying drawings and the description below. Other features and advantages of the subject matter described herein will be apparent from the description and drawings, and from the claims.
The present invention provides, for example, the following items.
(Item 1)
10. A decoder, the decoder comprising a circuit, the circuit comprising:
Receiving a bitstream;
storing a plurality of long term reference frames in a reference list;
retaining a long term reference frame in the reference list for a length of time based on a retention time;
decoding at least a portion of a video using the long term reference frames maintained in the reference list;
A decoder configured to:
(Item 2)
2. The decoder of claim 1, wherein each long term reference frame in the stored long term reference frames includes an associated retention time.
(Item 3)
2. The decoder of claim 1, further configured to mark the long term reference frame as unavailable after the long term reference frame has resided in the reference list for at least the retention time.
(Item 4)
4. The decoder of claim 3, further configured to mark the long term reference frame as available based on a signal in the bitstream.
(Item 5)
2. The decoder of claim 1, wherein the bitstream includes a signal for removing the long term reference frame from memory.
(Item 6)
6. The decoder of claim 5, further configured to remove the long term reference frame from the reference list based on the signal.
(Item 7)
an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;
an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing an inverse discrete cosine;
A deblocking filter;
A frame buffer;
Intra prediction processor
2. The decoder of claim 1, further comprising:
(Item 8)
Receiving a coded block;
determining that an inter prediction mode is enabled for the coded block;
determining a decoded block using the long-term reference frame as a reference frame and according to the inter prediction mode;
2. The decoder of claim 1, further configured to:
(Item 9)
9. The decoder of claim 8, wherein the decoded blocks form part of a quad tree plus a binary decision tree.
(Item 10)
9. The decoder of claim 8, wherein the decoded blocks are non-leaf nodes of the quadtree plus binary decision tree.
(Item 11)
1. A method, comprising:
A decoder receives a bitstream;
storing a plurality of long term reference frames in a reference list;
the decoder retaining a long term reference frame in the reference list for a length of time based on a retention time;
the decoder decoding at least a portion of a video using the long-term reference frames maintained in the reference list;
A method comprising:
(Item 12)
12. The method of claim 11, wherein each long term reference frame in the stored long term reference frames includes an associated retention time.
(Item 13)
12. The method of claim 11, further comprising marking the long term reference frame as unavailable after the long term reference frame has resided in the reference list for at least the retention time.
(Item 14)
14. The method of claim 13, further comprising marking the long term reference frame as available based on a signal in the bitstream.
(Item 15)
12. The method of claim 11, wherein the bitstream includes a signal for removing the long term reference frame from memory.
(Item 16)
16. The method of claim 15, further comprising removing the long term reference frame from the reference list based on the signal.
(Item 17)
The decoder further comprises:
an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;
an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing an inverse discrete cosine;
A deblocking filter;
A frame buffer;
Intra prediction processor
Item 12. The method of item 11, comprising:
(Item 18)
Receiving a coded block;
determining that an inter prediction mode is enabled for the coded block;
determining a decoded block using the long-term reference frame as a reference frame and according to the inter prediction mode;
12. The method of claim 11, further comprising:
(Item 19)
20. The method of claim 18, wherein the decoded blocks form part of a quad tree plus a binary decision tree.
(Item 20)
20. The method of claim 18, wherein the decoded blocks are non-leaf nodes of the quadtree plus binary decision tree.

本発明を例証する目的のために、図面は、本発明の１つ以上の実施形態の側面を示す。しかしながら、本発明が図面に示される精密な配列および手段に限定されないことを理解されたい。
図１は、長い期間にわたるフレーム予測に関するある例示的参照リストを例証する。 For the purpose of illustrating the invention, the drawings show aspects of one or more embodiments of the invention, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings.
FIG. 1 illustrates an example reference list for long term frame prediction.

図２は、ｅＬＴＲフレームが参照リスト内に保持される延長された長期参照（ｅＬＴＲ）フレーム保持のある例示的プロセスを例証するプロセスフロー図である。FIG. 2 is a process flow diagram illustrating one example process of extended long-term reference (eLTR) frame retention in which eLTR frames are retained in a reference list.

図３は、参照リスト内に保持されるｅＬＴＲフレームを伴うビットストリームをデコードすることが可能なある例示的デコーダを例証するシステムブロック図である。FIG. 3 is a system block diagram illustrating an example decoder capable of decoding a bitstream with eLTR frames held in a reference list.

図４は、いくつかの既存のアプローチと比較した圧縮効率改良を可能にすることができる、本主題のいくつかの側面による参照リスト内に保持されるｅＬＴＲフレームを伴うビデオをエンコードするある例示的プロセスを例証するプロセスフロー図である。FIG. 4 is a process flow diagram illustrating an example process for encoding video with eLTR frames held in a reference list in accordance with some aspects of the present subject matter that can enable improved compression efficiency compared to some existing approaches.

図５は、参照リスト内のｅＬＴＲ保持のために信号伝達することが可能なある例示的ビデオエンコーダを例証するシステムブロック図である。FIG. 5 is a system block diagram illustrating an example video encoder capable of signaling for eLTR retention in a reference list.

図６は、本明細書に開示される方法のいずれか１つ以上およびそのいずれか１つ以上の部分を実装するために使用されることができるコンピューティングシステムのブロック図である。FIG. 6 is a block diagram of a computing system that can be used to implement any one or more of the methods and any one or more portions thereof disclosed herein.

図面は、必ずしも縮尺通りではなく、想像線、図式表現、および部分図によって例証され得る。ある事例では、実施形態の理解のためには必要ではない詳細、または他の詳細を知覚困難にする詳細が、省略されている場合がある。種々の図面内の同様の参照記号は、同様の要素を示す。 The drawings are not necessarily to scale and may be illustrated by phantom lines, schematic representations, and partial views. In some instances, details that are not necessary for an understanding of the embodiments or that make other details difficult to perceive may be omitted. Like reference symbols in the various drawings indicate like elements.

長期参照ピクチャ（ＬＴＲ）は、フレームのある部分が経時的に繰り返し塞がれ、次いで露見された状態になる場合において、ビデオフレームのより良好な予測のために使用され得る。従来的に、ＬＴＲは、場面またはピクチャ群の持続時間にわたって使用され、その後、これは、置換または破棄される。本主題のいくつかの実装は、参照リスト内での保持のために最良の候補ＬＴＲを選択することによって、ＬＴＲ使用の有用性を拡張する。いくつかの実装では、明示的に信号伝達される延長された長期参照（ｅＬＴＲ）フレームが、明示的に信号伝達される時間の長さにわたって参照リスト内に保持され得る。本主題のいくつかの実装は、いくつかの既存のアプローチと比較して、有意な圧縮効率利得を提供し得る。 Long-term reference pictures (LTRs) may be used for better prediction of video frames in cases where a portion of a frame repeatedly becomes occluded and then uncovered over time. Traditionally, an LTR is used for the duration of a scene or group of pictures, after which it is replaced or discarded. Some implementations of the present subject matter extend the usefulness of LTR use by selecting the best candidate LTR for retention in the reference list. In some implementations, explicitly signaled extended long-term reference (eLTR) frames may be retained in the reference list for an explicitly signaled length of time. Some implementations of the present subject matter may provide significant compression efficiency gains compared to some existing approaches.

本主題のいくつかの実装は、ビデオコーディングにおけるｅＬＴＲフレームの選択および保持を達成し得る。ｅＬＴＲが、ピクチャ参照リスト内に保持され得、これは、予測のために現在のフレームまたはフレームの群によって使用され得る。リスト内の全ての他のフレームは、比較的に短い期間にわたって変化するが、ｅＬＴＲは、参照リスト内に保持されることができる。例えば、図１は、長い期間にわたるフレーム予測のためのある例示的参照リストを例証する。非限定的かつ例証的な例として、影付きで示されるビデオフレームが、参照フレームを使用して再構築され得る。参照リストは、経時的に変化するフレームおよび保持されるｅＬＴＲを含有し得る。 Some implementations of the present subject matter may achieve eLTR frame selection and retention in video coding. The eLTR may be retained in a picture reference list that may be used by a current frame or group of frames for prediction. The eLTR may be retained in the reference list while all other frames in the list change over a relatively short period of time. For example, FIG. 1 illustrates an example reference list for frame prediction over a long period of time. As a non-limiting and illustrative example, the video frames shown shaded may be reconstructed using the reference frames. The reference list may contain frames that change over time and the eLTRs that are retained.

いくつかの実装では、引き続き図１を参照すると、エンコーダが、ｅＬＴＲ選択および保持計算の動作を実行する。選択されたフレームおよび保持の時間は、例えば、フレームｎに関するｅＬＴＲ（ｅＬＴＲｎ）および保持時間（ＴＲｎ）のためのインデックスを示す対（ｅＬＴＲｎ、ＴＲｎ）を使用して、デコーダに信号伝達され得る。デコーダは、参照リスト内にＴＲｎの期間にわたってフレームｅＬＴＲｎを保持し得る。ｅＬＴＲｎフレームが少なくともＴＲｎにわたって参照リスト内に常駐した後、ｅＬＴＲｎフレームは、さらなる使用のためには利用不可能としてマーキングされ得る。いくつかの実装では、ｅＬＴＲｎフレームは、メモリ内に維持されるが、利用不可能な状態にあり得る。いくつかの実装では、エンコーダは、利用可能として、または利用不可能として、ｅＬＴＲｎフレームをマーキングするようにデコーダに明示的に信号伝達し得る。例えば、保持時間ＴＲｎの経過後に利用不可能として以前にマーキングされたｅＬＴＲｎフレームが、利用可能としてマーキングされ得る。そのような特徴は、前後に切り替わる場面を含有するビデオ等のために、ｅＬＴＲｎが将来再び使用されることを可能にし得る。いくつかの実装では、エンコーダは、デコーダがメモリからｅＬＴＲｎフレームを除去するための信号をビットストリーム内に含み得る。デコーダは、そのような信号に基づいて、参照リストおよびメモリからｅＬＴＲｎフレームを除去し得る。 In some implementations, with continued reference to FIG. 1, an encoder performs the eLTR selection and retention calculation operations. The selected frame and the time of retention may be signaled to the decoder, for example, using the pair (eLTRn, TRn) indicating an index for the eLTR (eLTRn) and retention time (TRn) for frame n. The decoder may retain frame eLTRn in the reference list for a period of TRn. After the eLTRn frame has resided in the reference list for at least TRn, the eLTRn frame may be marked as unavailable for further use. In some implementations, the eLTRn frame may be maintained in memory but in an unavailable state. In some implementations, the encoder may explicitly signal the decoder to mark the eLTRn frame as available or unavailable. For example, an eLTRn frame previously marked as unavailable after the retention time TRn has elapsed may be marked as available. Such a feature may allow eLTRn to be used again in the future, such as for videos containing back-and-forth scenes. In some implementations, the encoder may include a signal in the bitstream for the decoder to remove the eLTRn frame from memory. The decoder may remove the eLTRn frame from the reference list and memory based on such a signal.

図２は、ｅＬＴＲフレームが参照リスト内に保持されるｅＬＴＲフレーム保持のプロセス２００の非限定的な例を例証するプロセスフロー図である。そのようなｅＬＴＲ保持は、ビデオエンコーディングおよびデコーディングに対するいくつかの既存のアプローチと比較して、圧縮効率改良を可能にし得る。 Figure 2 is a process flow diagram illustrating a non-limiting example of a process 200 of eLTR frame retention in which eLTR frames are retained in a reference list. Such eLTR retention may enable compression efficiency improvements compared to some existing approaches to video encoding and decoding.

ステップ２１０において、引き続き図２を参照すると、ビットストリームが、デコーダによって受信される。ビットストリームは、例えば、データ圧縮を使用するときにデコーダへの入力であるビットのストリーム内に見出されるデータを含み得る。ビットストリームは、ビデオをデコードするために必要な情報を含み得る。受信することは、ビットストリームからブロックおよび関連付けられる信号伝達情報を抽出および／または解析することを含み得る。いくつかの実装では、ビットストリームを受信することは、ｅＬＴＲフレーム、そのようなフレーム（ｅＬＴＲｎ）へのインデックス、および関連付けられる保持時間（ＴＲｎ）を解析することを含み得、保持時間は、ビデオ内のデコードされるフレームおよび／または時間に基づく。 At step 210, with continued reference to FIG. 2, a bitstream is received by the decoder. The bitstream may include, for example, data found in a stream of bits that is input to the decoder when using data compression. The bitstream may include information necessary to decode the video. Receiving may include extracting and/or parsing blocks and associated signaling information from the bitstream. In some implementations, receiving the bitstream may include parsing eLTR frames, indexes to such frames (eLTRn), and associated retention times (TRn), where the retention times are based on the frames and/or times in the video being decoded.

継続して図２を参照すると、ステップ２２０において、ｅＬＴＲフレームが、参照ピクチャリスト内に記憶され得る。 With continued reference to FIG. 2, at step 220, the eLTR frame may be stored in a reference picture list.

ステップ２３０において、引き続き図２を参照すると、記憶されているｅＬＴＲフレームが、関連付けられる保持時間（ＴＲｎ）に基づく時間の長さにわたって参照リスト内に保持（例えば、維持）され得る。 At step 230, and still referring to FIG. 2, the stored eLTR frame may be retained (e.g., maintained) in the reference list for a length of time based on the associated retention time (TRn).

ステップ２４０において、引き続き図２を参照すると、ビデオの少なくとも一部が、ビットストリームからデコードされ得る。デコーディングは、カレントブロックをデコードすることを含み得る。例えば、ビットストリーム内に含有される受信された現在のコーディングされたブロックが、例えば、インター予測を使用することによってデコードされ得る。インター予測を介したデコーディングは、予測を算出するための参照として以前のフレーム、将来のフレーム、および／またはｅＬＴＲフレームを使用することを含み得、予測は、ビットストリーム内に含有される残差と組み合わせられ得る。 At step 240, with continued reference to FIG. 2, at least a portion of the video may be decoded from the bitstream. The decoding may include decoding a current block. For example, a received current coded block contained in the bitstream may be decoded, for example, by using inter prediction. Decoding via inter prediction may include using previous frames, future frames, and/or eLTR frames as references to calculate a prediction, which may be combined with a residual contained in the bitstream.

さらに図２を参照すると、後続のカレントブロックに関して、ｅＬＴＲフレームが、インター予測のための参照フレームとして利用され得る。例えば、第２のコーディングされたブロックが受信され得る。インター予測モードが第２のコーディングされたブロックに関して有効化されるかどうかが決定され得、決定は、ビットストリームから、インター予測モードが有効化されるかどうかを示す明示的信号を受信することを含み得る。第２のデコードされたブロックが、参照フレームとしてｅＬＴＲフレームを使用して、かつインター予測モードに従って、決定され得る。例えば、インター予測を介したデコーディングは、予測を算出するための参照としてｅＬＴＲフレームを使用することを含み得、予測は、ビットストリーム内に含有される残差と組み合わせられ得る。 With further reference to FIG. 2, for a subsequent current block, the eLTR frame may be utilized as a reference frame for inter prediction. For example, a second coded block may be received. It may be determined whether an inter prediction mode is enabled for the second coded block, and the determination may include receiving an explicit signal from the bitstream indicating whether the inter prediction mode is enabled. A second decoded block may be determined using the eLTR frame as a reference frame and according to the inter prediction mode. For example, decoding via inter prediction may include using the eLTR frame as a reference to calculate a prediction, and the prediction may be combined with a residual contained within the bitstream.

図３は、参照リスト内に保持されたｅＬＴＲフレームを伴うビットストリーム３７０をデコードすることが可能なデコーダ３００の非限定的な例を例証するシステムブロック図である。デコーダ３００は、エントロピーデコーダプロセッサ３１０と、逆量子化および逆変換プロセッサ３２０と、デブロッキングフィルタ３３０と、フレームバッファ３４０と、動き補償プロセッサ３５０と、イントラ予測プロセッサ３６０とを含み得る。いくつかの実装では、ビットストリーム３７０は、ｅＬＴＲインデックス（ｅＬＴＲｎ）および保持時間（ＴＲｎ）を信号伝達するパラメータ（例えば、ビットストリームのヘッダ内のフィールド）を含み得る。動き補償プロセッサ３５０は、ｅＬＴＲフレームを使用して、ピクセル情報を再構築し、ｅＬＴＲフレームの関連付けられる保持時間（ＴＲｎ）に従って、ｅＬＴＲフレームを保持し得る。例えば、ｅＬＴＲフレーム（ｅＬＴＲｎ）が、受信され、少なくとも関連付けられる保持時間にわたって参照リスト内に保持されるとき、ｅＬＴＲフレーム（ｅＬＴＲｎ）は、少なくとも関連付けられる参照時間の間、インター予測モードのための参照として使用され得る。 FIG. 3 is a system block diagram illustrating a non-limiting example of a decoder 300 capable of decoding a bitstream 370 with eLTR frames retained in a reference list. The decoder 300 may include an entropy decoder processor 310, an inverse quantization and inverse transform processor 320, a deblocking filter 330, a frame buffer 340, a motion compensation processor 350, and an intra prediction processor 360. In some implementations, the bitstream 370 may include parameters (e.g., fields in a header of the bitstream) signaling an eLTR index (eLTRn) and a retention time (TRn). The motion compensation processor 350 may use the eLTR frames to reconstruct pixel information and retain the eLTR frames according to their associated retention times (TRn). For example, when an eLTR frame (eLTRn) is received and retained in a reference list for at least an associated retention time, the eLTR frame (eLTRn) may be used as a reference for an inter prediction mode for at least the associated reference time.

動作時、引き続き図３を参照すると、ビットストリーム３７０が、デコーダ３００によって受信され、エントロピーデコーダプロセッサ３１０に入力され得、エントロピーデコーダプロセッサ３１０は、ビットストリームを量子化された係数にエントロピーデコードし得る。量子化された係数は、逆量子化および逆変換プロセッサ３２０に提供され得、逆量子化および逆変換プロセッサ３２０は、逆量子化および逆変換を実施して、残差信号を作成し得、残差信号は、処理モードに従って、動き補償プロセッサ３５０またはイントラ予測プロセッサ３６０の出力に追加され得る。動き補償プロセッサ３５０およびイントラ予測プロセッサ３６０の出力は、以前にデコードされたブロックおよび／または参照リスト内に維持されるｅＬＴＲフレームに基づくブロック予測を含み得る。予測および残差の合計が、デブロッキングフィルタ６３０によって処理され、フレームバッファ６４０内に記憶され得る。 In operation, and still referring to FIG. 3, a bitstream 370 may be received by the decoder 300 and input to the entropy decoder processor 310, which may entropy decode the bitstream into quantized coefficients. The quantized coefficients may be provided to the inverse quantization and inverse transform processor 320, which may perform inverse quantization and inverse transform to create a residual signal, which may be added to the output of the motion compensation processor 350 or the intra prediction processor 360, depending on the processing mode. The output of the motion compensation processor 350 and the intra prediction processor 360 may include block predictions based on previously decoded blocks and/or eLTR frames maintained in the reference list. The prediction and residual sums may be processed by the deblocking filter 630 and stored in the frame buffer 640.

図４は、いくつかの既存のアプローチと比較して圧縮効率改良を可能にし得る、本主題のいくつかの側面による参照リスト内に保持されるｅＬＴＲフレームを伴うビデオをエンコードするプロセス４００の非限定的な例を例証するプロセスフロー図である。ステップ４１０において、ビデオフレームのシーケンスが、エンコードされ、これは、１つ以上のｅＬＴＲフレームを決定することを含み得る。ステップ４２０において、ｅＬＴＲフレーム保持時間（ＴＲｎ）が、例えば、ｅＬＴＲフレームがエンコーダ／デコーダによって利用される時間の長さに基づいて決定され得、例えば、時間は、ビデオ内のデコードされているフレームに基づく。 FIG. 4 is a process flow diagram illustrating a non-limiting example of a process 400 for encoding a video with eLTR frames retained in a reference list in accordance with some aspects of the present subject matter, which may enable compression efficiency improvements compared to some existing approaches. At step 410, a sequence of video frames is encoded, which may include determining one or more eLTR frames. At step 420, an eLTR frame retention time (TRn) may be determined, for example, based on the length of time that the eLTR frame is utilized by the encoder/decoder, e.g., the time is based on the frame being decoded in the video.

ステップ４３０において、引き続き図４を参照すると、付加的信号伝達パラメータが決定され得る。例えば、利用不可能または利用可能としてｅＬＴＲフレームをマーキングするかどうか、およびその時間が、決定され得、各ｅＬＴＲフレームがメモリから除去されるべきであるかどうか、およびその時間が、決定され得る。 At step 430, still referring to FIG. 4, additional signaling parameters may be determined. For example, whether and when to mark eLTR frames as unavailable or available may be determined, and whether and when each eLTR frame should be removed from memory may be determined.

ステップ４４０において、引き続き図４を参照すると、ｅＬＴＲ保持時間および付加的信号伝達パラメータが、ビットストリーム内に含まれ得る。 At step 440, and still referring to FIG. 4, the eLTR retention time and additional signaling parameters may be included in the bitstream.

図５は、参照リスト内のｅＬＴＲ保持のために信号伝達することが可能なビデオエンコーダ５００の非限定的な例を例証するシステムブロック図である。例示的ビデオエンコーダ５００は、入力ビデオ５０５を受信し、入力ビデオ５０５は、最初に、ツリー構造化マクロブロック分割スキーム（例えば、クアッドツリープラスバイナリツリー）等の処理スキームに従って、セグメント化されるかまたは分けられ得る。ツリー構造化マクロブロック分割スキームの例は、ピクチャフレームを大きいブロック要素に分割する分割スキームを含み得、大きいブロック要素は、本開示の目的のために、コーディングツリーユニット（ＣＴＵ）と称され得る。いくつかの実装では、各ＣＴＵは、コーディングユニット（ＣＵ）と呼ばれるいくつかのサブブロックに１回以上さらに分割され得る。この分割の結果は、本開示の目的のために、予測ユニット（ＰＵ）と称され得るサブブロックの群を含み得る。変換ユニット（ＴＵ）もまた、利用され得る。 5 is a system block diagram illustrating a non-limiting example of a video encoder 500 capable of signaling for eLTR retention in a reference list. The exemplary video encoder 500 receives an input video 505, which may first be segmented or divided according to a processing scheme such as a tree-structured macroblock partitioning scheme (e.g., a quad tree plus a binary tree). An example of a tree-structured macroblock partitioning scheme may include a partitioning scheme that divides a picture frame into larger block elements, which for purposes of this disclosure may be referred to as coding tree units (CTUs). In some implementations, each CTU may be further partitioned one or more times into several sub-blocks, which may be referred to as coding units (CUs). The results of this partitioning may include a group of sub-blocks, which for purposes of this disclosure may be referred to as prediction units (PUs). Transform units (TUs) may also be utilized.

引き続き図５を参照すると、例示的ビデオエンコーダ５００は、イントラ予測プロセッサ５１５と、ｅＬＴＲフレーム保持を支援することが可能な動き推定／補償プロセッサ５２０（インター予測プロセッサとも称される）と、変換／量子化プロセッサ５２５と、逆量子化／逆変換プロセッサ５３０と、ループ内フィルタ５３５と、デコード済ピクチャバッファ５４０と、エントロピーコーディングプロセッサ５４５とを含み得る。いくつかの実装では、動き推定／補償プロセッサ５２０は、ｅＬＴＲ保持時間および付加的信号伝達パラメータを決定し得る。ｅＬＴＲフレーム保持および付加的パラメータを信号伝達するビットストリームパラメータが、出力ビットストリーム５５０内での包含のために、エントロピーコーディングプロセッサ５４５に入力され得る。 Continuing to refer to FIG. 5, the exemplary video encoder 500 may include an intra-prediction processor 515, a motion estimation/compensation processor 520 (also referred to as an inter-prediction processor) capable of supporting eLTR frame retention, a transform/quantization processor 525, an inverse quantization/inverse transform processor 530, an in-loop filter 535, a decoded picture buffer 540, and an entropy coding processor 545. In some implementations, the motion estimation/compensation processor 520 may determine the eLTR retention time and additional signaling parameters. Bitstream parameters signaling the eLTR frame retention and additional parameters may be input to the entropy coding processor 545 for inclusion in the output bitstream 550.

動作時、継続して図５を参照すると、入力ビデオ５０５のフレームのブロック毎に、イントラピクチャ予測を介して、または動き推定／補償を使用して、ブロックを処理すべきかどうかが決定され得る。ブロックは、イントラ予測プロセッサ５１０または動き推定／補償プロセッサ５２０に提供され得る。ブロックがイントラ予測を介して処理されるべきである場合、イントラ予測プロセッサ５１０は、処理を実施し、予測子を出力し得る。ブロックが動き推定／補償を介して処理されるべきである場合、動き推定／補償プロセッサ５２０は、適用可能である場合、インター予測のための参照としてｅＬＴＲフレームを使用することを含む処理を実施し得る。 In operation, and continuing to refer to FIG. 5, for each block of a frame of input video 505, it may be determined whether the block should be processed via intra-picture prediction or using motion estimation/compensation. The block may be provided to intra-prediction processor 510 or motion estimation/compensation processor 520. If the block is to be processed via intra-prediction, intra-prediction processor 510 may perform processing and output a predictor. If the block is to be processed via motion estimation/compensation, motion estimation/compensation processor 520 may perform processing including using an eLTR frame as a reference for inter prediction, if applicable.

継続して図５を参照すると、残差が、入力ビデオから予測子を減算することによって形成され得る。残差は、変換／量子化プロセッサ５２５によって受信され得、変換／量子化プロセッサ５２５は、変換処理（例えば、離散コサイン変換（ＤＣＴ））を実施して、係数を生成し得、係数は、量子化され得る。量子化された係数および任意の関連付けられる信号伝達情報が、エントロピーエンコーディングおよび出力ビットストリーム５５０内での包含のために、エントロピーコーディングプロセッサ５４５に提供され得る。エントロピーエンコーディングプロセッサ５４５は、ｅＬＴＲフレーム保持に関連する信号伝達情報のエンコーディングを支援し得る。加えて、量子化された係数は、逆量子化／逆変換プロセッサ５３０に提供され得、逆量子化／逆変換プロセッサ５３０は、ピクセルを再現し得、ピクセルは、予測子と組み合わせられ、ループ内フィルタ５３５によって処理され得、その出力は、ｅＬＴＲフレーム保持を支援することが可能である、動き推定／補償プロセッサ５２０による使用のために、デコード済ピクチャバッファ５４０内に記憶され得る。 Continuing with reference to FIG. 5, a residual may be formed by subtracting a predictor from the input video. The residual may be received by a transform/quantization processor 525, which may perform a transform process (e.g., a discrete cosine transform (DCT)) to generate coefficients, which may be quantized. The quantized coefficients and any associated signaling information may be provided to an entropy coding processor 545 for entropy encoding and inclusion in the output bitstream 550. The entropy encoding processor 545 may assist in encoding signaling information related to eLTR frame retention. Additionally, the quantized coefficients may be provided to an inverse quantization/inverse transform processor 530, which may reconstruct the pixels, which may be combined with the predictor and processed by an in-loop filter 535, the output of which may be stored in a decoded picture buffer 540 for use by the motion estimation/compensation processor 520, which may assist with eLTR frame retention.

引き続き図５を参照すると、いくつかの変形例が、上記に詳細に説明されたが、他の修正または追加が可能である。例えば、いくつかの実装では、カレントブロックは、任意の対称ブロック（８×８、１６×１６、３２×３２、６４×６４、１２８×１２８等）および任意の非対称ブロック（８×４、１６×８等）を含み得る。 With continued reference to FIG. 5, several variations have been described in detail above, but other modifications or additions are possible. For example, in some implementations, the current block may include any symmetric block (8×8, 16×16, 32×32, 64×64, 128×128, etc.) and any asymmetric block (8×4, 16×8, etc.).

いくつかの実装では、継続して図５を参照すると、クアッドツリープラスバイナリディシジョンツリー（ＱＴＢＴ）が実装され得る。ＱＴＢＴでは、コーディングツリーユニットレベルにおいて、ＱＴＢＴの分割パラメータが、いかなるオーバーヘッドも伝送することなく、局所的特性に適合するように動的に導出され得る。続けて、コーディングユニットレベルにおいて、ジョイント分類器ディシジョンツリー構造が、不必要な反復を排除し、誤った予測のリスクを制御し得る。 In some implementations, and continuing to refer to FIG. 5, a quad-tree plus binary decision tree (QTBT) may be implemented. In QTBT, at the coding tree unit level, the splitting parameters of QTBT may be dynamically derived to fit local characteristics without transmitting any overhead. Then, at the coding unit level, a joint classifier decision tree structure may eliminate unnecessary iterations and control the risk of erroneous prediction.

いくつかの実装では、デコーダは、ｅＬＴＲフレーム保持プロセッサ（図示せず）を含み得、ｅＬＴＲフレーム保持プロセッサは、ｅＬＴＲフレームを利用不可能としてマーキングするか、または参照リストから除去するかということと、それらの時間とを決定する。 In some implementations, the decoder may include an eLTR frame preservation processor (not shown) that determines whether and for how long eLTR frames should be marked as unavailable or removed from the reference list.

いくつかの実装では、本主題は、保持期間の途中でデコーダが同調するブロードキャスト（および同様の）シナリオに適用されることができる。標準的再生を支援するために、エンコーダが、瞬時デコーディングリフレッシュ（ＩＤＲ）タイプフレームとして（ｅ）ＬＴＲフレームをマーキングし得る。この場合では、ストリーミングは、次の利用可能なＬＴＲ（ＩＤＲ）フレームの後に再開し得る。そのようなアプローチは、ＩＤＲフレームとしてフレーム間を規定するいくつかの現在のブロードキャスト規格と同様であり得る。 In some implementations, the subject matter can be applied to broadcast (and similar) scenarios where a decoder tunes in midway through a retention period. To support standard playback, an encoder may mark (e)LTR frames as Instantaneous Decoding Refresh (IDR) type frames. In this case, streaming may resume after the next available LTR (IDR) frame. Such an approach may be similar to some current broadcast standards that specify interframes as IDR frames.

本明細書に説明される主題は、多くの技術的利点を提供する。例えば、本主題のいくつかの実装は、参照リスト内に保持されるｅＬＴＲフレームを使用してブロックをデコードすることを提供し得る。そのようなアプローチは、圧縮効率を改良し得る。 The subject matter described herein provides many technical advantages. For example, some implementations of the subject matter may provide for decoding blocks using eLTR frames that are maintained in a reference list. Such an approach may improve compression efficiency.

本明細書に説明される側面および実施形態のうちの任意の１つ以上のものが、コンピュータ技術分野の当業者に明白であるように、本明細書の教示に従ってプログラムされた１つ以上の機械（例えば、電子ドキュメントのためのユーザコンピューティングデバイスとして利用される１つ以上のコンピューティングデバイス、ドキュメントサーバ等の１つ以上のサーバデバイス等）において実現および／または実装されるデジタル電子回路、集積回路、専用に設計された特定用途向け集積回路（ＡＳＩＣ）、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）コンピュータハードウェア、ファームウェア、ソフトウェア、および／またはそれらの組み合わせを使用して、便宜的に実装され得ることに留意されたい。これらの種々の側面または特徴は、少なくとも１つのプログラム可能なプロセッサを含むプログラム可能なシステム上で実行可能かつ／または解読可能である１つ以上のコンピュータプログラムおよび／またはソフトウェア内での実装を含み得、少なくとも１つのプログラム可能なプロセッサは、専用目的もしくは汎用目的であり得、専用目的もしくは汎用目的であり得、データおよび命令を、ストレージシステム、少なくとも１つの入力デバイス、および少なくとも１つの出力デバイスから受信し、データおよび命令をそれらに伝送するように結合される。適切なソフトウェアコーディングが、ソフトウェア技術分野の当業者に明白であるように、本開示の教示に基づいて、熟練のプログラマによって容易に準備され得る。ソフトウェアおよび／またはソフトウェアモジュールを採用する上記に議論される側面および実装もまた、ソフトウェアおよび／またはソフトウェアモジュールの機械実行可能命令の実装を補助するために適切なハードウェアを含み得る。 It should be noted that any one or more of the aspects and embodiments described herein may be conveniently implemented using digital electronic circuitry, integrated circuits, specially designed application specific integrated circuits (ASICs), field programmable gate array (FPGA) computer hardware, firmware, software, and/or combinations thereof realized and/or implemented in one or more machines (e.g., one or more computing devices utilized as user computing devices for electronic documents, one or more server devices such as document servers, etc.) programmed in accordance with the teachings herein, as would be apparent to one of ordinary skill in the computer arts. These various aspects or features may include implementation in one or more computer programs and/or software executable and/or readable on a programmable system including at least one programmable processor, which may be dedicated or general purpose, coupled to receive data and instructions from and transmit data and instructions to a storage system, at least one input device, and at least one output device. Appropriate software coding may be readily prepared by skilled programmers based on the teachings of the present disclosure, as would be apparent to one of ordinary skill in the software arts. The aspects and implementations discussed above that employ software and/or software modules may also include suitable hardware to assist in implementing the machine-executable instructions of the software and/or software modules.

そのようなソフトウェアは、機械可読記憶媒体を採用するコンピュータプログラム製品であり得る。機械可読記憶媒体は、機械（例えば、コンピューティングデバイス）による実行のための命令のシーケンスを記憶および／またはエンコードすることが可能であり、かつ機械に本明細書に説明される方法および／または実施形態の任意の１つを実施させる任意の媒体であり得る。機械可読記憶媒体の例は、限定ではないが、磁気ディスク、光ディスク（例えば、ＣＤ、ＣＤ－Ｒ、ＤＶＤ、ＤＶＤ－Ｒ等）、光磁気ディスク、読取専用メモリ「ＲＯＭ」デバイス、ランダムアクセスメモリ「ＲＡＭ」デバイス、磁気カード、光学カード、ソリッドステートメモリデバイス、ＥＰＲＯＭ、ＥＥＰＲＯＭ、プログラマブル論理デバイス（ＰＬＤ）、および／またはそれらの任意の組み合わせを含む。機械可読媒体は、本明細書で使用される場合、単一の媒体、ならびに、例えばコンピュータメモリとの組み合わされたコンパクトディスクもしくは１つ以上のハードディスクドライブの集合等の物理的に分離した媒体の集合を含むように意図されている。本明細書で使用される場合、機械可読記憶媒体は、信号伝送の一過性形態を含まない。 Such software may be a computer program product employing a machine-readable storage medium. A machine-readable storage medium may be any medium capable of storing and/or encoding a sequence of instructions for execution by a machine (e.g., a computing device) and causing the machine to perform any one of the methods and/or embodiments described herein. Examples of machine-readable storage media include, but are not limited to, magnetic disks, optical disks (e.g., CDs, CD-Rs, DVDs, DVD-Rs, etc.), magneto-optical disks, read-only memory "ROM" devices, random access memory "RAM" devices, magnetic cards, optical cards, solid-state memory devices, EPROMs, EEPROMs, programmable logic devices (PLDs), and/or any combination thereof. Machine-readable media, as used herein, is intended to include a single medium as well as a collection of physically separate media, such as, for example, a collection of compact discs or one or more hard disk drives in combination with a computer memory. As used herein, machine-readable storage media does not include a transitory form of signal transmission.

そのようなソフトウェアはまた、搬送波等のデータキャリア上のデータ信号として搬送される情報（例えば、データ）を含み得る。例えば、機械実行可能情報は、信号が機械（例えば、コンピューティングデバイス）による実行のために命令のシーケンスまたはその一部をエンコードするデータキャリアにおいて具現化されるデータ搬送信号、ならびに機械に本明細書に説明される方法および／または実施形態の任意の１つを実施させる任意の関連する情報（例えば、データ構造およびデータ）として含まれ得る。 Such software may also include information (e.g., data) carried as a data signal on a data carrier, such as a carrier wave. For example, machine-executable information may be included as a data carrying signal embodied in a data carrier, the signal encoding a sequence of instructions or portions thereof for execution by a machine (e.g., a computing device), as well as any associated information (e.g., data structures and data) that causes the machine to perform any one of the methods and/or embodiments described herein.

コンピューティングデバイスの例は、限定ではないが、電子書籍読書デバイス、コンピュータワークステーション、端末コンピュータ、サーバコンピュータ、ハンドヘルドデバイス（例えば、タブレット型コンピュータ、スマートフォン等）、ウェブ装置、ネットワークルータ、ネットワークスイッチ、ネットワークブリッジ、機械よってとられるべきアクションを規定する命令のシーケンスを実行することが可能である任意の機械、およびそれらの任意の組み合わせを含む。一例では、コンピューティングデバイスは、キオスクを含み、かつ／またはその中に含まれ得る。 Examples of computing devices include, but are not limited to, e-book reading devices, computer workstations, terminal computers, server computers, handheld devices (e.g., tablet computers, smartphones, etc.), web appliances, network routers, network switches, network bridges, any machine capable of executing a sequence of instructions that define actions to be taken by the machine, and any combination thereof. In one example, a computing device may include and/or be included within a kiosk.

図６は、コントロールシステムに本開示の側面および／または方法のうちの任意の１つ以上のものを実施させるための命令のセットが実行され得るコンピュータシステム６００の例示的形態としてのコンピューティングデバイスの一実施形態の図式表現を示す。複数のコンピューティングデバイスが、デバイスのうちの１つ以上に、本開示の側面および／または方法のうちの任意の１つ以上を実施させるために専用に構成された命令のセットを実装するために利用され得ることも、考えられる。コンピュータシステム６００は、プロセッサ６０４と、メモリ６０８とを含み、プロセッサ６０４およびメモリ６０８は、バス６１２を介して相互に、および他の構成要素と通信する。バス６１２は、限定ではないが、種々のバスアーキテクチャのうちのいずれかを使用するメモリバス、メモリコントローラ、周辺バス、ローカルバス、およびそれらの任意の組み合わせを含むいくつかのタイプのバス構造のうちのいずれかを含み得る。 6 illustrates a diagrammatic representation of one embodiment of a computing device as an exemplary form of computer system 600 on which a set of instructions for causing a control system to perform any one or more of the aspects and/or methods of the present disclosure may be executed. It is also contemplated that multiple computing devices may be utilized to implement a set of instructions specifically configured to cause one or more of the devices to perform any one or more of the aspects and/or methods of the present disclosure. Computer system 600 includes a processor 604 and a memory 608, which communicate with each other and with other components via a bus 612. Bus 612 may include any of several types of bus structures, including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combination thereof, using any of a variety of bus architectures.

メモリ６０８は、限定ではないが、ランダムアクセスメモリ構成要素、読取専用構成要素、およびそれらの任意の組み合わせを含む種々の構成要素（例えば、機械可読媒体）を含み得る。一例では、起動中等にコンピュータシステム６００内の要素間で情報を転送することに役立つ基本ルーチンを含む基本入力／出力システム６１６（ＢＩＯＳ）が、メモリ６０８の中に記憶され得る。メモリ６０８はまた、本開示の側面および／または方法のうちの任意の１つ以上を具現化する命令（例えば、ソフトウェア）６２０を含み得る（例えば、１つ以上の機械可読媒体上に記憶されている）。別の例では、メモリ６０８はさらに、限定ではないが、オペレーティングシステム、１つ以上のアプリケーションプログラム、他のプログラムモジュール、プログラムデータ、およびそれらの任意の組み合わせを含む任意の数のプログラムモジュールを含み得る。 The memory 608 may include a variety of components (e.g., machine-readable media), including, but not limited to, random access memory components, read-only components, and any combination thereof. In one example, a basic input/output system 616 (BIOS), including basic routines that help to transfer information between elements within the computer system 600, such as during startup, may be stored in the memory 608. The memory 608 may also include (e.g., stored on one or more machine-readable media) instructions (e.g., software) 620 that embody any one or more of the aspects and/or methods of the present disclosure. In another example, the memory 608 may further include any number of program modules, including, but not limited to, an operating system, one or more application programs, other program modules, program data, and any combination thereof.

コンピュータシステム６００はまた、記憶デバイス６２４を含み得る。記憶デバイス（例えば、記憶デバイス６２４）の例は、限定ではないが、ハードディスクドライブ、磁気ディスクドライブ、光学媒体と組み合わせられた光ディスクドライブ、ソリッドステートメモリデバイス、およびそれらの任意の組み合わせを含む。記憶デバイス６２４は、適切なインターフェース（例証せず）によってバス６１２に接続され得る。例示的インターフェースは、限定ではないが、ＳＣＳＩ、アドバンスト・テクノロジー・アタッチメント（ＡＴＡ）、シリアルＡＴＡ、ユニバーサルシリアルバス（ＵＳＢ）、ＩＥＥＥ１３９４（ＦＩＲＥＷＩＲＥ（登録商標））、およびそれらの任意の組み合わせを含む。一例では、記憶デバイス６２４（または１つ以上のその構成要素）は、（例えば、外部ポートコネクタ（例証せず）を介して）コンピュータシステム６００と除去可能にインターフェース接続され得る。特に、記憶デバイス６２４および関連付けられた機械可読媒体６２８は、コンピュータシステム６００のための機械可読命令、データ構造、プログラムモジュール、ならびに／または、他のデータの不揮発性記憶装置および／または揮発性記憶装置を提供し得る。一例では、ソフトウェア６２０は、完全に、または部分的に、機械可読媒体６２８内に常駐し得る。別の例では、ソフトウェア６２０は、完全に、または部分的に、プロセッサ６０４内に常駐し得る。 Computer system 600 may also include a storage device 624. Examples of storage devices (e.g., storage device 624) include, but are not limited to, hard disk drives, magnetic disk drives, optical disk drives combined with optical media, solid-state memory devices, and any combination thereof. Storage device 624 may be connected to bus 612 by a suitable interface (not illustrated). Exemplary interfaces include, but are not limited to, SCSI, Advanced Technology Attachment (ATA), Serial ATA, Universal Serial Bus (USB), IEEE 1394 (FIREWIRE®), and any combination thereof. In one example, storage device 624 (or one or more of its components) may be removably interfaced with computer system 600 (e.g., via an external port connector (not illustrated)). In particular, storage device 624 and associated machine-readable media 628 may provide non-volatile and/or volatile storage of machine-readable instructions, data structures, program modules, and/or other data for computer system 600. In one example, the software 620 may reside, completely or partially, within the machine-readable medium 628. In another example, the software 620 may reside, completely or partially, within the processor 604.

コンピュータシステム６００はまた、入力デバイス６３２を含み得る。一例では、コンピュータシステム６００のユーザは、入力デバイス６３２を介してコンピュータシステム６００内にコマンドおよび／または他の情報を打ち込み得る。入力デバイス６３２の例は、限定ではないが、英数字入力デバイス（例えば、キーボード）、ポインティングデバイス、ジョイスティック、ゲームパッド、オーディオ入力デバイス（例えば、マイクロホン、音声応答システム等）、カーソル制御デバイス（例えば、マウス）、タッチパッド、光学スキャナ、ビデオ捕捉デバイス（例えば、静止カメラ、ビデオカメラ）、タッチスクリーン、およびそれらの任意の組み合わせを含む。入力デバイス６３２は、限定ではないが、シリアルインターフェース、パラレルインターフェース、ゲームポート、ＵＳＢインターフェース、ＦＩＲＥＷＩＲＥ（登録商標）インターフェース、バス６１２への直接的インターフェース、およびそれらの任意の組み合わせを含む種々のインターフェース（例証せず）のうちのいずれかを介して、バス６１２にインターフェース接続され得る。入力デバイス６３２は、タッチスクリーンインターフェースを含み得、タッチスクリーンインターフェースは、さらに下記に議論されるディスプレイ６３６の一部であるか、またはそれと別個であり得る。入力デバイス６３２は、上記に説明されるようなグラフィカルインターフェースにおいて１つ以上のグラフィック表現を選択するためのユーザ選択デバイスとして利用され得る。 Computer system 600 may also include input devices 632. In one example, a user of computer system 600 may type commands and/or other information into computer system 600 via input devices 632. Examples of input devices 632 include, but are not limited to, alphanumeric input devices (e.g., keyboards), pointing devices, joysticks, gamepads, audio input devices (e.g., microphones, voice response systems, etc.), cursor control devices (e.g., mice), touchpads, optical scanners, video capture devices (e.g., still cameras, video cameras), touch screens, and any combination thereof. Input devices 632 may be interfaced to bus 612 via any of a variety of interfaces (not illustrated), including, but not limited to, a serial interface, a parallel interface, a game port, a USB interface, a FIREWIRE® interface, a direct interface to bus 612, and any combination thereof. Input devices 632 may include a touch screen interface, which may be part of or separate from display 636, discussed further below. The input device 632 may be utilized as a user selection device for selecting one or more graphical representations in a graphical interface as described above.

ユーザはまた、記憶デバイス６２４（例えば、リムーバブルディスクドライブ、フラッシュドライブ等）および／またはネットワークインターフェースデバイス６４０を介してコマンドおよび／または他の情報をコンピュータシステム６００に入力し得る。ネットワークインターフェースデバイス６４０等のネットワークインターフェースデバイスは、ネットワーク６４４等の種々のネットワークのうちの１つ以上、およびそれに接続される１つ以上の遠隔デバイス６４８にコンピュータシステム６００を接続するために利用され得る。ネットワークインターフェースデバイスの例は、限定ではないが、ネットワークインターフェースカード（例えば、モバイルネットワークインターフェースカード、ＬＡＮカード）、モデム、およびそれらの任意の組み合わせを含む。ネットワークの例は、限定ではないが、ワイドエリアネットワーク（例えば、インターネット、企業ネットワーク）、ローカルエリアネットワーク（例えば、オフィス、建物、キャンパス、または他の比較的小さい地理的空間に関連付けられたネットワーク）、電話ネットワーク、電話／音声プロバイダと関連付けられたデータネットワーク（例えば、モバイル通信プロバイダのデータおよび／または音声ネットワーク）、２つのコンピューティングデバイス間の直接的接続、ならびにそれらの任意の組み合わせを含む。ネットワーク６４４等のネットワークは、有線モードおよび／または無線のモードの通信を採用し得る。概して、任意のネットワークトポロジが使用され得る。情報（例えば、データ、ソフトウェア６２０等）が、ネットワークインターフェースデバイス６４０を介して、コンピュータシステム６００に、および／またはコンピュータシステム６００から通信され得る。 A user may also input commands and/or other information to computer system 600 via storage device 624 (e.g., removable disk drive, flash drive, etc.) and/or network interface device 640. A network interface device, such as network interface device 640, may be utilized to connect computer system 600 to one or more of a variety of networks, such as network 644, and one or more remote devices 648 connected thereto. Examples of network interface devices include, but are not limited to, network interface cards (e.g., mobile network interface cards, LAN cards), modems, and any combination thereof. Examples of networks include, but are not limited to, wide area networks (e.g., the Internet, an enterprise network), local area networks (e.g., a network associated with an office, building, campus, or other relatively small geographic space), telephone networks, data networks associated with a telephone/voice provider (e.g., a mobile communications provider's data and/or voice network), a direct connection between two computing devices, and any combination thereof. A network, such as network 644, may employ wired and/or wireless modes of communication. In general, any network topology may be used. Information (e.g., data, software 620, etc.) may be communicated to and/or from computer system 600 via network interface device 640.

コンピュータシステム６００はさらに、ディスプレイデバイス６３６等のディスプレイデバイスに表示可能な画像を通信するためのビデオディスプレイアダプタ６５２を含み得る。ディスプレイデバイスの例は、限定ではないが、液晶ディスプレイ（ＬＣＤ）、陰極線管（ＣＲＴ）、プラズマディスプレイ、発光ダイオード（ＬＥＤ）ディスプレイ、およびそれらの任意の組み合わせを含む。ディスプレイアダプタ６５２およびディスプレイデバイス６３６は、本開示の側面のグラフィック表現を提供するためにプロセッサ６０４と組み合わせて利用され得る。ディスプレイデバイスに加えて、コンピュータシステム６００は、限定ではないが、オーディオスピーカ、プリンタ、およびそれらの任意の組み合わせを含む１つ以上の他の周辺出力デバイスを含み得る。そのような周辺出力デバイスは、周辺インターフェース６５６を介してバス６１２に接続され得る。周辺インターフェースの例は、限定ではないが、シリアルポート、ＵＳＢ接続、ＦＩＲＥＷＩＲＥ（登録商標）接続、パラレル接続、およびそれらの任意の組み合わせを含む。 The computer system 600 may further include a video display adapter 652 for communicating images displayable on a display device, such as the display device 636. Examples of display devices include, but are not limited to, a liquid crystal display (LCD), a cathode ray tube (CRT), a plasma display, a light emitting diode (LED) display, and any combination thereof. The display adapter 652 and the display device 636 may be utilized in combination with the processor 604 to provide graphical representations of aspects of the present disclosure. In addition to a display device, the computer system 600 may include one or more other peripheral output devices, including, but not limited to, audio speakers, a printer, and any combination thereof. Such peripheral output devices may be connected to the bus 612 via a peripheral interface 656. Examples of peripheral interfaces include, but are not limited to, a serial port, a USB connection, a FIREWIRE® connection, a parallel connection, and any combination thereof.

前述は、本発明の例証的実施形態の詳細な説明である。種々の修正および追加が、本発明の精神および範囲から逸脱することなく成され得る。上記に説明される種々の実施形態の各々の特徴が、関連付けられた新しい実施形態において複数の特徴の組み合わせを提供するために、適宜、他の説明される実施形態の特徴と組み合わせられ得る。さらに、前述は、いくつかの別個の実施形態を説明するが、本明細書に説明されているものは、本発明の原理の適用を例証するにすぎない。加えて、本明細書における特定の方法は、具体的な順序で実施されるものとして例証および／または説明され得るが、順序は、本明細書に開示されるような実施形態を達成するために、通常の技術内で大いに変更可能である。故に、本説明は、例としてのみ捉えられることを意図されており、別様に本発明の範囲を限定するようには意図されていない。 The foregoing is a detailed description of illustrative embodiments of the present invention. Various modifications and additions may be made without departing from the spirit and scope of the present invention. Features of each of the various embodiments described above may be combined with features of other described embodiments, as appropriate, to provide a combination of features in related new embodiments. Moreover, while the foregoing describes several separate embodiments, what has been described herein is merely illustrative of the application of the principles of the present invention. In addition, although certain methods herein may be illustrated and/or described as being performed in a specific order, the order may be varied considerably within ordinary skill in the art to achieve the embodiments as disclosed herein. Thus, this description is intended to be taken only as an example, and is not intended to otherwise limit the scope of the present invention.

上記の説明において、および請求項において、「～のうちの少なくとも１つ」または「～のうちの１つ以上」等の語句が生じ、要素または特徴の接続的列挙が後に続き得る。用語「および／または」もまた、２つ以上の要素または特徴の列挙内に生じ得る。そのような語句が使用される文脈によって別様に暗示的または明示的に否定されない限り、これは、個々に列挙される要素もしくは特徴のいずれか、または他の記載される要素もしくは特徴のいずれかと組み合わせて記載される要素もしくは特徴のいずれかを意味することが意図されている。例えば、語句「ＡおよびＢのうちの少なくとも一方」、「ＡおよびＢのうちの１つ以上」、ならびに「Ａおよび／またはＢ」は、各々、「Ａのみ、Ｂのみ、またはＡおよびＢともに」を意味することが意図されている。同様の解釈が、３つ以上のアイテムを含む列挙に関しても意図されている。例えば、語句「Ａ、Ｂ、およびＣのうちの少なくとも１つ」、「Ａ、Ｂ、およびＣのうちの１つ以上」、ならびに「Ａ、Ｂ、および／またはＣ」は、各々、「Ａのみ、Ｂのみ、Ｃのみ、ＡおよびＢともに、ＡおよびＣともに、ＢおよびＣともに、またはＡおよびＢおよびＣともに」を意味することが意図されている。加えて、上記および請求項内での用語「～に基づいて」の使用は、記載されていない特徴または要素も許容可能であるように、「少なくとも、～に基づいて」を意味することが意図されている。 In the above description and in the claims, phrases such as "at least one of" or "one or more of" may occur followed by a conjunctive enumeration of elements or features. The term "and/or" may also occur within a enumeration of two or more elements or features. Unless otherwise implied or explicitly contradicted by the context in which such a phrase is used, this is intended to mean any of the elements or features listed individually or any of the elements or features listed in combination with any of the other listed elements or features. For example, the phrases "at least one of A and B," "one or more of A and B," and "A and/or B" are each intended to mean "A only, B only, or both A and B." A similar interpretation is intended with respect to enumerations containing more than two items. For example, the phrases "at least one of A, B, and C," "one or more of A, B, and C," and "A, B, and/or C" are each intended to mean "A only, B only, C only, both A and B, both A and C, both B and C, or both A, B, and C." Additionally, use of the term "based on" above and in the claims is intended to mean "based at least on," such that unrecited features or elements are also allowed.

本明細書に説明される主題は、所望の構成に応じて、システム、装置、方法、および／または物品として具現化されることができる。前述の説明に記載される実装は、本明細書に説明される主題と一貫した全実装を表すわけではない。代わりに、それらは、単に説明される主題に関連する側面と一貫するいくつかの例にすぎない。いくつかの変更が、上記で詳細に説明されているが、他の修正または追加も、可能である。特に、さらなる特徴および／または変更が、本明細書に記載されるものに加えて提供され得る。例えば、上記で説明される実装は、開示される特徴の種々の組み合わせおよび副次的組み合わせおよび／または上記に開示されるいくつかのさらなる特徴の組み合わせおよび副次的組み合わせを対象とし得る。加えて、付随の図に描写され、かつ／または本明細書に説明される論理フローは、望ましい結果を達成するために、必ずしも、示される特定の順序または連続的順序を要求しない。他の実装も、以下の請求項の範囲内にあり得る。 The subject matter described herein may be embodied as a system, an apparatus, a method, and/or an article, depending on the desired configuration. The implementations described in the foregoing description do not represent all implementations consistent with the subject matter described herein. Instead, they are merely some examples consistent with aspects related to the subject matter described. Although some variations have been described in detail above, other modifications or additions are possible. In particular, further features and/or variations may be provided in addition to those described herein. For example, the implementations described above may be directed to various combinations and subcombinations of the disclosed features and/or combinations and subcombinations of some further features disclosed above. In addition, the logic flow depicted in the accompanying figures and/or described herein does not necessarily require the particular order or sequential order shown to achieve the desired results. Other implementations may be within the scope of the following claims.

Claims

10. A decoder, the decoder comprising a circuit, the circuit comprising:
Receiving a bitstream;
storing a plurality of long term reference frames in a reference list;
retaining a long term reference frame in the reference list for a length of time based on a retention time;
and decoding at least a portion of a video using the long term reference frames maintained in the reference list, each long term reference frame in the stored long term reference frames including an associated retention time.

The decoder of claim 1, further configured to mark the long-term reference frame as unavailable after the long-term reference frame has resided in the reference list for at least the retention time.

The decoder of claim 2, further configured to mark the long-term reference frame as available based on a signal in the bitstream.

The decoder of claim 1, wherein the bitstream includes a signal for removing the long-term reference frame from memory.

The decoder of claim 4, further configured to remove the long-term reference frame from the reference list based on the signal.

an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;
an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing an inverse discrete cosine;
A deblocking filter;
A frame buffer;
The decoder of claim 1 further comprising: an intra-prediction processor;

Receiving a coded block;
determining that an inter prediction mode is enabled for the coded block;
The decoder of claim 1 , further configured to: determine a decoded block using the long-term reference frame as a reference frame and according to the inter prediction mode.

The decoder of claim 7, wherein the decoded blocks form part of a quad tree plus a binary decision tree.

The decoder of claim 8, wherein the decoded block is a non-leaf node of the quad tree plus binary decision tree.

1. A method, comprising:
A decoder receives a bitstream;
storing a plurality of long term reference frames in a reference list;
the decoder retaining a long term reference frame in the reference list for a length of time based on a retention time;
and the decoder decoding at least a portion of a video using the long term reference frames maintained in the reference list, each long term reference frame in the stored long term reference frames including an associated retention time.

The method of claim 10, further comprising marking the long-term reference frame as unavailable after the long-term reference frame has resided in the reference list for at least the retention time.

The method of claim 11, further comprising marking the long-term reference frame as available based on a signal in the bitstream.

The method of claim 10, wherein the bitstream includes a signal for removing the long-term reference frame from memory.

The method of claim 13, further comprising removing the long-term reference frame from the reference list based on the signal.

The decoder further comprises:
an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;
an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing an inverse discrete cosine;
A deblocking filter;
A frame buffer;
An intra prediction processor.

Receiving a coded block;
determining that an inter prediction mode is enabled for the coded block;
The method of claim 10 , further comprising: determining a decoded block using the long-term reference frame as a reference frame and according to the inter prediction mode.

The method of claim 16, wherein the decoded blocks form part of a quad tree plus a binary decision tree.

The method of claim 17, wherein the decoded block is a non-leaf node of the quad tree plus binary decision tree.

1. A decoder, comprising:
receiving a bitstream including signaling information and a plurality of coded pictures, the plurality of coded pictures including a picture used as a long-term reference picture, a first coded picture, a second coded picture, and a subsequent coded picture , the picture used as the long-term reference picture being marked as an IDR picture;
decoding the plurality of coded pictures, including a picture used as the long-term reference picture, the first coded picture , the second coded picture, and the subsequent coded picture , and storing the decoded plurality of coded pictures in a decoded picture buffer, wherein the decoding includes:
decoding the first coded picture by forming a first list of reference pictures from the decoded coded pictures in the decoded picture buffer, where one picture in the first list of reference pictures is the long-term reference picture ; and decoding blocks of the first coded picture using the long-term reference picture ;
indicating, in response to at least one parameter in the signaling information, that the long-term reference picture is unavailable as a reference picture while continuing to store the long-term reference picture in the decoded picture buffer;
decoding the second coded picture without using the long-term reference picture indicated as unavailable as a reference picture;
changing an indication of the long-term reference picture in a subsequent list of reference pictures from unavailable to available as a reference picture in response to at least one parameter in the signaling information;
and decoding a block from the subsequent coded picture using the long-term reference picture .

20. The decoder of claim 19, further configured to mark the long-term reference picture as unused for reference purposes and remove the long-term reference picture from the decoded picture buffer.

The decoder of claim 19, wherein the parameters are time-related parameters.

The decoder of claim 19, wherein the bitstream is a single-view bitstream.