CN118694964A

CN118694964A - Unequally weighted planar motion vector export

Info

Publication number: CN118694964A
Application number: CN202411021308.4A
Authority: CN
Inventors: 克里特·帕努索波内; 洪胜煜; 余越; 王利民
Original assignee: Arris Enterprises LLC
Current assignee: Arris Enterprises LLC
Priority date: 2018-04-15
Filing date: 2019-04-15
Publication date: 2024-09-24
Also published as: CN111955009A; CN118694963A; CN118764641A; CN118784863A; CN111955009B

Abstract

The present invention provides an unequally weighted planar motion vector derivation. A method of planar motion vector derivation is provided that can employ unequally weighted combinations of neighboring motion vectors. In some embodiments, motion vector information associated with a bottom right pixel or block adjacent to a current coding unit can be derived based on motion information associated with a top row or top adjacent row of a current coding unit and motion information associated with a left column or left adjacent column of the current coding unit. Weighted or unweighted combinations of such values can be combined in a planar mode prediction model to derive associated motion information for bottom and/or right adjacent pixels or blocks.

Description

Unequally weighted planar motion vector export

本申请是2020年10月10日提交的申请号为201980025081.1(PCT/US2019/027560)、申请日为2019年4月15日、标题为“不等权重平面运动矢量导出”的专利申请的分案申请。This application is a divisional application of the patent application with application number 201980025081.1 (PCT/US2019/027560) filed on October 10, 2020, application date on April 15, 2019, and titled "Unequal Weight Planar Motion Vector Derivation".

优先权要求Priority claim

本申请依据35U.S.C.§119(e)要求2018年4月15日提交的较早提交的第62/657,831号美国临时申请的优先权，所述美国临时申请的全部内容在此以引用的方式并入本文中。This application claims priority under 35 U.S.C. §119(e) to earlier-filed U.S. Provisional Application No. 62/657,831, filed on April 15, 2018, the entire contents of which are hereby incorporated by reference.

技术领域Technical Field

本公开涉及视频编码的领域，确切地说涉及编码效率增加且存储器负担与所存储协同定位图片的数目的减少相关联。The present disclosure relates to the field of video coding, and more particularly to increased coding efficiency and a reduced memory burden associated with a number of stored co-located pictures.

背景技术Background Art

不断发展的视频编码标准的技术改进展现出提高编码效率的趋势，以实现更高的比特率、更高的分辨率和更好的视频质量。联合视频探索团队开发了一种被称为JVET的新型视频编码方案，并正在开发一种被称为通用视频编码(VVC)的更新型视频编码方案-由JVET于2018年10月1日发布的标题为通用视频编码(草案2)的标准的草案2的VVC第⁷版的完整内容在此以引用的方式并入本文中。类似于像HEVC(高效视频编码)的其它视频编码方案，JVET和VVC都是基于块的混合空域和时域预测编码方案。然而，相对于HEVC，JVET和VVC包括对比特流结构、语法、约束条件以及用于生成解码图片的映射的很多修改。JVET已经在联合探索模型(JEM)编码器和解码器中实施，但VVC预计要到2020年初才能实现。The technical improvements of the evolving video coding standards show a trend of improving coding efficiency to achieve higher bit rates, higher resolutions and better video quality. The Joint Video Exploration Team has developed a new video coding scheme called JVET and is developing an updated video coding scheme called Versatile Video Coding (VVC) - the full content of VVC version ⁷ of the draft 2 of the standard entitled Versatile Video Coding (Draft 2) published by JVET on October 1, 2018 is incorporated herein by reference. Similar to other video coding schemes like HEVC (High Efficiency Video Coding), JVET and VVC are both block-based hybrid spatial and temporal prediction coding schemes. However, relative to HEVC, JVET and VVC include many modifications to the bitstream structure, syntax, constraints, and mappings for generating decoded pictures. JVET has been implemented in the Joint Exploration Model (JEM) encoder and decoder, but VVC is not expected to be implemented until early 2020.

当前和预期的视频编码方案通常利用简单假设邻近像素/块相似度来确定相邻像素或像素块的预测强度值的值。可针对相关联的运动矢量(MV)实施相同过程。然而，此假设可能导致错误的结果。需要一种不等权重平面运动矢量导出的系统和方法。Current and anticipated video coding schemes typically utilize simple assumptions of neighboring pixel/block similarity to determine the value of predicted intensity values for neighboring pixels or blocks of pixels. The same process can be implemented for associated motion vectors (MVs). However, this assumption may lead to erroneous results. A system and method for unequal weighted planar motion vector derivation is needed.

发明内容Summary of the invention

一个或多个计算机的系统可被配置成通过在该系统上安装软件、固件、硬件或它们的组合来执行特定操作或动作，所述软件、固件、硬件或它们的组合在操作中使该系统执行特定所述动作。一个或多个计算机程序可被配置为通过包括指令来执行特定操作或动作，所述指令当由数据处理装置执行时使该装置执行所述动作。一个一般方面包括：标识具有顶部相邻行、左相邻列、底部相邻行和右相邻列的编码单元；至少部分基于与所述顶部行和所述左相邻列相关联的运动信息确定与定位于所述底部相邻行和所述右相邻列的相交点处的右下相邻像素相关联的运动信息；至少部分基于与所述右下相邻像素相关联的所述运动信息确定与所述右相邻列相关联的运动信息；以及编码所述编码单元。该方面的其它实施方案可包括相应的计算机系统、装置和记录在一个或多个计算机存储设备上的计算机程序，它们中的每一个都被配置为执行方法的动作。A system of one or more computers may be configured to perform a specific operation or action by installing software, firmware, hardware, or a combination thereof on the system, which software, firmware, hardware, or a combination thereof causes the system to perform a specific described action in operation. One or more computer programs may be configured to perform a specific operation or action by including instructions that, when executed by a data processing device, cause the device to perform the described action. A general aspect includes: identifying a coding unit having a top adjacent row, a left adjacent column, a bottom adjacent row, and a right adjacent column; determining motion information associated with a lower right adjacent pixel located at the intersection of the bottom adjacent row and the right adjacent column based at least in part on motion information associated with the top row and the left adjacent column; determining motion information associated with the right adjacent column based at least in part on the motion information associated with the lower right adjacent pixel; and encoding the coding unit. Other embodiments of this aspect may include corresponding computer systems, devices, and computer programs recorded on one or more computer storage devices, each of which is configured to perform the actions of the method.

实施方式可包括以下特征中的一个或多个：至少部分基于与所述右下相邻像素相关联的所述运动信息确定与所述底部相邻行相关联的运动信息；其中采用平面编码模式；确定与所述顶部相邻行相关联的第一权重值，并确定与所述左相邻列相关联的第二权重值，其中确定与所述右下相邻像素相关联的运动信息的所述步骤至少部分基于所述第一权重值与同所述顶部相邻行相关联的所述运动信息的组合，以及所述第二权重值与同所述左相邻列相关联的所述运动信息的组合。所描述技术的实施方式可进一步包括硬件、方法或过程，或计算机可访问介质上的计算机软件。Implementations may include one or more of the following features: determining motion information associated with the bottom-neighboring row based at least in part on the motion information associated with the bottom-right neighboring pixel; wherein a planar coding mode is employed; determining a first weight value associated with the top-neighboring row, and determining a second weight value associated with the left-neighboring column, wherein the step of determining the motion information associated with the bottom-right neighboring pixel is based at least in part on a combination of the first weight value and the motion information associated with the top-neighboring row, and a combination of the second weight value and the motion information associated with the left-neighboring column. Implementations of the described technology may further include hardware, methods or processes, or computer software on a computer-accessible medium.

此外，一般方面可包括一种视频编码的系统，包括：在存储器中存储具有顶部相邻行、左相邻列、底部相邻行和右相邻列的编码单元；至少部分基于与所述顶部行和所述左相邻列相关联的运动信息确定并在所述存储器中存储与定位于所述底部相邻行和所述右相邻列的相交点处的右下相邻像素相关联的运动信息；至少部分基于与所述右下相邻像素相关联的所述运动信息确定并在存储器中存储与所述右相邻列相关联的运动信息；以及编码所述编码单元。该方面的其它实施方案包括相应的计算机系统、装置和记录在一个或多个计算机存储设备上的计算机程序，它们中的每一个都被配置为执行方法的动作。In addition, a general aspect may include a system for video encoding, comprising: storing in a memory an encoding unit having a top adjacent row, a left adjacent column, a bottom adjacent row, and a right adjacent column; determining and storing in the memory motion information associated with a bottom right adjacent pixel located at an intersection of the bottom adjacent row and the right adjacent column based at least in part on motion information associated with the top adjacent row and the left adjacent column; determining and storing in the memory motion information associated with the right adjacent column based at least in part on the motion information associated with the bottom right adjacent pixel; and encoding the encoding unit. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each of which is configured to perform the actions of the method.

实施方式可包括以下特征中的一个或多个：视频编码的系统进一步包括：至少部分基于与所述右下相邻像素相关联的所述运动信息确定并在存储器中存储与所述底部相邻行相关联的运动信息。视频编码的系统进一步包括：确定并在存储器中存储与所述顶部相邻行相关联的第一权重值，且确定并在存储器中存储与所述左相邻列相关联的第二权重值，其中确定与所述右下相邻像素相关联的运动信息的所述步骤至少部分基于所述第一权重值与同所述顶部相邻行相关联的所述运动信息的组合，以及所述第二权重值与同所述左相邻列相关联的所述运动信息的组合。所描述技术的实施方式可包括硬件、方法或过程，或计算机可访问介质上的计算机软件。Implementations may include one or more of the following features: The system for video encoding further includes: determining and storing in a memory motion information associated with the bottom adjacent row based at least in part on the motion information associated with the bottom right adjacent pixel. The system for video encoding further includes: determining and storing in a memory a first weight value associated with the top adjacent row, and determining and storing in a memory a second weight value associated with the left adjacent column, wherein the step of determining the motion information associated with the bottom right adjacent pixel is based at least in part on a combination of the first weight value and the motion information associated with the top adjacent row, and a combination of the second weight value and the motion information associated with the left adjacent column. Implementations of the described technology may include hardware, methods or processes, or computer software on a computer accessible medium.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

借助于附图解释本发明的其它细节，在附图中：Further details of the invention are explained with the aid of the accompanying drawings, in which:

图1描绘了将帧划分成多个编码树单元(CTU)。FIG. 1 depicts the partitioning of a frame into a plurality of coding tree units (CTUs).

图2a-2c描绘了CTU被示例性分割成编码单元(CU)。2a-2c illustrate an exemplary partitioning of a CTU into coding units (CUs).

图3描绘了图2的CU分割的四叉树加二叉树(QTBT)表示。FIG. 3 depicts a quadtree plus binary tree (QTBT) representation of the CU partition of FIG. 2 .

图4描绘了JVET或VVC编码器中用于CU编码的简化框图。FIG4 depicts a simplified block diagram for CU encoding in a JVET or VVC encoder.

图5描绘了用于JVET或VVC中的亮度分量的可能帧内预测模式。FIG5 depicts possible intra prediction modes for the luma component in JVET or VVC.

图6描绘了JVET或VVC解码器中用于CU编码的简化框图。FIG6 depicts a simplified block diagram for CU encoding in a JVET or VVC decoder.

图7a-7b分别描绘了水平和竖直预测计算子的图形表示。7a-7b depict graphical representations of the horizontal and vertical prediction calculators, respectively.

图8描绘了权重参数S[n]的示例性实施方案，其中宽度和高度的总和为256。FIG. 8 depicts an exemplary implementation of the weight parameter S[n] where the sum of the width and height is 256.

图9描绘了权重参数S[n]的示例性实施方案，其中宽度和高度的总和为512。FIG. 9 depicts an exemplary implementation of the weight parameter S[n] where the sum of the width and height is 512.

图10描绘了不等权重平面运动矢量导出的系统和方法的流程框图。FIG. 10 depicts a block flow diagram of a system and method for unequal weighted planar motion vector derivation.

图11描绘了适于且被配置成提供可变模板尺寸以用于模板匹配的计算机系统的实施方案。11 depicts an embodiment of a computer system adapted and configured to provide variable template sizes for template matching.

图12描绘了适于且被配置成提供可变模板尺寸以用于模板匹配的视频编码器/解码器的实施方案。FIG. 12 depicts an embodiment of a video encoder/decoder adapted and configured to provide variable template sizes for template matching.

具体实施方式DETAILED DESCRIPTION

图1描绘了将帧划分成多个编码树单元(CTU)100。帧可以是视频序列中的图像。帧可以包括矩阵或矩阵的集合，以像素值表示图像中的强度度量。因此，这些矩阵的集合可以生成视频序列。可以定义像素值以表示全色视频编码中的颜色和亮度，其中像素被划分成三个通道。例如，在YCbCr颜色空间中，像素可以具有表示图像中的灰度水平强度的亮度值Y，以及表示颜色从灰色到蓝色和红色的差异程度的两个色度值Cb和Cr。在其它实施方案中，可以用不同颜色空间或模型中的值来表示像素值。视频的分辨率可以确定帧中的像素的数量。更高的分辨率可能表示更多像素和更好的图像清晰度，但也可能导致更高的带宽、存储和传输需求。FIG. 1 depicts the partitioning of a frame into a plurality of coding tree units (CTUs) 100. A frame may be an image in a video sequence. A frame may include a matrix or a collection of matrices representing intensity measures in an image in terms of pixel values. Thus, a collection of these matrices may generate a video sequence. Pixel values may be defined to represent color and brightness in full-color video encoding, where pixels are divided into three channels. For example, in a YCbCr color space, a pixel may have a brightness value Y representing the intensity of a gray level in an image, and two chrominance values Cb and Cr representing the degree of difference in color from gray to blue and red. In other embodiments, pixel values may be represented by values in different color spaces or models. The resolution of a video may determine the number of pixels in a frame. Higher resolutions may represent more pixels and better image clarity, but may also result in higher bandwidth, storage, and transmission requirements.

可以使用JVET对视频序列的帧进行编码和解码。JVET是联合视频探索团队开发的视频编码方案。已经在JEM(联合探索模型)编码器和解码器中实施了多个版本的JVET。类似于像HEVC(高效视频编码)的其它视频编码方案，JVET是一种基于块的混合空域和时域预测编码方案。在利用JVET进行编码期间，首先将帧分成被称为CTU 100的正方形块，如图1所示。例如，CTU 100可以是128×128像素的块。JVET can be used to encode and decode frames of a video sequence. JVET is a video coding scheme developed by the Joint Video Exploration Team. Several versions of JVET have been implemented in JEM (Joint Exploration Model) encoders and decoders. Similar to other video coding schemes like HEVC (High Efficiency Video Coding), JVET is a block-based hybrid spatial and temporal prediction coding scheme. During encoding using JVET, a frame is first divided into square blocks called CTU 100, as shown in Figure 1. For example, CTU 100 can be a block of 128×128 pixels.

图2a描绘了CTU 100被示例性地分割成CU 102。帧中的每个CTU 100可以被分割成一个或多个CU(编码单元)102。如下所述，CU 102可以用于预测和变换。与HEVC不同的是，在JVET中，CU 102可以是矩形或正方形，并可以被编码而无需进一步分割成预测单元或变换单元。CU 102可以与其根CTU 100一样大，或者是与4×4块一样小的根CTU 100的更小细分。FIG. 2a depicts an exemplary partitioning of a CTU 100 into CUs 102. Each CTU 100 in a frame may be partitioned into one or more CUs (coding units) 102. As described below, a CU 102 may be used for prediction and transformation. Unlike HEVC, in JVET, a CU 102 may be rectangular or square and may be encoded without further partitioning into prediction units or transform units. A CU 102 may be as large as its root CTU 100, or a smaller subdivision of a root CTU 100 as small as a 4×4 block.

在JVET中，可以根据四叉树加二叉树(QTBT)方案将CTU 100分割成CU 102，其中，可以根据四叉树将CTU 100递归地划分成正方形块，然后可以根据二叉树水平或垂直递归地划分那些正方形块。可以设置参数以根据QTBT控制划分，这些参数是例如CTU尺寸、四叉树和二叉树叶节点的最小尺寸、二叉树根节点的最大尺寸，以及二叉树的最大深度。在VVC中，也可以利用三叉树划分将CTU 100分割成CU。In JVET, CTU 100 may be partitioned into CUs 102 according to a quadtree plus binary tree (QTBT) scheme, where CTU 100 may be recursively partitioned into square blocks according to a quadtree, and then those square blocks may be recursively partitioned horizontally or vertically according to a binary tree. Parameters may be set to control the partitioning according to QTBT, such as CTU size, minimum size of quadtree and binary tree leaf nodes, maximum size of binary tree root node, and maximum depth of the binary tree. In VVC, CTU 100 may also be partitioned into CUs using ternary tree partitioning.

作为非限制性示例，图2a示出了分割成CU 102的CTU 100，其中实线表示四叉树划分，虚线表示二叉树划分。如图所示，二叉树划分允许水平划分和垂直划分以定义CTU及其细分成CU的结构。图2b和2c描绘了CU的三叉树划分的替代非限制性示例，其中，CU的细分是不均等的。As a non-limiting example, FIG. 2a shows a CTU 100 partitioned into CUs 102, where the solid lines represent quadtree partitioning and the dashed lines represent binary tree partitioning. As shown, the binary tree partitioning allows for horizontal and vertical partitioning to define the structure of the CTU and its subdivision into CUs. FIGS. 2b and 2c depict alternative non-limiting examples of ternary tree partitioning of a CU, where the subdivision of the CU is unequal.

图3描绘了图2的分割的QTBT表示。四叉树根节点代表CTU 100，其中四叉树部分中的每个子节点代表从正方形父块划分的四个正方形块中的一个。然后可以使用二叉树将四叉树叶节点代表的正方形块划分零次或多次，其中四叉树叶节点为二叉树的根节点。在二叉树部分的每个层级，可以垂直或水平地对块进行划分。设置为“0”的标志表示水平划分块，而设置为“1”的标志表示垂直划分块。FIG3 depicts a QTBT representation of the partitioning of FIG2. A quadtree root node represents CTU 100, where each child node in the quadtree portion represents one of four square blocks partitioned from a square parent block. The square block represented by the quadtree leaf node can then be partitioned zero or more times using a binary tree, where the quadtree leaf node is the root node of the binary tree. At each level of the binary tree portion, the block can be partitioned vertically or horizontally. A flag set to "0" indicates a horizontal partition of the block, while a flag set to "1" indicates a vertical partition of the block.

在四叉树划分和二叉树划分之后，由QTBT的叶节点代表的块代表要例如使用帧间预测或帧内预测编码来编码的最终CU 102。对于用帧间预测编码的条带或完整帧，可以为亮度和色度分量使用不同的分割结构。例如，对于帧间条带，CU 102可以具有用于不同颜色分量的编码块(CB)，例如一个亮度CB和两个色度CB。对于用帧内预测编码的条带或完整帧，对于亮度和色度分量，分割结构可以是相同的。After quadtree partitioning and binary tree partitioning, the blocks represented by the leaf nodes of the QTBT represent the final CU 102 to be encoded, for example, using inter-prediction or intra-prediction encoding. For slices or full frames encoded with inter-prediction, different partitioning structures can be used for luma and chroma components. For example, for inter slices, the CU 102 can have coding blocks (CBs) for different color components, such as one luma CB and two chroma CBs. For slices or full frames encoded with intra-prediction, the partitioning structure can be the same for luma and chroma components.

图4描绘了JVET编码器中用于CU编码的简化框图。视频编码的主要阶段包括：如上所述的分割以标识CU 102，接着在404或406处使用预测对CU 102编码，在408处生成残差CU410，在412处进行变换，在416处进行量化，以及在420处进行熵编码。图4中所示的编码器和编码过程还包括下文更详细描述的解码过程。FIG4 depicts a simplified block diagram for CU encoding in a JVET encoder. The main stages of video encoding include segmentation as described above to identify CU 102, followed by encoding CU 102 using prediction at 404 or 406, generating residual CU 410 at 408, transforming at 412, quantizing at 416, and entropy encoding at 420. The encoder and encoding process shown in FIG4 also includes a decoding process described in more detail below.

给定当前CU 102，编码器可以在404处使用帧内预测在空域上或在406处使用帧间预测在时域上获得预测CU 402。预测编码的基本理念是在初始信号与针对初始信号的预测之间传输差分或残差信号。在接收机侧，可以通过将残差和预测相加来重构初始信号，如下文将描述的。因为差分信号的相关性低于原始信号，所以其传输所需的比特更少。Given the current CU 102, the encoder can obtain the predicted CU 402 in the spatial domain using intra prediction at 404 or in the temporal domain using inter prediction at 406. The basic idea of predictive coding is to transmit a difference or residual signal between the original signal and the prediction for the original signal. At the receiver side, the original signal can be reconstructed by adding the residual and the prediction, as will be described below. Because the correlation of the difference signal is lower than the original signal, fewer bits are required for its transmission.

完全用帧内预测CU编码的条带，例如，整个图片或图片的一部分，可以是无需参考其它条带就可以编码的I条带，并且因此可以是解码能够开始的可能点。用至少一些帧间预测CU编码的条带可以是可以基于一个或多个参考图片解码的预测(P)或双向预测(B)条带。P条带可以与先前编码的条带一起使用帧内预测和帧间预测。例如，可以使用帧间预测对P条带进行比I条带更进一步的压缩，但需要先前编码的条带的编码来对它们进行编码。B条带可以使用帧内预测或帧间预测，使用来自两个不同帧的插值预测，来使用来自先前和/或后续条带的数据进行其编码，从而提高运动估计过程的精确度。在一些情况下，也可以或可以替代地使用帧内块复制来对P条带和B条带进行编码，其中，使用来自相同条带的其它部分的数据。A slice coded entirely with an intra-predicted CU, e.g., an entire picture or a portion of a picture, may be an I slice that can be coded without reference to other slices, and thus may be a possible point at which decoding can begin. A slice coded with at least some inter-predicted CUs may be a predicted (P) or bi-predicted (B) slice that can be decoded based on one or more reference pictures. P slices may use intra-prediction and inter-prediction together with previously coded slices. For example, P slices may be compressed further than I slices using inter-prediction, but the coding of previously coded slices is required to encode them. B slices may use intra-prediction or inter-prediction, using interpolated predictions from two different frames, to use data from previous and/or subsequent slices for their coding, thereby improving the accuracy of the motion estimation process. In some cases, P slices and B slices may also or may alternatively be coded using intra-block copying, where data from other parts of the same slice is used.

如下文将要论述的，可以基于从先前编码的CU 102(例如，相邻CU 102或参考图片中的CU 102)重构的CU 434进行帧内预测或帧间预测。As will be discussed below, the CU 434 may be intra-predicted or inter-predicted based on a reconstructed CU 102 from a previously encoded CU 102 (eg, a neighboring CU 102 or a CU 102 in a reference picture).

当在404处用帧内预测在空域上对CU 102进行编码时，可以找到一种帧内预测模式，该模式基于来自图片中的相邻CU 102的样点来最佳地预测CU 102的像素值。When CU 102 is encoded in the spatial domain using intra prediction at 404, an intra prediction mode may be found that best predicts pixel values of CU 102 based on samples from neighboring CUs 102 in the picture.

在对CU的亮度分量编码时，编码器可以生成候选帧内预测模式的列表。尽管HEVC对于亮度分量具有35种可能的帧内预测模式，但在JVET中，对于亮度分量有67种可能的帧内预测模式，在VVC中，有85种预测模式。这些模式包括平面模式、DC模式、图5中所示的65种定向模式，以及18种宽角预测模式，平面模式使用从相邻像素生成的值的三维平面，DC模式使用相邻像素的平均值，定向模式使用沿实线指示的方向从相邻像素复制的值，宽角预测模式可以与非正方形块一起使用。When encoding the luma component of a CU, the encoder can generate a list of candidate intra prediction modes. Although HEVC has 35 possible intra prediction modes for the luma component, in JVET, there are 67 possible intra prediction modes for the luma component, and in VVC, there are 85 prediction modes. These modes include planar mode, DC mode, 65 directional modes shown in Figure 5, and 18 wide-angle prediction modes. Planar mode uses a three-dimensional plane of values generated from neighboring pixels, DC mode uses the average value of neighboring pixels, and directional mode uses values copied from neighboring pixels in the direction indicated by the solid line. The wide-angle prediction mode can be used with non-square blocks.

在为CU的亮度分量生成候选帧内预测模式的列表时，列表上的候选模式的数量可以取决于CU的尺寸。候选列表可以包括：具有最低SATD(绝对变换差之和)成本的HEVC的35种模式的子集；为JVET添加的与从HEVC模式发现的候选相邻的新定向模式；以及来自基于用于先前编码的相邻块的帧内预测模式标识的CU 102的六个最可能模式(MPM)的集合以及默认模式列表中的模式。When generating a list of candidate intra prediction modes for the luma component of a CU, the number of candidate modes on the list may depend on the size of the CU. The candidate list may include: a subset of 35 modes of HEVC with the lowest SATD (Sum of Absolute Transform Difference) cost; a new directional mode added for JVET adjacent to the candidate found from the HEVC mode; and a set of six most probable modes (MPMs) from CU 102 identified based on the intra prediction mode for previously coded neighboring blocks and a mode in the default mode list.

在对CU的色度分量编码时，也可以生成候选帧内预测模式的列表。候选模式的列表可以包括从亮度样点利用跨分量线性模型投影生成的模式、为亮度CB，尤其是色度块中的共位位置发现的帧内预测模式，以及先前为相邻块发现的色度预测模式。编码器可以找到列表上具有最低速率失真成本的候选模式，并在对CU的亮度和色度分量编码时使用那些帧内预测模式。可以在指示用于对每个CU 102进行编码的帧内预测模式的比特流中对语法进行编码。When encoding the chrominance components of a CU, a list of candidate intra prediction modes may also be generated. The list of candidate modes may include modes generated from luma samples using cross-component linear model projections, intra prediction modes found for luma CBs, especially co-located positions in chroma blocks, and chroma prediction modes previously found for neighboring blocks. The encoder may find the candidate modes with the lowest rate-distortion cost on the list and use those intra prediction modes when encoding the luma and chroma components of the CU. Syntax may be encoded in the bitstream indicating the intra prediction mode used to encode each CU 102.

在已经选择了用于CU 102的最佳帧内预测模式之后，编码器可以使用那些模式生成预测CU 402。当选定模式是定向模式时，可以使用4抽头滤波器来提高定向精确度。可以利用边界预测滤波器，例如2抽头或3抽头滤波器调节预测块的顶部或左侧的列或行。After the best intra prediction modes for CU 102 have been selected, the encoder can use those modes to generate the predicted CU 402. When the selected mode is a directional mode, a 4-tap filter can be used to improve directional accuracy. The top or left columns or rows of the prediction block can be adjusted using a boundary prediction filter, such as a 2-tap or 3-tap filter.

可以利用位置相关帧内预测组合(PDPC)过程对预测CU 402进行进一步平滑化，该过程使用相邻块的未滤波样点调节基于相邻块的滤波样点生成的预测CU 402，或者使用3抽头或5抽头低通滤波器进行自适应参考样点平滑化以处理参考样点。The predicted CU 402 may be further smoothed using a position-dependent intra prediction combining (PDPC) process that uses unfiltered samples of neighboring blocks to adjust the predicted CU 402 generated based on filtered samples of neighboring blocks, or an adaptive reference sample smoothing using a 3-tap or 5-tap low-pass filter to process reference samples.

当在406处利用帧间预测在时域上对CU 102编码时，可以找到一组运动矢量(MV)，其指向参考图片中对CU 102的像素值做出最佳预测的样点。帧间预测通过表示条带中像素块的位移来利用条带之间的时间冗余性。通过称为运动补偿的过程，根据先前或之后条带中的像素值确定位移。可以在比特流中向解码器提供表示相对于特定参考图片的像素位移的运动矢量和关联的参考索引，连带提供初始像素与经运动补偿的像素之间的残差。解码器可以使用残差和信令通知的运动矢量和参考索引来在重构条带中重构像素块。When CU 102 is encoded in the temporal domain using inter prediction at 406, a set of motion vectors (MVs) may be found that point to the sample points in the reference picture that best predict the pixel values of CU 102. Inter prediction exploits temporal redundancy between slices by representing the displacement of blocks of pixels in a slice. The displacement is determined based on the pixel values in the previous or following slices through a process called motion compensation. The motion vectors and associated reference indices representing the displacement of pixels relative to a particular reference picture may be provided to the decoder in the bitstream along with the residual between the original pixel and the motion compensated pixel. The decoder may use the residual and the signaled motion vector and reference indices to reconstruct the pixel block in the reconstructed slice.

在JVET中，可以以1/16像素的精确度存储运动矢量，并可以利用四分之一像素或整数像素分辨率对运动矢量与CU的预测运动矢量之间的差值编码。In JVET, a motion vector may be stored with an accuracy of 1/16 pixel, and the difference between the motion vector and the predicted motion vector of the CU may be encoded with a quarter-pixel or integer-pixel resolution.

在JVET中，可以使用各种技术为CU 102内的多个子CU找到运动矢量，所述技术是例如高级时域运动矢量预测(ATMVP)、空时运动矢量预测(STMVP)、仿射运动补偿预测、模式匹配的运动矢量导出(PMMVD)和/或双向光流(BIO)。In JVET, motion vectors may be found for multiple sub-CUs within CU 102 using various techniques, such as advanced temporal motion vector prediction (ATMVP), spatio-temporal motion vector prediction (STMVP), affine motion compensated prediction, pattern-matched motion vector derivation (PMMVD), and/or bidirectional optical flow (BIO).

使用ATMVP，编码器可以为CU 102找到指向参考图片中对应块的时域矢量。可以基于为先前编码的相邻CU 102找到的运动矢量和参考图片找到时域矢量。使用整个CU 102的时域矢量指向的参考块，可以为CU 102内的每个子CU找到运动矢量。Using ATMVP, the encoder can find a temporal vector pointing to a corresponding block in a reference picture for CU 102. The temporal vector can be found based on the motion vector and reference picture found for a previously encoded neighboring CU 102. Using the reference block pointed to by the temporal vector for the entire CU 102, a motion vector can be found for each sub-CU within CU 102.

STMVP可以通过对为先前利用帧间预测编码的相邻块找到的运动矢量进行缩放和平均，来为子CU找到运动矢量，并一起找到时域矢量。STMVP can find the motion vector for the sub-CU by scaling and averaging the motion vectors found for neighboring blocks previously coded using inter-frame prediction, and together find the temporal vector.

可以使用仿射运动补偿预测，基于为块的顶角找到的两个控制运动矢量，为该块中的每个子CU预测运动矢量场。例如，可以基于为CU 102内的每个4×4块找到的顶角运动矢量，导出子CU的运动矢量。A motion vector field may be predicted for each sub-CU in a block based on the two control motion vectors found for the corners of the block using affine motion compensated prediction. For example, a motion vector for a sub-CU may be derived based on the corner motion vectors found for each 4×4 block within CU 102.

PMMVD可以使用双边匹配或模板匹配为当前CU 102找到初始运动矢量。双边匹配可以沿运动轨迹查看两个不同参考图片中的当前CU 102和参考块，而模板匹配可以查看当前CU 102和参考图片中由模板标识的对应块。然后可以针对每个子CU逐个细化为CU 102找到的初始运动矢量。PMMVD can use bilateral matching or template matching to find an initial motion vector for the current CU 102. Bilateral matching can look at the current CU 102 and reference blocks in two different reference pictures along the motion trajectory, while template matching can look at the corresponding blocks identified by the template in the current CU 102 and the reference picture. The initial motion vector found for the CU 102 can then be refined for each sub-CU one by one.

在利用双向预测基于更早和更晚参考图片执行帧间预测时可以使用BIO，并且BIO允许基于两个参考图片之间的差值梯度为子CU找到运动矢量。BIO can be used when performing inter-frame prediction based on earlier and later reference pictures using bidirectional prediction, and BIO allows finding a motion vector for a sub-CU based on the difference gradient between the two reference pictures.

在一些情况下，可以在CU级别使用局部照明补偿(LIC)，以基于与当前CU 102相邻的样点和与候选运动矢量标识的参考块相邻的对应样点，找到缩放因子参数和偏移参数的值。在JVET中，LIC参数可以变化并在CU级别被信令通知。In some cases, local illumination compensation (LIC) may be used at the CU level to find values for the scaling factor parameters and the offset parameters based on samples neighboring the current CU 102 and corresponding samples neighboring the reference block identified by the candidate motion vector. In JVET, the LIC parameters may vary and be signaled at the CU level.

对于以上方法中的一些，可以在CU级别将为CU的每个子CU找到的运动矢量用信号发送到解码器。对于其它方法，例如PMMVD和BIO，不在比特流中信令通知运动信息以节省开销，并且解码器可以通过相同过程导出运动矢量。For some of the above methods, the motion vector found for each sub-CU of the CU can be signaled to the decoder at the CU level. For other methods, such as PMMVD and BIO, motion information is not signaled in the bitstream to save overhead, and the decoder can derive the motion vector through the same process.

在已经找到了CU 102的运动矢量之后，编码器可以使用那些运动矢量生成预测CU402。在一些情况下，在已经为各个子CU找到了运动矢量时，在通过组合那些运动矢量与先前为一个或多个相邻子CU找到的运动矢量来生成预测CU 402时，可以使用重叠块运动补偿(OBMC)。After the motion vectors for CU 102 have been found, the encoder may use those motion vectors to generate a predicted CU 402. In some cases, when motion vectors have been found for individual sub-CUs, overlapped block motion compensation (OBMC) may be used when generating the predicted CU 402 by combining those motion vectors with motion vectors previously found for one or more neighboring sub-CUs.

在使用双向预测时，JVET可以使用解码器侧运动矢量细化(DMVR)来找到运动矢量。DMVR允许使用双向模板匹配过程，基于为双向预测找到的两个运动矢量来找到运动矢量。在DMVR中，可以找到利用两个运动矢量中的每个运动矢量生成的预测CU 402的加权组合，并且可以通过用最佳地指向组合的预测CU 402的新运动矢量替换这两个运动矢量来细化这两个运动矢量。可以使用两个细化的运动矢量来生成最终预测CU 402。When using bidirectional prediction, JVET can use decoder-side motion vector refinement (DMVR) to find motion vectors. DMVR allows motion vectors to be found based on two motion vectors found for bidirectional prediction using a bidirectional template matching process. In DMVR, a weighted combination of the prediction CU 402 generated using each of the two motion vectors can be found, and the two motion vectors can be refined by replacing them with a new motion vector that best points to the combined prediction CU 402. The two refined motion vectors can be used to generate the final prediction CU 402.

在408处，如上所述，一旦已经在404处用帧内预测或在406处用帧间预测找到了预测CU 402，编码器就可以从当前CU 102减去预测CU 402，以找到残差CU 410。At 408 , once the prediction CU 402 has been found, either at 404 with intra prediction or at 406 with inter prediction, the encoder may subtract the prediction CU 402 from the current CU 102 to find the residual CU 410 , as described above.

编码器可以在412处使用一个或多个变换操作来将残差CU 410变换成在变换域中表达残差CU 410的变换系数414，例如，使用离散余弦块变换(DCT变换)将数据转换到变换域中。与HEVC相比，JVET允许更多类型的变换操作，包括DCT-II、DST-VII、DST-VII、DCT-VIII、DST-I和DCT-V操作。可以将允许的变换操作分组成子集，并可以由编码器信令通知使用了哪些子集以及那些子集中的哪些特定操作的指示。在一些情况下，可以使用大块尺寸变换来将大于某个尺寸的CU 102中的高频变换系数归零，使得仅为那些CU 102保持低频变换系数。The encoder may use one or more transform operations at 412 to transform the residual CU 410 into transform coefficients 414 that express the residual CU 410 in the transform domain, for example, using a discrete cosine block transform (DCT transform) to convert the data into the transform domain. Compared to HEVC, JVET allows more types of transform operations, including DCT-II, DST-VII, DST-VII, DCT-VIII, DST-I, and DCT-V operations. The allowed transform operations may be grouped into subsets, and indications of which subsets and which specific operations in those subsets are used may be signaled by the encoder. In some cases, a large block size transform may be used to zero high-frequency transform coefficients in CUs 102 larger than a certain size so that only low-frequency transform coefficients are maintained for those CUs 102.

在一些情况下，可以在正向内核变换之后向低频变换系数414应用模式相关的非独立二次变换(MDNSST)。MDNSST操作可以使用基于旋转数据的Hypercube-Givens变换(HyGT)。在使用时，可以由编码器信令通知标识特定MDNSST操作的索引值。In some cases, a mode-dependent non-independent secondary transform (MDNSST) may be applied to the low-frequency transform coefficients 414 after the forward kernel transform. The MDNSST operation may use a Hypercube-Givens transform (HyGT) based on rotated data. When used, an index value identifying a specific MDNSST operation may be signaled by the encoder.

在416处，编码器可以将变换系数414量化成量化变换系数416。可以通过将系数值除以量化步长来计算每个系数的量化，该量化步长是从量化参数(QP)导出的。在一些实施方案中，Qstep被定义为2^(QP-4)/6.因为可以将高精度变换系数414转换成具有有限数量的可能值的量化变换系数416，所以量化可以有助于数据压缩。于是，变换系数的量化可以限制变换过程生成和发送的比特量。不过，尽管量化是有损操作，并且量化的损失不能恢复，但量化过程在重构序列的质量与表示该序列所需的信息量之间进行权衡。例如，较低的QP值可以产生质量更好的解码视频，尽管需要更大量的数据才能表示和传输。相反，高QP值可以产生质量较低的重构视频序列，但数据和带宽需求较低。At 416, the encoder may quantize the transform coefficients 414 into quantized transform coefficients 416. The quantization of each coefficient may be calculated by dividing the coefficient value by a quantization step size, which is derived from a quantization parameter (QP). In some embodiments, Qstep is defined as 2 ^(QP-4)/6 . Quantization may aid in data compression because the high-precision transform coefficients 414 may be converted into quantized transform coefficients 416 having a limited number of possible values. Thus, quantization of the transform coefficients may limit the amount of bits generated and transmitted by the transform process. However, although quantization is a lossy operation and the loss of quantization cannot be recovered, the quantization process trades off the quality of the reconstructed sequence against the amount of information required to represent the sequence. For example, a lower QP value may produce a better quality decoded video, although a larger amount of data may be required to represent and transmit. Conversely, a high QP value may produce a lower quality reconstructed video sequence, but with lower data and bandwidth requirements.

JVET可以利用基于方差的自适应量化技术，这允许每个CU 102为其编码过程使用不同的量化参数(而不是在对帧的每个CU 102编码时使用相同的帧QP)。基于方差的自适应量化技术自适应地降低某些块的量化参数，同时在其它块中增大量化参数。为了为CU 102选择特定QP，计算该CU的方差。简而言之，如果CU的方差高于帧的平均方差，可以为该CU102设置比帧的QP更高的QP。如果CU 102呈现出比帧的平均方差更低的方差，则可以分配更低的QP。JVET can utilize a variance-based adaptive quantization technique, which allows each CU 102 to use a different quantization parameter for its encoding process (rather than using the same frame QP when encoding each CU 102 of a frame). The variance-based adaptive quantization technique adaptively reduces the quantization parameter for certain blocks while increasing the quantization parameter in other blocks. In order to select a specific QP for a CU 102, the variance of the CU is calculated. In short, if the variance of a CU is higher than the average variance of the frame, a higher QP can be set for the CU 102 than the QP of the frame. If a CU 102 exhibits a lower variance than the average variance of the frame, a lower QP can be assigned.

在420处，编码器可以通过对量化变换系数418进行熵编码来找到最终压缩比特422。熵编码旨在消除要传输的信息的统计冗余。在JVET中，可以使用CABAC(上下文自适应二进制算术编码)对量化变换系数418编码，该技术使用概率度量来消除统计冗余。对于具有非零量化变换系数418的CU 102，可以将量化变换系数418转换成二进制。然后可以使用上下文模型对二进制表示的每个比特(“二进制位”)编码。CU 102可以被分解成三个区域，每个区域具有其自己的一组上下文模型以用于该区域内的像素。At 420, the encoder can find the final compressed bits 422 by entropy encoding the quantized transform coefficients 418. Entropy coding is intended to eliminate statistical redundancy of the information to be transmitted. In JVET, the quantized transform coefficients 418 can be encoded using CABAC (context adaptive binary arithmetic coding), which uses a probability metric to eliminate statistical redundancy. For CU 102 with non-zero quantized transform coefficients 418, the quantized transform coefficients 418 can be converted into binary. Each bit ("bin") of the binary representation can then be encoded using a context model. CU 102 can be decomposed into three regions, each region having its own set of context models for pixels within the region.

可以执行多个扫描轮次以对二进制位进行编码。在对前三个二进制位(bin0、bin1和bin2)编码的轮次期间，可以通过找到该二进制位在多达五个由模板标识的先前编码的相邻量化变换系数418中的位置之和，来找到指示为该二进制位使用哪个上下文模型的索引值。Multiple scanning passes may be performed to encode bins. During the pass for encoding the first three bins (bin0, bin1, and bin2), an index value indicating which context model to use for the bin may be found by finding the sum of the positions of the bin in up to five previously encoded adjacent quantized transform coefficients 418 identified by the template.

上下文模型可以基于二进制位的值为“0”或“1”的概率。在对值进行编码时，可以基于遇到值“0”和“1”的实际数量来更新上下文模型中概率。尽管HEVC使用固定表格来针对每个新图片对上下文模型进行重新初始化，但在JVET中，可以基于为先前编码的帧间预测图片开发的上下文模型对用于新帧间预测图片的上下文模型的概率进行初始化。The context model can be based on the probability of a binary bit value being "0" or "1". When encoding the value, the probabilities in the context model can be updated based on the actual number of values "0" and "1" encountered. While HEVC uses a fixed table to reinitialize the context model for each new picture, in JVET, the probabilities of the context model for a new inter-prediction picture can be initialized based on the context model developed for a previously coded inter-prediction picture.

编码器可以产生比特流，该比特流包含残差CU 410的熵编码的比特422、诸如选定的帧内预测模式或运动矢量的预测信息、如何根据QTBT结构从CTU 100分割CU 102的指示符，和/或关于编码视频的其它信息。比特流可以由解码器解码，如下所述。The encoder may generate a bitstream containing entropy coded bits 422 of the residual CU 410, prediction information such as a selected intra prediction mode or motion vector, an indicator of how the CU 102 is partitioned from the CTU 100 according to the QTBT structure, and/or other information about the encoded video. The bitstream may be decoded by a decoder as described below.

除了使用量化变换系数418找到最终压缩比特422之外，编码器还可以通过遵循与解码器将用来生成重构的CU 434的解码过程相同的解码过程，使用量化变换系数418来生成重构的CU 434。于是，一旦变换系数已经被编码器计算并量化，就可以将量化变换系数418传输到编码器中的解码环路。在量化CU的变换系数之后，解码环路允许编码器生成与解码器在解码过程中生成的相同的重构的CU 434。因此，在对新CU 102执行帧内预测或帧间预测时，编码器可以使用解码器会用于相邻CU 102或参考图片的相同的重构的CU 434。重构的CU 102、重构条带或完整的重构帧可以充当其它预测阶段的参考。In addition to using the quantized transform coefficients 418 to find the final compressed bits 422, the encoder can also use the quantized transform coefficients 418 to generate a reconstructed CU 434 by following the same decoding process that the decoder will use to generate the reconstructed CU 434. Thus, once the transform coefficients have been calculated and quantized by the encoder, the quantized transform coefficients 418 can be transmitted to the decoding loop in the encoder. After quantizing the transform coefficients of the CU, the decoding loop allows the encoder to generate the same reconstructed CU 434 as the decoder generated during the decoding process. Therefore, when performing intra-frame prediction or inter-frame prediction on the new CU 102, the encoder can use the same reconstructed CU 434 that the decoder would use for neighboring CUs 102 or reference pictures. The reconstructed CU 102, reconstructed slice, or complete reconstructed frame can serve as a reference for other prediction stages.

在编码器的解码环路处(并且参见下文，对于解码器中相同的操作)，为了获得重构图像的像素值，可以执行去量化过程。为了对帧进行去量化，例如，将帧的每个像素的量化值乘以量化步长，例如上述(Qstep)，以获得重构的去量化变换系数426。例如，在图4中所示的解码过程中，在编码器中，可以在424处对残差CU 410的量化变换系数418进行去量化以找到去量化变换系数426。如果在编码期间执行MDNSST操作，则在去量化之后可以对该操作进行反向操作。At the decoding loop of the encoder (and see below for the same operation in the decoder), to obtain pixel values for the reconstructed image, a dequantization process may be performed. To dequantize a frame, for example, the quantized value of each pixel of the frame is multiplied by a quantization step size, such as described above (Qstep), to obtain reconstructed dequantized transform coefficients 426. For example, in the decoding process shown in FIG. 4, in the encoder, the quantized transform coefficients 418 of the residual CU 410 may be dequantized at 424 to find the dequantized transform coefficients 426. If an MDNSST operation is performed during encoding, the operation may be reversed after dequantization.

在428处，可以例如通过向值应用DCT来获得重构图像，从而对去量化变换系数426进行逆变换以找到重构的残差CU 430。在432处，可以将重构的残差CU 430添加到在404处利用帧内预测或在406处利用帧间预测找到的对应预测CU 402，以便找到重构的CU 434。At 428, the dequantized transform coefficients 426 may be inversely transformed, for example, by applying a DCT to the values to obtain a reconstructed image, to find a reconstructed residual CU 430. At 432, the reconstructed residual CU 430 may be added to the corresponding prediction CU 402 found at 404 using intra prediction or at 406 using inter prediction to find a reconstructed CU 434.

在436处，可以在图片级别或CU级别在解码过程期间(在编码器中，或者如下文所述，在解码器中)向重构数据应用一个或多个滤波器。例如，编码器可以应用去方块滤波器、样点自适应偏移(SAO)滤波器和/或自适应环形滤波器(ALF)。编码器的解码过程可以实现滤波器，以估计可以解决重构图像中的潜在人工痕迹的最佳滤波器参数并将其传输到解码器。这样的改进提高了重构视频的客观和主观质量。在去方块滤波中，可以修改子CU边界附近的像素，而在SAO中，可以使用边缘偏移或频带偏移分类修改CTU 100中的像素。JVET的ALF可以使用对于每个2×2块具有圆形对称形状的滤波器。可以信令通知用于每个2×2块的滤波器的尺寸和身份的指示。At 436, one or more filters may be applied to the reconstructed data during the decoding process (in the encoder or, as described below, in the decoder) at the picture level or CU level. For example, the encoder may apply a deblocking filter, a sample adaptive offset (SAO) filter, and/or an adaptive loop filter (ALF). The decoding process of the encoder may implement filters to estimate the optimal filter parameters that can address potential artifacts in the reconstructed image and transmit them to the decoder. Such improvements improve the objective and subjective quality of the reconstructed video. In deblocking filtering, pixels near the sub-CU boundary may be modified, while in SAO, pixels in the CTU 100 may be modified using edge offsets or band offsets. JVET's ALF may use a filter having a circularly symmetric shape for each 2×2 block. Indications of the size and identity of the filter used for each 2×2 block may be signaled.

如果重构图片是参考图片，可以将它们存储在参考缓冲器438中，以在406处对将来的CU 102进行帧间预测。If the reconstructed pictures are reference pictures, they may be stored in a reference buffer 438 for inter-prediction of future CUs 102 at 406 .

在以上步骤期间，JVET允许使用内容自适应裁剪操作来调整颜色值，以匹配在上裁剪边界与下裁剪边界之间。裁剪边界可以针对每个条带改变，并且可以在比特流中信令通知标识边界的参数。During the above steps, JVET allows the use of content adaptive cropping operations to adjust color values to match between the upper and lower cropping boundaries. The cropping boundaries may change for each slice, and the parameters identifying the boundaries may be signaled in the bitstream.

图6描绘了JVET解码器中用于CU编码的简化框图。JVET解码器可以接收包含关于已编码CU 102的信息的比特流。比特流可以指示如何根据QTBT结构从CTU 100分割出图片的CU 102，CU 102的预测信息(例如，帧内预测模式或运动矢量)，以及表示熵编码残差CU的比特602。6 depicts a simplified block diagram for CU encoding in a JVET decoder. The JVET decoder may receive a bitstream containing information about an encoded CU 102. The bitstream may indicate how the CU 102 of a picture is partitioned from the CTU 100 according to the QTBT structure, prediction information for the CU 102 (e.g., intra prediction mode or motion vector), and bits 602 representing an entropy-coded residual CU.

在604处，解码器可以使用编码器在比特流中信令通知的CABAC上下文模型对熵编码比特602解码。解码器可以使用编码器信令通知的参数，来以与在编码期间更新上下文模型的概率相同的方式更新上下文模型的概率。At 604, the decoder may decode the entropy coded bits 602 using the CABAC context model signaled by the encoder in the bitstream. The decoder may use the parameters signaled by the encoder to update the probabilities of the context model in the same way that the probabilities of the context model were updated during encoding.

在604处对熵编码进行逆操作以找到量化变换系数606之后，解码器可以在608对它们进行去量化，以找到去量化变换系数610。如果在编码期间执行MDNSST操作，则在去量化之后可以由解码器对该操作进行逆操作。After inverting the entropy encoding at 604 to find quantized transform coefficients 606, the decoder may dequantize them at 608 to find dequantized transform coefficients 610. If an MDNSST operation was performed during encoding, the operation may be inverted by the decoder after dequantization.

在612处，可以对去量化变换系数610进行逆变换以找到重构的残差CU 614。在616处，可以将重构的残差CU 614添加到在622处利用帧内预测或在624处利用帧间预测找到的对应预测CU 626，以便找到重构的CU 618。At 612, the dequantized transform coefficients 610 may be inversely transformed to find a reconstructed residual CU 614. At 616, the reconstructed residual CU 614 may be added to a corresponding prediction CU 626 found at 622 using intra prediction or at 624 using inter prediction to find a reconstructed CU 618.

在620处，可以在图片级别或CU级别向重构数据应用一个或多个滤波器。例如，解码器可以应用去方块滤波器、样点自适应偏移(SAO)滤波器和/或自适应环形滤波器(ALF)。如上所述，可以使用位于编码器的解码环路中的环内滤波器来估计最优滤波器参数，以提高帧的客观和主观质量。这些参数被传输到解码器以在620处对重构帧滤波，以与编码器中的经滤波的重构帧。At 620, one or more filters may be applied to the reconstructed data at the picture level or the CU level. For example, the decoder may apply a deblocking filter, a sample adaptive offset (SAO) filter, and/or an adaptive loop filter (ALF). As described above, an in-loop filter located in the decoder loop of the encoder may be used to estimate the optimal filter parameters to improve the objective and subjective quality of the frame. These parameters are transmitted to the decoder to filter the reconstructed frame at 620 to filter the reconstructed frame with the filtered reconstructed frame in the encoder.

在通过找到重构的CU 618并应用信令通知的滤波器生成重构图片之后，解码器可以输出重构图片作为输出视频628。如果重构图片要被用作参考图片，可以将它们存储在参考缓冲器630中，以在624处对将来的CU 102进行帧间预测。After generating the reconstructed pictures by finding the reconstructed CU 618 and applying the signaled filters, the decoder may output the reconstructed pictures as output video 628. If the reconstructed pictures are to be used as reference pictures, they may be stored in a reference buffer 630 for inter prediction of future CUs 102 at 624.

平面模式常常是VVC、HEVC和JVET中最经常使用的帧内编码模式。图7a和7b展示针对具有高度H＝8 702和宽度W＝8 704的编码单元(块)用于水平预测计算(图7a)700和竖直预测子计算(图7b)710的VVC、HEVC和JVET平面预测子生过程，其中(0,0)706协调器对应于编码CU内的左上位置。Planar mode is often the most frequently used intra-coding mode in VVC, HEVC and JVET. Figures 7a and 7b show the VVC, HEVC and JVET planar predictor generation process for horizontal prediction calculation (Figure 7a) 700 and vertical prediction calculation (Figure 7b) 710 for a coding unit (block) with height H=8 702 and width W=8 704, where the (0,0) 706 coordinate corresponds to the upper left position within the coding CU.

VVC、HEVC和JVET(HEVC平面)中的平面模式通过基于相邻像素的强度值形成平面来生成当前编码单元(CU)的预测的一阶近似。归因于栅格扫描编码次序，重构的左列相邻像素和重构的顶部行相邻像素可用于当前CU，而非右列相邻像素和底部行相邻像素。VVC、HEVC和JVET的平面预测子成过程将所有右列相邻像素的强度值设定为与右上相邻像素的强度值相同，且所有底部行像素的强度值与左下相邻像素的强度值相同。Planar mode in VVC, HEVC, and JVET (HEVC Planar) generates a first-order approximation of the prediction of the current coding unit (CU) by forming a plane based on the intensity values of neighboring pixels. Due to the raster scan coding order, the reconstructed left column neighboring pixels and the reconstructed top row neighboring pixels can be used for the current CU instead of the right column neighboring pixels and the bottom row neighboring pixels. The plane prediction subprocess of VVC, HEVC, and JVET sets the intensity values of all right column neighboring pixels to the same as the intensity values of the top right neighboring pixels, and the intensity values of all bottom row pixels to the same as the intensity values of the bottom left neighboring pixels.

一旦环绕预测块的相邻像素被限定，则可根据以下等式确定CU内每一像素的水平和竖直预测子(分别P_h(x,y)和P_v(x,y))：Once the neighboring pixels surrounding the prediction block are defined, the horizontal and vertical predictors (P _h (x,y) and P _v (x,y) respectively) for each pixel within the CU may be determined according to the following equations:

P_h(x,y)＝(W-1-x)*R(-1,y)+(x+1)*R(W,-1)P _h (x,y)＝(W-1-x)*R(-1,y)+(x+1)*R(W,-1)

P_v(x,y)＝(H-1-y)*R(x,-1)+(y+1)*R(-1,H)P _v (x,y)＝(H-1-y)*R(x,-1)+(y+1)*R(-1,H)

其中in

R(x,y)表示(x,y)坐标处重构的相邻像素的强度值；R(x,y) represents the intensity value of the reconstructed adjacent pixel at the (x,y) coordinate;

W为块宽度；以及W is the block width; and

H为块高度。H is the block height.

依据这些值，通过对水平预测子和竖直预测子求平均来计算最终平面预测子P(x,y)，其中在当前CU为非正方形时根据以下等式进行特定调整：From these values, the final plane predictor P(x,y) is calculated by averaging the horizontal predictor and the vertical predictor, with specific adjustments made when the current CU is non-square according to the following equation:

此平面预测概念可应用于以细粒度确定MV。在VVC和JVET中，CU内的每一4×4子块可具有其自身的MV。如果MV(x,y)假定为含有像素(x,y)和较早描述的平面帧内预测概念的子块的MV，则重构的像素R(x,y)可转换为子块级别处的MV，MV(x,y)，特别应注意，对于帧内平面，R(x,y)为1D强度，而对于帧间平面，P(x,y)为2D MV。也就是说，水平和竖直预测子P_h(x,y)和P_v(x,y)分别可转换为水平和竖直MV预测子(分别为MV_h(x,y)和MV_v(x,y))。接着，可通过对水平和竖直预测子求平均来计算最终平面MV，MV(x,y)。This planar prediction concept can be applied to determine MVs at a fine granularity. In VVC and JVET, each 4×4 sub-block within a CU may have its own MV. If MV(x,y) is assumed to be the MV of the sub-block containing pixel (x,y) and the planar intra prediction concept described earlier, the reconstructed pixel R(x,y) may be converted to an MV at the sub-block level, MV(x,y), noting in particular that for intra planes, R(x,y) is a 1D intensity, while for inter planes, P(x,y) is a 2D MV. That is, the horizontal and vertical predictors _Ph (x,y) and _Pv (x,y) may be converted to horizontal and vertical MV predictors, _MVh (x,y) and _MVv (x,y), respectively. The final planar MV, MV(x,y), may then be calculated by averaging the horizontal and vertical predictors.

在一些实施方案中，多个参考条带可用于时间预测。因此，相邻子块可针对其相关联MV使用差异参考。为简单起见，可在采用一个以上参考的情况下，在组合一个以上MV时使用一个参考。在一些实施方案中，组合多个参考的一个可能选择是，选择在POC距离方面最接近编码图片的相邻子块的参考。In some implementations, multiple reference slices may be used for temporal prediction. Thus, neighboring subblocks may use difference references for their associated MVs. For simplicity, one reference may be used when combining more than one MV in the case where more than one reference is employed. In some implementations, one possible option for combining multiple references is to select the reference of the neighboring subblock that is closest to the coded picture in terms of POC distance.

MV的平面导出可需要环绕相邻子块的CU的MV。然而，在一些实施方案中，有可能可以认为一些相邻子块不可用于平面导出，因为它们可能不具有合适的MV。借助于非限制性示例，相邻子块可用帧内模式编码，可不使用适当的参考列表，或可不使用适当的参考条带。在此些情况下，可使用默认MV或替代MV，代替相邻子块的特定MV。在此情形下，可例如基于第一可用邻近相邻者的MV来使用替代MV。在其中相邻子块MV不使用所呈现的适当参考条带的替代实施方案中，一个可能的选择是，根据时间距离的比率利用加权因子将可用MV缩放到所要参考条带。Plane derivation of MVs may require MVs of CUs surrounding neighboring sub-blocks. However, in some implementations, it is possible that some neighboring sub-blocks may be considered unavailable for plane derivation because they may not have suitable MVs. By way of non-limiting example, the neighboring sub-blocks may be coded in intra mode, may not use an appropriate reference list, or may not use an appropriate reference slice. In these cases, a default MV or an alternative MV may be used instead of a specific MV for the neighboring sub-block. In this case, an alternative MV may be used, for example, based on the MV of the first available adjacent neighbor. In an alternative implementation in which the neighboring sub-block MV does not use an appropriate reference slice presented, one possible option is to scale the available MV to the desired reference slice using a weighting factor according to a ratio of temporal distances.

本公开呈现一种用以进行以下操作的系统和方法：导出当前CU的右下相邻子块的MV(提升过程)，且接着使用右下相邻子块的所导出的MV连同其它隅角相邻子块(例如右上相邻子块、左下相邻子块)的MV计算底部行和右列相邻子块的MV。The present disclosure presents a system and method for performing the following operations: deriving the MV of the bottom-right neighboring sub-block of the current CU (lifting process), and then using the derived MV of the bottom-right neighboring sub-block together with the MVs of other corner neighboring sub-blocks (e.g., top-right neighboring sub-block, bottom-left neighboring sub-block) to calculate the MVs of the bottom row and right column neighboring sub-blocks.

在一些实施方案中，右下子块MV导出过程可以是右上和左下相邻子块的加权平均，如下文呈现的等式中所限定：In some embodiments, the bottom right sub-block MV derivation process may be a weighted average of the top right and bottom left neighboring sub-blocks, as defined in the equation presented below:

在替代实施方案中，可假定平坦平面，且基于左上、右上和左下相邻子块的MV，可基于以下等式来导出MV：In an alternative implementation, a flat plane may be assumed, and based on the MVs of the top-left, top-right, and bottom-left neighboring sub-blocks, the MV may be derived based on the following equation:

MV(W,H)＝MV(W,-1)+MV(-1,H)-MV(-1,-1)MV(W,H)＝MV(W,-1)+MV(-1,H)-MV(-1,-1)

其中位置(0,0)表示当前块的左上子块位置，W为当前块的宽度且H为当前块的高度。MV(x,y)可因此表示含有位置(x,y)处的像素的重构的子块的MV以及含有位置(x,y)的子块处的所估计/预测的MV。Wherein the position (0,0) represents the upper left sub-block position of the current block, W is the width of the current block and H is the height of the current block. MV(x,y) may thus represent the MV of the reconstructed sub-block containing the pixel at the position (x,y) and the estimated/predicted MV at the sub-block containing the position (x,y).

又一非限制性示例可基于使用协同定位参考中含有像素(W,H)的协同定位位置处的子块的MV导出右下子块MV(类似于TMVP导出)。Yet another non-limiting example may derive the bottom-right sub-block MV based on using the MV of the sub-block at the co-located position containing the pixel (W, H) in the co-located reference (similar to the TMVP derivation).

在导出右下相邻子块的MV，MV(W,H)的情况下，可计算底部行相邻子块的MV，MV_b(x,H)和右列相邻子块的MV，MV_r(W,y)。如果假定线性内插法，则MV可限定如下：In the case of deriving the MV of the bottom right neighboring sub-block, MV(W,H), the MV of the bottom row neighboring sub-block, _MVb (x,H) and the MV of the right column neighboring sub-block, _MVr (W,y) can be calculated. If linear interpolation is assumed, the MV can be defined as follows:

然而，在替代实施方案中，除线性内插法外的模型可用于相关MV。However, in alternative embodiments, models other than linear interpolation may be used to correlate the MVs.

一旦限定环绕当前CU的相邻子块的运动矢量，则可根据以下等式确定CU内每一子块的水平和竖直MV(分别为MV_h(x,y)和MV_v(x,y))∶Once the motion vectors of neighboring sub-blocks surrounding the current CU are defined, the horizontal and vertical MVs (MV _h (x,y) and MV _v (x,y), respectively) of each sub-block within the CU may be determined according to the following equations:

MV_h(x,y)＝(W-1-x)*MV(-1,y)+(x+1)*MV_r(W,y)MV _h (x,y)＝(W-1-x)*MV(-1,y)+(x+1)*MV _r (W,y)

MV_v(x,y)＝(H-1-y)*MV(x,-1)+(y+1)*MV_b(x,H)MV _v (x,y)＝(H-1-y)*MV(x,-1)+(y+1)*MV _b (x,H)

其中MV_h(x,y)和MV_v(x,y)是水平和竖直MV预测子的缩放型式。然而，这些因子可在最终MV预测子计算步骤中得以补偿。where MV _h (x,y) and MV _v (x,y) are scaled versions of the horizontal and vertical MV predictors. However, these factors can be compensated in the final MV predictor calculation step.

在一些实施方案中，右上和左下角子块位置可分别设定成MV(W-1,-1)和MV(-1,H-1)。在此些实施方案中，针对中间预测子的内插可例如通过以下等式描述：In some implementations, the top right and bottom left sub-block positions may be set to MV(W-1,-1) and MV(-1,H-1), respectively. In such implementations, the interpolation for the intermediate predictor may be described, for example, by the following equations:

MV_h(x,y)＝(W-1-x)*MV(-1,y)+(x+1)*MV_r(W-1,y)MV _h (x,y)＝(W-1-x)*MV(-1,y)+(x+1)*MV _r (W-1,y)

MV_v(x,y)＝(H-1-y)*MV(x,-1)+(t+1)*MV_b(x,H-1)MV _v (x,y)＝(H-1-y)*MV(x,-1)+(t+1)*MV _b (x,H-1)

一些实施方案可利用不等权重组合用于最终平面MV导出。可采用不等权重以利用最终内插过程中的输入强度的准确性的差异。确切地说，较大权重可应用于较接近更可靠的相邻子块位置的子块位置。在VVC和JVET中，处理次序遵循CTU级别处的光栅扫描和针对CTU内的CU的z扫描。因此，顶部行和左列相邻子块是实际重构的子块，且比估计而得的底部行和右列相邻子块更可靠。最终MV导出处采用的使用不等权重的示例应用在以下等式中描述：Some implementations may utilize unequal weight combinations for final planar MV derivation. Unequal weights may be employed to exploit differences in the accuracy of input intensities in the final interpolation process. Specifically, larger weights may be applied to sub-block positions that are closer to more reliable neighboring sub-block positions. In VVC and JVET, the processing order follows the raster scan at the CTU level and the z scan for CUs within the CTU. Therefore, the top row and left column neighboring sub-blocks are actually reconstructed sub-blocks and are more reliable than the estimated bottom row and right column neighboring sub-blocks. An example application of using unequal weights employed at the final MV derivation is described in the following equation:

并且，以上等式中展示的不等权重指派的示例可一般化为如下文所示的通用等式：Also, the example of unequal weight assignment shown in the above equation can be generalized to a general equation as shown below:

其中A(x,y)和B(x,y)分别是水平和竖直预测子的位置相依加权因子，c(x,y)是位置相依舍入因子，且D(x,y)是位置相依缩放因子。where A(x,y) and B(x,y) are position-dependent weighting factors for the horizontal and vertical predictors, respectively, c(x,y) is a position-dependent rounding factor, and D(x,y) is a position-dependent scaling factor.

应注意，不等权重指派还可在水平和竖直MV预测子计算阶段使用，且取决于编解码器设计考虑因素，右下位置调整和不等权重指派组件可一起或单独使用。在一些实施方案中，可根据图片类型(I、P、B和/或任何其它已知、方便和/或所要的类型)、时间层、颜色分量(Y、Cb、Cr和/或任何其它已知、方便和/或所要的颜色分量)来修改加权因子和提升过程。It should be noted that unequal weight assignments may also be used in the horizontal and vertical MV predictor calculation stages, and depending on codec design considerations, the bottom right position adjustment and unequal weight assignment components may be used together or separately. In some implementations, the weighting factors and lifting process may be modified according to picture type (I, P, B, and/or any other known, convenient, and/or desired type), temporal layer, color component (Y, Cb, Cr, and/or any other known, convenient, and/or desired color component).

在一些实施方案中，具有唯一合并候选项指标的特殊合并模式可用于用信令通知不等权重平面MV导出的使用。在其中选择此特殊合并模式的实施方案中，可根据以下等式计算合并子块MV：In some embodiments, a special merge mode with a unique merge candidate indicator can be used to signal the use of unequal weight plane MV derivation. In embodiments where this special merge mode is selected, the merged sub-block MV can be calculated according to the following equation:

如上文呈现的分别与右下子块位置调整过程和不等权重指派过程相关联的等式涉及除法运算，其在计算复杂性方面可能成本较高。然而，这些除法运算通常可转换成缩放操作以使其更高效且实施方式友好，如以下等式中所呈现：The equations associated with the bottom right sub-block position adjustment process and the unequal weight assignment process, respectively, as presented above, involve division operations, which may be costly in terms of computational complexity. However, these division operations can generally be converted into scaling operations to make them more efficient and implementation friendly, as presented in the following equations:

MV(W,H)＝((W*MV(W,-1)+H*MV(-1,H))*S[W+H])＞＞ShiftDenomMV(W,H)＝((W*MV(W,-1)+H*MV(-1,H))*S[W+H])>>ShiftDenom

MV(W,H)＝((H*MV(W,-1)+W*MV(-1,H))*S[W+H])＞＞ShiftDenomMV(W,H)＝((H*MV(W,-1)+W*MV(-1,H))*S[W+H])>>ShiftDenom

MV_b(x,H)＝(((W-1-x)*MV(-1,H)+(x+1)*MV(W,H))*S[W])＞＞ShiftDenomMV _b (x,H)＝(((W-1-x)*MV(-1,H)+(x+1)*MV(W,H))*S[W])>>ShiftDenom

MV_r(W,y)＝(((H-1-y)*MV(W,-1)+(y+1)*MV(W,H))*S[H])＞＞ShiftDenomMV _r (W,y)＝(((H-1-y)*MV(W,-1)+(y+1)*MV(W,H))*S[H])>>ShiftDenom

MV(x,y)＝((H*MV_h(x,y)*(y+1)+W*WV_v(x,y)*(x+1))*S[x+y+2])>>(ShiftDenom+log₂ W+log₂ H)MV(x,y)＝((H*MV _h (x,y)*(y+1)+W*WV _v (x,y)*(x+1))*S[x+y+2] )>>(ShiftDenom+log ₂ W+log ₂ H)

其中S[n]是参数n的权重因子，且ShiftDenom是向下移位操作的因子。确切地说，S[n]是因子的近似值，且可被描述为：where S[n] is the weight factor for parameter n, and ShiftDenom is the factor for the shift down operation. Specifically, S[n] is the factor is an approximation of , and can be described as:

图8中描绘了S[n]的示例800，其中宽度和高度的总和为256且ShiftDenom为10，且图9中描绘了S[n]的另一示例900，其中宽度和高度的总和为512且ShiftDenom为10。An example 800 of S[n] where the sum of width and height is 256 and ShiftDenom is 10 is depicted in FIG. 8 , and another example 900 of S[n] where the sum of width and height is 512 and ShiftDenom is 10 is depicted in FIG. 9 .

在图8和9中呈现的示例中，需要2570位(针对图8，各10位的257个条目)和5130位(针对图9，各10位的513个条目)的存储器尺寸来填充权重表。此存储器尺寸可能超额且存储器负担较重，且因此对于效率和用以降低此存储器要求的存储器管理可能有益。以下是实现减小S[n]的尺寸和存储器负担的两种可能方式的两个示例。In the examples presented in Figures 8 and 9, a memory size of 2570 bits (257 entries of 10 bits each for Figure 8) and 5130 bits (513 entries of 10 bits each for Figure 9) is required to fill the weight table. This memory size may be excessive and the memory burden is heavy, and therefore it may be beneficial for efficiency and memory management to reduce this memory requirement. The following are two examples of two possible ways to achieve a reduction in the size and memory burden of S[n].

下文呈现S[n]的非限制性示例，下文展示其中宽度和高度的总和为128且ShiftDenom为10。A non-limiting example of S[n] is presented below, where the sum of width and height is 128 and ShiftDenom is 10.

S[n]＝{341,256,205,171,146,128,114,102,93,85,79,73,68,S[n]＝{341,256,205,171,146,128,114,102,93,85,79,73,68,

64,60,57,54,51,49,47,45,43,41,39,38,37,35,34,33,64,60,57,54,51,49,47,45,43,41,39,38,37,35,34,33,

32,31,30,29,28,28,27,26,26,25,24,24,23,23,22,22,32,31,30,29,28,28,27,26,26,25,24,24,23,23,22,22,

21,21,20,20,20,19,19,19,18,18,18,17,17,17,17,16,21,21,20,20,20,19,19,19,18,18,18,17,17,17,17,16,

16,16,16,15,15,15,15,14,14,14,14,14,13,13,13,13,16,16,16,15,15,15,15,14,14,14,14,14,13,13,13,13,

13,13,12,12,12,12,12,12,12,12,11,11,11,11,11,11,13,13,12,12,12,12,12,12,12,12,11,11,11,11,11,11,

11,11,10,10,10,10,10,10,10,10,10,10,9,9,9,9,11,11,10,10,10,10,10,10,10,10,10,10,9,9,9,9,

9,9,9,9,9,9,9,9,9,8,8,8,8,8,8,8,8}9,9,9,9,9,9,9,9,9,8,8,8,8,8,8,8,8}

下文展示S[n]的另一非限制性示例，其中宽度和高度的总和为128且ShiftDenom为9。Another non-limiting example of S[n] is shown below, where the sum of width and height is 128 and ShiftDenom is 9.

S[n]＝{171,128,102,85,73,64,57,51,47,43,39,37,34,32,30,28,27,26,24,23,22,21,20,20,19,18,18,17,17,16,16,15,15,14,14,13,13,13,12,12,12,12,11,11,11,11,10,10,10,10,10,9,9,9,9,9,9,9,8,8,8,8,8,8,8,8,7,7,7,7,7,7,7,7,7,7,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4}S[n]＝{171,128,102,85,73,64,57,51,47,43,39,37,34,32,30,28,27,26,24,23,22,21,20,20, 19,18,18,17,17,16,16,15,15,14,14,13,13,13,12,12,12,12,11,11,11,11,10,10,10, 10,10,9,9,9,9,9,9,9,8,8,8,8,8,8,8,8,7,7,7,7,7,7,7,7, 7,7,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,5,5,5,5,5,5,5,5, 5,5,5,5,5,5,5,5,5,5,5,5,4,4,4,4,4,4,4,4,4,4,4,4,4, 4,4}

在上述示例中，仅需要存储必需的129个条目中的126个条目，因为前两个条目(1/0和1/1)不在所呈现的系统和方法中使用。此外，在上述示例中表示1/2的第三条目具有值512和256，且可在权重计算期间单独地处置。相应地，上文呈现的等式中展示的权重平均值计算：In the above example, only 126 of the necessary 129 entries need to be stored because the first two entries (1/0 and 1/1) are not used in the presented system and method. In addition, the third entry representing 1/2 in the above example has values 512 and 256 and can be handled separately during the weight calculation. Accordingly, the weighted average calculation shown in the equation presented above:

可如以下等式中所展示而修改：can be modified as shown in the following equation:

MV_b(x,H)＝(((W-1-x)*MV(-1,H)+(x+1)*MV(W,H))*S[W-3])>>ShiftDenomMV _b (x,H)＝(((W-1-x)*MV(-1,H)+(x+1)*MV(W,H))*S[W-3])>>ShiftDenom

MV_r(W,y)＝(((H-1-y)*MV(W,-1)+(y+1)*MV(W,H))*S[H-3])＞＞ShiftDenomMV _r (W,y)＝(((H-1-y)*MV(W,-1)+(y+1)*MV(W,H))*S[H-3])>>ShiftDenom

然而，如标识的简单移位转换不能提供准确的输出，因此导致较差编码效率。效率低下可能是归因于转换过程，转换过程可能容许错误随着距离线性地累积。在一些实施方案中，可在以下等式中通过利用水平和竖直预测子的权重是互补的这一事实来减少此错误：However, if A simple shift conversion of the identity does not provide an accurate output, thus resulting in poor coding efficiency. The inefficiency may be due to the conversion process, which may allow errors to accumulate linearly with distance. In some embodiments, this error can be reduced by exploiting the fact that the weights of the horizontal and vertical predictors are complementary in the following equation:

相应地，可基于水平或竖直预测子的权重(无论哪个更准确)来计算加权。这可通过将参数horWeight和verWeight引入到以下等式中来实现，Accordingly, the weighting can be calculated based on the weight of the horizontal or vertical predictor (whichever is more accurate). This can be achieved by introducing the parameters horWeight and verWeight into the following equation,

这产生以下等式：This yields the following equation:

MV(x,y)＝(H*MV_h(x,y)*horWeight+W*MV_v(x,y)*verWeight)＞＞(ShiftDenom+log₂W+log₂H)MV(x,y)＝(H*MV _h (x,y)*horWeight+W*MV _v (x,y)*verWeight)>>(ShiftDenom+log ₂ W+log ₂ H)

或者，可在使用如上文标识的简化表时采用以下等式。Alternatively, the following equation may be employed when using the simplified table as identified above.

horWeight＝verWeight＝(1＜＜(ShiftDenom-1))，其中x＝y＝0horWeight=verWeight=(1＜＜(ShiftDenom-1)), where x=y=0

在一些实施方案中，用以存储参数S[n]的权重的表尺寸可进一步缩减，因为不等权重平面MV导出在子块级别处而非像素级别处操作，这是因为子块级别可以较粗粒度(例如，VVC和JVET中4×4)配置。In some embodiments, the table size used to store the weights of the parameters S[n] can be further reduced because the unequal weight plane MV derivation operates at the sub-block level rather than the pixel level, because the sub-block level can be configured at a coarser granularity (e.g., 4×4 in VVC and JVET).

因此，当子块尺寸为N×N时，MV(x,y)可映射到子块坐标MV(x/N,y/N)。在尺寸从W×H减小到(W/N)×(H/N)的情况下，表的最大尺寸相应地较低，因此可使用较小表尺寸。相应地，上述等式可重新用公式表示和呈现为：Therefore, when the sub-block size is N×N, MV(x,y) can be mapped to the sub-block coordinates MV(x/N,y/N). In the case where the size is reduced from W×H to (W/N)×(H/N), the maximum size of the table is correspondingly lower, so a smaller table size can be used. Accordingly, the above equation can be reformulated and presented as:

图10描绘了不等权重平面运动矢量导出的系统和方法的流程框图1000。在步骤1002中，接收CU，接着在步骤1004中，确定与CU相关联的运动信息。接着在步骤1006中，基于与CU相关联的运动信息导出与右下相邻像素或块相关联的运动信息。接着在步骤1008和1010中，可至少部分基于与CU相关联的运动信息和与所导出的右下相邻像素或块相关联的运动信息根据本文中所描述的系统和方法导出和/或限定运动信息。虽然图10描绘了步骤1008在步骤1010之前，但在一些实施方案中，步骤1008和1010可并行地发生和/或步骤1010可在步骤1008之前。在步骤1012中，确定是否使用了不等加权技术来导出相关运动矢量。如果在步骤1012中确定未采用加权，则系统可在步骤1014中继续编码，如本文所描述。然而，如果在步骤1012中确定使用加权组合技术导出了运动信息，则可在步骤1016中设定指示符，且系统可在步骤1014中继续编码。FIG. 10 depicts a flow chart 1000 of a system and method for unequal weighted planar motion vector derivation. In step 1002, a CU is received, and then in step 1004, motion information associated with the CU is determined. Then in step 1006, motion information associated with a lower right neighboring pixel or block is derived based on the motion information associated with the CU. Then in steps 1008 and 1010, motion information may be derived and/or limited according to the systems and methods described herein based at least in part on the motion information associated with the CU and the derived motion information associated with the lower right neighboring pixel or block. Although FIG. 10 depicts step 1008 before step 1010, in some embodiments, steps 1008 and 1010 may occur in parallel and/or step 1010 may be before step 1008. In step 1012, it is determined whether an unequal weighting technique is used to derive the relevant motion vector. If it is determined in step 1012 that weighting is not employed, the system may continue encoding in step 1014 as described herein. However, if it is determined in step 1012 that the motion information was derived using a weighted combining technique, then the indicator may be set in step 1016 and the system may continue encoding in step 1014 .

实施实施方案所需的指令序列的执行可以由图11中所示的计算机系统1100执行。在实施方案中，指令序列的执行是由单个计算机系统1100执行的。根据其它实施方案，通过通信链路1115耦合的两个或更多计算机系统1100可以彼此协调地执行指令序列。尽管下文将仅给出一个计算机系统1100的描述，但是，应当理解，可以采用任意数量的计算机系统1100来实践实施方案。The execution of the instruction sequence required to implement the embodiment can be performed by the computer system 1100 shown in Figure 11. In the embodiment, the execution of the instruction sequence is performed by a single computer system 1100. According to other embodiments, two or more computer systems 1100 coupled by a communication link 1115 can coordinate with each other to execute the instruction sequence. Although only a description of one computer system 1100 will be given below, it should be understood that any number of computer systems 1100 can be used to practice the embodiment.

现在将参考图11描述根据实施方案的计算机系统1100，该图是计算机系统1100的功能部件的框图。如本文所使用的，术语计算机系统1100被广泛地用于描述可以存储并独立运行一个或多个程序的任何计算设备。A computer system 1100 according to an embodiment will now be described with reference to Figure 11, which is a block diagram of functional components of computer system 1100. As used herein, the term computer system 1100 is used broadly to describe any computing device that can store and independently run one or more programs.

每个计算机系统1100可以包括耦合到总线1106的通信接口1114。通信接口1114在计算机系统1100之间提供双向通信。各个计算机系统1100的通信接口1114传输和接收电、电磁或光信号，这些信号包括代表各种类型的信号信息(例如，指令、消息和数据)的数据流。通信链路1115将一个计算机系统1100链接到另一个计算机系统1100。例如，通信链路1115可以是LAN，在这种情况下，通信接口1114可以是LAN卡；或者通信链路1115可以是PSTN，在这种情况下，通信接口1114可以是集成服务数字网络(ISDN)卡或调制解调器；或者通信链路1115可以是因特网，在这种情况下，通信接口1114可以是拨号、电缆或无线调制解调器。Each computer system 1100 may include a communication interface 1114 coupled to the bus 1106. The communication interface 1114 provides two-way communication between the computer systems 1100. The communication interface 1114 of each computer system 1100 transmits and receives electrical, electromagnetic, or optical signals, including data streams representing various types of signal information (e.g., instructions, messages, and data). The communication link 1115 links one computer system 1100 to another computer system 1100. For example, the communication link 1115 may be a LAN, in which case the communication interface 1114 may be a LAN card; or the communication link 1115 may be a PSTN, in which case the communication interface 1114 may be an integrated services digital network (ISDN) card or a modem; or the communication link 1115 may be the Internet, in which case the communication interface 1114 may be a dial-up, cable, or wireless modem.

计算机系统1100可以通过其各自的通信链路1115和通信接口1114传输和接收消息、数据和指令，包括程序，即应用程序、代码。所接收的程序代码可以在被接收时由相应的处理器1107执行，和/或存储在存储设备1110中，或存储在其它关联的非易失性存储介质中，以用于稍后执行。The computer system 1100 may transmit and receive messages, data, and instructions, including programs, i.e., applications, code, through its respective communication links 1115 and communication interfaces 1114. The received program code may be executed by the respective processor 1107 as it is received, and/or stored in the storage device 1110, or in other associated non-volatile storage media, for later execution.

在实施方案中，计算机系统1100协同数据存储系统1131一起工作，所述数据存储系统是例如包含容易被计算机系统1100访问的数据库1132的数据存储系统1131。计算机系统1100通过数据接口1133与数据存储系统1131通信。耦合到总线1106的数据接口1133传输和接收电、电磁或光信号，这些信号包括代表各种类型的信号信息(例如，指令、消息和数据)的数据流。在实施方案中，数据接口1133的功能可以由通信接口1114执行。In an embodiment, the computer system 1100 works in conjunction with a data storage system 1131, which is, for example, a data storage system 1131 containing a database 1132 that is easily accessible by the computer system 1100. The computer system 1100 communicates with the data storage system 1131 through a data interface 1133. The data interface 1133 coupled to the bus 1106 transmits and receives electrical, electromagnetic or optical signals, including data streams representing various types of signal information (e.g., instructions, messages, and data). In an embodiment, the functions of the data interface 1133 can be performed by the communication interface 1114.

计算机系统1100包括用于传送指令、消息和数据(统称为信息)的总线1106或其它通信机构，以及与总线1106耦合用于处理信息的一个或多个处理器1107。计算机系统1100还包括耦合到总线1106用于存储将由处理器1107执行的动态数据和指令的主存储器1108，诸如随机存取存储器(RAM)或其它动态存储设备。在由处理器1107执行指令期间，主存储器1108还可用于存储临时数据，即，变量或其它中间信息。The computer system 1100 includes a bus 1106 or other communication mechanism for transmitting instructions, messages, and data (collectively referred to as information), and one or more processors 1107 coupled to the bus 1106 for processing information. The computer system 1100 also includes a main memory 1108, such as a random access memory (RAM) or other dynamic storage device, coupled to the bus 1106 for storing dynamic data and instructions to be executed by the processor 1107. The main memory 1108 may also be used to store temporary data, i.e., variables or other intermediate information during the execution of instructions by the processor 1107.

计算机系统1100还可以包括耦合到总线1106用于存储处理器1107的静态数据和指令的只读存储器(ROM)1109或其它静态存储设备。存储设备1110(诸如，磁盘或光盘)也可以被提供并耦合到总线1106以用于存储处理器1107的数据和指令。The computer system 1100 may also include a read only memory (ROM) 1109 or other static storage device coupled to the bus 1106 for storing static data and instructions for the processor 1107. A storage device 1110, such as a magnetic disk or optical disk, may also be provided and coupled to the bus 1106 for storing data and instructions for the processor 1107.

计算机系统1100可以经由总线1106耦合到显示设备1111，例如但不限于，阴极射线管(CRT)或液晶显示器(LCD)，以用于向用户显示信息。输入设备1112，例如数字字母键和其它按键，可以耦合到总线1106以用于向处理器1107传送信息和命令选择。The computer system 1100 may be coupled to a display device 1111, such as but not limited to a cathode ray tube (CRT) or a liquid crystal display (LCD), via the bus 1106 for displaying information to a user. An input device 1112, such as alphanumeric keys and other keys, may be coupled to the bus 1106 for transmitting information and command selections to the processor 1107.

根据一个实施方案，各个计算机系统1100通过各自的处理器1107执行包含在主存储器1108中的一个或多个指令的一个或多个序列而执行具体操作。此类指令可以从另一个计算机可用介质，诸如ROM 1109或存储设备1110，被读入主存储器1108。包含在主存储器1108中的指令的序列的执行使得处理器1107执行本文所述的过程。在替代实施方案中，硬连线电路可以取代或结合软件指令使用。因此，实施方案不限于硬件电路和/或软件的任何特定组合。According to one embodiment, each computer system 1100 performs specific operations by executing one or more sequences of one or more instructions contained in the main memory 1108 by its respective processor 1107. Such instructions can be read into the main memory 1108 from another computer-usable medium, such as ROM 1109 or storage device 1110. Execution of the sequences of instructions contained in the main memory 1108 causes the processor 1107 to perform the processes described herein. In alternative embodiments, hard-wired circuits may be used in place of or in combination with software instructions. Therefore, the embodiments are not limited to any specific combination of hardware circuitry and/or software.

如本文所使用的，术语“计算机可用介质”是指提供信息或可由处理器1107使用的任何介质。这样的介质可以采取很多形式，包括但不限于，非易失性介质、易失性介质和传输介质。非易失性介质，即，在没电的情况下可以保持信息的介质，包括ROM 1109、CD ROM、磁带和磁盘。易失性介质，即，在没电的情况下不可以保持信息的介质，包括主存储器1108。传输介质包括同轴电缆、铜线和光纤，其包括具有总线1106的电线。传输介质还可以采取载波(即，可以在频率、幅度或相位方面被调制以传输信息信号的电磁波)的形式。此外，传输介质可以采取声波或光波(诸如，在无线电波和红外线数据通信期间产生的那些)的形式。As used herein, the term "computer-usable medium" refers to any medium that provides information or can be used by the processor 1107. Such media can take many forms, including, but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media, that is, media that can retain information without power, include ROM 1109, CD ROM, magnetic tape, and disk. Volatile media, that is, media that cannot retain information without power, include main memory 1108. Transmission media include coaxial cables, copper wires, and optical fibers, including wires with bus 1106. Transmission media can also take the form of carrier waves (that is, electromagnetic waves that can be modulated in frequency, amplitude, or phase to transmit information signals). In addition, transmission media can take the form of sound waves or light waves (such as those generated during radio wave and infrared data communications).

在以上说明书中，已经参考各实施方案的具体元素描述了各实施方案。但是，将显而易见的是，在不脱离实施方案的更广泛的实质和范围的情况下，可对其进行各种修改和变更。例如，读者要理解，本文所述的过程流程图中所示的过程动作的具体排序和组合仅仅是示例性的，并且可以使用不同或额外的过程动作或过程动作的不同组合或排序来实践这些实施方案。因此，说明书和附图应被视为是示例性的而非限制性的。In the above description, each embodiment has been described with reference to the specific elements of each embodiment. However, it will be apparent that various modifications and changes may be made to it without departing from the broader essence and scope of the embodiment. For example, the reader will understand that the specific ordering and combination of the process actions shown in the process flow charts described herein are merely exemplary, and different or additional process actions or different combinations or orderings of process actions may be used to practice these embodiments. Therefore, the description and drawings should be regarded as exemplary and not restrictive.

还应该指出的是，可以在各种计算机系统中实施本发明。本文所述的各种技术可以在硬件或软件或两者的组合中实现。优选地，在可编程计算机上执行的计算机程序中实施这些技术，所述可编程计算机各自包括处理器、可由处理器读取的存储介质(包括易失性和非易失性存储器和/或存储元件)、至少一个输入设备和至少一个输出设备。可以向使用输入设备输入的数据应用程序代码以执行上文描述的功能并产生输出信息。输出信息被应用到一个或多个输出设备。优选地以高阶程序编程语言或面向对象的编程语言实现每个程序，以与计算机系统通信。然而，如果需要的话，可以用汇编语言或机器语言来实现程序。在任何情况下，该语言可以是编译或解释语言。每个这样的计算机程序优选地存储在可由通用或专用可编程计算机读取的存储介质或设备(例如，ROM或磁盘)上，以在计算机读取存储介质或设备时配置并操作计算机以执行上述程序。还可以考虑将该系统实现为配置有计算机程序的计算机可读存储介质，其中如此配置的存储介质使计算机以特定的预定义方式运行。此外，示例性计算应用的存储元件可以是关系型或顺序(平坦文件)型计算数据库，其能够以各种组合和配置存储数据。It should also be noted that the present invention can be implemented in various computer systems. The various techniques described herein can be implemented in hardware or software or a combination of the two. Preferably, these techniques are implemented in a computer program executed on a programmable computer, each of which includes a processor, a storage medium (including volatile and non-volatile memory and/or storage element) readable by the processor, at least one input device and at least one output device. The data program code input using the input device can be applied to perform the functions described above and generate output information. The output information is applied to one or more output devices. Preferably, each program is implemented in a high-level program programming language or an object-oriented programming language to communicate with the computer system. However, if necessary, the program can be implemented in assembly language or machine language. In any case, the language can be a compiled or interpreted language. Each such computer program is preferably stored on a storage medium or device (e.g., ROM or disk) that can be read by a general or special programmable computer, so as to configure and operate the computer to execute the above-mentioned program when the computer reads the storage medium or device. It can also be considered that the system is implemented as a computer-readable storage medium configured with a computer program, wherein the storage medium so configured enables the computer to operate in a specific predefined manner. Furthermore, the storage element of the exemplary computing application may be a relational or sequential (flat file) type computing database capable of storing data in various combinations and configurations.

图12是可以并入本文所述的系统和设备的特征的源设备1212和目的地设备1210的高水平视图。如图12中所示，示例性视频编码系统1210包括源设备1212和目的地设备1214，其中，在本示例中，源设备1212生成编码视频数据。因此，源设备1212可被称为视频编码设备。目的地设备1214可以对源设备1212生成的编码视频数据解码。因此，目的地设备1214可被称为视频解码设备。源设备1212和目的地设备1214可以是视频编码设备的示例。FIG. 12 is a high-level view of a source device 1212 and a destination device 1210 that may incorporate features of the systems and devices described herein. As shown in FIG. 12 , an exemplary video encoding system 1210 includes a source device 1212 and a destination device 1214, wherein, in this example, the source device 1212 generates encoded video data. Thus, the source device 1212 may be referred to as a video encoding device. The destination device 1214 may decode the encoded video data generated by the source device 1212. Thus, the destination device 1214 may be referred to as a video decoding device. The source device 1212 and the destination device 1214 may be examples of video encoding devices.

目的地设备1214可以经由信道1216从源设备1212接收编码视频数据。信道1216可以包括能够将编码视频数据从源设备1212移动到目的地设备1214的一种介质或设备。在一个示例中，信道1216可以包括通信介质，该通信介质使得源设备1212能够实时地将编码视频数据直接传输到目的地设备1214。Destination device 1214 may receive the encoded video data from source device 1212 via channel 1216. Channel 1216 may include a medium or device capable of moving the encoded video data from source device 1212 to destination device 1214. In one example, channel 1216 may include a communication medium that enables source device 1212 to transmit the encoded video data directly to destination device 1214 in real-time.

在本示例中，源设备1212可以根据通信标准(例如，无线通信协议)调制编码视频数据，并且可以向目的地设备1214传输调制的视频数据。通信介质可以包括无线或有线通信介质，例如射频(RF)频谱或一个或多个物理传输线。通信介质可以形成诸如局域网、广域网的基于分组的网络或诸如因特网的全球网络的一部分。通信介质可以包括路由器、交换机、基站或促成从源设备1212到目的地设备1214的通信的其它设备。在另一个示例中，信道1216可以对应于存储由源设备1212生成的编码视频数据的存储介质。In this example, source device 1212 can modulate the encoded video data according to a communication standard (e.g., a wireless communication protocol), and can transmit the modulated video data to destination device 1214. The communication medium may include a wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form a portion of a packet-based network such as a local area network, a wide area network, or a global network such as the Internet. The communication medium may include a router, a switch, a base station, or other devices that facilitate communication from source device 1212 to destination device 1214. In another example, channel 1216 may correspond to a storage medium that stores the encoded video data generated by source device 1212.

在图12的示例中，源设备1212包括视频源1218、视频编码器1220和输出接口1222。在一些情况下，输出接口1228可以包括调制器/解调器(调制解调器)和/或发射器。在源设备1212中，视频源1218可以包括源，例如，视频捕获设备(例如摄像机)、包含先前捕获的视频数据的视频档案、从视频内容提供商接收视频数据的视频馈送接口和/或用于生成视频数据的计算机图形系统，或这些源的组合。12, source device 1212 includes a video source 1218, a video encoder 1220, and an output interface 1222. In some cases, output interface 1228 may include a modulator/demodulator (modem) and/or a transmitter. In source device 1212, video source 1218 may include a source, such as a video capture device (e.g., a camera), a video archive containing previously captured video data, a video feed interface for receiving video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of these sources.

视频编码器1220可以对捕获的、预先捕获的或计算机生成的视频数据编码。输入图像可以由视频编码器1220接收并存储在输入帧存储器1221中。通用处理器1223可以从这里加载信息并执行编码。可以从存储设备，例如图12中描绘的示例性存储器模块，加载用于驱动通用处理器的程序。通用处理器可以使用处理存储器1222进行编码，并且通用处理器对编码信息的输出可以被存储在缓冲器，例如输出缓冲器1226中。The video encoder 1220 can encode captured, pre-captured or computer-generated video data. The input image can be received by the video encoder 1220 and stored in the input frame memory 1221. The general processor 1223 can load information from here and perform encoding. The program for driving the general processor can be loaded from a storage device, such as the exemplary memory module depicted in Figure 12. The general processor can use the processing memory 1222 for encoding, and the output of the general processor to the encoded information can be stored in a buffer, such as an output buffer 1226.

视频编码器1220可以包括重采样模块1225，其可以被配置成在可缩放视频编码方案中对视频数据编码(code)(例如，编码(encode))，该可缩放视频编码方案定义至少一个基础层和至少一个增强层。作为编码过程的一部分，重采样模块1225可以对至少一些视频数据进行重采样，其中，可以使用重采样滤波器以自适应方式执行重采样。The video encoder 1220 may include a resampling module 1225 that may be configured to encode (e.g., encode) video data in a scalable video coding scheme that defines at least one base layer and at least one enhancement layer. As part of the encoding process, the resampling module 1225 may resample at least some of the video data, wherein the resampling may be performed in an adaptive manner using a resampling filter.

可以经由源设备1212的输出接口1228直接向目的地设备1214传输编码视频数据，例如，编码比特流。在图12的示例中，目的地设备1214包括输入接口1238、视频解码器1230和显示设备1232。在一些情况下，输入接口1228可以包括接收器和/或调制解调器。目的地设备1214的输入接口1238通过信道1216接收编码视频数据。编码视频数据可以包括由视频编码器1220生成的代表视频数据的各种语法元素。这样的语法元素可以与通信介质上传输的编码视频数据一起被包括，存储在存储介质上或存储在文件服务器上。The encoded video data, e.g., an encoded bitstream, may be transmitted directly to the destination device 1214 via the output interface 1228 of the source device 1212. In the example of FIG. 12, the destination device 1214 includes an input interface 1238, a video decoder 1230, and a display device 1232. In some cases, the input interface 1228 may include a receiver and/or a modem. The input interface 1238 of the destination device 1214 receives the encoded video data via the channel 1216. The encoded video data may include various syntax elements representing the video data generated by the video encoder 1220. Such syntax elements may be included with the encoded video data transmitted on the communication medium, stored on a storage medium, or stored on a file server.

编码视频数据也可以存储到存储介质或文件服务器上，以供目的地设备1214稍晚访问以进行解码和/或回放。例如，编码比特流可以临时存储在输入缓冲器1231中，然后加载到通用处理器1233中。可以从存储设备或存储器加载用于驱动通用处理器的程序。通用处理器可以使用处理存储器1232来执行解码。视频编码器1230还可以包括类似于视频编码器1220中采用的重采样模块1225的重采样模块1235。The encoded video data may also be stored on a storage medium or file server for later access by the destination device 1214 for decoding and/or playback. For example, the encoded bitstream may be temporarily stored in an input buffer 1231 and then loaded into a general purpose processor 1233. A program for driving the general purpose processor may be loaded from a storage device or memory. The general purpose processor may use processing memory 1232 to perform decoding. The video encoder 1230 may also include a resampling module 1235 similar to the resampling module 1225 employed in the video encoder 1220.

图12描绘了与通用处理器1233分开的重采样模块1235，但本领域的技术人员将理解，重采样功能可以由通用处理器执行的程序执行，并且视频编码器中的处理可以使用一个或多个处理器来完成。解码的图像可以存储在输出帧缓冲器1236中，并且然后被发送到输入接口1238。12 depicts a resampling module 1235 separate from a general purpose processor 1233, but those skilled in the art will appreciate that the resampling function may be performed by a program executed by a general purpose processor, and that the processing in the video encoder may be performed using one or more processors. The decoded image may be stored in an output frame buffer 1236 and then sent to an input interface 1238.

显示设备1238可以与目的地设备1214集成或者可以在其外部。在一些示例中，目的地设备1214可以包括集成显示设备，并且还可以被配置为与外部显示设备接合。在其它示例中，目的地设备1214可以是显示设备。通常，显示设备1238向用户显示解码视频数据。Display device 1238 can be integrated with destination device 1214 or can be external thereto. In some examples, destination device 1214 can include an integrated display device and can also be configured to interface with an external display device. In other examples, destination device 1214 can be a display device. Typically, display device 1238 displays decoded video data to a user.

视频编码器1220和视频解码器1230可以根据视频压缩标准操作。ITU-T VCEG(Q6/16)和ISO/IEC MPEG(JTC 1/SC 29/WG 11)正在研究对未来视频编码技术的标准化的潜在需求，该技术的压缩能力显著地超过当前的高效视频编码HEVC标准(包括其对屏幕内容编码和高动态范围编码的当前扩展和近期扩展)。各工作组正在共同努力一起开展这项探索活动(被称为联合视频探索团队(JVET))，以评估这个领域中其专家提出的压缩技术设计。在作者为J.Chen,E.Alshina,G.Sullivan,J.Ohm,J.Boyce的“Algorithm Description ofJoint Exploration Test Model 5(JEM 5)”,JVET-E1001-V2中描述了JVET开发的最近情况。The video encoder 1220 and the video decoder 1230 can operate according to a video compression standard. ITU-T VCEG (Q6/16) and ISO/IEC MPEG (JTC 1/SC 29/WG 11) are studying the potential need for standardization of future video coding technology, the compression capability of which significantly exceeds the current high-efficiency video coding HEVC standard (including its current and near-term extensions to screen content coding and high dynamic range coding). The working groups are working together to carry out this exploration activity (called the Joint Video Exploration Team (JVET)) to evaluate the compression technology designs proposed by their experts in this field. The recent development of JVET is described in "Algorithm Description of Joint Exploration Test Model 5 (JEM 5)" by J. Chen, E. Alshina, G. Sullivan, J. Ohm, J. Boyce, JVET-E1001-V2.

此外或替代地，视频编码器1220和视频解码器1230可以根据与所公开的JVET特征一起工作的其它专利或行业标准而操作。因而，其它标准是例如ITU-T H.264标准，或称为MPEG-4，部分10，高级视频编码(AVC)或此类标准的扩展。因此，尽管是为JVET新开发的，但本公开的技术不限于任何特定的编码标准或技术。视频压缩标准和技术的其它示例包括MPEG-2、ITU-T H.263和专有或开源压缩格式和相关格式。Additionally or alternatively, the video encoder 1220 and the video decoder 1230 may operate in accordance with other patents or industry standards that work with the disclosed JVET features. Thus, the other standards are, for example, the ITU-T H.264 standard, otherwise known as MPEG-4, Part 10, Advanced Video Coding (AVC), or extensions of such standards. Thus, although newly developed for JVET, the techniques of the present disclosure are not limited to any particular coding standard or technique. Other examples of video compression standards and techniques include MPEG-2, ITU-T H.263, and proprietary or open source compression formats and related formats.

视频编码器1220和视频解码器1230可以在硬件、软件、固件或它们的任意组合中实现。例如，视频编码器1220和解码器1230可以采用一个或多个处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑或其任意组合。在视频编码器1220和解码器1230部分地中软件中实现时，设备可以将用于软件的指令存储在合适的非暂态计算机可读存储介质中，并可以在硬件中使用一个或多个处理器执行该指令以执行本公开的技术。视频编码器1220和视频解码器1230中的每一个都可以包括在一个或多个编码器或解码器中，编码器或解码器的任一个都可以被集成为相应设备中的组合编码器/解码器(CODEC)的部分。The video encoder 1220 and the video decoder 1230 can be implemented in hardware, software, firmware, or any combination thereof. For example, the video encoder 1220 and the decoder 1230 can use one or more processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, or any combination thereof. When the video encoder 1220 and the decoder 1230 are partially implemented in software, the device can store instructions for the software in a suitable non-transitory computer-readable storage medium, and can use one or more processors in hardware to execute the instructions to perform the technology of the present disclosure. Each of the video encoder 1220 and the video decoder 1230 can be included in one or more encoders or decoders, and any of the encoders or decoders can be integrated as part of a combined encoder/decoder (CODEC) in the corresponding device.

可以在由计算机执行的计算机可执行指令(例如，程序模块)的一般上下文中描述本文所述主题的各方面，计算机是例如上文所述的通用处理器1223和1233。通常，程序模块包括执行特定任务或实施特定抽象数据类型的例程、程序、对象、部件、数据结构等。也可以在分布式计算环境中实践本文所述的主题的各方面，其中任务是由通过通信网络链接的远程处理设备执行的。在分布式计算环境中，程序模块可位于包括存储设备的本地和远程计算机存储介质。Aspects of the subject matter described herein may be described in the general context of computer-executable instructions (e.g., program modules) executed by a computer, such as the general-purpose processors 1223 and 1233 described above. Typically, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. Aspects of the subject matter described herein may also be practiced in a distributed computing environment, where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in local and remote computer storage media including storage devices.

存储器的示例包括随机存取存储器(RAM)、只读存储器(ROM)或两者。存储器可以存储指令，例如，源代码或二进制代码，以用于执行上文所述的技术。在执行要由诸如处理器1223和1233的处理器执行的指令期间，存储器还可用于存储变量或其它中间信息。Examples of memory include random access memory (RAM), read-only memory (ROM), or both. The memory can store instructions, such as source code or binary code, for performing the techniques described above. The memory can also be used to store variables or other intermediate information during the execution of instructions to be executed by a processor such as processors 1223 and 1233.

存储设备还可以存储指令，例如，源代码或二进制代码，以用于执行上文所述的技术。存储设备可以额外地存储由计算机处理器使用和操控的数据。例如，视频编码器1220或视频解码器1230中的存储设备可以是由计算机系统1223或1233访问的数据库。存储设备的其它示例包括随机存取存储器(RAM)、只读存储器(ROM)、硬盘驱动器、磁盘、光盘、CD-ROM、DVD、闪存存储器、USB存储卡或任何其它计算机可以读取的介质。The storage device may also store instructions, such as source code or binary code, for performing the techniques described above. The storage device may additionally store data used and manipulated by the computer processor. For example, the storage device in the video encoder 1220 or the video decoder 1230 may be a database accessed by the computer system 1223 or 1233. Other examples of storage devices include random access memory (RAM), read-only memory (ROM), hard drive, disk, optical disk, CD-ROM, DVD, flash memory, USB memory card, or any other computer-readable medium.

存储器或存储设备可以是由或结合视频编码器和/或解码器使用的非暂态计算机可读存储介质的示例。非暂态计算机可读存储介质包含用于控制计算机系统的指令，所述计算机系统要被配置为执行特定实施方案描述的功能。所述指令在由一个或多个计算机处理器执行时，可以被配置成执行在特定实施方案中描述的功能。The memory or storage device may be an example of a non-transitory computer-readable storage medium used by or in conjunction with a video encoder and/or decoder. The non-transitory computer-readable storage medium contains instructions for controlling a computer system that is to be configured to perform the functions described in a particular embodiment. The instructions, when executed by one or more computer processors, may be configured to perform the functions described in a particular embodiment.

而且，要指出的是，已经将一些实施方案描述为可以被描绘为流程图或框图的过程。虽然每者可将操作描述为顺序的过程，但是这些操作中的多个操作可并行执行或同时执行。此外，操作的顺序可被重新布置。过程可以具有附图中未包括的额外步骤。Moreover, it is noted that some embodiments have been described as processes that can be depicted as flow charts or block diagrams. Although each may describe the operations as sequential processes, multiple operations in these operations may be performed in parallel or simultaneously. In addition, the order of the operations may be rearranged. The process may have additional steps not included in the accompanying drawings.

特定实施方案可以在非暂态计算机可读存储介质中实现，以由或结合指令执行系统、装置、系统或机器使用。计算机可读存储介质包含用于控制计算机系统以执行特定实施方案描述的方法的指令。计算机系统包括一个或多个计算设备。指令在由一个或多个计算机处理器执行时，可以被配置成执行在特定实施方案中描述的功能。Specific embodiments may be implemented in a non-transitory computer-readable storage medium for use by or in conjunction with an instruction execution system, device, system, or machine. The computer-readable storage medium contains instructions for controlling a computer system to perform the method described in the specific embodiments. The computer system includes one or more computing devices. The instructions, when executed by one or more computer processors, may be configured to perform the functions described in the specific embodiments.

如本文中的说明书和随后的整个权利要求书中所使用的，除非上下文明确指出其它表述，否则“一(a,an)”和“该”包括复数引用。而且，如本文中的说明书和随后的整个权利要求书中所使用的，“在……中”的含义包括“在……中”和“在……上”，除非上下文明确指出其它表述。As used in the specification herein and throughout the claims that follow, "a," "an," and "the" include plural references unless the context clearly indicates otherwise. Also, as used in the specification herein and throughout the claims that follow, the meaning of "in" includes "in" and "on," unless the context clearly indicates otherwise.

尽管已经用以上结构特征和/或方法动作特有的语言详细描述了本发明的示例性实施方案，但要理解的是，本领域的技术人员将容易认识到，在实质上不背离本发明的新颖教导和优点的情况下，在示例性实施方案中很多额外修改是可能的。此外，应当理解，所附权利要求中定义的主题未必限于上述具体特征或动作。因此，这些和所有这样的修改都意在包括在根据所附权利要求的宽度和范围解释的本发明的范围内。Although the exemplary embodiments of the present invention have been described in detail with the above structural features and/or method actions specific language, it is to be understood that those skilled in the art will readily recognize that many additional modifications are possible in the exemplary embodiments without substantially departing from the novel teachings and advantages of the present invention. In addition, it should be understood that the subject matter defined in the appended claims is not necessarily limited to the above-mentioned specific features or actions. Therefore, these and all such modifications are intended to be included within the scope of the present invention as interpreted according to the width and scope of the appended claims.

Claims

1. A method of decoding video included in a bitstream by a decoder, comprising:

(a) Receiving the bitstream, wherein the bitstream comprises at least four slices indicating how to divide a coding tree unit into coding units according to a quadtree plus multi-tree structure, the quadtree plus multi-tree structure allowing dividing a square parent node with quadtree division, the quadtree division dividing the square parent node in half in both horizontal and vertical directions to define leaf nodes shaped as squares, each leaf node having the same size;

(b) Wherein the quadtree plus multi-tree structure allows a first one of the leaf nodes to be partitioned based on a symmetric binary tree partition that partitions the first one of the leaf nodes of the quadtree partition in a horizontal direction or a vertical direction into half, generating two rectangular blocks of the same size as leaf nodes, wherein the two rectangular blocks generated by the symmetric binary tree partition are rectangular coding units, wherein the two rectangular blocks generated by the symmetric binary tree partition are not prediction units, wherein the two rectangular blocks generated by the symmetric binary tree partition are not transform units;

(c) Wherein the quadtree plus multi-tree structure allows a second one of the leaf nodes to be partitioned based on an asymmetric tree partition that partitions the second one of the leaf nodes of the quadtree partition in a horizontal direction or a vertical direction, generating three rectangular blocks as leaf nodes, wherein a size of two rectangular blocks is different from a size of a third one of the three blocks, wherein the three rectangular blocks generated by the asymmetric tree partition are rectangular encoding units, wherein the three rectangular blocks generated by the asymmetric tree partition are not prediction units, wherein the three rectangular blocks generated by the asymmetric tree partition are not transformation units;

(d) Identifying final coding units to be decoded represented by leaf nodes of the quadtree plus multi-tree structure, wherein a plurality of the final coding units are rectangles, wherein each of the rectangular final coding units is a decision point whether to perform inter-picture or intra-picture prediction;

(e) The following two are received: (1) A first motion vector associated with a rectangular coding unit of a B slice of a current frame of the video, wherein one of the rectangular final coding units is the rectangular coding unit included in a bi-directionally predicted slice of the current frame of the video that references a temporally previous reference slice relative to a temporally previous reference frame of the current frame of the rectangular coding unit, and (2) a second motion vector associated with the rectangular coding unit of the B slice of the current frame of the video that references a temporally future reference slice relative to a temporally future frame of the current frame of the rectangular coding unit;

(f) Wherein the rectangular coding unit of the B slices of the current frame of the video has a top adjacent row, a left adjacent column, a bottom adjacent row, and a right adjacent column, each of the top adjacent row, the left adjacent column, the bottom adjacent row, and the right adjacent column being in the current frame;

(g) Applying optical flow between the temporally future reference stripe and the temporally previous reference stripe to perform sample-based motion modification using a corrected motion vector based on at least one of the second motion vector and the first motion vector;

(h) The rectangular encoding unit is decoded based on the sample-based motion modification as a result of applying the optical flow between the temporally preceding reference stripe and the temporally future reference stripe.

2. A method of encoding video included in a bitstream by an encoder, comprising:

(a) Providing the bitstream, wherein the bitstream comprises at least four slices indicating how to divide a coding tree unit into coding units according to a quadtree plus multi-tree structure, the quadtree plus multi-tree structure allowing dividing a square parent node with quadtree division, the quadtree division dividing the square parent node in half in both horizontal and vertical directions to define leaf nodes shaped as squares, each leaf node having the same size;

(d) Wherein the bitstream is configured for identifying final coding units to be decoded represented by leaf nodes of the quadtree plus multi-tree structure, wherein a plurality of the final coding units are rectangles, wherein each of the rectangular final coding units is a decision point whether to perform inter-picture or intra-picture prediction;

(e) The following two are provided: (1) A first motion vector associated with a rectangular coding unit of a B slice of a current frame of the video, wherein one of the rectangular final coding units is the rectangular coding unit included in a bi-directionally predicted slice of the current frame of the video that references a temporally previous reference slice relative to a temporally previous reference frame of the current frame of the rectangular coding unit, and (2) a second motion vector associated with the rectangular coding unit of the B slice of the current frame of the video that references a temporally future reference slice relative to a temporally future frame of the current frame of the rectangular coding unit;

(g) Wherein the bitstream is configured for applying optical flow between the temporally future reference stripe and the temporally previous reference stripe to perform sample-based motion modification using a corrected motion vector based on at least one of the second motion vector and the first motion vector;

(h) The rectangular encoding unit is encoded based on the sample-based motion modification as a result of applying the optical flow between the temporally preceding reference stripe and the temporally future reference stripe.

3. A non-transitory computer readable storage medium containing instructions for controlling a computer system, which when executed, perform the method of claim 1.

4. A non-transitory computer readable storage medium containing instructions for controlling a computer system, which when executed, perform the method of claim 2.