CN110557639B

CN110557639B - Application of interleaved prediction

Info

Publication number: CN110557639B
Application number: CN201910468418.8A
Authority: CN
Inventors: 张凯; 张莉; 刘鸿彬; 王悦
Original assignee: Beijing ByteDance Network Technology Co Ltd; ByteDance Inc
Current assignee: Beijing ByteDance Network Technology Co Ltd; ByteDance Inc
Priority date: 2018-05-31
Filing date: 2019-05-31
Publication date: 2022-09-02
Anticipated expiration: 2039-05-31
Also published as: TWI868071B; CN110557639A; WO2019229682A1; TW202005388A

Abstract

A method of processing a video block, comprising: since the block satisfies a condition, determining to apply interleaving prediction to the block, determining a prediction block based on a first intermediate prediction block and a second intermediate prediction block, and using the prediction block to generate an encoding or encoding of the block Decode representation. The first intermediate prediction block is generated from a first group of sub-blocks obtained by dividing the block according to the first division mode, and the second intermediate prediction block is generated from a second group of sub-blocks obtained by dividing the block according to the second division mode. At least one of the sub-blocks in the second group has a different size than the sub-blocks in the first group.

Description

Application of Interleaving Prediction

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

根据适用的《专利法》和/或《巴黎公约》的规定，本申请及时要求(1)于2018年5月31日以国际专利申请号PCT/CN2018/089242提交的在先中国专利申请和(2)于2019年1月2日以国际专利申请号PCT/CN2019/070058提交的在先中国专利申请的优先权和利益，这两个申请在提交后遂被放弃。将国际专利申请号PCT/CN2018/089242和PCT/CN2019/070058的全部公开以引用方式并入本文，作为本申请公开的一部分。In accordance with the provisions of the applicable Patent Law and/or the Paris Convention, this application promptly requires (1) the prior Chinese patent application filed under International Patent Application No. PCT/CN2018/089242 on May 31, 2018 and ( 2) The priority and interest of the prior Chinese patent application filed under International Patent Application No. PCT/CN2019/070058 on January 2, 2019, which were subsequently abandoned after filing. The entire disclosures of International Patent Application Nos. PCT/CN2018/089242 and PCT/CN2019/070058 are incorporated herein by reference as part of this disclosure.

技术领域technical field

本申请文件涉及视频编码技术、设备和系统。This application document relates to video coding technologies, devices and systems.

背景技术Background technique

运动补偿(MC)是一种视频处理中的技术，给出先前的和/或将来的帧，通过考虑相机和/或视频中的对象的运动来预测视频中的帧。运动补偿可以用于视频数据的编码以实现视频压缩。Motion compensation (MC) is a technique in video processing that, given previous and/or future frames, predicts frames in a video by taking into account the motion of the camera and/or objects in the video. Motion compensation can be used in the encoding of video data to achieve video compression.

发明内容SUMMARY OF THE INVENTION

本文件公开了与视频运动补偿中基于子块的运动预测相关的方法、系统和设备。This document discloses methods, systems, and apparatuses related to subblock-based motion prediction in video motion compensation.

在一个典型的方面，公开了一种处理视频块的方法，该方法包括：由于块满足条件，确定对该块应用交织预测，基于第一中间预测块和第二中间预测块确定预测块，并使用该预测块生成该块的编码或解码表示。第一中间预测块从根据第一划分模式对块进行分割得到的第一组子块生成，并且第二中间预测块从根据第二划分模式对块进行分割得到的第二组子块生成。第二组中至少有一个子块与第一组中的子块具有不同的尺寸。In an exemplary aspect, a method of processing a video block is disclosed, the method comprising: since the block satisfies a condition, determining to apply interleaving prediction to the block, determining a prediction block based on a first intermediate prediction block and a second intermediate prediction block, and An encoded or decoded representation of the block is generated using the predicted block. The first intermediate prediction block is generated from a first set of sub-blocks obtained by dividing the block according to the first division mode, and the second intermediate prediction block is generated from a second set of sub-blocks obtained by dividing the block according to the second division mode. At least one sub-block in the second group has a different size than the sub-block in the first group.

在另一个典型的方面，一种装置，包括：处理器，其被配置为实现本文中描述的方法。In another exemplary aspect, an apparatus includes a processor configured to implement the methods described herein.

在又一个典型的方面，本文所述的各种技术可以实现为一种计算机程序产品，其存储在非暂时性计算机可读介质上，该计算机程序产品包括用于实现本文所述方法的程序代码。In yet another exemplary aspect, the various techniques described herein can be implemented as a computer program product stored on a non-transitory computer readable medium, the computer program product comprising program code for implementing the methods described herein .

在又一个典型的方面，一种视频解码装置，其可以实现本文所述的方法。In yet another exemplary aspect, a video decoding apparatus may implement the methods described herein.

在附图、说明书和权利要求中阐述一个或多个实施例的细节。The details of one or more embodiments are set forth in the accompanying drawings, the description, and the claims.

附图说明Description of drawings

图1是示出基于子块的预测的示例的示意图。FIG. 1 is a schematic diagram illustrating an example of subblock-based prediction.

图2示出了由两个控制点运动矢量描述的块的仿射运动场的示例。Figure 2 shows an example of an affine motion field of a block described by two control point motion vectors.

图3示出了块的每个子块的仿射运动矢量场的示例。Figure 3 shows an example of an affine motion vector field for each sub-block of a block.

图4示出了在AF_INTER模式中块400的运动矢量预测的示例。FIG. 4 shows an example of motion vector prediction of block 400 in AF_INTER mode.

图5A示出了当前编码单元(CU)的候选块的选择顺序的示例。FIG. 5A shows an example of a selection order of candidate blocks of a current coding unit (CU).

图5B示出了在AF_MERGE模式中当前CU的候选块的另一个示例。FIG. 5B shows another example of candidate blocks of the current CU in AF_MERGE mode.

图6示出了CU的可选时域运动矢量预测(ATMVP)运动预测处理的示例。6 shows an example of an optional temporal motion vector prediction (ATMVP) motion prediction process for a CU.

图7示出了具有四个子块的一个CU和相邻块的示例。FIG. 7 shows an example of one CU with four sub-blocks and adjacent blocks.

图8是视频处理的示例方法的流程图。8 is a flowchart of an example method of video processing.

图9示出了视频编码器或视频解码器的功能框图的示例。9 shows an example of a functional block diagram of a video encoder or video decoder.

图10示出了在帧速率上转换(FRUC)方法中使用的双向匹配的示例。Figure 10 shows an example of bidirectional matching used in the frame rate up-conversion (FRUC) method.

图11示出了在FRUC方法中使用的模板匹配的示例。Figure 11 shows an example of template matching used in the FRUC method.

图12示出了FRUC方法中的单向运动估计(ME)的示例。Figure 12 shows an example of unidirectional motion estimation (ME) in the FRUC method.

图13示出了根据所公开的技术的具有两个划分模式的交织预测的示例。13 shows an example of interleaving prediction with two partition modes in accordance with the disclosed technique.

图14A示出了根据所公开的技术其中将块划分为4×4个子块的示例划分模式。14A illustrates an example partitioning pattern in which a block is partitioned into 4x4 sub-blocks in accordance with the disclosed technique.

图14B示出了根据所公开的技术其中将块划分为8×8个子块的示例划分模式。14B illustrates an example partitioning pattern in which a block is partitioned into 8x8 sub-blocks in accordance with the disclosed technique.

图14C示出了根据所公开的技术其中将块划分为4×8个子块的示例划分模式。14C illustrates an example partitioning pattern in which a block is partitioned into 4x8 sub-blocks in accordance with the disclosed technique.

图14D示出了根据所公开的技术其中将块划分为8×4个子块的示例划分模式。14D illustrates an example partitioning pattern in which a block is partitioned into 8x4 sub-blocks in accordance with the disclosed technique.

图14E示出了根据所公开的技术其中将块划分为不一致的子块的示例划分模式。14E illustrates an example partitioning pattern in which a block is partitioned into non-uniform sub-blocks in accordance with the disclosed technique.

图14F示出了根据所公开的技术其中将块划分为不一致的子块的另一个示例划分模式。14F illustrates another example partitioning pattern in which a block is partitioned into non-uniform sub-blocks in accordance with the disclosed technique.

图14G示出了根据所公开的技术其中将块划分为不一致的子块的又一个示例划分模式。14G illustrates yet another example partitioning pattern in which a block is partitioned into non-uniform sub-blocks in accordance with the disclosed technique.

图15A是根据所公开的技术提高基于块的运动预测视频系统的带宽使用和预测精度的方法的示例流程图。15A is an example flow diagram of a method for improving bandwidth usage and prediction accuracy of a block-based motion prediction video system in accordance with the disclosed techniques.

图15B是根据所公开的技术提高基于块的运动预测视频系统的带宽使用和预测精度的方法的另一个示例流程图。15B is another example flow diagram of a method for improving bandwidth usage and prediction accuracy of a block-based motion prediction video system in accordance with the disclosed techniques.

图16是图示可以用于实现本公开技术的各个部分的计算机系统或其他控制设备的架构的示例的示意图。16 is a schematic diagram illustrating an example of the architecture of a computer system or other control device that may be used to implement various portions of the disclosed technology.

图17示出可以用于实现本公开技术的各个部分的移动设备的示例实施例的框图。17 illustrates a block diagram of an example embodiment of a mobile device that may be used to implement portions of the disclosed techniques.

具体实施方式Detailed ways

全局运动补偿是视频压缩中运动补偿技术的变体之一，并且可以用来预测相机的运动。然而，在视频文件的帧内的移动对象并没有通过全局运动补偿的各种实现充分地表示出来。局部运动估计，诸如块运动补偿可以用于解释帧内的移动对象，其中帧被划分成像素块以用于执行运动预测。Global motion compensation is one of the variants of motion compensation techniques in video compression and can be used to predict camera motion. However, moving objects within a frame of a video file are not adequately represented by various implementations of global motion compensation. Local motion estimation, such as block motion compensation, can be used to interpret moving objects within a frame, where the frame is divided into blocks of pixels for performing motion prediction.

基于块运动补偿开发出的基于子块的预测通过高效视频编码(HEVC)附录I(3D-HEVC)首次引入视频编码标准。Subblock-based prediction, developed based on block motion compensation, was first introduced into the video coding standard through High Efficiency Video Coding (HEVC) Annex I (3D-HEVC).

图1是示出了基于预测的子块的示例的示意图。使用基于子块的预测，将块100诸如编码单元(CU)或预测单元(PU)划分为几个不重叠的子块101。不同的子块可以被分配不同的运动信息，诸如参考索引或运动矢量(MV)。然后对每个子块分别执行运动补偿。FIG. 1 is a schematic diagram illustrating an example of a prediction-based sub-block. Using subblock-based prediction, a block 100 such as a coding unit (CU) or prediction unit (PU) is divided into several non-overlapping subblocks 101 . Different sub-blocks may be assigned different motion information, such as reference indices or motion vectors (MV). Motion compensation is then performed separately for each sub-block.

为了探索HEVC之外的未来视频编码技术，视频编码专家组(VCEG)和运动图像专家组(MPEG)于2015年联合成立了联合视频探索小组(JVET)。JVET采用了许多方法，并且将其添加到了名为联合探索模型(JEM)的参考软件中。在JEM中，基于子块的预测在多种编码技术中被采用，诸如仿射预测、可选时域运动矢量预测(ATMVP)、空时运动矢量预测(STMVP)、双向光流(BIO)，以及帧速率上转换(FRUC)，其详细讨论如下。To explore future video coding technologies beyond HEVC, the Video Coding Experts Group (VCEG) and the Moving Picture Experts Group (MPEG) jointly established the Joint Video Exploration Team (JVET) in 2015. JVET took a number of methods and added them to reference software called the Joint Exploration Model (JEM). In JEM, subblock-based prediction is employed in various coding techniques, such as affine prediction, optional temporal motion vector prediction (ATMVP), space-time motion vector prediction (STMVP), bidirectional optical flow (BIO), and Frame Rate Up-Conversion (FRUC), which are discussed in detail below.

仿射预测Affine prediction

在HEVC中，仅平移运动模型应用于运动补偿预测(MCP)。然而，相机和对象可能具有多种运动，例如放大/缩小、旋转、透视运动和/或其他不规则运动。另一方面，JEM应用了简化的仿射变换运动补偿预测。In HEVC, only translational motion models are applied for motion compensated prediction (MCP). However, cameras and objects may have various motions such as zoom in/out, rotation, perspective motion, and/or other irregular motions. On the other hand, JEM applies a simplified affine transform motion-compensated prediction.

图2示出了由两个控制点运动矢量V₀和V₁描述的块200的仿射运动场的示例。块200的运动矢量场(MVF)可以由以下等式描述：Figure 2 shows an example of an affine motion field of block 200 described by two control point motion vectors V ₀ and V ₁ . The motion vector field (MVF) of block 200 can be described by the following equation:

如图2所示，(v_0x,v_0y)是左上角控制点的运动矢量，并且(v_1x,v_1y)是右上角控制点的运动矢量。为了简化运动补偿预测，可以应用基于子块的仿射变换预测。子块尺寸M×N推导如下：As shown in FIG. 2, (v _0x , v _0y ) is the motion vector of the upper left control point, and (v _1x , v _1y ) is the motion vector of the upper right control point. To simplify motion compensated prediction, subblock-based affine transform prediction can be applied. The sub-block size M×N is derived as follows:

这里，MvPre是运动矢量分数精度(例如，JEM中的1/16)。(v_2x,v_2y)是左下控制点的运动矢量，其根据等式(1)计算。如果需要，M和N可以被向下调节使其分别作为w和h的除数。Here, MvPre is the motion vector fractional precision (eg, 1/16 in JEM). (v _2x , v _2y ) is the motion vector of the lower left control point, which is calculated according to equation (1). If desired, M and N can be adjusted down to be divisors of w and h, respectively.

图3示出了块300的每个子块的仿射MVF的示例。为了推导出每个M×N子块的运动矢量，可以根据等式(1)计算每个子块的中心样本的运动矢量，并且四舍五入到运动矢量分数精度(例如，JEM中的1/16)。然后可以应用运动补偿插值滤波器，利用推导出的运动矢量生成各子块的预测。在MCP之后，对每个子块的高精度运动矢量进行取整，并将其保存为与正常运动矢量相同的精度。FIG. 3 shows an example of an affine MVF for each sub-block of block 300 . To derive the motion vector for each MxN sub-block, the motion vector for the center sample of each sub-block can be calculated according to equation (1) and rounded to motion vector fractional precision (eg, 1/16 in JEM). A motion compensated interpolation filter can then be applied to generate predictions for each sub-block using the derived motion vectors. After MCP, the high precision motion vector of each sub-block is rounded and saved to the same precision as the normal motion vector.

在JEM中，有两个仿射运动模式：AF_INTER模式和AF_MERGE模式。对于宽度和高度都大于8的CU，可以应用AF_INTER模式。在位流中，CU级别的仿射标志被发信令(signal)，以指示是否使用AF_INTER模式。在AF_INTER模式中，使用相邻的块构造具有运动矢量对{(v₀,v₁)|v₀＝{v_A,v_B,v_c},v₁＝{v_D,v_E}}的候选列表。In JEM, there are two affine motion modes: AF_INTER mode and AF_MERGE mode. For CUs with both width and height greater than 8, AF_INTER mode can be applied. In the bitstream, an affine flag at the CU level is signaled to indicate whether AF_INTER mode is used. In AF_INTER mode, adjacent blocks are used to construct a motion vector pair {(v ₀ ,v ₁ )|v ₀ ={v _A ,v _B ,v _c },v ₁ ={v _D ,v _E }} Candidate list.

图4示出了在AF_INTER模式中块400的运动矢量预测(MVP)的示例。如图4所示，v0从子块A、B或C的运动矢量中选择。可以根据参考列表对相邻块的运动矢量进行缩放。也可以根据相邻块参考的图片顺序计数(POC)、当前CU参考的POC和当前CU的POC之间的关系对运动矢量进行缩放。从相邻的子块D和E中选择v₁的方法类似。当候选列表的数目小于2时，该列表由复制每个AMVP候选组成的运动矢量对来填充。当候选列表大于2时，可以首先根据相邻的运动矢量对候选进行排序(例如，基于一对候选中两个运动矢量的相似性)。在一些实现中，保留前两个候选。在一些实施例中，使用速率失真(RD)成本检查来确定选择哪个运动矢量对候选作为当前CU的控制点运动矢量预测(CPMVP)。可以在位流中发信令指示CPMVP在候选列表中的位置的索引。在确定了当前仿射CU的CPMVP后，应用仿射运动估计，并且找到控制点运动矢量(CPMV)。然后，在比特流中对CPMV和CPMVP的差异发信令。FIG. 4 shows an example of motion vector prediction (MVP) of block 400 in AF_INTER mode. As shown in Figure 4, v0 is selected from the motion vectors of sub-blocks A, B or C. The motion vectors of adjacent blocks may be scaled according to the reference list. The motion vector may also be scaled according to the relationship between the picture order count (POC) referenced by neighboring blocks, the POC referenced by the current CU, and the POC of the current CU. _The method for selecting v1 from adjacent subblocks D and E is similar. When the number of candidate lists is less than 2, the list is populated by duplicating motion vector pairs consisting of each AMVP candidate. When the candidate list is larger than 2, the candidates may first be sorted according to adjacent motion vectors (eg, based on the similarity of the two motion vectors in a pair of candidates). In some implementations, the first two candidates are kept. In some embodiments, a rate-distortion (RD) cost check is used to determine which motion vector pair candidate to select as control point motion vector prediction (CPMVP) for the current CU. An index indicating the position of the CPMVP in the candidate list may be signaled in the bitstream. After the CPMVP of the current affine CU is determined, affine motion estimation is applied, and the control point motion vector (CPMV) is found. The difference between CPMV and CPMVP is then signaled in the bitstream.

当在AF_MERGE模式下应用CU时，它从有效的相邻重构块中获取用仿射模式编码的第一个块。图5A示出了当前CU 500的候选块的选择顺序的示例。如图5A所示，选择顺序可以是从当前CU 500的左(501)、上(502)、右上(503)、左下(504)到左上(505)。图5B示出了在AF_MERGE模式中当前CU 500的候选块的另一个示例。如果相邻的左下块501以仿射模式编码，如图5B所示，则导出包含子块501的CU左上角、右上角和左下角的运动矢量v2、v3和v4。当前CU 500左上角的运动矢量v0是基于v2、v3和v4计算的。可以相应地计算当前CU右上方的运动矢量v1。When a CU is applied in AF_MERGE mode, it takes the first block encoded in affine mode from the valid adjacent reconstructed blocks. FIG. 5A shows an example of the selection order of candidate blocks of the current CU 500 . As shown in FIG. 5A , the selection order may be from left ( 501 ), top ( 502 ), top right ( 503 ), bottom left ( 504 ) to top left ( 505 ) of the current CU 500 . FIG. 5B shows another example of candidate blocks of the current CU 500 in AF_MERGE mode. If the adjacent lower left block 501 is coded in affine mode, as shown in FIG. 5B , the motion vectors v2, v3 and v4 containing the upper left, upper right and lower left corners of the CU of sub-block 501 are derived. The motion vector v0 in the upper left corner of the current CU 500 is calculated based on v2, v3 and v4. The motion vector v1 at the upper right of the current CU can be calculated accordingly.

根据等式(1)中的仿射运动模型计算当前CU的CPMV v0和v1后，可以生成当前CU的MVF。为了识别当前CU是否使用AF_MERGE模式编码，当至少有一个相邻的块以仿射模式编码时，可以在比特流中发信令仿射标志。After calculating the CPMV v0 and v1 of the current CU according to the affine motion model in Equation (1), the MVF of the current CU can be generated. To identify whether the current CU is coded using AF_MERGE mode, an affine flag may be signaled in the bitstream when at least one adjacent block is coded in affine mode.

可选时域运动矢量预测(ATMVP)Optional Temporal Motion Vector Prediction (ATMVP)

在ATMVP方法中，通过从小于当前CU的块中提取多组运动信息(包括运动矢量和参考指数)，修改时间运动矢量预测(TMVP)方法。In the ATMVP method, the Temporal Motion Vector Prediction (TMVP) method is modified by extracting sets of motion information (including motion vectors and reference indices) from blocks smaller than the current CU.

图6示出了CU 600的ATMVP运动预测过程的示例。ATMVP方法分两步预测CU 600内子CU 601的运动矢量。第一步是用时间矢量识别参考图片650中的相应块651。参考图片650也称为运动源图片。第二步是将当前的CU 600划分成子CU 601，并从每个子CU对应的块中获取每个子CU的运动矢量和参考指数。6 shows an example of an ATMVP motion prediction process for CU 600. The ATMVP method predicts motion vectors for sub-CUs 601 within CU 600 in two steps. The first step is to identify the corresponding block 651 in the reference picture 650 with the temporal vector. The reference picture 650 is also referred to as a motion source picture. The second step is to divide the current CU 600 into sub-CUs 601, and obtain the motion vector and reference index of each sub-CU from the block corresponding to each sub-CU.

在第一步中，参考图片650和对应的块由当前CU 600的空间相邻块的运动信息确定。为了避免相邻块的重复扫描处理，使用当前CU 600的MERGE候选列表中的第一MERGE候选。第一可用的运动矢量及其相关联的参考索引被设置为时间矢量和运动源图片的索引。这样，与TMVP相比，可以更准确地识别对应的块，其中对应的块(有时称为并置块)始终位于相对于当前CU的右下角或中心位置。In the first step, the reference picture 650 and the corresponding block are determined from the motion information of the spatially neighboring blocks of the current CU 600 . To avoid repeated scan processing of adjacent blocks, the first MERGE candidate in the MERGE candidate list of the current CU 600 is used. The first available motion vector and its associated reference index are set to the temporal vector and the index of the motion source picture. In this way, the corresponding block can be more accurately identified than TMVP, where the corresponding block (sometimes referred to as a collocated block) is always located in the lower right corner or center position relative to the current CU.

在第二步中，通过将时间矢量添加到当前CU的坐标中，通过运动源图片650中的时间矢量识别子CU 651的对应块。对于每个子CU，使用其对应块的运动信息(例如，覆盖中心样本的最小运动网格)来导出子CU的运动信息。在识别出对应的N×N块的运动信息后，用与HEVC的TMVP同样方式，将其转换为当前子CU的运动矢量和参考指数，其中应用运动缩放和其他程序。例如，解码器检查是否满足低延迟条件(例如，当前图片的所有参考图片的POC都小于当前图片的POC)，并且可能使用运动矢量MVx(例如，与参考图片列表X对应的运动矢量)来预测每个子CU的运动矢量MVy(例如，X等于0或1并且Y等于1-X)。In the second step, the corresponding block of the sub-CU 651 is identified by the temporal vector in the motion source picture 650 by adding the temporal vector to the coordinates of the current CU. For each sub-CU, the motion information for the sub-CU is derived using the motion information of its corresponding block (eg, the smallest motion grid covering the center sample). After the motion information of the corresponding NxN block is identified, it is converted into the motion vector and reference index of the current sub-CU in the same way as the TMVP of HEVC, where motion scaling and other procedures are applied. For example, the decoder checks whether the low-latency condition is met (eg, the POC of all reference pictures of the current picture is less than the POC of the current picture), and may use the motion vector MVx (eg, the motion vector corresponding to the reference picture list X) to predict Motion vector MVy for each sub-CU (eg, X equals 0 or 1 and Y equals 1-X).

空时运动矢量预测(STMVP)Space-Time Motion Vector Prediction (STMVP)

在STMVP方法中，子CU的运动矢量按照光栅扫描顺序递归导出。图7示出具有四个子块的一个CU和相邻块的示例。考虑8×8的CU 700，其包括四个4×4子CU A(701)、B(702)、C(703)和D(704)。当前帧中相邻的4×4块标记为a(711)、b(712)、c(713)和d(714)。In the STMVP method, the motion vectors of sub-CUs are derived recursively in raster scan order. FIG. 7 shows an example of one CU with four sub-blocks and adjacent blocks. Consider an 8x8 CU 700 that includes four 4x4 sub-CUs A (701), B (702), C (703), and D (704). Adjacent 4x4 blocks in the current frame are labeled a (711), b (712), c (713), and d (714).

子CU A的运动推导由识别其两个空间邻居开始。第一邻居是子CU A701上方的N×N块(块c 713)。如果该块c(713)不可用或内部编码，则检查子CU A(701)上方的其他N×N块(从左到右，从块c 713处开始)。第二个邻居是子CU A701左侧的一个块(块b 712)。如果块b(712)不可用或是内部编码，则检查子CU A701左侧的其他块(从上到下，从块b 712处开始)。每个列表从相邻块获得的运动信息被缩放到给定列表的第一参考帧。接下来，按照HEVC中规定的与TMVP相同的程序，推导出子块A701的时间运动矢量预测(TMVP)。提取块D704处的并置块的运动信息并进行相应的缩放。最后，在检索和缩放运动信息后，对每个参考列表分别平均所有可用的运动向量。将平均运动矢量指定为当前子CU的运动矢量。The motion derivation of sub-CU A begins by identifying its two spatial neighbors. The first neighbor is the NxN block above sub-CU A 701 (block c 713). If this block c (713) is not available or intra-coded, then check the other NxN blocks above sub-CU A (701) (from left to right, starting at block c 713). The second neighbor is a block to the left of sub-CU A 701 (block b 712). If block b (712) is not available or is intra-coded, other blocks to the left of sub-CU A 701 are checked (from top to bottom, starting at block b 712). The motion information obtained from neighboring blocks for each list is scaled to the first reference frame of the given list. Next, the temporal motion vector prediction (TMVP) of the sub-block A701 is derived according to the same procedure as the TMVP specified in HEVC. The motion information for the collocated block at block D 704 is extracted and scaled accordingly. Finally, after retrieving and scaling the motion information, all available motion vectors are averaged separately for each reference list. The average motion vector is designated as the motion vector of the current sub-CU.

帧速率上转换(FRUC)Frame Rate Up Conversion (FRUC)

对于CU，当其MERGE标志为真时，可以对FRUC标志发信令。当FRUC标志为假时，可以对MERGE索引发信令并且使用常规MERGE模式。当FRUC标志为真时，可以对另一个FRUC模式标志发信令来指示将使用哪种方法(例如，双向匹配或模板匹配)来导出该块的运动信息。For a CU, the FRUC flag may be signaled when its MERGE flag is true. When the FRUC flag is false, the MERGE index can be signaled and the regular MERGE mode used. When the FRUC flag is true, another FRUC mode flag may be signaled to indicate which method (eg, bidirectional matching or template matching) will be used to derive the motion information for the block.

在编码器端，基于对正常MERGE候选所做的RD成本选择决定是否对CU使用FRUCMERGE模式。例如，通过使用RD成本选择来检查CU的多个匹配模式(例如，双向匹配和模板匹配)。导致最低成本的模式进一步与其它CU模式相比较。如果FRUC匹配模式是最有效的模式，那么对于CU，FRUC标志设置为真，并且使用相关的匹配模式。At the encoder side, the decision whether to use FRUCMERGE mode for the CU is based on the RD cost selection made on the normal MERGE candidates. For example, multiple matching modes (eg, bidirectional matching and template matching) of the CU are checked by using RD cost selection. The mode that results in the lowest cost is further compared with other CU modes. If the FRUC match mode is the most efficient mode, then for the CU, the FRUC flag is set to true and the associated match mode is used.

通常，FRUC MERGE模式中的运动推导处理有两个步骤：首先执行CU级运动搜索，然后执行子CU级运动细化。在CU级，基于双向匹配或模板匹配，导出整个CU的初始运动矢量。首先，生成MV候选列表，并且选择导致最低匹配成本的候选作为进一步CU级细化的起点。然后在起始点附近执行基于双向匹配或模板匹配的局部搜索。将最小匹配成本的MV结果作为整个CU的MV值。随后，以导出的CU运动矢量为起点，进一步在子CU级细化运动信息。In general, the motion derivation process in FRUC MERGE mode has two steps: first, a CU-level motion search is performed, and then a sub-CU-level motion refinement is performed. At the CU level, based on bidirectional matching or template matching, the initial motion vector for the entire CU is derived. First, a list of MV candidates is generated, and the candidate resulting in the lowest matching cost is selected as the starting point for further CU-level refinement. A local search based on bidirectional matching or template matching is then performed near the starting point. The MV result with the minimum matching cost is taken as the MV value of the entire CU. Then, starting from the derived CU motion vector, the motion information is further refined at the sub-CU level.

例如，对于W×H CU运动信息推导执行以下推导过程。在第一阶段，推导出了整个W×H CU的MV。在第二阶段，该CU进一步被分成M×M子CU。M的值按照(13)计算，D是预先定义的划分深度，在JEM中默认设置为3。然后导出每个子CU的MV值。For example, the following derivation process is performed for WxH CU motion information derivation. In the first stage, the MV of the entire W×H CU is derived. In the second stage, the CU is further divided into MxM sub-CUs. The value of M is calculated according to (13), and D is the pre-defined division depth, which is set to 3 by default in JEM. The MV value of each sub-CU is then derived.

图10示出在帧速率上转换(FRUC)法中使用的双向匹配的示例。通过在两张不同的参考图片(1010，1011)中沿当前CU(1000)的运动轨迹找到两个块之间最接近的匹配，使用双向匹配来获得当前CU的运动信息。在连续运动轨迹假设下，指向两个参考块的运动矢量MV0(1001)和MV1(1002)与当前图片和两个参考图片之间的时间距离(例如，TD0(1003)和TD1(1004))成正比。在一些实施例中，当当前图片1000暂时位于两个参考图片(1010，1011)之间并且当前图片到两个参考图片的时间距离相同时，双向匹配成为基于镜像的双向MV。Figure 10 shows an example of bidirectional matching used in the frame rate up-conversion (FRUC) method. Bidirectional matching is used to obtain motion information for the current CU by finding the closest match between two blocks along the motion trajectory of the current CU (1000) in two different reference pictures (1010, 1011). The temporal distance between the motion vectors MV0 (1001) and MV1 (1002) pointing to the two reference blocks and the current picture and the two reference pictures (eg, TD0 (1003) and TD1 (1004)) under the continuous motion trajectory assumption proportional. In some embodiments, bidirectional matching becomes mirror-based bidirectional MV when the current picture 1000 is temporarily located between two reference pictures (1010, 1011) and the temporal distance of the current picture to both reference pictures is the same.

图11示出在FRUC方法中使用的模板匹配的示例。模板匹配可以用于通过找到当前图片中的模板(例如，当前CU的顶部和/或左侧相邻块)与参考图片1110中的块(例如，与模板大小相同)之间的最接近匹配来获取当前CU 1100的运动信息。除了上述的FRUC MERGE模式外，模板匹配也可以应用于AMVP模式。在JEM和HEVC中，AMVP都具有两个候选。通过模板匹配方法，可以导出新的候选。如果通过模板匹配新导出的候选与第一个现有的AMVP候选不同，则将其插入AMVP候选列表的最开始处，并且然后将列表大小设置为2(例如，通过删除第二个现有AMVP候选)。当应用于AMVP模式时，仅应用CU级搜索。Figure 11 shows an example of template matching used in the FRUC method. Template matching may be used to find the closest match between a template in the current picture (eg, the top and/or left neighboring blocks of the current CU) and a block in the reference picture 1110 (eg, the same size as the template) Obtain the motion information of the current CU 1100. In addition to the FRUC MERGE pattern described above, template matching can also be applied to the AMVP pattern. In both JEM and HEVC, AMVP has two candidates. Through the template matching method, new candidates can be derived. If the newly derived candidate via template matching is different from the first existing AMVP candidate, insert it at the very beginning of the AMVP candidate list, and then set the list size to 2 (e.g. by removing the second existing AMVP candidate) candidate). When applied to AMVP mode, only CU-level search is applied.

CU级设置的MV候选可以包括以下：(1)原始AMVP候选，如果当前CU处于AMVP模式，(2)所有MERGE候选，(3)插值MV场(稍后描述)中的数个MV，以及顶部和左侧相邻运动矢量。The MV candidates set at the CU level may include the following: (1) the original AMVP candidates, if the current CU is in AMVP mode, (2) all MERGE candidates, (3) several MVs in the interpolated MV field (described later), and the top and the left adjacent motion vector.

当使用双向匹配时，MERGE候选的每个有效MV可以用作输入，以生成假设为双向匹配的MV对。例如，MERGE候选在参考列表A处的一个有效MV为(MVa，ref_a)。然后在另一个参考列表B中找到其配对的双向MV的参考图片ref_b，使得ref_a和ref_b在时间上位于当前图片的不同侧。如果参考列表B中的参考ref_b不可用，则将参考ref_b确定为与参考ref_a不同的参考，并且其到当前图片的时间距离是列表B中的最小距离。确定参考ref_b后，通过基于当前图片和参考ref_a、参考ref_b之间的时间距离缩放MVa导出MVb。When using bidirectional matching, each valid MV of a MERGE candidate can be used as input to generate MV pairs that are assumed to be bidirectional matching. For example, a valid MV for a MERGE candidate at reference list A is (MVa, ref _a ). The reference picture ref _b of its paired bidirectional MV is then found in another reference list B such that ref _a and ref _b are temporally on different sides of the current picture. If reference ref _b in reference list B is not available, then reference ref _b is determined to be _a different reference than reference ref a, and its temporal distance to the current picture is the smallest distance in list B. After the reference ref _b is determined, MVb is derived by scaling MVa based on the temporal distance between the current picture and the reference ref _a and the reference ref _b .

在一些实现中，还可以将来自插值MV场中的四个MV添加到CU级候选列表中。更具体地，添加当前CU的位置(0，0)，(W/2，0)，(0，H/2)和(W/2，H/2)处插值的MV。当在AMVP模式下应用FRUC时，原始的AMVP候选也添加到CU级的MV候选集。在一些实现中，在CU级，可以将AMVP CU的15个MV和MERGE CU的13个MV添加到候选列表中。In some implementations, four MVs from the interpolated MV field may also be added to the CU-level candidate list. More specifically, the interpolated MVs at positions (0, 0), (W/2, 0), (0, H/2) and (W/2, H/2) of the current CU are added. When applying FRUC in AMVP mode, the original AMVP candidates are also added to the CU-level MV candidate set. In some implementations, at the CU level, 15 MVs for AMVP CUs and 13 MVs for MERGE CUs may be added to the candidate list.

在子CU级设置的MV候选包括从CU级搜索确定的MV，(2)顶部、左侧、左上方和右上方相邻的MV，(3)参考图片中并置的MV的缩放版本，(4)一个或多个ATMVP候选(例如，最多四个)和(5)一个或多个STMVP候选(例如，最多四个)。来自参考图片的缩放MV如下导出。两个列表中的参考图片都被遍历。参考图片中子CU的并置位置处的MV被缩放为起始CU级MV的参考。ATMVP和STMVP候选可以是前四个。在子CU级，一个或多个MV(例如，最多17个)被添加到候选列表中。MV candidates set at the sub-CU level include MVs determined from CU-level searches, (2) top, left, top left and top right adjacent MVs, (3) scaled versions of MVs collocated in reference pictures, (4) One or more ATMVP candidates (eg, up to four) and (5) one or more STMVP candidates (eg, up to four). The scaled MV from the reference picture is derived as follows. The reference pictures in both lists are traversed. The MV at the collocated position of the sub-CU in the reference picture is scaled to be the reference of the starting CU-level MV. ATMVP and STMVP candidates can be the top four. At the sub-CU level, one or more MVs (eg, up to 17) are added to the candidate list.

插值MV场的生成Generation of Interpolated MV Fields

在对帧进行编码之前，基于单向ME生成整个图片的内插运动场。然后，该运动场可以随后用作CU级或子CU级的MV候选。Before encoding the frame, an interpolated motion field for the entire picture is generated based on unidirectional ME. This motion field can then be used as an MV candidate at the CU level or sub-CU level.

在一些实施例中，两个参考列表中每个参考图片的运动场在4×4的块级别上被遍历。图12示出了在FRUC方法中的单向运动估计(ME)1200的示例。对于每个4×4块，如果与块相关联的运动通过当前图片中的4×4块，并且该块没有被分配任何内插运动，则根据时间距离TD0和TD1将参考块的运动缩放到当前图片(与HEVC中TMVP的MV缩放相同方式)，并且在当前帧中将该缩放运动指定给该块。如果没有缩放的MV指定给4×4块，则在插值运动场中将块的运动标记为不可用。In some embodiments, the motion field of each reference picture in the two reference lists is traversed on a 4x4 block level. FIG. 12 shows an example of unidirectional motion estimation (ME) 1200 in the FRUC method. For each 4x4 block, if the motion associated with the block passes through a 4x4 block in the current picture and the block is not assigned any interpolated motion, the motion of the reference block is scaled according to the temporal distances TD0 and TD1 to The current picture (in the same way as the MV scaling of TMVP in HEVC), and the scaling motion is assigned to the block in the current frame. If no scaled MV is assigned to a 4x4 block, the motion of the block is marked as unavailable in the interpolated motion field.

插值和匹配成本Interpolation and matching costs

当运动矢量指向分数采样位置时，需要运动补偿插值。为了降低复杂度，对双向匹配和模板匹配都使用双线性插值而不是常规的8抽头HEVC插值。Motion compensated interpolation is required when motion vectors point to fractional sample locations. To reduce complexity, bilinear interpolation is used instead of regular 8-tap HEVC interpolation for both bidirectional matching and template matching.

匹配成本的计算在不同的步骤处有点不同。当从CU级的候选集中选择候选时，匹配成本可以是双向匹配或模板匹配的绝对和差(SAD)。在确定起始MV后，双向匹配在子CU级搜索的匹配成本C如下计算：The calculation of the matching cost is a little different at different steps. When selecting candidates from the CU-level candidate set, the matching cost can be the sum-difference (SAD) of bidirectional matching or template matching. After determining the starting MV, the matching cost C of the bidirectional matching search at the sub-CU level is calculated as follows:

这里，w是权重系数。在一些实施例中，w可以被经验地设置为4。MV和MV^s分别指示当前MV和起始MV。仍然可以将SAD用作模式匹配在子CU级搜索的匹配成本。Here, w is a weight coefficient. In some embodiments, w may be empirically set to 4. MV and ^MVs indicate the current MV and the starting MV, respectively. It is still possible to use SAD as a matching cost for pattern matching to search at the sub-CU level.

在FRUC模式下，MV通过仅使用亮度(luma)样本导出。导出的运动将用于亮度(luma)和色度(chroma)，用于MC帧间预测。确定MV后，对亮度使用8抽头(8-taps)插值滤波器并且对色度使用4抽头(4-taps)插值滤波器执行最终MC。In FRUC mode, the MV is derived by using only luma (luma) samples. The derived motion will be used for luma (luma) and chroma (chroma) for MC inter prediction. After the MV is determined, a final MC is performed using an 8-tap (8-taps) interpolation filter for luma and a 4-tap (4-taps) interpolation filter for chroma.

MV细化是基于模式的MV搜索，以双向匹配成本或模板匹配成本为标准。在JEM中，支持两种搜索模式—无限制中心偏置菱形搜索(UCBDS)和自适应交叉搜索，分别在CU级别和子CU级别进行MV细化。对于CU级和子CU级的MV细化，都在四分之一亮度样本MV精度下直接搜索MV，接着是八分之一亮度样本MV细化。将CU和子CU步骤的MV细化的搜索范围设置为8个亮度样本。MV refinement is a pattern-based MV search, with either bidirectional matching cost or template matching cost as the criterion. In JEM, two search modes are supported—unrestricted center-biased diamond search (UCDBS) and adaptive cross search, with MV refinement at CU level and sub-CU level, respectively. For both CU-level and sub-CU-level MV refinements, the MV is directly searched at one-quarter luma sample MV precision, followed by eighth luma sample MV refinement. The search range of MV refinement for CU and sub-CU steps is set to 8 luma samples.

在双向匹配MERGE模式下，应用双向预测，因为CU的运动信息是基于在两个不同的参考图片中沿着当前CU运动轨迹的两个块之间的最近匹配得出的。在模板匹配MERGE模式下，编码器可以从列表0的单向预测、列表1的单向预测或者双向预测中为CU做出选择。该选择可以基于如下的模板匹配成本：In bidirectional matching MERGE mode, bidirectional prediction is applied because the motion information of the CU is derived based on the closest match between two blocks along the current CU motion trajectory in two different reference pictures. In template matching MERGE mode, the encoder can choose from list 0 unidirectional prediction, list 1 unidirectional prediction, or bidirectional prediction for the CU. The selection can be based on template matching costs as follows:

如果costBi<＝factor*min(cost0,cost1)If costBi<=factor*min(cost0, cost1)

则使用双向预测；then use bidirectional prediction;

否则，如果cost0<＝cost1else if cost0 <= cost1

则使用列表0中的单向预测；then use the one-way prediction in list 0;

否则，otherwise,

使用列表1中的单向预测；Use the one-way prediction in Listing 1;

这里，cost0是列表0模板匹配的SAD，cost1是列表2模板匹配的SAD，并且costBi是双向模板匹配的SAD。例如，当factor的值等于1.25时，意味着选择过程朝双向预测偏移。帧间预测方向选择可应用于CU级模板匹配处理。Here, cost0 is the SAD matched by the list 0 template, cost1 is the SAD matched by the list 2 template, and costBi is the SAD matched by the bidirectional template. For example, when the value of factor is equal to 1.25, it means that the selection process is shifted towards bidirectional prediction. Inter prediction direction selection can be applied to CU-level template matching processing.

以上讨论的基于子块的预测技术可用于在子块尺寸较小时获得每个子块更精确的运动信息。然而，较小的子块在运动补偿中施加了更高的带宽要求。另一方面，对于较小的子块，推导的运动信息可能不准确，尤其是当块中存在一些噪声时。因此，在一个块内具有固定的子块大小可能是次优的。The sub-block-based prediction techniques discussed above can be used to obtain more accurate motion information for each sub-block when the sub-block size is smaller. However, smaller sub-blocks impose higher bandwidth requirements in motion compensation. On the other hand, for smaller sub-blocks, the derived motion information may be inaccurate, especially when there is some noise in the block. Therefore, having a fixed sub-block size within a block may be sub-optimal.

本文描述了可以在各种实施例中使用的技术，以使用非均匀和/或可变子块大小来解决固定子块大小引入的带宽和精度问题。这些技术(也被称为交织预测)使用不同的划分块的方法，以便在不增加带宽消耗的情况下更可靠地获取运动信息。Described herein are techniques that may be used in various embodiments to use non-uniform and/or variable sub-block sizes to address bandwidth and precision issues introduced by fixed sub-block sizes. These techniques (also known as interleaving prediction) use different methods of dividing blocks in order to obtain motion information more reliably without increasing bandwidth consumption.

使用交织预测技术，将块划分为具有一个或多个划分模式的子块。划分模式表示将块划分为子块的方法，包括子块的大小和子块的位置。对于每个划分模式，可以通过基于划分模式推导出每个子块的运动信息来生成相应的预测块。因此，在一些实施例中，即使对于一个预测方向，也可以通过多个划分模式生成多个预测块。在一些实施例中，对于每个预测方向，可能只应用一个划分模式。Using interleaved prediction techniques, the block is divided into sub-blocks with one or more division patterns. The partition mode indicates the method of dividing the block into sub-blocks, including the size of the sub-block and the location of the sub-block. For each partition mode, a corresponding prediction block may be generated by deriving motion information of each sub-block based on the partition mode. Therefore, in some embodiments, multiple prediction blocks may be generated by multiple partition modes even for one prediction direction. In some embodiments, only one partition mode may be applied for each prediction direction.

图13示出根据所公开的技术的具有两个划分模式的交织预测的示例。当前块1300可以划分成多个模式。例如，如图13所示，当前块被划分成模式0(1301)和模式1(1302)。生成两个预测块P₀(1303)和P₁(1304)。通过计算P₀(1303)和P₁(1304)的加权和，可以生成当前块1300的最终预测块P(1305)。13 shows an example of interleaving prediction with two partition modes in accordance with the disclosed technique. The current block 1300 may be divided into multiple modes. For example, as shown in FIG. 13, the current block is divided into mode 0 (1301) and mode 1 (1302). Two prediction blocks P ₀ (1303) and P ₁ (1304) are generated. By calculating the weighted sum of P ₀ (1303) and P ₁ (1304), the final prediction block P (1305) of the current block 1300 can be generated.

一般来说，给定X个划分模式，当前块的X个预测块(表示为P₀，P₁,，…,P_X-1)可以以X个划分模式由基于子块的预测生成。当前块的最终预测(表示为P)可生成为：In general, given X partition modes, X prediction blocks (denoted as P ₀ , P ₁ , . . . , P _X-1 ) of the current block can be generated by sub-block-based prediction in X partition modes. The final prediction for the current block (denoted as P) can be generated as:

这里，(x,y)是块中像素的坐标，并且w_i(x,y)是P_i的权重系数。通过示例而不是限制，权重可以表示为：Here, (x, y) are the coordinates of the pixels in the block, and _wi (x, y) are the weight coefficients of P _i . By way of example rather than limitation, the weights can be expressed as:

N是非负值。可选地，等式(8)中的位移操作也可以表示为：N is a non-negative value. Alternatively, the displacement operation in equation (8) can also be expressed as:

权重之和是2的幂，通过执行移位操作而不是浮点除法，可以更有效地计算加权和P。The sum of the weights is a power of 2, and the weighted sum P can be computed more efficiently by performing a shift operation instead of a floating point division.

划分模式可以具有不同的子块形状、尺寸或位置。在一些实施例中，划分模式可以包括不规则的子块大小。图14A-图14G显示了16×16块的几个划分模式的示例。在图14A中，根据所公开的技术将块划分为4×4个子块。这种模式也用于JEM。图14B示出根据所公开的技术将块划分为8×8个子块的划分模式的示例。图14C示出根据所公开的技术将块划分为8×4个子块的划分模式的示例。图14D示出根据所公开的技术将块划分为4×8个子块的划分模式的示例。在图14E中，根据所公开的技术将块的一部分划分为4x4子块。块边界上的像素被划分成更小的子块，其大小如2×4,4×2或2×2。一些子块可以合并以形成更大的子块。图14F示出了相邻子块(如4x4子块和2x4子块)的示例，这些子块合并后形成尺寸为6×4、4×6或6×6的较大子块。在图14G中，块的一部分被划分为8×8子块。而块边界处的像素被划分为较小的子块如8×4、4×8或4×4。Partition patterns can have different sub-block shapes, sizes or positions. In some embodiments, the partitioning pattern may include irregular sub-block sizes. Figures 14A-14G show examples of several partition patterns for 16x16 blocks. In Figure 14A, the block is divided into 4x4 sub-blocks according to the disclosed technique. This mode is also used in JEM. 14B illustrates an example of a partitioning pattern for partitioning a block into 8x8 sub-blocks in accordance with the disclosed techniques. 14C shows an example of a partitioning pattern for partitioning a block into 8x4 sub-blocks in accordance with the disclosed technique. 14D illustrates an example of a partitioning pattern for partitioning a block into 4x8 sub-blocks in accordance with the disclosed techniques. In Figure 14E, a portion of a block is divided into 4x4 sub-blocks according to the disclosed technique. Pixels on block boundaries are divided into smaller sub-blocks of size such as 2×4, 4×2 or 2×2. Some sub-blocks can be merged to form larger sub-blocks. Figure 14F shows an example of adjacent sub-blocks (eg, 4x4 sub-blocks and 2x4 sub-blocks) that are combined to form larger sub-blocks of size 6x4, 4x6, or 6x6. In Figure 14G, a portion of the block is divided into 8x8 sub-blocks. Whereas, pixels at block boundaries are divided into smaller sub-blocks such as 8x4, 4x8 or 4x4.

基于子块的预测中，子块的形状和大小可以基于编码块的形状和/或大小和/或编码块信息来确定。编码块信息可以包括块和/或子块上使用的一种编码算法，诸如运动补偿预测是否是(1)仿射预测方法，(2)可选时域运动矢量预测方法，(3)空时运动矢量预测方法，(4)双向光流方法，或(5)帧速率上转换方法。例如，在一些实施例中，当当前块的大小为M×N时，子块的大小为4×N(或8×N等)，即子块与当前块具有相同的高度。在一些实施例中，当当前块的大小为M×N时，子块的大小为M×4(或M×8等)，即子块与当前块具有相同的宽度。在一些实施例中，当当前块的大小为M×N(其中M>N)时，子块的大小为A×B，其中A>B(例如，8×4)。或者，子块的大小为B×A(例如，4×8)。In subblock-based prediction, the shape and size of the subblock may be determined based on the shape and/or size of the coding block and/or coding block information. The coding block information may include an coding algorithm used on the block and/or sub-block, such as whether the motion compensated prediction is (1) an affine prediction method, (2) an optional temporal motion vector prediction method, (3) space-time A motion vector prediction method, (4) a bidirectional optical flow method, or (5) a frame rate up-conversion method. For example, in some embodiments, when the size of the current block is M×N, the size of the sub-block is 4×N (or 8×N, etc.), ie, the sub-block has the same height as the current block. In some embodiments, when the size of the current block is M×N, the size of the sub-block is M×4 (or M×8, etc.), that is, the sub-block has the same width as the current block. In some embodiments, when the size of the current block is MxN (where M>N), the size of the sub-block is AxB, where A>B (eg, 8x4). Alternatively, the size of the sub-block is BxA (eg, 4x8).

在一些实施例中，当前块的大小为M×N。当M×N<＝T(或min(M，N)<＝T，或max(M，N)<＝T等)时，子块的大小为A×B；当M×N>T(或min(M，N)>T，或max(M，N)>T等)时，子块的大小为C×D，其中A<＝C，B<＝D。例如，如果M×N<＝256，子块的大小可以是4×4。在一些实现中，子块的大小为8×8。In some embodiments, the size of the current block is MxN. When M×N<=T (or min(M,N)<=T, or max(M,N)<=T, etc.), the size of the sub-block is A×B; when M×N>T (or When min(M, N)>T, or max(M, N)>T, etc.), the size of the sub-block is C×D, where A<=C and B<=D. For example, if MxN<=256, the size of the sub-block may be 4x4. In some implementations, the size of the sub-block is 8x8.

在一些实施例中，可以基于帧间预测的方向确定是否应用交织预测。该方向指示第一或第二中间预测块是按时间向后预测还是按时间向前预测。例如，在一些实施例中，交织预测可以适用于双向预测，但不适用于单向预测。In some embodiments, whether to apply interleaved prediction may be determined based on the direction of inter prediction. The direction indicates whether the first or second intermediate prediction block is predicted backward in time or forward in time. For example, in some embodiments, interleaved prediction may be suitable for bidirectional prediction, but not for unidirectional prediction.

作为另一个示例，当应用多重假设时，当有多于一个参考块时，交织预测可以应用于一个预测方向。多重假设可以表明多个视频帧被用来制作预测块。当采用多个视频帧制作预测块时，可以将交织预测应用于一个预测方向。该预测方向可以是向前或向后的预测方向。向前预测方向是指视频序列中多个视频帧出现在预测块之前，而向后方向是指视频序列中多个参考帧出现在预测块之后。As another example, when applying multiple hypotheses, interleaved prediction can be applied to one prediction direction when there is more than one reference block. Multiple hypotheses may indicate that multiple video frames are used to make prediction blocks. When multiple video frames are used to make a prediction block, interleaved prediction can be applied to one prediction direction. The prediction direction may be a forward or backward prediction direction. The forward prediction direction refers to the occurrence of multiple video frames in the video sequence before the prediction block, and the backward direction refers to the occurrence of multiple reference frames in the video sequence after the prediction block.

在一些实施例中，还可以基于帧间预测方向来确定如何应用交织预测。在一些实施例中，对于两个不同的参考列表，具有基于子块预测的双向预测块被划分为具有两种不同划分模式的子块。在双向预测中，可以得到第一参考列表和第二参考列表，其中第一参考列表和第二参考列表表示在时间上向前和向后远离预测块的帧。更具体地说，第一参考列表可以包括相对于视频序列中的预测块在第一方向的第一组块。第一组块可以用于创建预测块。第二参考列表可以包括相对于视频序列中的预测块在第二方向的第二组块。第二组块可用于创建预测块。第一方向和第二方向可以是相反的，即一个方向可以在时间上向前、另一个方向可以在时间上向后远离预测块。第一参考列表可以根据第一划分模式分割为第一组子块。第二参考列表可以根据第二划分模式分割为第二组子块，其中第一模式和第二模式不同。In some embodiments, how to apply interleaving prediction may also be determined based on the inter prediction direction. In some embodiments, a bidirectionally predicted block with subblock-based prediction is divided into subblocks with two different partitioning modes for two different reference lists. In bidirectional prediction, a first reference list and a second reference list may be obtained, where the first reference list and the second reference list represent frames that are temporally forward and backward away from the prediction block. More specifically, the first reference list may include a first set of blocks in a first direction relative to a prediction block in the video sequence. The first set of blocks can be used to create prediction blocks. The second reference list may include a second set of blocks in a second direction relative to the prediction block in the video sequence. The second set of blocks can be used to create prediction blocks. The first direction and the second direction may be opposite, that is, one direction may be forward in time and the other direction may be backward in time away from the prediction block. The first reference list may be partitioned into a first group of sub-blocks according to a first partition mode. The second reference list may be partitioned into a second set of sub-blocks according to a second partition mode, wherein the first mode and the second mode are different.

例如，当从参考列表0(L0)预测时，双向预测块被划分为4×8子块，如图14D所示。从参考列表1(L1)预测时，同一块划分为8×4子块，如图14C所示。最终预测P计算为:For example, when predicting from reference list 0 (L0), the bidirectional prediction block is divided into 4x8 sub-blocks, as shown in Fig. 14D. When predicting from reference list 1 (L1), the same block is divided into 8x4 sub-blocks, as shown in Fig. 14C. The final prediction P is calculated as:

这里，P⁰和P¹分别是来自L0和L1的预测值。w⁰和w¹分别是来自L0和L1的加权值。如等式(16)所示，加权值可以确定为：w⁰(x,y)+w¹(x,y)＝1<<N(其中N为非负整数值)。由于每个方向预测使用的子块较少(例如4×8子块，而不是8×8子块)，因此与现有的基于子块的方法相比，计算需要较小的带宽。通过使用较大的子块，预测结果也不太容易受到噪声干扰的影响。Here, P ⁰ and P ¹ are the predicted values from L0 and L1, respectively. w ⁰ and w ¹ are weighted values from L0 and L1, respectively. As shown in equation (16), the weighting value can be determined as: w ⁰ (x, y)+w ¹ (x, y)=1<<N (where N is a non-negative integer value). Since each direction prediction uses fewer subblocks (eg, 4×8 subblocks instead of 8×8 subblocks), the computation requires less bandwidth compared to existing subblock-based methods. By using larger sub-blocks, the prediction results are also less susceptible to noise interference.

在一些实施例中，对于相同的参考列表，具有基于子块预测的单向预测块被划分为具有两个或多个不同划分模式的子块。例如，对于列表L(L＝0或1)的预测，P^L计算如下：In some embodiments, for the same reference list, a unidirectionally predicted block with subblock-based prediction is partitioned into subblocks with two or more different partitioning modes. For example, for the prediction of list L (L=0 or 1), ^PL is calculated as follows:

这里，XL是列表L的划分模式数。

是用第i个划分模式生成的预测，并且

是

的加权值。例如，当XL为2时，列表L应用两种划分模式。在第一种划分模式中，将块划分为如图14D所示4×8子块，在第二种划分模式中，将块划分为如图14C所示的8×4子块。Here, XL is the number of division modes of the list L.

is the prediction generated with the ith partition mode, and

Yes

weighted value. For example, when XL is 2, list L applies two division modes. In the first division mode, the block is divided into 4×8 sub-blocks as shown in FIG. 14D , and in the second division mode, the block is divided into 8×4 sub-blocks as shown in FIG. 14C .

在一些实施例中，基于子块预测的双向预测块被视为分别来自L0和L1的两个单向预测块的组合。来自每个列表的预测可以按照如上面的示例中的描述导出。最终预测P可计算为：In some embodiments, a bidirectional prediction block based on subblock prediction is considered a combination of two unidirectional prediction blocks from L0 and L1, respectively. Predictions from each list can be derived as described in the example above. The final prediction P can be calculated as:

这里参数a和b是应用于两个内部预测块的两个附加权重。在这个特定的示例中，a和b都可以设置为1。与上面的示例类似，由于每个方向的预测使用较少子块(例如，4×8子块，而不是8×8子块)，因此带宽使用优于现有的基于子块的方法或与现有的基于子块的方法相同。同时，通过采用较大的子块可以改善预测结果。Here parameters a and b are two additional weights applied to the two intra prediction blocks. In this particular example, both a and b can be set to 1. Similar to the example above, the bandwidth usage is better than existing subblock-based methods or with Existing subblock-based methods are the same. At the same time, prediction results can be improved by adopting larger sub-blocks.

在一些实施例中，可以在每个单向预测块中使用单独的不均匀模式。例如，对于每个列表L(例如，L0或L1)，块被划分为不同的模式(例如，如图14E或图14F所示)。使用较少数量的子块减少了对带宽的需求。子块的不均匀性也增加了预测结果的鲁棒性。In some embodiments, a separate non-uniform pattern may be used in each uni-directional prediction block. For example, for each list L (eg, L0 or L1 ), the blocks are divided into different patterns (eg, as shown in FIG. 14E or FIG. 14F ). Using a smaller number of subblocks reduces bandwidth requirements. The inhomogeneity of the sub-blocks also increases the robustness of the prediction results.

在一些实施例中，对于多假设编码块，对于每个预测方向(或参考图片列表)可以有多个由不同的划分模式生成的预测块。可以使用多个预测块并应用附加权重生成最终预测。例如，附加权重可以设置为1/M，其中M是生成的预测块的总数。In some embodiments, for a multi-hypothesis coding block, there may be multiple prediction blocks generated by different partition modes for each prediction direction (or reference picture list). The final prediction can be generated using multiple prediction blocks and applying additional weights. For example, the additional weight can be set to 1/M, where M is the total number of prediction blocks generated.

在一些实施例中，编码器可以确定是否以及如何应用交织预测。然后，编码器可以在序列级、图片级、视图级、切片级、编码树单元(CTU)(也称为最大编码单元(LCU))级、CU级、PU级、树单元(TU)级、片级、片组级或区域级(可能包括多个CU/PU/TU/LCU)向解码器发送与确定相对应的信息。这些信息可以在序列参数集(SPS)、视图参数集(VPS)、图片参数集(PPS)、切片报头(SH)、图片报头、序列包头、片级、片组级、CTU/LCU、CU、PU、TU或区域的第一个块中发信令。In some embodiments, the encoder can determine whether and how to apply interleaving prediction. The encoder can then perform operations at sequence level, picture level, view level, slice level, coding tree unit (CTU) (also known as largest coding unit (LCU)) level, CU level, PU level, tree unit (TU) level, The slice level, slice group level or region level (possibly including multiple CU/PU/TU/LCU) sends information corresponding to the determination to the decoder. This information can be in sequence parameter set (SPS), view parameter set (VPS), picture parameter set (PPS), slice header (SH), picture header, sequence header, slice level, slice group level, CTU/LCU, CU, Signaling in the first block of a PU, TU or region.

在某些实现中，交织预测适用于现有的子块方法，诸如仿射预测、ATMVP、STMVP、FRUC、或BIO。在这种情况下，不需要额外的信令成本。在一些实现中，可以将交织预测生成的新子块MERGE候选项插入到MERGE列表中，例如交织预测+ATMVP、交织预测+STMVP、交织预测+FRUC等。In some implementations, interleaved prediction is applicable to existing subblock methods, such as affine prediction, ATMVP, STMVP, FRUC, or BIO. In this case, no additional signaling costs are required. In some implementations, new subblock MERGE candidates generated by interleaving prediction may be inserted into the MERGE list, eg, interleaving prediction+ATMVP, interleaving prediction+STMVP, interleaving prediction+FRUC, and so on.

在一些实施例中，标志可以被发信令来指示是否使用交织预测。对标志发信令可以包括在视频信息中对标志进行编码。在一个示例中，如果当前块是仿射帧间编码的，则标志a可以被发信令，以指示是否使用交织预测。在另一个示例中，如果当前块是仿射MERGE编码的并且应用了单向预测，则该标志可以被发信令，以指示是否使用交织预测。在第三个示例中，如果当前块是仿射MERGE编码的，则该标志可以被发信令，以指示是否使用交织预测。In some embodiments, a flag may be signaled to indicate whether to use interleaving prediction. Signaling the marker may include encoding the marker in the video information. In one example, if the current block is affine inter-coded, a flag a may be signaled to indicate whether to use interleaved prediction. In another example, if the current block is affine MERGE coded and unidirectional prediction is applied, this flag may be signaled to indicate whether interleaved prediction is used. In a third example, if the current block is affine MERGE encoded, this flag may be signaled to indicate whether to use interleaved prediction.

在一些实施例中，如果当前块是仿射MERGE编码的并应用了单向预测，则可以始终使用交织预测。在一些实施例中，如果当前块是仿射MERGE编码的，则可以始终使用交织预测。In some embodiments, interleaved prediction may always be used if the current block is affine MERGE coded and unidirectional prediction is applied. In some embodiments, interleaved prediction may always be used if the current block is affine MERGE encoded.

在一些实施例中，用于指示是否使用交织预测的标志可以在不发信令的情况下被继承。在一个示例中，如果当前块是仿射MERGE编码的，则可以使用继承。在另一个示例中，可以自从其继承仿射模型的相邻块的标志继承标志。在第三个示例中，标志继承自预定义的相邻块，如左侧或上方的相邻块。在第四个示例中，标志可以从第一个遇到的仿射编码的相邻块继承。如果没有相邻块是仿射编码的，则可以推断标志为零。换言之，如果没有相邻块是仿射编码的，就不会应用交织预测。在第五个示例中，只有当当前块应用单向预测时，才能继承标志。在第六个示例中，只有当前块和要从其继承的相邻块位于同一CTU中时，才能继承标志。在第七个示例中，只有当当前块和要从其继承的相邻块位于同一CTU行中时，才能继承标志。在第八个示例中，当仿射模型从时间相邻块派生时，不能从相邻块的标志继承标志。在第九个示例中，不能从不在同一LCU或LCU行或视频数据处理单元(例如64x64或128x128)的相邻块的标志继承标志。如何发信令和/或导出标志可能取决于当前块的块尺度和/或编码信息。编码信息包括视频中编码的信息。In some embodiments, a flag indicating whether to use interlace prediction may be inherited without signaling. In one example, inheritance can be used if the current block is affine MERGE encoded. In another example, a flag may be inherited from the flag of an adjacent block from which the affine model is inherited. In the third example, the flag is inherited from a predefined adjacent block, such as the adjacent block to the left or above. In a fourth example, the flags can be inherited from the first encountered affine coded adjacent block. If no adjacent blocks are affine coded, the flag can be inferred to be zero. In other words, if no adjacent blocks are affine coded, interleaving prediction is not applied. In the fifth example, the flag can only be inherited if the current block applies unidirectional prediction. In the sixth example, the flag can only be inherited if the current block and the adjacent block from which it is to be inherited are in the same CTU. In the seventh example, the flag can only be inherited if the current block and the adjacent block from which it is to be inherited are in the same CTU row. In the eighth example, when the affine model is derived from temporally adjacent blocks, the flags cannot be inherited from the flags of the adjacent blocks. In the ninth example, flags cannot be inherited from flags of adjacent blocks that are not in the same LCU or LCU row or video data processing unit (eg, 64x64 or 128x128). How the flag is signaled and/or derived may depend on the block size and/or coding information of the current block. The encoded information includes information encoded in the video.

在一些实施例中，如果参考图片是当前图片，则不应用交织预测。在一个示例中，如果参考帧是包含预测块的当前帧，则不对指示是否使用交织预测的标志发信令。参考帧被用作预测预测块的基础。In some embodiments, if the reference picture is the current picture, no interleaving prediction is applied. In one example, if the reference frame is the current frame containing the prediction block, a flag indicating whether to use interleaved prediction is not signaled. The reference frame is used as the basis for predicting the prediction block.

在一些实施例中，当前块要使用的划分模式可以基于来自空间和/或时间相邻块的信息来推导。例如，编码器和解码器都可以采用一组预先确定的规则来获得基于时间邻接(例如，同一块的先前使用的划分模式)或空间邻接(例如，相邻块使用的划分模式)的划分模式，而不是依赖于编码器来发送相关信息。In some embodiments, the partitioning mode to be used by the current block may be derived based on information from spatially and/or temporally neighboring blocks. For example, both the encoder and the decoder may employ a set of predetermined rules to obtain partition patterns based on temporal adjacency (eg, previously used partition patterns for the same block) or spatial adjacency (eg, partition patterns used by adjacent blocks) , rather than relying on the encoder to send the relevant information.

在一些实施例中，加权值w可以被固定。例如，所有的划分模式都可以平均加权：w_i(x,y)＝1。在一些实施例中，加权值可以基于块的位置以及使用的分割模式来确定。例如，对于不同的(x,y)，w_i(x,y)可能不同。在一些实施例中，加权值可以进一步取决于基于子块预测的编码技术(例如，仿射或ATMVP)和/或其他编码信息(例如，跳跃或非跳跃模式和/或MV信息)。In some embodiments, the weighting value w may be fixed. For example, all partition modes can be weighted equally: w _i (x,y)=1. In some embodiments, the weighting value may be determined based on the location of the block and the partitioning mode used. For example, w _i (x, y) may be different for different (x, y). In some embodiments, the weighting values may further depend on sub-block prediction based coding techniques (eg, affine or ATMVP) and/or other coding information (eg, skip or non-skip modes and/or MV information).

在一些实施例中，编码器可以确定加权值，并在序列级、图片级、切片级、CTU/LCU级、CU级、PU级或区域级(可能包括多个CU/PU/TU/LCU)将这些值发送给解码器。对加权值可以在序列参数集(SPS)、图片参数集(PPS)、切片头段(SH)、CTU/LCU、CU、PU或区域的第一个块中发信令。在一些实施例中，加权值可以从空间和/或时间相邻块的加权值导出。In some embodiments, the encoder may determine the weighting value and calculate the weighting value at the sequence level, picture level, slice level, CTU/LCU level, CU level, PU level or region level (possibly including multiple CU/PU/TU/LCU) Send these values to the decoder. The weighting value may be signaled in the first block of a sequence parameter set (SPS), picture parameter set (PPS), slice header (SH), CTU/LCU, CU, PU or region. In some embodiments, the weighted values may be derived from the weighted values of spatially and/or temporally neighboring blocks.

应当注意的是，本文公开的交织预测技术可以应用于基于子块预测的一种、部分或全部编码技术。例如，交织预测技术可以应用于仿射预测，而其他基于子块预测的编码技术(例如，ATMVP、STMVP、FRUC或BIO)不使用交织预测。作为另一个示例，所有仿射、ATMVP和STMVP应用本文公开的交织预测技术。It should be noted that the interleaving prediction techniques disclosed herein may be applied to one, part or all of subblock prediction based coding techniques. For example, interleaved prediction techniques can be applied to affine prediction, while other subblock prediction-based coding techniques (eg, ATMVP, STMVP, FRUC, or BIO) do not use interleaved prediction. As another example, all affine, ATMVP and STMVP apply the interleaving prediction techniques disclosed herein.

图15A是根据所公开的技术提高视频系统中运动预测的方法1500的示例流程图。方法1500包括在1502从视频帧中选择一组像素以形成块。方法1500包括在1504根据第一模式将块分割成第一组子块。方法1500包括在1506基于第一组子块生成第一中间预测块。方法1500包括在1508根据第二模式将块划分成第二组子块。第二组中至少一个子块具有与第一组中的一个子块的大小不同的大小。方法1500包括在1510基于第二组子块生成第二中间预测块。方法1500还包括在1512基于第一中间预测块和第二中间预测块确定预测块。15A is an example flow diagram of a method 1500 for improving motion prediction in a video system in accordance with the disclosed techniques. The method 1500 includes, at 1502, selecting a set of pixels from a video frame to form a block. The method 1500 includes, at 1504, partitioning the block into a first set of sub-blocks according to the first mode. The method 1500 includes generating, at 1506, a first inter-prediction block based on the first set of sub-blocks. The method 1500 includes, at 1508, dividing the block into a second set of sub-blocks according to the second mode. At least one sub-block in the second group has a different size than one sub-block in the first group. The method 1500 includes generating, at 1510, a second inter-prediction block based on the second set of sub-blocks. The method 1500 also includes determining, at 1512, a prediction block based on the first inter-prediction block and the second inter-prediction block.

在一些实施例中，使用(1)仿射预测方法、(2)可选时域运动矢量预测法、(3)空时运动矢量预测法、(4)双向光流法、或(5)帧速率上转换法中的至少一种生成第一中间预测块或第二中间预测块。In some embodiments, (1) affine prediction method, (2) optional temporal motion vector prediction method, (3) space-time motion vector prediction method, (4) bidirectional optical flow method, or (5) frame At least one of the rate up-conversion methods generates the first inter-prediction block or the second inter-prediction block.

在一些实施例中，第一组或第二组中的子块具有矩形形状。在一些实施例中，第一组子块中的子块具有不均匀的形状。在一些实施例中，第二组子块中的子块具有不均匀的形状。In some embodiments, the sub-blocks in the first or second group have a rectangular shape. In some embodiments, the sub-blocks in the first set of sub-blocks have non-uniform shapes. In some embodiments, the sub-blocks in the second set of sub-blocks have non-uniform shapes.

在一些实施例中，所述方法包括基于块的尺寸确定第一模式或第二模式。在一些实施例中，所述方法包括基于来自与该块在时间上或空间上相邻的第二个块的信息确定第一模式或第二模式。In some embodiments, the method includes determining the first mode or the second mode based on the size of the block. In some embodiments, the method includes determining the first mode or the second mode based on information from a second block that is temporally or spatially adjacent to the block.

在一些实施例中，对于在第一方向中的块的运动预测，执行将块分割成第一组子块。在一些实施例中，对于在第二方向中的块的运动预测，执行将块分割成第二组子块。In some embodiments, for motion prediction of the block in the first direction, partitioning the block into a first set of sub-blocks is performed. In some embodiments, for motion prediction of the block in the second direction, partitioning the block into the second set of sub-blocks is performed.

在一些实施例中，对于在第一方向中的块的运动预测，执行将块分割成第一组子块，并将块分割成第二组子块。在一些实施例中，该方法还包括：通过根据第三模式将块划分为第三组子块，在第二方向上对块进行运动预测；基于第三组子块生成第三中间预测块；根据第四模式将块划分为第四组子块，其中第四组中的至少一个子块与第三组中的子块大小不同；基于第四组子块生成第四中间预测块；基于第三中间预测块和第四中间预测块确定第二预测块；以及基于预测块和第二预测块确定第三预测块。In some embodiments, for motion prediction of a block in the first direction, partitioning the block into a first set of sub-blocks and partitioning the block into a second set of sub-blocks is performed. In some embodiments, the method further comprises: performing motion prediction on the block in the second direction by dividing the block into a third set of sub-blocks according to a third mode; generating a third intermediate prediction block based on the third set of sub-blocks; Divide the block into a fourth group of sub-blocks according to a fourth mode, wherein at least one sub-block in the fourth group is different in size from the sub-block in the third group; generate a fourth intermediate prediction block based on the fourth group of sub-blocks; generate a fourth intermediate prediction block based on the fourth group of sub-blocks; The three intermediate prediction blocks and the fourth intermediate prediction block determine the second prediction block; and the third prediction block is determined based on the prediction block and the second prediction block.

在一些实施例中，该方法包括在基于块的运动预测视频系统中向编码设备发送用于对块进行分割的第一模式和第二模式的信息。在一些实施例中，传输第一模式和第二模式的信息在以下之一执行：(1)序列级、(2)图片级、(3)视图级、(4)切片级、(5)编码树单元、(6)最大编码单元级、(7)编码单元级、(8)预测单元级、(10)树单元级、或(11)区域级。In some embodiments, the method includes sending information of a first mode and a second mode for partitioning the block to an encoding device in a block-based motion prediction video system. In some embodiments, transmitting the information of the first mode and the second mode is performed at one of: (1) sequence level, (2) picture level, (3) view level, (4) slice level, (5) encoding Tree unit, (6) maximum coding unit level, (7) coding unit level, (8) prediction unit level, (10) tree unit level, or (11) region level.

在一些实施例中，确定预测结果包括：将第一组权重应用于第一中间预测块以获得第一加权预测块；将第二组权重应用于第二中间预测块以获得第二加权预测块；以及计算第一加权预测块和第二加权预测块的加权和以得到预测块。In some embodiments, determining the prediction result comprises: applying a first set of weights to a first intermediate prediction block to obtain a first weighted prediction block; applying a second set of weights to a second intermediate prediction block to obtain a second weighted prediction block ; and calculating a weighted sum of the first weighted prediction block and the second weighted prediction block to obtain the prediction block.

在一些实施例中，第一组权重或第二组权重包括固定权重值。在一些实施例中，第一组权重或第二组权重是基于来自与该块在时间上或空间上相邻的另一块的信息确定的。在一些实施例中，使用用于生成第一预测块或第二预测块的编码算法确定第一组权重或第二组权重。在一些实现中，第一组权重中的至少一个值与第一组权重中的另一个值不同。在一些实现中，第二组权重中的至少一个值与第二组权重中的另一个值不同。在一些实现中，权重之和等于二的幂。In some embodiments, the first set of weights or the second set of weights includes fixed weight values. In some embodiments, the first set of weights or the second set of weights is determined based on information from another block that is temporally or spatially adjacent to the block. In some embodiments, the first set of weights or the second set of weights is determined using an encoding algorithm used to generate the first prediction block or the second prediction block. In some implementations, at least one value in the first set of weights is different from another value in the first set of weights. In some implementations, at least one value in the second set of weights is different from another value in the second set of weights. In some implementations, the sum of the weights is equal to a power of two.

在一些实施例中，该方法包括将权重传输到基于块的运动预测视频系统中的编码设备。在一些实施例中，传输权重在以下之一执行：(1)序列级、(2)图片级、(3)视图级、(4)切片级、(5)编码树单元、(6)最大编码单元级、(7)编码单元级、(8)预测单元级、(10)树单元级、或(11)区域级。In some embodiments, the method includes transmitting the weights to an encoding device in a block-based motion prediction video system. In some embodiments, transmission weighting is performed at one of: (1) sequence level, (2) picture level, (3) view level, (4) slice level, (5) coding tree unit, (6) maximum coding Unit level, (7) coding unit level, (8) prediction unit level, (10) tree unit level, or (11) region level.

图15B是根据所公开的技术提高视频系统中基于块的运动预测的方法1550的示例流程图。方法1550包括在1552处从视频帧中选择一组像素以形成块。方法1550包括在1554处基于块的大小或者与该块空间或时间相邻的另一个块的信息将块划分为多个子块。多个子块中的至少一个子块的大小与其他子块不同。方法1550还包括在1556处通过对多个子块应用编码算法生成运动矢量预测。在一些实施例中，编码算法包括(1)仿射预测方法、(2)可选时域运动矢量预测方法、(3)空时运动矢量预测方法、(4)双向光流法、或(5)帧速率上转换法中的至少一种。15B is an example flow diagram of a method 1550 for improving block-based motion prediction in a video system in accordance with the disclosed techniques. The method 1550 includes, at 1552, selecting a set of pixels from the video frame to form a block. The method 1550 includes dividing the block into a plurality of sub-blocks at 1554 based on a size of the block or information of another block that is spatially or temporally adjacent to the block. At least one sub-block of the plurality of sub-blocks is different in size from other sub-blocks. The method 1550 also includes generating, at 1556, a motion vector prediction by applying an encoding algorithm to the plurality of sub-blocks. In some embodiments, the encoding algorithm includes (1) an affine prediction method, (2) an optional temporal motion vector prediction method, (3) a space-time motion vector prediction method, (4) a bidirectional optical flow method, or (5) ) at least one of frame rate up-conversion methods.

如本文进一步所述，编码处理可以避免检查从父块拆分的块的仿射模式，其中父块本身使用不同于仿射模式的模式编码。As described further herein, the encoding process may avoid checking the affine mode of a block split from a parent block, which itself is encoded using a mode different from the affine mode.

表1说明了对随机访问(RA)配置使用常规2x2仿射预测的示例性能结果。Table 1 illustrates example performance results using regular 2x2 affine prediction for random access (RA) configurations.

表1 2x2仿射预测的示例测试结果Table 1 Example test results for 2x2 affine prediction

YY UU VV EncTEncT DecTDecT 类别A1Category A1 -0.11％-0.11% -0.18％-0.18% -0.09％-0.09% 139％139% 111％111% 类别A2Category A2 -0.9％-0.9% -0.85％-0.85% -0.68％-0.68% 142％142% 125％125% 类别BCategory B -0.58％-0.58% -0.51％-0.51% -0.67％-0.67% 136％136% 114％114% 类别CCategory C -0.26％-0.26% -0.24％-0.24% -0.24％-0.24% 129％129% 108％108% 类别DCategory D -0.54％-0.54% -0.52％-0.52% -0.53％-0.53% 130％130% 118％118% 类别FCategory F -0.89％-0.89% -1.02％-1.02% -0.97％-0.97% 125％125% 108％108% 总体overall -0.47％-0.47% -0.44％-0.44% -0.44％-0.44% 136％136% 114％114%

表2说明了根据本技术的实施例，将交织预测应用于单向预测得到的示例性能结果。表3说明了根据本技术的实施例，将交织预测应用于双向预测得到的示例性能结果。Table 2 illustrates example performance results obtained by applying interleaved prediction to unidirectional prediction in accordance with embodiments of the present technology. Table 3 illustrates example performance results obtained by applying interleaved prediction to bidirectional prediction in accordance with embodiments of the present technology.

表2单向预测中交织预测的示例测试结果Table 2 Example test results for interleaved prediction in one-way prediction

YY UU VV EncTEncT DecTDecT 类别A1Category A1 -0.05％-0.05% -0.14％-0.14% -0.02％-0.02% 101％101% 100％100% 类别A2Category A2 -0.55％-0.55% -0.17％-0.17% -0.11％-0.11% 102％102% 101％101% 类别BCategory B -0.33％-0.33% -0.17％-0.17% -0.20％-0.20% 101％101% 101％101% 类别CCategory C -0.15％-0.15% -0.16％-0.16% -0.04％-0.04% 100％100% 100％100% 类别DCategory D -0.21％-0.21% -0.09％-0.09% -0.02％-0.02% 106％106% 106％106% 类别FCategory F -0.39％-0.39% -0.40％-0.40% -0.39％-0.39% 102％102% 102％102% 总体overall -0.27％-0.27% -0.16％-0.16% -0.11％-0.11% 101％101% 101％101%

表3双向预测中交织预测的示例测试结果Table 3 Example test results for interleaved prediction in bidirectional prediction

YY UU VV EncTEncT DecTDecT 类别A1Category A1 -0.09％-0.09% -0.18％-0.18% -0.12％-0.12% 103％103% 102％102% 类别A2Category A2 -0.74％-0.74% -0.40％-0.40% -0.28％-0.28% 104％104% 104％104% 类别BCategory B -0.37％-0.37% -0.39％-0.39% -0.35％-0.35% 103％103% 102％102% 类别CCategory C -0.22％-0.22% -0.19％-0.19% -0.13％-0.13% 102％102% 102％102% 类别DCategory D -0.42％-0.42% -0.28％-0.28% -0.32％-0.32% 103％103% 102％102% 类别FCategory F -0.60％-0.60% -0.64％-0.64% -0.62％-0.62% 102％102% 102％102% 总体overall -0.38％-0.38% -0.30％-0.30% -0.23％-0.23% 103％103% 102％102%

如表2和表3所示，与基于传统的2x 2仿射预测的编码相比，交织预测以更低的复杂度实现了主要的编码增益。特别地，与2x 2仿射方法(0.47％)相比，应用于双向预测的交织预测获得0.38％的编码增益。与2x 2仿射方法中的136％和114％相比，编码时间和解码时间分别为103％和102％。As shown in Tables 2 and 3, interleaved prediction achieves major coding gains with lower complexity compared to conventional 2x2 affine prediction based coding. In particular, interleaved prediction applied to bidirectional prediction achieves a coding gain of 0.38% compared to the 2x2 affine method (0.47%). The encoding time and decoding time are 103% and 102%, respectively, compared to 136% and 114% in the 2x2 affine method.

图16是图示可以用于实现本公开技术的各个部分的计算机系统或其他控制设备1600的结构的示例的示意图。在图16中，计算机系统1600包括通过互连1625连接的一个或多个处理器1605和存储器1610。互连1625可以表示由适当的桥、适配器或控制器连接的任何一条或多条单独的物理总线、点对点连接或两者。因此，互连1625可以包括例如系统总线、外围组件互连(PCI)总线、超传输或工业标准体系结构(ISA)总线、小型计算机系统接口(SCSI)总线、通用串行总线(USB)、IIC(I2C)总线或电气与电子工程师协会(IEEE)标准674总线(有时被称为“火线”)。16 is a schematic diagram illustrating an example of the structure of a computer system or other control device 1600 that may be used to implement various portions of the disclosed technology. In FIG. 16 , computer system 1600 includes one or more processors 1605 and memory 1610 connected by interconnect 1625 . Interconnect 1625 may represent any one or more separate physical buses, point-to-point connections, or both, connected by appropriate bridges, adapters, or controllers. Thus, interconnect 1625 may include, for example, a system bus, a Peripheral Component Interconnect (PCI) bus, a HyperTransport or Industry Standard Architecture (ISA) bus, a Small Computer System Interface (SCSI) bus, a Universal Serial Bus (USB), IIC (I2C) bus or Institute of Electrical and Electronics Engineers (IEEE) Standard 674 bus (sometimes referred to as "FireWire").

处理器1605可以包括中央处理器(CPU)，来控制例如主机的整体操作。在一些实施例中，处理器1605通过执行存储在存储器1610中的软件或固件来实现这一点。处理器1605可以是或可以包括一个或多个可编程通用或专用微处理器、数字信号处理器(DSP)、可编程控制器、专用集成电路(ASIC)、可编程逻辑器件(PLD)等，或这些器件的组合。The processor 1605 may include a central processing unit (CPU) to control, for example, the overall operation of the host computer. In some embodiments, the processor 1605 does this by executing software or firmware stored in the memory 1610 . The processor 1605 may be or include one or more programmable general purpose or special purpose microprocessors, digital signal processors (DSPs), programmable controllers, application specific integrated circuits (ASICs), programmable logic devices (PLDs), etc., or a combination of these devices.

存储器1610可以是或包括计算机系统的主存储器。存储器1610表示任何适当形式的随机存取存储器(RAM)、只读存储器(ROM)、闪存等，或这些设备的组合。在使用中，存储器1610除其他外可包含一组机器指令，当处理器1605执行该指令时，使处理器1605执行操作以实现本公开技术的实施例。Memory 1610 may be or include the main memory of the computer system. Memory 1610 represents any suitable form of random access memory (RAM), read only memory (ROM), flash memory, etc., or a combination of these devices. In use, memory 1610 may contain, among other things, a set of machine instructions that, when executed by processor 1605, cause processor 1605 to perform operations to implement embodiments of the disclosed technology.

通过互连1625连接到处理器1605的还有(可选的)网络适配器1615。网络适配器1615为计算机系统1600提供与远程设备(诸如存储客户机和/或其他存储服务器)通信的能力，并且可以是例如以太网适配器或光纤通道适配器。Also connected to processor 1605 via interconnect 1625 is (optional) network adapter 1615. Network adapters 1615 provide computer system 1600 with the ability to communicate with remote devices, such as storage clients and/or other storage servers, and may be, for example, Ethernet adapters or Fibre Channel adapters.

图17示出了可以用于实施本公开技术的各个部分的移动设备1700的示例实施例的框图。移动设备1700可以是笔记本电脑、智能手机、平板电脑、摄像机或其他能够处理视频的设备。移动设备1700包括处理器或控制器1701来处理数据，以及与处理器1701通信的存储器1702来存储和/或缓冲数据。例如，处理器1701可以包括中央处理器(CPU)或微控制器单元(MCU)。在一些实现中，处理器1701可以包括现场可编程门阵列(FPGA)。在一些实现中，移动设备1700包括或与图形处理单元(GPU)、视频处理单元(VPU)和/或无线通信单元通信，以实现智能手机设备的各种视觉和/或通信数据处理功能。例如，存储器1702可以包括并存储处理器可执行代码，当处理器1701执行该代码时，将移动设备1700配置为执行各种操作，例如接收信息、命令和/或数据、处理信息和数据，以及将处理过的信息/数据发送或提供给另一个数据设备，诸如执行器或外部显示器。为了支持移动设备1700的各种功能，存储器1702可以存储信息和数据，诸如指令、软件、值、图像以及处理器1701处理或引用的其他数据。例如，可以使用各种类型的随机存取存储器(RAM)设备、只读存储器(ROM)设备、闪存设备和其他合适的存储介质来实现存储器1702的存储功能。在一些实现中，移动设备1700包括输入/输出(I/O)单元1703，来将处理器1701和/或内存1702与其他模块、单元或设备进行接口。例如，I/O单元1703可以与处理器1701和内存1702进行接口，以利用与典型数据通信标准兼容的各种无线接口，例如，在云中的一台或多台计算机和用户设备之间。在一些实现中，移动设备1700可以通过I/O单元1703使用有线连接与其他设备进行接口。移动设备1700还可以与其他外部接口(例如数据存储器)和/或可视或音频显示设备1704连接，以检索和传输可由处理器处理、由存储器存储或由显示设备1704或外部设备的输出单元上显示的数据和信息。例如，显示设备1704可以根据所公开的技术显示基于MVP修改的视频帧(例如，包括如图13所示的预测块1305的视频帧)。17 illustrates a block diagram of an example embodiment of a mobile device 1700 that may be used to implement portions of the disclosed techniques. Mobile device 1700 may be a laptop, smartphone, tablet, video camera, or other device capable of processing video. The mobile device 1700 includes a processor or controller 1701 to process data, and a memory 1702 in communication with the processor 1701 to store and/or buffer the data. For example, the processor 1701 may include a central processing unit (CPU) or a microcontroller unit (MCU). In some implementations, the processor 1701 may include a field programmable gate array (FPGA). In some implementations, mobile device 1700 includes or communicates with a graphics processing unit (GPU), video processing unit (VPU), and/or wireless communication unit to implement various visual and/or communication data processing functions of the smartphone device. For example, memory 1702 may include and store processor-executable code that, when executed by processor 1701, configures mobile device 1700 to perform various operations, such as receiving information, commands and/or data, processing information and data, and Send or provide the processed information/data to another data device, such as an actuator or an external display. To support various functions of mobile device 1700 , memory 1702 may store information and data, such as instructions, software, values, images, and other data processed or referenced by processor 1701 . For example, various types of random access memory (RAM) devices, read only memory (ROM) devices, flash memory devices, and other suitable storage media may be used to implement the storage function of memory 1702. In some implementations, mobile device 1700 includes an input/output (I/O) unit 1703 to interface processor 1701 and/or memory 1702 with other modules, units or devices. For example, I/O unit 1703 may interface with processor 1701 and memory 1702 to utilize various wireless interfaces compatible with typical data communication standards, eg, between one or more computers and user equipment in the cloud. In some implementations, mobile device 1700 can interface with other devices through I/O unit 1703 using a wired connection. Mobile device 1700 may also interface with other external interfaces (eg, data storage) and/or visual or audio display device 1704 to retrieve and transmit data that may be processed by a processor, stored in memory, or transmitted by display device 1704 or an output unit of an external device. Displayed data and information. For example, display device 1704 may display a video frame modified based on MVP (eg, a video frame including prediction block 1305 as shown in FIG. 13 ) in accordance with the disclosed techniques.

在一些实施例中，视频解码器装置可以实施视频解码方法，其中使用本文所述的改进的基于块的运动预测进行视频解码。该方法可以包括使用来自视频帧的一组像素形成视频块。块可以根据第一模式分割成第一组子块。第一中间预测块可以对应于第一组子块。块可以包括根据第二模式的第二组子块。第二组中至少有一个子块的大小与第一组中的一个子块的大小不同。该方法还可以基于第一中间预测块和从第二组子块生成的第二中间预测块来确定预测块。该方法的其他特征可能与上述方法1500相似。In some embodiments, a video decoder device may implement a video decoding method wherein the video decoding is performed using the improved block-based motion prediction described herein. The method may include forming a video block using a set of pixels from a video frame. A block may be partitioned into a first set of sub-blocks according to a first pattern. The first intermediate prediction block may correspond to the first group of sub-blocks. The block may comprise a second set of sub-blocks according to the second mode. At least one of the sub-blocks in the second group has a different size than one of the sub-blocks in the first group. The method may also determine the prediction block based on the first intermediate prediction block and a second intermediate prediction block generated from the second set of sub-blocks. Other features of the method may be similar to method 1500 described above.

在一些实施例中，视频解码的解码器端的方法可以通过使用视频帧的块，利用基于块的运动预测来提高预测的视频质量，其中块对应一组像素块。基于块的大小或来自与该块在空间或时间上相邻的另一块的信息，可以将块划分为多个子块，其中多个子块的至少一个子块的大小与其他子块的大小不同。解码器可以使用通过对多个子块应用编码算法生成的运动矢量预测。该方法的其他特征参考图15B和相应的说明做出了描述。In some embodiments, a decoder-side method of video decoding may utilize block-based motion prediction to improve predicted video quality by using blocks of video frames, where a block corresponds to a set of pixel blocks. Based on the size of the block or information from another block that is spatially or temporally adjacent to the block, the block may be divided into multiple sub-blocks, wherein at least one of the multiple sub-blocks is of a different size than the other sub-blocks. The decoder may use motion vector prediction generated by applying an encoding algorithm to multiple sub-blocks. Additional features of the method are described with reference to Figure 15B and the corresponding description.

在一些实施例中，可以使用实现在如图16和图17所述的硬件平台上的解码装置来实现视频解码方法。In some embodiments, the video decoding method may be implemented using a decoding apparatus implemented on a hardware platform as described in FIGS. 16 and 17 .

图8是视频编码或解码的示例方法800流程图。方法800包括由于块满足条件，确定(802)对该块应用交织预测。方法800包括基于第一中间预测块和第二中间预测块确定(804)预测块。方法800包括使用该预测块生成(806)该块的编码或解码表示。例如，视频编码器或转码器可以在806处执行编码，并且视频解码器可以在806处执行解码表示的生成。第一中间预测块从根据第一模式对块进行分割得到的第一组子块生成，并且第二中间预测块从根据第二模式对块进行分割得到的第二组子块生成，并且第二组中至少有一个子块与第一组中的子块具有不同的尺寸。8 is a flowchart of an example method 800 of video encoding or decoding. Method 800 includes determining (802) to apply interleaving prediction to the block because the block satisfies the condition. The method 800 includes determining (804) a prediction block based on the first inter-prediction block and the second inter-prediction block. The method 800 includes generating (806) an encoded or decoded representation of the block using the predicted block. For example, a video encoder or transcoder may perform encoding at 806 and a video decoder may perform generation of a decoded representation at 806 . The first intermediate prediction block is generated from a first set of sub-blocks obtained by dividing the block according to the first mode, and the second intermediate prediction block is generated from a second set of sub-blocks obtained by dividing the block according to the second mode, and the second At least one sub-block in the group has a different size than the sub-block in the first group.

在一些实施例中，块满足的条件是使用双向预测编码对块进行编码。在一些实施例中，块满足的条件是使用多重假设预测来预测该块，并且其中交织预测应用于多个参考块可用的预测方向。在一些实施例中，将块分割为第一组子块，并将块分割为第二组子块，以在第一方向上对块进行运动预测。在一些实施例中，编码表示可以包括第一模式和第二模式的信息。In some embodiments, the block satisfies the condition that the block is encoded using bidirectional predictive coding. In some embodiments, a block satisfies the condition that the block is predicted using multiple hypothesis prediction, and wherein interleaved prediction is applied to the prediction directions available for multiple reference blocks. In some embodiments, the block is partitioned into a first set of sub-blocks and the block is partitioned into a second set of sub-blocks for motion prediction of the block in the first direction. In some embodiments, the encoded representation may include the first mode and the second mode of information.

在一些实施例中，条件包括使用双向预测而不是单向预测编码该块，其中双向预测基于在前和随后视频帧，并且单向预测仅基于在前或随后视频帧。在一些实施例中，条件是块是双向预测的，并且其中第一中间预测块是使用块的第一参考列表从第一组子块生成的，并且第二中间预测块是使用块的第二参考列表从第二组子块生成的。在一些实施例中，条件是块是单向预测的，并且其中第一中间预测块是使用块的参考列表L0或L1从第一组子块生成的，并且第二中间预测块是使用块的参考列表L0或L1从第二组子块生成的。在一些实施例中，条件是块是双向预测的，并且其中第一中间预测块是使用块的第一参考列表从第一组子块生成的，并且第二中间预测块是使用块的第二参考列表从第二组子块生成的。该方法还可以包括使用第一参考列表从块的一个或多个第三组子块生成一个或多个第三中间预测块，并使用所述第二参考列表从块的一个或多个第四组子块生成一个或多个第四中间预测块，其中，基于一个或多个第三中间预测块和/或一个或多个第四中间预测块确定预测块。In some embodiments, the condition includes encoding the block using bidirectional prediction instead of unidirectional prediction, wherein bidirectional prediction is based on previous and subsequent video frames, and unidirectional prediction is based on previous or subsequent video frames only. In some embodiments, the condition is that the block is bidirectionally predicted, and wherein the first intermediate prediction block is generated from the first set of sub-blocks using a first reference list of blocks, and the second intermediate prediction block is a second intermediate prediction block using a second The reference list is generated from the second set of subblocks. In some embodiments, the condition is that the block is uni-directionally predicted, and wherein the first intermediate prediction block is generated from the first set of sub-blocks using a reference list L0 or L1 of blocks, and the second intermediate prediction block is generated using the block The reference list L0 or L1 is generated from the second set of sub-blocks. In some embodiments, the condition is that the block is bidirectionally predicted, and wherein the first intermediate prediction block is generated from the first set of sub-blocks using a first reference list of blocks, and the second intermediate prediction block is a second intermediate prediction block using a second The reference list is generated from the second set of subblocks. The method may further include generating one or more third intermediate prediction blocks from one or more third sets of sub-blocks of the block using the first reference list, and generating one or more fourth intermediate prediction blocks from the one or more fourth groups of blocks using the second reference list The group of sub-blocks generates one or more fourth intermediate prediction blocks, wherein the prediction blocks are determined based on the one or more third intermediate prediction blocks and/or the one or more fourth intermediate prediction blocks.

在一些实施例中，条件是块是多重假设编码的块。该方法还可以包括使用用于该块的参考列表从该块的一组或多组附加子块生成一个或多个附加中间预测块，其中该预测块是基于一个或多个附加中间预测块确定的。In some embodiments, the condition is that the block is a multiple-hypothesis coded block. The method may also include generating one or more additional intermediate prediction blocks from one or more sets of additional sub-blocks of the block using the reference list for the block, wherein the prediction block is determined based on the one or more additional intermediate prediction blocks of.

方法800的其他特征和变体可能类似于参考图15A和15B所述的特征和变体。Other features and variations of method 800 may be similar to those described with reference to Figures 15A and 15B.

图9显示了实施本文公开的交织预测技术的示例装置1900的功能框图。例如，装置1900可以是接收视频1902的视频编码器或转码器。接收到的视频1902可以是以压缩视频或未压缩视频形式。视频1902可以通过网络接口或从存储设备接收。视频1902(未压缩或压缩形式中的任一种)可能对应于一定尺寸的视频帧。装置1900可以对视频1902执行预处理1904操作。预处理1904可以是可选的，并且可以包括诸如解密、颜色空间转换、质量增强过滤等内容。编码器1906可以将视频1902转换为编码表示，该编码表示可以通过后处理块1910选择性地进行后处理以产生输出视频。例如，编码器1906可以在视频1902的块上执行交织预测。块可以表示任何尺寸的视频区域，但通常被选择以在像素数量上具有固定数量的水平和垂直尺寸(例如，128x128或16x16等)。在一些情况下，块可能代表编码单元。可选的后处理块可能包括过滤、加密、打包等。输出视频1910可以存储在存储设备上，或者可以通过网络接口传输。9 shows a functional block diagram of an example apparatus 1900 implementing the interleaving prediction techniques disclosed herein. For example, apparatus 1900 may be a video encoder or transcoder that receives video 1902 . The received video 1902 may be in the form of compressed video or uncompressed video. Video 1902 may be received through a network interface or from a storage device. Video 1902 (either in uncompressed or compressed form) may correspond to a video frame of a certain size. Apparatus 1900 may perform preprocessing 1904 operations on video 1902 . Preprocessing 1904 may be optional and may include things such as decryption, color space conversion, quality enhancement filtering, and the like. Encoder 1906 may convert video 1902 into an encoded representation that may be optionally post-processed by post-processing block 1910 to produce an output video. For example, encoder 1906 may perform interleaved prediction on blocks of video 1902. A block can represent an area of video of any size, but is typically chosen to have a fixed number of horizontal and vertical dimensions in terms of number of pixels (eg, 128x128 or 16x16, etc.). In some cases, a block may represent a coding unit. Optional post-processing blocks may include filtering, encryption, packing, etc. The output video 1910 may be stored on a storage device, or may be transmitted over a network interface.

从上述来看，应当理解的是，为了便于说明，本发明公开的技术的具体实施例已经在本文中进行了描述，但是可以在不偏离本发明范围的情况下进行各种修改。因此，除了的之外，本发明公开的技术不限于权利要求的限定。From the foregoing, it should be understood that specific embodiments of the presently disclosed technology have been described herein for ease of illustration, but that various modifications may be made without departing from the scope of the present invention. Therefore, the technology disclosed in the present invention is not limited by the limitations of the claims, except for the following.

本文中公开的和其他描述的实施例、模块和功能操作可以在数字电子电路、或计算机软件、固件或硬件中实现，包括本文中所公开的结构及其结构等效体，或其中一个或多个的组合。公开的实施例和其他实施例可以实现为一个或多个计算机程序产品，即一个或多个编码在计算机可读介质上的计算机程序指令的模块，以供数据处理装置执行或控制数据处理装置的操作。计算机可读介质可以是机器可读存储设备、机器可读存储基板、存储设备、影响机器可读传播信号的物质组成或其中一个或多个的组合。术语“数据处理装置”包括用于处理数据的所有装置、设备和机器，包括例如可编程处理器、计算机或多处理器或计算机组。除硬件外，该装置还可以包括为计算机程序创建执行环境的代码，例如，构成处理器固件的代码、协议栈、数据库管理系统、操作系统或其中一个或多个的组合。传播信号是人为产生的信号，例如机器产生的电信号、光学信号或电磁信号，生成这些信号以对信息进行编码，以便传输到适当的接收装置。The embodiments, modules, and functional operations disclosed and otherwise described herein can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed herein and their structural equivalents, or one or more of them a combination of. The disclosed embodiments and other embodiments can be implemented as one or more computer program products, ie, one or more modules of computer program instructions encoded on a computer-readable medium for execution by, or for controlling the execution of, data processing apparatus. operate. The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a storage device, a composition of matter that affects a machine-readable propagated signal, or a combination of one or more thereof. The term "data processing apparatus" includes all apparatus, equipment and machines for processing data, including, for example, programmable processors, computers, or groups of multiple processors or computers. In addition to hardware, the apparatus may include code that creates an execution environment for the computer program, eg, code making up processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more thereof. A propagated signal is a human-generated signal, such as a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to an appropriate receiving device.

计算机程序(也称为程序、软件、软件应用、脚本或代码)可以用任何形式的编程语言(包括编译语言或解释语言)编写，并且可以以任何形式部署，包括作为独立程序或作为模块、组件、子程序或其他适合在计算环境中使用的单元。计算机程序不一定与文件系统中的文件对应。程序可以存储在保存其他程序或数据的文件的部分中(例如，存储在标记语言文档中的一个或多个脚本)、专用于该程序的单个文件中、或多个协调文件(例如，存储一个或多个模块、子程序或部分代码的文件)中。计算机程序可以部署在一台或多台计算机上来执行，这些计算机位于一个站点上或分布在多个站点上，并通过通信网络互连。A computer program (also called a program, software, software application, script, or code) may be written in any form of programming language, including compiled or interpreted languages, and may be deployed in any form, including as a stand-alone program or as a module, component , subroutines, or other units suitable for use in a computing environment. Computer programs do not necessarily correspond to files in the file system. Programs may be stored in sections of files that hold other programs or data (for example, one or more scripts stored in a markup language document), in a single file dedicated to the program, or in multiple coordination files (for example, storing a or files of multiple modules, subroutines, or portions of code). A computer program can be deployed for execution on one or more computers, located at one site or distributed across multiple sites, interconnected by a communications network.

本文中描述的处理和逻辑流可以通过一个或多个可编程处理器执行，该处理器执行一个或多个计算机程序，通过在输入数据上操作并生成输出来执行功能。处理和逻辑流也可以通过特殊用途的逻辑电路来执行，并且装置也可以实现为特殊用途的逻辑电路，例如，FPGA(现场可编程门阵列)或ASIC(专用集成电路)。The processes and logic flows described herein can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, eg, an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

例如，适于执行计算机程序的处理器包括通用和专用微处理器，以及任何类型数字计算机的任何一个或多个。通常，处理器将从只读存储器或随机存取存储器或两者接收指令和数据。计算机的基本元件是执行指令的处理器和存储指令和数据的一个或多个存储设备。通常，计算机还将包括一个或多个用于存储数据的大容量存储设备，例如，磁盘、磁光盘或光盘，或通过操作耦合到一个或多个大容量存储设备来从其接收数据或将数据传输到一个或多个大容量存储设备，或两者兼有。然而，计算机不一定具有这样的设备。适用于存储计算机程序指令和数据的计算机可读介质包括所有形式的非易失性存储器、介质和存储器设备，包括例如半导体存储器设备，例如EPROM、EEPROM和闪存设备；磁盘，例如内部硬盘或可移动磁盘；磁光磁盘；以及CDROM和DVD-ROM光盘。处理器和存储器可以由专用逻辑电路来补充，或合并到专用逻辑电路中。Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more of any kind of digital computer. Typically, a processor will receive instructions and data from read-only memory or random access memory, or both. The basic elements of a computer are a processor that executes instructions and one or more storage devices that store instructions and data. Typically, a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical disks to receive data from or transfer data therefrom Transfer to one or more mass storage devices, or both. However, a computer does not necessarily have such a device. Computer readable media suitable for storage of computer program instructions and data include all forms of non-volatile memory, media and memory devices including, for example, semiconductor memory devices such as EPROM, EEPROM and flash memory devices; magnetic disks such as internal hard disks or removable Magnetic disks; magneto-optical disks; and CDROM and DVD-ROM optical disks. The processor and memory may be supplemented by, or incorporated into, special purpose logic circuitry.

虽然本专利文件包含许多细节，但不应将其解释为对任何发明或权利要求范围的限制，而应解释为对特定发明的特定实施例的特征的描述。本专利文件在单独实施例的上下文描述的某些特征也可以在单个实施例中组合实施。相反，在单个实施例的上下文中描述的各种功能也可以在多个实施例中单独实施，或在任何合适的子组合中实施。此外，尽管上述特征可以描述为在某些组合中起作用，甚至最初要求是这样，但在某些情况下，可以从组合中删除权利要求组合中的一个或多个特征，并且权利要求的组合可以指向子组合或子组合的变体。While this patent document contains many details, these should not be construed as limitations on the scope of any invention or claims, but rather as descriptions of features of particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various functions that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although the above features may be described as functioning in certain combinations, even as originally claimed, in some cases one or more features of a claim combination may be deleted from the combination and the claim combination Can point to sub-combinations or variants of sub-combinations.

类似地，尽管图纸中以特定顺序描述了操作，但这不应理解为要获得想要的结果必须按照所示的特定顺序或顺序执行此类操作，或执行所有说明的操作。此外，本专利文件所述实施例中各种系统组件的分离不应理解为在所有实施例中都需要这样的分离。Similarly, although a drawing depicts operations in a particular order, it should not be understood that such operations must be performed in the particular order or sequence shown, or that all illustrated operations be performed, to obtain desired results. Furthermore, the separation of various system components in the embodiments described in this patent document should not be construed as requiring such separation in all embodiments.

仅描述了一些实现和示例，其他实现、增强和变体可以基于本专利文件中描述和说明的内容做出。Only some implementations and examples have been described, and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims

1. A method of processing a video block using interleaved prediction, the method comprising:

determining a prediction block for the block based on the first inter-prediction block and the second inter-prediction block because the block satisfies one or more conditions; and

generating an encoded or decoded representation of the block using the predicted block;

Wherein, the first intermediate prediction block is generated from a first group of sub-blocks obtained by dividing the block according to a first division mode, and the second intermediate prediction block is obtained from dividing the block according to a second division mode The resulting second set of sub-blocks is generated,

wherein the first division mode is different from the second division mode;

The encoding or decoding represents at the sequence parameter set (SPS), view parameter set (VPS), picture parameter set (PPS), slice header (SH), picture header, sequence header, slice level, slice group level, or region the initial block contains information on whether and how to apply the interleaving prediction, wherein the information includes a flag that is selectively included based on the condition of the block,

When encoding and decoding the block using the sub-block prediction method, the flag is ignored,

The sub-block prediction method is an optional temporal motion vector prediction method, an affine prediction method, a frame rate up-conversion method, a bidirectional optical flow method or a space-time motion vector prediction method,

wherein the condition of the block is related to the block size and/or codec information of the block.

2. The method of claim 1, wherein the condition is that the block is uni-directionally predicted, and wherein the first intermediate prediction block is generated from the first set of sub-blocks from the same reference picture, And the second intermediate prediction block is generated from the second set of sub-blocks.

3. The method of claim 1, wherein the condition satisfied by the block is that the block uses bidirectional predictive coding.

4. The method of claim 1, wherein the condition comprises encoding and decoding the block using bidirectional prediction instead of unidirectional prediction, wherein the bidirectional prediction is based on a reference list L0 and a reference list L1, and the single The direction prediction is based only on the reference list L0 or the reference list L1.

5. The method of claim 1, wherein the condition is that the block is bi-directionally predicted, and wherein the first inter-prediction block is obtained from the first set of sub-blocks using a first reference list of the blocks block, and the second intermediate prediction block is generated from the second set of sub-blocks using a second reference list of the blocks.

6. The method of claim 1, wherein the block satisfies the condition that the block is predicted using multiple hypothesis prediction, and wherein the interleaved prediction is applied to prediction directions available for multiple reference blocks.

7. The method of claim 1, wherein the condition is that the block is bidirectionally predicted, and wherein the first inter prediction of the block is generated from the first set of sub-blocks using a first reference list block and generating the second intermediate prediction block of the block from the second set of sub-blocks using a second reference list, the method further comprising:

generating one or more third intermediate prediction blocks from one or more third set of sub-blocks of the block using the first reference list;

generating one or more fourth intermediate prediction blocks from one or more fourth groups of sub-blocks of the block using the second reference list;

wherein the prediction block is determined based on the one or more third inter-prediction blocks and/or the one or more fourth inter-prediction blocks.

8. The method of claim 1, wherein the condition is that the block is a multiple hypothesis codec block, the method further comprising:

generating one or more additional intermediate prediction blocks from one or more additional groups of sub-blocks of the block using the reference list used by the block; and

wherein the prediction block is determined based on the one or more additional intermediate prediction blocks.

9. The method of claim 8, wherein the prediction block is determined as an equal-weighted weighted sum of the first inter-prediction block, the second inter-prediction block, and the one or more additional inter-prediction blocks .

10. The method of claim 1, wherein partitioning the block into a first set of sub-blocks and partitioning the block into a second set of sub-blocks is performed for motion prediction of the block in the first direction.

11. The method of any one of claims 1 to 10, further comprising:

Information of the first division mode and the second division mode is included in the codec representation.

12. The method of claim 11, wherein the information of the first partition mode and the second partition mode is included at: (1) sequence level, (2) picture level, (3) view level, (4) Slice level, (5) Codec tree unit, (6) LCU level, (7) Codec unit level, (8) Prediction unit level, (10) Tree unit level, or (11) Region one of the grades.

13. The method of claim 1, wherein the codec representation is at (1) sequence level, (2) picture level, (3) view level, (4) slice level, (5) codec tree unit ( CTU), (6) largest codec unit (LCU) level, (7) codec unit (CU) level, (8) prediction unit (PU) level, (9) tree unit (TU) level, (10) slice level, (11) slice group level, or (12) region level, which may include multiple CU/PU/TU/LCU levels, contains relevant information for the interleaving prediction.

14. The method of claim 1, comprising signaling a flag indicating whether to use the interleaved prediction if the block is affine coded.

15. The method of claim 1, wherein a flag indicates whether to use the interleaved prediction, and the flag is not signaled.

16. The method of claim 1, inheriting a flag indicating whether to use the interleaved prediction for the block from previous codec information in the codec representation.

17. The method of claim 16, comprising inheriting the flag from an adjacent block from which the affine model is inherited.

18. The method of claim 17, comprising inheriting the flags of predefined neighboring blocks, the predefined neighboring blocks including the neighboring block to the left or the neighboring block above.

19. The method of claim 16, comprising inheriting the flag from an initially encountered affine coded neighboring block.

20. The method of claim 16, comprising inferring that there is no interleaving prediction when no adjacent blocks are affine coded.

21. The method of claim 16, comprising inheriting the flag when the block applies unidirectional prediction.

22. The method of claim 16, comprising inheriting the flag when the block and an adjacent block from which the flag is inherited are located in the same CTU.

23. The method of claim 16, comprising inheriting the flag when the block and neighboring blocks inheriting therefrom are in the same CTU row.

24. The method of claim 1, wherein the conditions include width and height of the block.

25. The method of claim 1, wherein the condition comprises encoding and decoding another block of a video frame without using the block.

26. A video processing apparatus comprising a processor configured to implement the method of any of claims 1 to 25.

27. A non-transitory computer readable medium having stored thereon instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 25.