CN111247804A - Method and device for image processing - Google Patents
Method and device for image processing Download PDFInfo
- Publication number
- CN111247804A CN111247804A CN201980005232.7A CN201980005232A CN111247804A CN 111247804 A CN111247804 A CN 111247804A CN 201980005232 A CN201980005232 A CN 201980005232A CN 111247804 A CN111247804 A CN 111247804A
- Authority
- CN
- China
- Prior art keywords
- image block
- motion vector
- sub
- cpmv
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 115
- 238000012545 processing Methods 0.000 title claims abstract description 59
- 230000033001 locomotion Effects 0.000 claims abstract description 271
- 239000013598 vector Substances 0.000 claims abstract description 172
- 241000723655 Cowpea mosaic virus Species 0.000 claims abstract description 99
- 230000008569 process Effects 0.000 claims abstract description 73
- 238000004590 computer program Methods 0.000 claims description 12
- 238000005516 engineering process Methods 0.000 abstract description 21
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 78
- 238000004891 communication Methods 0.000 description 8
- 230000002123 temporal effect Effects 0.000 description 8
- 238000003672 processing method Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000009466 transformation Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000013139 quantization Methods 0.000 description 5
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000008570 general process Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/56—Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
版权申明Copyright notice
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。The disclosure of this patent document contains material that is subject to copyright protection. This copyright belongs to the copyright owner. The copyright owner has no objection to the reproduction by anyone of the patent document or the patent disclosure as it exists in the official records and archives of the Patent and Trademark Office.
技术领域technical field
本申请涉及图像处理领域,并且更为具体地,涉及一种图像处理的方法与装置。The present application relates to the field of image processing, and more particularly, to a method and apparatus for image processing.
背景技术Background technique
视频编码帧间预测的大致思想为:利用视频相邻帧之间的时域相关性,使用先前已经编码的重构帧作为参考帧,通过运动估计和运动补偿的方法对当前帧进行预测,从而去除视频的时间冗余信息。帧间预测的大致流程包括运动估计(Motion Estimation,ME)与运动补偿(Motion Compensation,MC)。当前帧的当前编码块在参考帧中寻找最相似块作为当前块的预测块,当前块与其相似块之间的相对位移为运动矢量(Motion Vector,MV)。运动估计的过程就是将当前帧的当前编码块在参考帧中经过搜索、比较后得到运动矢量的过程。运动补偿就是利用MV和参考帧得到预测帧的过程。运动补偿得到的预测帧可能和原始的当前帧有一定的差别,因此需要将预测帧和当前帧的差值(残差)经过变换、量化等过程之后传递到解码端,除此之外还需要将MV和参考帧的信息传递到解码端。解码端通过MV、参考帧、以及预测帧和当前帧的差值,可以重构出当前帧。The general idea of inter-frame prediction in video coding is to use the temporal correlation between adjacent video frames, use the previously encoded reconstructed frame as a reference frame, and predict the current frame through motion estimation and motion compensation. Remove temporal redundancy information from video. The general process of inter-frame prediction includes motion estimation (Motion Estimation, ME) and motion compensation (Motion Compensation, MC). The current coding block of the current frame searches for the most similar block in the reference frame as the prediction block of the current block, and the relative displacement between the current block and its similar blocks is a motion vector (Motion Vector, MV). The process of motion estimation is the process of obtaining a motion vector after searching and comparing the current coding block of the current frame in the reference frame. Motion compensation is the process of obtaining predicted frames using MV and reference frames. The predicted frame obtained by motion compensation may be different from the original current frame, so it is necessary to pass the difference (residual) between the predicted frame and the current frame to the decoding end after transformation, quantization and other processes. The information of MV and reference frame is passed to the decoding end. The decoding end can reconstruct the current frame through the MV, the reference frame, and the difference between the predicted frame and the current frame.
由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位。为了提高运动矢量的精度,亚像素精度被提出来。例如,在高性能视频编码(high efficiency video coding,HEVC)标准中,对亮度分量的运动估计采用1/4像素精度的运动矢量。但是在数字视频中并不存在分数像素处的样值,一般来说,为了实现1/K像素精度估计,必须将这些分像素点的值近似内插出来,也就是对参考帧的行方向和列方向进行K倍内插,即在插值之后的参考帧中进行搜索预测块。在对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。Due to the continuity of natural object motion, the motion vector of an object between two adjacent frames is not necessarily exactly integer pixel units. To improve the accuracy of motion vectors, sub-pixel accuracy is proposed. For example, in the high efficiency video coding (HEVC) standard, a motion vector with 1/4 pixel precision is used for motion estimation of the luminance component. However, there are no samples at fractional pixels in digital video. Generally speaking, in order to achieve 1/K pixel accuracy estimation, the values of these fractional pixels must be approximately interpolated, that is, the line direction and the reference frame. K-fold interpolation is performed in the column direction, that is, the prediction block is searched in the reference frame after interpolation. In the process of interpolating the current block, the pixels in the current block and the pixels in the adjacent areas need to be used.
通常,在帧间预测过程中只考虑传统的运动模型(例如,平移运动)。然而在现实世界中,还有很多种运动形式,比如缩放、旋转、透视运动等无规则的运动。为了考虑上述多运动形式,在VTM-3.0中,引入了仿射运动补偿预测(Affine motion compensationprediction,可简称为Affine)技术。在Affine模式中,图像块的仿射运动场可以通过两个控制点(四参数)或三个控制点(六参数)的运动矢量导出。Typically, only traditional motion models (eg, translational motion) are considered in the inter prediction process. However, in the real world, there are many forms of motion, such as zooming, rotating, perspective motion and other irregular motions. In order to consider the above multi-motion forms, in VTM-3.0, the affine motion compensation prediction (Affine motion compensation prediction, which can be referred to as Affine for short) technology is introduced. In Affine mode, the affine motion field of an image block can be derived from motion vectors of two control points (four parameters) or three control points (six parameters).
在Affine模式中,可以对图像处理单元的运动估计采用1/4像素精度、1/16像素精度或其它亚像素精度的运动矢量。Affine技术的图像处理单元是sub-CU(可称为子块),sub-CU的大小为4×4(单位:像素),这会使Affine技术产生较大的带宽压力。In Affine mode, motion vectors with 1/4 pixel precision, 1/16 pixel precision or other sub-pixel precision can be used for motion estimation of the image processing unit. The image processing unit of the Affine technology is a sub-CU (may be called a sub-block), and the size of the sub-CU is 4×4 (unit: pixel), which will cause the Affine technology to generate greater bandwidth pressure.
发明内容SUMMARY OF THE INVENTION
本申请提供一种图像处理的方法与装置,可以在一定程度上降低Affine预测技术造成的带宽压力。The present application provides an image processing method and apparatus, which can reduce the bandwidth pressure caused by the Affine prediction technology to a certain extent.
第一方面,提供一种图像处理的方法,所述方法包括:获取图像块的控制点的运动矢量CPMV;根据所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,所述运动矢量为整像素精度。In a first aspect, an image processing method is provided, the method comprising: acquiring a motion vector CPMV of a control point of an image block; acquiring a motion vector of a sub-image block in the image block according to the CPMV of the image block, the The motion vectors are described with integer pixel precision.
第二方面,提供一种图像处理的装置,所述装置包括:第一获取单元,用于获取图像块的控制点的运动矢量CPMV;第二获取单元,用于根据所述第一获取单元获取的所述图像块的CPMV,获取所述图像块中子图像块的运动矢量,所述运动矢量为整像素精度。In a second aspect, an apparatus for image processing is provided, the apparatus comprising: a first acquiring unit for acquiring a motion vector CPMV of a control point of an image block; a second acquiring unit for acquiring according to the first acquiring unit The CPMV of the image block is obtained, and the motion vector of the sub-image block in the image block is obtained, and the motion vector is of integer pixel precision.
第三方面,提供一种图像处理的装置,所述编码装置包括存储器和处理器,所述存储器用于存储指令,所述处理器用于执行所述存储器存储的指令,并且对所述存储器中存储的指令的执行使得所述处理器执行第一方面提供的方法。In a third aspect, an image processing apparatus is provided, the encoding apparatus includes a memory and a processor, the memory is used for storing instructions, the processor is used for executing the instructions stored in the memory, and the memory is stored in the memory. Execution of the instructions causes the processor to perform the method provided by the first aspect.
第四方面,提供一种芯片,所述芯片包括处理模块与通信接口,所述处理模块用于控制所述通信接口与外部进行通信,所述处理模块还用于实现第一方面提供的方法。In a fourth aspect, a chip is provided, the chip includes a processing module and a communication interface, the processing module is configured to control the communication interface to communicate with the outside, and the processing module is further configured to implement the method provided in the first aspect.
第五方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被计算机执行时使得所述计算机实现第一方面或第一方面的任一可能的实现方式中的方法。A fifth aspect provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, enables the computer to implement the method in the first aspect or any possible implementation manner of the first aspect .
第六方面,提供一种包含指令的计算机程序产品,所述指令被计算机执行时使得所述计算机实现第一方面提供的方法。A sixth aspect provides a computer program product comprising instructions, which when executed by a computer cause the computer to implement the method provided by the first aspect.
本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。In the solution provided by the present application, by making the motion vector of the sub-image block as the image processing unit to be of integer pixel precision, the motion compensation process of the sub-image block does not involve sub-pixels, thereby reducing the amount of noise generated by the Affine prediction technology to a certain extent. Bandwidth pressure.
附图说明Description of drawings
图1是视频编码架构的示意图。Figure 1 is a schematic diagram of a video coding architecture.
图2是1/4像素插值的示意图。Figure 2 is a schematic diagram of 1/4 pixel interpolation.
图3(a)和图3(b)分别是四参数Affine模型和六参数Affine模型的示意图。3(a) and 3(b) are schematic diagrams of the four-parameter Affine model and the six-parameter Affine model, respectively.
图4是Affine运动矢量场的示意图。Figure 4 is a schematic diagram of the Affine motion vector field.
图5是现有技术的Affine模式与HEVC模式所需参考像素点的对比图。FIG. 5 is a comparison diagram of reference pixels required by the prior art Affine mode and HEVC mode.
图6是根据本申请实施例的图像处理的方法的示意性流程图。FIG. 6 is a schematic flowchart of an image processing method according to an embodiment of the present application.
图7是根据本申请实施例的图像处理的方法的另一示意性流程图。FIG. 7 is another schematic flowchart of an image processing method according to an embodiment of the present application.
图8是根据本申请实施例的图像处理的方法的再一示意性流程图。FIG. 8 is still another schematic flowchart of an image processing method according to an embodiment of the present application.
图9是根据本申请实施例的图像处理的装置的示意性流程图。FIG. 9 is a schematic flowchart of an apparatus for image processing according to an embodiment of the present application.
图10是根据本申请实施例的图像处理的装置的另一示意性流程图。FIG. 10 is another schematic flowchart of an apparatus for image processing according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein in the specification of the application are for the purpose of describing specific embodiments only, and are not intended to limit the application.
为了便于理解根据本申请实施例的方案,下面首先描述几个相关的概念。To facilitate understanding of the solutions according to the embodiments of the present application, several related concepts are first described below.
1、帧间预测1. Inter prediction
如图1所示,视频编码框架主要包括帧内预测、帧间预测、变换、量化、熵编码、环路滤波几个部分。As shown in Figure 1, the video coding framework mainly includes several parts: intra-frame prediction, inter-frame prediction, transformation, quantization, entropy coding, and loop filtering.
本申请主要针对帧间预测(inter prediction)部分进行改进。This application mainly improves the inter prediction part.
帧间预测的大致思想是:利用视频相邻帧之间的时域相关性,使用重构帧作为参考帧,通过运动估计(Motion Estimation,ME)和运动补偿(Motion Compensation,MC)对当前帧进行预测,从而去除视频的时间冗余信息。The general idea of inter-frame prediction is to use the temporal correlation between adjacent frames of the video, use the reconstructed frame as a reference frame, and perform motion estimation (Motion Estimation, ME) and motion compensation (Motion Compensation, MC) for the current frame. Prediction is performed to remove the temporal redundancy information of the video.
本文中提及的当前帧,在编码场景下,表示当前正在编码的帧,在解码场景下,表示当前正在解码的帧。The current frame mentioned in this article, in the encoding scenario, represents the frame currently being encoded, and in the decoding scenario, represents the frame currently being decoded.
本文中提及的重构帧,在编码场景下,表示先前已经编码的帧,在解码场景下,表示先前已经解码的帧。The reconstructed frame mentioned in this paper, in the encoding scenario, refers to the frame that has been previously encoded, and in the decoding scenario, refers to the frame that has been previously decoded.
对于一帧图像,在编码过程中不会直接对整帧图像进行处理,通常将整帧图像划分为图像块进行处理。For a frame of image, the whole frame of image is not directly processed during the encoding process, and the whole frame of image is usually divided into image blocks for processing.
作为示例,先将整帧图像划分成编码区域(Coding Tree Unit,CTU),例如CTU的大小为64×64或128×128(单位:像素),然后可以进一步地将CTU划分成方形或矩形的编码单元(Coding Unit,CU)。在编码过程中,对CU进行处理。As an example, first divide the whole frame of image into coding areas (Coding Tree Unit, CTU), for example, the size of the CTU is 64×64 or 128×128 (unit: pixel), and then the CTU can be further divided into square or rectangular Coding unit (Coding Unit, CU). During the encoding process, the CU is processed.
本文中提及的图像块的大小的单位均为像素。The size of the image block mentioned in this article is in pixels.
帧间预测的大致流程如下。The general flow of inter prediction is as follows.
针对当前帧中的当前图像块(下文简称为当前块),在参考帧中寻找最相似块作为当前块的预测块。当前块与相似块之间的相对位移称为运动矢量(Motion Vector,MV)。运动估计指的是,将当前帧的当前块在参考帧中经过搜索、比较后得到运动矢量的过程。运动补偿指的是,利用参考块与运动估计得到的运动矢量得到预测块的过程。For the current image block in the current frame (hereinafter simply referred to as the current block), the most similar block in the reference frame is found as the prediction block of the current block. The relative displacement between the current block and similar blocks is called a Motion Vector (MV). Motion estimation refers to the process of obtaining a motion vector after searching and comparing the current block of the current frame in the reference frame. Motion compensation refers to the process of obtaining a prediction block by using a reference block and a motion vector obtained by motion estimation.
帧间预测的过程获得的预测块可能和原始的当前块有一定的差别,因此,需要计算预测块与当前块之间的差值,该差值可称为残差。对残差进行变换、量化、熵编码等处理之后,得到编码比特流。The predicted block obtained in the process of inter-frame prediction may be different from the original current block. Therefore, it is necessary to calculate the difference between the predicted block and the current block, and the difference may be called residual. After transforming, quantizing, and entropy coding the residual, the coded bit stream is obtained.
在编码端,完成图像编码后,即熵编码得到的比特流之后,会将比特流以及编码模式信息,例如帧间预测模式、运动矢量信息等信息,进行存储或发送到解码端。At the encoding end, after the image encoding is completed, that is, after the bit stream obtained by entropy encoding, the bit stream and encoding mode information, such as inter-frame prediction mode, motion vector information and other information, are stored or sent to the decoding end.
在解码端,获得熵编码比特流之后,先对该比特流进行熵解码,得到相应的残差;然后,根据解码得到的运动矢量等编码模式信息,获得预测块;最后,根据残差和预测块,得到当前块中各像素点的值,即重构出当前块,以此类推,重构出当前帧。At the decoding end, after obtaining the entropy coded bit stream, entropy decoding is performed on the bit stream to obtain the corresponding residual; then, the prediction block is obtained according to the coding mode information such as the motion vector obtained by decoding; finally, the prediction block is obtained according to the residual and prediction block, obtain the value of each pixel in the current block, that is, reconstruct the current block, and so on, reconstruct the current frame.
如图1所示,在编码过程中,还可以包括反量化和反变换等步骤。反量化指的就是与量化过程相反的过程。反变换指的就是与变换过程相反的过程。As shown in FIG. 1 , in the encoding process, steps such as inverse quantization and inverse transformation may also be included. Inverse quantization refers to the reverse process of quantization. The inverse transformation refers to the reverse process of the transformation process.
帧间预测主要包括前向预测、后向预测、双预测等。其中,前向预测是利用当前帧的前一重构帧(可以称为历史帧)对当前帧进行预测。后向预测是利用当前帧之后的帧(可以称为将来帧)对当前帧进行预测。双预测可以是双向预测,即既利用“历史帧”也利用“将来帧”来对当前帧进行预测。双预测还可以是两个方向的预测,例如,利用两个“历史帧”来对当前帧进行预测,或者,利用两个“将来帧”来对当前帧进行预测。Inter-frame prediction mainly includes forward prediction, backward prediction, bi-prediction and so on. The forward prediction is to predict the current frame by using the previous reconstructed frame (which may be referred to as a historical frame) of the current frame. Backward prediction is the prediction of the current frame using the frame following the current frame (which may be called a future frame). Bi-prediction may be bi-directional prediction, ie using both "historical frames" and "future frames" to predict the current frame. Bi-prediction can also be prediction in two directions, eg, using two "historical frames" to predict the current frame, or using two "future frames" to predict the current frame.
2、亚像素精度运动估计2. Sub-pixel precision motion estimation
在实际场景中,由于自然物体运动的连续性,物体在相邻两帧之间的运动矢量不一定刚好是整数个像素单位,因此,需要将运动估计的精度提升到亚像素级别(也称为1/K像素精度)。例如,在HEVC标准中,对亮度分量的运动估计采用1/4像素精度的运动矢量。In actual scenes, due to the continuity of motion of natural objects, the motion vector of objects between two adjacent frames is not necessarily exactly integer pixel units. Therefore, it is necessary to improve the accuracy of motion estimation to sub-pixel level (also known as 1/K pixel precision). For example, in the HEVC standard, a motion vector with 1/4 pixel precision is used for motion estimation of the luminance component.
但在数字视频中并不存在1/K像素处的样值,通常,为了实现1/K像素精度的运动估计,将1/K像素点的值近似内插出来,换言之,对参考帧的行方向和列方向进行K倍内插,在插值之后的图像中进行搜索。对当前块进行插值的过程,需要用到当前块中的像素点及其相邻区域的像素点。However, there is no sample value at 1/K pixel in digital video. Usually, in order to achieve motion estimation with 1/K pixel precision, the value of 1/K pixel point is approximately interpolated. K-fold interpolation is performed in the direction and column direction, and the search is performed in the image after the interpolation. The process of interpolating the current block needs to use the pixels in the current block and the pixels in the adjacent areas.
作为示例,1/4像素插值的过程如图2所示。对于一个大小为8×8、4×8、4×4或8×4的图像块,会用到该图像块外部左侧的3个像素点和右侧的4个像素点来产生内插点的像素值。如图2所示,对于一个大小为4×4的图像块,a0,0和d0,0为1/4像素点,b0,0和h0,0为半像素点,c0,0和n0,0为3/4像素点。假如说当前块为2×2的块,A0,0~A1,0,A0,0~A0,1围成的2×2块。为了计算这个2×2的块中所有的内插点,需要用到2×2外部的一些点,包括左边3个,右边4个,上边3个,下边4个。As an example, the process of 1/4 pixel interpolation is shown in Figure 2. For an image block of
3、仿射运动补偿预测技术(Affine motion compensated prediction,下文简称为Affine)。3. Affine motion compensated prediction (Affine motion compensated prediction, hereinafter referred to as Affine).
Affine为一种帧间预测技术。Affine is an inter-frame prediction technology.
在HEVC标准中,帧间预测过程只考虑了传统的运动模型(例如,平移运动)。然而在现实世界中,还有很多种运动形式,比如缩放、旋转、透视运动等无规则的运动。为了考虑到上述运动形式,在VTM-3.0中,引入了Affine技术。In the HEVC standard, the inter prediction process only takes into account traditional motion models (eg, translational motion). However, in the real world, there are many forms of motion, such as zooming, rotating, perspective motion and other irregular motions. In order to take into account the above motion forms, in VTM-3.0, Affine technology is introduced.
如图3所示,一个Affine模式的运动场可以通过两个控制点(四参数)(如图3(a)所示)或三个控制点(六参数)(如图3(b)所示)的运动矢量导出。As shown in Fig. 3, an Affine mode motion field can be controlled by two control points (four parameters) (as shown in Fig. 3(a)) or three control points (six parameters) (as shown in Fig. 3(b)) Motion vector export.
下文中,将控制点的MV(control point motion vector)简称为CPMV。Hereinafter, the MV (control point motion vector) of the control point is abbreviated as CPMV.
Affine的处理单元不是CU,而是将CU划分之后得到的子块(sub-CU),每个sub-CU的大小为4×4。在Affine模式,每个sub-CU具有一个MV。可以理解到,不同于普通CU,Affine模式的CU不只有一个MV,一个CU中具有多少个sub-CU,这个CU就具有多少个MV。The processing unit of Affine is not a CU, but a sub-block (sub-CU) obtained by dividing the CU, and the size of each sub-CU is 4×4. In Affine mode, each sub-CU has one MV. It can be understood that, unlike a normal CU, a CU in Affine mode does not have only one MV, and the CU has as many MVs as there are sub-CUs in a CU.
作为示例,一个CU中的sub-CU的MV通过如图3中所示的两个控制点或三个控制点的CPMV计算导出。例如,对于四参数的Affine运动模型,位于(x,y)位置的sub-CU的MV通过以下公式计算得到:As an example, the MV of a sub-CU in one CU is derived by CPMV calculation of two control points or three control points as shown in FIG. 3 . For example, for the four-parameter Affine motion model, the MV of the sub-CU at the (x, y) position is calculated by the following formula:
再例如,对于六参数的Affine运动模型,位于(x,y)位置的sub-CU的MV通过以下公式计算得到:For another example, for the six-parameter Affine motion model, the MV of the sub-CU located at the (x, y) position is calculated by the following formula:
其中(mv0x,mv0y)为左上角控制点的MV,(mv1x,mv1y)为右上角控制点的MV,(mv2x,mv2y)为左下角控制点的MV。上述公式中的W表示sub-CU的所在CU的宽,H表示sub-CU的所在CU的高。Where (mv 0x , mv 0y ) is the MV of the upper left control point, (mv 1x , mv 1y ) is the MV of the upper right control point, (mv 2x , mv 2y ) is the MV of the lower left control point. W in the above formula represents the width of the CU where the sub-CU is located, and H represents the height of the CU where the sub-CU is located.
经过上述公式(1)的计算,一个CU中运动矢量的示意图如图4所示,每个方格代表4×4大小的sub-CU。在上述公式计算之后的所有sub-CU的MV都会转换成1/16像素精度的表示,也就是说sub-CU的MV最高精度是1/16像素。After the calculation of the above formula (1), a schematic diagram of a motion vector in a CU is shown in FIG. 4 , and each square represents a sub-CU with a size of 4×4. After the calculation of the above formula, the MV of all sub-CUs will be converted into a representation of 1/16 pixel precision, that is to say, the highest precision of the MV of a sub-CU is 1/16 pixel.
在计算得到每一个sub-CU的MV之后,经过运动补偿的过程得到每一个sub-CU的预测块。色度分量和亮度分量的sub-CU的大小都是4×4,色度分量4×4块的运动矢量由其对应的四个4×4的亮度分量运动矢量平均得到。After calculating the MV of each sub-CU, the prediction block of each sub-CU is obtained through the process of motion compensation. The size of the sub-CUs of the chrominance component and the luminance component are both 4×4, and the motion vector of the 4×4 block of the chrominance component is obtained by averaging the corresponding four motion vectors of the luminance component of 4×4.
在Affine模式的编码过程中,在码流中写入CPMV信息,不需要写入每个sub-CU的MV信息。In the encoding process of the Affine mode, the CPMV information is written in the code stream, and the MV information of each sub-CU does not need to be written.
4、自适应运动矢量精度(Adaptive Motion Vector Resolution,AMVR)4. Adaptive Motion Vector Resolution (AMVR)
AMVR技术可以使得CU具有整像素精度和亚像素精度的运动矢量。整像素精度例如可以为1像素精度、2像素精度等。亚像素精度例如可以为1/2像素精度、1/4像素精度、1/8像素精度或1/16像素精度等。AMVR technology can enable the CU to have integer-pixel precision and sub-pixel precision motion vectors. The integer pixel precision may be, for example, 1 pixel precision, 2 pixel precision, or the like. The sub-pixel precision may be, for example, 1/2 pixel precision, 1/4 pixel precision, 1/8 pixel precision, or 1/16 pixel precision.
例如,对于每一个采用Affine AMVR技术的CU(有些情况下可能CU不采用AffineAMVR),在编码端自适应地决策其对应的MV精度,并将决策的结果写进码流传递到解码端。For example, for each CU that uses Affine AMVR technology (in some cases, the CU may not use Affine AMVR), the encoding end adaptively determines its corresponding MV precision, and writes the decision result into the code stream and transmits it to the decoding end.
Affine AMVR技术中提及的整像素精度或亚像素精度指的是CPMV的像素精度,而不是sub-CU的像素精度。The integer pixel precision or sub-pixel precision mentioned in Affine AMVR technology refers to the pixel precision of CPMV, not the pixel precision of sub-CU.
对于整像素的CPMV,CU的运动估计的过程都是整像素的过程,但是经过上述公式(1)或公式(2)计算之后得到的sub-CU的MV可能是1/4像素精度或其它亚像素精度。For the CPMV of an integer pixel, the motion estimation process of the CU is the process of the whole pixel, but the MV of the sub-CU obtained after the calculation of the above formula (1) or formula (2) may be 1/4 pixel precision or other sub-CUs. pixel precision.
如果sub-CU的MV是亚像素精度,则sub-CU的运动补偿过程会涉及到亚像素,且由于sub-CU的大小为4×4,这会使得Affine预测过程产生较大的带宽压力。If the MV of the sub-CU is sub-pixel precision, the motion compensation process of the sub-CU will involve sub-pixels, and since the size of the sub-CU is 4×4, this will cause a large bandwidth pressure to the Affine prediction process.
申请人在VVC最新的参考软件VTM-4.0上,选取官方通测数据作为测试序列,进行了仿真,仿真结果如图5所示。On VVC's latest reference software VTM-4.0, the applicant selected the official pass-through test data as the test sequence, and carried out a simulation. The simulation results are shown in Figure 5.
如图5所示,左侧方框表示HEVC最坏情况(1/4像素精度的MV)为8×8的双向帧间预测CU,所需的参考像素点的个数为(8+7)×(8+7)×2=450。右侧方框表示VVC的Affine模式下最坏情况(1/16和1/4像素精度MV)下的4×4双向帧间预测的CU,所需参考像素点的个数为(4+7)×(4+7)×2×4=968。As shown in Figure 5, the left box represents the worst-case (MV of 1/4 pixel precision) of HEVC bi-directional inter-prediction CU of 8×8, and the required number of reference pixels is (8+7) ×(8+7)×2=450. The box on the right represents the CU of 4×4 bidirectional inter-frame prediction in the worst case (1/16 and 1/4 pixel precision MV) in the Affine mode of VVC, and the number of required reference pixels is (4+7 )×(4+7)×2×4=968.
从图5可知,现有的Affine模式,相比于HEVC,增加了115%的参考像素点,造成了较大的带宽压力。It can be seen from Fig. 5 that, compared with HEVC, the existing Affine mode increases the reference pixels by 115%, which causes a large bandwidth pressure.
针对上述问题,本申请提出一种图像处理的方法与装置,可以在一定程度上减小Affine技术产生的带宽压力。In view of the above problems, the present application proposes an image processing method and apparatus, which can reduce the bandwidth pressure generated by the Affine technology to a certain extent.
本申请适用于数字视频编码技术领域,具体用于视频编解码器的帧间预测部分。本申请可以应用于符合国际视频编码标准H.264/HEVC和中国AVS2标准等的编解码器,以及符合下一代视频编码标准VVC或AVS3等的编解码器。The present application applies to the technical field of digital video coding, and is specifically used for the inter-frame prediction part of a video codec. The present application can be applied to codecs conforming to international video coding standards H.264/HEVC and Chinese AVS2 standards, etc., as well as codecs conforming to next-generation video coding standards VVC or AVS3, etc.
本申请可以应用于视频编解码器的帧间预测部分,也就是说,根据本申请实施例的图像处理的方法可以由编码装置执行,也可以由解码装置执行。The present application may be applied to the inter-frame prediction part of a video codec, that is, the image processing method according to the embodiment of the present application may be performed by an encoding apparatus or a decoding apparatus.
图6为本申请提供的图像处理的方法600的示意性流程图,该方法600包括如下步骤。FIG. 6 is a schematic flowchart of a method 600 for image processing provided by the present application. The method 600 includes the following steps.
610,获取图像块的控制点的运动矢量(CPMV)。610. Obtain the motion vector (CPMV) of the control point of the image block.
下文将描述获取图像块的CPMV的方式,这里暂不描述。The manner of acquiring the CPMV of the image block will be described below, which will not be described here for the time being.
620,根据该图像块的CPMV,获取图像块中子图像块的运动矢量,该运动矢量为整像素精度。620. Acquire a motion vector of a sub-image block in the image block according to the CPMV of the image block, where the motion vector is of integer pixel precision.
换句话说,基于该图像块的CPMV,获取该图像块中的子图像块的运动矢量,并使得该子图像块的运动矢量的像素精度为整像素精度。In other words, based on the CPMV of the image block, the motion vector of the sub-image block in the image block is obtained, and the pixel precision of the motion vector of the sub-image block is made to be an integer pixel precision.
本申请中提及的子图像块表示图像处理或视频处理的处理单元。该子图像块的宽和/或高可以小于8像素。例如,子图像块的大小为4×4(像素)。Sub-image blocks referred to in this application represent processing units of image processing or video processing. The width and/or height of the sub-image block may be less than 8 pixels. For example, the size of the sub-image block is 4×4 (pixels).
子图像块可以是通过划分图像块得到的块。可以理解到,若图像块与子图像块的大小相同,则子图像块可以认为就是图像块本身。The sub image block may be a block obtained by dividing the image block. It can be understood that if the size of the image block and the sub-image block are the same, the sub-image block can be regarded as the image block itself.
子图像块可以是方形的块、例如大小为4×4或8×8的块,也可以是矩形的块,例如大小为2×4或4×8的块。The sub-image block may be a square block, such as a block with a size of 4×4 or 8×8, or a rectangular block, such as a block with a size of 2×4 or 4×8.
本申请中提及的图像块的大小可以为16×16、16×8、16×4、8×16、4×8、8×8、8×4、4×8等其它尺寸。The size of the image block mentioned in this application may be 16×16, 16×8, 16×4, 8×16, 4×8, 8×8, 8×4, 4×8 and other sizes.
应理解,作为处理单元的子图像块的运动矢量为整像素精度,因此,子图像块的运动补偿过程不会涉及到亚像素,从而可以降低视频帧间预测过程产生的带宽压力。It should be understood that the motion vector of the sub-image block as the processing unit is of integer pixel precision. Therefore, the motion compensation process of the sub-image block does not involve sub-pixels, so that the bandwidth pressure generated by the video inter-frame prediction process can be reduced.
根据图像块的CPMV,获取图像块中子图像块的运动矢量的过程可以包括:根据该图像块的两个或三个控制点的运动矢量,计算获得子图像块的运动矢量,并使得所获得的子图像块的运动矢量的像素精度为整像素精度。According to the CPMV of the image block, the process of obtaining the motion vector of the sub-image block in the image block may include: calculating and obtaining the motion vector of the sub-image block according to the motion vectors of two or three control points of the image block, and making the obtained motion vector The pixel precision of the motion vector of the sub-image block is integer pixel precision.
作为示例,可以根据前文描述的公式(1)或公式(2),计算得到子图像块的运动矢量。As an example, the motion vector of the sub-image block can be calculated according to the formula (1) or formula (2) described above.
可选地,在一些实施例中,如果直接基于图像块的CPMV计算得到的子图像块的运动矢量的像素精度为整像素精度,则这个运动矢量就是本申请要获取的子图像块的运动矢量。Optionally, in some embodiments, if the pixel precision of the motion vector of the sub-image block obtained directly based on the CPMV of the image block is an integer pixel precision, then this motion vector is the motion vector of the sub-image block to be acquired by the present application. .
例如,作为一种可能的实现方式,采用一种算法,根据图像块的CPMV计算子图像块的运动矢量,该算法可以保证计算出的子图像块的运动矢量的像素精度为整像素。For example, as a possible implementation manner, an algorithm is used to calculate the motion vector of the sub-image block according to the CPMV of the image block, and the algorithm can ensure that the pixel precision of the calculated motion vector of the sub-image block is an integer pixel.
可选地,在一些实施例中,如果直接基于图像块的CPMV,计算得到的子图像块的运动矢量的像素精度为亚像素精度,例如,1/4像素精度、1/8像素精度或1/16像素精度,则还需要对当前计算得到的运动矢量进行处理,使其由亚像素精度变为整像素精度。Optionally, in some embodiments, if directly based on the CPMV of the image block, the pixel precision of the motion vector of the sub-image block obtained by calculation is sub-pixel precision, for example, 1/4 pixel precision, 1/8 pixel precision or 1/4 pixel precision. /16 pixel precision, it is also necessary to process the currently calculated motion vector to change it from sub-pixel precision to integer pixel precision.
可选地,步骤620包括如下步骤1)和步骤2)。Optionally,
1)根据图像块的CPMV,计算子图像块的第一运动矢量,第一运动矢量为亚像素精度。1) Calculate the first motion vector of the sub-image block according to the CPMV of the image block, and the first motion vector is of sub-pixel precision.
例如,根据前文描述的公式(1)或公式(2),基于CPMV计算子图像块的第一运动矢量,计算得到的第一运动矢量的像素精度为亚像素。For example, according to the formula (1) or formula (2) described above, the first motion vector of the sub-image block is calculated based on the CPMV, and the pixel precision of the calculated first motion vector is sub-pixel.
2)将第一运动矢量处理为整像素精度的第二运动矢量。2) Process the first motion vector into a second motion vector with integer pixel precision.
作为步骤2)的一种可能的实现方式:根据子图像块的第一运动矢量,获取第二运动矢量,使得第二运动矢量的终点为与第一运动矢量的终点最接近的整像素点。As a possible implementation manner of step 2): obtain the second motion vector according to the first motion vector of the sub-image block, so that the end point of the second motion vector is the closest integer pixel to the end point of the first motion vector.
例如,最接近的整像素点可以是第一运动矢量的终点的上方、下方、左方或右方的整像素点。For example, the closest integer pixel point may be an integer pixel point above, below, to the left or to the right of the end point of the first motion vector.
作为一个示例,通过如下公式,根据子图像块的第一运动矢量(MV1x,MV1y),计算得到该子图像块的第二运动矢量(MV2x,MV2y)。As an example, through the following formula, the second motion vector (MV2x, MV2y) of the sub-image block is obtained by calculation according to the first motion vector (MV1x, MV1y) of the sub-image block.
若MV1x>=0,MV2x=((MV1x+(1<<(shift-1)))>>shift)<<shift;If MV1x>=0, MV2x=((MV1x+(1<<(shift-1)))>>shift)<<shift;
若MV1x<0,MV2x=-((-MV1x+(1<<(shift-1)))>>shift)<<shift;If MV1x<0, MV2x=-((-MV1x+(1<<(shift-1)))>>shift)<<shift;
若MV1y>=0,MV2y=((MV1y+(1<<(shift-1)))>>shift)<<shift;If MV1y>=0, MV2y=((MV1y+(1<<(shift-1)))>>shift)<<shift;
若MV1y<0,MV2y=-((-MV1y+(1<<(shift-1)))>>shift)<<shift,If MV1y<0, MV2y=-((-MV1y+(1<<(shift-1)))>>shift)<<shift,
式(3)。Formula (3).
其中,shift的取值与编码软件平台中运动矢量的存储精度有关。例如,在当前的VTM-4.0参考软件中,运动矢量的存储精度为1/16精度,则可以将shift的取值设置为4。The value of shift is related to the storage precision of the motion vector in the encoding software platform. For example, in the current VTM-4.0 reference software, the storage precision of the motion vector is 1/16 precision, so the value of shift can be set to 4.
作为另一个示例,通过如下公式,根据子图像块的第一运动矢量(MV1x,MV1y),获得该子图像块的第二运动矢量(MV2x,MV2y)。As another example, the second motion vector (MV2x, MV2y) of the sub-image block is obtained according to the first motion vector (MV1x, MV1y) of the sub-image block by the following formula.
若MV1x>=0,MV2x=(MV1x>>shift)<<shift;If MV1x>=0, MV2x=(MV1x>>shift)<<shift;
若MV1x<0,MV2x=-(((-MV1x)>>shift)<<shift);If MV1x<0, MV2x=-(((-MV1x)>>shift)<<shift);
若MV1y>=0,MV2y=(MV1y>>shift)<<shift;If MV1y>=0, MV2y=(MV1y>>shift)<<shift;
若MV1y<0,MV2y=-(((-MV1y)>>shift)<<shift), 式(4)。If MV1y<0, MV2y=-(((-MV1y)>>shift)<<shift), formula (4).
其中,shift的含义与前文描述的shift的含义一致。The meaning of shift is the same as the meaning of shift described above.
公式(3)和公式(4)中的“<<”表示左移,“>>”表示右移。"<<" in formula (3) and formula (4) means left shift, and ">>" means right shift.
需要说明的是,本申请对运动矢量的像素精度由亚像素级别转换为整像素级别的方式不作限定。例如,还可以根据其它可行的从亚像素到整像素的变换算法,根据第一运动矢量获得整像素精度的第二运动矢量。It should be noted that the present application does not limit the manner in which the pixel precision of the motion vector is converted from the sub-pixel level to the integer pixel level. For example, the second motion vector with integer pixel precision can also be obtained according to the first motion vector according to other feasible transformation algorithms from subpixels to integer pixels.
当前Affine技术中处理的最小CU(对应本申请实施例中的图像块)的大小为16×16时,在运动估计的过程中不会带来带宽的压力,因此,对运动估计过程不需要进行修改。这种情形下,图像块的CPMV的像素精度可能为整像素,也可能为亚像素。若图像块的CPMV的像素精度为亚像素,则根据图像块的CPMV计算得到的子图像块的运动矢量的像素精度也为亚像素;若图像块的CPMV的像素精度为整像素,根据图像块的CPMV计算得到的子图像块的运动矢量的像素精度也有可能为亚像素,例如,根据公式(1)或公式(2)计算得到的子图像块的运动矢量的像素精度可能是亚像素。When the size of the smallest CU (corresponding to the image block in the embodiment of the present application) processed in the current Affine technology is 16×16, the motion estimation process will not bring pressure on the bandwidth. Therefore, the motion estimation process does not need to be Revise. In this case, the pixel precision of the CPMV of the image block may be an integer pixel or a sub-pixel. If the pixel accuracy of the CPMV of the image block is sub-pixel, then the pixel accuracy of the motion vector of the sub-image block calculated according to the CPMV of the image block is also sub-pixel; if the pixel accuracy of the CPMV of the image block is an integer pixel, according to the image block The pixel precision of the motion vector of the sub-image block calculated by the CPMV of , may also be sub-pixel, for example, the pixel precision of the motion vector of the sub-image block calculated according to formula (1) or formula (2) may be sub-pixel.
上述可知,现有的Affine技术中,子图像块,即处理单元的运动矢量的像素精度可能为亚像素,这会导致运动补偿过程涉及亚像素,会增加Affine技术的带宽压力。As can be seen from the above, in the existing Affine technology, the pixel precision of the sub-image block, that is, the motion vector of the processing unit, may be sub-pixel, which will cause the motion compensation process to involve sub-pixels, which will increase the bandwidth pressure of the Affine technology.
本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。In the solution provided by the present application, by making the motion vector of the sub-image block as the image processing unit to be of integer pixel precision, the motion compensation process of the sub-image block does not involve sub-pixels, thereby reducing the amount of noise generated by the Affine prediction technology to a certain extent. Bandwidth pressure.
应理解,通过扩大作为处理单元的子图像块的大小,在一定程度上也可以缓解带宽压力的问题,但是,这样会降低图像压缩性能。本申请通过将作为处理单元的子图像块的运动矢量处理为整像素精度,可以保证整像素精度的运动补偿,从而一方面可以解决带宽压力的问题,另一方面也可以保证较好的图像压缩性能。It should be understood that the problem of bandwidth pressure can also be alleviated to a certain extent by enlarging the size of the sub-image block as a processing unit, but this will reduce the image compression performance. In the present application, by processing the motion vector of the sub-image block as the processing unit to the integer pixel precision, the motion compensation of the integer pixel precision can be ensured, so that the problem of bandwidth pressure can be solved on the one hand, and better image compression can also be ensured on the other hand. performance.
可以根据本申请提供的方案,对现有的Affine技术进行改进,即将Affine模式下的Sub-CU的运动矢量处理为整像素精度,从而可以降低Affine技术产生的带宽压力。According to the solution provided in this application, the existing Affine technology can be improved, that is, the motion vector of the Sub-CU in the Affine mode is processed into integer pixel precision, so that the bandwidth pressure generated by the Affine technology can be reduced.
除了可以应用于Affine技术之外,本申请提供的方案也可以应用于将来可能出现的其它类似的技术中,例如,运动矢量的像素精度包括整像素精度与亚像素精度,且图像处理单元的尺寸较小,例如,4×4。In addition to being applicable to the Affine technology, the solution provided in this application can also be applied to other similar technologies that may appear in the future. For example, the pixel precision of the motion vector includes integer pixel precision and sub-pixel precision, and the size of the image processing unit Smaller, for example, 4×4.
应理解,本申请提供的方案,可用于提升压缩视频质量,提升编解码器的硬件友好性,对广播电视、电视会议、网络视频等视频的压缩处理具有重要意义。It should be understood that the solution provided by this application can be used to improve the quality of compressed video and improve the hardware friendliness of codecs, and is of great significance to the compression processing of videos such as broadcast television, video conferences, and online videos.
可选地,在一些实施例中,本申请实施例提供的方法还包括:将该图像块的CPMV处理为整像素精度。Optionally, in some embodiments, the methods provided by the embodiments of the present application further include: processing the CPMV of the image block to integer pixel precision.
本实施例可以保证图像块的CPMV为整像素精度。This embodiment can ensure that the CPMV of the image block is of integer pixel precision.
下文将描述将该图像块的CPMV处理为整像素精度的实施方式。An embodiment of processing the CPMV of the image block to integer pixel precision will be described below.
可选地,如图7所示,在一些实施例中,步骤610包括如下步骤611、步骤612和步骤613。Optionally, as shown in FIG. 7 , in some embodiments,
611,获取该图像块的运动信息候选列表。611. Obtain a motion information candidate list of the image block.
例如,获取该图像块的空域和/或时域邻近块的运动矢量,基于这些邻近块的运动矢量,构建该图像块的运动信息候选列表。For example, the motion vectors of adjacent blocks in the spatial domain and/or the temporal domain of the image block are obtained, and based on the motion vectors of these adjacent blocks, a motion information candidate list of the image block is constructed.
612,将该运动信息候选列表中的运动矢量处理为整像素精度。612. Process the motion vectors in the motion information candidate list into integer pixel precision.
例如,可以采用前文描述的公式(3)或公式(4),将该运动信息候选列表中的运动矢量处理为整像素精度。For example, the above-described formula (3) or formula (4) can be used to process the motion vectors in the motion information candidate list to integer pixel precision.
邻近块指的是用于构建该图像块的运动信息候选列表的邻近块,例如,时域和/或空域上的邻近块。本申请对于确定邻近块的方式不作限定。Neighboring blocks refer to neighboring blocks used to construct the motion information candidate list of the image block, eg, neighboring blocks in the temporal and/or spatial domains. The present application does not limit the manner of determining adjacent blocks.
613,根据所述运动信息候选列表中处理为整像素精度的运动矢量,获取所述图像块的CPMV。613. Obtain the CPMV of the image block according to the motion vector in the motion information candidate list that is processed as an integer pixel precision.
Affine帧间预测模式可以分为Affine merge模式和Affine inter模式。Affine inter prediction mode can be divided into Affine merge mode and Affine inter mode.
图7所示实施例可以应用于Affine inter模式,也可以应用于Affine merge模式。The embodiment shown in FIG. 7 can be applied to the Affine inter mode and also can be applied to the Affine merge mode.
可选地,在如图7所示的实施例中,该图像块的帧间预测方式为Affine merge模式。Optionally, in the embodiment shown in FIG. 7 , the inter-frame prediction mode of the image block is an Affine merge mode.
在Affine merge模式下,可以从运动信息候选列表选择一个CPMV直接作为该图像块的CPMV。即步骤613包括:从该图像块的运动信息候选列表中选择一个CPMV作为该图像块的CPMV。In the Affine merge mode, a CPMV can be directly selected from the motion information candidate list as the CPMV of the image block. That is,
因为用于构建运动信息候选列表的邻近块的运动矢量被处理为整像素精度,因此,从运动信息候选列表选择CPMV直接作为该图像块的CPMV,可以保证该图像块的CPMV为整像素。Because the motion vectors of adjacent blocks used to construct the motion information candidate list are processed to integer pixel precision, selecting the CPMV from the motion information candidate list directly as the CPMV of the image block can ensure that the CPMV of the image block is an integer pixel.
作为示例,Affine merge模式的帧间预测的大致流程包括如下步骤。在本示例中,以图像块为CU为例。As an example, the general flow of inter-frame prediction in Affine merge mode includes the following steps. In this example, the image block is taken as the CU as an example.
步骤1-1,从空域临近块和/或时域临近块获取邻近块的运动矢量(MV)。此过程会获取到Affine模式的邻近块的MV以及传统模式的邻近块的MV,根据这些邻近块的MV组合得到CPMVs,并由这些CPMVs构建该CU的运动信息候选列表。Step 1-1, obtain the motion vector (MV) of the adjacent block from the adjacent block in the spatial domain and/or the adjacent block in the temporal domain. In this process, MVs of adjacent blocks in Affine mode and MVs of adjacent blocks in traditional mode are obtained, CPMVs are obtained by combining the MVs of these adjacent blocks, and a motion information candidate list of the CU is constructed from these CPMVs.
步骤1-2,将该CU的运动信息候选列表中的运动矢量,处理为整像素精度。Step 1-2: Process the motion vectors in the motion information candidate list of the CU into integer pixel precision.
步骤1-3,从运动信息候选列表中选择一个组合(该组合中可能包含两个或者三个CPMV,代表两个控制点和三个控制点的CPMV),作为CU的CPMVs。Step 1-3, select a combination from the motion information candidate list (this combination may include two or three CPMVs, representing CPMVs of two control points and three control points) as the CPMVs of the CU.
在Affine merge模式中,将运动信息候选列表中选出的CPMVs作为当前CU的CPMVs,不需要进行运动估计,也不存在Affine inter模式中的MVD的概念(下文将描述)。也就是说,在Affine merge模式中,只需要将从运动信息候选列表中选出的CPMVs的索引(一个CU只需要写一个索引)写入码流,不需要传输MVD。In the Affine merge mode, the CPMVs selected in the motion information candidate list are used as the CPMVs of the current CU, no motion estimation is required, and there is no concept of MVD in the Affine inter mode (described below). That is to say, in the Affine merge mode, only the index of the CPMVs selected from the motion information candidate list (one CU only needs to write one index) needs to be written into the code stream, and there is no need to transmit the MVD.
关于步骤1-1中提及的邻近块,该临近块的帧间预测模式可以是传统的帧间预测模式也可能是affine模式,因此从临近块获取到的MV可能是整像素精度也可能是亚像素精度。Regarding the adjacent block mentioned in step 1-1, the inter prediction mode of the adjacent block can be the traditional inter prediction mode or the affine mode, so the MV obtained from the adjacent block may be integer pixel precision or may be Subpixel accuracy.
本实施例通过将当前图像块的邻近块的运动矢量处理为整像素精度,从而可以保证该图像块的CPMV为整像素精度。In this embodiment, the motion vectors of the adjacent blocks of the current image block are processed to the integer pixel precision, so that the CPMV of the image block can be guaranteed to be the integer pixel precision.
前文已述,图7所示的实施例也可以应用于Affine inter模式。为了更好地理解本申请实施例,在描述将图7所示的实施例应用于Affine inter模式的实施例之前,先描述一下Affine Inter模式的大致流程。As mentioned above, the embodiment shown in FIG. 7 can also be applied to the Affine inter mode. In order to better understand the embodiments of the present application, before describing the embodiment in which the embodiment shown in FIG. 7 is applied to the Affine inter mode, a general flow of the Affine Inter mode is described first.
作为示例,Affine Inter模式的大致流程包括如下步骤。在本示例中,以图像块为CU为例。As an example, the general flow of the Affine Inter mode includes the following steps. In this example, the image block is taken as the CU as an example.
步骤2-1,从空域临近块和/或时域临近块获取邻近块的运动矢量。此过程会获取到Affine模式的邻近块的运动矢量以及传统模式的邻近块的运动矢量;根据所获取的运动矢量组合得到CPMVs,并由这些CPMVs构建该CU的运动信息候选列表。Step 2-1: Obtain motion vectors of adjacent blocks from adjacent blocks in the spatial domain and/or adjacent blocks in the temporal domain. In this process, the motion vectors of the adjacent blocks in the Affine mode and the motion vectors of the adjacent blocks in the traditional mode are obtained; CPMVs are obtained by combining the obtained motion vectors, and the motion information candidate list of the CU is constructed from these CPMVs.
步骤2-2,从步骤2-1构建的运动信息候选列表中选择一个组合(该组合中可能包含两个或者三个CPMV,代表两个控制点和三个控制点的CPMV),作为当前CU的预测MV(Motion vector prediction,MVP)(即当前CU的预测CPMVs)。Step 2-2, select a combination from the motion information candidate list constructed in step 2-1 (this combination may contain two or three CPMVs, representing the CPMVs of two control points and three control points), as the current CU The predicted MV (Motion vector prediction, MVP) (that is, the predicted CPMVs of the current CU).
步骤2-3,以当前整个CU为单位进行运动估计,获取当前CU的CPMVs。Step 2-3, perform motion estimation with the current entire CU as a unit, and obtain the CPMVs of the current CU.
步骤2-4,计算步骤2-2选择的CPMVs与步骤2-3运动估计的CPMVs之间的差值,获得运动矢量差值(Motion Vector Difference,MVD)。Step 2-4: Calculate the difference between the CPMVs selected in step 2-2 and the CPMVs estimated by motion in step 2-3 to obtain a motion vector difference (Motion Vector Difference, MVD).
在Affine Inter模式中,需要将选择的CPMVs的索引,以及MVD写入码流。In Affine Inter mode, the index of the selected CPMVs and the MVD need to be written into the code stream.
在Affine Inter模式中,运动估计过程以CU(对应于本申请实施例中的图像块)为单位进行,运动补偿过程则以4×4的sub-CU(对应于本申请实施例中的子图像块)为单位进行。In the Affine Inter mode, the motion estimation process is performed in units of CUs (corresponding to the image blocks in the embodiments of the present application), and the motion compensation process is performed in 4×4 sub-CUs (corresponding to the sub-images in the embodiments of the present application) block) as a unit.
关于步骤2-1中提及的邻近块,该临近块的帧间预测模式可以是传统的帧间预测模式也可能是affine模式,因此从临近块获取到的MV可能是整像素精度也可能是亚像素精度。Regarding the adjacent block mentioned in step 2-1, the inter prediction mode of the adjacent block may be the traditional inter prediction mode or the affine mode, so the MV obtained from the adjacent block may be integer pixel precision or may be Subpixel accuracy.
在Affine Inter模式中,编码端会进行CU的运动矢量的不同像素精度的选择,这个过程可以称为自适应运动矢量精度(Adaptive Motion Vector Resolution,AMVR)决策。In the Affine Inter mode, the encoding end will select different pixel precisions of the motion vector of the CU, and this process may be called an Adaptive Motion Vector Resolution (Adaptive Motion Vector Resolution, AMVR) decision.
AMVR决策的像素精度本质上是MVD的像素精度,也就是CU的CPMVs的像素精度,而不是sub-CU的MV的像素精度。The pixel precision of AMVR decisions is essentially the pixel precision of the MVD, that is, the pixel precision of the CU's CPMVs, not the pixel precision of the sub-CU's MVs.
在现有的Affine Inter模式中,AMVR决策的像素精度的范围包括但不限于:1/16像素精度、1/8像素精度、1/4像素精度、1/2像素精度、1像素精度、2像素精度、4像素精度等。换句话说,CU可以有多种不同像素精度的CPMVs。例如,CU可以有整像素、1/4像素精度和1/16像素精度三种不同的CPMVs。In the existing Affine Inter mode, the range of pixel precision for AMVR decisions includes but is not limited to: 1/16 pixel precision, 1/8 pixel precision, 1/4 pixel precision, 1/2 pixel precision, 1 pixel precision, 2 Pixel precision, 4 pixel precision, etc. In other words, a CU can have multiple CPMVs with different pixel precisions. For example, a CU can have three different CPMVs: integer pixel, 1/4 pixel precision, and 1/16 pixel precision.
可选地,在图7所示的实施例中,图像块的帧间预测模式为Affine Inter模式,步骤611包括获取该图像块的运动信息候选列表;步骤612包括将该运动信息候选列表中的运动矢量处理为整像素精度;步骤613包括:从图像块的运动信息候选列表选择该图像块的预测CPMV,获得该图像块的MVD,该图像块的预测CPMV与该图像块的MVD,获得该图像块的CPMV。Optionally, in the embodiment shown in FIG. 7 , the inter-frame prediction mode of the image block is the Affine Inter mode, and step 611 includes acquiring the motion information candidate list of the image block;
如图8所示,在本实施例中,步骤610还可以包括步骤614,对该图像块进行N像素的运动矢量精度决策,N为正整数。As shown in FIG. 8 , in this embodiment, step 610 may further include
即对该图像块进行整像素精度的运动矢量精度决策(AMVR决策)。That is, a motion vector precision decision (AMVR decision) with integer pixel precision is performed on the image block.
可以理解到,通过对图像块进行整像素精度的AMVR决策,可以保证图像块的MVD的像素精度为整像素,也可以保证图像块的CPMV的像素精度为整像素。这样,可以保证图像块的运动估计过程中不涉及亚像素,从而可以在一定程度上降低带宽压力。It can be understood that by performing AMVR decision on the image block with integer pixel accuracy, the pixel accuracy of the MVD of the image block can be guaranteed to be an integer pixel, and the pixel accuracy of the CPMV of the image block can also be guaranteed to be an integer pixel. In this way, it can be ensured that no sub-pixels are involved in the motion estimation process of the image block, so that the bandwidth pressure can be reduced to a certain extent.
在本实施例中,使用Affine AMVR进行运动矢量精度决策时,不对所有像素精度进行决策,而是跳过其中1/M(M>1)像素精度的决策,也就是说,只进行N像素精度的决策。In this embodiment, when using Affine AMVR to make motion vector precision decisions, all pixel precisions are not decided, but the decision of 1/M (M>1) pixel precision is skipped, that is, only N pixel precisions are made. decision.
应理解,在本实施例中,在将运动矢量精度索引写入码流时,由于像素精度可选项减少,因此写入码流的比特数(bit数)相应减少,甚至可以无需写入表示运动矢量精度索引的比特数。例如,原本像素精度可选项包括三种:整像素、1/4像素和1/16像素,则至少需要2比特的信息表示这三种像素精度,例如,采用“0”表示1/4像素,“10”表示1/16像素,“11”表示整像素。而在本实施例中,可以采用“0”表示整像素,因而只需在码流中写入1比特的数据,或者,可以通过协议约定好采用整像素精度,因而无需将运动矢量精度索引写入码流,这样节省信令开销,同时也可以减小带宽压力。It should be understood that, in this embodiment, when the motion vector precision index is written into the code stream, since the pixel precision options are reduced, the number of bits (bit number) written into the code stream is correspondingly reduced, and it is even unnecessary to write motion representation The number of bits for the vector precision index. For example, the original pixel precision options include three: integer pixel, 1/4 pixel and 1/16 pixel, then at least 2 bits of information are required to represent these three pixel precisions, for example, "0" is used to represent 1/4 pixel, "10" means 1/16 of a pixel, and "11" means a whole pixel. In this embodiment, "0" can be used to represent an integer pixel, so only 1-bit data needs to be written in the code stream, or, an integer pixel precision can be agreed upon in the protocol, so there is no need to write the motion vector precision index This saves signaling overhead and reduces bandwidth pressure.
需要说明的是,在该图像块的帧间预测模式为Affine inter模式的情况下,对该图像块进行N(N为正整数)像素的运动矢量精度决策的实施例与图8所示实施例可以组合实施,也可以解耦于图8所示实施例而独立实施。It should be noted that when the inter-frame prediction mode of the image block is the Affine inter mode, the embodiment of performing the motion vector precision decision of N (N is a positive integer) pixel for the image block is the same as the embodiment shown in FIG. 8 . It can be implemented in combination, or can be implemented independently from the embodiment shown in FIG. 8 .
可选地,如图8所示,在一些实施例中,该图像块的帧间预测模式为Affine inter模式,步骤610包括:获取图像块的CPMV,对该图像块进行N像素的运动矢量精度决策,N为正整数。Optionally, as shown in FIG. 8 , in some embodiments, the inter prediction mode of the image block is Affine inter mode, and step 610 includes: acquiring the CPMV of the image block, and performing motion vector precision of N pixels on the image block. decision, N is a positive integer.
应理解,通过对该图像块进行N像素的运动矢量精度决策,无论是否将该图像块的邻近块的运动矢量处理为整像素精度,都可以保证该图像块的CPMV为整像素精度。It should be understood that by making a motion vector precision decision of N pixels for the image block, whether the motion vectors of adjacent blocks of the image block are processed to integer pixel precision, the CPMV of the image block can be guaranteed to be integer pixel precision.
还应理解,在Affine Inter模式中,通过将图像块的CPMV的像素精度处理为整像素精度,可以保证整像素精度的运动估计,有助于减少带宽压力。It should also be understood that, in the Affine Inter mode, by processing the pixel precision of the CPMV of the image block to the integer pixel precision, the motion estimation of the integer pixel precision can be guaranteed, which helps to reduce the bandwidth pressure.
上述可知,在Affine merge模式中,将图像块的CPMV的像素精度处理为整像素精度的实现方式为:将所述运动信息候选列表中的运动矢量处理为整像素精度。It can be seen from the above that, in the Affine merge mode, the implementation manner of processing the pixel precision of the CPMV of the image block to the integer pixel precision is: processing the motion vector in the motion information candidate list to the integer pixel precision.
在Affine inter模式中,将图像块的CPMV的像素精度处理为整像素精度的实现方式为:将所述运动信息候选列表中的运动矢量处理为整像素精度,且对该图像块进行整像素精度的AMVR决策。In the Affine inter mode, the implementation of processing the pixel precision of the CPMV of the image block to the integer pixel precision is as follows: processing the motion vector in the motion information candidate list to the integer pixel precision, and processing the image block with the integer pixel precision AMVR decision.
或者,在Affine inter模式中,将图像块的CPMV的像素精度处理为整像素精度的实现方式为:对该图像块进行整像素精度的AMVR决策。Or, in the Affine inter mode, the implementation manner of processing the pixel precision of the CPMV of the image block to the integer pixel precision is: performing AMVR decision of the image block with the integer pixel precision.
在上述涉及将邻近块的运动矢量处理为整像素精度的实施例中,可以采用上述公式(3)或公式(4)所示的方式,将邻近块的运动矢量处理为整像素精度。也可以采用其它可行的由亚像素转整像素的算法或方法,将邻近块的运动矢量处理为整像素精度。本申请对此不作限定。In the above-mentioned embodiments involving processing motion vectors of adjacent blocks to integer pixel precision, the motion vectors of adjacent blocks may be processed to integer pixel precision in the manner shown in the above formula (3) or formula (4). Other feasible algorithms or methods for converting sub-pixels to whole pixels can also be used to process the motion vectors of adjacent blocks to whole-pixel precision. This application does not limit this.
可选地,在一些实施例中,当图像块的大小小于阈值时,将该图像块的CPMV处理为整像素精度。Optionally, in some embodiments, when the size of an image block is smaller than a threshold, the CPMV of the image block is processed to integer pixel precision.
该阈值可以根据实际需求确定。例如,该阈值为16像素。The threshold can be determined according to actual needs. For example, the threshold is 16 pixels.
例如,当图像块的高和/或宽小于16像素时,将该图像块的CPMV处理为整像素精度。For example, when the height and/or width of an image block is less than 16 pixels, the CPMV of the image block is processed to integer pixel precision.
从前文描述的Affine Inter模式可知,在Affine Inter模式下,会进行以图像块为单位的运动估计。例如,当图像块的高和宽等于或大于16像素时,即使是亚像素精度的运动估计过程也不会造成较大的带宽压力,这种情形下,可以不对图像块的CPMV进行处理使之成为整像素精度。As can be seen from the Affine Inter mode described above, in the Affine Inter mode, motion estimation in units of image blocks will be performed. For example, when the height and width of the image block are equal to or greater than 16 pixels, even the sub-pixel precision motion estimation process will not cause large bandwidth pressure. In this case, the CPMV of the image block may not be processed to make it becomes integer pixel precision.
但是,如果图像块的高和/或宽小于16像素,例如,图像块的大小为4×8、8×4、4×16或16×4,亚像素精度的运动估计过程可能会造成较大的带宽压力。这种情况下,可以将该图像块的CPMV处理为整像素精度。However, if the height and/or width of the image block is less than 16 pixels, for example, the size of the image block is 4×8, 8×4, 4×16 or 16×4, the motion estimation process with sub-pixel precision may cause large bandwidth pressure. In this case, the CPMV of the image block can be processed to integer pixel precision.
可选地,在一些实施例中,图像块的预测模式为Affine Inter模式,且图像块的高和/或宽小于16像素,根据本申请实施例的方法还包括:对图像块进行整像素精度的AMVR决策。Optionally, in some embodiments, the prediction mode of the image block is Affine Inter mode, and the height and/or width of the image block is less than 16 pixels, the method according to this embodiment of the present application further includes: performing integer pixel precision on the image block AMVR decision.
本实施例可以保证整像素精度的运动估计过程,从而可以避免造成较大的带宽压力。This embodiment can ensure a motion estimation process with integer pixel precision, so as to avoid causing a large bandwidth pressure.
此外,将满足高和/或宽小于16像素的条件的图像块的运动矢量精度索引写入码流时,由于像素精度可选项减少,可以减小写入码流的bit数。In addition, when the motion vector precision index of the image block that satisfies the condition that the height and/or width is less than 16 pixels is written into the code stream, the number of bits written into the code stream can be reduced due to the reduction of pixel precision options.
例如,针对高和宽大于或等于16像素的CU,在整像素、1/4像素和1/16像素三种方式中选择AMVR像素精度,例如,采用“0”代表1/4像素,“10”代表1/16像素,“11”代表整像素。针对高和/或宽小于16像素的CU,因为只有一种AMVR像素精度可选项,因此不需要将AMVR像素精度索引写入码流,例如可以通过协议约定采用整像素精度。For example, for a CU whose height and width are greater than or equal to 16 pixels, select AMVR pixel precision among three modes: integer pixel, 1/4 pixel and 1/16 pixel, for example, use "0" for 1/4 pixel, "10" " for 1/16 of a pixel, and "11" for a whole pixel. For CUs whose height and/or width are less than 16 pixels, because there is only one option of AMVR pixel precision, it is not necessary to write the AMVR pixel precision index into the code stream, for example, integer pixel precision can be adopted by agreement.
本申请实施例可以应用于不同种的帧间预测方式,例如,前向预测、后向预测或双预测。换言之,本申请实施例中提及的子图像块的帧间预测方式可以为如下任一种:前向预测、后向预测、双预测。The embodiments of the present application may be applied to different inter-frame prediction modes, for example, forward prediction, backward prediction, or bi-prediction. In other words, the inter-frame prediction mode of the sub-image block mentioned in the embodiments of the present application may be any of the following: forward prediction, backward prediction, and bi-prediction.
例如,子图像块的帧间预测方式为前向预测,则将前向预测过程所得的子图像块的运动矢量处理为整像素。For example, if the inter-frame prediction mode of the sub-image block is forward prediction, the motion vector of the sub-image block obtained in the forward prediction process is processed as an integer pixel.
再例如,子图像块的帧间预测方式为后向预测,则将后向预测过程所得的子图像块的运动矢量处理为整像素。For another example, if the inter-frame prediction mode of the sub-image block is backward prediction, the motion vector of the sub-image block obtained by the backward prediction process is processed as an integer pixel.
再例如,子图像块的帧间预测方式为双预测,则将双预测过程所得的子图像块的运动矢量处理为整像素。For another example, if the inter-frame prediction mode of the sub-image block is bi-prediction, the motion vector of the sub-image block obtained in the bi-prediction process is processed as an integer pixel.
可选地,子图像块的帧间预测方式为双预测,但只针对双预测中的一个预测过程,采用本申请实施例提供的方法,将子图像块的运动矢量处理为整像素精度。Optionally, the inter-frame prediction mode of the sub-image block is bi-prediction, but for only one prediction process in the bi-prediction, the method provided by the embodiment of the present application is used to process the motion vector of the sub-image block to integer pixel precision.
例如,该图像块的CPMV为双预测过程中前向预测所得的图像块的CPMV,或者,双预测过程中后向预测所得的该图像块的CPMV。For example, the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or the CPMV of the image block obtained by the backward prediction in the bi-prediction process.
换句话说,例如,子图像块的帧间预测方式为双预测,则将双预测过程的一个预测过程所得的子图像块的运动矢量处理为整像素。这一个预测过程可以是双预测中的前向预测过程,或者是双预测中的后向预测过程。In other words, for example, if the inter-frame prediction mode of the sub-image block is bi-prediction, the motion vector of the sub-image block obtained by one prediction process of the bi-prediction process is processed as an integer pixel. This one prediction process may be a forward prediction process in bi-prediction, or a backward prediction process in bi-prediction.
上述可知,本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。It can be seen from the above that, in the solution provided by the present application, by making the motion vector of the sub-image block as the image processing unit to be of integer pixel precision, the motion compensation process of the sub-image block can not involve sub-pixels, so that the Affine prediction can be reduced to a certain extent. Technology-generated bandwidth pressure.
进一步地,通过将图像块的CPMV的像素精度处理为整像素精度,在Affine Inter模式中,可以保证整像素精度的运动估计,有助于减少带宽压力。Further, by processing the pixel precision of the CPMV of the image block to the integer pixel precision, in the Affine Inter mode, the motion estimation of the integer pixel precision can be guaranteed, which helps to reduce the bandwidth pressure.
因此,本申请提供的方案,即可以降低帧间预测过程造成的带宽压力,同时也可以保证一定的压缩性能。Therefore, the solution provided by the present application can reduce the bandwidth pressure caused by the inter-frame prediction process, and can also ensure a certain compression performance.
上文描述了本申请的方法实施例,下文将描述本申请的装置实施例。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见前面方法实施例,为了简洁,这里不再赘述。The method embodiments of the present application are described above, and the device embodiments of the present application will be described below. It should be understood that the description of the apparatus embodiment corresponds to the description of the method embodiment. Therefore, for the content that is not described in detail, reference may be made to the foregoing method embodiment, which is not repeated here for brevity.
如图9所示,本申请实施例提供一种图像处理的装置900,该装置900包括如下单元。As shown in FIG. 9 , an embodiment of the present application provides an
第一获取单元910,用于获取图像块的控制点的运动矢量CPMV。The first obtaining
第二获取单元920,用于根据该第一获取单元910获取的该图像块的CPMV,获取该图像块中子图像块的运动矢量,该运动矢量为整像素精度。The second obtaining
本申请提供的方案,通过使作为图像处理单元的子图像块的运动矢量为整像素精度,可以使子图像块的运动补偿过程不涉及亚像素,从而在一定程度上可以降低Affine预测技术产生的带宽压力。In the solution provided by the present application, by making the motion vector of the sub-image block as the image processing unit to be of integer pixel precision, the motion compensation process of the sub-image block does not involve sub-pixels, thereby reducing the amount of noise generated by the Affine prediction technology to a certain extent. Bandwidth pressure.
可选地,在一些实施例中,该第二获取单元920用于:根据该图像块的CPMV,计算该子图像块的第一运动矢量,该第一运动矢量为亚像素精度;将该第一运动矢量处理为整像素精度的第二运动矢量。Optionally, in some embodiments, the second obtaining
可选地,在一些实施例中,该第二获取单元920用于,根据该子图像块的第一运动矢量,获取第二运动矢量,使得第二运动矢量的终点为与第一运动矢量的终点最接近的整像素点。Optionally, in some embodiments, the second obtaining
例如,第二获取单元920用于,通过公式(3)或公式(4),将第一运动矢量处理为像素精度为整像素的第二运动矢量。For example, the second obtaining
可选地,在一些实施例中,该子图像块的高和/或宽为4像素。Optionally, in some embodiments, the height and/or width of the sub-image block is 4 pixels.
可选地,在一些实施例中,该第一获取单元910用于:获取该图像块的运动信息候选列表,将该运动信息候选列表中的运动矢量处理为整像素精度;根据该运动信息候选列表中处理为整像素精度的运动矢量,获取该图像块的CPMV。Optionally, in some embodiments, the first obtaining
可选地,在一些实施例中,该装置900还包括:处理单元930,用于对该图像块进行N像素的运动矢量精度决策,N为正整数。Optionally, in some embodiments, the
可选地,在一些实施例中,该图像块的高和/或宽小于16像素。Optionally, in some embodiments, the height and/or width of the image block is less than 16 pixels.
可选地,在一些实施例中,该子图像块的帧间预测方式为如下任一种:前向预测、后向预测、双预测。Optionally, in some embodiments, the inter-frame prediction mode of the sub-image block is any one of the following: forward prediction, backward prediction, and bi-prediction.
可选地,在一些实施例中,该子图像块的帧间预测方式为双预测,其中,该图像块的CPMV为双预测过程中前向预测所得的该图像块的CPMV,或者,双预测过程中后向预测所得的该图像块的CPMV。Optionally, in some embodiments, the inter-frame prediction mode of the sub-image block is bi-prediction, wherein the CPMV of the image block is the CPMV of the image block obtained by forward prediction in the bi-prediction process, or the bi-prediction The CPMV of the image block obtained by backward prediction in the process.
可选地,本实施例的图像处理的装置900可以为编码器,该装置900中还可以包括用于实现视频编码相关流程的功能模块。Optionally, the
可选地,本实施例的图像处理的装置900可以为解码器,该装置900中还可以包括用于实现视频解码相关流程的功能模块。Optionally, the
如图10所示,本发明实施例还提供一种图像处理的装置1000。该装置1000包括处理器1010与存储器1020,该存储器1020用于存储指令,该处理器1010用于执行该存储器1020存储的指令,并且对该存储器1020中存储的指令的执行使得,该处理器1010用于执行上文方法实施例的方法。As shown in FIG. 10 , an embodiment of the present invention further provides an
具体地,该编码装置1000还包括通信接口1030,用于与外部器件传输信号。Specifically, the
可选地,本实施例的图像处理的装置1000为编码器,通信接口1030用于从外部器件接收待处理的图像或视频数据。或者,通信接口1030还用于向解码端发送编码码流。Optionally, the
可选地,本实施例的图像处理的装置1000为解码器,通信接口1030用于从编码端接收编码码流。Optionally, the
本发明实施例还提供一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得,该计算机执行上文方法实施例的方法。An embodiment of the present invention further provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer causes the computer to execute the method of the above method embodiments.
本发明实施例还提供一种包含指令的计算机程序产品,其特征在于,该指令被计算机执行时使得计算机执行上文方法实施例的方法。Embodiments of the present invention also provide a computer program product containing instructions, characterized in that, when the instructions are executed by a computer, the instructions cause the computer to execute the methods of the above method embodiments.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其他任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digitalsubscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware or any other combination. When implemented in software, it can be implemented in whole or in part in the form of a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the procedures or functions according to the embodiments of the present invention result in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device. Computer instructions may be stored on or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g. coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to another website site, computer, server or data center. A computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that includes an integration of one or more available media. Useful media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital video disc (DVD)), or semiconductor media (eg, solid state disk (SSD)), and the like.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art can realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.
Claims (21)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/077894 WO2020181507A1 (en) | 2019-03-12 | 2019-03-12 | Image processing method and apparatus |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111247804A true CN111247804A (en) | 2020-06-05 |
CN111247804B CN111247804B (en) | 2023-10-13 |
Family
ID=70865988
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201980005232.7A Active CN111247804B (en) | 2019-03-12 | 2019-03-12 | Image processing methods and devices |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111247804B (en) |
WO (1) | WO2020181507A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107277506A (en) * | 2017-08-15 | 2017-10-20 | 中南大学 | A kind of motion vector accuracy fast selecting method and device based on adaptive motion vector precision |
CN108781284A (en) * | 2016-03-15 | 2018-11-09 | 联发科技股份有限公司 | Method and device for video encoding and decoding with affine motion compensation |
CN109005407A (en) * | 2015-05-15 | 2018-12-14 | 华为技术有限公司 | Encoding video pictures and decoded method, encoding device and decoding device |
CN109391814A (en) * | 2017-08-11 | 2019-02-26 | 华为技术有限公司 | Encoding video pictures and decoded method, device and equipment |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106303544B (en) * | 2015-05-26 | 2019-06-11 | 华为技术有限公司 | A kind of video coding-decoding method, encoder and decoder |
CN106534858B (en) * | 2015-09-10 | 2019-09-06 | 展讯通信(上海)有限公司 | True motion estimation method and device |
CN109218733B (en) * | 2017-06-30 | 2022-03-29 | 华为技术有限公司 | Method for determining prediction motion vector prediction and related equipment |
WO2019032765A1 (en) * | 2017-08-09 | 2019-02-14 | Vid Scale, Inc. | Frame-rate up conversion with reduced complexity |
-
2019
- 2019-03-12 WO PCT/CN2019/077894 patent/WO2020181507A1/en active Application Filing
- 2019-03-12 CN CN201980005232.7A patent/CN111247804B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109005407A (en) * | 2015-05-15 | 2018-12-14 | 华为技术有限公司 | Encoding video pictures and decoded method, encoding device and decoding device |
CN108781284A (en) * | 2016-03-15 | 2018-11-09 | 联发科技股份有限公司 | Method and device for video encoding and decoding with affine motion compensation |
CN109391814A (en) * | 2017-08-11 | 2019-02-26 | 华为技术有限公司 | Encoding video pictures and decoded method, device and equipment |
CN107277506A (en) * | 2017-08-15 | 2017-10-20 | 中南大学 | A kind of motion vector accuracy fast selecting method and device based on adaptive motion vector precision |
Non-Patent Citations (2)
Title |
---|
HONGBIN LIUZ等: "CE2-related: Joint Test of AMVR for Affine Inter Mode (Test 2.1.1 and Test 2.1.2)", 《JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH MEETING,JVET-M0247》 * |
JIANCONG LUO等: "CE2: Adaptive precision for affine MVD coding (Test 2.1.1)", 《JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH MEETING,JVET-M0420》 * |
Also Published As
Publication number | Publication date |
---|---|
WO2020181507A1 (en) | 2020-09-17 |
CN111247804B (en) | 2023-10-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2023014095A (en) | Memory access windowing and padding for motion vector refinement and motion compensation | |
TW201933866A (en) | Improved decoder-side motion vector derivation | |
TW202013979A (en) | Integer motion compensation | |
CA2808160C (en) | Optimized deblocking filters | |
TW202044830A (en) | Method, device, and system for determining prediction weight for merge mode | |
JP7608558B2 (en) | MOTION VECTOR PREDICTION METHOD AND RELATED APPARATUS - Patent application | |
CN111630861A (en) | Video processing method and device | |
CN109922336B (en) | Inter-frame prediction method and device for video data | |
KR20230145097A (en) | Spatial local illumination compensation | |
CN110636312B (en) | Video encoding and decoding method and device and storage medium | |
WO2020140243A1 (en) | Video image processing method and apparatus | |
JP7258209B2 (en) | Video Coding Using Multi-Resolution Reference Image Management | |
TW202031048A (en) | Simplified spatial-temporal motion vector prediction | |
KR20060027779A (en) | Method and apparatus for encoding and decoding video signal using temporal and spatial correlation of video block | |
US20200396476A1 (en) | Merge candidate reorder based on global motion vector | |
WO2020181504A1 (en) | Video encoding method and apparatus, and video decoding method and apparatus | |
CN114827623A (en) | Boundary extension for video coding and decoding | |
JP2006217560A (en) | How to reduce frame buffer memory size and access | |
JP2024510433A (en) | Temporal structure-based conditional convolutional neural network for video compression | |
WO2020219948A1 (en) | Selective motion vector prediction candidates in frames with global motion | |
CN111247804B (en) | Image processing methods and devices | |
KR20230081711A (en) | Motion Coding Using Geometric Models for Video Compression | |
CN111656782A (en) | Video processing method and device | |
KR20230162801A (en) | Externally enhanced prediction for video coding | |
JP2024504689A (en) | Method and apparatus for encoding or decoding video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |