CN112514391A

CN112514391A - Video processing method and device

Info

Publication number: CN112514391A
Application number: CN201980049277.4A
Authority: CN
Inventors: 王悦名; 郑萧桢
Original assignee: SZ DJI Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2021-03-16
Also published as: WO2021134631A1

Abstract

A method and a device for video processing are provided, the method comprises: acquiring data of M reference blocks from a reference frame buffer, wherein M is a positive integer; when the Merge mode or Skip mode inter-frame prediction is carried out on N image blocks, whether a reference block of each image block in the N image blocks exists or not is determined from M reference blocks, wherein N is a positive integer. By accessing data of the M reference blocks from the reference frame buffer in advance and determining whether a reference block of each of the N image blocks exists in the M reference blocks acquired in advance, a plurality of data required for inter-frame prediction in a Merge mode or a Skip mode can be acquired from the M reference blocks acquired in advance, so that the number of times of requesting data from the reference frame buffer can be reduced.

Description

Video processing method and device

Copyright declaration

The disclosure of this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the patent and trademark office official records and records.

Technical Field

The present application relates to the field of video coding, and more particularly, to a method and apparatus for video processing.

Background

In order to reduce the bandwidth occupied by video storage and transmission, video data needs to be encoded. Prediction is an important link in a video encoding process, and the purpose of prediction is to obtain, for an image block, a block (which may be referred to as a reference block) that is closest to the image block, and then subtract the image block from the reference block to obtain a residual. The prediction is divided into intra prediction and inter prediction. Inter-frame prediction is to obtain a Motion Vector (MV) of a current image block, and then determine the position of a reference block in a reference frame according to the MV.

The High Efficiency Video Coding (HEVC) standard proposes two inter prediction modes: merge mode and Skip mode. In the Merge mode and the Skip mode, a reference block of an image block is obtained according to a candidate motion vector of the image block, wherein data needs to be obtained from a reference frame according to the candidate motion vector to obtain the reference block of the current block. In a hardware encoder, data is requested from a reference frame buffer.

The HEVC standard provides a flexible block division structure, where a frame of image is divided into a plurality of sub-blocks. For example, a frame of picture is first divided into a plurality of Coding Tree Units (CTUs), one CTU may be divided into one or more Coding Units (CUs), and one CU may be divided into one or more Prediction Units (PUs). Under the HEVC standard, for each PU, data needs to be acquired from the reference frame buffer according to its candidate motion vector, which results in that data needs to be requested from the reference frame buffer multiple times during inter-coding of one CTU, which increases the difficulty of hardware design and also reduces the efficiency of inter-frame prediction.

Disclosure of Invention

The application provides a video processing method and device, which can acquire a plurality of data required by inter-frame prediction in a Merge mode or a Skip mode from M pre-acquired reference blocks by accessing data of M reference blocks from a reference frame cache in advance, thereby reducing the frequency of requesting data from the reference frame cache.

In a first aspect, a method for video processing is provided, the method comprising: acquiring data of M reference blocks from a reference frame buffer, wherein M is a positive integer; when the Merge mode or Skip mode inter-frame prediction is carried out on N image blocks, whether a reference block of each image block in the N image blocks exists or not is determined from M reference blocks, wherein N is a positive integer.

In a second aspect, an encoding apparatus is provided, which includes a processor and a memory, the memory is used for storing instructions, the processor is used for executing the instructions stored in the memory, and the execution of the instructions stored in the memory causes the processor to execute the following operations: acquiring data of M reference blocks from a reference frame buffer, wherein M is a positive integer; when the Merge mode or Skip mode inter-frame prediction is carried out on N image blocks, whether a reference block of each image block in the N image blocks exists or not is determined from M reference blocks, wherein N is a positive integer.

In a third aspect, a chip is provided, where the chip includes a processing module and a communication interface, the processing module is used to control the communication interface to communicate with the outside, and the processing module is used to implement the method provided in the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which, when executed by a computer, causes the computer to carry out the method provided by the first aspect.

In a fifth aspect, a computer program product is provided comprising instructions which, when executed by a computer, cause the computer to carry out the method provided in the first aspect.

Therefore, in the present application, by accessing data of M reference blocks from the reference frame buffer in advance and determining whether a reference block of each image block of N image blocks exists in the M reference blocks obtained in advance, it is possible to obtain a plurality of data required for inter-frame prediction in the Merge mode or the Skip mode from the M reference blocks obtained in advance, so that it is not necessary to request data from the reference frame buffer for inter-frame prediction of each image block once or multiple times, the number of times of sending requests to the reference frame buffer is reduced, management pressure on the reference frame buffer is relieved, encoding complexity is reduced, and encoding efficiency can be improved.

Drawings

Fig. 1 is a schematic diagram of an architecture of video coding.

Fig. 2 is a schematic diagram of a block division structure under the HEVC standard.

Fig. 3 is a schematic flow chart of a method for video processing according to an embodiment of the present application.

Fig. 4 is another schematic flow chart of a method of video processing provided by an embodiment of the present application.

Fig. 5 is a further schematic flow chart of a method for video processing according to an embodiment of the present application.

Fig. 6 is a schematic block diagram of an encoding apparatus provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.

In order to facilitate understanding of the embodiments of the present application, related concepts related to the embodiments of the present application will be described.

1. Video coding infrastructure

In order to reduce the bandwidth occupied by video storage and transmission, video data needs to be encoded. The basic architecture of video coding is shown in fig. 1. Firstly, a frame of image is divided into a plurality of image blocks, and then each image block is subjected to prediction, transformation, quantization and entropy coding.

Among them, prediction is divided into intra prediction and inter prediction. Intra-prediction uses a block that has been encoded on a current image frame to generate a reference block (or referred to as a prediction block) of a current image block (hereinafter, simply referred to as a current block), and inter-prediction uses the reference frame (or referred to as a reference picture) to acquire the reference block of the current block. The current block is then subtracted from the reference block to obtain residual data. And transforming the time domain signal to a frequency domain through the residual data and the transformation matrix to obtain a transformation coefficient. The transform coefficients are quantized to reduce the dynamic range of the transform coefficients to further compress the information. For the quantized transform coefficient, firstly, entropy coding is carried out to obtain a bit stream of entropy coding; and secondly, after inverse quantization and inverse transformation, the reference block is added, and then in-loop filtering is carried out to obtain a reconstructed frame image, so that a better prediction mode can be determined based on the reconstructed frame image.

In inter-frame prediction, for a current block, a block most similar to the current block is searched from a reference frame to serve as a reference block of the current block, then the current block and the reference block are subtracted to obtain a residual error, and the residual error is subjected to subsequent transformation, quantization and entropy coding to form a bit stream. The process of searching for a block most similar to the current block from the reference frame is called motion search. A reference block for the current block is obtained through motion search. The position offset of the current block with respect to the reference block is referred to as a Motion Vector (MV) of the current block. It can be understood that, if the MV of the current block is obtained, the position of the reference block in the reference frame, i.e. the reference block of the current block, can be obtained.

2. Inter prediction for Merge mode and Skip mode

In the High Efficiency Video Coding (HEVC) standard, three inter prediction modes are proposed: normal inter prediction mode, Merge mode, and Skip mode. The Merge mode and the Skip mode belong to two special inter-prediction modes.

As can be understood from the foregoing description of inter prediction, the purpose of inter prediction is to obtain the MV of the current block and then determine the position of the reference block in the reference image according to the MV. The neighboring image blocks have similarity, for example, the current block and the neighboring block belong to the same object, and the moving distance and direction of the current block and the neighboring block are naturally similar or the same when the shot moves, so that the MV of the current block does not need to be calculated many times, and the MV of the neighboring block can be used as the MV of the current block. The Merge mode and Skip mode are inter-frame prediction modes based on the idea.

In the Merge mode, for a current block, a Motion Vector Predictor (MVP) candidate list is obtained from its neighboring blocks (temporal neighboring blocks, and/or spatial neighboring blocks), an optimal MVP is selected as the MV of the current block, and then the position of a reference block in a reference frame is directly determined according to the MV, i.e., the reference block is determined according to the MV. In the HEVC standard, the Merge mode can obtain at most 5 candidate motion vectors from neighboring blocks and select one of them as the motion vector of the current block.

In the Merge mode, MVP and MV are the same, i.e., Motion Vector Difference (MVD) is zero, and it can be considered that no MVD exists. Therefore, when encoding, the encoding end only needs to encode the index of the MVP (i.e. MV) selected from the MVP candidate list in the MVP candidate list, and does not need to encode the MVD any more. The decoding end can construct the MVP candidate list according to a similar method, and then the MV of the current block can be obtained according to the index transmitted by the encoding end. That is, the Merge mode is characterized by: the MV is MVP (MVD is 0), and in the coded stream, the MVD does not need to be coded.

By way of example and not limitation, the encoding operation flow in the Merge mode includes the following steps.

Step one, a Motion Vector Prediction (MVP) candidate list of a current block is obtained.

And step two, selecting an optimal MVP from the MVP candidate list, and simultaneously obtaining the index of the MVP in the MVP candidate list.

And step three, taking the MVP selected in the step two as a Motion Vector (MV) of the current block.

And step four, determining the position of the reference block in the reference frame according to the MV of the current block, namely acquiring the reference block of the current block.

And step five, subtracting the reference block from the current block to obtain residual data.

Step six, because the MVP and the MV are the same, there is no MVD, and only the residual data, and the index of MV (MVP) in the MVP candidate list, need to be transmitted to the decoder.

By way of example and not limitation, step two includes the following substeps.

Substep (1), regarding each MVP in the MVP candidate list as an MV, performing the following processing:

1) obtaining a reference block in the reference frame according to the MV;

2) and calculating a residual according to the MV and the reference block, performing transformation and quantization based on the residual, and determining the coding cost of the current block for coding based on the MV.

And (2) selecting the optimal MVP from the MVP candidate list by comparing the coding cost of each MVP in the MVP candidate list.

It should be understood that after step two, the optimal MVP (i.e. the MV of the current block) is obtained.

Similar to the Merge mode, the Skip mode also obtains candidate motion vectors from neighboring blocks and selects one of the candidate motion vectors as the motion vector of the current block.

Skip mode has two differences with respect to the Merge mode: first, only the image block divided according to the division structure of 2N × 2N (e.g., the image block divided at the upper left corner in fig. 2 (C), please refer to the contents described below in conjunction with fig. 2) can use the Skip mode; secondly, in the Skip mode, the default of the residual between the current block and the reference block is zero, that is, the residual does not need to be coded, thereby greatly saving the code rate. In other words, in Skip mode, the reference block of the current block is the same as the reconstructed block. In other words, the Skip mode is characterized in that the reconstructed value (rec) is a predicted value (pred) (residual value is 0), that is, the residual does not need to be encoded.

For example, Skip mode can be considered as a special case of Merge mode, i.e. the residual is 0, and there is no need to encode the residual.

The Merge mode and the Skip mode belong to an inter prediction mode, and therefore, in the Merge mode and the Skip mode, data needs to be acquired from a reference frame according to a candidate motion vector to obtain a reference block of a current block, and in a hardware encoder, data needs to be requested from a reference frame Buffer (Buffer).

3. Block partitioning

The new generation of video coding standard HEVC employs a hybrid coding architecture based on block partitioning, as shown in fig. 2.

As shown in fig. 2 (a), one picture is divided into several sub-blocks, each of which is called a Coding Tree Unit (CTU). As shown in fig. 2 (B), each CTU may be further divided into one or four Coding Units (CUs), and each CU may be further divided into smaller CUs. As shown in fig. 2 (C), each CU may be further divided into one, two, or four Prediction Units (PUs). In the HEVC standard, the size range of the CTU may be: 16 × 16 to 64 × 64 (unit is pixel).

In the block partitioning-based hybrid coding architecture adopted by HEVC, the prediction granularity is PU, for example, in the inter prediction of the Merge mode and the Skip mode, for each PU, data needs to be fetched from the reference frame buffer according to the candidate motion vector.

In the existing inter-coding scheme, for each image block, data needs to be requested from a reference frame buffer according to each candidate motion vector thereof to obtain a reference block of the image block.

First, an image block may correspond to one candidate motion vector or a plurality of candidate motion vectors. Secondly, HEVC provides a flexible block partitioning structure, where one CTU can be partitioned into one or more CUs, and each CU can be further partitioned into one or more PUs. Therefore, during the inter-frame coding process of a CTU, data needs to be requested to the reference frame buffer many times or even frequently, which increases the inter-frame coding delay and increases the hardware design difficulty.

The application provides a video processing method and device, which can acquire a plurality of data required by inter-frame prediction from M pre-acquired reference blocks by accessing data of M reference blocks from a reference frame cache in advance, thereby reducing the frequency of requesting data from the reference frame cache.

The coding standard to which the present application is applicable includes, but is not limited to, the HEVC coding standard. For example, the present application may be applicable to various codec standards, both current and future evolutions.

Fig. 3 is a schematic flow chart of a method for video processing according to an embodiment of the present application. The method comprises the following steps.

S310, acquiring data of M reference blocks from the reference frame buffer.

The reference frame buffer means a buffer (buffer) that stores data of the reference frame. The reference frame represents an image frame that has completed encoding.

Wherein M is a positive integer. For example, M is 1, or an integer greater than 1. In the case where M is an integer greater than 1, M reference blocks may be located in one reference frame or may be located in multiple reference frames.

S320, when the Merge mode or Skip mode inter-frame prediction is carried out on the N image blocks, whether a reference block of each image block in the N image blocks exists or not is determined from the M reference blocks.

Wherein N is a positive integer. For example, N is 1, or an integer greater than 1.

Each image block of the N image blocks may be a smallest image processing object in video coding. For example, in the HEVC standard, each of the N image blocks may be referred to as a Prediction Unit (PU). As another example, in other video coding standards, each image block of the N image blocks may have other names, respectively. In some embodiments, each of the N image blocks is taken as PU for example.

Take an image block (denoted as image block x) of the N image blocks as an example. When a reference block of an image block x exists in the M reference blocks, the reference block of the image block x is acquired from the M reference blocks, so that the image block x is subjected to inter prediction in a Merge mode or a Skip mode by using the reference block. In the case where there is no reference block of the image block x among the M reference blocks, the inter prediction process of the Merge mode or the Skip mode of the image block x is skipped.

When a reference block of an image block x exists in M reference blocks, inter prediction in a Merge mode or a Skip mode is performed on the image block x, including: a reference block for image block x is obtained from the M reference blocks.

The reference block of the image block x exists in the M reference blocks, which indicates that the data of the reference block of the image block x is included in the data of the M reference blocks acquired in step S310. A reference block of the image block x exists in the M reference blocks, and it can be further understood that any one of the M reference blocks or a plurality of reference blocks contains the reference block of the image block x.

Acquiring the reference block of the image block x from the M reference blocks means acquiring data of the reference block of the image block x from the data of the M reference blocks acquired in step S310, that is, acquiring the reference block of the image block x.

The absence of the reference block of the image block x in the M reference blocks indicates that the data of the reference block of the image block x is not included in the data of the M reference blocks acquired in step S310. There is no reference block for the image block x in the M reference blocks, and it can be further understood that none of the M reference blocks includes a reference block for the image block x.

Step S310 is performed before step S320. Therefore, in step S320, it is determined whether there is a reference block for each of the N image blocks from among the M reference blocks acquired in advance.

As can be seen from the foregoing description for the Merge mode or Skip mode, in one implementation, in inter prediction in the Merge mode or Skip mode, data needs to be requested from the reference frame buffer for each candidate motion vector and from the reference frame buffer for each PU.

In the method, data of M reference blocks are accessed from a reference frame buffer in advance, and then when inter-frame prediction of Merge mode or Skip mode is carried out on N image blocks, whether a reference block of each image block in the N image blocks exists is determined in the M reference blocks acquired in advance, wherein if yes, inter-frame prediction of Merge mode or Skip mode is carried out on the corresponding image block, and if not, an inter-frame prediction process of Merge mode or Skip mode of the corresponding image block is skipped.

It should be understood that by accessing data of the M reference blocks from the reference frame buffer in advance and determining whether there is a reference block for each of the N image blocks among the M reference blocks acquired in advance, it is possible to acquire a plurality of data required for inter prediction in the Merge mode or the Skip mode from the M reference blocks acquired in advance. Therefore, one or more times of data are not required to be requested from the reference frame cache for each image block, the number of times of sending requests to the reference frame cache is reduced, the management pressure on the reference frame cache is relieved, the encoding complexity is reduced, and the encoding efficiency can be improved.

Therefore, in the present application, by accessing data of M reference blocks from a reference frame buffer in advance, multiple data required for inter-frame prediction in the Merge mode or the Skip mode can be acquired from the M reference blocks acquired in advance, so that it can be avoided that the reference frame buffer needs to be accessed once per processing a candidate motion vector and the reference frame buffer needs to be accessed once or multiple times per processing an image block (e.g., PU).

It should be understood that reducing the number of times of requesting data from the reference frame buffer may reduce the design difficulty of hardware, and may also improve the efficiency of inter-frame prediction.

In the case where N is greater than 1, in the present application, the processing manner for each of the N image blocks is similar. For ease of understanding and description, in some embodiments, a description will be given by taking one image block (referred to as a second image block) of the N image blocks as an example. It should be noted that the scheme described below with reference to the second image block is also applicable to other image blocks in the N image blocks, or each image block in the N image blocks is processed similarly to the second image block.

In step S310, there are various methods for obtaining data of M reference blocks from the reference frame buffer.

Optionally, step S310 includes: and acquiring M reference blocks from a reference frame buffer according to the S candidate motion vectors of the first image block, wherein the first image block and the N image blocks belong to the same image unit block.

And acquiring M reference blocks from the reference frame buffer according to the S candidate motion vectors of the first image block, wherein M positions are obtained by adding the M motion vectors to the positions of the first image block respectively, and the data of the M reference blocks are read from the reference frame buffer based on the M positions. For example, the data of M reference blocks are read from the reference frame buffer centered at these M positions.

Wherein S is a positive integer less than or equal to M. If S is equal to M, the M motion vectors are the S candidate motion vectors of the first image block. If S is smaller than M, zero motion vector or global motion vector supplementation may be used, i.e. M motion vectors are obtained by S candidate motion vectors and (M-S) zero motion vectors or global motion vectors.

As an example, assume that the size of the ith reference block of the M reference blocks is W_i×H_iWherein W is_iDenotes the width, H_iIndicating height, i equals 1,2 …, M. Suppose that the horizontal and vertical coordinates of the pixel points of the left and upper boundaries of the ith reference block in the image frame are X respectively_iAnd Y_iThen the horizontal and vertical coordinates of the pixel points of the right and lower boundaries of the ith reference block in the image frame are X respectively_i+W_i-1 and Y_i+H_i-1。

The size of the M reference blocks is not limited in the present application. For example, the sizes of the M reference blocks may or may not be identical. As another example, the size of the M reference blocks may be related to the block partitioning of the coding architecture. Alternatively, the size of the M reference blocks may be predefined.

The first image block and the N image blocks belong to the same image unit block, which means that the first image block and the N image blocks belong to the same non-minimum image processing unit. In other words, the block of picture cells represents a non-minimal picture processing unit in the video coding architecture. For example, under the HEVC standard, the picture unit block may be a CTU or a CU. It should be understood that under other video coding standards, the picture unit block may have other names.

The first image block may be an image block at the same division level as the N image blocks, or may be an image block at a division level higher than the N image blocks.

For example, under the HEVC standard, each of the N image blocks is a PU, and the first image block may be a PU, or may be a CU or a CTU. In an example, the N image blocks are N PUs, and a CTU composed of the N PUs is the first image block, that is, the CTU includes N PUs. In another example, the N tiles are N CUs, and a CTU composed of the N CUs is the first tile, that is, the CTU includes N CUs.

It should be understood that, since the first image block and the N image blocks to be inter-predicted belong to the same image unit block, the probability that the reference blocks of the N image blocks fall into the M reference blocks acquired based on the candidate motion vector of the first image block is high, so that more data required for inter-prediction can be acquired from the M reference blocks acquired in advance, and the number of accesses to the reference frame buffer can be reduced.

Optionally, in an embodiment where data of the M reference blocks is obtained from the reference frame buffer through the candidate motion vector of the first image block, the first image block includes part or all of the N image blocks.

For example, in the HEVC standard, the N tiles are N PUs, and the first tile is a CU or CTU that contains some or all of the N PUs.

It should be understood that, when the first image block includes some or all of the N image blocks, the probability that the reference blocks of the N image blocks fall into the M reference blocks obtained by the candidate motion vector of the first image block is further increased, so that more data required for inter-frame prediction can be obtained from the M reference blocks obtained in advance, and the number of accesses to the reference frame buffer can be effectively reduced.

Optionally, in an embodiment where data of the M reference blocks is obtained from the reference frame buffer by candidate motion vectors of the first image block, the first image block is a block of image units including N image blocks.

For example, in the HEVC standard, N tiles are N PUs of 1 CTU, and the first tile is the CTU; or the N image blocks are N PUs in 1 CU, and the first image block is the CU.

It should be understood that, when the first image block includes N image blocks, the probability that the reference blocks of the N image blocks fall into the M reference blocks obtained by the candidate motion vector of the first image block is further increased, so that more data required for inter-frame prediction can be obtained from the M reference blocks obtained in advance, and the number of accesses to the reference frame buffer can be effectively reduced.

For example, in an embodiment where data of M reference blocks are obtained from a reference frame buffer by a candidate motion vector of a first image block, and the first image block includes N image blocks, during inter-frame prediction of the N image blocks, required reference data may be obtained from the M reference blocks obtained in advance, that is, during inter-frame prediction of the N image blocks, data only needs to be requested to the reference frame buffer once.

As an example, in the HEVC standard, a first image block is a CTU, and N image blocks are N PUs obtained by dividing the CTU, in the inter-frame coding process of a Merge mode or a Skip mode of all PUs in the CTU, the embodiments of the present application may access data to a reference frame buffer only once, which effectively reduces the number of times of accessing data to the reference frame buffer compared to the prior art.

It was described above that data of M reference blocks are obtained from the reference frame buffer according to the candidate motion vector of the first image block in step S310. Alternatively, in step S310, the data of the M reference blocks may also be obtained from the reference frame buffer according to other feasible manners. For example, when the data of M reference blocks is obtained from the reference frame buffer according to the motion vector of one or more PU/CUs in the CTU, and then inter prediction in the Merge mode or the Skip mode is performed on each PU/CU in the CTU, the reference block of each PU/CU may be obtained from the M reference blocks obtained in advance, and it is not necessary to send a reference block obtaining request to the reference frame buffer for each PU/CU in multiple times.

Hereinafter, taking the second image block of the N image blocks as an example for description, as shown in fig. 4, step S320 includes: and when the second image block in the N image blocks is subjected to the inter prediction of the Merge mode or the Skip mode, determining whether a reference block of the second image block exists in the M reference blocks.

Wherein the reference block of the second image block may be obtained by a candidate motion vector of the second image block.

For example, in the example of the encoding operation flow in the Merge mode or the Skip mode described above, step S320 may be completed in step two. In other words, in the example of the encoding operation flow in the Merge mode or the Skip mode described above, in step two, it is determined whether the reference block of the second image block is included in the M reference blocks.

Optionally, as shown in fig. 4, the method of the embodiment shown in fig. 3 may further include step S330 or step S340.

S330, if the determination result in the step S320 is yes, that is, if there is a reference block of the second image block in the M reference blocks, performing inter prediction in the Merge mode or the Skip mode on the second image block according to the reference block of the second image block.

Performing Merge mode or Skip mode inter-frame prediction on the second image block according to the reference block of the second image block, comprising: a reference block for the second image block is obtained from the M reference blocks.

Acquiring the reference block of the second image block from the M reference blocks means acquiring data of the reference block of the second image block from data of the M reference blocks acquired in advance.

For the inter-frame prediction process of the image block in the Merge mode or the Skip mode, refer to the above-described encoding operation flow in the Merge mode or the Skip mode, and are not described herein again.

For example, in the example of the encoding operation flow in the Merge mode or the Skip mode described above, step S320 and step S330 may be completed in step two. In other words, in the example of the encoding operation flow in the Merge mode or the Skip mode described above, in step two, it is determined whether the reference block of the second image block is included in the M reference blocks, and in case of yes, the reference block of the second image block is acquired from the M reference blocks.

S340, in the case that the determination result in the step S320 is negative, that is, in the case that the reference block of the second image block does not exist in the M reference blocks, skipping the inter prediction process of the second image block in the Merge mode or the Skip mode.

For example, for a candidate motion vector of the second image block, if the reference block corresponding to the candidate motion vector is not in the M reference blocks, the inter prediction process in the Merge mode or the Skip mode for the second image block using the reference block corresponding to the candidate motion vector is skipped.

For another example, for all candidate motion vectors of the second image block, if the reference blocks corresponding to all the candidate motion vectors are not in the M reference blocks, skipping the inter-frame prediction process of the second image block in the Merge mode or the Skip mode using the reference block corresponding to each candidate motion vector, that is, skipping the inter-frame prediction process of the second image block in the Merge mode or the Skip mode, that is, not performing the inter-frame prediction process of the Merge mode or the Skip mode on the second image block.

In the case that the inter prediction process of the Merge mode or Skip mode is not performed on the second image block, for example, the inter prediction or intra prediction may be performed on the second image block in other manners.

Optionally, in the embodiment shown in fig. 4, step S320 includes: and determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists in the M reference blocks, wherein j is 1, …, and P is the number of candidate motion vectors of the second image block. P is a positive integer, and P may be 1, or an integer greater than 1.

For example, if it is determined in step S320 that a reference block corresponding to the jth candidate motion vector of the second image block exists in the M reference blocks, step S330 includes: and performing Merge mode or Skip mode inter-frame prediction on the second image block by adopting the reference block corresponding to the jth candidate motion vector.

Performing Merge mode or Skip mode inter-frame prediction on the second image block by using a reference block corresponding to the jth candidate motion vector, wherein the method comprises the following steps: and acquiring a reference block corresponding to the jth candidate motion vector from the M reference blocks.

As an example, the sub-step (1) and the sub-step (2) in the second step of the encoding operation flow in the Merge mode or the Skip mode are performed by using the jth candidate motion vector as the MV of the second image block.

For another example, if it is determined in step S320 that there is no reference block corresponding to the jth candidate motion vector in the M reference blocks, in the embodiments shown in fig. 3 and 4, the inter prediction process in the Merge mode or the Skip mode for the second image block using the reference block corresponding to the jth candidate motion vector may be skipped.

It should be understood that the method of the embodiment shown in fig. 3 comprises step S330 shown in fig. 4 as long as it is determined in step S320 that the reference block corresponding to one candidate motion vector of the second image block is included in the M reference blocks.

If it is determined in step S320 that none of the reference blocks corresponding to the P candidate motion vectors of the second image block are included in the M reference blocks, the method of the embodiment shown in fig. 3 includes step S340 shown in fig. 4, but does not include step S330 shown in fig. 4.

The P candidate motion vectors of the second image block in step S320 may be part or all of motion vectors in a Motion Vector Predictor (MVP) candidate list of the second image block. Whether the P candidate motion vectors are partial motion vectors or all motion vectors in the MVP candidate list of the second image block may be determined by actual requirements. Furthermore, in case that the P candidate motion vectors are part of the motion vectors in the MVP candidate list of the second image block, the positions of the P candidate motion vectors in the MVP candidate list may also be determined by actual requirements.

A method of determining whether or not a reference block corresponding to the jth candidate motion vector exists among the M reference blocks will be described below.

Optionally, step S320 includes: determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists from the M reference blocks according to the following factors: the position of the second image block, the jth candidate motion vector, and the positions of the M reference blocks.

For example, the position (denoted as the predicted position) of the second image block plus the jth candidate motion vector is calculated according to the position of the second image block and the jth candidate motion vector, and then it is determined whether the predicted position is included in the positions of the M reference blocks, if so, it can be considered that a reference block corresponding to the jth candidate motion vector of the second image block exists in the M reference blocks, and if not, it can be considered that a reference block corresponding to the jth candidate motion vector of the second image block does not exist in the M reference blocks.

The expression "determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks" may be replaced by "determining whether the reference block corresponding to the jth candidate motion vector is included in the M reference blocks". Accordingly, the expression "a reference block corresponding to the jth candidate motion vector of the second image block among the M reference blocks" may be replaced with "a reference block corresponding to the jth candidate motion vector is included in the M reference blocks", and "a reference block corresponding to the jth candidate motion vector of the second image block among the M reference blocks" may be replaced with "a reference block corresponding to the jth candidate motion vector is not included in the M reference blocks".

As an example. Taking the second image block as PU as an example, suppose that the width and height of the second image block are respectively denoted as W_PUAnd H_PUThe position of the second image block can be expressed as (X)_PU，Y_PU，X_PU+W_PU-1，Y_PU+H_PU-1) representing the position of the left boundary, the upper boundary, the right boundary, the lower boundary, respectively, of the second image block. It is to be understood that the coordinates of the pixels of the second image block located at the left and upper boundaries may be expressed as (X)_PU，Y_PU) The coordinates of the pixels located at the right and lower boundaries on the second image may be expressed as (X)_PU+W_PU-1，Y_PU+H_PU-1). Suppose that the horizontal component and the vertical component of the jth candidate motion vector are denoted as MVX, respectively_PUAnd MVY_PU. The predicted position of the second image block plus the jth candidate motion vector can be represented as (X)_PU+MVX_PU，Y_PU+MVY_PU，X_PU+W_PU+MVX_PU-1，Y_PU+H_PU+MVY_PU-1) representing the position of the left, upper, right and lower boundary, respectively, of the second image block plus the jth candidate motion vector.

For example, when the positions of the left, upper, right, and lower boundaries of the second image block to which the jth candidate motion vector is added are all located within the positions of the M reference blocks, it may be considered that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks, and otherwise, it is considered that the reference block corresponding to the jth candidate motion vector is not included in the M reference blocks.

For another example, when the positions of the left, upper, right, and lower boundaries of the second image block plus the jth candidate motion vector are all located within the contraction positions of the M reference blocks, it may be considered that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks, otherwise, it is considered that the reference block corresponding to the jth candidate motion vector is not included in the M reference blocks. The contraction positions of the M reference blocks represent positions of the M reference blocks after contraction based on pixels required by the sub-pixel interpolation, which is described in detail in the following description of the second to fifth modes.

Optionally, step S320 includes: and under the condition that the horizontal component and the vertical component of the jth candidate motion vector point to the whole pixel respectively, if the position of the second image block after the shift of the jth candidate motion vector is carried out is positioned in the position of any one of the M reference blocks, determining that the reference block corresponding to the jth candidate motion vector exists in the M reference blocks.

As an example, data of M reference blocks are obtained by the candidate motion vector of the first image block. For example, step S310 includes: and acquiring M reference blocks from the reference frame buffer according to the S candidate motion vectors of the first image block. In step S320, if the jth motion vector candidate (MVX) points to the integer pixel when the horizontal component and the vertical component of the jth motion vector candidate of the second image block point to the integer pixel respectively_PU，MVY_PU) When condition one described below is satisfied, it is determined that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks.

Optionally, in step S320, the factor for determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks may further include a preset value related to the minimum precision of the motion vector and/or the number of taps of the interpolation filter, in addition to the position of the second image block, the jth candidate motion vector, and the positions of the M reference blocks.

For example, under the HEVC standard, the minimum precision is 1/4, the tap is 7 or 8 taps, and the preset value may be 3.25. For another example, under other standards, when the minimum precision of the motion vector changes, or the number of taps of the interpolation filter changes, the value of the preset value may also change accordingly. For another example, under other video coding standards, the preset value may have a corresponding value.

Optionally, step S320 includes: and under the condition that the horizontal component and/or the vertical component of the jth candidate motion vector of the second image block point to the sub-pixels, comprehensively considering the position of the second image block, the jth candidate motion vector, the positions of the M reference blocks and the preset value, and determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists in the M reference blocks.

For example, step S320 includes: under the condition that the horizontal component of the jth candidate motion vector of the second image block points to sub-pixels and the vertical component points to integer pixels, if the position of the second image block after the shift of the jth candidate motion vector is located in the contraction position of the ith reference block in the M reference blocks, determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, wherein i is 1, …, and M, and the contraction position of the ith reference block represents the position of the ith reference block after the ith reference block is respectively contracted inwards by preset values through a left boundary and a right boundary.

For another example, step S320 includes: and under the condition that the horizontal component of the jth candidate motion vector of the second image block points to the integer pixel and the vertical component points to the sub-pixel, if the position of the second image block after the shift of the jth candidate motion vector is performed is located in the contraction position of the ith reference block in the M reference blocks, determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, wherein i is 1, … and M, and the contraction position of the ith reference block represents the position of the ith reference block after the ith reference block passes through the upper boundary and the lower boundary and is respectively contracted inwards by a preset value.

For another example, step S320 includes: under the condition that a horizontal component of a jth candidate motion vector of the second image block points to a sub-pixel and a vertical component points to the sub-pixel, if the position of the second image block after the shift of the jth candidate motion vector is located in a contraction position of an ith reference block in the M reference blocks, determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, wherein i is 1, …, and M, and the contraction position of the ith reference block represents a position of the ith reference block after the ith reference block is respectively contracted inwards by a preset value through a left boundary, a right boundary, an upper boundary and a lower boundary.

As an example, M reference blocks are retrieved from the reference frame buffer in step S310 based on the S candidate motion vectors of the first image block. In step S320, it is determined whether there is a reference block corresponding to the jth candidate motion vector in the M reference blocks by any one of the following manners.

The first method is adopted. If the jth motion vector candidate (MVX) points to integer pixels for both horizontal and vertical components of the jth motion vector candidate_PU，MVY_PU) And determining that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks when the following condition one is satisfied.

And (5) carrying out the second mode. If the jth candidate Motion Vector (MVX) points to a sub-pixel in the horizontal component and to a whole pixel in the vertical component_PU，MVY_PU) And determining that the reference block corresponding to the jth candidate motion vector is contained in the M reference blocks when the following condition II is satisfied.

And (4) carrying out a third mode. If the jth candidate Motion Vector (MVX) points to integer pixels for the horizontal component and sub-pixels for the vertical component of the jth candidate motion vector_PU，MVY_PU) And judging that the reference block corresponding to the jth candidate motion vector is contained in the M reference blocks when the following condition III is satisfied.

The method is four. If the jth candidate Motion Vector (MVX) points to a sub-pixel with its horizontal and vertical components respectively_PU，MVY_PU) And determining that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks when the following condition four is satisfied.

The first condition is as follows:

X_PU+MVX_PU≥X_i

Y_PU+MVY_PU≥Y_i

X_PU+W_PU+MVX_PU-1≤X_i+W_i-1

Y_PU+H_PU+MVY_PU-1≤Y_i+H_i-1

and a second condition:

X_PU+MVX_PU≥X_i+a

Y_PU+MVY_PU≥Y_i

X_PU+W_PU+MVX_PU-1≤X_i+W_i-1-a

Y_PU+H_PU+MVY_PU-1≤Y_i+H_i-1

and (3) carrying out a third condition:

X_PU+MVX_PU≥X_i

Y_PU+MVY_PU≥Y_i+a

X_PU+W_PU+MVX_PU-1≤X_i+W_i-1

Y_PU+H_PU+MVY_PU-1≤Y_i+H_i-1-a

and a fourth condition:

X_PU+MVX_PU≥X_i+a

Y_PU+MVY_PU≥Y_i+a

X_PU+W_PU+MVX_PU-1≤X_i+W_i-1-a

Y_PU+H_PU+MVY_PU-1≤Y_i+H_i-1-a

wherein (X)_PU，Y_PU) Representing the coordinates of the pixels in the second image block located at the left and upper boundaries, i.e. the coordinates of the position of the upper left pixel in the second image block, W_PUAnd H_PURespectively representing width and height, MVX, of the second image block_PUAnd MVY_PURespectively representing the horizontal and vertical components, X, of the jth candidate motion vector_iAnd Y_iRespectively representing the horizontal coordinate and the vertical coordinate of the pixel point of the left boundary and the upper boundary of the ith reference block in the M reference blocks, W_iAnd H_iRespectively representing the width and height of the ith reference block, the value range of i is 1-M, a represents a preset value, and the preset value is related to the minimum precision of the motion vector and/or the tap number of the interpolation filter. For example, under the HEVC standard, a may take on a value of 3.25.

Optionally, in step S320, whether the horizontal component and the vertical component of the jth candidate motion vector of the second image block point to a sub-pixel or a whole pixel or not, the position of the second image block, the jth candidate motion vector, the positions of the M reference blocks, and the preset value are considered, and whether the reference block corresponding to the jth candidate motion vector is included in the M reference blocks or not is determined.

For example, in step S320, if the position of the second image block after the shifting of the jth candidate motion vector is located in the contraction position of the ith reference block in the M reference blocks, it is determined that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks, where the contraction position of the ith reference block indicates a position of the ith reference block after passing through the left boundary, the right boundary, the upper boundary and the lower boundary and contracting inward by a preset value, respectively, where the preset value is related to the minimum precision of the motion vector, and i is 1, …, M.

As an example, M reference blocks are retrieved from the reference frame buffer in step S310 based on the S candidate motion vectors of the first image block. In step S320, it may be determined whether a reference block corresponding to the jth candidate motion vector exists in the M reference blocks in the following manner.

And a fifth mode. If it is firstj candidate Motion Vectors (MVX)_PU，MVY_PU) And if the condition four is satisfied, determining that the reference block corresponding to the jth candidate motion vector is included in the M reference blocks.

In order to better understand the technical solution provided by the present application, an example is given below with reference to fig. 5.

In fig. 5, a candidate motion vector (denoted as a jth candidate motion vector) of an image block (denoted as a second image block) is described as an example. This second image block may for example be referred to as PU.

As shown in fig. 5, inter-coding the second image block includes the following steps.

S510, acquiring data of M reference blocks from the reference frame buffer.

Step S510 is the same as step S310 in the above embodiments, and is described in detail above, and is not described again here.

S520, determining whether the reference block corresponding to the jth candidate motion vector is included in the ith reference block according to the position of the second image block, the jth candidate motion vector of the second image block, and the position of the ith reference block in the M reference blocks, if yes, going to step S530, and if not, going to step S540. Wherein, the value range of j is 1-P, and P represents the number of candidate motion vectors of the second image block. The initial value of i is 1, and the value range of i is 1-M.

The method for determining whether the reference block corresponding to the jth candidate motion vector is included in the ith reference block is described in the above embodiments, and is not described herein again.

And S530, performing inter-frame prediction in Merge mode or Skip mode on the second image block by using the reference block corresponding to the jth candidate motion vector.

It should be understood that, in step S530, the following steps are included: and acquiring a reference block corresponding to the jth candidate motion vector from the data of the ith reference block.

For example, step S530 may be implemented by sub-step (1) in the flow of the encoding operation in the Merge mode or the Skip mode described above. Step 1) in sub-step (1) is equivalent to obtaining a reference block corresponding to the jth candidate motion vector from the ith reference block.

And S540, adding 1 to the value of i.

S550, judging whether the value of i exceeds M, if yes, going to step S560, and if not, going to step S520.

And S560, skipping the inter-frame prediction process of Merge mode or Skip mode on the second image block by adopting the reference block corresponding to the jth candidate motion vector.

Assuming that the second image block has P candidate motion vectors participating in inter prediction in the Merge mode or the Skip mode, the process shown in fig. 5 may be performed for each of the P candidate motion vectors. If reference blocks corresponding to one or more candidate motion vectors are obtained from M reference frames after the processing shown in fig. 5 is performed on the P candidate motion vectors of the second image block, an inter-frame prediction process in the Merge mode or the Skip mode may be performed on the second image block based on the reference blocks. The specific process is detailed in the sub-step in step two in the encoding operation flow in the Merge mode or the Skip mode.

It should also be understood that if inter prediction is required for multiple image blocks, the process shown in fig. 5 is performed for each image block.

Assuming that, for an image block, after the processing in fig. 5, reference blocks corresponding to one or more candidate motion vectors are obtained from M reference frames, an inter-frame prediction process in the Merge mode or the Skip mode may be performed on the image block based on the reference blocks.

Assuming that, for an image block, after the processing in fig. 5, a reference block corresponding to any candidate motion vector is not obtained from M reference frames, the inter-frame prediction process in the Merge mode or the Skip mode for the image block may be skipped. For example, the image block may be inter-predicted or intra-predicted in other ways.

The second image block is taken as an example in the above embodiment. It should be understood that an operation similar to the processing of the second tile may be performed for each of the N tiles.

For a plurality of candidate motion vectors of the second image block, the reference block corresponding to each candidate motion vector is obtained from the M reference blocks, and it is not necessary to request the reference frame cache for one or more times for each motion vector of each image block, so that the number of times of sending requests to the reference frame cache is reduced, the management pressure on the reference frame cache is relieved, the encoding complexity is reduced, and the encoding efficiency can be improved.

Another embodiment of the present application further provides a method for video processing. The method comprises the following steps (A) and (B).

And (A) acquiring data of M reference blocks from a reference frame buffer, wherein M is a positive integer.

M may be 1 or an integer greater than 1. In the case where M is an integer greater than 1, M reference blocks may be located in one reference frame or may be located in multiple reference frames.

Step (a) is the same as step S310 in the above embodiments, and is described in detail above, and is not repeated herein.

And (B) acquiring a reference block of each image block in the N image blocks from the M reference blocks during inter prediction of the N image blocks. N is a positive integer, and N may be 1 or an integer greater than 1.

Each image block of the N image blocks may be a smallest image processing object in video coding. For example, in the HEVC standard, each of the N image blocks may be referred to as a Prediction Unit (PU). As another example, in other video coding standards, each image block of the N image blocks may have other names, respectively.

Step (a) is performed before step (B). Therefore, in step (B), it can be considered that the reference blocks of the N image blocks are acquired from the M reference blocks acquired in advance.

The technical solution of this embodiment can be applied to a scene in which the inter prediction mode is the Merge mode or the Skip mode.

For example, in step (B), a reference block for each of N image blocks is obtained from the M reference blocks, and for each of the N image blocks, inter prediction in the Merge mode or the Skip mode is performed on the image block according to the reference block of the image block.

Optionally, the technical solution provided in this embodiment may also be applied to a scenario in which the inter prediction mode is a non-large mode and a non-Skip mode.

In the case that N is an integer greater than 1, by acquiring the reference blocks of the plurality of image blocks from the M reference blocks acquired in advance, it is not necessary to request data from the reference frame buffer one or more times for each image block in the inter prediction process of the plurality of image blocks, so that the number of times of requesting data from the reference frame buffer can be reduced.

Under the condition that N is equal to 1, when the present embodiment is applied to a scene in which the inter prediction mode is the Merge mode or the Skip mode, by acquiring reference blocks corresponding to a plurality of candidate motion vectors of 1 image block from M reference blocks acquired in advance, it is not necessary to request data from the reference frame buffer for each candidate motion vector of the image block, and thus the number of times of requesting data from the reference frame buffer can be reduced.

In the embodiment, in the inter-frame prediction of the N image blocks, the reference blocks of the N image blocks are acquired from the M reference blocks acquired in advance, so that in the inter-frame prediction process of the N image blocks, it is not necessary to request the reference frame cache for one or more times for each image block, the number of times of sending requests to the reference frame cache is reduced, the management pressure on the reference frame cache is relieved, the encoding complexity is reduced, and the encoding efficiency can be improved.

Therefore, in the scheme provided in this embodiment, the data of the M reference blocks are accessed from the reference frame cache in advance, and then the multiple data required for inter-frame prediction are acquired from the data of the M reference blocks accessed in advance, so that the number of times of requesting data from the reference frame cache can be reduced, the design difficulty of hardware can be reduced, and the efficiency of inter-frame prediction can be improved.

In step (B), a reference block for each of the N image blocks may be unconditionally obtained from the M reference blocks; the reference block for each of the N image blocks may also be conditionally obtained from the M reference blocks.

The reference block of each of the N image blocks is unconditionally obtained from the M reference blocks, which means that the reference block of each of the N image blocks can be directly obtained from the data of the M reference blocks obtained in advance.

For example, in step (a), data of M reference blocks are obtained from the reference frame buffer according to a candidate motion vector of a first image block, the first image block including N image blocks. For example, the N image blocks are N PUs, and the first image block is a CTU including the N PUs. It is assumed that the data of the M reference blocks can be guaranteed to contain reference blocks of N image blocks. In this case, in step (B), the reference block for each of the N image blocks may be unconditionally acquired from the M reference blocks. In other words, in step (B), the reference block of each of the N image blocks may be directly acquired from the M reference blocks without performing the determination action.

Conditionally obtaining the reference block for each of the N image blocks from the M reference blocks means obtaining the reference block for each of the N image blocks from the M reference blocks under a condition that the reference blocks for the N image blocks are guaranteed to be included in the M reference blocks.

For example, in a case where it cannot be guaranteed that the M reference blocks acquired in step (a) contain reference blocks of N image blocks, the reference blocks of the N image blocks may be acquired from the M reference blocks on the premise that it is determined that there are reference blocks of the N image blocks in the M reference blocks.

Taking an image block (denoted as a second image block) of the N image blocks as an example, the method for determining whether a reference block of the second image block exists in the M reference blocks is described in detail in the above embodiments, and is not described herein again.

It should be understood that, in the inter-frame prediction process of the Merge mode or Skip mode of the N image blocks, by acquiring data of M reference blocks from the reference frame buffer in advance, and acquiring reference blocks of the N image blocks from the M reference blocks on the premise that the reference blocks of the N image blocks exist in the M reference blocks, inter-frame prediction of the Merge mode or Skip mode can be performed on each image block in the N image blocks, so that, in the inter-frame prediction process of the Merge mode or Skip mode of the N image blocks, it is not necessary to request data from the reference frame buffer for inter-frame prediction of each image block once or for multiple times, the number of times of sending requests to the reference frame buffer is reduced, the management pressure on the reference frame buffer is relieved, the encoding complexity is reduced, and the encoding efficiency can be improved.

The method for video coding provided by the embodiment of the application can be executed by an encoder or a device with a video coding function.

Method embodiments of the present application are described above and apparatus embodiments of the present application are described below. It should be understood that the description of the apparatus embodiments corresponds to the description of the method embodiments, and therefore, for the sake of brevity, detailed descriptions may be omitted with reference to the foregoing method embodiments.

As shown in fig. 6, an encoding apparatus is further provided in the embodiments of the present application. The encoding apparatus comprises a processor 610 and a memory 620, the memory 620 is used for storing instructions, the processor 610 is used for executing the instructions stored in the memory 620, and the execution of the instructions stored in the memory 620 causes the processor 610 to execute the method of the above method embodiment.

The processor 610, by executing the instructions stored by the memory 620, performs the following operations: acquiring data of M reference blocks from a reference frame buffer, wherein M is a positive integer; when the Merge mode or Skip mode inter-frame prediction is carried out on N image blocks, whether a reference block of each image block in the N image blocks exists or not is determined from M reference blocks, wherein N is a positive integer.

In this embodiment, by accessing data of M reference blocks from a reference frame buffer in advance and determining whether a reference block of each image block of N image blocks exists in the M reference blocks obtained in advance, it may be implemented to obtain a plurality of data required for inter-frame prediction in the Merge mode or the Skip mode from the M reference blocks obtained in advance, thereby avoiding the need to access the reference frame buffer once per processing one candidate motion vector and the need to access the reference frame buffer once per processing one image block.

Optionally, each image block of the N image blocks is a Prediction Unit (PU).

Optionally, the first image block is any one of: a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU).

Optionally, the obtaining data of the M reference blocks from the reference frame buffer includes: and acquiring M reference blocks from a reference frame buffer according to S candidate motion vectors of the first image block, wherein S is a positive integer less than or equal to M, and the first image block and the N image blocks belong to the same image unit block.

Optionally, the first image block includes some or all of the N image blocks.

Optionally, the first image block is a picture unit block.

Optionally, the second image block is included in the N image blocks; when inter prediction in Merge mode or Skip mode is performed on N image blocks, determining whether a reference block of each image block in the N image blocks exists from M reference blocks includes: it is determined from the M reference blocks whether a reference block of the second image block exists. Wherein the processor 610 is further configured to perform the following operations: and under the condition that the reference block of the second image block exists in the M reference blocks, performing Merge mode or Skip mode inter-frame prediction on the second image block according to the reference block of the second image block.

Optionally, the processor 610 is further configured to perform the following operations: and skipping Merge mode or Skip mode inter prediction of the second image block when the reference block of the second image block does not exist in the M reference blocks.

Optionally, the candidate motion vector of the second image block includes a plurality; determining whether a reference block corresponding to the second image block exists in the M reference blocks, including: and determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists in the M reference blocks, wherein j is 1, …, and P is the number of the candidate motion vectors.

Optionally, determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from the M reference blocks includes: determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists from the M reference blocks according to the following factors: the position of the second image block, the jth candidate motion vector, and the positions of the M reference blocks.

Optionally, determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from the M reference blocks includes: and under the condition that the horizontal component and the vertical component of the jth candidate motion vector point to the whole pixel respectively, if the position of the second image block after the shift of the jth candidate motion vector is carried out is positioned in the position of any one of the M reference blocks, determining that the reference block corresponding to the jth candidate motion vector exists in the M reference blocks.

Optionally, in case the horizontal component, and/or the vertical component of the jth candidate motion vector points to a sub-pixel, the factor further comprises a preset value, which is related to the minimum precision of the motion vector and/or the number of taps of the interpolation filter.

Optionally, determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from the M reference blocks includes: and if the position of the second image block after the shift of the jth candidate motion vector is performed is located in the contraction position of the ith reference block in the M reference blocks, determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, wherein i is 1, … and M.

In the case where the horizontal component of the jth candidate motion vector points to a sub-pixel and the vertical component points to a whole pixel, the contraction position of the ith reference block indicates a position after the ith reference block is respectively contracted inward by preset values through a left boundary and a right boundary.

In the case where the horizontal component of the jth candidate motion vector points to integer pixels and the vertical component points to sub-pixels, the contraction position of the ith reference block indicates a position after the ith reference block is contracted inward by a preset value through the upper boundary and the lower boundary, respectively.

In the case where the horizontal component of the jth candidate motion vector points to a sub-pixel and the vertical component points to a sub-pixel, the contraction position of the ith reference block indicates a position after the ith reference block is contracted inward by a preset value through a left boundary, a right boundary, an upper boundary, and a lower boundary, respectively.

Optionally, determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from the M reference blocks includes: and if the position of the second image block after the displacement of the jth candidate motion vector is performed is located in the contraction position of the ith reference block in the M reference blocks, determining that the reference block corresponding to the jth candidate motion vector exists in the M reference blocks, wherein the contraction position of the ith reference block represents the position of the ith reference block after the ith reference block is respectively contracted inwards by preset values through a left boundary, a right boundary, an upper boundary and a lower boundary, the preset values are related to the minimum precision of the motion vector, and i is 1, … and M.

Optionally, when a reference block of a second image block exists in the M reference blocks, performing inter prediction in a Merge mode or a Skip mode on the second image block includes: and under the condition that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, performing Merge-mode or Skip-mode inter-frame prediction on the second image block by adopting the reference block corresponding to the jth candidate motion vector.

Optionally, the processor 610 is further configured to perform the following operations: and under the condition that no reference block corresponding to the jth candidate motion vector exists in the M reference blocks, skipping the inter-frame prediction process of carrying out Merge mode or Skip mode on the second image block by adopting the reference block corresponding to the jth candidate motion vector.

Optionally, as shown in fig. 6, the encoding apparatus further includes a communication interface 630 for transmitting signals with an external device.

Alternatively, the encoding apparatus of the present embodiment is an encoder, and the communication interface 630 is used to receive image or video data to be processed from an external device. Alternatively, the communication interface 630 is further configured to send the encoded code stream to the decoding end.

Embodiments of the present invention also provide a computer storage medium having a computer program stored thereon, where the computer program is executed by a computer, so that the computer executes the method of the above method embodiments.

Embodiments of the present invention also provide a computer program product comprising instructions, wherein the instructions, when executed by a computer, cause the computer to perform the method of the above method embodiments.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any other combination. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Video Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of video processing, comprising:

acquiring data of M reference blocks from a reference frame buffer, wherein M is a positive integer;

when the Merge mode or Skip mode inter-frame prediction is carried out on N image blocks, whether a reference block of each image block in the N image blocks exists or not is determined from the M reference blocks, and N is a positive integer.

2. The method of claim 1, wherein the retrieving data of the M reference blocks from the reference frame buffer comprises:

and acquiring the M reference blocks from the reference frame buffer according to S candidate motion vectors of a first image block, wherein S is a positive integer less than or equal to M, and the first image block and the N image blocks belong to the same image unit block.

3. The method according to claim 2, wherein the first image block comprises some or all of the N image blocks.

4. The method of claim 3, wherein the first image block is the block of picture units.

5. The method according to any of claims 1 to 4, wherein the N image blocks comprise a second image block;

when performing inter prediction in a Merge mode or a Skip mode on N image blocks, determining whether a reference block of each image block in the N image blocks exists in the M reference blocks includes:

determining whether a reference block of the second image block exists from the M reference blocks;

wherein the method further comprises:

and under the condition that the reference block of the second image block exists in the M reference blocks, performing Merge mode or Skip mode inter-frame prediction on the second image block.

6. The method of claim 5, further comprising:

skipping Merge mode or Skip mode inter prediction of the second image block when a reference block of the second image block is absent from the M reference blocks.

7. The method according to claim 5, wherein the candidate motion vector of the second image block comprises a plurality;

wherein the determining whether a reference block corresponding to the second image block exists in the M reference blocks includes:

and determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists in the M reference blocks, wherein j is 1, …, and P is the number of the candidate motion vectors.

8. The method according to claim 7, wherein said determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks comprises:

determining whether a reference block corresponding to the jth candidate motion vector of the second image block exists from the M reference blocks according to the following factors:

the position of the second image block, the jth candidate motion vector, and the positions of the M reference blocks.

9. The method according to claim 7 or 8, wherein said determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks comprises:

and determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks if the position of the second image block after the shift of the jth candidate motion vector is performed is located in the position of any one of the M reference blocks under the condition that the horizontal component and the vertical component of the jth candidate motion vector point to integer pixels respectively.

10. The method according to claim 8, wherein in case the horizontal component, and/or the vertical component of the jth candidate motion vector points to a sub-pixel, said factors further comprise a preset value related to the minimum precision of the motion vector and/or the number of taps of the interpolation filter.

11. The method according to claim 10, wherein said determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks comprises:

if the position of the second image block after the shift of the jth candidate motion vector is located in the contraction position of the ith reference block in the M reference blocks, determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, i is 1, …, M, wherein,

in case that the horizontal component of the jth candidate motion vector points to a sub-pixel and the vertical component points to a whole pixel, the contraction position of the ith reference block indicates a position after the ith reference block is respectively contracted inward by the preset value through a left boundary and a right boundary,

in case that the horizontal component of the jth candidate motion vector points to integer pixels and the vertical component points to sub-pixels, the contraction position of the ith reference block indicates a position after the ith reference block is contracted inward by the preset values through the upper boundary and the lower boundary, respectively,

and under the condition that the horizontal component and the vertical component of the jth candidate motion vector point to the sub-pixels, the contraction position of the ith reference block represents a position of the ith reference block after the ith reference block is respectively contracted inwards by the preset value through a left boundary, a right boundary, an upper boundary and a lower boundary.

12. The method according to claim 7 or 8, wherein said determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks comprises:

if the position of the second image block after the shift of the jth candidate motion vector is located in the contraction position of the ith reference block in the M reference blocks, determining that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, wherein,

the contraction position of the ith reference block indicates a position of the ith reference block after passing through the left boundary, the right boundary, the upper boundary and the lower boundary and respectively contracting inward by a preset value, the preset value is related to the minimum precision of the motion vector, and i is 1, …, M.

13. The method according to any one of claims 7 to 12, wherein the inter-predicting the second image block in Merge mode or Skip mode in case that a reference block of the second image block exists in the M reference blocks comprises:

and under the condition that a reference block corresponding to the jth candidate motion vector exists in the M reference blocks, performing Merge-mode or Skip-mode inter-frame prediction on the second image block by adopting the reference block corresponding to the jth candidate motion vector.

14. The method according to any one of claims 7 to 12, further comprising:

and under the condition that the reference block corresponding to the jth candidate motion vector does not exist in the M reference blocks, skipping the inter-frame prediction process of the Merge mode or the Skip mode on the second image block by adopting the reference block corresponding to the jth candidate motion vector.

15. The method according to any of claims 1 to 14, wherein each of said N image blocks is a prediction unit, PU.

16. The method according to any of claims 2 to 4, wherein the first image block is any of the following: a coding tree unit CTU, a coding unit CU, a prediction unit PU.

17. An encoding apparatus, comprising:

a memory to store instructions;

a processor for executing the instructions stored by the memory to perform the following operations:

18. The encoding apparatus as claimed in claim 17, wherein the retrieving data of the M reference blocks from the reference frame buffer comprises:

19. The encoding device of claim 18, wherein the first image block comprises some or all of the N image blocks.

20. The encoding device according to claim 19, wherein the first image block is the picture unit block.

21. The encoding apparatus according to any of claims 17 to 20, wherein the N image blocks comprise a second image block;

wherein the processor is further configured to perform the following operations:

22. The encoding device of claim 21, wherein the processor is further configured to:

23. The encoding apparatus as claimed in claim 21, wherein the candidate motion vectors for the second image block comprise a plurality;

24. The encoding apparatus as claimed in claim 23, wherein the determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks comprises:

25. The encoding apparatus as claimed in claim 23 or 24, wherein the determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from the M reference blocks comprises:

26. The encoding apparatus according to claim 24, wherein in the case where the horizontal component and/or the vertical component of the jth candidate motion vector point to a sub-pixel, the factor further comprises a preset value, and the preset value is related to a minimum precision of the motion vector and/or a number of taps of an interpolation filter.

27. The encoding apparatus as claimed in claim 26, wherein the determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from among the M reference blocks comprises:

28. The encoding apparatus as claimed in claim 23 or 24, wherein the determining whether there is a reference block corresponding to the jth candidate motion vector of the second image block from the M reference blocks comprises:

29. The encoding device according to any one of claims 23 to 28, wherein said inter-predicting the second image block in a Merge mode or a Skip mode when a reference block of the second image block exists in the M reference blocks comprises:

30. The encoding apparatus of any one of claims 23 through 28, wherein the processor is further configured to:

31. The encoding device according to any of claims 17 to 30, wherein each of the N image blocks is a prediction unit, PU.

32. The encoding device according to any one of claims 18 to 20, wherein the first image block is any one of the following: a coding tree unit CTU, a coding unit CU, a prediction unit PU.

33. A computer storage medium, having stored thereon a computer program which, when executed by a computer, causes the computer to perform the method of any one of claims 1 to 16.

34. A computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 16.