WO2020180166A1

WO2020180166A1 - Image encoding/decoding method and apparatus

Info

Publication number: WO2020180166A1
Application number: PCT/KR2020/003228
Authority: WO
Inventors: 안용조
Original assignee: 디지털인사이트주식회사
Priority date: 2019-03-07
Filing date: 2020-03-09
Publication date: 2020-09-10
Also published as: MX2021010704A; US20220360776A1; MX2025000107A; KR20210127709A; CA3132582A1; CN113545041A; US12034923B2; US20210321096A1; US11363265B2; US20240323370A1

Abstract

An image encoding/decoding method and apparatus according to the present invention may: reconstruct a current picture on the basis of at least one of intra prediction and inter prediction; specify a block boundary, to which a de-blocking filter is applied, in the reconstructed current picture; and apply a de-blocking filter to the block boundary on the basis of a filter type pre-defined in an encoding apparatus.

Description

Video encoding/decoding method and apparatus

The present invention relates to a video encoding/decoding method and apparatus.

Recently, as the demand for high-definition and high-definition video increases, the need for a high-efficiency video compression technology for next-generation video services has emerged. Based on this need, ISO/IEC MPEG and ITU-T VCEG, which jointly standardized H.264/AVC and HEVC video compression standards, formed JVET (Joint Video Exploration Team) to establish a new video compression standard from October 2015. Research and exploration for establishment were conducted, and in April 2018, a new video compression standardization was started with an evaluation of the responses to the CfP (Call for Proposal) of the new video compression standard.

In a video compression technique, a block splitting structure means a unit that performs encoding and decoding, and a unit to which major encoding and decoding techniques such as prediction and transformation are applied. As video compression technology develops, the size of blocks for encoding and decoding is gradually increasing, and the block division type supports more various division types. In addition, video compression is performed using not only units for encoding and decoding, but also units subdivided according to the role of blocks.

In the HEVC standard, video encoding and decoding are performed using unit blocks that are subdivided according to a quadtree-type block splitting structure and a role for prediction and transformation. In addition to the quad-tree type block division structure, various types of block division such as QTBT (QuadTree plus Binary Tree) in the form of combining a quad tree and a binary-tree, and MTT (Multi-Type-Tree) in which a triple-tree is combined. Structures have been proposed to improve video coding efficiency. Through the support of various block sizes and various types of block splitting structures, one picture is divided into a plurality of blocks, and information on each of the coding units such as coding mode, motion information, and intra prediction direction information corresponding to each block is diverse. As the number of bits expressing this is greatly increased.

An image encoding/decoding method and apparatus according to the present invention provides an in-loop filtering method for a reconstructed picture.

An image encoding/decoding method and apparatus according to the present invention provides a motion compensation method according to a plurality of inter prediction modes.

An image encoding/decoding method and apparatus according to the present invention restores a current picture based on at least one of intra prediction or inter prediction, specifies a block boundary to which a deblocking filter is applied from the reconstructed current picture, and encodes The deblocking filter may be applied to the specified block boundary based on a filter type pre-defined in the /decoding device.

In the image encoding/decoding method and apparatus according to the present invention, the deblocking filter is applied in units of a predetermined MxN sample grid, where M and N may be integers of 4, 8 or more. .

In the video encoding/decoding method and apparatus according to the present invention, the encoding/decoding apparatus defines a plurality of filter types having different filter lengths, and the plurality of filter types is a long filter and an intermediate filter ( It may include at least one of a middle filter or a short filter.

In the video encoding/decoding method and apparatus according to the present invention, the filter length of the long filter is 8, 10, 12 or 14, the filter length of the intermediate filter is 6, and the filter length of the short filter is 2 or 4 Can be

In the image encoding/decoding method and apparatus according to the present invention, the number of pixels to which the deblocking filter is applied in a P block and the number of pixels to which the deblocking filter is applied in a Q block are different from each other, wherein the P The block and the Q block may be adjacent blocks in both directions based on the specified block boundary.

In the video encoding/decoding method and apparatus according to the present invention, the number of pixels to which the deblocking filter is applied in the P block may be 3, and the number of pixels to which the deblocking filter is applied in the Q block may be 7 have.

In the video encoding/decoding method and apparatus according to the present invention, the reconstructing of the current picture comprises: constructing a merge candidate list of a current block, deriving motion information of the current block from the merge candidate list, and And performing motion compensation of the current block based on the motion information.

In the method and apparatus for encoding/decoding an image according to the present invention, motion compensation of the current block may be performed based on a reference region according to a current picture reference mode.

In the method and apparatus for encoding/decoding an image according to the present invention, a motion vector among the derived motion information may be corrected by using a motion vector difference value for a merge mode.

In the image encoding/decoding method and apparatus according to the present invention, correction of the motion vector may be performed only when the size of the current block is larger than a predetermined threshold size.

In the present invention, an in-loop filter is applied in a unit of a predetermined sample grid, but the efficiency of in-loop filtering may be improved by considering a boundary between prediction/transform blocks or subblocks thereof.

In addition, the present invention filters block boundaries based on in-loop filters having different filter lengths, so that artifacts on the boundary can be efficiently removed.

In addition, according to the present invention, the efficiency of motion compensation may be improved by adaptively using a plurality of inter prediction modes according to a predetermined priority.

In addition, the present invention can improve the encoding efficiency of the current picture reference mode by adaptively using the reference region according to the current picture reference mode.

In addition, according to the present invention, by selectively using a merge mode based on a motion vector difference value, accuracy of an inter prediction parameter for a merge mode can be improved and encoding efficiency of the merge mode can be improved.

1 is a block diagram showing an image encoding apparatus according to the present invention.

2 is a block diagram showing an image decoding apparatus according to the present invention.

3 is a diagram illustrating a target boundary and a target pixel of a deblocking filter in an embodiment to which the present invention is applied.

4 illustrates a deblocking filtering process in the filter unit 150 of the encoding apparatus and the filter unit 240 of the decoding apparatus according to an embodiment to which the present invention is applied.

5 is a diagram illustrating a concept of performing prediction and transformation by dividing one coding block into a plurality of sub-blocks.

6 is a diagram illustrating an example of sub-block division for one coding block and a concept of a sub-block boundary and a deblocking filter grid.

7 is a diagram illustrating a concept of a pixel currently filtered at a boundary between a P block and a Q block and a reference pixel used for filtering.

FIG. 8 is a diagram illustrating a concept of a pixel currently filtered at a boundary between a P block and a Q block and a sub-block boundary within a Q block and a reference pixel used for filtering.

9 is a diagram for explaining a basic concept of a current picture reference mode.

10 is a diagram illustrating an embodiment of a current picture reference area according to a location of a current block.

11 to 14 illustrate examples of a region including a current block and a search and reference available region of a current picture reference (CPR).

15 illustrates an image encoding/decoding method using a merge mode based on a motion vector difference value (MVD) as an embodiment to which the present invention is applied.

16 to 21 are diagrams illustrating a method of determining an inter prediction mode of a current block based on a predetermined priority in an embodiment to which the present invention is applied.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings in the present specification so that those of ordinary skill in the art may easily implement the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and similar reference numerals are assigned to similar parts throughout the specification.

When it is said that a part is'connected' with another part throughout the specification, this includes not only the case that it is directly connected but also the case where it is electrically connected with another element interposed therebetween. In addition, when it is said that a certain part'includes' a certain element throughout the specification, it means that other elements may be further included rather than excluding other elements unless specifically stated to the contrary.

As used throughout this specification, the term ~ (to) or the step of does not mean a step for. In addition, terms such as first and second may be used to describe various elements, but the elements should not be limited to the terms. These terms are used only for the purpose of distinguishing one component from another component.

In addition, components shown in the embodiments of the present invention are independently shown to represent different characteristic functions, and it does not mean that each component is formed of separate hardware or a single software component. That is, each constituent unit is described as being listed as a respective constituent unit for convenience of explanation, and at least two constituent units of each constituent unit are combined to form a single constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform a function. An integrated embodiment and a separate embodiment of each of these components are also included in the scope of the present invention unless departing from the essence of the present invention.

Hereinafter, in various embodiments of the present invention described herein, terms such as “~ unit”, “~ group”, “~ unit”, “~ module”, “~ block” are used to process at least one function or operation. It means a unit, which can be implemented in hardware or software, or a combination of hardware and software.

In addition, a coding block refers to a processing unit of a set of target pixels on which encoding and decoding are currently performed, and may be used interchangeably as a coding block and a coding unit. In addition, the coding unit refers to a coding unit (CU) and may be generically referred to including a coding block (CB).

In addition, quadtree splitting refers to one block being divided into four independent coding units, and binary splitting refers to one block being divided into two independent coding units. In addition, ternary division refers to that one block is divided into three independent coding units in a 1:2:1 ratio.

Referring to FIG. 1, the image encoding apparatus 100 includes a picture splitter 110, a

prediction unit

120, 125, a transform unit 130, a quantization unit 135, a rearrangement unit 160, and an entropy encoder ( 165, an inverse quantization unit 140, an inverse transform unit 145, a filter unit 150, and a memory 155 may be included.

The picture dividing unit 110 may divide the input picture into at least one processing unit. In this case, the processing unit may be a prediction unit (PU), a transform unit (TU), or a coding unit (CU). Hereinafter, in an embodiment of the present invention, a coding unit may be used as a unit that performs encoding or a unit that performs decoding.

The prediction unit may be divided in a shape such as at least one square or rectangle of the same size within one coding unit, or one prediction unit among the prediction units split within one coding unit is another prediction. It may be divided to have a shape and/or size different from the unit. When a prediction unit that performs intra prediction based on a coding unit is not a minimum coding unit, intra prediction may be performed without dividing into a plurality of prediction units NxN.

The

prediction units

120 and 125 may include an inter prediction unit 120 that performs inter prediction or inter prediction, and an intra prediction unit 125 that performs intra prediction or intra prediction. It is possible to determine whether to use inter prediction or to perform intra prediction for the prediction unit, and determine specific information (eg, intra prediction mode, motion vector, reference picture, etc.) according to each prediction method. A residual value (residual block) between the generated prediction block and the original block may be input to the transform unit 130. In addition, prediction mode information, motion vector information, etc. used for prediction may be encoded by the entropy encoder 165 together with a residual value and transmitted to a decoder. However, when the motion information derivation technique from the side of the decoder according to the present invention is applied, since the encoder does not generate the prediction mode information and motion vector information, the corresponding information is not transmitted to the decoder. On the other hand, it is possible for the encoder to signal and transmit information indicating that motion information is derived and used from the side of the decoder and information on a technique used for inducing the motion information.

The inter prediction unit 120 may predict a prediction unit based on information of at least one picture of a previous picture or a subsequent picture of the current picture. In some cases, the prediction unit may be predicted based on information of a partial region in the current picture that has been encoded. You can also predict the unit. As the inter prediction mode, various methods such as a merge mode, an advanced motion vector prediction (AMVP) mode, an affine mode, a current picture referencing mode, and a combined prediction mode may be used. In the merge mode, at least one motion vector among spatial/temporal merge candidates may be set as a motion vector of a current block, and inter prediction may be performed using the motion vector. However, even in the merge mode, the pre-set motion vector may be corrected by adding an additional motion vector difference value (MVD) to the pre-set motion vector. In this case, the corrected motion vector may be used as the final motion vector of the current block, which will be described in detail with reference to FIG. 15. The afine mode is a method of dividing a current block into a predetermined sub-block unit and performing inter prediction using a motion vector derived in each sub-block unit. Here, the sub-block unit is represented by NxM, and N and M may be integers of 4, 8, 16 or more, respectively. The shape of the sub-block may be square or non-square. The sub-block unit may be a fixed one pre-committed to the encoding apparatus, or may be variably determined in consideration of the size/shape of the current block, and the component type. The current picture reference mode is an inter prediction method using a pre-restored region in the current picture to which the current block belongs and a predetermined block vector, which will be described in detail with reference to FIGS. 9 to 14. In the combined prediction mode, a first prediction block through inter prediction and a second prediction block through intra prediction are generated for one current block, respectively, and a predetermined weight is applied to the first and second prediction blocks. This is a method of generating the final prediction block. Here, inter prediction may be performed using any one of the aforementioned inter prediction modes. The intra prediction may be performed by fixedly using only an intra prediction mode (eg, any one of a planar mode, a DC mode, a vertical/horizontal mode, and a diagonal mode) pre-set in the encoding apparatus. Alternatively, the intra prediction mode for intra prediction may be derived based on an intra prediction mode of a neighboring block (eg, at least one of left, upper, upper left, upper right, and lower right) adjacent to the current block. In this case, the number of neighboring blocks used may be fixed to one or two, or may be three or more. Even if all of the aforementioned neighboring blocks are available, only one of the left neighboring blocks or the upper neighboring blocks may be limited to be used, or only the left and upper neighboring blocks may be restricted. The weight may be determined in consideration of whether the aforementioned neighboring block is a block encoded in an intra mode. It is assumed that the weight w1 is applied to the first prediction block and the weight w2 is applied to the second prediction block. In this case, when both the left and upper neighboring blocks are blocks encoded in the intra mode, w1 may be a natural number smaller than w2. For example, the ratio of w1 and w2 (raito) may be [1:3]. When neither of the left/top neighboring blocks is an intra mode coded block, w1 may be a natural number greater than w2. For example, the ratio of w1 and w2 (raito) may be [3:1]. When only one of the left/top neighboring blocks is a block encoded in the intra mode, w1 may be set equal to w2.

The inter prediction unit 120 may include a reference picture interpolation unit, a motion prediction unit, and a motion compensation unit.

The reference picture interpolation unit may receive reference picture information from the memory 155 and generate pixel information of an integer number of pixels or less from the reference picture. In the case of a luminance pixel, a DCT-based interpolation filter with different filter coefficients may be used to generate pixel information of an integer pixel or less in units of 1/4 pixels. In the case of a color difference signal, a DCT-based interpolation filter with different filter coefficients may be used to generate pixel information of an integer pixel or less in units of 1/8 pixels.

The motion prediction unit may perform motion prediction based on the reference picture interpolated by the reference picture interpolation unit. Various methods, such as a full search-based block matching algorithm (FBMA), three step search (TSS), and a new three-step search algorithm (NTS), can be used as a method for calculating a motion vector. The motion vector may have a motion vector value in units of 1/2 or 1/4 pixels based on the interpolated pixels. The motion prediction unit may predict the current prediction unit by differently predicting the motion.

The intra predictor 125 may generate a prediction unit based on reference pixel information around a current block, which is pixel information in the current picture. When the neighboring block of the current prediction unit is a block that has performed inter prediction and the reference pixel is a pixel that has performed inter prediction, the reference pixel included in the block that has performed inter prediction is a reference pixel of the block that has performed intra prediction Can be used in place of information. That is, when the reference pixel is not available, information on the reference pixel that is not available may be replaced with at least one reference pixel among the available reference pixels.

In addition, a residual block including a prediction unit that performs prediction based on a prediction unit generated by the

prediction units

120 and 125 and residual information that is a difference value from the original block of the prediction unit may be generated. The generated residual block may be input to the transform unit 130.

In the transform unit 130, the original block and the residual block including residual information of the prediction unit generated through the

prediction units

120 and 125 are converted to DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), and KLT. You can convert it using the same conversion method. Whether to apply DCT, DST, or KLT to transform the residual block may be determined based on intra prediction mode information of a prediction unit used to generate the residual block.

The quantization unit 135 may quantize values converted into the frequency domain by the transform unit 130. Quantization coefficients may vary depending on the block or the importance of the image. The value calculated by the quantization unit 135 may be provided to the inverse quantization unit 140 and the rearrangement unit 160.

The rearrangement unit 160 may rearrange coefficient values on the quantized residual values.

The rearrangement unit 160 may change the two-dimensional block shape coefficients into a one-dimensional vector shape through a coefficient scanning method. For example, the rearrangement unit 160 may scan from a DC coefficient to a coefficient in a high frequency region using a Zig-Zag Scan method, and change it into a one-dimensional vector form. Depending on the size of the transform unit and the intra prediction mode, instead of zig-zag scan, a vertical scan that scans two-dimensional block shape coefficients in a column direction and a horizontal scan that scans two-dimensional block shape coefficients in a row direction may be used. That is, according to the size of the transformation unit and the intra prediction mode, it is possible to determine which scan method is to be used among zig-zag scan, vertical direction scan, and horizontal direction scan.

The entropy encoding unit 165 may perform entropy encoding based on values calculated by the rearrangement unit 160. Entropy coding may use various coding methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC). In relation to this, the entropy encoder 165 may encode residual value coefficient information of a coding unit from the rearrangement unit 160 and the

prediction units

120 and 125. In addition, according to the present invention, it is possible to signal and transmit information indicating that motion information is derived from the side of the decoder and used, and information on a technique used to induce motion information.

The inverse quantization unit 140 and the inverse transform unit 145 inverse quantize values quantized by the quantization unit 135 and inverse transform the values transformed by the transform unit 130. The residual value generated by the inverse quantization unit 140 and the inverse transform unit 145 is reconstructed by being combined with the prediction units predicted through the motion estimation unit, motion compensation unit, and intra prediction unit included in the

prediction units

120 and 125 Blocks (Reconstructed Block) can be created.

The filter unit 150 may include at least one of a deblocking filter, an offset correction unit, and an adaptive loop filter (ALF). The deblocking filter can remove block distortion caused by the boundary between blocks in the reconstructed picture, which will be described with reference to FIGS. 3 to 8. The offset correction unit may correct an offset from the original image for the deblocking image in pixel units. In order to perform offset correction for a specific picture, the pixels included in the image are divided into a certain number of areas, and then the area to be offset is determined and the offset is applied to the area, or offset by considering the edge information of each pixel. You can use the method of applying. Adaptive Loop Filtering (ALF) may be performed based on a value obtained by comparing the filtered reconstructed image and the original image. After dividing the pixels included in the image into predetermined groups, one filter to be applied to the corresponding group may be determined, and filtering may be performed differentially for each group.

The memory 155 may store the reconstructed block or picture calculated through the filter unit 150, and the stored reconstructed block or picture may be provided to the

prediction units

120 and 125 when performing inter prediction.

2, the image decoder 200 includes an entropy decoding unit 210, a rearrangement unit 215, an inverse quantization unit 220, an inverse transform unit 225,

prediction units

230 and 235, and a filter unit ( 240) and a memory 245 may be included.

When an image bitstream is input from the image encoder, the input bitstream may be decoded in a procedure opposite to that of the image encoder.

The entropy decoding unit 210 may perform entropy decoding in a procedure opposite to that performed by the entropy encoding unit of the image encoder. For example, various methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be applied in response to the method performed by the image encoder.

The entropy decoder 210 may decode information related to intra prediction and inter prediction performed by the encoder.

The rearrangement unit 215 may perform rearrangement based on a method in which the bitstream entropy-decoded by the entropy decoder 210 is rearranged by the encoder. The coefficients expressed in the form of a one-dimensional vector may be reconstructed into coefficients in the form of a two-dimensional block and rearranged.

The inverse quantization unit 220 may perform inverse quantization based on a quantization parameter provided by an encoder and a coefficient value of a rearranged block.

The inverse transform unit 225 may perform an inverse transform, that is, an inverse DCT, an inverse DST, and an inverse KLT, for transforms, that is, DCT, DST, and KLT, performed by the transform unit on the quantization result performed by the image encoder. The inverse transformation may be performed based on a transmission unit determined by an image encoder. The inverse transform unit 225 of the image decoder may selectively perform a transformation technique (eg, DCT, DST, KLT) according to a plurality of pieces of information such as a prediction method, a size of a current block, and a prediction direction.

The

prediction units

230 and 235 may generate a prediction block based on information related to generation of a prediction block provided from the entropy decoder 210 and information on a previously decoded block or picture provided from the memory 245.

As described above, when intra prediction or intra prediction is performed in the same manner as the operation of the image encoder, when the size of the prediction unit and the size of the transformation unit are the same, a pixel existing on the left side of the prediction unit and a pixel existing on the top left side , Intra prediction for the prediction unit is performed based on the pixel present at the top, but when the size of the prediction unit and the size of the transformation unit are different when performing intra prediction, the intra prediction is performed using a reference pixel based on the transformation unit. You can make predictions. In addition, intra prediction using NxN splitting for only the smallest coding unit may be used.

The

prediction units

230 and 235 may include a prediction unit determining unit, an inter prediction unit, and an intra prediction unit. The prediction unit discrimination unit receives various information such as prediction unit information input from the entropy decoder 210, prediction mode information of the intra prediction method, motion prediction related information of the inter prediction method, and divides the prediction unit from the current coding unit, and predicts It can be determined whether the unit performs inter prediction or intra prediction. On the other hand, if the encoder 100 does not transmit motion prediction-related information for the inter prediction, instead, information indicating that motion information is derived from the side of the decoder and used, and information about a technique used for deriving motion information are transmitted. In this case, the prediction unit determination unit determines the prediction performance of the inter prediction unit 23 based on the information transmitted from the encoder 100.

The inter prediction unit 230 uses information necessary for inter prediction of the current prediction unit provided by the video encoder to predict the current based on information included in at least one picture of a previous picture or a subsequent picture of the current picture including the current prediction unit. Inter prediction for a unit can be performed. To perform inter prediction, an inter prediction mode of a prediction unit included in a corresponding coding unit may be determined based on a coding unit. Regarding the inter prediction mode, the above-described merge mode, AMVP mode, affine mode, current picture reference mode, combined prediction mode, and the like may be used in the same manner in the decoding apparatus, and a detailed description thereof will be omitted. The inter prediction unit 230 may determine the inter prediction mode of the current prediction unit with a predetermined priority, which will be described with reference to FIGS. 16 to 18.

The intra prediction unit 235 may generate a prediction block based on pixel information in the current picture. When the prediction unit is a prediction unit that has performed intra prediction, intra prediction may be performed based on intra prediction mode information of a prediction unit provided from an image encoder. The intra prediction unit 235 may include an AIS (Adaptive Intra Smoothing) filter, a reference pixel interpolation unit, and a DC filter. The AIS filter is a part that performs filtering on a reference pixel of the current block, and may determine whether to apply the filter according to the prediction mode of the current prediction unit and apply it. AIS filtering may be performed on a reference pixel of a current block by using the prediction mode and AIS filter information of the prediction unit provided by the image encoder. When the prediction mode of the current block is a mode in which AIS filtering is not performed, the AIS filter may not be applied.

When the prediction mode of the prediction unit is a prediction unit that performs intra prediction based on a pixel value obtained by interpolating a reference pixel, the reference pixel interpolator may interpolate the reference pixel to generate a reference pixel of a pixel unit having an integer value or less. When the prediction mode of the current prediction unit is a prediction mode that generates a prediction block without interpolating a reference pixel, the reference pixel may not be interpolated. The DC filter may generate a prediction block through filtering when the prediction mode of the current block is the DC mode.

The reconstructed block or picture may be provided to the filter unit 240. The filter unit 240 may include a deblocking filter, an offset correction unit, and an ALF.

In the deblocking filter of the image decoder, information related to the deblocking filter provided from the image encoder is provided, and the image decoder may perform deblocking filtering on the corresponding block. This will be described with reference to FIGS. 3 to 8. To

The offset correction unit may perform offset correction on the reconstructed image based on the type of offset correction applied to the image during encoding and information on the offset value. The ALF may be applied to a coding unit based on information on whether to apply ALF and information on ALF coefficients provided from the encoder. Such ALF information may be included in a specific parameter set and provided.

The memory 245 can store the reconstructed picture or block so that it can be used as a reference picture or a reference block, and can provide the reconstructed picture to an output unit.

3 is a diagram illustrating

block boundaries

320 and 321 between two different blocks (P block and Q block), and the corresponding block boundaries can be classified into a vertical boundary and a horizontal boundary.

In FIG. 3, a Q block region refers to a region of a current target block on which encoding and/or decoding is currently performed, and a P block region refers to a block spatially adjacent to a Q block as a reconstructed block. The P block and the Q block are pre-restored blocks, the Q block refers to a region in which deblocking filtering is currently performed, and the P block may refer to a block spatially adjacent to the Q block.

FIG. 3 conceptually shows a P block region and a Q block region to which a deblocking filter is applied, and illustrates an embodiment of pixels positioned at a boundary between a P block and a Q block to which the deblocking filter is applied. Accordingly, the number of pixels to which the deblocking filter proposed in the present invention is applied (hereinafter, the number of target pixels) and the number of taps of the deblocking filter are not limited to FIG. 3, and the P block is based on the boundary between the P block and the Q block. The number of target pixels for each of the and Q blocks may be 1, 2, 3, 4, 5, 6, 7 or more. The number of target pixels of the P block may be the same as or different from the number of target pixels of the Q block. For example, the number of target pixels of the P block may be 5, and the number of target pixels of the Q block may be 5. Alternatively, the number of target pixels of the P block may be 7 and the number of target pixels of the Q block may be 7. Alternatively, the number of target pixels of the P block may be 3, and the number of target pixels of the Q block may be 7.

In FIG. 3, a case where the number of target pixels of the P block and the Q block is 3, respectively, is described as an embodiment.

In one embodiment of the vertical boundary illustrated in FIG. 3, an example in which a deblocking filter is applied to the first row 330 of the Q block 300 region is illustrated.

Among the four pixels (q0, q1, q2, q3) belonging to the first row, three pixels (q0, q1, q2) adjacent to the vertical boundary are target pixels on which deblocking filtering is performed.

In addition, in an example in which the deblocking filter is applied to the first column 331 of the Q block 301 region among the exemplary embodiments of the horizontal boundary illustrated in FIG. 3, four pixels (q0) belonging to the first column are similarly applied. , q1, q2, and q3), three pixels (q0, q1, and q2) adjacent to the horizontal boundary are target pixels on which deblocking filtering is performed.

However, in performing the deblocking filter on the corresponding pixels, filtering is performed by referring to a pixel value of another pixel value (for example, q3) belonging to the first row or column other than the target pixel on which the deblocking filtering is performed. I can. Alternatively, filtering may be performed by referring to pixel values of a neighboring row or column of the first row or column. Here, the neighboring row or column may belong to a current target block or may belong to a block spatially adjacent to the current target block (eg, left/right, top/bottom). The location of the spatially adjacent blocks may be adaptively determined in consideration of the filtering direction (or boundary direction). Through the reference, whether or not to perform filtering, filtering strength, filter coefficients, number of filter coefficients, filtering direction, etc. may be adaptively determined. The above-described embodiment can be applied in the same/similar manner to the embodiments described later.

3 illustrates an example in which the deblocking filter is applied to the Q block region, as representative of the first row 330 and the first column 331, including the first row and subsequent rows belonging to the Q block region The deblocking filter is similarly performed on subsequent columns (second column, third column, etc.) belonging to the Q block area including (second row, third row, etc.) and first column.

In FIG. 3, the P block region refers to a block region spatially adjacent to a vertical boundary or a horizontal boundary of a current target block on which encoding and/or decoding is currently performed, and an embodiment of the vertical boundary shown in FIG. In the middle, an example in which the deblocking filter is applied to the first row 330 of the P block 310 is shown.

Among the four pixels p0, p1, p2, and p3 belonging to the first row, three pixels p0, p1, and p2 adjacent to the vertical boundary are target pixels on which deblocking filtering is performed.

In addition, in an example in which the deblocking filter is applied to the first column 331 of the P block 311 region in the embodiment of the horizontal boundary illustrated in FIG. 3, four pixels belonging to the first column (p0 , p1, p2, p3), three pixels p0, p1, and p2 adjacent to the horizontal boundary are target pixels on which deblocking filtering is performed.

However, in performing the deblocking filter on the corresponding pixels, filtering may be performed by referring to a pixel value of p3, not a target pixel on which deblocking filtering is performed.

FIG. 3 shows an example in which the deblocking filter is applied to the P block region. The first row 330 and the first column 331 are representatively illustrated, and subsequent rows belonging to the P block region including the first row The deblocking filter is similarly performed on subsequent columns (second column, third column, etc.) belonging to the P block region including (second row, third row, etc.) and first column.

Referring to FIG. 4, a block boundary for deblocking filtering (hereinafter, referred to as an edge) among block boundaries of a reconstructed picture may be specified (S400).

The reconstructed picture may be partitioned into a predetermined NxM pixel grid. The NxM pixel grid may mean a unit in which deblocking filtering is performed. Here, N and M may be 4, 8, 16 or more integers. Each pixel grid may be defined for each component type. For example, when the component type is a luminance component, N and M may be set to 4, and when the component type is a color difference component, N and M may be set to 8. Regardless of the component type, a fixed-size NxM pixel grid may be used.

The edge is a block boundary positioned on an NxM pixel grid, and may include at least one of a boundary of a transform block, a boundary of a prediction block, or a boundary of a sub-block. The sub-block may mean a sub-block according to the affine mode described above. Block boundaries to which the deblocking filter is applied will be described with reference to FIGS. 5 and 6.

Referring to FIG. 4, a decision value for the specified edge may be derived (S410).

In this embodiment, it is assumed that the edge type is a vertical edge, and a 4x4 pixel grid is applied. Based on the edge, the left block and the right block will be referred to as P blocks and Q blocks, respectively. The P block and the Q block are pre-restored blocks, the Q block refers to a region in which deblocking filtering is currently performed, and the P block may refer to a block spatially adjacent to the Q block.

First, the determination value may be derived using a variable dSam for inducing the determination value. dSam may be derived for at least one of a first pixel line or a fourth pixel line of the P block and the Q block. Hereinafter, dSam for the first pixel line (row) of the P block and Q block is referred to as dSam0, and dSam for the fourth pixel line (row) is referred to as dSam3.

If at least one of the following conditions is satisfied, dSam0 may be set to 1, otherwise, dSam0 may be set to 0.

조건Condition
1One	dqp < 제1 문턱값dqp <first threshold
22	(sp + sq) < 제2 문턱값(sp + sq) <second threshold
33	spq < 제3 문턱값spq <3rd threshold

In Table 1, dpq may be derived based on at least one of a first pixel value linearity d1 of the first pixel line of the P block or a second pixel value linearity d2 of the first pixel line of the Q block. Here, the first pixel value linearity d1 may be derived using i pixels p belonging to the first pixel line of the P block. i can be 3, 4, 5, 6, 7 or more. The i pixels p may be continuous pixels adjacent to each other, or may be non-contiguous pixels spaced apart from each other. In this case, the pixel p may be i pixels closest to the edge among the pixels of the first pixel line. Similarly, the second pixel value linearity d2 can be derived using j pixels q belonging to the first pixel line of the Q block. j may be 3, 4, 5, 6, 7 or more. j is set to the same value as i, but is not limited thereto, and may be a value different from i. The j pixels q may be contiguous pixels adjacent to each other, or may be non-contiguous pixels separated by a predetermined interval. In this case, the pixel q may be j pixels closest to the edge among the pixels of the first pixel line.

For example, when three pixels p and three pixels q are used, the first pixel value linearity d1 and the second pixel value linearity d2 may be derived as in Equation 1 below.

[Equation 1]

d1 = Abs( p2,0-2 * p1,0 + p0,0)

d2 = Abs( q2,0-2 * q1,0 + q0,0)

Alternatively, when six pixels p and six pixels q are used, the first pixel value linearity d1 and the second pixel value linearity d2 can be derived as in Equation 2 below.

[Equation 2]

d1 = (Abs( p2,0-2 * p1,0 + p0,0) + Abs( p5,0-2 * p4,0 + p3,0) + 1) >> 1

d2 = (Abs( q2,0-2 * q1,0 + q0,0) + Abs( q5,0-2 * q4,0 + q3,0) + 1) >> 1

In Table 1, sp denotes a first pixel value gradient v1 of the first pixel line of the block P, and sq denotes a second pixel value gradient v2 of the first pixel line of the Q block. Here, the first pixel value gradient v1 may be derived using m pixels p belonging to the first pixel line of the P block. m may be 2, 3, 4, 5, 6, 7 or more. The m pixels p may be contiguous pixels adjacent to each other, or may be non-contiguous pixels spaced at predetermined intervals. Alternatively, some of the m pixels p may be contiguous pixels adjacent to each other, and the others may be non-contiguous pixels separated by a predetermined interval. Similarly, the second pixel value gradient v2 may be derived using n pixels q belonging to the first pixel line of the Q block. n may be 2, 3, 4, 5, 6, 7 or more. n is set to the same value as m, but is not limited thereto, and may be a value different from m. The n pixels q may be contiguous pixels adjacent to each other, or may be non-contiguous pixels separated by a predetermined interval. Alternatively, some of the n pixels q may be contiguous pixels adjacent to each other, and the others may be non-contiguous pixels separated by a predetermined interval.

For example, when two pixels p and two pixels q are used, a first pixel value gradient v1 and a second pixel value gradient v2 may be derived as in Equation 3 below.

[Equation 3]

v1 = Abs( p3,0-p0,0)

v2 = Abs( q0,0-q3,0)

Alternatively, when six pixels p and six pixels q are used, the first pixel value gradient v1 and the second pixel value gradient v2 may be derived as in Equation 4 below.

[Equation 4]

v1 = Abs( p3,0-p0,0) + Abs( p7,0-p6,0-p5,0 + p4,0)

v2 = Abs( q0,0-q3,0) + Abs( q4,0-q5,0-q6,0 + q7,0)

The spq in Table 1 may be derived from the difference between the pixel p0,0 and the pixel q0,0 adjacent to the edge.

The first and second threshold values of Table 1 may be derived based on a predetermined parameter QP. Here, the QP may be determined using at least one of a first quantization parameter of the P block, a second quantization parameter of the Q block, or an offset for inducing QP. The offset may be a value encoded and signaled by an encoding device. For example, QP may be derived by adding the offset to the average value of the first and second quantization parameters. The third threshold value of Table 1 may be derived based on the above-described quantization parameter (QP) and block boundary strength (BS). Here, the BS may be variably determined in consideration of a prediction mode of a P/Q block, an inter prediction mode, the presence or absence of a non-zero transform coefficient, and a difference in motion vectors. For example, when at least one prediction mode of the P block and the Q block is an intra mode, the BS may be set to 2. When at least one of the P blocks or the Q blocks is encoded in the joint prediction mode, the BS may be set to 2. When at least one of the P block or Q block includes a non-zero transform coefficient, the BS may be set to 1. When the P block is coded in an inter prediction mode different from the Q block (for example, when the P block is coded in the current picture reference mode and the Q block is coded in the merge mode or AMVP mode), the BS may be set to 1. have. When both the P block and the Q block are coded in the current picture reference mode, and the difference between their block vectors is greater than or equal to a predetermined threshold difference, the BS may be set to 1. Here, the threshold difference may be a fixed value (eg, 4, 8, 16) pre-committed to the encoding/decoding device.

Since dSam3 is derived using one or more pixels belonging to the fourth pixel line through the same method as dSam0 described above, a detailed description will be omitted.

A determination value may be derived based on the derived dSam0 and dSam3. For example, if both dSam0 and dSam3 are 1, the decision value is set to the first value (e.g. 3), otherwise, the decision value is set to the second value (e.g. 1 or 2). Can be.

Referring to FIG. 4, a filter type of a deblocking filter may be determined based on the derived determined value (S420).

In the encoding/decoding apparatus, a plurality of filter types having different filter lengths may be defined. As an example of the filter type, there may be a long filter having the longest filter length, a short filter having the shortest filter length, or one or more middle filters that are longer than a short filter and shorter than a long filter. . The number of filter types defined in the encoding/decoding apparatus may be 2, 3, 4 or more.

For example, when the determined value is a first value, a long filter may be used, and when the determined value is a second value, a short filter may be used. Alternatively, when the determined value is a first value, one of a long filter or an intermediate filter may be selectively used, and when the determined value is a second value, a short filter may be used. Alternatively, when the determined value is the first value, a long filter is used, and when the determined value is not the first value, either a short filter or an intermediate filter may be selectively used. In particular, when the determination value is 2, an intermediate filter may be used, and when the determination value is 1, a short filter may be used.

Referring to FIG. 4, filtering may be performed on an edge of a reconstructed picture based on a deblocking filter according to the determined filter type (S430).

The deblocking filter may be applied to a plurality of pixels located in both directions based on an edge and located in the same pixel line. Here, a plurality of pixels to which the deblocking filter is applied is referred to as a filtering region, and the length (or number of pixels) of the filtering region may be different for each filter type. The length of the filtering region may be interpreted as having the same meaning as the filter length of the aforementioned filter type. Alternatively, the length of the filtering region may mean a sum of the number of pixels to which the deblocking filter is applied in the P block and the number of pixels to which the deblocking filter is applied in the Q block.

In this embodiment, it is assumed that three filter types, that is, a long filter, an intermediate filter, and a short filter, are defined in the encoding/decoding apparatus, and a deblocking filtering method for each filter type will be described. However, the present invention is not limited thereto, and only a long filter and an intermediate filter may be defined, only a long filter and a short filter may be defined, or only an intermediate filter and a short filter may be defined.

1. In case of long filter-based deblocking filtering

For convenience of explanation, it is assumed that the edge type is a vertical edge, and the currently filtered pixel (hereinafter, the current pixel q) belongs to the Q block unless otherwise stated. The filtered pixel fq may be derived through a weighted average of the first reference value and the second reference value.

Here, the first reference value may be derived using all or part of the pixels in the filtering area to which the current pixel q belongs. Here, the length (or number of pixels) of the filtering region may be an integer of 8, 10, 12, 14 or more. Some pixels in the filtering area may belong to the P block and the other pixels may belong to the Q block. For example, when the length of the filtering region is 10, 5 pixels may belong to the P block and 5 pixels may belong to the Q block. Alternatively, 3 pixels may belong to the P block and 7 pixels may belong to the Q block. Conversely, 7 pixels may belong to the P block and 3 pixels may belong to the Q block. In other words, the long filter-based deblocking filtering may be performed symmetrically or asymmetrically on the P block and the Q block.

Regardless of the location of the current pixel q, all pixels belonging to the same filtering area may share one and the same first reference value. That is, the same first reference value can be used regardless of whether the currently filtered pixel is located in the P block or the Q block. The same first reference value may be used regardless of the position of the currently filtered pixel in the P block or the Q block.

The second reference value may be derived using at least one of a pixel farthest from an edge (hereinafter, referred to as a first pixel) among pixels of the filtering area belonging to the Q block or surrounding pixels of the filtering area. The surrounding pixel may mean at least one pixel adjacent to the right side of the filtering area. For example, the second reference value may be derived as an average value between one first pixel and one neighboring pixel. Alternatively, the second reference value may be derived as an average value between two or more first pixels and two or more neighboring pixels adjacent to the right side of the filtering area.

For the weighted average, predetermined weights f1 and f2 may be applied to the first reference value and the second reference value, respectively. Specifically, the encoding/decoding apparatus may define a plurality of weight sets, and may set the weight f1 by selectively using any one of the plurality of weight sets. The selection may be performed in consideration of the length (or number of pixels) of the filtering region belonging to the Q block. For example, the encoding/decoding apparatus may define a weight set as shown in Table 2 below. Each weight set may consist of one or more weights corresponding to each location of the pixel to be filtered. Accordingly, from among a plurality of weights belonging to the selected weight set, a weight corresponding to the position of the current pixel q may be selected and applied to the current pixel q. The number of weights constituting the weight set may be the same as the length of the filtering region belonging to the Q block. A plurality of weights constituting one weight set may be sampled at regular intervals within a range of an integer greater than 0 and less than 64. Here, 64 is only an example, and may be larger or smaller than 64. The predetermined interval may be 9, 13, 17, 21, 25 or more. The interval may be variably determined according to the length L of the filtering region included in the Q block. Alternatively, a fixed spacing may be used regardless of L.

Q 블록에 속한 필터링 영역의 길이(L)Length of filtering area belonging to Q block (L)	가중치 세트Weight set
L > 5L> 5	{ 59, 50, 41, 32, 23, 14, 5 }{59, 50, 41, 32, 23, 14, 5}
55	{ 58, 45, 32, 19, 6 }{58, 45, 32, 19, 6}
L < 5L <5	{ 53, 32, 11 }{53, 32, 11}

Referring to Table 2, when the length (L) of the filtering region belonging to the Q block is greater than 5, {59, 50, 41, 32, 23, 14, 5} is selected among three weight sets, and L is 5 In case of, {58, 45, 32, 19, 6} is selected, and when L is less than 5, {53, 32, 11} may be selected. However, Table 2 is only an example of a weight set, and the number of weight sets defined in the encoding/decoding apparatus may be 2, 4 or more.

Also, when L is 7 and the current pixel is the first pixel q0 based on the edge, a weight 59 may be applied to the current pixel. When the current pixel is the second pixel q1 based on the edge, a weight 50 may be applied to the current pixel, and when the current pixel is the seventh pixel q6 based on the edge, a weight 5 may be applied to the current pixel. .

The weight f2 may be determined based on the pre-determined weight f1. For example, the weight f2 may be determined as a value obtained by subtracting the weight f1 from a pre-defined constant. Here, the pre-defined constant is a fixed value pre-defined in the encoding/decoding apparatus, and may be 64. However, this is only an example, and an integer greater than or less than 64 may be used.

2. In case of intermediate filter-based deblocking filtering

The filter length of the intermediate filter may be smaller than the filter length of the long filter. The length (or number of pixels) of the filtering region according to the intermediate filter may be smaller than the length of the filtering region according to the aforementioned long filter.

For example, the length of the filtering area according to the intermediate filter may be 6, 8 or more. Here, the length of the filtering region belonging to the P block may be the same as the length of the filtering region belonging to the Q block. However, the present invention is not limited thereto, and the length of the filtering region belonging to the P block may be longer or shorter than the length of the filtering region belonging to the Q block.

Specifically, the filtered pixel fq may be derived using the current pixel q and at least one neighboring pixel adjacent to the current pixel q. Here, the neighboring pixels are one or more pixels adjacent to the left direction of the current pixel q (hereinafter, left peripheral pixels) or one or more pixels adjacent to the right direction of the current pixel q (hereinafter, right peripheral pixels). It may include at least one.

For example, when the current pixel q is q0, two left peripheral pixels p0 and p1 and two right peripheral pixels q1 and q2 may be used. When the current pixel q is q1, two left peripheral pixels p0 and q0 and one right peripheral pixel q2 may be used. When the current pixel q is q2, three left peripheral pixels p0, q0, and q1 and one right peripheral pixel q3 may be used.

3. Short filter-based deblocking filtering

The filter length of the short filter may be smaller than that of the intermediate filter. The length (or number of pixels) of the filtering region according to the short filter may be smaller than the length of the filtering region according to the above-described intermediate filter. For example, the length of the filtering region according to the short filter may be 2, 4 or more.

Specifically, the filtered pixel fq may be derived by adding or subtracting a predetermined first offset (offset1) to the current pixel q. Here, the first offset may be determined based on a difference value between the pixels of the P block and the pixels of the Q block. For example, as shown in Equation 5 below, the first offset may be determined based on a difference value between the pixel p0 and the pixel q0 and a difference value between the pixel p1 and the pixel q1. However, filtering for the current pixel q may be performed only when the first offset is smaller than a predetermined threshold. Here, the threshold value is derived based on the above-described quantization parameter (QP) and block boundary strength (BS), and a detailed description thereof will be omitted.

[Equation 5]

offset1 = (9 * (q0-p0)-3 * (q1-p1) + 8) >> 4

Alternatively, the filtered pixel fq may be derived by adding a predetermined second offset (offset2) to the current pixel q. Here, the second offset may be determined in consideration of at least one of a difference (or change amount) or a first offset between the current pixel q and the neighboring pixels. Here, the surrounding pixels may include at least one of a left pixel or a right pixel of the current pixel q. For example, the second offset may be determined as in Equation 6 below.

[Equation 6]

offset2 = ((( q2 + q0 + 1) >> 1)-q1-offset1) >> 1

A method of performing deblocking filtering on a block boundary of a reconstructed picture will be described in detail with reference to FIGS. 7 and 8. The above-described filtering method is not limited to being applied only to a deblocking filter, and an in-loop filter The same/similarly can be applied to an adaptive sample offset (SAO), an adaptive loop filter (ALF), and the like.

As shown in FIG. 5, prediction or transformation may be performed by dividing one coding block into two or four divisions in either a horizontal direction or a vertical direction. The coding block may also be understood as a decoding block. In this case, prediction may be performed only by dividing the coding block into two or four divisions in either a horizontal direction or a vertical direction, or both prediction and transformation may be performed by dividing into two or four, or by dividing into two or four. You can also perform conversion only.

In this case, by dividing the one coding block into two or four divisions in a horizontal direction or a vertical direction, intra prediction and transformation may be performed in each division unit.

5 does not limit the number of divisions, and may be divided into 3, 5, or more. Here, the number of divisions may be variably determined based on block properties. Block properties include block size/shape, component type (luminance, color difference), prediction mode (intra prediction or inter mode), inter prediction mode (merge mode, AMVP mode, affine mode, etc.) Means), a prediction/transformation unit, and an encoding parameter such as a position or length of a block boundary.

Alternatively, either non-divided or two-divided may be selectively used, and either non-divided or four-divided may be selectively used. Alternatively, any one of non-divided, two-divided, or four-divided may be selectively used.

According to the exemplary embodiment shown in FIG. 5, in the case of vertically dividing one coding block 510 into two sub-blocks, the divided sub-blocks 510 are equally divided into two widths (W) 511 -It may include dividing the block so that the width of the block has W/2 (513), and in the case of horizontally dividing one coding block 520 into two sub-blocks, the height of the block (H) It may include dividing the block by dividing 522 equally into two so that the height of the divided sub-block has H/2 (523).

In addition, according to another embodiment illustrated in FIG. 5, in the case of vertically dividing one coding block 530 into four sub-blocks, the width (W) 531 of the block is equally divided into four. It may include dividing the block so that the divided sub-block has a width of W/4 (533). In the case of horizontal division of one coding block 540 into four sub-blocks, It may include dividing the block so that the height (H) 542 is equally divided into four so that the height of the divided sub-block has H/4 (543).

In addition, in an embodiment of the present invention, among modes in which prediction is performed by dividing a current coding block into a plurality of sub-blocks, in the case of a prediction mode within a sub-block screen, the current coding block is predicted and It involves performing the transformation in the same form. In this case, a transform unit may be divided to have the same size/shape as the sub-block, or a plurality of transform units may be merged. Alternatively, conversely, a sub-block unit may be determined based on a transform unit, and intra prediction may be performed in a sub-block unit. The above-described embodiment can be applied in the same/similar manner to the embodiments described later.

In addition, according to another embodiment illustrated in FIG. 5, in the case of vertically dividing one coding block 530 into four sub-blocks, the width (W) 531 of the block is equally divided into four. It may include dividing the block so that the divided sub-block has a width of W/4 (433). In the case of horizontal division of one coding block 440 into four sub-blocks, It may include dividing the block so that the height (H) 542 is equally divided into four so that the height of the divided sub-block has H/4 (543).

In addition, in an embodiment of the present invention, among modes in which prediction is performed by dividing a current coding block into a plurality of sub-blocks, in the case of a prediction mode within a sub-block screen, the current coding block is predicted and It involves performing the transformation in the same form.

When the one coding block is divided into two or four sub-blocks to perform intra prediction and transform according to the corresponding sub-block, blocking artifacts occur at the boundary of the corresponding sub-block. I can. Accordingly, in the case of performing intra prediction in units of sub-blocks, a deblocking filter may be performed at the boundary of each sub-block. The deblocking filter may be selectively performed, and flag information may be used for this. The flag information may indicate whether to perform filtering at the boundary of the sub-block. The flag information may be encoded by the encoding device and signaled by the decoding device, or may be derived by the decoding device based on at least one block attribute of the current block or the neighboring block. Block properties are the same as described above, and detailed descriptions will be omitted.

When performing a deblocking filter on one coding block, if the current coding block is a block in which prediction and transformation have been performed through intra prediction in the sub-block unit, a sub-block unit deblocking filter is performed inside the current block. Can be applied.

When performing a deblocking filter on one coding block, the current coding block is a block in which prediction and transformation have been performed through intra prediction in the sub-block unit, but the boundary of the corresponding sub-block is used for performing the deblocking filter. If it does not exist on the block grid (NxM sample grid), the deblocking filter step is skipped at the boundary of the sub-block, and the boundary of the sub-block is on the block grid for performing the deblocking filter. The deblocking filter step can be performed at the boundary of the sub-block only if it exists.

Here, the block grid for performing the deblocking filter means a minimum block boundary unit to which deblocking filtering can be applied, and may mean a minimum pixel interval between a previous block boundary and a next block boundary. In general, the block grid can use 8x8. However, this is not limited to 8x8 blocks, and 4x4 or 16x16 may also be used. Block grids of different sizes may be used depending on the component type. For example, a block grid having a size smaller than the color difference component may be used for the luminance component. A block grid having a fixed size for each component type may be used.

As shown in FIG. 5, when one coding block is divided into two in either a horizontal direction or a vertical direction, it can be divided into asymmetric sub-blocks.

In this case, an embodiment of a case where one coding block is divided into two asymmetrically in either a horizontal direction or a vertical direction is illustrated in FIG. 5.

The current coding block 550 is vertically divided into two sub-blocks, so that the first sub-block is composed of a sub-block having a height of H and a width of W/4, and the second sub-block has a height of H. , It may be composed of a sub-block having a width of 3*W/4.

In addition, the current coding block 560 is horizontally divided into two sub-blocks, so that the first sub-block is composed of a sub-block having a height of H/4 and a width of W, and the second sub-block is 3 * It can be composed of sub-blocks with H/4 as the height and W as the width.

Independent transformation and/or inverse transformation may be performed in units of the sub-blocks, and transformation and/or inverse transformation may be performed only in some sub-blocks of the current coding block. Here, some sub-blocks may mean N sub-blocks located on the left or top of the current target block. N may be 1, 2, 3, etc. N may be a fixed value pre-committed to the decoding apparatus, or may be variably determined in consideration of the above-described block attribute.

The transformation process for some sub-blocks may be limited to be performed only when the shape of the current target block is a rectangle (W>H, W<H). Alternatively, it may be limited to be performed only when the size of the current target block is greater than or equal to the threshold size.

In the case of a coding block that performs transformation and inverse transformation only in some sub-blocks of the current coding blocks, the above-described sub-block unit deblocking filter process may be applied in the same manner.

In the embodiment illustrated in FIG. 6, a case in which one 16x8 coding block 600 is vertically divided into four 4x8 sub-blocks 610-613 is illustrated. In this case, a total of three sub-block boundaries may occur. A first sub-block boundary 620 occurring between the first sub-block 610 and the second sub-block 611, between the second sub-block 611 and the third sub-block 612 An example may be a second sub-block boundary 621 that occurs, and a third sub-block boundary 622 that occurs between the third sub-block 612 and the fourth sub-block 613.

In this case, among the sub-block boundaries, only the second sub-block boundary 621 exists on the deblocking filter grid as the sub-block boundary existing on the deblocking filter grid to which the deblocking filter is applied.

Therefore, when the current coding block is a block that has been predicted and transformed through intra prediction in sub-block division units, in performing deblocking filtering, the deblocking filter is performed only at the block boundary existing on the deblocking filter grid. Can be done. Alternatively, different deblocking filters may be applied to the second sub-block boundary and the first sub-block boundary. That is, at least one of a filter coefficient, a number of taps, and an intensity of the deblocking filter may be different.

The block grid for performing the deblocking filter may be configured in units of N pixels, and N uses a specific predetermined integer, and one or more of 4, 8, 16, 32, etc., can be adaptively used. I can.

For a detailed description of the present invention, the embodiment shown in FIG. 6 is used.

FIG. 6 shows an embodiment in which prediction and transformation are performed through intra prediction in sub-block units by vertically dividing one 16x8 coding block into four sub-blocks.

In this case, the single 16x8 coded block is divided into four 4x8 sub-blocks, reconstructed by performing intra prediction and transforming in a sub-block unit, and input to the deblocking filter step.

One coding block input through the deblocking filter step includes a first vertical boundary 621 between a first sub-block 610 and a second sub-block 611, a second sub-block 611, and There is a second vertical boundary 620 between the third sub-block 612 and a third vertical boundary 622 between the third sub-block 612 and the fourth sub-block 613 .

In this case, when the block grid for the current deblocking filter is 8x8, the second vertical boundary 620 is the only sub-block boundary existing on the deblocking filter grid among the sub-block boundaries.

According to an embodiment of the present invention, when the current coding block is a block that has been predicted and/or transformed through intra prediction in sub-block units, as in the second vertical direction boundary 620, on the deblocking filter grid. Deblocking filters can be performed on all sub-block boundaries that do not exist on the deblocking filter grid, such as the sub-block boundary existing in the first vertical direction boundary 621 and the third vertical direction boundary 622. have. In this case, the properties of the deblocking filter applied for each boundary (eg, at least one of intensity, number of taps, coefficient, position/number of input pixels, etc.) may be different.

According to another embodiment of the present invention, when the current coding block is a block in which prediction and/or transformation is performed through intra prediction in sub-block units, a deblocking filter, like the second vertical boundary 620 The deblocking filter is performed only at the sub-block boundary existing on the grid, and the sub-block boundary not present on the deblocking filter grid, such as the first vertical boundary 621 and the third vertical boundary 622 Deblocking filter execution can be omitted at

In addition, in the case of performing the deblocking filter on one coding block, the deblocking filter proposed in the present invention is a block in which the current coding block is predicted and/or transformed through intra prediction in sub-block units. In this case, deblocking filtering may be performed on a pixel different from a coding block having the same size as the current coding block, a different deblocking filter strength may be used, or a different number of deblocking filter taps may be applied.

In an embodiment of the present invention, when a current coding block is smaller than a specific block size M, a coding block (CB), a prediction block (PB) existing on a block grid for the deblocking filter, Alternatively, a deblocking filter may be performed on N pixels located in a Q block (a block currently performing a deblocking filter) located at a boundary of at least one of the transform blocks (TB).

In this case, when the current coding block is a block in which prediction and/or transformation is performed through the intra prediction in sub-block units, a deblocking filter may be performed on (N+K) pixels located in the Q block. .

In this case, M may mean the width or height of the block and may be 16, 32, 64, or 128.

In addition, in this case, N denotes the number of pixels adjacent to the block boundary included in the P block (the block currently performing the deblocking filter and the block adjacent to the block boundary) and the Q block (the block currently performing the deblocking filter). And, it may be an integer of 1, 2, 3, 4, 5, 6, 7 or more.

In addition, K is a pixel to be additionally subjected to deblocking filter among pixels adjacent to a block boundary included in the Q block when the current coding block is a block that has been predicted and/or transformed through intra prediction in sub-block units. It means a number, and can have a value from 0 to an integer minus N from the width and height of the current block.

In another embodiment of the present invention, when the current coding block is a block that has been predicted and/or transformed through the intra prediction in sub-block units, the N pixels are assigned to the K pixels located in the Q block. It is possible to use a different deblocking filter strength than the applied filter.

In addition, when the current coding block is a block that has been predicted and transformed through intra-prediction of the sub-block unit, a deblocking filter different from the filter applied to the N pixels is applied to K pixels located in the Q block. can do. The above-described different filters may mean that at least one or more of filter intensity, coefficient value, number of taps, number/position of input pixels, and the like are different.

FIG. 7 is a diagram illustrating an example of a pixel to be subjected to deblocking filtering and a reference pixel used for filtering at a boundary between a P block and a Q block in applying a deblocking filter to a horizontal boundary.

In FIG. 7, a Q block 700 which is a block currently subjected to deblocking filtering, a P block 710 that is spatially adjacent to the top, and a boundary 720 between the P block and the Q block are shown. In the P block, target pixels to which deblocking filtering is applied are a total of three pixel rows adjacent to the block boundary 720, and the concepts of reference pixels for performing filtering in each pixel row are illustrated in 713, 712, and 711 of FIG. 7. . However, FIG. 7 shows an embodiment of a horizontal boundary. In the present invention, when the deblocking filter is applied to a vertical boundary, all concepts of the present invention described above or below are applied to a pixel column instead of a pixel row. .

In 713 of FIG. 7, the target pixel for deblocking filtering is a p2 pixel, which is a pixel positioned at the third boundary from the boundary, and pixels referred to to perform deblocking filtering on the p2 pixel are p3, p2, p1, p0, and q0 pixels. In this case, the p2 pixel may be set to a value of p2', which is a weighted average value using a predefined weight using five pixels of p3, p2, p1, p0, and q0. However, at this time, the weighted averaged value p2' is used as one value within the range of a value added or subtracted by a specific offset value from the p2 value.

In 712 of FIG. 7, the target pixel for deblocking filtering is a p1 pixel, which is a pixel positioned at the second boundary, and pixels referred to to perform deblocking filtering on the p1 pixel are p2, p1, p0, and q0 pixels. In this case, the p1 pixel may be set to a value of p1', which is a weighted average value using a weight defined in advance using four pixels p2, p1, p0, and q0. However, at this time, the weighted averaged value p1' is used as one value within the range of a value added or subtracted by a specific offset value from the p1 value.

In 711 of FIG. 7, the target pixel for deblocking filtering is a p0 pixel, which is a pixel located first at the boundary, and pixels referred to to perform deblocking filtering on the p0 pixel are p2, p1, p0, q0, and q1 pixels. . In this case, the p0 pixel may be set to a value of p0', which is a weighted average value using a predefined weight using five pixels of p2, p1, p0, q0, and q1. However, at this time, the weighted averaged value p0' is used as one value within the range of a value added or subtracted by a specific offset value from the p0 value.

Similarly, target pixels to which deblocking filtering is applied in the Q block are a total of three pixel rows adjacent to the block boundary 720, and the concept of reference pixels for performing filtering in each pixel row is shown in 703, 702, and 701 of FIG. Shown. However, FIG. 7 shows an embodiment of a horizontal boundary. In the present invention, when the deblocking filter is applied to a vertical boundary, all concepts of the present invention described above or below are applied to a pixel column instead of a pixel row. .

In 703 of FIG. 7, the target pixel for deblocking filtering is a q2 pixel, which is a pixel positioned third from the boundary, and pixels q3, q2, q1, q0, and p0 are referred to to perform deblocking filtering on the q2 pixel. In this case, the q2 pixel may be set to a value of p2', which is a weighted average value using a weight defined in advance using five pixels q3, q2, q1, q0, and p0. However, at this time, the weighted averaged value q2' is used as one value within the range of a value added or subtracted by a specific offset value from the q2 value.

In 702 of FIG. 7, the target pixel for deblocking filtering is a q1 pixel, which is a pixel located at the second boundary from the boundary, and pixels referred to to perform deblocking filtering on the q1 pixel are q2, q1, q0, and p0 pixels. In this case, the q1 pixel may be set to a value of q1' which is a weighted average value using a weight predefined by using four pixels q2, q1, q0, and p0. However, at this time, the weighted averaged value q1' is used as one value within the range of a value added or subtracted by a specific offset value from the q1 value.

In 701 of FIG. 7, the target pixel for deblocking filtering is a q0 pixel, which is a pixel located first at the boundary, and pixels referenced to perform deblocking filtering on the q0 pixel are q2, q1, q0, p0, and p1 pixels. . In this case, the p0 pixel may be set to a value of q0', which is a weighted average value using a weight predefined using five pixels q2, q1, q0, p0, and p1. However, at this time, the weighted averaged value q0' is used as one value within the range of a value added or subtracted by a specific offset value from the q0 value.

FIG. 8 is a diagram illustrating a concept of a pixel to be subjected to blocking filtering and a reference pixel used for filtering at a boundary between a P block and a Q block and a sub-block boundary within a Q block.

Unlike the method of performing deblocking filtering only at the boundary between the P block and the Q block shown in FIG. 7, although it is not located on the block grid of the deblocking filter, the Q block is an intra prediction or sub-block in a sub-block unit. In the case of a block performing unit conversion, blocking artifacts may occur at the sub-block boundary, and in order to effectively remove this, an additional deblocking filter is additionally performed on pixels located at the sub-block boundary inside the Q block. It is shown in Figure 8.

In addition to FIG. 7, in FIG. 8, pixels to be subjected to deblocking filtering are three pixel rows adjacent to the block boundary 720 and additional N pixel rows of the sub-block boundary 800 present in the block. The concept of performing filtering on the additional N pixel rows of the sub-block boundary and the concept of a reference pixel are illustrated in 801 and 802 of FIG. 8. However, FIG. 8 shows an embodiment of a horizontal boundary. In the present invention, when the deblocking filter is applied to a vertical boundary, all concepts of the present invention described above or later are applied to a pixel column instead of a pixel row. .

In addition, in FIG. 8, as an example of an additional N pixel rows of the sub-block boundary 800, a case where N is 2 is illustrated, but the present invention is not limited thereto, and the case where N is 4, 8, etc. It is an example of an example.

In 801 of FIG. 8, the target pixel for deblocking filtering is a q3 pixel, which is a pixel located at the top or left of the sub-block boundary 500, and pixels referenced to perform deblocking filtering on the q3 pixel are q2 and q3. , q4, q5 pixels. In this case, the q3 pixel may be set to a value of q3' which is a weighted average value using a weight defined in advance using four pixels q2, q3, q4, and q5. However, in this case, the weighted averaged value q3' is used as one value within the range of a value added or subtracted by a specific offset value from the q3 value. In addition, the q2, q4, and q5 pixels are not limited to the contents shown in FIG. 8, and are +1, +2, -1, -2 or +N,-located at an integer pixel distance based on a sub-block boundary. It may mean a pixel at a predefined location such as N.

In 802 of FIG. 8, the target pixel for deblocking filtering is a q4 pixel, which is a pixel positioned at the bottom or the first right of the sub-block boundary 800, and pixels referred to to perform deblocking filtering on the q4 pixel are q2 and q3. , q4, q5 pixels. In this case, the q4 pixel may be set to a value of q4', which is a weighted average value using a weight defined in advance using four pixels q2, q3, q4, and q5. However, at this time, the weighted average value q4' is used as one value within the range of a value added or subtracted by a specific offset value from the q4 value. In addition, the pixels q2, q3, and q5 are not limited to the contents shown in FIG. 8, and are +1, +2, -1, -2 or +N,-located at an integer pixel distance based on a sub-block boundary. It may mean a pixel at a predefined location such as N.

In the above embodiment, as an example of a deblocking filter for a sub-block boundary, filtering using 4 pixels has been exemplified, but in the present invention, 5 pixels or 3 pixels are referred to the target pixel of the deblocking filter. Can be used as a pixel. Also, the number of reference pixels for deblocking filtering may be different from each other according to positions.

9 is a diagram for explaining a basic concept of a current picture referencing mode.

As shown in FIG. 9, the current picture reference technique is a technique for performing prediction on an already reconstructed region of the same picture as the current block when performing prediction on the current block.

In encoding or decoding the current block 910 of the current picture 900 of FIG. 9, a region 901 previously reconstructed according to the encoding and decoding order exists, and has pixel similarity with the current block 110. Areas may exist. Accordingly, based on the pixel similarity, a reference block 930 similar to the current block 910 exists in the region 901 of the pre-restored current picture, and prediction is performed using the reference block 930. The technology is defined as the current picture reference technology.

Information on whether to refer to the current picture may be encoded and signaled by an encoding device, or may be derived by a decoding device. In this case, the derivation is performed based on the size, shape, location, division type (eg, quad tree, binary tree, ternary tree), prediction mode, and location/type of the tile or tile group to which the block belongs. Can be. Here, the block may mean at least one of a current block or a neighboring block adjacent to the current block.

In this case, the pixel distance between the current block 910 and the reference block 930 is defined as a vector, and the vector is called a block vector.

In encoding the current picture reference mode, the prediction information of the current block is a skip, merge, and difference through a block vector prediction method similar to the inter prediction of the block vector information. Methods such as signal transmission can be used. For example, the block vector may be derived as a neighboring block of the current block. In this case, the neighboring block may be limited to a block coded in the current picture reference mode, or may be limited to a block coded in the merge mode (or skip mode). It may be limited to blocks encoded with other prediction modes (AMVP mode, affine mode, etc.) pre-defined in other decoding devices. Alternatively, the block vector may be derived based on information (e.g., block index) specifying the position of the reference block.

FIG. 10 illustrates in detail an embodiment in which a current block belongs to a picture, a tile group, or a CTU located at the leftmost position at a boundary of a tile.

In FIG. 9, a search range of a current picture reference for a current block of a current block, that is, a reference region, is shown. In FIG. 10, an embodiment of a region previously reconstructed according to a CTU including the current block and an encoding and decoding order is illustrated. Show.

In FIG. 10, an example in which the current picture 1000 is divided into a tile group A 1050 and a tile group B 1060 is illustrated. In this case, the tile group is one of methods of dividing one picture, and a tile may also correspond to this. The tile group may be composed of one or more tiles. Hereinafter, the tile group may be understood as one tile.

In FIG. 10, tile group A 1050 shows a tile group composed of 1001, 1002, 1006, 1007, 1011, and 1012 CTUs, and tile group B 1060 is 1003, 1004, 1005, 1008, 1009, 1010. , 1013, 1014, 1015 A tile group consisting of CTUs is shown.

The predefined reference area for referencing the current picture may mean a partial area of an already reconstructed area 901 in the current picture 900 shown in FIG. 9. In addition, the partial region may be at least one of a current CTU including a current block, a left CTU spatially adjacent to the current CTU, or a CTU positioned at the top.

In particular, when the CTU including the current block is a current picture, a current tile group, or a CTU located at the far left of the current tile, a CTU including the current block and a CTU spatially adjacent to the top may be referred to.

BLOCK A 1051 shown in FIG. 10 is an embodiment of a case in which the current block is included in the CTU located at the far left of the current picture 1000. If the current block is a block included in BLOCK A (1051), the area that the current block can refer to to perform the current picture reference corresponds to the area previously restored according to the encoding and decoding order inside the current CTU (BLOCK A). I can.

In addition to this, in the present invention, when the CTU including the current block is the CTU located at the leftmost of the current picture 1000, the upper CTU 1006 that is spatially adjacent is used as a reference region when there is a spatially adjacent upper CTU. May include doing.

If there is no available CTU to the left of the CTU to which the current block belongs (hereinafter referred to as the current CTU), the current picture reference is performed using only the undone region within the current CTU, or the current picture reference is not performed. May be. Or, it may be set to refer to a specific region previously restored before the current CTU. The specific region may belong to the same tile or tile group, or may be P CTUs decoded immediately before the current CTU. P can be 1, 2, 3 or more. The N value may be a fixed value pre-defined in the encoding/decoding apparatus, or may be variably determined according to the position of the current block and/or the current CTU.

In addition, it may include using the CTU 207 previously encoded and decoded according to the encoding and decoding order as a reference region.

BLOCK B 1061 shown in FIG. 10 is a CTU included in Tile group B, which is the second tile group of the current picture 1000, and is located at the far left of Tile group B and is located at the tile group boundary. Yes. If the current block is a block included in BLOCK B 1061, the area that the current block can refer to to perform current picture reference corresponds to a previously restored area according to the encoding and decoding order inside the current CTU (BLOCK B). I can.

In addition to this, in the present invention, when the CTU containing the current block is the CTU located at the far left of the current tile group 1060, the upper CTU 1003 that is spatially adjacent is used as a reference region when there is a spatially adjacent upper CTU. May include doing.

In addition, it may include using the CTU 1005, which has been previously encoded and decoded according to the encoding and decoding order, as a reference region.

In addition, in the present invention, the concept referred to as a tile group among the above-described matters is substituted with the concept of a tile.

In addition, in the present invention, for the case where the CTU including the current block is a CTU located at the left boundary of a tile or tile group, the current tile or tile group is limited to a tile or tile group that allows prediction between tiles or tile groups. Thus, it includes using the CTU located to the left of the current CTU as a reference area. To this end, information on a prediction/referencing relationship between tiles or tile groups may be encoded. For example, the information includes whether reference between tiles is allowed, whether a current tile refers to another tile, the number of tiles belonging to a picture, an index specifying the location of a tile, the number/location of referenced tiles, etc. It may include at least one of information. The information may be signaled at at least one level of a video sequence, a picture, a tile group, or a tile.

11 illustrates an embodiment of a region including a current block and a search and reference available region of a current picture reference (CPR).

It is shown that the current CTU is divided into a plurality of VPDUs. As the size of the CTU increases, the VPDU refers to a maximum unit capable of performing encoding and decoding at one time in order to reduce the cost of implementing hardware for this. Here, the VPDU may mean a block having at least one of a width or a height smaller than that of the CTU. When the split depth of the CTU is k, the VPDU may be defined as a block having a split depth of (k+1) or (k+2). The shape of the VPDU may be square or rectangular, but may be limited to square as needed.

In addition, the size of the VPDU may use a predefined arbitrary size, or may use a size of a quarter of the CTU. At this time, the predefined arbitrary size may be 64x64, 32x32, or 128x128.

In FIG. 11, a CTU 1110 including a current block and a CTU 1100 spatially adjacent to the left are shown. In this case, a search range for referencing the current picture of the current block, that is, a reference region may be predefined.

In particular, only when the current block is included in the first VPDU 1111 of the current CTU 1110, only all or some of the pixel areas of the CTU 1100 adjacent to the left may be used as the reference area. According to the embodiment shown in FIG. 11, 2

VPDUs

1102, 3

VPDUs

1103, 4 VPDUs 1104 and the current block excluding the 1st VPDU 1101 of the CTU 1100 adjacent to the left are A region reconstructed in advance according to an encoding and decoding order among the included VPDUs 1111 may be used as a reference region.

As shown in FIG. 11, according to an embodiment of the present invention, when using spatially adjacent CTUs as a search and reference area of a current picture reference (CPR), only some areas of spatially adjacent CTUs are used. Include.

According to the encoding/decoding order, only N CTUs encoded/decoded immediately before the current CTU may be used. Alternatively, only M VPDUs encoded/decoded immediately before the current VPDU according to the encoding/decoding order may be used. Here, N and M may be 1, 2, 3, 4, or more integers, and N and M may be the same or different from each other. The number may be a value pre-defined in the encoding/decoding apparatus or may be variably determined based on availability of a block. The M VPDUs may be limited to belonging to the same CTU (or tile, tile group). Alternatively, at least one of the M VPDUs may be limited to belonging to a CTU (or tile or tile group) different from the others. At least one of the aforementioned restrictions may be set in consideration of the location and/or scan order of the current block or current VPDU. Here, the location may be interpreted in various meanings, such as a location within the CTU, a location within a tile, and a location within a tile group. The above-described embodiments can be applied in the same/similar manner to the following embodiments.

11, for each case in which the current block is included in the second VPDU 1112, the third VPDU 1113 and the fourth VPDU 1114 of the current CTU 1110, reference is made to the CTU 1100 adjacent to the left. The change of the area is shown.

When the current block is included in the second VPDU 1112 of the current CTU 1110, all pixel areas of the CTU 1100 adjacent to the left may not be used as a reference area, and only some zeros may be used as the reference area. According to the embodiment shown in FIG. 11, the third VPDU 1103 and the fourth VPDU 1104 excluding the first VPDU 1101 and the second VPDU 1102 of the CTU 1100 adjacent to the left are referenced. Can be used as In addition, a region reconstructed in advance according to an encoding and decoding order among the first VPDU 1111 of the current CTU 1110 and the second VPDU 1112 including the current block may be used as a reference region.

In addition, when the current block is included in the third VPDU 1113 of the current CTU 1110, all the pixel areas of the CTU 1100 adjacent to the left are not used as a reference area, and only some zeros can be used as the reference area. , According to the embodiment shown in FIG. 11, refer to the fourth VPDU 1104 excluding the first VPDU 1101, the second VPDU 1102 and the third VPDU 1103 of the CTU 1100 adjacent to the left. Can be used as an area. In addition, among the first VPDU 1111 of the current CTU 1110, the second VPDU 1112, and the third VPDU 1113 including the current block, a region reconstructed in advance according to the encoding and decoding order can be used as a reference region. have.

In addition, when the current block is included in the 4th VPDU 1113 of the current CTU 1110, the CTU 1100 adjacent to the left is not used as a reference area, and only a previously restored area inside the current CTU 1110 is a reference area. Can be used as

12 illustrates another embodiment of an area including a current block and a search and reference available area for a current picture reference (CPR).

12 illustrates an additional embodiment of a case in which the current block is different from the existing z-scan order in the VPDU execution order according to the partition type of the current CTU 1210.

When the current CTU 1210 is vertically divided so that the VPDU is performed in the first VPDU 1211, the third VPDU 1213, the second VPDU 1212, and the fourth VPDU 1214, the current block is When included in the third VPDU 1213 of the CTU 1210, the second VPDU 1202 and the fourth VPDU excluding the first VPDU 1201 and the third VPDU 1203 within the CTU 1200 adjacent to the left (1204) can be used as a reference area. In addition, a region reconstructed in advance according to an encoding and/or decoding order among the first VPDU 1211 of the current CTU 1210 and the third VPDU 1213 including the current block may be used as a reference region. In the above embodiment, it means that a region spatially adjacent to the current VPDU among the left CTUs is first referred.

In addition, according to the VPDU execution order, if the VPDU execution order is the 1st VPDU 1211, the 3rd VPDU 1213, the 2nd VPDU 1212, and the 4th VPDU 1214 due to vertical division in the left CTU, As in the above method (Fig. 12(a)), a reference area can be designated.

Unlike the previous embodiment of FIG. 12A, the current CTU 1210 is vertically divided, so that the VPDU is performed in the first VPDU 1211, the third VPDU 1213, the second VPDU 1212, and the second VPDU 1212. In the case of 4 VPDU 1214, when the current block is included in the third VPDU 1213 of the current CTU 1210, the CTU 1200 adjacent to the left side 1 VPDU 1201 and the second VPDU 1202 Except for, the third VPDU 1203 and the fourth VPDU 1204 may be used as reference areas. In addition, a region reconstructed in advance according to an encoding and decoding order among the first VPDU 1211 of the current CTU 1210 and the third VPDU 1213 including the current block may be used as a reference region.

This is the above method (Fig. 12(b)) when the order of VPDU execution in the left CTU is the first VPDU 1211, the second VPDU 1212, the third VPDU 1213, and the fourth VPDU 1214. It means that you can designate a reference area together.

In the above embodiment, it means that the encoded and decoded regions are first referred to later according to the VPDU execution order of the left CTU.

In addition, when the current CTU 1210 is vertically divided and the VPDU is performed in the first VPDU 1211, the third VPDU 1213, the second VPDU 1212, and the fourth VPDU 1214, the current block If this is included in the second VPDU 1212 of the current CTU 1210, the CTU 1200 adjacent to the left, the first VPDU 1201, the second VPDU 1202, and the third VPDU 1203 are excluded. The VPDU 1204 can be used as a reference area. In addition, among the first VPDU 1211 of the current CTU 1210, the third VPDU 1213, and the second VPDU 1212 including the current block, a region reconstructed in advance according to the encoding and decoding order can be used as a reference region. have.

13 illustrates another embodiment of a region including a current block and a search and reference available region of a current picture reference (CPR).

In FIG. 13, in the case where the CTU including the current block shown in FIG. 10 is located at the far left of the current picture, the current tile group, and the current tile, and can refer to the upper CTU, the CTU is located at the top of the CTU spatially adjacent to the current CTU. It shows an embodiment referring to the located CTU.

As long as the current block is included in the first VPDU 1311 of the current CTU 1310, all the pixel areas of the CTU 1300 adjacent to the top may not be used as a reference area, and only a partial area may be used as a reference area. According to the embodiment shown in FIG. 13, the second VPDU 1302, the third VPDU 1303, the fourth VPDU 1304, and the current block excluding the first VPDU 1301 of the CTU 1300 adjacent to the top Among the included VPDUs 1311, a region reconstructed in advance according to an encoding and decoding order may be used as a reference region.

When the current block is included in the second VPDU 1312 of the current CTU 1310, all the pixel areas of the CTU 1300 adjacent to the top are not used as a reference area, and only some zeros can be used as the reference area. According to the embodiment shown in 13, the third VPDU 1303 and the fourth VPDU 1304 excluding the first VPDU 1301 and the second VPDU 1302 of the CTU 1300 adjacent to the top are used as reference regions. Can be used. In addition, a region reconstructed in advance according to an encoding and decoding order among the first VPDU 1311 of the current CTU 1310 and the second VPDU 1312 including the current block may be used as a reference region.

In addition, when the current block is included in the third VPDU 1313 of the current CTU 1310, all the pixel areas of the CTU 1300 adjacent to the top are not used as a reference area, and only some zeros can be used as the reference area. , According to the embodiment shown in FIG. 13, refer to the fourth VPDU 1304 excluding the first VPDU 1301, the second VPDU 1302 and the third VPDU 1303 of the CTU 1300 adjacent to the top Can be used as an area. In addition, among the first VPDU 1311, the second VPDU 1312 of the current CTU 1310, and the third VPDU 1313 including the current block, a region reconstructed in advance according to the encoding and decoding order can be used as a reference region. have.

In addition, when the current block is included in the fourth VPDU 1313 of the current CTU 1310, the CTU 1300 adjacent to the top is not used as a reference area, and only the previously restored area inside the current CTU 1310 is the reference area. Can be used as

14 illustrates another embodiment of an area including a current block and a search and reference available area for a current picture reference (CPR).

In FIG. 14, in addition to FIG. 13, referring to a partial region of the CTU 1400 spatially adjacent to the top when the execution order of the VPDU is different from the existing z-scan order according to the current block division type of the current CTU 1410. It shows an additional embodiment.

When the current CTU 1410 is vertically divided and the order of VPDU execution is the first VPDU 1411, the third VPDU 1413, the second VPDU 1412, and the fourth VPDU 1414, the current block is When included in the 3rd VPDU 1413 of the CTU 1410, the 3rd VPDU 1403 and the 4th VPDU 1404 excluding the 1st VPDU 1401 and 2nd VPDU 1402 of the CTU 1400 adjacent to the top ) Can be used as a reference area. In addition, a region reconstructed in advance according to an encoding and decoding order among the first VPDU 1411 of the current CTU 1410 and the third VPDU 1413 including the current block may be used as a reference region.

In addition, for the case where the current CTU 1410 is vertically divided and the VPDU is performed in the first VPDU 1411, the third VPDU 1413, the second VPDU 1412, and the fourth VPDU 1414, the current block If this is included in the second VPDU 1412 of the current CTU 1410, the fourth VPDU excluding the CTU 1400, the first VPDU 1401, the second VPDU 1402, and the third VPDU 1403 adjacent to the top (1404) can be used as a reference area. In addition, among the first VPDU 1411, the third VPDU 1413 of the current CTU 1410, and the second VPDU 1412 including the current block, a region reconstructed in advance according to the encoding and decoding order can be used as a reference region. have.

As described above, the current block may set the reference region based on the current CTU, based on the CTU located at least one of the left or the top. That is, the left or upper CTU may be selectively used, and the selection may be performed based on predetermined encoding information. The encoding information may include information on whether to refer to a left or upper CTU, whether a left or upper CTU is available, a scan order, and a position of a current VPDU within the current CTU.

The merge mode equally uses motion information of a neighboring block as motion information of a current block, and unlike the AMVP mode, it does not require encoding/decoding a separate motion vector difference value. However, even in the merge mode, a predetermined motion vector difference value (MVD) may be used to improve the accuracy of the motion vector. In the present invention, motion information may be understood as including at least one of a motion vector, a reference picture index, and prediction direction information.

The MVD may be selectively used based on a predetermined flag (hereinafter, MVD_flag). MVD_flag may indicate whether a motion vector difference value (MVD) is used in the merge mode. For example, when the flag is the first value, the motion vector difference value is used in the merge mode, and if not, the motion vector difference value is not used in the merge mode. That is, when the flag is the first value, the motion vector derived according to the merge mode may be corrected by using the motion vector difference value. Otherwise, the motion vector according to the merge mode may not be corrected.

MVD_flag may be encoded/decoded only when at least one of the width or height of the current block is greater than or equal to 8. Alternatively, MVD_flag may be encoded/decoded only when the number of pixels belonging to the current block is greater than or equal to 64. Alternatively, MVD_flag may be encoded/decoded only when the sum of the width and height of the current block is greater than 12.

Referring to FIG. 15, a merge candidate list of a current block may be configured (S1500).

The merge candidate list may include one or a plurality of merge candidates available to derive motion information of the current block. The size of the merge candidate list may be variably determined based on information indicating the maximum number of merge candidates constituting the merge candidate list (hereinafter, size information). The size information may be a fixed value (eg, 2, 3, 4, 5, 6 or more integers) encoded and signaled by the encoding device or pre-committed to the decoding device.

The plurality of merge candidates belonging to the merge candidate list may include at least one of a spatial merge candidate or a temporal merge candidate.

The spatial merge candidate may mean a neighboring block spatially adjacent to the current block or motion information of the neighboring block. Here, the neighboring block may include at least one of a lower left block A0, a left block A1, an upper right block B0, an upper block B1, or an upper left block B2 of the current block. According to a predetermined priority order, available neighboring blocks among the neighboring blocks may be sequentially added to the merge candidate list. For example, the priority is B1->A1->B0->A1->B2, A1->B1->A0->B1->B2, A1->B1->B0->A0->B2, etc. It may be defined as, but is not limited thereto.

The temporal merge candidate may mean one or more co-located blocks belonging to a co-located picture or motion information of the collocated block. Here, the collocated picture is any one of a plurality of reference pictures included in the reference picture list, and may be a picture different from the picture to which the current block belongs. The collocated picture may be the first picture or the last picture in the reference picture list. Alternatively, the collocated picture may be specified based on an index coded to indicate the collocated picture. The collocated block may include at least one of a block C1 including a center position of the current block or a neighboring block C0 adjacent to the lower right corner of the current block. According to a predetermined priority order, an available block of C0 and C1 may be sequentially added to the merge candidate list. For example, C0 may have a higher priority than C1. However, the present invention is not limited thereto, and C1 may have a higher priority than C0.

The encoding/decoding apparatus may include a buffer that stores motion information of one or more blocks (hereinafter, referred to as previous blocks) for which encoding/decoding has been completed before the current block. In other words, the buffer may store a list consisting of motion information of a previous block (hereinafter, a motion information list).

The motion information list may be initialized in units of any one of a picture, a slice, a tile, a CTU row, or a CTU. Initialization may mean a state in which the motion information list is empty. Motion information of the previous block is sequentially added to the motion information list according to the encoding/decoding order of the previous block, but the motion information list is updated in a first-in first-out (FIFO) method in consideration of the size of the motion information list. Can be. For example, when the most recently encoded/decoded motion information (hereinafter, the latest motion information) is the same as motion information pre-added to the motion information list, the latest motion information may not be added to the motion information list. Alternatively, motion information identical to the latest motion information may be removed from the motion information list, and the latest motion information may be added to the motion information list. In this case, the latest motion information may be added to the last position of the motion information list or may be added to the position of the removed motion information.

The previous block may include at least one of one or more neighboring blocks spatially adjacent to the current block or one or more neighboring blocks not spatially adjacent to the current block.

In the merge candidate list, motion information of a previous block or a previous block belonging to the buffer or motion information list may be further added as a merge candidate.

Specifically, redundancy check between the motion information list and the merge candidate list may be performed. The redundancy check may be performed on all or part of the merge candidates belonging to the merge candidate list and all or part of the previous block in the motion information list. However, for convenience of explanation, it is assumed that the redundancy check of the present invention is performed on a part of a merge candidate belonging to a merge candidate list and a part of a previous block in the motion information list. Here, some merge candidates in the merge candidate list may include at least one of a left block or an upper block among spatial merge candidates. However, the present invention is not limited thereto, and some merge candidates may be limited to any one block among spatial merge candidates, and may further include at least one of a lower left block, an upper right block, an upper left block, or a temporal merge candidate. Some previous blocks of the motion information list may mean K previous blocks recently added to the motion information list. Here, K may be 1, 2, 3 or more, and may be a fixed value pre-committed to the encoding/decoding device.

For example, it is assumed that 5 previous blocks (or motion information of the previous block) are stored in the motion information list, and indices of 1 to 5 are allocated to each previous block. The larger the index, the more recently stored previous block. In this case, it is possible to check the redundancy of motion information between a previous

block having indices

5, 4, and 3 and some merge candidates of the merge candidate list. Alternatively, redundancy between a previous block having indices 5 and 4 and some merge candidates of the merge candidate list may be checked. Alternatively, redundancy between the previous

block having indexes

4 and 3 and some merge candidates of the merge candidate list may be checked, excluding the most recently added previous block of index 5.

As a result of the redundancy check, if even one previous block having the same motion information exists, the previous block of the motion information list may not be added to the merge candidate list. On the other hand, if there is no previous block having the same motion information, all or part of the previous block of the motion information list may be added to the last position of the merge candidate list. In this case, it may be added to the merge candidate list in the order of the previous blocks recently added from the motion information list (that is, in the order of the largest index to the smallest). However, the previous block most recently added to the motion information list (ie, the previous block having the largest index) may be restricted so that it is not added to the merge candidate list. The addition of the previous block may be performed in consideration of the size of the merge candidate list. For example, it is assumed that the merge candidate list has at most T merge candidates according to the size information of the merge candidate list described above. In this case, the addition of the previous block may be limited to be performed only until the number of merge candidates included in the merge candidate list (T-n) is reached. Here, n may be an integer of 1, 2 or more. Alternatively, the addition of the previous block may be repeatedly performed until the number of merge candidates belonging to the merge candidate list reaches T.

Referring to FIG. 15, motion information of a current block may be derived based on a merge candidate list and a merge index (merge_idx) (S1510).

The merge index may specify any one of a plurality of merge candidates belonging to the merge candidate list. The motion information of the current block may be set as motion information of the merge candidate specified by the merge index.

According to the value of MVD_flag, the maximum number of merge candidates for which the current block is available may be adaptively determined. If MVD_flag is 0, up to M merge candidates are used, while if MVD_flag is 1, N merge candidates may be used. Here, M may be a natural number smaller than N.

For example, when MVD_flag is 1, the signaled merge index may have a value of 0 or 1. That is, when a motion vector difference value is used in the merge mode, the motion information of the current block may be derived using only one of the first merge candidate (merge_idx=0) or the second merge candidate (merge_idx=1) of the merge candidate list. have.

Accordingly, even when the maximum number of merge candidates in the merge candidate list is M, when a motion vector difference value is used in the merge mode, the maximum number of merge candidates in which the current block can be used may be two.

Alternatively, when MVD_flag is 1, the merge index is not encoded/decoded, and instead, the merge index is set to 0, whereby the use of the first merge candidate may be forced. Alternatively, when MVD_flag is 1, the merge index may have a value between 0 and i, i may be an integer of 2, 3 or more, and i may be the same as (M-1).

Referring to FIG. 15, a motion vector difference value (MVD) for a merge mode of a current block may be derived (S1520).

The MVD of the current block may be derived based on a merge offset vector (offsetMV). The MVD includes at least one of MVD (MVD0) in the L0 direction and MVD (MVD1) in the L1 direction, and each of MVD0 and MVD1 may be derived using a merge offset vector.

The merge offset vector may be determined based on the length (mvdDistance) and direction (mvdDirection) of the merge offset vector. For example, the merge offset vector (offsetMV) may be determined as in Equation 7 below.

[Equation 7]

offsetMV[ x0 ][ y0 ][ 0] = (mvdDistance[ x0 ][ y0] << 2) * mvdDirection[ x0 ][ y0 ][0]

offsetMV[ x0 ][ y0 ][ 1] = (mvdDistance[ x0 ][ y0] << 2) * mvdDirection[ x0 ][ y0 ][1]

Here, mvdDistance may be determined in consideration of at least one of a distance index (distance_idx) or a predetermined flag (pic_fpel_mmvd_enabled_flag). The distance index (distance_idx) may mean an index encoded to specify the length or distance of the motion vector difference value MVD. pic_fpel_mmvd_enabled_flag may indicate whether the motion vector uses integer pixel precision in the merge mode of the current block. For example, when pic_fpel_mmvd_enabled_flag is the first value, the merge mode of the current block uses integer pixel precision. That is, this may mean that the motion vector resolution of the current block is an integer pel. On the other hand, when pic_fpel_mmvd_enabled_flag is the second value, the merge mode of the current block may use fractional pixel precision. In other words, when pic_fpel_mmvd_enabled_flag is the second value, the merge mode of the current block may use integer pixel precision or decimal pixel precision. Alternatively, when pic_fpel_mmvd_enabled_flag is the second value, the merge mode of the current block may be limited to use only fractional pixel precision. Examples of the precision of a fractional pixel may include 1/2 pel, 1/4 pel, and 1/8 pel.

For example, mvdDistance may be determined as shown in Table 3 below.

distance_idx[　x0　][　y0　]distance_idx[　x0　][　y0　]	MmvdDistance[　x0　][　y0　]MmvdDistance[　x0　][　y0　]
distance_idx[　x0　][　y0　]distance_idx[　x0　][　y0　]	pic_fpel_mmvd_enabled_flag =　= 0pic_fpel_mmvd_enabled_flag =　= 0	pic_fpel_mmvd_enabled_flag =　= 1pic_fpel_mmvd_enabled_flag =　= 1
00	1One	44
1One	22	88
22	44	1616
33	88	3232
44	1616	6464
55	3232	128128
66	6464	256256
77	128128	512512

Also, mvdDirection indicates the direction of the merge offset vector, and may be determined based on a direction index (direction_idx). Here, the direction may include at least one of left, right, top, bottom, top left, bottom left, top right, or bottom right. For example, mvdDirection may be determined as shown in Table 4 below.

direction_idx[　x0　][　y0　]direction_idx[　x0　][　y0　]	mvdDirection[　x0　][　y0　][0]mvdDirection[　x0　][　y0　][0]	mvdDirection[　x0　][　y0　][1]mvdDirection[　x0　][　y0　][1]
00	+1+1	00
1One	-1-One	00
22	00	+1+1
33	00	-1-One

In Table 4, mvdDirection[ x0 ][ y0 ][0] means the sign of the x component of the motion vector difference value, and mvdDirection[ x0 ][ y0 ][1] means the sign of the y component of the motion vector difference value. can do. When direction_idx is 0, the direction of the motion vector difference value is in the right direction, when direction_idx is 1, the direction of the motion vector difference value is in the left direction, and when direction_idx is 2, the direction of the motion vector difference value is in the downward direction. When, direction_idx is 3, the directions of the motion vector difference values may be respectively determined in the upward direction.

The above-described distance index and direction index may be encoded/decoded only when MVD_flag is the first value.

Meanwhile, the motion vector difference value MVD may be set equal to the previously determined merge offset vector. Alternatively, the merge offset vector may be corrected in consideration of a POC difference (PocDiff) between a reference picture of the current block and a current picture to which the current block belongs, and the corrected merge offset vector may be set as a motion vector difference value (MVD). have. In this case, the current block is encoded/decoded for bidirectional prediction, and the reference picture of the current block may include a first reference picture (a reference picture in the L0 direction) and a second reference picture (a reference picture in the L1 direction). For convenience of explanation, the POC difference between the first reference picture and the current picture is hereinafter referred to as PocDiff0, and the POC difference between the second reference picture and the current picture is referred to as PocDiff1.

When PocDiff0 and PocDiff1 are the same, MVD0 and MVD1 of the current block may be set equally as merge offset vectors, respectively.

When PocDiff0 and PocDiff1 are not the same, when the absolute value of PocDiff0 is greater than or equal to the absolute value of PocDiff1, MVD0 may be set equally as the merge offset vector. Meanwhile, MVD1 may be derived based on a pre-set MVD0. For example, when the first and second reference pictures are long-term reference pictures, MVD1 may be derived by applying a first scaling factor to MVD0. The first scaling factor may be determined based on PocDiff0 and PocDiff1. On the other hand, when at least one of the first or second reference pictures is a short-term reference picture, MVD1 may be derived by applying a second scaling factor to MVD0. The second scaling factor may be a fixed value (eg, -1/2, -1, etc.) pre-committed to the encoding/decoding device. However, the second scaling factor can be applied only when the code of PocDiff0 and the code of PocDiff1 are different from each other. If the sign of PocDiff0 and the sign of PocDiff1 are the same, MVD1 is set to be the same as MVD0, and separate scaling may not be performed.

Meanwhile, when PocDiff0 and PocDiff1 are not the same, when the absolute value of PocDiff0 is less than the absolute value of PocDiff1, MVD1 may be set equally as the merge offset vector. Meanwhile, MVD0 may be derived based on a pre-set MVD1. For example, when the first and second reference pictures are long-term reference pictures, MVD0 may be derived by applying a first scaling factor to MVD1. The first scaling factor may be determined based on PocDiff0 and PocDiff1. On the other hand, when at least one of the first or second reference pictures is a short-term reference picture, MVD0 may be derived by applying a second scaling factor to MVD1. The second scaling factor may be a fixed value (eg, -1/2, -1, etc.) pre-committed to the encoding/decoding device. However, the second scaling factor can be applied only when the code of PocDiff0 and the code of PocDiff1 are different from each other. If the sign of PocDiff0 and the sign of PocDiff1 are the same, MVD0 is set to be the same as MVD1, and separate scaling may not be performed.

Referring to FIG. 15, a motion vector of a current block may be corrected using a motion vector difference value (MVD) (S1530), and motion compensation of a current block may be performed based on the corrected motion vector (S1540).

The present invention relates to a method and apparatus for parsing encoding information related to a merge mode from a coding block encoded in a skip mode and/or a merge mode among video coding techniques.

When a current encoding and/or decoding block is encoded and/or decoded in a skip or merge mode, a plurality of prediction methods may be used, and a method of efficiently signaling a plurality of prediction methods is required. In signaling and parsing the merge mode-related encoding information of the current encoding and/or decoding block, the order of the signaling and parsing syntax may be determined according to an order in which the occurrence frequency is high among a plurality of prediction methods.

The plurality of prediction methods include a block unit merge mode, a general CU unit merge mode (regular merge mode or CU merge mode), MMVD (MVD-based merge mode), a subblock unit merge mode, a combined prediction mode, and non- -It may include at least one of a square prediction mode or a current picture reference mode.

In addition, a method of signaling and parsing each corresponding syntax, a condition for this, or a case in which the corresponding syntax is not expressed (or not signaled) will be described below through the syntax table and semantics for each syntax. However, redundant descriptions will be omitted.

Referring to FIG. 16, regular_merge_flag may indicate whether a general CU-based merge mode is used to generate an inter prediction parameter of a current block. When regular_merge_flag is not signaled, regular_merge_flag may be set to 0.

mmvd_flag[x0][y0] may indicate whether the MVD-based merge mode is used to generate the inter prediction parameter of the current block. Here, mmvd_flag[x0][y0] may be interpreted as having the same meaning as MVD_flag described above. When mmvd_flag is not signaled, mmvd_flag may be derived based on at least one of regular_merge_flag or whether the current block is a block coded in the current picture reference mode. For example, when the current block is not a block encoded in the current picture reference mode and regular_merge_flag is not 1, mmvd_flag is derived as 1, otherwise, mmvd_flag may be derived as 0.

The merge_subblock_flag may indicate whether an inter prediction parameter in a subblock unit for the current block is derived from a neighboring block. When merge_subblock_flag is not signaled, merge_subblock_flag may be derived based on at least one of sps_ciip_enabled_flag or sps_triangle_enabled_flag. Here, sps_ciip_enabled_flag indicates whether encoding information (e.g., ciip_flag) about the combined prediction mode exists, and sps_triangle_enabled_flag may indicate whether motion compensation based on non-rectangular partitions can be used.

For example, when at least one of sps_ciip_enabled_flag or sps_triangle_enabled_flag is 0, merge_subblock_flag is derived as 1, otherwise, merge_subblock_flag may be derived as 0.

ciip_flag may indicate whether the combined prediction mode is applied to the current block. When ciip_flag is not signaled, ciip_flag may be derived based on sps_triangle_enabled_flag. For example, when sps_triangle_enabled_flag is 0, ciip_flag may be derived as 1, otherwise, ciip_flag may be derived as 0.

merge_triangle_flag may indicate whether motion compensation based on non-rectangular partitions is used for the current block. When merge_triangle_flag is not signaled, merge_triangle_flag may be derived based on at least one of sps_triangle_enabled_flag and ciip_flag. For example, when sps_triangle_enabled_flag is 1 and ciip_flag is 0, merge_triangle_flag is derived as 1, otherwise, merge_triangle_flag may be derived as 0.

cu_skip_flag may indicate whether the current block is a block encoded in the skip mode. For example, when cu_skip_flag=1, no syntax except for the following syntax for the current block is parsed. When cu_skip_flag is not signaled, cu_skip_flag may be induced to 0.

-Flag indicating the combined prediction mode (pred_mode_ibc_flag)

-Flag indicating MVD-based merge mode (mmvd_flag)

-Merge index (mmvd_merge_flag) in MVD-based merge mode

-Distance index (mmvd distance_idx) in MVD-based merge mode

-Direction index in MVD-based merge mode (mmvd_direction_idx)

-Merge index (merge_idx)

-merge_subblock_flag

-Merge index (merge_subblock_idx) in merge mode in sub-block units

-Split direction indicator for non-rectangular partitions (merge_triangle_split_dir)

-Merge index of non-rectangular partition (merge_triangle_idx)

Referring to FIG. 17, regular_merge_flag may indicate whether a general CU-based merge mode is used to generate an inter prediction parameter of a current block. When the regular_merge_flag is not signaled, the regular_merge_flag may be derived in consideration of whether the current block is a block coded in the current picture reference mode. For example, when the current block is a block coded in the current picture reference mode, regular_merge_flag may be derived as 1, otherwise, regular_merge_flag may be derived as 0.

Referring to FIG. 18, regular_merge_flag may indicate whether a general CU-based merge mode is used to generate an inter prediction parameter of a current block. When regular_merge_flag is not signaled, regular_merge_flag may be set to 0.

mmvd_flag[x0][y0] may indicate whether the MVD-based merge mode is used to generate the inter prediction parameter of the current block. Here, mmvd_flag[x0][y0] may be interpreted as having the same meaning as MVD_flag described above.

When mmvd_flag is not signaled, mmvd_flag may be derived based on at least one of regular_merge_flag or whether the current block is a block coded in the current picture reference mode. For example, when the current block is not a block encoded in the current picture reference mode and regular_merge_flag is not 1, mmvd_flag is derived as 1, otherwise, mmvd_flag may be derived as 0.

Alternatively, when mmvd_flag is not signaled, mmvd_flag may be derived based on at least one of whether the current block is a block encoded in the current picture reference mode, regular_merge_flag, or the size of the current block. For example, if the current block is not a block encoded in the current picture reference mode, regular_merge_flag is not 1, and the sum of the width and height of the current block is less than or equal to 12, mmvd_flag is derived as 1, otherwise , mmvd_flag may be derived to 0.

ciip_flag may indicate whether the combined prediction mode is applied to the current block. When ciip_flag is not signaled, ciip_flag may be derived based on at least one of sps_triangle_enabled_flag or slice type. For example, if sps_triangle_enabled_flag is 0 or the slice to which the current block belongs is not a B slice, ciip_flag may be derived as 1, otherwise, ciip_flag may be derived as 0.

-Flag indicating the combined prediction mode (pred_mode_ibc_flag)

-Flag indicating MVD-based merge mode (mmvd_flag)

-Merge index (mmvd_merge_flag) in MVD-based merge mode

-Distance index (mmvd distance_idx) in MVD-based merge mode

-Direction index in MVD-based merge mode (mmvd_direction_idx)

-Merge index (merge_idx)

-merge_subblock_flag

-Merge index (merge_subblock_idx) in merge mode in sub-block units

-Merge index of non-rectangular partition (merge_triangle_idx)

Referring to FIG. 19, regular_merge_flag may indicate whether a general CU-based merge mode is used to generate an inter prediction parameter of a current block. When the regular_merge_flag is not signaled, the regular_merge_flag may be derived in consideration of whether the current block is a block coded in the current picture reference mode. For example, when the current block is a block coded in the current picture reference mode, regular_merge_flag may be derived as 1, otherwise, regular_merge_flag may be derived as 0.

The encoding apparatus may generate a bitstream by encoding at least one of the above-described merge mode-related encoding information according to a predetermined priority. The decoding apparatus may decode the bitstream to obtain encoding information related to a merge mode, and perform inter prediction according to the obtained encoding information.

Various embodiments of the present disclosure are not listed in all possible combinations, but are intended to describe representative aspects of the present disclosure, and matters described in the various embodiments may be applied independently or may be applied in combination of two or more.

In addition, various embodiments of the present disclosure may be implemented by hardware, firmware, software, or a combination thereof. For implementation by hardware, one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs (Field Programmable Gate Arrays), general purpose It may be implemented by a processor (general processor), a controller, a microcontroller, a microprocessor, or the like.

The scope of the present disclosure is software or machine-executable instructions (e.g., operating systems, applications, firmware, programs, etc.) that allow an operation according to a method of various embodiments to be executed on a device or computer, and such software or It includes a non-transitory computer-readable medium (non-transitory computer-readable medium) which stores instructions and the like and is executable on a device or a computer.

The present invention can be used to encode/decode an image signal.

Claims

Reconstructing a current picture based on at least one of intra prediction and inter prediction;

Specifying a block boundary to which a deblocking filter is applied in the restored current picture; And

And applying the deblocking filter to the specified block boundary based on a filter type pre-defined in a decoding apparatus.
The method of claim 1,

The deblocking filter is applied in a unit of a predetermined MxN sample grid, where M and N are integers of 4, 8 or more.
The method of claim 1,

The decoding device defines a plurality of filter types having different filter lengths,

The plurality of filter types include at least one of a long filter, a middle filter, and a short filter.
The method of claim 3,

The filter length of the long filter is 8, 10, 12 or 14,

The filter length of the intermediate filter is 6,

The short filter has a filter length of 2 or 4, and an image decoding method.
The method of claim 4,

The number of pixels to which the deblocking filter is applied in the P block and the number of pixels to which the deblocking filter is applied in the Q block are different from each other,

The P block and the Q block are blocks adjacent to each other in both directions based on the specified block boundary.
The method of claim 5,

The number of pixels to which the deblocking filter is applied in the P block is three, and the number of pixels to which the deblocking filter is applied in the Q block is seven.
The method of claim 1, wherein restoring the current picture comprises:

Constructing a merge candidate list of the current block;

Deriving motion information of the current block from the merge candidate list based on the merge index of the current block; Here, the motion information includes at least one of a motion vector, a reference picture index, and prediction direction information,

Correcting a motion vector of the current block by using a motion vector difference value for a merge mode of the current block; And

And performing motion compensation of the current block based on the corrected motion vector.
The method of claim 7,

The step of correcting the motion vector is performed only when the size of the current block is larger than a predetermined threshold size.
Reconstructing a current picture based on at least one of intra prediction and inter prediction;

Specifying a block boundary to which a deblocking filter is applied in the restored current picture; And

And applying the deblocking filter to the specified block boundary based on a filter type pre-defined in an encoding apparatus.
The method of claim 9,

The deblocking filter is applied in a unit of a predetermined MxN sample grid, wherein M and N are integers of 4, 8 or more.
The method of claim 1,

The encoding device defines a plurality of filter types having different filter lengths,

The plurality of filter types include at least one of a long filter, a middle filter, and a short filter.
The method of claim 11,

The filter length of the long filter is 8, 10, 12 or 14,

The filter length of the intermediate filter is 6,

The short filter has a filter length of 2 or 4, and an image decoding method.
The method of claim 12,

The number of pixels to which the deblocking filter is applied in the P block and the number of pixels to which the deblocking filter is applied in the Q block are different from each other,

The P block and the Q block are blocks adjacent to each other in both directions based on the specified block boundary.
The method of claim 13,

The number of pixels to which the deblocking filter is applied in the P block is three, and the number of pixels to which the deblocking filter is applied in the Q block is seven.