[go: up one dir, main page]

CN121040068A - Image encoding/decoding methods and apparatus, and recording media for storing bit streams - Google Patents

Image encoding/decoding methods and apparatus, and recording media for storing bit streams

Info

Publication number
CN121040068A
CN121040068A CN202480026782.8A CN202480026782A CN121040068A CN 121040068 A CN121040068 A CN 121040068A CN 202480026782 A CN202480026782 A CN 202480026782A CN 121040068 A CN121040068 A CN 121040068A
Authority
CN
China
Prior art keywords
block
block vector
vector
information
candidate list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202480026782.8A
Other languages
Chinese (zh)
Inventor
许镇
崔祯娥
朴胜煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hyundai Motor Co
Kia Corp
Original Assignee
Hyundai Motor Co
Kia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hyundai Motor Co, Kia Corp filed Critical Hyundai Motor Co
Priority claimed from PCT/KR2024/007749 external-priority patent/WO2024258110A1/en
Publication of CN121040068A publication Critical patent/CN121040068A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/88Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving rearrangement of data among different coding units, e.g. shuffling, interleaving, scrambling or permutation of pixel data or permutation of transform coefficient data among different blocks

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Provided are an image encoding/decoding method and apparatus, a recording medium storing a bitstream, and a transmission method. The image decoding method may include the steps of determining a prediction mode of a current block as an intra block copy merge mode, determining a block vector merge candidate list of the current block, deriving a block vector of the current block on the basis of the block vector merge candidate list, correcting the block vector by using differential block vector information, and generating a prediction block of the current block on the basis of the corrected block vector.

Description

Image encoding/decoding method and apparatus, and recording medium storing bit stream
Technical Field
The present invention relates to a method and apparatus for encoding/decoding an image and a recording medium for storing a bitstream. In particular, the present invention relates to a method and apparatus for encoding/decoding an image based on improved intra block copy merge mode prediction and a recording medium for storing a bitstream.
Background
Recently, in various application fields, demands for high resolution and high quality images such as ultra high definition (ultra high definition, UHD) images are increasing. As image data becomes higher in resolution and quality, the amount of data increases relatively compared to existing image data. Therefore, when such image data is transmitted using an existing medium such as a wired or wireless broadband channel, or when such image data is stored using an existing storage medium, both costs of transmission and storage increase. In order to solve these problems occurring as image data becomes higher in resolution and quality, efficient image encoding/decoding techniques are required for images having higher resolution and image quality.
Intra block copy prediction has high prediction accuracy for repeating similarly shaped picture content. In particular, when there is no reference picture and only intra prediction is applicable, intra block copy prediction may be applied, and accordingly, codec efficiency may be improved. Accordingly, in order to improve the codec efficiency for the case where only intra prediction is applicable, various tools for intra block copy prediction are being discussed.
Disclosure of Invention
Technical problem
It is an object of the present invention to provide a method and apparatus for encoding/decoding an image with improved encoding/decoding efficiency.
It is another object of the present invention to provide a recording medium for storing a bitstream generated by the method or apparatus for decoding an image according to the present invention.
In order to solve the above problems, another object of the present invention is to provide a method for correcting a block vector derived by an intra block copy merge mode.
Technical proposal
A method for decoding an image according to an embodiment of the present invention may include determining a prediction mode of a current block as an intra block copy merge mode, determining a block vector merge candidate list of the current block, deriving a block vector of the current block based on the block vector merge candidate list, correcting the block vector by using differential block vector information, and generating a prediction block of the current block based on the corrected block vector.
In the method for decoding an image, a block vector merge candidate list may be determined by using block vector information of neighboring blocks of a current block.
In the method for decoding an image, block vector merge candidates of the determined block vector merge candidate list may be reordered.
In the method for decoding an image, the block vector merging candidates may be reordered based on a similarity between a template of a reference block represented by the block vector merging candidates and a template of a current block.
In the method for decoding an image, block vector merge candidates corresponding to a template of a reference block (which has high similarity to a template of a current block) may be reordered to have high priority in a block vector merge candidate list.
In the method for decoding an image, the similarity may be determined by one of a Sum of Absolute Differences (SAD) method and a Sum of Squared Errors (SSE) method.
In the method for decoding an image, the derived block vector may be derived from one of a predetermined number of block vector merging candidates corresponding to a high priority in the block vector merging candidate list.
In the method for decoding an image, the differential block vector information may include direction information and distance information.
In the method for decoding an image, the direction information may include vertical direction information and horizontal direction information.
In the method for decoding an image, the distance information may include vertical distance information and horizontal distance information.
In the method for decoding an image, it is determined whether to correct the derived block vector, and when it is determined that the derived block vector is to be corrected, it may further include obtaining differential block vector information, and the derived block vector may be corrected based on the differential block vector information according to the determination that the derived block vector is to be corrected.
A method for encoding an image according to an embodiment of the present invention may include determining a prediction mode of a current block as an intra block copy merge mode, determining a block vector merge candidate list of the current block, deriving a block vector of the current block based on the block vector merge candidate list, correcting the block vector by using differential block vector information, and generating a prediction block of the current block based on the corrected block vector.
The non-transitory computer readable recording medium according to an embodiment of the present invention may store a bitstream generated by a method for encoding an image.
The bit stream transmission method according to an embodiment of the present invention may transmit a bit stream generated by a method for encoding an image.
The features briefly summarized above with respect to the present invention are provided as examples only to explain the detailed description and should not be construed as limiting the scope of the invention.
Advantageous effects
According to the present invention, a method and apparatus for encoding/decoding an image with improved encoding/decoding efficiency can be provided.
In addition, according to the present invention, a method for correcting a block vector derived by an intra block copy merge mode may be provided.
In addition, according to the present invention, prediction accuracy can be improved by generating a prediction block based on a corrected block vector.
The effects obtainable from the present invention are not limited to the above-described effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.
Drawings
Fig. 1 is a block diagram showing the configuration of an encoding apparatus according to an embodiment of the present invention.
Fig. 2 is a block diagram showing a configuration of a decoding apparatus according to an embodiment of the present invention.
Fig. 3 is a schematic diagram schematically showing a video codec system to which the present invention is applicable.
Fig. 4 is a schematic diagram for describing an intra block copy method according to an embodiment of the present invention.
Fig. 5 is a schematic diagram for describing a method for reordering block vector merge candidates of a block vector merge candidate list according to an embodiment of the present invention.
Fig. 6 is a schematic diagram for describing information about 4 directions included in differential block vector information according to an embodiment of the present invention.
Fig. 7 is a schematic diagram for describing information about 8 directions included in differential block vector information according to an embodiment of the present invention.
Fig. 8 is a schematic diagram for describing information on 16 directions included in differential block vector information according to an embodiment of the present invention.
Fig. 9 is a flowchart illustrating a method for correcting a block vector according to an embodiment of the present invention.
Fig. 10 is a schematic diagram for illustrating a content streaming system to which an embodiment according to the present invention is applicable.
Detailed Description
Best mode for carrying out the invention
A method for decoding an image according to an embodiment of the present invention may include determining a prediction mode of a current block as an intra block copy merge mode, determining a block vector merge candidate list of the current block, deriving a block vector of the current block based on the block vector merge candidate list, correcting the block vector by using differential block vector information, and generating a prediction block of the current block based on the corrected block vector.
Embodiments for practicing the invention
While the invention is susceptible to various modifications and embodiments, specific embodiments are shown in the drawings and are described in detail herein. However, it is not intended to limit the invention to the particular embodiments, but is to be construed to include all modifications, equivalents, or alternatives falling within the spirit and scope of the invention. The same reference numbers in the figures indicate the same or similar functionality in various respects. For clarity of description, the shapes and sizes of elements in the drawings may be provided by way of example. The following detailed description of exemplary embodiments refers to the accompanying drawings, which illustrate specific embodiments by way of example. These embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments. It is to be understood that the various embodiments are different from each other and are not necessarily mutually exclusive. For example, the specific shapes, structures and features described herein may be implemented in other embodiments without departing from the spirit and scope of the invention with reference to one embodiment. It is also to be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the embodiments. Accordingly, the detailed description set forth below is not intended to be limiting, and the scope of the exemplary embodiments is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled, if appropriately described.
In the present invention, the terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The term is used solely for the purpose of distinguishing one component from another. For example, a first component could be termed a second component, and, similarly, a second component could be termed a first component, without departing from the scope of the present invention. The term is and/or includes a combination of a plurality of related descriptive items or any item of a plurality of related descriptive items.
The components shown in the embodiments of the present invention are independently depicted to indicate different feature functions, and do not represent that each component is formed as a separate hardware or software configuration unit. That is, for convenience of explanation, each component is listed and included as a separate component, and at least two of the components may be combined to form a single component, or one component may be divided into a plurality of components to perform a function, so long as the embodiment in which the components are integrated and the embodiment in which each component is divided are also included in the scope of the present invention.
The terminology used in the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. Singular forms include plural forms unless the context clearly indicates otherwise. Furthermore, some components of the present invention are not necessary components to perform necessary functions in the present invention, and may be optional components only for improving performance. The present invention may be implemented by including only necessary components for achieving the gist of the present invention without including only components for improving performance, and structures including only necessary components without including only optional components for improving performance are also included in the scope of the present invention.
In embodiments, the term "at least one" may denote one of a number greater than or equal to 1, such as 1, 2,3, and 4. In embodiments, the term "plurality" may denote one of a number greater than or equal to 2, such as 2,3, and 4.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In describing the embodiments of the present specification, if it is determined that detailed description of related known configurations or functions will obscure the subject matter of the present specification, detailed description will be omitted, like reference numerals will be used for like components in the drawings, and repeated description of the same components will be omitted.
Description of the terms
Hereinafter, "image" may mean one picture constituting a video, and may also refer to the video itself. For example, "encoding and/or decoding of an image" may mean "encoding and/or decoding of video", and may also mean "encoding and/or decoding of one of the images constituting video".
Hereinafter, "moving image" and "video" may be used in the same sense and may be used interchangeably. In addition, the target image may be an encoding target image that is a target of encoding and/or a decoding target image that is a target of decoding. In addition, the target image may be an input image input to the encoding apparatus, and may be an input image input to the decoding apparatus. Here, the target image may have the same meaning as the current image.
Hereinafter, the encoder and the image encoding apparatus may be used in the same sense and may be used interchangeably.
Hereinafter, the decoder and the image decoding apparatus may be used in the same sense and may be used interchangeably.
Hereinafter, "image", "picture", "frame" and "picture" may be used in the same sense and may be used interchangeably.
Hereinafter, the "target block" may be an encoding target block that is a target of encoding and/or a decoding target block that is a target of decoding. In addition, the target block may be a current block that is a target of current encoding and/or decoding. For example, "target block" and "current block" may be used in the same sense and may be used interchangeably.
Hereinafter, "block" and "unit" may be used in the same sense and may be used interchangeably. In addition, "unit" may mean that it includes a luminance component block and a chrominance component block corresponding thereto in order to distinguish it from a block. For example, a Coding Tree Unit (CTU) may be composed of one luminance component (Y) coding tree block (coding tree block, CTB) and two chrominance components (Cb, cr) coding tree blocks associated therewith.
Hereinafter, "sample", "picture element", and "pixel" may be used in the same sense and may be used interchangeably. In this context, a sample may represent the basic unit constituting a block.
Hereinafter, "inter" and "inter" may be used in the same sense and may be used interchangeably.
Hereinafter, "intra" and "intra" may be used in the same sense and may be used interchangeably.
Fig. 1 is a block diagram showing the configuration of an encoding apparatus according to an embodiment of the present invention.
The encoding apparatus 100 may be an encoder, a video encoding apparatus, or an image encoding apparatus. The video may include one or more images. The encoding apparatus 100 may sequentially encode one or more images.
Referring to fig. 1, the encoding apparatus 100 may include an image partition unit 110, an intra prediction unit 120, a motion prediction unit 121, a motion compensation unit 122, a switcher 115, a subtractor 113, a transform unit 130, a quantization unit 140, an entropy encoding unit 150, an inverse quantization unit 160, an inverse transform unit 170, an adder 117, a filtering unit 180, and a reference picture buffer 190.
In addition, the encoding apparatus 100 may generate a bitstream including information encoded by encoding of an input image, and output the generated bitstream. The generated bit stream may be stored in a computer readable recording medium or may be streamed via a wired/wireless transmission medium.
The image partition unit 110 may partition an input image into various forms to improve efficiency of video encoding/decoding. That is, the input video is composed of a plurality of pictures, and one picture may be hierarchically partitioned and processed for compression efficiency, parallel processing, and the like. For example, a picture may be partitioned into one or more tiles (tiles) or slices, and then partitioned again into multiple Coding Tree Units (CTUs). Alternatively, one picture may be first partitioned into multiple sub-pictures defined as a rectangular slice group, and each sub-picture may be partitioned into tiles/slices. Here, the sub-picture may be used to support functions of encoding/decoding and transmitting pictures partially independently. This has the advantage of easy editing in applications where the multi-channel input is configured as one picture, since multiple sub-pictures can be reconstructed separately. In addition, tiles may be divided horizontally to generate bricks (tiles). Here, the bricks may be used as basic units for parallel processing within the chip. In addition, one CTU may be recursively partitioned into Quadtrees (QTs), and the partitioned end nodes may be defined as Coding Units (CUs). The CU may be partitioned as a prediction unit PU (Prediction Unit) and as a transform unit TU (Transform Unit) to perform prediction and partitioning. On the other hand, a CU may be used as a prediction unit and/or a transform unit itself. Here, for flexible partitioning, each CTU may be recursively partitioned into multi-type trees (multi-TYPE TREES, MTT) and Quadtrees (QT). Partitioning CTUs into multiple types of trees may begin with end nodes of QT, and MTT may consist of Binary Tree (BT) and trigeminal tree (TRIPLE TREE, TT). For example, the MTT structure may be classified into a vertical binary SPLIT mode (split_bt_ver), a horizontal binary SPLIT mode (split_bt_hor), a vertical trigeminal SPLIT mode (split_tt_ver), and a horizontal trigeminal SPLIT mode (split_tt_hor). In addition, during partitioning, the minimum block size (MinQTSize) of the quadtree of the luminance block may be set to 16×16, the maximum block size (MaxBtSize) of the binary tree may be set to 128×128, and the maximum block size (MaxTtSize) of the trigeminal tree may be set to 64×64. In addition, the minimum block size (MinBtSize) of the binary tree and the minimum block size (MinTtSize) of the trigeminal tree may be designated as 4×4, and the maximum depth (MaxMttDepth) of the multi-type tree may be designated as 4. In addition, in order to improve the encoding efficiency of the I slice, a dual tree of CTU partition structures using the luminance component and the chrominance component differently may be applied. on the other hand, in P-slices and B-slices, the luma and chroma codec tree blocks (Coding Tree Block, CTBs) within the CTU may be partitioned into individual trees that share a codec tree structure.
The encoding apparatus 100 may perform encoding on an input image in an intra mode and/or an inter mode. Alternatively, the encoding apparatus 100 may perform encoding on the input image in a third mode (e.g., IBC mode, palette mode, etc.) other than the intra mode and the inter mode. However, if the third mode has similar functional characteristics to the intra mode or the inter mode, it may be classified as the intra mode or the inter mode for convenience of explanation. In the present invention, the third mode is classified and described separately only in the case where a specific explanation of the third mode is required.
The switcher 115 may switch to intra when using intra mode as a prediction mode, and the switcher 115 may switch to inter when using inter mode as a prediction mode. Here, the intra mode may represent an intra prediction mode, and the inter mode may represent an inter prediction mode. The encoding apparatus 100 may generate a prediction block for an input block of an input image. In addition, the encoding apparatus 100 may encode the residual block using residuals of the input block and the prediction block after generating the prediction block. The input image may be referred to as a current image as a current encoding target. The input block may be referred to as a current block or an encoding target block as a current encoding target.
When the prediction mode is an intra mode, the intra prediction unit 120 may use samples of an already encoded/decoded block surrounding the current block as reference samples. The intra prediction unit 120 may perform spatial prediction of the current block by using the reference samples, or generate prediction samples of the input block by spatial prediction. Herein, intra prediction may represent intra prediction.
As an intra prediction method, non-directional prediction modes such as a DC mode and a plane mode, and directional prediction modes (e.g., 65 directions) may be applied. Here, the intra prediction method may be expressed as an intra prediction mode or an intra prediction mode.
When the prediction mode is the inter mode, the motion prediction unit 121 may retrieve a region that best matches the input block from the reference image in the motion prediction process, and derive a motion vector by using the retrieved region. In this case, a search area may be used as the area. The reference picture may be stored in the reference picture buffer 190. Here, when encoding/decoding of the reference image is performed, it may be stored in the reference picture buffer 190.
The motion compensation unit 122 may generate a prediction block of the current block by performing motion compensation using the motion vector. Herein, inter prediction may represent inter prediction or motion compensation.
When the value of the motion vector is not an integer, the motion prediction unit 121 and the motion compensation unit 122 may generate a prediction block by applying an interpolation filter to a partial region of the reference picture. In order to perform inter prediction or motion compensation, whether a motion prediction and motion compensation mode of a prediction unit included in a coding and decoding unit is one of a skip mode, a merge mode, an advanced motion vector prediction (advanced motion vector prediction, AMVP) mode, and an Intra Block Copy (IBC) mode may be determined based on the coding and decoding unit, and inter prediction or motion compensation may be performed according to each mode.
In addition, based on the above inter prediction method, an Affine (AFFINE) mode based on prediction of a sub-PU, a temporal motion vector prediction (Subblock-based Temporal Motion Vector Prediction, sbTMVP) mode based on a sub-block, a merging with MVD (MERGE WITH MVD, MMVD) mode based on prediction of a PU, and a geometric partition mode (Geometric Partitioning Mode, GPM) mode may be applied. In addition, in order to improve performance of each mode, history-based MVP (HMVP), paired average MVP (PAIRWISE AVERAGE MVP, PAMVP), combined Intra/Inter Prediction (CIIP), adaptive motion vector resolution (Adaptive Motion Vector Resolution, AMVR), bi-directional optical Flow (Bi-Directional Optical-Flow, BDOF), bi-directional Prediction with CU weights (Bi-PREDICTIVE WITH CU Weights, BCW), local illumination compensation (Local Illumination Compensation, LIC), template matching (TEMPLATE MATCHING, TM), overlapped block motion compensation (Overlapped Block Motion Compensation, OBMC), and the like may be applied.
Among them, affine (AFFINE) mode is a technique used in both AMVP and MERGE (MERGE) modes, and also has high coding efficiency. In the existing video codec standard, since motion compensation (Motion Compensation, MC) is performed by considering only parallel movement of blocks, it has a disadvantage that motion (e.g., enlargement/reduction and rotation) occurring in reality cannot be properly compensated. To supplement this, a four-parameter affine motion model using two control point motion vectors (control point motion vector, CPMV) and a six-parameter affine motion model using three control point motion vectors may be used and applied to inter prediction. Here, the CPMV is a vector representing an affine motion model of one of the upper left side, the upper right side, and the lower left side of the current block.
The subtractor 113 may generate a residual block by using a difference between the input block and the prediction block. The residual block may be referred to as a residual signal. The residual signal may represent a difference between the original signal and the predicted signal. Alternatively, the residual signal may be a signal generated by transforming or quantizing, or transforming and quantizing a difference between the original signal and the predicted signal. The residual block may be a residual signal of a block unit.
The transform unit 130 may generate transform coefficients by performing a transform on the residual block, and output the generated transform coefficients. Herein, the transform coefficient may be a coefficient value generated by performing a transform on the residual block. When the transform skip mode is applied, the transform unit 130 may skip the transform of the residual block.
The quantization level may be generated by applying quantization to the transform coefficients or the residual signal. Hereinafter, the quantization level may also be referred to as a transform coefficient in an embodiment.
For example, a 4×4 luminance residual block generated by intra prediction is transformed with a discrete sine Transform (DISCRETE SINE Transform, DST) based base vector, and the remaining residual block may be transformed with a discrete cosine Transform (Discrete Cosine Transform, DCT) based base vector. In addition, a transform block is partitioned into a quadtree shape of one block using a residual quadtree (Residual Quad Tree, RQT) technique, and after performing transform and quantization on each transformed block partitioned by the RQT, a coded block flag (cbf) of a codec may be transmitted when all coefficients become 0 to improve coding efficiency.
As another alternative, a multiple transform selection (Multiple Transform Selection, MTS) technique that selectively uses multiple transform bases to perform transforms may be applied. That is, instead of partitioning a CU into TUs by RQT, a function similar to TU partitioning may be performed by sub-block Transform (SBT) techniques. Specifically, SBT is applied only to inter prediction blocks, and unlike RQT, a current block may be partitioned into a size of 1/2 or 1/4 in a vertical or horizontal direction, and then transformation may be performed on only one of the blocks. For example, if the current block is vertically partitioned, the transformation may be performed on the leftmost or rightmost block, and if the current block is horizontally partitioned, the transformation may be performed on the uppermost or lowermost block.
Furthermore, a low frequency inseparable transform (Low Frequency Non-Separable Transform, LFNST) may be applied, which is a quadratic transform technique that additionally transforms the transformed residual signal into the frequency domain by DCT or DST. LFNST additionally performs a transform on the 4 x 4 or 8 x 8 low frequency region on the upper left side so that the residual coefficients can be concentrated on the upper left side.
The quantization unit 140 may generate a quantization level by quantizing a transform coefficient or a residual signal according to a Quantization Parameter (QP), and output the generated quantization level. Herein, the quantization unit 140 may quantize the transform coefficient by using a quantization matrix.
For example, a quantizer using QP values of 0 to 51 may be used. Alternatively, if the image size is large and high coding efficiency is required, QP of 0 to 63 may be used. Furthermore, a dependent quantization (DEPENDENT QUANTIZATION, DQ) method using two quantizers instead of one may be applied. DQ quantization is performed with two quantizers (e.g., Q0 and Q1), but a quantizer for the next transform coefficient can be selected based on the current state by a state transition model even though there is no signaling information about the use of a particular quantizer.
The entropy encoding unit 150 may generate a bitstream by performing entropy encoding on the values calculated by the quantization unit 140 or on the codec parameter values calculated when encoding is performed according to the probability distribution, and output the bitstream. The entropy encoding unit 150 may perform entropy encoding on information about samples of an image and information for decoding the image. For example, the information for decoding the image may include a syntax element.
When entropy encoding is applied, the symbols are represented such that a smaller number of bits are allocated to symbols having a high occurrence probability and a larger number of bits are allocated to symbols having a low occurrence probability, and thus the size of a bit stream for symbols to be encoded can be reduced. The entropy encoding unit 150 may perform entropy encoding using an encoding method such as exponential golomb (exponential Golomb), context-adaptive variable length coding (CAVLC), context-adaptive binary arithmetic coding (context-adaptive binary arithmetic coding, CABAC), and the like. For example, the entropy encoding unit 150 may perform entropy encoding by using a variable length codec (variable length coding/code, VLC) table. In addition, the entropy encoding unit 150 may derive a binarization method of the target symbol and a probability model of the target symbol/binary, and perform arithmetic coding by using the derived binarization method and context model.
In connection with this, when CABAC is applied, in order to reduce the size of the probability table stored in the decoding apparatus, the table probability update method may be changed to the table update method using a simple equation and applied. In addition, two different probability models may be used to obtain more accurate symbol probability values.
In order to encode the transform coefficient level (quantization level), the entropy encoding unit 150 may change coefficients in the form of a two-dimensional block into a one-dimensional vector form by a transform coefficient scanning method.
The codec parameters may include information (flags, indexes, etc.) encoded in the encoding apparatus 100 and signaled to the decoding apparatus 200, such as syntax elements, and information derived in the encoding or decoding process, and may represent information required when encoding or decoding an image.
Herein, signaling a flag or index may mean that the corresponding flag or index is entropy encoded and included in a bitstream in an encoder, and may mean that the corresponding flag or index is entropy decoded from the bitstream in a decoder.
The encoded current image may be used as a reference image for another image to be processed later. Accordingly, the encoding apparatus 100 may reconstruct or decode the encoded current image again, and store the reconstructed or decoded image as a reference image in the reference picture buffer 190.
The quantization level may be inverse quantized in the inverse quantization unit 160 or may be inverse transformed in the inverse transformation unit 170. The inverse quantized and/or inverse transformed coefficients may be added to the prediction block by adder 117. Herein, the inverse quantized and/or inverse transformed coefficients may represent coefficients on which at least one of the inverse quantization and inverse transformation is performed, and may represent a reconstructed residual block. The inverse quantization unit 160 and the inverse transformation unit 170 may be performed as inverse processes of the quantization unit 140 and the transformation unit 130.
The reconstructed block may pass through the filtering unit 180. The filtering unit 180 may apply all or some of filtering techniques of a deblocking filter, a Sample Adaptive Offset (SAO) filter, an adaptive loop filter (adaptive loop filter, ALF), a bilateral filter (BIF), a luma map with chroma scaling (luma MAPPING WITH chroma scaling, LMCS) filter, or the like to the reconstructed samples, the reconstructed blocks, or the reconstructed image. The filtering unit 180 may be referred to as an in-loop filter. In this case, an in-loop filter may also be used as a name excluding LMCS.
The deblocking filter may remove block distortion generated in boundaries between blocks. To determine whether to apply the deblocking filter, whether to apply the deblocking filter to the current block may be determined based on samples included in several rows or columns included in the block. When a deblocking filter is applied to a block, different filters may be applied according to the required deblocking filter strength.
To compensate for coding errors with sample adaptive offset, an appropriate offset value may be added to the sample value. The sample adaptive offset may correct the offset from the original image by sample unit for the deblocking image. A method of partitioning samples included in an image into a predetermined number of areas, determining an area to which an offset is applied, and applying the offset to the determined area, or a method of applying the offset in consideration of edge information about each sample may be used.
The bilateral filter (BIF) can also correct an offset from the original image on a sample-by-sample basis for an image that has been deblocked.
The adaptive loop filter may perform filtering based on a comparison of the reconstructed image with the original image. Samples included in the image may be partitioned into predetermined groups, filters to be applied to each group may be determined, and differential filtering may be performed on each group. The information whether to apply ALF may be signaled by a Coding Unit (CU) and the form and coefficients of the adaptive loop filter to be applied to each block may vary.
In luminance mapping (luma MAPPING WITH chroma scaling, LMCS) with chroma scaling, luminance Mapping (LM) represents remapping luminance values by a piecewise linear model, and Chroma Scaling (CS) represents a technique of scaling residual values of chroma components according to average luminance values of a prediction signal. In particular LMCS can be used as an HDR correction technique reflecting the characteristics of a high dynamic range (HIGH DYNAMIC RANGE, HDR) image.
The reconstructed block or the reconstructed image that has passed through the filtering unit 180 may be stored in the reference picture buffer 190. The reconstructed block that has passed through the filtering unit 180 may be a part of a reference image. That is, the reference image is a reconstructed image composed of reconstructed blocks that have passed through the filtering unit 180. The stored reference pictures may be later used for inter prediction or motion compensation.
Fig. 2 is a block diagram showing a configuration of a decoding apparatus according to an embodiment of the present invention.
The decoding apparatus 200 may be a decoder, a video decoding apparatus, or an image decoding apparatus.
Referring to fig. 2, the decoding apparatus 200 may include an entropy decoding unit 210, an inverse quantization unit 220, an inverse transformation unit 230, an intra prediction unit 240, a motion compensation unit 250, an adder 201, a switch 203, a filtering unit 260, and a reference picture buffer 270.
The decoding apparatus 200 may receive the bitstream output from the encoding apparatus 100. The decoding apparatus 200 may receive a bit stream stored in a computer readable recording medium, or may receive a bit stream streamed over a wired/wireless transmission medium. The decoding apparatus 200 may decode the bitstream in an intra mode or an inter mode. In addition, the decoding apparatus 200 may generate a reconstructed image or a decoded image generated by decoding and output the reconstructed image or the decoded image.
When the prediction mode for decoding is an intra mode, the switch 203 may switch to intra. Alternatively, when the prediction mode for decoding is an inter mode, the switch 203 may switch to an inter mode.
The decoding apparatus 200 may obtain a reconstructed residual block by decoding an input bitstream and generate a prediction block. When the reconstructed residual block and the prediction block are obtained, the decoding apparatus 200 may generate a reconstructed block that is a decoding target by adding the reconstructed residual block and the prediction block. The decoding target block may be referred to as a current block.
The entropy decoding unit 210 may generate symbols by entropy decoding the bitstream according to a probability distribution. The generated symbols may include symbols in the form of quantization levels. Herein, the entropy decoding method may be an inverse of the entropy encoding method described above.
The entropy decoding unit 210 may change the coefficients of the one-dimensional vector shape into the coefficients of the two-dimensional block shape by a transform coefficient scanning method to decode the transform coefficient level (quantization level).
The quantization level may be inverse quantized in the inverse quantization unit 220 or inverse transformed in the inverse transformation unit 230. The quantization level may be the result of an inverse quantization and/or inverse transformation and may be generated as a reconstructed residual block. Herein, the inverse quantization unit 220 may apply a quantization matrix to the quantization level. The inverse quantization unit 220 and the inverse transformation unit 230 applied to the decoding apparatus may apply the same technique as the inverse quantization unit 160 and the inverse transformation unit 170 applied to the foregoing encoding apparatus.
When the intra mode is used, the intra prediction unit 240 may generate a prediction block by performing spatial prediction on the current block using sample values of blocks that have been decoded around the decoding target block. The intra prediction unit 240 applied to the decoding apparatus may apply the same technique as the intra prediction unit 120 applied to the foregoing encoding apparatus.
When the inter mode is used, the motion compensation unit 250 may generate a prediction block by performing motion compensation on the current block using the motion vector and the reference image stored in the reference picture buffer 270. When the value of the motion vector is not an integer value, the motion compensation unit 250 may generate a prediction block by applying an interpolation filter to a partial region within the reference image. In order to perform motion compensation, it may be determined whether a motion compensation mode of a prediction unit included in a corresponding codec unit is a skip mode, a merge mode, an AMVP mode, or a current picture reference mode based on the codec unit, and motion compensation may be performed according to each mode. The motion compensation unit 250 applied to the decoding apparatus may apply the same technique as the motion compensation unit 122 applied to the encoding apparatus described above.
The adder 201 may generate a reconstructed block by adding the reconstructed residual block and the prediction block. The filtering unit 260 may apply at least one of an inverse LMCS, a deblocking filter, a sample adaptive offset, and an adaptive loop filter to the reconstructed block or the reconstructed image. The filtering unit 260 applied to the decoding apparatus may apply the same filtering technique as the filtering unit 180 applied to the aforementioned encoding apparatus.
The filtering unit 260 may output the reconstructed image. The reconstructed block or reconstructed image may be stored in a reference picture buffer 270 and used for inter prediction. The reconstructed block that has passed through the filtering unit 260 may be a part of a reference image. That is, the reference image may be a reconstructed image composed of reconstructed blocks that have passed through the filtering unit 260. The stored reference pictures may be later used for inter prediction or motion compensation.
Fig. 3 is a schematic diagram schematically showing a video codec system to which the present invention is applicable.
The video codec system according to the embodiment may include an encoding apparatus 10 and a decoding apparatus 20. Encoding device 10 may send encoded video and/or image information or data in file or streaming form over a digital storage medium or network to decoding device 20.
The encoding apparatus 10 according to the embodiment may include a video source generating unit 11, an encoding unit 12, and a transmitting unit 13. The decoding apparatus 20 according to an embodiment may include a receiving unit 21, a decoding unit 22, and a rendering unit 23. The encoding unit 12 may be referred to as a video/image encoding unit, and the decoding unit 22 may be referred to as a video/image decoding unit. The transmitting unit 13 may be included in the encoding unit 12. The receiving unit 21 may be included in the decoding unit 22. The rendering unit 23 may include a display unit, and the display unit may be configured as a separate device or an external component.
The video source generating unit 11 may obtain video/images through a process of capturing, synthesizing, or generating the video/images. The video source generating unit 11 may comprise video/image capturing means and/or video/image generating means. The video/image capturing means may comprise, for example, one or more cameras, video/image archives comprising previously captured video/images, etc. The video/image generating means may comprise, for example, a computer, a tablet computer, a smart phone, etc., and may (electronically) generate the video/image. For example, a virtual video/image may be generated by a computer or the like, in which case the video/image capturing process may be replaced with a process of generating related data.
The encoding unit 12 may encode the input video/image. For compression and coding efficiency, the coding unit 12 may perform a series of processes such as prediction, transformation, and quantization. The encoding unit 12 may output encoded data (encoded video/image information) in the form of a bitstream. The detailed configuration of the encoding unit 12 may also be configured in the same manner as the encoding apparatus 100 of fig. 1 described above.
The transmitting unit 13 may transmit the encoded video/image information or data output in the form of a bitstream to the receiving unit 21 of the decoding apparatus 20 through a digital storage medium or a network in the form of a file or stream. The digital storage medium may include various storage media such as USB, SD, CD, DVD, blu-ray, HDD, SSD, etc. The transmitting unit 13 may include an element for generating a media file through a predetermined file format, and may include an element for transmission through a broadcast/communication network. The receiving unit 21 may extract/receive the bit stream from the storage medium or the network and transmit it to the decoding unit 22.
The decoding unit 22 may decode the video/image by performing a series of processes such as inverse quantization, inverse transformation, and prediction corresponding to the operation of the encoding unit 12. The detailed configuration of the decoding unit 22 may also be configured in the same manner as the above-described decoding apparatus 200 of fig. 2.
The rendering unit 23 may render the decoded video/image. The rendered video/images may be displayed by a display unit.
In the present invention, a block vector derived by intra block copy prediction merge mode may be corrected.
The intra block copy prediction method is a method of searching for a best prediction block in a reconstructed region of a current picture by using a block vector, and then generating a prediction block of the current block by copying the best prediction block. To improve the accuracy of intra block copy prediction, the encoder and/or decoder may correct the block vector.
Before describing a method for correcting a block vector of a current block, a method for intra block copy prediction according to an embodiment of the present invention will be described with reference to fig. 4.
Fig. 4 is a schematic diagram for describing a method for intra block copy prediction according to an embodiment of the present invention.
Referring to fig. 4, based on the block vector 420 of the current block 410, the matching block 440 may be derived within predefined search ranges R1, R2, R3, and R4 of the reconstructed region 430 of the current picture 400. In addition, a prediction block of the current block 410 may be generated based on the matching block 440.
The predefined search ranges R1, R2, R3, and R4 in fig. 4 may be defined as current Coding Tree Units (CTUs) including a current block, an upper left CTU, and a left CTU, respectively.
In addition, the reference templates may be searched based on a predefined search order within a predefined search scope. As an example, the reference templates may be searched in a zigzag order of R1, R4, R3, and R2.
On the other hand, information about the search range may be determined in the encoder and transmitted to the decoder. In addition, the search range may be set to a predetermined value in the encoder/decoder.
On the other hand, in fig. 4, the derivation of the matching block within a predefined search range is described, but the matching block may also be derived based on a block vector in the reconstructed region within the current picture.
On the other hand, the block vector 420 of fig. 4 is a block vector that has not been corrected, but a matching block may be derived based on the corrected block vector of the current block, and a prediction block of the current block may be generated based on the derived matching block.
On the other hand, the reference block in intra block copy may be a matching block.
The intra block copy merge mode may be a mode in which a block vector of a current block is derived from information on block vectors of neighboring blocks of the current block.
When the intra block copy merge mode is applied, a block vector merge candidate list may be determined by using information on block vectors of neighboring blocks of the current block, which are predicted by the intra block copy merge mode or the intra template matching prediction mode included in the reconstructed region. Herein, the block vector merge candidate list may be a list storing information on a block vector of a reference block of the current block.
On the other hand, the reference block in the intra block copy merge mode may be a matching block.
The block vector merge candidates of the block vector merge candidate list may be reordered. Specifically, the block vector merge candidates in the block vector merge candidate list may be reordered in descending order of priority so that the block vector merge candidate list may be determined, and the block vector of the current block may be derived based on the determined block vector merge candidate list. Herein, having a higher priority may mean matching with a lower index number in the block vector merging candidate list.
Fig. 5 is a schematic diagram for describing a method for reordering block vector merge candidates of a block vector merge candidate list according to an embodiment of the present invention. Specifically, fig. 5 is a schematic diagram for describing a template matching-based adaptive reordering method (adaptive reordering of merging candidates using template matching, (adaptive reordering of MERGE CANDIDATES WITH TEMPLATE MATCHING, ARMC-TM)).
The ARMC-TM is a method of reordering block vector merge candidates of a block vector merge candidate list based on a similarity between a template of a reference block represented by each block vector merge candidate and a template of a current block.
As an example, the block vector merge candidates may be reordered such that block vector merge candidates corresponding to the template of the reference block (which has high similarity with the template of the current block) have high priority in the block vector merge candidate list.
Referring to fig. 5, in the intra block copy merge mode, in predefined search ranges R1, R2, R3, and R4 of the reconstructed region 510 of the current picture 500, a matching block M1 550 may be derived by a block vector V1 530, and a matching block M2 560 may be derived by a block vector V2 540. On the other hand, the block vector merging candidate list may include information about the block vectors V1 and V2.
In addition, adjacent L-shaped areas (i.e., left, upper, and left upper areas) of the current block 520 may be defined as a template (current template) 570 of the current block, and adjacent L-shaped areas of M1 and the matching block M2 may be defined as a template (template of M1) 580 of the reference block M1 and a template (template of M2) 590 of the reference block M2, respectively.
Herein, the similarity between the template 570 of the current block and the template 580 of M1 and the similarity between the template 570 of the current block and the template 590 of M2 may be determined, and the block vector merge candidate list may be reordered based on the similarity.
For example, when the similarity between the template 570 of the current block and the template 580 of M1 is high, the block vector merge candidates may be reordered so that the block vector merge candidates corresponding to the template 580 of M1 have a high priority in the block vector merge candidate list. Herein, having a higher priority may mean that the corresponding block vector merging candidate has a lower index number in the block vector merging candidate list.
On the other hand, the similarity may be determined by any one of the SAD method and the SSE method.
The predefined search ranges R1, R2, R3, and R4 in fig. 5 may be defined as current Codec Tree Units (CTUs) including a current block, an upper left CTU, and a left CTU, respectively.
In addition, the reference templates may be searched based on a predefined search order within a predefined search scope. As an example, the reference templates may be searched in a zigzag order of R1, R4, R3, and R2.
On the other hand, information about the search range and the size and shape of the template may be determined in the encoder and transmitted to the decoder. In addition, the search range and the size and shape of the template may be set to predetermined values in the encoder/decoder.
As an example, the size of the current template may be determined by (w×l2) + (l1×h) + (l1×l2), as shown in fig. 5. Here, w and h may represent the width and height of the current block, and the values of L1 and L2 may be determined as any positive integer.
On the other hand, in fig. 5, the derivation of the matching block within a predefined search range is described, but the matching block may also be derived based on a block vector in the reconstructed region within the current picture.
Further, fig. 5 depicts reordering of 2 block vector merging candidates, but this is one example, and L block vector merging candidates for L block vectors may be reordered. Herein, L is any positive integer.
In addition, fig. 5 depicts a block vector merging candidate list having a higher priority matching a lower index number, but this is one example, and the block vector merging candidate may be matched with a predetermined index according to priority.
In addition, fig. 5 describes a case where block vector merge candidates corresponding to a template of a reference block (which has high similarity to a template of a current block) are reordered to have high priority in a block vector merge candidate list, but this is one example, and block vector merge candidates may be reordered to have arbitrary priority based on similarity.
On the other hand, the block vector of the current block may be derived based on the block vector merge candidate list. Specifically, when one block vector merging candidate is determined among the block vector merging candidates included in the block vector merging candidate list, the block vector of the current block may be derived from the determined block vector merging candidate.
In addition, the best block vector merge candidate may be determined in the block vector merge candidate list, and the block vector of the current block may be derived from the best block vector merge candidate. Herein, the best block vector merging candidate may be a block vector merging candidate having the lowest cost value among the block vector merging candidates included in the block vector merging candidate list.
In addition, when reordering block vector merge candidates of the block vector merge candidate list to determine the block vector merge candidate list, a block vector of the current block may be derived based on the determined block vector merge candidate list. Specifically, when one block vector merging candidate is determined among the reordered block vector merging candidates, the block vector of the current block may be derived from the determined block vector merging candidate.
In addition, the block vector of the current block may be derived from one of n block vector merging candidates corresponding to a high priority in the block vector merging candidate list. Specifically, one block vector merging candidate may be determined among n block vector merging candidates having a high priority in the block vector merging candidate list, and a block vector of the current block may be derived from the determined block vector merging candidate. Herein, n is any positive integer equal to or smaller than the total number of block vector merging candidates included in the block vector merging candidate list.
On the other hand, the priority may be determined based on the cost value. Specifically, the block vector merging candidates in the block vector merging candidate list may have higher priority as their cost values decrease.
On the other hand, having a higher priority may indicate a match with a lower index number in the block vector merge candidate list.
Alternatively, the cost value may be calculated by a predefined cost function. In addition, the method for calculating the cost value may be any one of the SAD method or the SSE method.
In the present invention, a block vector of a current block, which is derived through an intra block copy prediction merge mode, may be corrected, and a prediction block of the current block may be generated based on the corrected block vector. Specifically, the derived block vector of the current block may be corrected by using the differential block vector information, a matching block may be derived based on the corrected block vector, and a prediction block of the current block may be generated based on the derived matching block.
As an example, the corrected block vector of the current block may be calculated as shown in equation 1.
[ Formula 1]
Final block vector = initial block vector + differential block vector
In equation 1, the initial block vector may be a block vector of the current block derived by the intra block copy prediction merging mode, and the final block vector is a corrected block vector.
On the other hand, the differential block vector may be derived by using differential block vector information.
According to an embodiment of the present invention, when determining whether to correct a derived block vector, the block vector may be corrected based on differential block vector information. Specifically, whether to correct the derived block vector may be determined, and when the derived block vector is determined to be corrected, differential block vector information may be obtained, and since the derived block vector is determined to be corrected, the derived block vector may be corrected based on the differential block vector information.
For example, when it is determined that the derived block vector is not corrected, differential block vector information may not be obtained, and a prediction block of the current block may be generated based on the derived block vector.
The differential block vector information may include direction information and distance information. Herein, the direction information may be information about the direction of the differential block vector, and the distance information may be information about the magnitude of the differential block vector.
On the other hand, the direction information may include horizontal direction information and vertical direction information.
Fig. 6 to 8 are schematic diagrams for describing information about directions included in differential block vector information according to an embodiment of the present invention. Referring to fig. 6 to 8, a circle indicates an initial position of a differential block vector, and a square indicates a direction of the differential block vector.
Fig. 6 is a schematic diagram for describing information about 4 directions included in differential block vector information according to an embodiment of the present invention. Specifically, fig. 6 shows that the direction of the differential block vector corresponds to one of 4 directions (right horizontal direction, left horizontal direction, upward vertical direction, and downward vertical direction).
TABLE 1
Referring to table 1, direction information indicating the direction of the differential block vector among the four directions is matched with the index and transmitted/parsed. Herein, the direction information may include horizontal direction information and vertical direction information.
For example, direction information including horizontal direction (+1) information and vertical direction (0) information may be matched with index 0. In addition, direction information including horizontal direction (0) information and vertical direction (+1) information may be matched with index 2. Herein, when the direction information is transmitted/parsed with index 0, the direction of the differential block vector may be the right horizontal direction 600, and when the direction information is transmitted/parsed with index 2, the direction of the differential block vector may be the upward vertical direction 610.
On the other hand, table 1 is an example, and the direction information of the differential block vector may be matched with an arbitrary index and transmitted/parsed. Herein, any index may be a preset index.
Fig. 7 is a schematic diagram for describing information about 8 directions included in differential block vector information according to an embodiment of the present invention. Specifically, fig. 7 shows that the direction of the differential block vector is one of 8 directions (right horizontal direction, left horizontal direction, upward vertical direction, downward vertical direction, right upper diagonal direction, left upper diagonal direction, right lower diagonal direction, and left lower diagonal direction).
Fig. 8 is a schematic diagram for describing information about 16 directions included in differential block vector information according to an embodiment of the present invention. Specifically, fig. 8 shows that the direction of the differential block vector is one of 16 directions obtained by adding additional 8 directions to the above 8 directions.
Regarding the 8 direction information and the 16 direction information described in fig. 7 and 8, the direction information of the differential block vector may also be matched with an arbitrary index and transmitted/parsed in the same manner as described in table 1.
On the other hand, fig. 6 to 8 show 4 pieces of direction information, 8 pieces of direction information, and 16 pieces of direction information, respectively, but this is one example, and the direction of the differential block vector may be one of N directions. Herein, N is any positive integer.
The distance information included in the differential block vector information may be matched with an arbitrary index and transmitted/parsed in a range from 1/4 pixel to 32 pixels.
TABLE 2
Referring to table 2, a distance from 1/4 pixel to 32 pixels may be matched with 8 indexes, and distance information may be transmitted/parsed using indexes matched with magnitudes of differential block vectors. For example, when the magnitude of the differential block vector corresponds to 1 pixel distance, the distance information may be transmitted/parsed using index 2 that matches the distance, and when the magnitude of the differential block vector corresponds to 8 pixel distances, the distance information may be transmitted/parsed using index 5 accordingly.
On the other hand, table 2 is an example, a pixel distance ranging from 1/4 pixel to 32 pixels may be matched with K indexes, and distance information of the differential block vector may be transmitted/parsed with an index matched with a pixel distance corresponding to the order of magnitude of the differential block vector. Herein, K is any positive integer.
According to an embodiment, the distance information may include vertical distance information and horizontal distance information. The vertical distance information indicates an absolute value of a vertical component of the differential block vector. The horizontal distance information indicates an absolute value of a horizontal component of the differential block vector.
On the other hand, the differential block vector may be derived by using differential block vector information. Specifically, the direction of the differential block vector may be determined as the direction information of transmission/parsing, and the magnitude of the differential block vector may be determined as the distance information of transmission/parsing.
For example, when the direction of the differential block vector is a right horizontal direction and its magnitude corresponds to 1 pixel distance, differential block vector direction information including an index of the right horizontal direction as direction information and an index of 1 pixel distance as distance information may be transmitted/parsed, and the block vector of the current block may be corrected by using the differential block vector having the magnitude and the direction.
Fig. 9 is a schematic diagram for describing a method for correcting a block vector according to an embodiment of the present invention. The method for correcting a block vector in fig. 9 may be performed by an image decoding apparatus.
The image decoding apparatus may determine a prediction mode of the current block as an intra block copy merge mode (S900).
In addition, the image decoding apparatus may determine a block vector merge candidate list of the current block (S910).
On the other hand, the block vector merge candidate list may be determined by using block vector information of neighboring blocks of the current block.
In addition, the block vector merge candidates of the determined block vector merge candidate list may be reordered.
In addition, the block vector merge candidates may be reordered based on the similarity between the template of the reference block represented by the block vector merge candidates and the template of the current block.
In addition, block vector merge candidates corresponding to the template of the reference block (which has high similarity to the template of the current block) may be reordered to have high priority in the block vector merge candidate list.
In addition, the similarity may be determined by any one of the SAD method and the SSE method.
In addition, the image decoding apparatus may derive a block vector of the current block based on the block vector merge candidate list (S920).
On the other hand, the derived block vector may be derived from one of a predetermined number of block vector merging candidates corresponding to a high priority in the block vector merging candidate list.
In addition, the image decoding apparatus may correct the block vector by using the differential block vector information (S930).
On the other hand, the differential block vector information may include direction information and distance information.
In addition, the direction information may include horizontal direction information and vertical direction information.
In addition, the distance information may include vertical distance information and horizontal distance information.
In addition, the image decoding apparatus may generate a prediction block of the current block based on the corrected block vector (S940).
In addition, the image decoding apparatus may determine whether to correct the derived block vector, and when determining to correct the derived block vector, may obtain differential block vector information. In addition, the derived block vector may be corrected based on the differential block vector information according to the determination of the block vector to be corrected.
On the other hand, the steps described in fig. 9 may be similarly performed in the image encoding method. In addition, the bitstream may be generated by an image encoding method including the steps described in fig. 9. The bit stream may be stored in a non-volatile computer readable recording medium and may also be transmitted (or streamed).
Fig. 10 exemplarily shows a content streaming system to which an embodiment according to the present invention can be applied.
As shown in fig. 10, a content streaming system to which an embodiment of the present invention is applied may mainly include an encoding server, a streaming server, a web server, a media storage device, a user device, and a multimedia input device.
The encoding server compresses content received from a multimedia input device such as a smart phone, a video camera, a CCTV, etc. into digital data to generate a bitstream and transmits it to the streaming server. As another example, if a multimedia input device such as a smart phone, a video camera, a CCTV, etc. directly generates a bitstream, the encoding server may be omitted.
The bitstream may be generated by the image encoding method and/or the image encoding apparatus to which the embodiments of the present invention are applied, and the streaming server may temporarily store the bitstream during transmission or reception of the bitstream.
The streaming server transmits multimedia data to the user device based on a user request via the web server, and the web server may act as an intermediary informing the user of any available services. When a user requests a desired service from the network server, the network server transmits it to the streaming server, and the streaming server may transmit multimedia data to the user. At this time, the content streaming system may include a separate control server, and in this case, the control server may control commands/responses between devices within the content streaming system.
The streaming server may receive content from the media storage device and/or the encoding server. For example, the content may be received in real-time as the content is received from the encoding server. In this case, in order to provide a smooth streaming service, the streaming server may store the bit stream for a period of time.
Examples of user devices may include mobile phones, smart phones, laptops, digital broadcast terminals, personal Digital Assistants (PDAs), portable multimedia players (portable multimedia player, PMPs), navigation devices, tablet PCs, ultrabooks, wearable devices (e.g., smartwatches, smartglasses, HMDs), digital televisions, desktop computers, digital signage, and the like.
Each server in the content streaming system described above may operate as a distributed server, in which case the data received from each server may be distributed and processed.
The above embodiments may be performed in the same or corresponding manner in the encoding means and decoding means. In addition, at least one of the above embodiments or a combination of at least one of the above embodiments may be utilized to encode/decode an image.
The order in which the above embodiments are applied may be different in the encoding apparatus and the decoding apparatus. Alternatively, the order in which the above embodiments are applied may be the same in the encoding apparatus and the decoding apparatus.
The above-described embodiments may be performed for each of the luminance signal and the chrominance signal. Alternatively, the above-described embodiments of the luminance signal and the chrominance signal may be performed identically.
In the above-described embodiments, the method is described based on a flowchart having a series of steps or units, but the present invention is not limited to the order of the steps, and instead, some steps may be performed simultaneously with other steps or in a different order. In addition, those of ordinary skill in the art will understand that steps in the flowcharts are not mutually exclusive and that other steps may be added to the flowcharts or some steps may be deleted from the flowcharts without affecting the scope of the present invention.
Embodiments may be implemented in the form of program instructions executable by various computer components and recorded in a computer-readable recording medium. The computer readable recording medium may include individual program instructions, data files, data structures, etc., or combinations thereof. The program instructions recorded in the computer-readable recording medium may be specially designed and constructed for the present invention, or they may be well known to those having ordinary skill in the computer software arts.
The bit stream generated by the encoding method according to the above-described embodiments may be stored in a nonvolatile computer-readable recording medium. In addition, the bit stream stored in the nonvolatile computer readable recording medium may be decoded by the decoding method according to the above-described embodiments.
Examples of the computer readable recording medium include magnetic recording media such as hard disks, floppy disks, and magnetic tapes, optical data storage media such as CD-ROMs, or DVD-ROMs, magneto-optical media such as floppy disks, and hardware devices such as read-only memories (ROMs), random Access Memories (RAMs), flash memories, and the like, which are specifically configured to store and execute program instructions. Examples of program instructions include not only machine language code, which is formatted by a compiler, but also high-level language code that may be implemented by a computer using an interpreter. The hardware means may be configured to be operated by one or more software modules or vice versa to perform the processes according to the invention.
While the invention has been described in terms of specific items such as detailed elements, as well as limited embodiments and figures, they are provided solely to aid in a more complete understanding of the invention and the invention is not limited to the embodiments described above. Those skilled in the art to which the invention pertains will appreciate that various modifications and variations may be made in light of the foregoing description.
Therefore, the spirit of the invention should not be limited to the above-described embodiments, and the full scope of the appended claims and equivalents thereof should be within the scope and spirit of the invention.
Industrial applicability
The present invention can be used in an apparatus for encoding/decoding an image and a recording medium for storing a bitstream.

Claims (14)

1.一种用于解码图像的方法,所述方法包括:1. A method for decoding an image, the method comprising: 将当前块的预测模式确定为帧内块复制合并模式;The prediction mode for the current block is determined to be the intra-block copy-and-merge mode; 确定当前块的块矢量合并候选列表;Determine the block vector merge candidate list for the current block; 基于块矢量合并候选列表来推导当前块的块矢量;The block vector of the current block is derived based on the block vector merging candidate list; 通过利用差分块矢量信息来校正块矢量;以及Block vectors are corrected by utilizing differential block vector information; and 基于校正的块矢量来生成当前块的预测块。The predicted block for the current block is generated based on the corrected block vector. 2.根据权利要求1所述的方法,其中,通过利用当前块的相邻块的块矢量信息来确定块矢量合并候选列表。2. The method of claim 1, wherein the block vector merging candidate list is determined by utilizing the block vector information of the adjacent blocks of the current block. 3.根据权利要求1所述的方法,其中,对确定的块矢量合并候选列表的块矢量合并候选进行重新排序。3. The method of claim 1, wherein the block vector merging candidates in the determined block vector merging candidate list are reordered. 4.根据权利要求3所述的方法,其中,基于由块矢量合并候选所表示的参考块的模板与当前块的模板之间的相似度对块矢量合并候选进行重新排序。4. The method of claim 3, wherein the block vector merging candidates are reordered based on the similarity between the template of the reference block represented by the block vector merging candidate and the template of the current block. 5.根据权利要求4所述的方法,其中,对与参考块的模板相对应的块矢量合并候选进行重新排序,以在块矢量合并候选列表中具有高优先级,所述参考块的模板与当前块的模板具有高相似度。5. The method of claim 4, wherein the block vector merging candidates corresponding to the template of the reference block are reordered to have high priority in the block vector merging candidate list, the template of the reference block having high similarity to the template of the current block. 6.根据权利要求4所述的方法,其中,所述相似度是通过绝对差之和(SAD)方法和平方误差之和(SSE)方法的一种来确定的。6. The method of claim 4, wherein the similarity is determined by one of the sum of absolute differences (SAD) method and the sum of squared errors (SSE) method. 7.根据权利要求1所述的方法,其中,推导的块矢量是从块矢量合并候选列表中与高优先级相对应的预定数量的块矢量合并候选的一个推导的。7. The method of claim 1, wherein the derived block vector is derived from one of a predetermined number of block vector merging candidates corresponding to high priority in the block vector merging candidate list. 8.根据权利要求1所述的方法,其中,所述差分块矢量信息包括方向信息和距离信息。8. The method according to claim 1, wherein the differential block vector information includes direction information and distance information. 9.根据权利要求8所述的方法,其中,所述方向信息包括竖直方向信息和水平方向信息。9. The method according to claim 8, wherein the direction information includes vertical direction information and horizontal direction information. 10.根据权利要求8所述的方法,其中,所述距离信息包括竖直距离信息和水平距离信息。10. The method according to claim 8, wherein the distance information includes vertical distance information and horizontal distance information. 11.根据权利要求1所述的方法,进一步包括:11. The method of claim 1, further comprising: 确定是否校正推导的块矢量;以及Determine whether to correct the derived block vectors; and 当确定要校正推导的块矢量时,获得差分块矢量信息,Once the block vector to be corrected is determined, the differential block vector information is obtained. 其中,根据确定出要校正推导的块矢量,基于差分块矢量信息来校正推导的块矢量。Specifically, based on the determined block vector to be corrected, the derived block vector is corrected using differential block vector information. 12.一种用于编码图像的方法,所述方法包括:12. A method for encoding an image, the method comprising: 将当前块的预测模式确定为帧内块复制合并模式;The prediction mode for the current block is determined to be the intra-block copy-and-merge mode; 确定当前块的块矢量合并候选列表;Determine the block vector merge candidate list for the current block; 基于块矢量合并候选列表来推导当前块的块矢量;The block vector of the current block is derived based on the block vector merging candidate list; 通过利用差分块矢量信息来校正块矢量;以及Block vectors are corrected by utilizing differential block vector information; and 基于校正的块矢量来生成当前块的预测块。The predicted block for the current block is generated based on the corrected block vector. 13.一种非易失性计算机可读记录介质,其用于存储比特流,所述比特流由用于编码图像的方法来生成,13. A non-volatile computer-readable recording medium for storing a bitstream, said bitstream being generated by a method for encoding images. 其中,用于编码图像的方法包括:The methods used for encoding images include: 将当前块的预测模式确定为帧内块复制合并模式;The prediction mode for the current block is determined to be the intra-block copy-and-merge mode; 确定当前块的块矢量合并候选列表;Determine the block vector merge candidate list for the current block; 基于块矢量合并候选列表来推导当前块的块矢量;The block vector of the current block is derived based on the block vector merging candidate list; 通过利用差分块矢量信息来校正块矢量;以及Block vectors are corrected by utilizing differential block vector information; and 基于校正的块矢量来生成当前块的预测块。The predicted block for the current block is generated based on the corrected block vector. 14.一种用于传输比特流的方法,所述比特流由用于编码图像的方法来生成,14. A method for transmitting a bitstream, said bitstream being generated by a method for encoding an image, 其中,用于传输比特流的方法包括传输比特流,The methods for transmitting bit streams include transmitting bit streams. 其中,用于编码图像的方法包括:The methods used for encoding images include: 将当前块的预测模式确定为帧内块复制合并模式;The prediction mode for the current block is determined to be the intra-block copy-and-merge mode; 确定当前块的块矢量合并候选列表;Determine the block vector merge candidate list for the current block; 基于块矢量合并候选列表来推导当前块的块矢量;The block vector of the current block is derived based on the block vector merging candidate list; 通过利用差分块矢量信息来校正块矢量;以及Block vectors are corrected by utilizing differential block vector information; and 基于校正的块矢量来生成当前块的预测块。The predicted block for the current block is generated based on the corrected block vector.
CN202480026782.8A 2023-06-12 2024-06-05 Image encoding/decoding methods and apparatus, and recording media for storing bit streams Pending CN121040068A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
KR20230074967 2023-06-12
KR10-2023-0074967 2023-06-12
PCT/KR2024/007749 WO2024258110A1 (en) 2023-06-12 2024-06-05 Image encoding/decoding method and device, and recording medium storing bitstream

Publications (1)

Publication Number Publication Date
CN121040068A true CN121040068A (en) 2025-11-28

Family

ID=94082217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202480026782.8A Pending CN121040068A (en) 2023-06-12 2024-06-05 Image encoding/decoding methods and apparatus, and recording media for storing bit streams

Country Status (2)

Country Link
KR (1) KR20240175310A (en)
CN (1) CN121040068A (en)

Also Published As

Publication number Publication date
KR20240175310A (en) 2024-12-19

Similar Documents

Publication Publication Date Title
CN119183659A (en) Image encoding/decoding method, device and recording medium storing bit stream
CN119422377A (en) Image encoding/decoding method, device and recording medium for storing bit stream
CN119278624A (en) Image encoding/decoding method, device and recording medium storing bit stream
CN118947122A (en) Method, apparatus and recording medium storing bit stream for encoding/decoding image
CN119174174A (en) Image encoding/decoding method and device and recording medium storing bit stream
US20250240437A1 (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
CN120345257A (en) Image encoding/decoding method and device and recording medium storing bit stream
CN119156813A (en) Image encoding/decoding method and apparatus, and recording medium for storing bit stream
CN121040068A (en) Image encoding/decoding methods and apparatus, and recording media for storing bit streams
US20250373809A1 (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
US20260019597A1 (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
US20250184496A1 (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
US20250193393A1 (en) Method and apparatus for encoding/decoding image and recording medium for storing bitstream
US20250392703A1 (en) Method and apparatus for encoding/decoding an image and a recording medium for storing bitstream
CN120266480A (en) Image encoding/decoding method and device and recording medium storing bit stream
CN120752916A (en) Image encoding/decoding method and apparatus, and recording medium storing bit stream
CN121336407A (en) Image encoding/decoding method and apparatus, and recording medium for storing bit stream
CN120712780A (en) Image encoding/decoding method, device, and recording medium storing bit stream
CN121312139A (en) Image encoding/decoding methods and apparatus, and recording media for storing bit streams.
CN119137951A (en) Image encoding/decoding method and device and recording medium storing bit stream
CN120530636A (en) Image encoding/decoding method and apparatus, and recording medium storing bit stream
CN120604512A (en) Image encoding/decoding method and apparatus, and recording medium storing bit stream
CN119137941A (en) Image encoding/decoding method and device and recording medium storing bit stream
CN121286010A (en) Image encoding/decoding method, apparatus, and recording medium storing bit stream
CN119563314A (en) Image encoding/decoding method, device and recording medium with stored bit stream

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication