[go: up one dir, main page]

CN114071161A - Image coding method, image decoding method and related device - Google Patents

Image coding method, image decoding method and related device Download PDF

Info

Publication number
CN114071161A
CN114071161A CN202010748923.0A CN202010748923A CN114071161A CN 114071161 A CN114071161 A CN 114071161A CN 202010748923 A CN202010748923 A CN 202010748923A CN 114071161 A CN114071161 A CN 114071161A
Authority
CN
China
Prior art keywords
block
prediction
current
indication information
intra
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010748923.0A
Other languages
Chinese (zh)
Other versions
CN114071161B (en
Inventor
谢志煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202010748923.0A priority Critical patent/CN114071161B/en
Priority to TW110123862A priority patent/TW202209879A/en
Publication of CN114071161A publication Critical patent/CN114071161A/en
Application granted granted Critical
Publication of CN114071161B publication Critical patent/CN114071161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了一种图像编码方法、图像解码方法及相关装置,图像解码方法包括:划分图像,确定当前编码块的目标分量的帧内预测模式,目标分量包括亮度分量或色度分量;根据目标分量的帧内预测模式确定当前编码块的目标分量的预测块;根据目标分量的帧内预测模式对用于修正预测块的参考像素进行第一滤波,得到滤波后的参考像素;根据滤波后的参考像素对目标分量的预测块进行第二滤波,得到修正后的预测块。本申请实施例在利用相邻像素块与当前像素块的空间关联性修正当前像素块的预测块之前,先对当前像素块的相邻像素块的边界像素进行滤波,避免锐化,提高帧内预测准确度和编码效率。

Figure 202010748923

The embodiments of the present application disclose an image encoding method, an image decoding method, and a related device. The image decoding method includes: dividing an image, and determining an intra-frame prediction mode of a target component of a current encoding block, where the target component includes a luminance component or a chrominance component; Determine the prediction block of the target component of the current coding block according to the intra prediction mode of the target component; perform first filtering on the reference pixels used to correct the prediction block according to the intra prediction mode of the target component to obtain the filtered reference pixels; The second filtering is performed on the prediction block of the target component by the reference pixel to obtain a modified prediction block. In this embodiment of the present application, before correcting the prediction block of the current pixel block by using the spatial correlation between the adjacent pixel block and the current pixel block, the boundary pixels of the adjacent pixel block of the current pixel block are filtered to avoid sharpening and improve the intra-frame Prediction accuracy and coding efficiency.

Figure 202010748923

Description

Image encoding method, image decoding method and related device
Technical Field
The present application relates to the field of electronic device technologies, and in particular, to an image encoding method, an image decoding method, and a related apparatus.
Background
Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, Personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, video conferencing devices, video streaming devices, and so forth.
Digital video devices implement video compression techniques such as those described in the standards and extensions of the standards defined by the Moving Picture Experts Group (MPEG) -2, MPEG-4, ITU-t h.263, ITU-t h.264/MPEG-4 part 10 Advanced Video Coding (AVC), ITU-t h.265 High Efficiency Video Coding (HEVC) standards, to more efficiently transmit and receive digital video information. Video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information by implementing these video codec techniques.
With the proliferation of internet video, even though digital video compression technology is continuously evolving, still higher requirements are put on video compression ratio.
Disclosure of Invention
The embodiment of the application provides an image coding method, an image decoding method and a related device, so that before a prediction block of a current pixel block is corrected by utilizing the spatial relevance between an adjacent pixel block and the current pixel block, the boundary pixels of the adjacent pixel block of the current pixel block are filtered, sharpening is avoided, and the intra-frame prediction accuracy and the coding efficiency are improved.
In a first aspect, an embodiment of the present application provides an image encoding method, including:
dividing the image, and determining coding information of a current coding block, wherein the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-prediction filtering is allowed or not, and the second indication information is used for indicating whether intra-prediction smoothing filtering is allowed or not;
determining an optimal prediction mode of the current coding speed according to the coding information, setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through a code stream;
and superposing the prediction block of the current coding block and a residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
Compared with the prior art, the scheme of the application performs smooth filtering on the prediction block obtained by calculating the intra-frame prediction mode, improves intra-frame prediction precision, and effectively improves coding efficiency.
In a second aspect, an embodiment of the present application provides an image decoding method, including:
analyzing a code stream, and determining decoding information of a current decoding block, wherein the decoding information comprises first indication information and third indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed or not, and the third indication information is used for indicating whether intra-frame prediction smooth filtering is used or not;
determining a prediction block of the current decoding block according to the first indication information and the third indication information;
and superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit.
Compared with the prior art, the scheme of the application performs smooth filtering on the prediction block obtained by calculating the intra-frame prediction mode, improves intra-frame prediction precision, and effectively improves coding efficiency.
In a third aspect, an embodiment of the present application provides an image encoding apparatus, including:
the device comprises a dividing unit, a decoding unit and a smoothing unit, wherein the dividing unit is used for dividing an image and determining coding information of a current coding block, the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-prediction filtering is allowed or not, and the second indication information is used for indicating whether intra-prediction smoothing filtering is allowed or not;
a determining unit, configured to determine an optimal prediction mode of the current coding speed according to the coding information, set third indication information according to the optimal prediction mode, and transmit the optimal prediction mode index and the third indication information via a code stream;
and the superposition unit is used for superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
In a fourth aspect, an embodiment of the present application provides an image decoding apparatus, including:
the decoding unit is used for decoding the code stream and determining decoding information of a current decoding block, wherein the decoding information comprises first indication information and third indication information, the first indication information is used for indicating whether intra-frame prediction filtering is allowed or not, and the third indication information is used for indicating whether intra-frame prediction smoothing filtering is used or not;
a determining unit, configured to determine a prediction block of the current decoded block according to the first indication information and the third indication information;
and the superposition unit is used for superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit.
In a fifth aspect, an embodiment of the present application provides an encoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the first aspect.
In a sixth aspect, an embodiment of the present application provides a decoder, including: a processor and a memory coupled to the processor; the processor is configured to perform the method of the second aspect.
In a seventh aspect, an embodiment of the present application provides a terminal, where the terminal includes: one or more processors, memory, and a communication interface; the memory, the communication interface and the one or more processors; the terminal communicates with other devices through the communication interface, the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, perform the method according to the first or second aspect.
In an eighth aspect, the present invention provides a computer-readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the method of the first or second aspect.
In a ninth aspect, embodiments of the present application provide a computer program product comprising instructions that, when executed on a computer, cause the computer to perform the method of the first or second aspect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic block diagram of a coding tree unit in an embodiment of the present application;
FIG. 2 is a schematic block diagram of a CTU and a coding block CU in an embodiment of the present application;
FIG. 3 is a schematic block diagram of a color format in an embodiment of the present application;
FIG. 4 is a schematic diagram of an IPF in an embodiment of the present application;
FIG. 5 is a diagram illustrating intra prediction filtering according to an embodiment of the present application;
FIG. 6 is a schematic block diagram of a video coding system in an embodiment of the present application;
FIG. 7 is a schematic block diagram of a video encoder in an embodiment of the present application;
FIG. 8 is a schematic block diagram of a video decoder in an embodiment of the present application;
FIG. 9 is a flowchart illustrating an image encoding method according to an embodiment of the present application;
FIG. 10 is a flowchart illustrating an image decoding method according to an embodiment of the present application;
FIG. 11A is a schematic diagram of a first padding of a prediction block in the embodiment of the present application;
FIG. 11B is a second padding diagram of a prediction block in the embodiment of the present application;
FIG. 11C is a third padding diagram of a prediction block in the embodiment of the present application;
FIG. 12 is a block diagram of a functional unit of an image encoding apparatus according to an embodiment of the present application;
FIG. 13 is a block diagram showing another functional unit of the image encoding apparatus according to the embodiment of the present application;
FIG. 14 is a block diagram of a functional unit of an image decoding apparatus according to an embodiment of the present application;
fig. 15 is a block diagram of another functional unit of the image decoding apparatus in the embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another. For example, a first client may be referred to as a second client, and similarly, a second client may be referred to as a first client, without departing from the scope of the present invention. Both the first client and the second client are clients, but they are not the same client.
First, terms used in the embodiments of the present application will be described.
For the partition of images, in order to more flexibly represent Video contents, a Coding Tree Unit (CTU), a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) are defined in the High Efficiency Video Coding (HEVC) technology. The CTU, CU, PU, and TU are all image blocks.
A coding tree unit CTU, an image being composed of a plurality of CTUs, a CTU generally corresponding to a square image area, containing luminance pixels and chrominance pixels (or may contain only luminance pixels, or may contain only chrominance pixels) in the image area; the CTU also contains syntax elements that indicate how the CTU is divided into at least one Coding Unit (CU) and the method of decoding each coding block to obtain a reconstructed picture. As shown in fig. 1, the picture 10 is composed of a plurality of CTUs (including CTU a, CTU B, CTU C, etc.). The encoded information corresponding to a CTU includes luminance values and/or chrominance values of pixels in a square image region corresponding to the CTU. Furthermore, the coding information corresponding to a CTU may also contain syntax elements indicating how to divide the CTU into at least one CU and the method of decoding each CU to get the reconstructed picture. The image area corresponding to one CTU may include 64 × 64, 128 × 128, or 256 × 256 pixels. In one example, a CTU of 64 × 64 pixels comprises a rectangular pixel lattice of 64 columns of 64 pixels each, each pixel comprising a luminance component and/or a chrominance component. The CTUs may also correspond to rectangular image regions or image regions with other shapes, and an image region corresponding to one CTU may also be an image region in which the number of pixels in the horizontal direction is different from the number of pixels in the vertical direction, for example, including 64 × 128 pixels.
The coding block CU, as shown in fig. 2, may further be divided into coding blocks CU, each of which generally corresponds to an a × B rectangular region in the image, and includes a × B luma pixels and/or its corresponding chroma pixels, a being the width of the rectangle, B being the height of the rectangle, a and B may be the same or different, and a and B generally take values of 2 raised to an integer power, such as 128, 64, 32, 16, 8, 4. Here, the width referred to in the embodiment of the present application refers to the length in the X-axis direction (horizontal direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1, and the height refers to the length in the Y-axis direction (vertical direction) in the two-dimensional rectangular coordinate system XoY shown in fig. 1. The reconstructed image of a CU may be obtained by adding a predicted image, which is generated by intra prediction or inter prediction, specifically, may be composed of one or more Predicted Blocks (PB), and a residual image, which is generated by inverse quantization and inverse transform processing on transform coefficients, specifically, may be composed of one or more Transform Blocks (TB). Specifically, one CU includes coding information including information such as a prediction mode and a transform coefficient, and performs decoding processing such as corresponding prediction, inverse quantization, and inverse transform on the CU according to the coding information to generate a reconstructed image corresponding to the CU. The coding tree unit and coding block relationship is shown in fig. 3.
The prediction unit PU is a basic unit of intra prediction and inter prediction. Defining motion information of an image block to include an inter-frame prediction direction, a reference frame, a motion vector, and the like, wherein the image block undergoing encoding processing is called a Current Coding Block (CCB), the image block undergoing decoding processing is called a Current Decoding Block (CDB), and for example, when one image block is undergoing prediction processing, the current coding block or the current decoding block is a prediction block; when an image block is being residual processed, the currently encoded block or the currently decoded block is a transform block. The picture in which the current coding block or the current decoding block is located is called the current frame. In the current frame, image blocks located on the left or upper side of the current block may be inside the current frame and have completed encoding/decoding processing, resulting in reconstructed images, which are referred to as reconstructed blocks; information such as the coding mode of the reconstructed block, the reconstructed pixels, etc. is available (available). A frame in which the encoding/decoding process has been completed before the encoding/decoding of the current frame is referred to as a reconstructed frame. When the current frame is a uni-directionally predicted frame (P frame) or a bi-directionally predicted frame (B frame), it has one or two reference frame lists, respectively, referred to as L0 and L1, each of which contains at least one reconstructed frame, referred to as the reference frame of the current frame. The reference frame provides reference pixels for inter-frame prediction of the current frame.
And a transform unit TU for processing the residual between the original image block and the predicted image block.
The pixel (also called as a pixel) refers to a pixel in an image, such as a pixel in a coding block, a pixel in a luminance component pixel block (also called as a luminance pixel), a pixel in a chrominance component pixel block (also called as a chrominance pixel), and the like.
The samples (also referred to as pixel values and sample values) refer to pixel values of pixels, the pixel values refer to luminance (i.e., gray-scale values) in a luminance component domain, and the pixel values refer to chrominance values (i.e., colors and saturations) in a chrominance component domain, and according to different processing stages, a sample of one pixel specifically includes an original sample, a predicted sample, and a reconstructed sample.
Description of the directions: horizontal direction, for example: in the two-dimensional rectangular coordinate system XoY shown in fig. 1, along the X-axis direction and the vertical direction, for example: as shown in the two-dimensional rectangular coordinate system XoY of fig. 1 along the Y-axis in the negative direction.
And intra-frame prediction, namely generating a prediction image of the current block according to the spatial adjacent pixels of the current block. An intra prediction mode corresponds to a method of generating a prediction image. The division of the intra-frame prediction unit comprises a2 Nx 2N division mode and an Nx N division mode, wherein the 2 Nx 2N division mode is that image blocks are not divided; the N × N division is to divide the image block into four equal-sized sub-image blocks.
Typically, digital video compression techniques work on video sequences whose color coding method is YCbCr, which may also be referred to as YUV, in a color format of 4:2:0, 4:2:2, or 4:4: 4. Where Y denotes brightness (Luma) that is a gray scale value, Cb denotes a blue Chrominance component, Cr denotes a red Chrominance component, and U and V denote Chrominance (Chroma) for describing color and saturation. In color format, 4:2:0 indicates 4 luminance components per 4 pixels, 2 chrominance components (yyycbcr), 4:2:2 indicates 4 luminance components per 4 pixels, 4 chrominance components (yyyycbcrcbccr), and 4:4:4 indicates full pixel display (yyycbcrcbcrcbcr), and fig. 3 shows the component profiles for different color formats, where the circle is the Y component and the triangle is the UV component.
In a digital video encoding process, an encoder reads pixels and encodes raw video sequences in different color formats. A general digital encoder generally includes prediction, transformation and quantization, inverse transformation and inverse quantization, loop filtering, entropy coding, and the like, and is used to eliminate spatial, temporal, visual, and character redundancy. However, the human eye is more sensitive to changes in the luminance component and does not react strongly to changes in the chrominance component, so the original video sequence is typically encoded using YUV 4:2:0 color format. Meanwhile, the digital video encoder adopts different prediction processes for the luminance component and the chrominance component in the intra-frame coding part, the prediction of the luminance component is more delicate and complex, and the prediction of the chrominance component is generally simpler. The Cross Component Prediction (CCP) mode is a technique of existing digital video coding that acts on a luminance Component and a chrominance Component to increase a video compression ratio.
The intra-frame prediction part in the digital video coding and decoding mainly refers to the image information of adjacent blocks of a current frame to predict a current coding unit block, calculates residual errors of a prediction block and an original image block to obtain residual error information, and transmits the residual error information to a decoding end through the processes of transformation, quantization and the like. And after receiving and analyzing the code stream, the decoding end obtains residual information through steps of inverse transformation, inverse quantization and the like, and a reconstructed image block is obtained after the residual information is superposed on a predicted image block obtained by prediction of the decoding end. In the process, intra-frame prediction usually predicts a current coding block by means of respective angle mode and non-angle mode to obtain a prediction block, screens out the optimal prediction mode of a current coding unit according to rate distortion information obtained by calculation of the prediction block and an original block, and then transmits the prediction mode to a decoding end through a code stream. And the decoding end analyzes the prediction mode, predicts to obtain a predicted image of the current decoding block and superposes residual pixels transmitted by the code stream to obtain a reconstructed image.
Through the development of the digital video coding and decoding standards of the past generations, a non-angle mode is kept relatively stable, and the non-angle mode has an average mode and a plane mode; the angle mode is increased along with the evolution of the digital video coding and decoding standard, taking the international digital video coding standard H series as an example, the H.264/AVC standard only has 8 angle prediction modes and 1 non-angle prediction mode; H.265/HEVC is extended to 33 angular prediction modes and 2 non-angular prediction modes; and the latest universal video coding standard H.266/VVC at present adopts 67 prediction modes, wherein 2 non-angular prediction modes are reserved, and the angular modes are expanded from 33 to 65 of H.265. Needless to say, as the angle mode increases, the intra-frame prediction will be more accurate, and the demand of the current society for the development of high definition and ultra-high definition videos is more satisfied. The international standard is the same, the domestic digital audio and video coding standard AVS3 also continues to expand the angle mode and the non-angle mode, the development of ultra-high definition digital video puts higher requirements on intra-frame prediction, and the coding efficiency can not be submitted by only increasing the angle prediction mode and expanding the wide angle. Therefore, the domestic digital audio and video coding standard AVS3 adopts an Intra Prediction Filter (IPF) technique, and the intra prediction filter technique indicates that not all reference pixels are used in the current intra angle prediction, and the relevance between some pixels and the current coding unit is easily ignored, and the intra prediction filter technique improves the pixel prediction precision through point-to-point filtering, and can effectively enhance the spatial relevance, thereby improving the intra prediction precision. The IPF technique takes the prediction mode from top right to bottom left in AVS3 as an example, as shown in fig. 4, where URB represents the boundary pixel of the left neighboring block near the current coding unit, MRB represents the boundary pixel of the upper neighboring block near the current coding unit, and filter direction represents the filtering direction. In the prediction mode direction from top right to bottom left, the generated prediction value of the current coding unit mainly uses the reference pixel points of the adjacent block in the row of the MRB above, that is, the prediction pixel of the current coding unit does not refer to the reconstructed pixel of the adjacent block on the left side, however, the current coding unit and the reconstructed block on the left side are in a spatial adjacent relationship, and if only the MRB pixel on the upper side is referred to and the URB pixel on the left side is not referred to, spatial correlation is easily lost, which results in poor prediction effect.
The IPF technology is applied to all prediction modes of intra-frame prediction, and is a filtering method for improving intra-frame prediction precision. The IPF technology is mainly realized by the following processes:
a) judging the current prediction mode of the coding unit by the IPF technology, and dividing the current prediction mode into a horizontal angle prediction mode, a vertical angle prediction mode and a non-angle prediction mode;
b) according to different types of prediction modes, the IPF technology adopts different filters to filter input pixels;
c) according to different distances from the current pixel to the reference pixel, the IPF technology adopts different filter coefficients to filter the input pixel;
the input pixel of the IPF technique is a predicted pixel obtained in each prediction mode, and the output pixel is a final predicted pixel after IPF.
The IPF technique has an allowable flag bit IPF _ enable _ flag, a binary variable, a value of '1' indicating that intra prediction filtering can be used; a value of '0' indicates that no intra prediction filtering should be used. Meanwhile, the IPF technology also uses an identification bit IPF _ flag, and a binary variable with the value of '1' indicates that intra-frame prediction filtering is to be used; a value of '0' indicates that intra prediction filtering should not be used, and if the flag ipf _ flag does not exist in the code stream, 0 is defaulted.
Syntax element IPF _ flag, as follows:
Figure BDA0002608522600000041
Figure BDA0002608522600000051
the IPF technique classifies prediction modes 0, 1, and 2 as non-angular prediction modes, and filters the prediction pixels using a first three-tap filter;
classifying the prediction modes 3 to 18 and 34 to 50 into vertical angle prediction modes, and filtering the prediction pixels by using a first two-tap filter;
the prediction modes 19 to 32 and 51 to 65 are classified into horizontal-class angle prediction modes, and the prediction pixels are filtered using the second two-tap filter.
The first three-tap filter applicable to the IPF technique has the following filtering formula:
P′(x,y)=f(x)·P(-1,y)+f(y)·P(x,-1)+(1-f(x)-f(y))·P(x,y)
the first two-tap filter applicable to the IPF technique has the following filtering formula:
P′(x,y)=f(x)·P(-1,y)+(1-f(x))·P(x,y)
the second two-tap filter suitable for the IPF technique has the following filtering formula:
P′(x,y)=f(y)·P(x,-1)+(1-f(y))·P(x,y)
in the above equation, P' (x, y) is the final prediction value of the pixel at the (x, y) position of the current chroma prediction block, f (x) and f (y) are the horizontal filter coefficient of the reconstructed pixel of the reference left-side neighboring block and the vertical filter coefficient of the reconstructed pixel of the reference upper-side neighboring block, respectively, P (-1, y) and P (x, -1) are the reconstructed pixel at the left side of the y row and the reconstructed pixel at the upper side of the x column, respectively, and P (x, y) is the original prediction pixel value in the current chroma component prediction block. Wherein, the values of x and y do not exceed the width and height value range of the current coding unit block.
The values of the horizontal filter coefficient and the vertical filter coefficient are related to the size of the current coding unit block and the distance from the prediction pixel in the current prediction block to the left reconstruction pixel and the upper reconstruction pixel. The values of the horizontal filter coefficient and the vertical filter coefficient are also related to the size of the current coding block, and are divided into different filter coefficient groups according to the size of the current coding unit block.
Table 1 gives the filter coefficients for the IPF technique.
Table 1 intra chroma prediction filter coefficients
Figure BDA0002608522600000052
Fig. 5 shows three filtering cases of intra prediction filtering, (a) filtering the prediction value in the current coding unit with reference to only the upper reference pixel; (b) only the left reference pixel is referred to filter the measured value in the current coding unit; and (c) filtering the prediction value in the current coding unit block by referring to the upper side reference pixel and the left side reference pixel, wherein the Distance represents the Distance from the current processed pixel to the reference pixel.
FIG. 6 is a block diagram of a video coding system 1 of one example described in an embodiment of the present application. As used herein, the term "video coder" generally refers to both video encoders and video decoders. In this application, the term "video coding" or "coding" may generally refer to video encoding or video decoding. The video encoder 100 and the video decoder 200 of the video coding system 1 are used to implement the image encoding method proposed by the present application.
As shown in fig. 6, video coding system 1 includes a source device 10 and a destination device 20. Source device 10 generates encoded video data. Accordingly, source device 10 may be referred to as a video encoding device. Destination device 20 may decode the encoded video data generated by source device 10. Accordingly, the destination device 20 may be referred to as a video decoding device. Various implementations of source device 10, destination device 20, or both may include one or more processors and memory coupled to the one or more processors. The memory can include, but is not limited to, RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer, as described herein.
Source device 10 and destination device 20 may comprise a variety of devices, including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, or the like.
Destination device 20 may receive encoded video data from source device 10 via link 30. Link 30 may comprise one or more media or devices capable of moving encoded video data from source device 10 to destination device 20. In one example, link 30 may comprise one or more communication media that enable source device 10 to transmit encoded video data directly to destination device 20 in real-time. In this example, source device 10 may modulate the encoded video data according to a communication standard, such as a wireless communication protocol, and may transmit the modulated video data to destination device 20. The one or more communication media may include wireless and/or wired communication media such as a Radio Frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (e.g., the internet). The one or more communication media may include a router, switch, base station, or other apparatus that facilitates communication from source device 10 to destination device 20. In another example, encoded data may be output from output interface 140 to storage device 40.
The image codec techniques of this application may be applied to video codecs to support a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, streaming video transmissions (e.g., via the internet), encoding for video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other applications. In some examples, video coding system 1 may be used to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
The video coding system 1 illustrated in fig. 6 is merely an example, and the techniques of this application may be applied to video coding settings (e.g., video encoding or video decoding) that do not necessarily include any data communication between an encoding device and a decoding device. In other examples, the data is retrieved from local storage, streamed over a network, and so forth. A video encoding device may encode and store data to a memory, and/or a video decoding device may retrieve and decode data from a memory. In many examples, the encoding and decoding are performed by devices that do not communicate with each other, but merely encode data to and/or retrieve data from memory and decode data.
In the example of fig. 6, source device 10 includes video source 120, video encoder 100, and output interface 140. In some examples, output interface 140 may include a regulator/demodulator (modem) and/or a transmitter. Video source 120 may comprise a video capture device (e.g., a video camera), a video archive containing previously captured video data, a video feed interface to receive video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources of video data.
Video encoder 100 may encode video data from video source 120. In some examples, source device 10 transmits the encoded video data directly to destination device 20 via output interface 140. In other examples, encoded video data may also be stored onto storage device 40 for later access by destination device 20 for decoding and/or playback.
In the example of fig. 6, destination device 20 includes input interface 240, video decoder 200, and display device 220. In some examples, input interface 240 includes a receiver and/or a modem. Input interface 240 may receive encoded video data via link 30 and/or from storage device 40. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. In general, display device 220 displays decoded video data. The display device 220 may include a variety of display devices, such as a Liquid Crystal Display (LCD), a plasma display, an Organic Light Emitting Diode (OLED) display, or other types of display devices.
Although not shown in fig. 6, in some aspects, video encoder 100 and video decoder 200 may each be integrated with an audio encoder and decoder, and may include appropriate multiplexer-demultiplexer units or other hardware and software to handle encoding of both audio and video in a common data stream or separate data streams.
Video encoder 100 and video decoder 200 may each be implemented as any of a variety of circuits such as: one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), discrete logic, hardware, or any combinations thereof. If the present application is implemented in part in software, a device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may execute the instructions in hardware using one or more processors to implement the techniques of the present application. Any of the foregoing, including hardware, software, a combination of hardware and software, etc., may be considered one or more processors. Each of video encoder 100 and video decoder 200 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (codec) in a respective device.
Fig. 7 is an exemplary block diagram of a video encoder 100 described in embodiments of the present application. The video encoder 100 is used to output the video to the post-processing entity 41. Post-processing entity 41 represents an example of a video entity, such as a media-aware network element (MANE) or a splicing/editing device, that may process the encoded video data from video encoder 100. In some cases, post-processing entity 41 may be an instance of a network entity. In some video encoding systems, post-processing entity 41 and video encoder 100 may be parts of separate devices, while in other cases, the functionality described with respect to post-processing entity 41 may be performed by the same device that includes video encoder 100. In some example, post-processing entity 41 is an example of storage 40 of FIG. 1.
In the example of fig. 7, the video encoder 100 includes a prediction processing unit 108, a filter unit 106, a memory 107, a summer 112, a transformer 101, a quantizer 102, and an entropy encoder 103. The prediction processing unit 108 includes an inter predictor 110 and an intra predictor 109. For image block reconstruction, the video encoder 100 further includes an inverse quantizer 104, an inverse transformer 105, and a summer 111. Filter unit 106 represents one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although filter unit 106 is shown in fig. 7 as an in-loop filter, in other implementations, filter unit 106 may be implemented as a post-loop filter. In one example, the video encoder 100 may further include a video data memory, a partitioning unit (not shown).
Video encoder 100 receives video data and stores the video data in a video data memory. The partitioning unit partitions the video data into image blocks and these image blocks may be further partitioned into smaller blocks, e.g. image block partitions based on a quadtree structure or a binary tree structure. Prediction processing unit 108 may select one of a plurality of possible coding modes for the current image block, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes. Prediction processing unit 108 may provide the resulting intra, inter coded block to summer 112 to generate a residual block and to summer 111 to reconstruct the encoded block used as the reference picture. An intra predictor 109 within prediction processing unit 108 may perform intra-predictive encoding of a current block of video relative to one or more neighboring encoded blocks of the current block to be encoded in the same frame or slice to remove spatial redundancy. Inter predictor 110 within prediction processing unit 108 may perform inter-predictive encoding of the current block relative to one or more prediction blocks in one or more reference pictures to remove temporal redundancy. The prediction processing unit 108 provides information indicating the selected intra or inter prediction mode of the current image block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected inter prediction mode.
After prediction processing unit 108 generates a prediction block for the current image block via inter/intra prediction, video encoder 100 forms a residual image block by subtracting the prediction block from the current image block to be encoded. Summer 112 represents one or more components that perform this subtraction operation. The residual video data in the residual block may be included in one or more TUs and applied to transformer 101. The transformer 101 transforms the residual video data into residual transform coefficients using a transform such as a Discrete Cosine Transform (DCT) or a conceptually similar transform. Transformer 101 may convert residual video data from a pixel value domain to a transform domain, e.g., the frequency domain.
The transformer 101 may send the resulting transform coefficients to the quantizer 102. Quantizer 102 quantizes the transform coefficients to further reduce the bit rate. In some examples, quantizer 102 may then perform a scan of a matrix that includes quantized transform coefficients. Alternatively, the entropy encoder 103 may perform a scan.
After quantization, the entropy encoder 103 entropy encodes the quantized transform coefficients. For example, the entropy encoder 103 may perform Context Adaptive Variable Length Coding (CAVLC), Context Adaptive Binary Arithmetic Coding (CABAC), syntax-based context adaptive binary arithmetic coding (SBAC), Probability Interval Partition Entropy (PIPE) coding, or another entropy encoding method or technique. After entropy encoding by the entropy encoder 103, the encoded codestream may be transmitted to the video decoder 200, or archived for later transmission or retrieved by the video decoder 200. The entropy encoder 103 may also entropy encode syntax elements of the current image block to be encoded.
Inverse quantizer 104 and inverse transformer 105 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, e.g., for later use as a reference block for a reference image. The summer 111 adds the reconstructed residual block to the prediction block produced by the inter predictor 110 or the intra predictor 109 to produce a reconstructed image block. The filter unit 106 may be adapted to reconstruct the image block to reduce distortions, such as block artifacts. This reconstructed image block is then stored in memory 107 as a reference block, which may be used by inter predictor 110 as a reference block to inter predict a block in a subsequent video frame or image.
The video encoder 100 divides the input video into a number of coding tree units, each of which is in turn divided into a number of coding blocks, either rectangular or square. When the current coding block selects the intra-frame prediction mode for coding, the calculation traversal of a plurality of prediction modes is carried out on the brightness component of the current coding block, the optimal prediction mode is selected according to the rate distortion cost, the calculation traversal of a plurality of prediction modes is carried out on the chroma component of the current coding block, and the optimal prediction mode is selected according to the rate distortion cost. And then, calculating a residual between the original video block and the prediction block, wherein one subsequent path of the residual forms an output code stream through change, quantization, entropy coding and the like, and the other path of the residual forms a reconstruction sample through inverse transformation, inverse quantization, loop filtering and the like to be used as reference information of subsequent video compression.
The present IPF technique is implemented in the video encoder 100 as follows.
The input digital video information is divided into a plurality of coding tree units at a coding end, each coding tree unit is divided into a plurality of rectangular or square coding units, and each coding unit carries out intra-frame prediction process to calculate a prediction block.
In the current coding unit,
if the allowable identification bit of the IPF is '1', performing all the following steps;
② if the allowed identification bit of IPF is '0', only the steps of a1), b1), f1 and g1) are carried out.
a1) The intra-frame prediction firstly traverses all prediction modes, calculates prediction pixels under each intra-frame prediction mode, and calculates the rate distortion cost according to the original pixels;
b1) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all prediction modes, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c1) traversing all intra-frame prediction modes again, starting an IPF technology in the process, firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;
d1) IPF is carried out on the prediction block of the current coding unit, a filter corresponding to the prediction block is selected according to the current prediction mode, a corresponding filter coefficient group is selected according to the size of the current coding unit, and a table 1 can be looked up according to the specific correspondence;
e1) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the IPF technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f1) if the IPF allowed identification bit is '0', transmitting the prediction mode index recorded in b1) to a decoding end through a code stream;
if the IPF allows the identification bit to be '1', the minimum cost value recorded in b1) is compared with the minimum cost value recorded in e1),
if the rate distortion cost in b1) is lower, the prediction mode index code recorded in b1) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the IPF current coding unit identification position uses an identification position to indicate that the IPF technology is not used and is also transmitted to the decoding end through the code stream;
if the rate distortion in e1) is smaller, the prediction mode index code recorded in e1) is used as the optimal prediction mode of the current coding unit and is transmitted to the decoding end through the code stream, and the identification position of the IPF current coding unit is true by using the identification position, which indicates that the IPF technology is used, and is also transmitted to the decoding end through the code stream.
g1) And then, overlapping the predicted value and residual information after operations such as transformation, quantization and the like to obtain a reconstructed block of the current coding unit as reference information of a subsequent coding unit.
The intra predictor 109 may also provide information indicating the selected intra prediction mode of the current encoding block to the entropy encoder 103 so that the entropy encoder 103 encodes the information indicating the selected intra prediction mode.
Fig. 8 is an exemplary block diagram of a video decoder 200 described in embodiments of the present application. In the example of fig. 8, the video decoder 200 includes an entropy decoder 203, a prediction processing unit 208, an inverse quantizer 204, an inverse transformer 205, a summer 211, a filter unit 206, and a memory 207. The prediction processing unit 208 may include an inter predictor 210 and an intra predictor 209. In some examples, video decoder 200 may perform a decoding process that is substantially reciprocal to the encoding process described with respect to video encoder 100 from fig. 7.
In the decoding process, video decoder 200 receives an encoded video bitstream representing an image block and associated syntax elements of an encoded video slice from video encoder 100. Video decoder 200 may receive video data from network entity 42 and, optionally, may store the video data in a video data store (not shown). The video data memory may store video data, such as an encoded video bitstream, to be decoded by components of video decoder 200. The video data stored in the video data memory may be obtained, for example, from storage device 40, from a local video source such as a camera, via wired or wireless network communication of video data, or by accessing a physical data storage medium. The video data memory may serve as a decoded picture buffer (CPB) for storing encoded video data from the encoded video bitstream.
Network entity 42 may be, for example, a server, a MANE, a video editor/splicer, or other such device for implementing one or more of the techniques described above. Network entity 42 may or may not include a video encoder, such as video encoder 100. Network entity 42 may implement portions of the techniques described in this application before network entity 42 sends the encoded video bitstream to video decoder 200. In some video decoding systems, network entity 42 and video decoder 200 may be part of separate devices, while in other cases, the functionality described with respect to network entity 42 may be performed by the same device that includes video decoder 200.
The entropy decoder 203 of the video decoder 200 entropy decodes the code stream to generate quantized coefficients and some syntax elements. The entropy decoder 203 forwards the syntax elements to the prediction processing unit 208. Video decoder 200 may receive syntax elements at the video slice level and/or the picture block level. When a video slice is decoded as an intra-decoded (I) slice, intra predictor 209 of prediction processing unit 208 generates a prediction block for an image block of the current video slice based on the signaled intra prediction mode and data from previously decoded blocks of the current frame or picture. When a video slice is decoded as an inter-decoded (i.e., B or P) slice, the inter predictor 210 of the prediction processing unit 208 may determine an inter prediction mode for decoding a current image block of the current video slice based on syntax elements received from the entropy decoder 203, decode the current image block (e.g., perform inter prediction) based on the determined inter prediction mode.
The inverse quantizer 204 inversely quantizes, i.e., dequantizes, the quantized transform coefficients provided in the codestream and decoded by the entropy decoder 203. The inverse quantization process may include: the quantization parameter calculated by the video encoder 100 for each image block in the video slice is used to determine the degree of quantization that should be applied and likewise the degree of inverse quantization that should be applied. Inverse transformer 205 applies an inverse transform, such as an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, to the transform coefficients in order to generate a block of residues in the pixel domain.
After the inter predictor 210 generates a prediction block for the current image block or a sub-block of the current image block, the video decoder 200 obtains a reconstructed block, i.e., a decoded image block, by summing the residual block from the inverse transformer 205 with the corresponding prediction block generated by the inter predictor 210. Summer 211 represents the component that performs this summation operation. A loop filter (in or after the decoding loop) may also be used to smooth pixel transitions or otherwise improve video quality, if desired. Filter unit 206 may represent one or more loop filters, such as deblocking filters, Adaptive Loop Filters (ALF), and Sample Adaptive Offset (SAO) filters. Although the filter unit 206 is shown in fig. 8 as an in-loop filter, in other implementations, the filter unit 206 may be implemented as a post-loop filter.
The image decoding method specifically performed by the video decoder 200 includes obtaining a prediction mode index of a current coding block after an input code stream is analyzed, inversely transformed, and inversely quantized. If the prediction mode index of the chroma component of the current coding block is an enhanced two-step cross-component prediction mode, selecting only reconstructed samples from upper side or left side adjacent pixels of the current coding block according to an index value to calculate a linear model, calculating according to the linear model to obtain a reference prediction block of the chroma component of the current coding block, performing down-sampling, and performing prediction correction based on the correlation of boundary adjacent pixels in the orthogonal direction on the down-sampled prediction block to obtain a final prediction block of the chroma component. One path of the subsequent code stream is used as reference information of subsequent video decoding, and the other path of the subsequent code stream is subjected to post-filtering processing to output a video signal.
The IPF technique is currently implemented at the video decoder 200 as follows.
And the decoding end acquires and analyzes the code stream to obtain digital video sequence information, and analyzes to obtain an IPF allowed identification bit of the current video sequence, the current decoding unit coding mode is an intra-frame prediction coding mode, and the IPF used identification bit of the current decoding unit.
In the current decoding unit,
if the allowable identification bit of the IPF is '1', performing all the following steps;
② if the allowed identification bit of IPF is '0', only making a2), b2) and e2) steps:
a2) acquiring code stream information, analyzing residual error information of a current decoding unit, and obtaining time domain residual error information through inverse transformation and inverse quantization processes;
b2) analyzing the code stream and acquiring a prediction mode index of the current decoding unit, and calculating to obtain a prediction block of the current decoding unit according to the adjacent reconstruction block and the prediction mode index;
c2) analyzing and acquiring the use identification bit of the IPF, and if the use identification bit of the IPF is '0', not performing additional operation on the current prediction block; if the usage flag of the IPF is '1', executing d 2);
d2) selecting a corresponding filter according to the prediction mode classification information of the current decoding unit, selecting a corresponding filter coefficient group according to the size of the current decoding unit, and filtering each pixel in a prediction block to obtain a final prediction block;
e2) superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing;
it should be understood that other structural variations of the video decoder 200 may be used to decode the encoded video stream. For example, the video decoder 200 may generate an output video stream without processing by the filter unit 206; alternatively, for some image blocks or image frames, the entropy decoder 203 of the video decoder 200 does not decode quantized coefficients and accordingly does not need to be processed by the inverse quantizer 204 and the inverse transformer 205.
In the intra-frame prediction technology, the existing IPF technology can effectively improve the coding efficiency of intra-frame prediction, greatly enhances the spatial correlation of intra-frame prediction, and well solves the problem that only a single reference pixel row or column is used in the intra-frame prediction process, but the influence of some pixels on the predicted value is ignored. However, when the intra-frame prediction process needs a smooth part, the IPF technology and the current intra-frame prediction mode cannot solve similar problems well, and the pixel-by-pixel filtering based on the reference pixel can improve the relevance between the prediction block and the reference block, but cannot solve the smooth problem inside the prediction block.
The prediction block calculated according to the single prediction mode usually shows better prediction effect in the image with clear texture, and the residual error becomes smaller and smaller, so that the coding efficiency is improved. However, in an image block with a blurred texture, too sharp prediction may increase and enlarge a residual, resulting in poor prediction effect and reduced coding efficiency.
In view of the above problems, the embodiments of the present application directly filter the prediction block for some image blocks that need smoothing, and the filtering technique is hereinafter referred to as an intra-prediction smoothing filtering technique for easy understanding and distinction.
The following detailed description is made with reference to the accompanying drawings.
Fig. 9 is a flowchart illustrating an image encoding method in an embodiment of the present application, where the image encoding method can be applied to the source device 10 in the video decoding system 1 shown in fig. 6 or the video encoder 100 shown in fig. 7. The flow shown in fig. 9 is described by taking as an example the execution subject of the video encoder 100 shown in fig. 7. As shown in fig. 9, an image encoding method provided in an embodiment of the present application includes:
step 110, dividing the image, and determining coding information of a current coding block, wherein the coding information comprises first indication information and second indication information, the first indication information is used for indicating whether intra-prediction filtering is allowed, and the second indication information is used for indicating whether intra-prediction smoothing filtering is allowed;
step 120, determining an optimal prediction mode of the current coding speed according to the coding information, setting third indication information according to the optimal prediction mode, and transmitting the optimal prediction mode index and the third indication information through a code stream;
and step 130, superposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
The filtering technical scheme 1 is specifically implemented in the intra prediction part at the encoding end as follows:
the encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smoothing filtering allowable identification bit and the like, divides an image into a plurality of CTUs after acquiring image information, further divides the image into a plurality of CUs, and performs intra-frame prediction on each independent CU. It should be noted that the minimum processing unit may also be a custom coding block, and the CU is only an example and is not limited to the only example.
In the intra-prediction process, it is possible to,
if the intra-frame prediction smoothing filter allowable identification bit is '1' and the current CU area is greater than or equal to 64 and smaller than 2048, executing all the following steps;
② if the intra prediction smoothing filter allows the flag bit to be '0', only a3), b3) and f3), g3) are performed;
a3) the current coding unit traverses all intra-frame prediction modes, calculates to obtain a prediction block under each prediction mode, and calculates to obtain rate distortion cost information of the current prediction mode according to an original pixel block;
b3) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all prediction modes, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c3) traversing all intra-frame prediction modes again, starting an intra-frame prediction smoothing filtering technology in the process, firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;
d3) carrying out intra-frame prediction smoothing filtering on all pixels in a prediction block of a current coding unit to obtain a final prediction block;
e3) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the intra-frame prediction smoothing filtering technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f3) if the identification bit of intra-frame prediction smooth filtering permission is '0', transmitting the prediction mode index recorded in b3) to a decoding end through a code stream;
if the intra prediction smoothing filter allows the flag bit to be '1', the minimum cost value recorded in b3) is compared with the minimum cost value recorded in e3),
if the rate distortion cost in b3) is lower, the prediction mode index code recorded in b3) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the mark position '0', which indicates that the intra-frame prediction smoothing filtering technology is not used, and is also transmitted to the decoding end through the code stream;
if the rate distortion in e3) is smaller, the prediction mode index code recorded in e3) is transmitted to the decoding end as the optimal prediction mode of the current coding unit through the code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the identification position '1', which indicates that the intra-frame prediction smoothing filtering technology is used, and is also transmitted to the decoding end through the code stream.
g3) And then, superposing the prediction block and the residual after inverse transformation and inverse quantization to obtain a reconstructed coding unit block which is used as a prediction reference block of the next coding unit.
The filtering technical scheme 2 is specifically implemented in the intra prediction part at the encoding end as follows:
the encoder acquires encoding information including an intra-frame prediction filtering allowable identification bit, an intra-frame prediction smoothing filtering allowable identification bit and the like, divides an image into a plurality of CTUs after acquiring image information, further divides the image into a plurality of CUs, and performs intra-frame prediction on each independent CU.
In the intra-prediction process, it is possible to,
if the intra-frame prediction smoothing filter allowable identification bit is '1' and the current CU area is greater than or equal to 64 and smaller than 2048, executing all the following steps;
② if the intra prediction smoothing filter allows the flag bit to be '0', only a4), b4) and f4), g4) are performed;
a4) the current coding unit traverses all intra-frame prediction modes, calculates to obtain a prediction block under each prediction mode, and calculates to obtain rate distortion cost information of the current prediction mode according to an original pixel block;
b4) selecting the optimal prediction mode of the current coding unit according to the principle of minimum rate distortion cost of all prediction modes, and recording the optimal prediction mode information and the rate distortion cost information corresponding to the optimal prediction mode information;
c4) traversing all intra-frame prediction modes again, starting an intra-frame prediction smoothing filtering technology in the process, firstly calculating prediction pixels under each intra-frame prediction mode to obtain a prediction block of the current coding unit;
d4) filtering the prediction block of the current coding unit twice, performing intra-frame prediction smoothing filtering on all pixels in the prediction block for the first time, and performing intra-frame prediction smoothing filtering on left boundary pixels and upper boundary pixels in the filtered prediction block for the second time to obtain a final prediction block;
e4) calculating rate distortion cost information of each prediction mode according to the final prediction pixel obtained by the intra-frame prediction smoothing filtering technology and the original pixel, and recording the prediction mode and the corresponding cost value of the minimum rate distortion cost information;
f4) if the identification bit of intra-frame prediction smooth filtering permission is '0', transmitting the prediction mode index recorded in b4) to a decoding end through a code stream;
if the intra prediction smoothing filter allows the flag bit to be '1', the minimum cost value recorded in b4) is compared with the minimum cost value recorded in e4),
if the rate distortion cost in b4) is lower, the prediction mode index code recorded in b4) is used as the optimal prediction mode of the current coding unit and is transmitted to a decoding end through a code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the mark position '0', which indicates that the intra-frame prediction smoothing filtering technology is not used, and is also transmitted to the decoding end through the code stream;
if the rate distortion in e4) is smaller, the prediction mode index code recorded in e4) is transmitted to the decoding end as the optimal prediction mode of the current coding unit through the code stream, and the intra-frame prediction smoothing filtering of the current coding unit uses the identification position '1', which indicates that the intra-frame prediction smoothing filtering technology is used, and is also transmitted to the decoding end through the code stream.
g4) And then, superposing the prediction block and the residual after inverse transformation and inverse quantization to obtain a reconstructed coding unit block which is used as a prediction reference block of the next coding unit.
Corresponding to the image encoding method described in fig. 9, fig. 10 is a flowchart illustrating an image encoding method in an embodiment of the present application, which can be applied to the destination device 20 in the video decoding system 1 shown in fig. 6 or the video decoder 200 shown in fig. 8. The flow shown in fig. 10 is described by taking as an example the video encoder 200 shown in fig. 8 as an execution subject. As shown in fig. 10, an image decoding method provided in an embodiment of the present application includes:
step 210, parsing the code stream, and determining decoding information of a current decoded block, where the decoding information includes first indication information and third indication information, the first indication information is used to indicate whether intra-frame prediction filtering is allowed, and the third indication information is used to indicate whether intra-frame prediction smoothing filtering is used;
step 220, determining a prediction block of the current decoding block according to the first indication information and the third indication information;
and step 230, overlapping the restored residual information with the prediction block to obtain a reconstructed block of the current decoding unit.
The specific flow of intra-frame prediction at the decoding end in filtering technical scheme 1 is as follows:
the decoder obtains a code stream, analyzes the code stream to obtain an intra-frame prediction smooth filtering allowable identification bit of the current video sequence, analyzes the code stream and carries out inverse transformation and inverse quantization on the obtained residual error information.
In the intra-prediction decoding process,
if the allowable flag bit of the intra prediction smoothing filtering is '1' and the current CU area is greater than or equal to 64 and less than 2048, executing all the following steps;
② if the allowable flag bit of the intra prediction smoothing filtering is '0', only the steps of a5), b5), c5), d5) and e5) are performed:
a5) acquiring a code stream, decoding to obtain residual information, and performing inverse transformation, inverse quantization and other processes to obtain time domain residual information;
b5) analyzing the code stream to obtain a prediction mode of a current decoding unit, and calculating according to the prediction mode of the current decoding unit and an adjacent reconstruction block to obtain a prediction block;
c5) parsing and obtaining the usage flag of the intra prediction smoothing filter,
if the using identification bit of the intra-frame prediction smooth filtering is '0', no additional operation is performed on the current prediction block;
if the using flag of the intra prediction smoothing filtering is '1', d5) is executed;
d5) filtering the input prediction block by using an intra-frame prediction smoothing filter to obtain a filtered current decoding unit prediction block;
e5) and superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing.
The specific flow of intra-frame prediction at the decoding end in filtering technical scheme 2 is as follows:
the decoder obtains a code stream, analyzes the code stream to obtain an intra-frame prediction smooth filtering allowable identification bit of the current video sequence, analyzes the code stream and performs inverse transformation and inverse quantization on the obtained residual information.
In the intra-prediction decoding process,
if the allowable flag bit of the intra prediction smoothing filtering is '1' and the current CU area is greater than or equal to 64 and less than 2048, executing all the following steps;
if the allowable flag bit of the intra prediction smoothing filter is '0', only the steps a6) and b6), e6) are performed:
a6) acquiring a code stream, decoding to obtain residual information, and performing inverse transformation, inverse quantization and other processes to obtain time domain residual information;
b6) analyzing the code stream to obtain a prediction mode of a current decoding unit, and calculating according to the prediction mode of the current decoding unit and an adjacent reconstruction block to obtain a prediction block;
c6) parsing and obtaining the usage flag of the intra prediction smoothing filter,
if the using identification bit of the intra-frame prediction smooth filtering is '0', no additional operation is performed on the current prediction block;
if the using identification bit of the intra-frame prediction smooth filtering is '1', executing d);
d6) filtering the input prediction block twice, performing intra-frame prediction smoothing filtering on all prediction pixels in the prediction block for the first time, and performing intra-frame prediction smoothing filtering on left boundary pixels and upper boundary pixels in the prediction block after filtering for the second time to obtain a filtered prediction block of the current decoding unit;
e6) and superposing the restored residual error information on the prediction block to obtain a reconstructed block of the current decoding unit, and outputting the reconstructed block after post-processing.
When the intra prediction filtering technique filters the current coding unit or decoding unit, the current block needs to be filled first, and there may be the following two filling schemes.
Pixel fill scheme 1, reconstructed pixels can be filled using reconstructed pixels, otherwise predicted pixels are filled:
a7) if the reference pixels on the left side and the upper side outside the current prediction block are available, namely the reconstructed pixels are arranged on the left side and the upper side, the two rows on the left side and the upper side outside the prediction block are filled by the reconstructed pixels;
b7) if the reference pixels on the left side or the upper side outside the current prediction block are unavailable, namely the reconstructed pixels are not arranged on the left side or the upper side, filling the outer two rows or two columns by using the row or the column which is closest to the side in the current prediction block on the side without the reconstructed pixels;
c7) filling the outer two columns of the right adjacent columns outside the current prediction block by using the rightmost column prediction value in the current prediction block;
d7) filling the outer two rows of the lower adjacent rows outside the current prediction block by using the lowest predicted value in the current prediction block;
e7) and filling the upper right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the upper side outside the current prediction block, filling the lower right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the lower side outside the current prediction block, and filling the lower left-corner pixel points outside the current prediction block by using the filled bottommost pixel points on the left side outside the current prediction block.
Fig. 11A gives a schematic diagram of a prediction block of the padding scheme 1, where pred.pixel denotes a pixel of the prediction block and recon.pixel denotes a padded pixel.
Pixel fill scheme 2, all using predictive pixel fill:
a8) filling the outer two rows of upper adjacent columns outside the current prediction block by using the uppermost row of prediction values in the current prediction block;
b8) filling the outer two columns of the left adjacent columns outside the current prediction block by using the leftmost column of prediction values in the current prediction block;
c8) filling the outer two columns of the right adjacent columns outside the current prediction block by using the rightmost column of prediction values in the current prediction block;
d8) filling the outer two rows of the lower adjacent rows outside the current prediction block by using the prediction value of the lowest row in the current prediction block;
e8) and filling the upper right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the upper side outside the current prediction block, filling the lower right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the lower side outside the current prediction block, and filling the lower left-corner pixel points outside the current prediction block by using the filled bottommost pixel points on the left side outside the current prediction block.
Fig. 11B gives a schematic diagram of a prediction block of padding scheme 2, where pred. pixel denotes the pixels of the prediction block and recon. pixel denotes the padded pixels.
Pixel fill scheme 3, reconstructed pixels are available to fill two rows or two columns using the closest row or one column, otherwise predicted pixel fill is used:
a9) if the reference pixels on the left side and the upper side outside the current prediction block are available, namely the reconstructed pixels are arranged on the left side and the upper side, two rows or two columns needing to be filled outside the prediction block are filled by using a first row on the left side and a first row on the upper side of the adjacent reconstructed pixels outside the prediction block;
b9) if the reference pixels on the left side or the upper side outside the current prediction block are unavailable, namely the reconstructed pixels are not arranged on the left side or the upper side, one side without the reconstructed pixels uses the row or the column which is closest to the side in the current prediction block to fill the outer two rows or two columns, and the two rows or two columns which need to be filled on one side with the reconstructed pixels are filled by using the reconstructed pixels on the first row or the first column;
c9) filling the outer two columns of the right adjacent columns outside the current prediction block by using the rightmost column prediction value in the current prediction block;
d9) filling the outer two rows of the lower adjacent rows outside the current prediction block by using the lowest predicted value in the current prediction block;
e9) and filling the upper right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the upper side outside the current prediction block, filling the lower right-corner pixel points outside the current prediction block by using the filled rightmost pixel points on the lower side outside the current prediction block, and filling the lower left-corner pixel points outside the current prediction block by using the filled bottommost pixel points on the left side outside the current prediction block.
Fig. 11C gives a schematic diagram of a padding scheme 3 prediction block.
The intra-prediction filtering techniques described above employ a simplified gaussian convolution kernel to filter the predicted block, and several filter schemes are proposed herein, and a set of filter coefficients is provided in each scheme.
Filter scheme 1, 5 x 5 size 25 taps
Figure BDA0002608522600000131
Filtering each prediction pixel in the prediction block, the filtering formula is as follows:
P′(x,y)=c1·P(x-2,y-2)+c2·P(x-1,y-2)+c3·P(x,y-2)+c2·P(x+1,y-2)+c1
·P(x+2,y-2)+c2·P(x-2,y-1)+c4·P(x-1,y-1)+c5·P(x,y-1)+c4
·P(x+1,y-1)+c2·P(x+2,y-1)+c3·P(x-2,y)+c5·P(x-1,y)+c6·P(x,y)
+c5·P(x+1,y)+c3·P(x+2,y)+c2·P(x-2,y+1)+c4·P(x-1,y+1)+c5
·P(x,y+1)+c4·P(x+1,y+1)+c2·P(x+2,y+1)+c1·P(x-2,y+2)+c2
·P(x-1,y+2)+c3·P(x,y+2)+c2·P(x+1,y+2)+c1·P(x+2,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c1、c2、c3、c4、c5And c6Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c1Is 0.0030, c2Is 0.0133, c3Is 0.0219, c4Is 0.0596, c5Is 0.0983, c6Is 0.1621. P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The convolution kernel coefficients adopted by the intra-frame prediction filtering technology can be approximate to integers, and the sum of all coefficients is an exponential power of 2, so that floating point calculation and division operation of a computer can be avoided, and the calculation complexity is greatly reduced as shown in the following:
Figure BDA0002608522600000132
the sum of the filter coefficients is 1024, i.e. the calculated prediction value needs to be shifted to the right by 10 bits.
Filter scheme 2, 5 x 5 size with 13 taps
Filtering each prediction pixel in the prediction block, wherein the adopted filter is a diamond filter, the size of the adopted filter is 5 multiplied by 5, the filter has 13 taps, in order to save computing resources and avoid computing complexity, the coefficients of all filter kernels are integers, and the sum of the coefficients is an exponential power of 2, and the filter kernels are as follows:
Figure BDA0002608522600000133
the sum of the filter coefficients is 256, i.e. the calculated prediction value needs to be shifted to the right by 8 bits.
The filter formula is as follows:
P′(x,y)=c1·P(x,y-2)+c2·P(x-1,y-1)+c3·P(x,y)+c2·P(x+1,y-1)+c1·P(x-2,y)+c3
·P(x-1,y)+c4·P(x,y)+c3·P(x+1,y)+c1·P(x+2,y)+c2·P(x-1,y+1)+c3
·P(x,y+1)+c2·P(x+1,y+1)+c1·P(x,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c1、c2、c3And c4Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c1Is 13, c2Is 18, c3Is 25, c4Is 32. P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
Filter scheme 21 taps of size 3, 5 x 5
Filtering each prediction pixel in the prediction block, wherein the adopted filter is a diamond filter, the size of the adopted filter is 21 taps in a size of 5 multiplied by 5, in order to save computing resources and avoid computing complexity, the coefficients of all filtering kernels are integers, and the sum is an exponential power of 2, and the filtering kernels are as follows:
Figure BDA0002608522600000141
the sum of the filter coefficients is 1024, i.e. the calculated prediction value needs to be shifted to the right by 10 bits.
The filter formula is as follows:
P′(x,y)=c1·P(x-1,y-2)+c2·P(x,y-2)+c1·P(x+1,y-2)+c1·P(x-2,y-1)+c3
·P(x-1,y-1)+c4·P(x,y-1)+c3·P(x+1,y-1)+c1·P(x+2,y-1)+c2
·P(x-2,y)+c4·P(x-1,y)+c5·P(x,y)+c4·P(x+1,y)+c2·P(x+2,y)+c1
·P(x-2,y+1)+c3·P(x-1,y+1)+c4·P(x,y+1)+c3·P(x+1,y+1)+c1
·P(x+2,y+1)+c1·P(x-1,y+2)+c2·P(x,y+2)+c1·P(x+1,y+2)
in the above equation, P' (x, y) is the final prediction value at the current coding unit (x, y), c1、c2、c3、c4And c5Respectively, in the above-mentioned approximate Gaussian convolution kernel coefficient, c1Is 24, c2Is 27, c3Is 62, c4Is 88, c5Is 124. P (x, y) and other parameters such as P (x-1, y-1) are prediction values located at the current coding unit (x, y) and (x-1, y-1), wherein the value ranges of x and y do not exceed the width and height of the current coding unit block.
The technology provided by the application is suitable for the intra-frame prediction coding and decoding part, provides selection for operations such as smoothing processing or local blurring of intra-frame prediction, and enables prediction pixels to be smoother and prediction blocks to be closer to original images by using the technology for the parts which do not need too sharp of image textures, and finally improves coding efficiency.
The combination of filtering solution 1, pixel filling solution 1 and filter solution 1 was tested on the official simulation platform HPM8.0 of AVS, the intra prediction block was filtered smoothly, and the test results under the full-frame test condition and the random access condition are shown in table 2.
TABLE 2 All Intra test results
Class Y U V
4K -0.62% -0.51% -0.70%
1080P -0.41% -0.05% -0.17%
720P -0.20% -0.97% -0.26%
Average performance -0.41% -0.51% -0.38%
As can be seen from table 2, this scheme has good performance improvement under both test conditions.
Under the AI test condition, the luminance component has 0.41% of BDBR saving, and the UV component has 0.51% and 0.38% of BDBR saving respectively, so that the high performance can be obviously seen, and the coding efficiency of the coder is effectively improved.
From each resolution, the scheme has larger coding performance improvement on the 4K resolution video, which is beneficial to the development of the future ultra-high definition video, and saves more code rates and more bandwidths for the ultra-high resolution video.
According to the scheme, in the intra-frame prediction process, smooth filtering is carried out on the prediction block obtained by calculating the intra-frame prediction mode, the intra-frame prediction precision is improved, and the coding efficiency is effectively improved, specifically as follows:
1. two technical schemes of smoothing filtering for intra-frame prediction are provided, wherein the first filtering scheme uses a smoothing filtering technology for all pixels of an intra-frame prediction block; the second filtering scheme employs two filtering passes, the first pass using the smoothing filter for all pixels in the prediction block and the second pass using the smoothing filter only at the boundaries;
2. three kinds of filling schemes of pixels outside a smooth filtering boundary are provided, wherein in the first scheme, if the reconstructed pixels of adjacent blocks can be used, the reconstructed pixels are used for filling two rows or two columns; the second scheme fills two rows or two columns outside with the rows or columns of the nearest boundary within the prediction block; in the third scheme, if the reconstructed pixels of the adjacent block are available, one row or one column of the nearest prediction block is used for filling two rows or two columns outside the prediction block, and the other row or one column of the nearest boundary in the prediction block is used for filling two rows or two columns outside the prediction block on the side where the reconstructed pixels are unavailable;
three filter schemes are proposed, each filter providing a specific set of filter coefficients. The first filter scheme is a 25-tap square filter of size 5 × 5; the second filter scheme is a 13-tap diamond filter of 5 × 5 size; the third filtering scheme is a 21-tap octagon filter of 5 x 5 size.
The present application can be further extended from the following directions.
Embodiment 1: the shape of the filter and the number of taps in the filter scheme are adjusted, and the number of taps is reduced to reduce the calculation load;
the expansion scheme 2: adjusting the filter coefficient in the filter scheme, and filtering the prediction block by adopting the filtering of the asymmetric coefficient so as to improve the coding efficiency;
embodiment 3: and filtering prediction blocks in different prediction modes by using different filters, and improving the filtering effect on different textures through different frequency characteristics so as to improve the coding efficiency.
The embodiment of the application provides an image coding device which can be a video decoder or a video encoder. In particular, the image encoding device is configured to perform the steps performed by the video decoder in the above decoding method. The image encoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
The present embodiment may divide the functional modules of the image encoding apparatus according to the above method, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 12 is a schematic diagram showing a possible configuration of the image encoding apparatus according to the above embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 12, the image encoding device 12 includes a dividing unit 120, a determining unit 121, and an superimposing unit 122.
A dividing unit 120, configured to divide an image, and determine coding information of a current coding block, where the coding information includes first indication information and second indication information, the first indication information is used to indicate whether intra prediction filtering is allowed, and the second indication information is used to indicate whether intra prediction smoothing filtering is allowed;
a determining unit 121, configured to determine an optimal prediction mode of the current coding speed according to the coding information, set third indication information according to the optimal prediction mode, and transmit the optimal prediction mode index and the third indication information via a code stream;
and a superposition unit 122, configured to superpose the prediction block of the current coding block and a residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed block.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image encoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image encoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image encoding apparatus.
In the case of using an integrated unit, a schematic structural diagram of an image encoding device provided in an embodiment of the present application is shown in fig. 13. In fig. 13, the image encoding device 13 includes: a processing module 130 and a communication module 131. The processing module 130 is used for controlling and managing actions of the image encoding apparatus, for example, performing steps performed by the dividing unit 120, the determining unit 121, the superimposing unit 122, and/or other processes for performing the techniques described herein. The communication module 131 is used to support interaction between the image encoding apparatus and other devices. As shown in fig. 13, the image encoding apparatus may further include a storage module 132, and the storage module 132 is used for storing program codes and data of the image encoding apparatus, for example, contents stored in the storage unit.
The Processing module 130 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 131 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 132 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image coding device can execute the image coding method, and the image coding device can be specifically a video image coding device or other equipment with a video coding function.
The application also provides a video encoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image encoding method of the embodiment of the application.
The embodiment of the application provides an image decoding device which can be a video decoder or a video decoder. Specifically, the image decoding apparatus is configured to perform the steps performed by the video decoder in the above decoding method. The image decoding device provided by the embodiment of the application can comprise modules corresponding to the corresponding steps.
In the embodiment of the present application, the image decoding apparatus may be divided into functional modules according to the above method, for example, each functional module may be divided according to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The division of the modules in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.
Fig. 14 shows a schematic diagram of a possible structure of the image decoding apparatus according to the above-described embodiment, in a case where each functional module is divided in correspondence with each function. As shown in fig. 14, image decoding apparatus 14 includes parsing unit 140, determining unit 141, and superimposing unit 142.
A parsing unit 140, configured to parse the code stream, and determine decoding information of a current decoded block, where the decoding information includes first indication information and third indication information, the first indication information is used to indicate whether intra-prediction filtering is allowed, and the third indication information is used to indicate whether intra-prediction smoothing filtering is used;
a determining unit 141, configured to determine a prediction block of the current decoded block according to the first indication information and the third indication information;
and a superposition unit 142, configured to superpose the restored residual information on the prediction block to obtain a reconstructed block of the current decoding unit.
All relevant contents of each step related to the above method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. Of course, the image decoding apparatus provided in the embodiments of the present application includes, but is not limited to, the above modules, for example: the image decoding apparatus may further include a storage unit. The storage unit may be used to store program codes and data of the image decoding apparatus.
In the case of using an integrated unit, a schematic structural diagram of an image decoding apparatus provided in an embodiment of the present application is shown in fig. 15. In fig. 15, the image decoding apparatus includes: a processing module 150 and a communication module 151. The processing module 150 is used for controlling and managing actions of the image decoding apparatus, for example, performing steps performed by the parsing unit 140, the determining unit 141, and the superimposing unit 142, and/or other processes for performing the techniques described herein. The communication module 151 is used to support interaction between the image decoding apparatus and other devices. As shown in fig. 15, the image decoding apparatus may further include a storage module 152, and the storage module 152 is used for storing program codes and data of the image decoding apparatus, for example, contents stored in the storage unit.
The Processing module 150 may be a Processor or a controller, and may be, for example, a Central Processing Unit (CPU), a general-purpose Processor, a Digital Signal Processor (DSP), an ASIC, an FPGA or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication module 151 may be a transceiver, an RF circuit or a communication interface, etc. The storage module 152 may be a memory.
All relevant contents of each scene related to the method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again. The image decoding device may perform the image decoding method, and the image decoding device may be specifically a video image decoding device or other equipment with a video decoding function.
The application also provides a video decoder, which comprises a nonvolatile storage medium and a central processing unit, wherein the nonvolatile storage medium stores an executable program, and the central processing unit is connected with the nonvolatile storage medium and executes the executable program to realize the image decoding method of the embodiment of the application.
The present application further provides a terminal, including: one or more processors, memory, a communication interface. The memory, communication interface, and one or more processors; the memory is used for storing computer program code comprising instructions which, when executed by the one or more processors, cause the terminal to perform the image encoding and/or image decoding methods of embodiments of the present application. The terminal can be a video display device, a smart phone, a portable computer and other devices which can process video or play video.
Another embodiment of the present application also provides a computer-readable storage medium including one or more program codes, where the one or more programs include instructions, and when a processor in a decoding apparatus executes the program codes, the decoding apparatus executes an image encoding method and an image decoding method of the embodiments of the present application.
In another embodiment of the present application, there is also provided a computer program product comprising computer executable instructions stored in a computer readable storage medium; the at least one processor of the decoding device may read the computer executable instructions from the computer readable storage medium, and the execution of the computer executable instructions by the at least one processor causes the terminal to implement the image encoding method and the image decoding method of the embodiments of the present application.
In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any combination thereof. When implemented using a software program, may take the form of a computer program product, either entirely or partially. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part.
The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).
The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical functional division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another device, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may be one physical unit or a plurality of physical units, that is, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially contributed to by the prior art, or all or part of the technical solutions may be embodied in the form of a software product, where the software product is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1.一种图像编码方法,其特征在于,包括:1. an image coding method, is characterized in that, comprises: 划分图像,确定当前编码块的编码信息,所述编码信息包括第一指示信息和第二指示信息,所述第一指示信息用于指示是否允许帧内预测滤波,所述第二指示信息用于指示是否允许帧内预测平滑滤波;Divide the image, and determine the encoding information of the current encoding block, where the encoding information includes first indication information and second indication information, the first indication information is used to indicate whether intra-frame prediction filtering is allowed, and the second indication information is used to Indicates whether to allow intra prediction smoothing filtering; 根据所述编码信息确定所述当前编码快的最优预测模式,根据所述最优预测模式设置第三指示信息,以及将所述最优预测模式索引和所述第三指示信息经码流传输;Determine the optimal prediction mode for the current encoding speed according to the encoding information, set third indication information according to the optimal prediction mode, and transmit the optimal prediction mode index and the third indication information through a code stream ; 将所述当前编码块的预测块和经反变换、反量化后得到的残差块叠加,得到重建后的重建块。The prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization are superimposed to obtain a reconstructed reconstructed block. 2.一种图像解码方法,其特征在于,包括:2. an image decoding method, is characterized in that, comprises: 解析码流,确定当前解码块的解码信息,所述解码信息包括第一指示信息和第三指示信息,所述第一指示信息用于指示是否允许帧内预测滤波,所述第三指示信息用于指示是否使用帧内预测平滑滤波;Parse the code stream to determine the decoding information of the current decoding block, the decoding information includes first indication information and third indication information, the first indication information is used to indicate whether intra-frame prediction filtering is allowed, and the third indication information is used to indicate whether to use intra prediction smoothing filtering; 根据所述第一指示信息和所述第三指示信息,确定所述当前解码块的预测块;determining the prediction block of the current decoding block according to the first indication information and the third indication information; 将所述预测块叠加还原后的残差信息得到当前解码单元的重建块。The reconstructed block of the current decoding unit is obtained by superimposing the restored residual information on the prediction block. 3.一种图像编码装置,其特征在于,包括:3. An image encoding device, comprising: 划分单元,用于划分图像,确定当前编码块的编码信息,所述编码信息包括第一指示信息和第二指示信息,所述第一指示信息用于指示是否允许帧内预测滤波,所述第二指示信息用于指示是否允许帧内预测平滑滤波;a dividing unit, configured to divide an image and determine coding information of the current coding block, where the coding information includes first indication information and second indication information, the first indication information is used to indicate whether intra-frame prediction filtering is allowed, the first indication information The second indication information is used to indicate whether intra-frame prediction smoothing filtering is allowed; 确定单元,用于根据所述编码信息确定所述当前编码快的最优预测模式,根据所述最优预测模式设置第三指示信息,以及将所述最优预测模式索引和所述第三指示信息经码流传输;A determination unit, configured to determine the optimal prediction mode of the current encoding speed according to the encoding information, set third indication information according to the optimal prediction mode, and combine the optimal prediction mode index with the third indication The information is transmitted through the code stream; 叠加单元,用于将所述当前编码块的预测块和经反变换、反量化后得到的残差块叠加,得到重建后的重建块。The superposition unit is used for superimposing the prediction block of the current coding block and the residual block obtained after inverse transformation and inverse quantization to obtain a reconstructed reconstructed block. 4.一种图像解码装置,其特征在于,包括:4. An image decoding device, comprising: 解析单元,用于解析码流,确定当前解码块的解码信息,所述解码信息包括第一指示信息和第三指示信息,所述第一指示信息用于指示是否允许帧内预测滤波,所述第三指示信息用于指示是否使用帧内预测平滑滤波;a parsing unit, configured to parse the code stream and determine decoding information of the current decoding block, where the decoding information includes first indication information and third indication information, the first indication information is used to indicate whether intra-frame prediction filtering is allowed, the The third indication information is used to indicate whether to use intra-frame prediction smoothing filtering; 确定单元,用于根据所述第一指示信息和所述第三指示信息,确定所述当前解码块的预测块;a determining unit, configured to determine the prediction block of the current decoding block according to the first indication information and the third indication information; 叠加单元,用于将所述预测块叠加还原后的残差信息得到当前解码单元的重建块。A superposition unit, configured to superimpose the restored residual information of the prediction block to obtain a reconstructed block of the current decoding unit. 5.一种编码器,包括非易失性存储介质以及中央处理器,其特征在于,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,当所述中央处理器执行所述可执行程序时,所述编码器执行如权利要求1或2所述的双向帧间预测方法。5. An encoder, comprising a non-volatile storage medium and a central processing unit, wherein the non-volatile storage medium stores an executable program, and the central processing unit and the non-volatile storage medium store an executable program. media connection, when the central processing unit executes the executable program, the encoder executes the bidirectional inter-frame prediction method according to claim 1 or 2. 6.一种解码器,包括非易失性存储介质以及中央处理器,其特征在于,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,当所述中央处理器执行所述可执行程序时,所述解码器执行如权利要求1或2所述的双向帧间预测方法。6. A decoder, comprising a non-volatile storage medium and a central processing unit, wherein the non-volatile storage medium stores an executable program, and the central processing unit and the non-volatile storage medium store an executable program. media connection, when the central processing unit executes the executable program, the decoder executes the bidirectional inter-frame prediction method according to claim 1 or 2. 7.一种终端,其特征在于,所述终端包括:一个或多个处理器、存储器和通信接口;所述存储器、所述通信接口与所述一个或多个处理器连接;所述终端通过所述通信接口与其他设备通信,所述存储器用于存储计算机程序代码,所述计算机程序代码包括指令,7. A terminal, characterized in that the terminal comprises: one or more processors, a memory and a communication interface; the memory and the communication interface are connected to the one or more processors; the communication interface communicates with other devices, the memory is for storing computer program code, the computer program code includes instructions, 当所述一个或多个处理器执行所述指令时,所述终端执行如权利要求1或2所述的方法。The terminal performs the method of claim 1 or 2 when the one or more processors execute the instructions. 8.一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在终端上运行时,使得所述终端执行如权利要求1或2所述的方法。8. A computer program product comprising instructions, wherein, when the computer program product is run on a terminal, the terminal is caused to perform the method according to claim 1 or 2. 9.一种计算机可读存储介质,包括指令,其特征在于,当所述指令在终端上运行时,使得所述终端执行如权利要求1或2所述的方法。9 . A computer-readable storage medium, comprising instructions, wherein, when the instructions are executed on a terminal, the terminal is caused to execute the method according to claim 1 or 2 . 10 .
CN202010748923.0A 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices Active CN114071161B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010748923.0A CN114071161B (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices
TW110123862A TW202209879A (en) 2020-07-29 2021-06-29 Image encoding method, image decoding method and related devices capable of improving the accuracy of intra-frame prediction and encoding efficiency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010748923.0A CN114071161B (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices

Publications (2)

Publication Number Publication Date
CN114071161A true CN114071161A (en) 2022-02-18
CN114071161B CN114071161B (en) 2023-03-31

Family

ID=80227085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010748923.0A Active CN114071161B (en) 2020-07-29 2020-07-29 Image encoding method, image decoding method and related devices

Country Status (2)

Country Link
CN (1) CN114071161B (en)
TW (1) TW202209879A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116156180A (en) * 2023-04-19 2023-05-23 北京中星微人工智能芯片技术有限公司 Intra prediction method, image encoding method, image decoding method, and device
WO2024108931A1 (en) * 2022-11-23 2024-05-30 华为技术有限公司 Video encoding and decoding methods and apparatus
WO2024145850A1 (en) * 2023-01-04 2024-07-11 Oppo广东移动通信有限公司 Coding method, decoding method, code stream, coder, decoder, and storage medium
WO2025011415A1 (en) * 2023-07-11 2025-01-16 维沃移动通信有限公司 Intra-frame prediction method and apparatus, reference object determination method and apparatus, and electronic device
WO2025073085A1 (en) * 2023-10-04 2025-04-10 Oppo广东移动通信有限公司 Encoding method, decoding method, encoder, decoder and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120147955A1 (en) * 2010-12-10 2012-06-14 Madhukar Budagavi Mode Adaptive Intra Prediction Smoothing in Video Coding
CN103581647A (en) * 2013-09-29 2014-02-12 北京航空航天大学 Depth map sequence fractal coding method based on motion vectors of color video
CN103609124A (en) * 2011-06-15 2014-02-26 华为技术有限公司 Mode dependent intra smoothing filter table mapping methods for non-square prediction units
CN104125473A (en) * 2014-07-31 2014-10-29 南京理工大学 3D (three dimensional) video depth image intra-frame predicting mode selecting method and system
CN109889852A (en) * 2019-01-22 2019-06-14 四川大学 An optimization method for HEVC intra-frame coding based on neighboring values
CN110267041A (en) * 2019-06-28 2019-09-20 Oppo广东移动通信有限公司 Image encoding method, image encoding device, electronic device, and computer-readable storage medium
WO2019245261A1 (en) * 2018-06-18 2019-12-26 세종대학교 산학협력단 Method and apparatus for encoding/decoding image
CN111294592A (en) * 2020-05-13 2020-06-16 腾讯科技(深圳)有限公司 Video information processing method, multimedia information processing method and device
CN111327904A (en) * 2018-12-15 2020-06-23 华为技术有限公司 Image reconstruction method and device
WO2020130889A1 (en) * 2018-12-21 2020-06-25 Huawei Technologies Co., Ltd. Method and apparatus of mode- and size-dependent block-level restrictions

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120147955A1 (en) * 2010-12-10 2012-06-14 Madhukar Budagavi Mode Adaptive Intra Prediction Smoothing in Video Coding
CN103609124A (en) * 2011-06-15 2014-02-26 华为技术有限公司 Mode dependent intra smoothing filter table mapping methods for non-square prediction units
CN103581647A (en) * 2013-09-29 2014-02-12 北京航空航天大学 Depth map sequence fractal coding method based on motion vectors of color video
CN104125473A (en) * 2014-07-31 2014-10-29 南京理工大学 3D (three dimensional) video depth image intra-frame predicting mode selecting method and system
WO2019245261A1 (en) * 2018-06-18 2019-12-26 세종대학교 산학협력단 Method and apparatus for encoding/decoding image
CN111327904A (en) * 2018-12-15 2020-06-23 华为技术有限公司 Image reconstruction method and device
WO2020130889A1 (en) * 2018-12-21 2020-06-25 Huawei Technologies Co., Ltd. Method and apparatus of mode- and size-dependent block-level restrictions
CN109889852A (en) * 2019-01-22 2019-06-14 四川大学 An optimization method for HEVC intra-frame coding based on neighboring values
CN110267041A (en) * 2019-06-28 2019-09-20 Oppo广东移动通信有限公司 Image encoding method, image encoding device, electronic device, and computer-readable storage medium
CN111294592A (en) * 2020-05-13 2020-06-16 腾讯科技(深圳)有限公司 Video information processing method, multimedia information processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张洪彬;伏长虹;陈锐霖;萧允治;苏卫民;: "3D-HEVC深度图像快速帧内编码方法" *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024108931A1 (en) * 2022-11-23 2024-05-30 华为技术有限公司 Video encoding and decoding methods and apparatus
WO2024145850A1 (en) * 2023-01-04 2024-07-11 Oppo广东移动通信有限公司 Coding method, decoding method, code stream, coder, decoder, and storage medium
CN116156180A (en) * 2023-04-19 2023-05-23 北京中星微人工智能芯片技术有限公司 Intra prediction method, image encoding method, image decoding method, and device
CN116156180B (en) * 2023-04-19 2023-06-23 北京中星微人工智能芯片技术有限公司 Intra prediction method, image encoding method, image decoding method, and device
WO2025011415A1 (en) * 2023-07-11 2025-01-16 维沃移动通信有限公司 Intra-frame prediction method and apparatus, reference object determination method and apparatus, and electronic device
WO2025073085A1 (en) * 2023-10-04 2025-04-10 Oppo广东移动通信有限公司 Encoding method, decoding method, encoder, decoder and storage medium

Also Published As

Publication number Publication date
CN114071161B (en) 2023-03-31
TW202209879A (en) 2022-03-01

Similar Documents

Publication Publication Date Title
WO2021238540A1 (en) Image encoding method, image decoding method, and related apparatuses
CN113497937B (en) Image coding method, image decoding method and related device
CN114071161B (en) Image encoding method, image decoding method and related devices
CN115002485B (en) Image encoding method, image decoding method and related devices
WO2022037300A1 (en) Encoding method, decoding method, and related devices
CN115398906B (en) Method for signaling video encoding data
US12010325B2 (en) Intra block copy scratch frame buffer
CN118101967B (en) Method for position-dependent spatially varying transformation for video coding and decoding
WO2022022622A1 (en) Image coding method, image decoding method, and related apparatus
CN114868393B (en) Method for performing surround motion compensation
US20250039437A1 (en) Methods and devices for multi-hypothesis-based prediction
CN113965764B (en) Image encoding method, image decoding method and related device
TWI882138B (en) Image encoding method, image decoding method and related device
TWI882120B (en) Image encoding method, image decoding method and related device
CN117581547A (en) Side-window bilateral filtering for video encoding and decoding
WO2025098769A1 (en) Joint adaptive in-loop and output filter
CN119605183A (en) Film grain compositing using coded information
CN119325706A (en) System and method for angular intra-mode coding
CN119999202A (en) System and method for transform selection of intra prediction modes based on extrapolation filters
CN120077654A (en) System and method for intra prediction mode based on adaptive extrapolation filter

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant