[go: up one dir, main page]

CN117979031A - Image filtering method, storage medium, electronic device and product - Google Patents

Image filtering method, storage medium, electronic device and product Download PDF

Info

Publication number
CN117979031A
CN117979031A CN202410169233.8A CN202410169233A CN117979031A CN 117979031 A CN117979031 A CN 117979031A CN 202410169233 A CN202410169233 A CN 202410169233A CN 117979031 A CN117979031 A CN 117979031A
Authority
CN
China
Prior art keywords
gradient
coding unit
pixel
values
image block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410169233.8A
Other languages
Chinese (zh)
Inventor
马学睿
李一鸣
郭宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202410169233.8A priority Critical patent/CN117979031A/en
Publication of CN117979031A publication Critical patent/CN117979031A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • H04N19/198Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/88Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving rearrangement of data among different coding units, e.g. shuffling, interleaving, scrambling or permutation of pixel data or permutation of transform coefficient data among different blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application discloses an image filtering method, a storage medium, electronic equipment and a product. The image filtering method comprises the steps of calculating a gradient matrix based on pixel reconstruction values of pixels of a reconstructed image block; determining gradient information of each coding unit of each reconstructed image block based on the gradient matrix; and determining filter parameters of each coding unit based on gradient information of each coding unit. Based on the pixel reconstruction value of each pixel of the reconstructed image block, calculating the gradient matrix corresponding to the reconstructed image block, and based on the maximum parallelism allowed by the equipment, calculating the gradient value corresponding to each pixel in parallel, so as to quickly generate a complete gradient matrix; then, the filter parameters of each coding unit are calculated only based on the complete gradient matrix, so that the calculation efficiency of the filter parameters is improved, and the image filtering efficiency is further improved.

Description

Image filtering method, storage medium, electronic device and product
Technical Field
The present application relates to encoding and decoding technologies, and in particular, to an image filtering method, a computer readable storage medium, an electronic device, and a computer program product.
Background
Adaptive loop filtering (Adaptive Loop Filtering, ALF), a post-processing technique in video encoding and decoding processes, is applied in the loop of a decoder to reduce block artifacts and noise generated during video encoding.
When the adaptive loop filtering is used for calculating the filter parameters, the filter parameters of each coding unit are calculated by traversing each coding unit, and when each coding unit is traversed, whether the gradient of each pixel required by the coding unit is calculated or not is continuously judged, so that the calculation parallelism is reduced, and the calculation efficiency of the filter parameters is lower.
Disclosure of Invention
The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides an image filtering method, a computer readable storage medium, an electronic device and a computer program product, which can omit a judging step by separating the calculation of a gradient matrix and the calculation of a filter parameter, and is beneficial to improving the parallelism of the gradient matrix, thereby improving the calculation efficiency of the filter parameter.
In a first aspect, the present application provides an image filtering method comprising calculating a gradient matrix based on pixel reconstruction values of respective pixels of a reconstructed image block; determining gradient information of each coding unit of each reconstructed image block based on the gradient matrix; and determining filter parameters of each coding unit based on gradient information of each coding unit.
In a second aspect, another image filtering method provided by the present application includes calculating a gradient matrix based on pixel values of respective pixels of an encoded image block; determining gradient information of each coding unit of each coded image block based on the gradient matrix; and determining filter parameters of each coding unit based on gradient information of each coding unit.
In a third aspect, the present application provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described image filtering method.
In a fourth aspect, the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above image filtering method when executing the program.
According to the image filtering method, the computer-readable storage medium and the electronic device provided by the embodiment of the application, through the steps of simultaneously calculating the filter parameters and the required gradient matrix of the coding unit in the traversal process of the separation coding unit, firstly, the gradient matrix corresponding to the reconstructed image block is calculated based on the pixel reconstruction value of each pixel of the reconstructed image block, judgment is not needed, and the gradient value corresponding to each pixel can be obtained by parallel calculation based on the maximum parallelism allowed by the device, so that the complete gradient matrix is rapidly generated; then, gradient information of each coding unit is calculated only based on the complete gradient matrix, so that filter parameters of each coding unit are calculated based on the gradient information of each coding unit, calculation efficiency of the filter parameters is improved, and image filtering efficiency is improved.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
fig. 1 is a schematic diagram of a video coding principle according to an embodiment of the present application;
FIG. 2 is a schematic diagram of the inter prediction principle provided by an embodiment of the present application;
FIG. 3 is a schematic view of a sheet structure provided by an embodiment of the present application;
Fig. 4 is a schematic diagram of a filtering principle of a filter for universal video coding according to an embodiment of the present application;
FIG. 5 is a schematic diagram of the shape of a filter according to an embodiment of the present application;
FIG. 6 is a schematic diagram of the shape of a filter provided by an embodiment of the present application;
Fig. 7 is a schematic diagram of a video codec system according to an embodiment of the present application;
fig. 8 is a schematic flow chart of an image filtering method according to an embodiment of the present application;
fig. 9 is a schematic diagram of an image filtering method according to an embodiment of the present application;
fig. 10 is a schematic flow chart of an image filtering method according to an embodiment of the present application;
FIG. 11 is a schematic diagram of an image filtering method according to an embodiment of the present application;
Fig. 12 and fig. 13 are schematic flow diagrams of an image filtering method according to an embodiment of the present application;
Fig. 14 is a schematic diagram of an image filtering method according to an embodiment of the present application;
Fig. 15 is a schematic flow chart of an image filtering method according to an embodiment of the present application;
FIG. 16 is a flowchart of another image filtering method according to an embodiment of the present application;
Fig. 17 is a schematic structural diagram of an image filtering device according to an embodiment of the present application;
fig. 18 is a schematic structural diagram of an image filtering device according to an embodiment of the present application;
fig. 19 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 20 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application.
The following describes some technical terms involved in the present application:
1. video coding:
The video signal can be divided into two modes, i.e., capturing by a camera and generating by a computer, from the viewpoint of signal acquisition. The corresponding compression coding modes may also differ due to the difference in statistical properties.
Referring to fig. 1, in the modern mainstream Video Coding technology, taking international Video Coding standards HEVC/h.265 (HIGH EFFICIENCY Video Coding, HEVC), VVC, and AVS3 as examples, a hybrid Coding framework is adopted, and the following series of operations and processes are performed on an input original Video signal:
(1) Block partitioning structure (block partition structure): the input image (input video pictures) is divided into a number of non-overlapping processing units according to the size of one, each processing unit will perform a similar compression operation. This processing unit is called CTU, or LCU. The CTU may proceed further down to finer divisions, resulting in one or more elementary coded units, referred to as CUs. Each CU is the most basic element in an encoding pass. Described below are various coding schemes that may be employed for each CU.
CTU lines-in the codec process, the picture is divided into CTU blocks of equal size and then processed one by one. The CTU row refers to an arrangement of these CTU blocks in the horizontal direction. The codec may process multiple CTU rows in parallel, thus exploiting the performance of the multi-core processor, processing the CTU rows.
For example, assume we have an image of 1920x1080 size, with a CTU size of 64x64. In this case, the image would be divided into 30 (1920/64) CTU columns and 17 (1080/64) CTU rows. The codec will process these CTU blocks row by row, i.e. first the CTU blocks of the first row, then the CTU blocks of the second row, and so on. In a multi-threaded environment, multiple CTU rows may be processed simultaneously to increase codec speed.
(2) Predictive coding (PREDICTIVE CODING): the method comprises intra-frame prediction (intra-picture prediction), inter-frame prediction (motion-comp prediction) and other modes, and the original video signal is subjected to prediction of the selected reconstructed video signal to obtain a residual video signal. The encoding side needs to decide one of the most suitable prediction coding modes among many possible prediction coding modes for the current CU and inform the decoding side.
Currently, the mainstream video coding standards, such as HEVC, VVC, AVS3, are first generation video coding standards AV1 (Alliance for Open Media Video 1, av1) established by the open media alliance, and second generation video coding standards (Alliance for Open Media Video, av2) established by the open media alliance all adopt a block-based hybrid coding framework. The method divides the original video data into a series of coding blocks, and combines video coding methods such as prediction, transformation, entropy coding and the like to realize the compression of the video data. Among them, motion compensation is a type of prediction method commonly used for video coding, and motion compensation derives a prediction value of a current coding block from a coded region based on redundancy characteristics of video content in a time domain or a space domain. Such prediction methods include: inter prediction, intra block copy prediction, intra string copy prediction, etc., these prediction methods may be used alone or in combination in a particular coding implementation. For coded blocks using these prediction methods, it is often necessary to explicitly or implicitly encode one or more two-dimensional displacement vectors in the code stream, indicating the displacement of the current block (or co-located blocks of the current block) relative to its one or more reference blocks.
It should be noted that, in different prediction modes and in different implementations, the displacement vectors may have different names, and are described herein in the following manner: 1) The displacement Vector in inter prediction is called Motion Vector (MV); 2) The displacement Vector in intra Block copy is called Block Vector (BV); 3) The displacement Vector in intra-frame String replication is called String Vector (SV). Techniques related to inter prediction and intra block copy prediction are described below.
A. intra prediction: the predicted signal comes from the already encoded reconstructed region within the same image.
B. Inter prediction: the predicted signal comes from an already encoded picture other than the current picture (called the reference picture).
As shown in fig. 2, inter prediction predicts a pixel of a current image using a neighboring encoded image pixel by using correlation of video time domain, so as to achieve the purpose of effectively removing video time domain redundancy, and effectively saving bits of encoded residual data. Wherein P is the current frame, pr is the reference frame, B is the current block to be encoded, and Br is the reference block of B. The coordinate position of B 'and B in the image is the same, br coordinates are (xr, yr), and B' coordinates are (x, y). The displacement between the current coding block and its reference block is called Motion Vector (MV), i.e.: MV= (xr-x, yr-y).
Considering that the temporal or spatial neighboring blocks have a strong correlation, MV prediction techniques can be used to further reduce the bits required to encode MVs. In h.265/HEVC, inter prediction includes both Merge and AMVP MV prediction techniques.
(3) Transform coding (Transform) and Quantization (Quantization): the residual video signal is subjected to a transform operation such as discrete fourier transform (Discrete Fourier Transform, DFT), discrete cosine transform (Discrete Cosine Transform, DCT), etc., to transform the signal into the transform domain, referred to as transform coefficients. And (3) carrying out lossy quantization operation on the signals in the transform domain, and losing certain information, so that the quantized signals are favorable for compression expression. In some video coding standards, there may be more than one transform mode to choose from, so the coding end also needs to choose one of the transforms for the current coding CU and inform the decoding end. The degree of refinement of quantization is usually determined by quantization parameters (Quantizer Parameter, QP), with larger QP values, coefficients representing a larger range of values being quantized to the same output, and thus usually with greater distortion and lower code rate; conversely, a smaller QP value will represent a smaller range of coefficients to be quantized to the same output, and therefore will typically result in less distortion, while corresponding to a higher code rate.
(4) Entropy coding (Entropy Coding) or statistical coding: the quantized transform domain signal is subjected to statistical compression coding according to the occurrence frequency of each value, and finally a binary (0 or 1) compressed code stream is output. Meanwhile, encoding generates other information such as a selected mode, a motion vector, etc., and entropy encoding is also required to reduce a code rate. The statistical coding is a lossless coding mode, and can effectively reduce the code rate required for expressing the same signal. Common statistical coding schemes are variable length coding (VLC, variable Length Coding) or context-based binary arithmetic coding (CABAC, content Adaptive Binary Arithmetic Coding).
(5) Loop Filtering (Loop Filtering): the encoded image is subjected to inverse quantization (scaling & inv. Transform), inverse transformation and predictive compensation (inverse operations (2) to (4) above), and a reconstructed decoded image can be obtained. The reconstructed image is different from the original image in part of information due to the quantization effect compared to the original image, resulting in Distortion (Distortion). The reconstructed image is filtered, for example, deblocking filtering (deblocking), sampling adaptive compensation (SAMPLE ADAPTIVE Offset, SAO) or ALF, so as to effectively reduce the distortion degree generated by quantization. Since these filtered reconstructed images will be used as references for subsequent encoded images for predicting future signals, the above-described filtering operations are also referred to as loop filtering, and filtering operations within the encoded loop.
2. Video decoding: as an inverse process of video coding, according to the above coding process, it can be seen that, at the decoding end, for each CU, after obtaining a video bitstream, the decoder performs entropy decoding first to obtain various mode information and quantized transform coefficients. Each coefficient is subjected to inverse quantization and inverse transformation to obtain a residual signal. On the other hand, according to the known coding mode information, a prediction signal corresponding to the CU can be obtained, and after the prediction signal and the prediction signal are added, a reconstructed signal can be obtained. Finally, the reconstructed values of the decoded image require a loop filtering operation to produce the final output signal.
3. Video code stream structure
(1) Video sequence
The video sequence is the highest level syntax structure of the bitstream (i.e., video bitstream). The video sequence starts with a first sequence header and the sequence end code or video edit indicates the end of a video sequence. The sequence header between the first sequence header of the video sequence to the first occurring end-of-sequence code or video editing code is a repeated sequence header. Each sequence header is followed by one or more encoded pictures, each picture being preceded by a picture header. The encoded pictures are arranged in a bitstream order, which should be the same as the decoding order. The decoding order may be different from the display order.
(2) Image processing apparatus
An image may be a frame or a field whose encoded data starts with an image start code and ends with a sequence start code, a sequence end code or the next image start code.
The image types include: i image; p image; b image, the whole coded frame is called I frame, the frame generated by referring to the previous I frame only contains difference part coded frame is called P frame, and the frame coded by referring to the previous and the next frames is called B frame.
(3) Sheet
A slice is a rectangular region (e.g., region a, region B, etc.) in an image that contains the portions of the largest coding units within the image, and should not overlap between slices. The sheet structure is shown in fig. 3.
(4) Maximum coding unit, coding tree and coding unit
The image is divided into maximum Coding units (such as Coding Tree Units (CTUs) in video Coding), the maximum Coding units should not overlap, samples at the upper left corner of the maximum Coding units should not exceed the image boundary, and samples at the lower right corner of the maximum Coding units may exceed the image boundary.
The coding tree determines the division of the largest coding unit into a plurality of coding units (e.g., CUs in video coding), such as binary tree, quadtree, enhanced quadtree, etc., and one coding unit may be referred to as an image block.
The coding unit is divided into one or more transform blocks, which are basic units in performing transform coding.
4. ALF and Cross-component adaptive loop filtering (Cross-Component Adaptive Loop Filtering, CCALF) in general video coding (VERSATILE VIDEO CODING, VVC)
As a newly adopted loop filter in VVC, ALF and CCALF are wiener filters, which adaptively determine filter coefficients according to different video contents, thereby reducing the Mean Square Error (MSE) between the reconstructed component and the original component. Wherein the input of ALF is reconstructed pixel values before ALF processing and the output is an enhanced reconstructed luminance image and a reconstructed chrominance image. As an adaptive filter, wiener filters can produce different filter coefficients for video content of different characteristics. Accordingly, ALF needs to first classify video content, and use corresponding filters for each class of video content. In the design of VVC, each 4x4 block will be classified into one of 25 classes according to its own directionality and activity. Each type of video content will calculate corresponding filter coefficients.
For the luminance component, VVC supports ALF adaptive switching at CTU level in addition to adaptation at 4x4 block level. Each CTU may use a filter bank generated by the current slice (slice) or a filter bank generated by the encoded slice or a set of 16 fixed filters in a fixed filter bank trained off-line. Inside the CTU, each 4x4 block will select a corresponding class of filter from the filter bank for filtering according to its own class. The filter coefficients of the adaptive parameter set (Adaptation PARAMETER SET, APS) and the corresponding clipping index are transmitted to the decoding end by alf_aps. An alf_aps may contain a luma filter bank (comprising up to 25 filters) and up to 8 chroma filters.
CCALF uses the luminance component to correct the chrominance component. The process flows of ALF and CCALF are shown in FIG. 4. CCALF uses the luminance component as input and outputs a correction value for the chrominance component. The two chrominance components may each independently control whether the corresponding correction value is used. The correction value and the output of the chrominance ALF together constitute the final chrominance component.
4.1 ALF Filter shape
In VVC [1], ALF uses diamond filters of two different shapes as shown in fig. 5. Wherein the luminance component uses a 7x7 diamond filter and the chrominance component uses a 5x5 diamond filter.
4.2 Pixel Block Classification and geometric transformation
For the luminance component, ALF will adaptively use different filters at the sub-block level (4 x 4), i.e. each 4x4 pixel block needs to be divided into one of 25 classes. For the chrominance components, ALF does not need to classify pixels at the sub-block level, and all chrominance pixels within one CTU use the same filter. The classification index C of a block of luma component pixels is defined by the directional characteristic (Directionality) D and the quantized Activity characteristic (Activity) of the blockAnd (3) obtaining the product together. The formula is as follows:
To calculate D and First, the gradient values of horizontal, vertical, diagonal and anti-diagonal for each pixel within a 4x4 pixel block need to be calculated:
Hk,l=|2R(k,l)-R(k-1,l)-R(k+1,l)|#(2-2)
Vk,l=|2R(k,l)-R(k,l-1)-R(k,l+1)|#(2-3)
D0k,l=|2R(k,l)-R(k-1,l-1)-R(k+1,l+1)|#(2-4)
D1k,l=|2R(k,l)-R(k-1,l+1)-R(k+1,l-1)|#(2-5)
Based on the pixel gradients, the horizontal, vertical, diagonal, and anti-diagonal gradients for each 4x4 block overall are calculated as follows:
Where i and j represent the coordinates of the top left pixel of the 4x4 pixel block and R (k, l) represents the reconstructed pixel value before the (k, l) position is ALF filtered.
After the gradient values of the pixel blocks are obtained, the maximum and minimum values of the gradient values in the horizontal direction and the vertical direction are respectively as follows:
the maximum and minimum values of the gradient values of the diagonal and the anti-diagonal directions are respectively as follows:
the directivity characteristic D is derived by comparing the maximum and minimum values of gradient values in four directions obtained by formulas (2-8) - (2-9):
step1: if it is And/>If so, then D is set to 0.
Step2: if it isGo to Step3, otherwise go to Step4.
Step3: if it isD is set to 2, otherwise D is set to 1.
Step4: if it isD is set to 4, otherwise D is set to 3.
The activity characteristic a is calculated from the following formula:
the interval in which activity feature A would be quantized to [0-4] is taken as the quantized activity feature
Prior to filtering each 4x4 luma block, the filter coefficients and corresponding clipping values are geometrically transformed according to the gradient values of the current block according to the rules of table 1, including invariant (No transformation), diagonal transform (Diagonal), vertical flip (VERTICAL FLIP), and Rotation transform (Rotation). Applying a geometric transformation to the filter coefficients is equivalent to applying a geometric transformation to the pixel values with the coefficients unchanged and then filtering. The purpose of the geometric transformation is to align the directionality of the different block contents as much as possible, thereby reducing the number of classifications required by the ALF, so that different pixels share the same filter coefficients. The use of geometric transformation can improve the true classification from 25 classes to 100 classes without increasing the number of ALF filters, thereby improving the adaptivity thereof.
Gradient value Several transformations
G d1<gd0 and g h<gv Unchanged
G d1<gd0 and g v≤gh Diagonal transformation
G d0≤gd1 and g h<gv Vertical overturn
G d0≤gd1 and g v≤gh Rotation transformation
Table 1 geometrical transformations based on pixel block gradient values
4.3 ALF Filtering Process in VVC
At the decoding end, if the ALF flag bit of the CTU level is true, each pixel R (i, j) in the current CTU is filtered, and the filtering process and output are as follows:
Where f (K, l) represents the filter tap coefficients (the primary function of the filter tap coefficients is to weight the input signal for filtering purposes, different filter designs require different tap coefficients to meet the performance requirements of the filter), K (x, y) is a limiting function, and c (K, l) is a limiting operation related parameter. The value range of k and l is To/>Where L is the length of the filter. The clipping function is specifically defined as K (x, y) =min (y, max (-y, x)). The clipping operation adds a nonlinear function to the ALF, which can reduce the effect of pixels with too large peripheral differences on the current pixel.
5. ALF in enhanced compression model (Enhanced Compression Model, ECM) 8.0
ECM-8.0[2] removes the limitation of the virtual boundaries and downsampling operations in computing gradients during ALF classification. Meanwhile, the basic unit of the ALF classification operation is changed from a 4x4 sub-block to a 2x2 sub-block. And the shape of the luminance and chrominance filters are changed accordingly.
5.1 Enhanced fixed Filter
For the luminance component, ECM-8.0 uses 3 different classifiers (C0, C1 and C2) and three different sets of filters (F0, F1 and F2). Where the filter banks F0 and F1 comprise fixed filters whose coefficients are generated according to the classifier C0 and C1 in-line training. F2 contains filter coefficients generated from the content to be encoded, which need to be written into the code stream for transmission to the decoding end.
5.2 Classification Process
In ECM-8.0, each 2x2 sub-block generates a corresponding class index C i according to its directivity characteristic D i and activity characteristic a i, as shown in the following formula:
Where i represents the classifier index and M D,i represents the total number of activity features D i used by the corresponding classifier.
Similar to the VVC calculation process, the horizontal, vertical, diagonal, and anti-diagonal gradients of each pixel are generated using a 1-D Laplace operator (Laplace), which is a differential operator given by the divergence of the gradients of a function in euclidean space, both mathematically and physically.
. For classifier C 0, the gradient of its sub-block is generated by summing the pixel gradient values covering all locations in the 4x4 region of the target 2x2 sub-block. For classifiers C 1 and C 2, the gradients of its sub-blocks are generated by summing the pixel gradient values covering all locations in the 12x12 region of the target 2x2 sub-block. Specifying horizontal, vertical, diagonal, and anti-diagonal sub-block gradients asThe directivity characteristic D i is obtained by comparing the following two values with a set of thresholds:
The directivity characteristic D 2 uses the same thresholds 2 and 4.5 as VVC. For D 0 and D 1, the horizontal/vertical edge intensities are first calculated And diagonal edge intensity/>The threshold th= [1.25,1.5,2,3,4.5,8] is used. When/>Edge strength/>Set to 0; otherwise,/>Is to satisfy/>Is the largest integer of (a). When/>Edge strength/>Set to 0; otherwise/>To meet/>Is the largest integer of (a). When/>I.e., the horizontal/vertical direction edge is strong, the directivity characteristic D i is generated from table 2 (a). Otherwise, the directivity characteristic D i is generated by table 2 (b).
TABLE 2 edge StrengthMapping relation with D i
Activity featureThe sub-block level gradient accumulation result A i in the horizontal and vertical directions is quantized and generated, and the quantized value range is 0 to n. For/>N is set to 4; for/>And/>Setting n to 15 an alf_aps may transmit up to 4 sets of luma component filters, each filter set containing up to 25 filters.
5.3 Band classifier based on 2x2 sub-blocks
In ECM-8.0, the ALF classification process uses a new classifier. For a set of filter banks that are transmitted to the decoding end, a flag bit is used to indicate whether the original classifier or the new classifier is used. The new classifier cannot use geometric transformations. When a new classifier is used, pixels at all positions in one 2x2 sub-block are added and then classified according to the sum of pixel values.
classindex=(sum×25)>>(samplebitdepth+2)#(2-13)
5.4 Filtering Process
First, two fixed filters F 0 and F 1 using 13x13 diamond filters generate two intermediate values R 0 (x, y) and R 1 (x, y) for the current pixel to be filtered. The in-line generated filter F 2 is then applied over R 0(x,y),R1 (x, y) and surrounding pixels to generate filtered pixels, as shown in the following equation:
Where f i,i denotes a difference value of the surrounding pixel subjected to the clipping operation and the current pixel R (x, y), and g i denotes a difference value of R i-20 (x, y) subjected to the clipping operation and the current pixel. The filter coefficients c i, i=0,..21, need to be transmitted to the decoding end.
In ECM-8.0, the luminance filter generated by online training contains a total of 4 types of inputs: spatial neighboring samples, reconstructed samples before deblocking, extended samples generated by fixed filter filtering, and residual components. The shape of which is shown in figure 6. Where #0- #19 denotes spatial neighboring samples, #20- #25, #28- #29 denotes samples generated by filtering with a fixed filter, #26, #27, #30 denotes reconstructed samples before deblocking, #31, #32 denotes residual components. According to various types of inputs, the filtering process is as follows:
Where f i,j denotes the difference between the clipped spatial surrounding pixel sample and the current sample R (x, y), g i denotes the difference between the clipped pixel sample generated using the fixed filter filtering and the current sample R (x, y), and h i,j denotes the difference between the reconstructed sample and the current sample R (x, y) before clipping deblocking. r i denotes a limited residual component, RFILTERED i denotes a limited residual component generated using filtering with a fixed filter. The residual component uses the same fixed filter as the reconstructed component.
In alf_aps, one flag bit is used to indicate whether only the residual component is used or both the residual component and the residual component filtered by the fixed filter are used.
5.5 Classifier based on residual error component
A classifier based on the luminance residual component serves as a third classifier for ALF. For any 2x2 luma sub-block, the sum of absolute values of residual components covering all positions within the 8x8 region of the current 2x2 sub-block is first calculated and then classified according to the accumulated sum using the following formula:
classIdx=sum>>(samplebitdepth-4)#(2-16)
The final class classIdx has a value in the range of 0-24. In alf_aps, the classifier used for each filter bank transmission is required.
And 5.6, ALF coefficient and nonlinear clipping index transmission process.
For the current image to be filtered, its ALF coefficients may be selected from the filter coefficients generated by the current image, the filter coefficients generated by the encoded image, and the pre-trained filter coefficients. If the filter coefficients generated using the current image are selected, the filter-related parameters need to be transmitted to the decoding end. An exemplary ALF filter syntax element and transmission procedure is shown in table 1 below:
alf_comp_num_alt_filters_minus1+1 represents the number of filter banks used for the current color component (e.g., luma component, chroma component, cross component), and in ECM, the luma component supports a maximum of 4 filter banks and the chroma component supports a maximum of 8 filter banks. For each filter bank, it is necessary to indicate by alf_comp_clip_flag whether the filters included in the current filter bank use nonlinear clipping. A
Lf_comp_num_filters_ signalled _minus1+1 represents the number of filters that the current filter bank contains (in ECM, the luma component contains at most 25 filters per filter bank and the chroma component contains at most 1 filter.
Alf_comp_coeff () represents tap coefficients of the transmission filter. If nonlinear clipping is used, a nonlinear clipping index alf_comp_clip_idx [ altIdx ] [ sfIdx ] [ j ] is transmitted for each tap coefficient of the filter.
The image filtering method provided by the embodiment of the application can be applied to a video coding and decoding system, wherein the video coding and decoding system can comprise a content making device (corresponding to a coding device) and a content presenting device (corresponding to a decoding device), the content making device can be an electronic device used by a provider of video data (such as a content producer of the video data), and the electronic device can be a terminal (such as a PC (Personal Computer, personal computer), an intelligent mobile device (such as a smart phone) or the like) or a server.
The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms. The content presentation device may refer to an electronic device used by a user of video data (e.g., a viewer of video data, i.e., a business object), which may be a terminal (e.g., a PC (Personal Computer, personal computer), a smart mobile device (e.g., a smart phone), a VR device (e.g., a VR headset, VR glasses, etc.), a smart home appliance, an in-vehicle terminal, an aircraft, etc.), which is integrated with a client.
The client may be a client with a function of displaying text, image, audio, video and other data information, including but not limited to a multimedia client (e.g., a video client), a social client (e.g., an instant messaging client), an information application (e.g., a news client), an entertainment client (e.g., a game client), a shopping client, a vehicle client, a browser, etc. The client may be a stand-alone client, or may be an embedded sub-client integrated into a client (e.g., a social client), which is not limited herein.
The content creation device and the content presentation device may be the same device or different devices, each of which includes a plurality of modules, and different modules are used to implement different functions, and these modules may be integrated into the same electronic device or may be located in different electronic devices. The content production device can be used for realizing the functions of video data acquisition, encoding and the like, and the content presentation device can be used for realizing the functions of decoding, rendering, displaying and the like of the packaged file. Referring to fig. 7, fig. 7 is a schematic diagram of a video encoding and decoding system for video data according to an embodiment of the application.
In fig. 7, on the content creation device side, a real-world visual scene is captured and captured by a group of cameras or a camera device having a plurality of lenses and sensors, and captured camera captured image a is video data B including a plurality of frames of captured images. Or on the content production device side, the multi-frame screen content image a may be used as the video data B. After the content acquisition is completed, the video data B is encoded, so that a video code stream C is obtained, and the video code stream C is sent to the content presentation device.
On the content presentation device side, the video code stream C is decoded, then the decoded video data B' is rendered, and the rendered video D is obtained, so that the video D is displayed.
It can be understood that the coding and decoding technology related to the application can be realized by means of cloud technology; for example, a cloud server is used as the content creation device. Cloud technology (Cloud technology) refers to a hosting technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.
Cloud technology (Cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by Cloud computing business models, and can form a resource pool, so that the Cloud computing business model is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.
The image filtering method of the application can be realized based on cloud computing. Cloud computing (clouding) is a computing model that distributes computing tasks across a large pool of computers, enabling various application systems to acquire computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the cloud are infinitely expandable in the sense of users, and can be acquired at any time, used as needed, expanded at any time and paid for use as needed.
As a basic capability provider of cloud computing, a cloud computing resource pool (abbreviated as a cloud platform, generally referred to as IaaS (Infrastructure AS A SERVICE) platform) is established, in which multiple types of virtual resources are deployed for external clients to select for use.
According to the logic function division, a PaaS (Platform AS A SERVICE, platform service) layer can be deployed on an IaaS (Infrastructure AS A SERVICE, infrastructure service) layer, and a SaaS (Software AS A SERVICE, service) layer can be deployed above the PaaS layer, or the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, web container, etc. SaaS is a wide variety of business software such as web portals, sms mass senders, etc. Generally, saaS and PaaS are upper layers relative to IaaS.
Based on the basic concepts and the description of related scenes, the embodiment of the application provides an image filtering method, an image filtering device, a computer storage medium and electronic equipment.
Referring to fig. 8, an embodiment of the present application provides an image filtering method applied to a decoding end, including the following steps:
Step 011: calculating a gradient matrix based on pixel reconstruction values of each pixel of the reconstructed image block;
Specifically, a bitstream, i.e., a video bitstream, can be obtained after encoding and compressing video data (e.g., at least one of a photographed image photographed by a camera and a computer-generated screen content image).
The decoding end can acquire the video code stream sent by the encoding end, the video code stream contains decoding indication information, and the decoding end can decode the video code stream according to the decoding indication information, so that a reconstructed image is obtained.
And then calculating a gradient matrix of each reconstructed image block based on pixel reconstruction values of each pixel of each reconstructed image block of the reconstructed image.
As shown in fig. 9, the reconstructed image is an entire image, the reconstructed image is divided into a plurality of CTUs with the same size as mentioned above, and then the CTUs may be optionally further divided based on the setting of the current device, so as to obtain image blocks with preset sizes (such as 32×32, 64×64, etc.), that is, a reconstructed image block finally used for calculating a gradient matrix, where the reconstructed image block is one of a plurality of image blocks obtained by dividing the CTU. Alternatively, the CTUs may be directly used as reconstructed image blocks without dividing the CTUs.
The gradient matrix comprises a plurality of gradient matrixes which respectively correspond to different preset gradient directions. The preset gradient directions include a horizontal direction, a vertical direction, a first diagonal direction (i.e., the diagonal direction mentioned in the foregoing), and a second diagonal direction (i.e., the opposite diagonal direction mentioned in the foregoing).
Referring to fig. 10, optionally, step 011 includes the steps of:
Step 0111: calculating gradient values of each pixel of the reconstructed image in a horizontal direction, a vertical direction, a first diagonal direction, and a second diagonal direction, respectively, based on pixel reconstruction values of each pixel of the reconstructed image;
Step 0112: and generating a gradient matrix corresponding to the horizontal direction, a gradient matrix corresponding to the vertical direction, a gradient matrix corresponding to the first diagonal direction and a gradient matrix corresponding to the second diagonal direction respectively based on gradient values of each pixel in the horizontal direction, the vertical direction, the first diagonal direction and the second diagonal direction.
Specifically, the gradient value of each pixel of the reconstructed image block in the horizontal direction may be calculated based on the pixel reconstruction value of each pixel of the reconstructed image block, so that a gradient matrix corresponding to the horizontal direction is generated according to the gradient value of each pixel of the reconstructed image block in the horizontal direction; calculating a gradient value of each pixel of the reconstructed image block in the vertical direction based on the pixel reconstruction value of each pixel of the reconstructed image block, thereby generating a gradient matrix corresponding to the vertical direction according to the gradient value of each pixel of the reconstructed image block in the vertical direction; calculating a gradient value of each pixel of the reconstructed image block in a first diagonal direction based on a pixel reconstruction value of each pixel of the reconstructed image block, thereby generating a gradient matrix corresponding to the first diagonal direction according to the gradient value of each pixel of the reconstructed image block in the first diagonal direction; and calculating the gradient value of each pixel of the reconstructed image block in the second diagonal direction based on the pixel reconstruction value of each pixel of the reconstructed image block, so as to generate a gradient matrix corresponding to the second diagonal direction according to the gradient value of each pixel of the reconstructed image block in the second diagonal direction.
Optionally, the gradient value of any target pixel in any target gradient direction is determined according to the gradient value of the target pixel and the gradient value of the adjacent pixel of the target pixel in the target gradient direction, and the target gradient direction is a horizontal direction, a vertical direction, a first diagonal direction or a second diagonal direction.
The adjacent pixels of the target pixel in the target gradient direction are pixels which are located in the target gradient direction of the target pixel and have a distance from the target pixel smaller than a preset distance threshold (such as a distance of 1 pixel, a distance of 2 pixels, etc.). For example, the adjacent pixels of the target pixel in the horizontal direction may be pixels that are in the same row as the target pixel and have a distance from the target pixel smaller than a preset distance threshold.
Referring to fig. 11, the reconstructed image block is rectangular, a horizontal axis (i.e., x-axis) and a vertical axis (i.e., y-axis) are respectively established with an upper left corner of the reconstructed image block as an origin of coordinates, a horizontal direction of the reconstructed image block is a horizontal axis direction, a vertical direction of the reconstructed image block is a vertical axis direction, a first diagonal direction of the reconstructed image block is a direction of one diagonal of the rectangle, and a second diagonal direction of the reconstructed image block is a direction of the other diagonal of the rectangle.
Taking the coordinates of any target pixel as (x, y) as an example, the gradient of the target pixel in the horizontal direction can be calculated according to the pixel values of the pixels whose coordinates are (x, y), (x+1, y) and (x-1, y), respectively. For example, V (x, y) = 2*L (x, y) -L (x-1, y) -L (x+1, y), where V (x, y) is the gradient in the horizontal direction of the pixel at the coordinates (x, y), and L (x, y) is the pixel value of the pixel at the coordinates (x, y).
The gradient of the target pixel in the vertical direction can be calculated from the pixel values of the pixels whose coordinates are (x, y), (x, y+1) and (x, y-1), respectively. For example, H (x, y) = 2*L (x, y) -L (x, y-1) -L (x, y+1), where H (x, y) is the gradient of the pixel with coordinates (x, y) in the vertical direction.
The gradient of the target pixel in the first diagonal direction may be calculated from pixel values of pixels having coordinates of (x, y), (x+1, y+1), and (x-1, y-1), respectively. For example, D1 (x, y) = 2*L (x, y) -L (x-1, y-1) -L (x+1, y+1), where D1 (x, y) is the gradient of the pixel with coordinates (x, y) in the first diagonal direction.
The gradient of the target pixel in the second diagonal direction may be calculated from pixel values of pixels having coordinates of (x, y), (x-1, y+1), and (x+1, y-1), respectively. For example, D1 (x, y) = 2*L (x, y) -L (x-1, y+1) -L (x+1, y-1), where D1 (x, y) is the gradient of the pixel with coordinates (x, y) in the first diagonal direction.
After the gradient values of each pixel in each preset gradient direction are calculated, a gradient matrix of the reconstructed image in each preset gradient direction can be generated based on the calculated gradient values in each preset gradient direction.
Therefore, the complete gradient matrix (namely the gradient matrix corresponding to the whole reconstructed image block) is generated firstly, calculation of filtering parameters is not needed, and whether all gradient values required by calculation of the filtering parameters are calculated or not can be judged, so that the gradient value calculation of all pixels can be executed in parallel, the parallel performance of the equipment is fully utilized, and the complete gradient matrix is quickly generated.
Referring to fig. 12, optionally, step 011 may include the steps of:
step 0113: the gradient values of the individual pixels of the odd-numbered lines of the reconstructed image block are determined based on the pixel reconstruction values of the individual pixels of the reconstructed image block.
Specifically, in order to reduce the computational effort required for gradient computation, gradient information of the encoding unit is generally computed from gradient values of pixels in odd lines. Therefore, in calculating the gradient values of the respective pixels, only the gradient values of the respective pixels of the odd-numbered lines may be calculated, and then the gradient values of the even-numbered lines may be set to a preset value (e.g., 0). Therefore, the calculated amount can be reduced by almost half, and the calculation efficiency of the gradient matrix can be greatly improved.
Optionally, the image filtering method of the present application may be implemented by using the computing power of a graphics processor (Graphics Processing Unit, GPU), and due to the high parallelism of the GPU, each pixel of the reconstructed image block may be processed in parallel like a single thread, so that the gradient values of each pixel are calculated in parallel, so as to generate the gradient matrix rapidly.
Step 012: gradient information of each coding unit of each reconstructed image block is determined based on the gradient matrix.
Wherein the gradient information may be used to characterize gradient characteristics of the coding unit. The directional characteristic (such as the aforementioned directional characteristic D) and the activity characteristic (such as the aforementioned activity characteristic a) of the coding unit may be determined based on the gradient information, thereby determining a classification index of the coding unit, and finally obtaining the filter parameter of the coding unit based on the classification index.
As shown in fig. 11, the coding unit (CU mentioned above) is a unit of basic coding obtained by dividing the reconstructed image block by a preset size (4*4, 2×2, etc.).
Specifically, after the complete gradient matrix of the reconstructed image block is obtained by calculation, each coding unit in the reconstructed image block can be traversed, and whether the gradient value required by the coding unit is calculated is completed or not is not required to be judged in the traversing process, so that the gradient information of each coding unit is rapidly calculated based on the gradient matrix corresponding to the reconstructed image block.
Referring to fig. 13, optionally, step 012 includes the steps of:
Step 0121: determining a transverse gradient sum of the coding unit according to a plurality of associated gradient values of the coding unit in the horizontal direction of the gradient matrix;
Step 0122: determining a longitudinal gradient sum of the coding unit according to a plurality of associated gradient values of the coding unit in the vertical direction of the gradient matrix;
step 0123: and determining gradient information of the coding unit according to the sum of the transverse gradient and the longitudinal gradient of the coding unit.
Specifically, each gradient value in the gradient matrix and each pixel of the reconstructed image are in one-to-one correspondence. When any coding unit is traversed, a target gradient value corresponding to a pixel (such as a pixel at the upper left corner) at a preset position in the coding unit can be determined in the gradient matrix, so that the target gradient value is used as a correlation gradient value of the coding unit in the horizontal direction and the vertical direction according to the correlation gradient value of the target gradient value in the horizontal direction and the vertical direction.
Fig. 14 is a schematic diagram of a portion of a reconstructed image block, in which in one example, a pixel (e.g., a pixel in an upper left corner) of a preset position of a coding unit has coordinates (x, y) in a gradient matrix, a plurality of associated gradient values of the coding unit in a horizontal direction may include at least one of gradient values having coordinates (x, y), (x+2, y), (x+4, y), and (x+6, y), respectively, and a plurality of associated gradient values of the coding unit in a vertical direction may include at least one of pixels having coordinates (x, y), (x, y+2), (x, y+4), and (x, y+6), respectively.
Then, the lateral gradient sum of the coding unit can be determined from a plurality of associated gradient values of the coding unit in the horizontal direction of the gradient matrix.
Optionally, the lateral gradient sum of the coding unit is a sum of a plurality of associated gradient values of the coding unit in the horizontal direction.
The longitudinal gradient sum of the coding unit is determined from a plurality of associated gradient values of the coding unit in the vertical direction of the gradient matrix.
Optionally, the longitudinal gradient sum of the coding unit is a sum of a plurality of associated gradient values of the coding unit in the vertical direction.
And finally, determining gradient information of the coding unit according to the sum of the transverse gradient and the longitudinal gradient of the coding unit.
Optionally, the gradient information of the coding unit is a sum of a lateral gradient sum and a longitudinal gradient sum of the coding unit.
Therefore, based on the complete gradient matrix, the transverse gradient summation and the longitudinal gradient summation are simultaneously carried out in one traversal, compared with the method that the transverse gradient summation of each coding unit is obtained by first traversing and then the longitudinal gradient summation of each coding unit is obtained by traversing, the traversal times are reduced, and the calculation efficiency of gradient information is improved.
And the parallel performance of the device can be fully utilized, the gradient information of a plurality of coding units can be calculated in parallel, and the calculation efficiency of the filter parameters is further improved.
In some embodiments, the number of associated gradient values of the coding unit may be limited in the case of a coding unit in a later row, so that the reconstructed image block may also be expanded to obtain a larger size reconstructed image block when generating the gradient matrix. If the CTU with size 128 is divided into a plurality of reconstructed image blocks with size 32×32, the reconstructed image blocks can be expanded to obtain reconstructed image blocks with size 36×36.
And if the expanded pixel does not coincide with the adjacent reconstructed image block, directly determining the pixel value of the expanded pixel as a preset pixel value (such as 128).
After the extended reconstructed image block is obtained, a corresponding gradient matrix can be generated based on the pixel values of the extended reconstructed image block, and the size of the gradient matrix is the same as that of the extended reconstructed image block.
Therefore, even if the coding units are positioned at the edge of the reconstructed image block, the associated gradient values of the coding units exist, so that the gradient information of each coding unit can be obtained through calculation.
Optionally, referring to fig. 14 again, the image coordinates of the pixel at the preset position of the encoding unit are (m, n), the coordinates of the pixel at the preset position in the gradient matrix are (x, y), and in the case where n is the first preset threshold value, the plurality of associated gradient values of the encoding unit in the vertical direction include gradient values with coordinates of (x, y), (x, y+2), (x, y+4), respectively; under the condition that n is a second preset threshold value, the plurality of associated gradient values of the coding unit in the vertical direction comprise gradient values with coordinates of (x, y+2), (x, y+4) and (x, y+6) respectively; in the case where n is a coordinate value other than the first preset threshold and the second preset threshold, the plurality of associated gradient values of the encoding unit in the vertical direction include gradient values having coordinates of (x, y), (x, y+2), (x, y+4), (x, y+6), respectively.
The image coordinates refer to coordinates of a pixel in a corresponding CTU, taking a size of 128×128 of the CTU and an origin of coordinates as an example, coordinates of a first pixel of the CTU are (1, 1), that is, the image coordinates of a pixel in the coding unit may be (128 ) at maximum.
In the case where the CTU size is 128×128, the first preset threshold may be 120, and the second preset threshold may be 124.
Referring to fig. 15, in some embodiments, step 012 includes the steps of:
step 0124: and determining gradient information of each coding unit in the target gradient direction based on the completed gradient matrix under the condition that the gradient matrix calculation in the target gradient direction is completed, wherein the target gradient direction is a horizontal direction, a vertical direction, a first diagonal direction or a second diagonal direction.
Specifically, in order to calculate gradient information of the encoding unit as soon as possible, the gradient information of the encoding unit in any target gradient direction can be calculated when the gradient matrix calculation in the target gradient direction is completed, and at this time, the gradient matrices corresponding to the preset gradient directions which are not calculated yet can also be calculated in parallel, compared with the gradient matrix calculation in all the preset gradient directions, the gradient information of each preset gradient direction is calculated respectively, and the gradient information of each preset gradient direction can be calculated quickly by fully utilizing the parallelism of the equipment in the later stage of the gradient information calculation.
Step 013: filter parameters of the respective coding units are determined based on gradient information of the respective coding units, respectively.
The filter parameters refer to a filter used when filtering an image. The filters are related to the classification of the coding units, as in the design of VVC, each 4x4 block may be classified into one of 25 classes based on its own directionality and activity. Each type of video content will calculate corresponding filter parameters.
Specifically, in determining the filter parameters, the directivity characteristic and the activity characteristic may be calculated according to gradient information of the encoding unit (the manner of calculating the directivity characteristic and the activity characteristic has been described above, and will not be described here. The classification index (e.g., classification index C as mentioned above) of the coding unit is then determined based on the directivity characteristic and the activity characteristic, and finally the filter parameter corresponding to the classification index C is obtained, so that the filter parameter of the coding unit can be determined.
Alternatively, the filter parameters of each coding unit may be determined based on gradient information of each coding unit in each preset gradient direction, including a horizontal direction, a vertical direction, a first diagonal direction, and a second diagonal direction, respectively.
The filter parameters of the coding unit are determined, for example, from gradient information of the coding unit in the horizontal direction, the vertical direction, the first diagonal direction and the second diagonal direction, respectively.
During encoding, the filter parameters required during decoding are determined according to the filter parameters of each encoding unit, so that the corresponding filter parameters are transmitted in the code stream, and all the filter parameters are not transmitted, thereby reducing the code rate. When decoding, after determining the classification index according to the gradient information of each coding unit, the filter parameters corresponding to the coding units can be rapidly determined in the filter parameters in the code stream.
Alternatively, the tap coefficients of the filter may be determined according to the classification index of the coding unit, and different filters may use different tap coefficients, so that the filter used when the coding unit performs filtering may be determined based on the tap coefficients.
According to the method described in the foregoing embodiments, the embodiment of the present application further provides an image filtering method of a decoding end, please refer to fig. 16, fig. 16 is a schematic flow chart of the image filtering method of an encoding end provided in the embodiment of the present application, which specifically includes:
step 021: calculating a gradient matrix based on pixel values of respective pixels of the encoded image block;
step 022: determining gradient information of each coding unit of each coding image block based on the gradient matrix;
step 023: filter parameters of the respective coding units are determined based on gradient information of the respective coding units, respectively.
The coded image block refers to an image block with preset dimensions (such as 32×32, 64×64, etc.) obtained by dividing an image to be coded into a plurality of CTUs with the same size as mentioned above and then selectively further dividing the CTUs based on the setting of the current device. Or the coded picture block is CTU.
It should be noted that, the video encoding process and the video decoding process are reversible, and when performing image filtering, the decoding end is performed based on the reconstructed image block in the reconstructed image, and the encoding end is performed based on the encoded image block of the image to be encoded, so the image filtering method of the encoding end provided by the embodiment of the present application can refer to the embodiment of the image filtering method of the decoding end, which is not described herein again.
According to the method described in the above embodiment, the embodiment of the present application further provides an image filtering device, which is configured to perform the steps in the image filtering method at the decoding end. Referring to fig. 17, fig. 17 is a schematic diagram of an image filtering apparatus 300 according to an embodiment of the application. The image filtering apparatus 300 comprises a first calculation module 301 and a first determination module 302, wherein:
A first calculation module 301, configured to calculate a gradient matrix based on pixel reconstruction values of pixels of the reconstructed image block;
A first determining module 302, configured to determine gradient information of each encoding unit of each reconstructed image block based on the gradient matrix; and determining filter parameters of each coding unit based on the gradient information of each coding unit.
It should be noted that, the specific details of each module unit in the image filtering apparatus 300 are described in detail in the embodiment of the image filtering method, and are not described herein.
According to the method described in the above embodiment, the embodiment of the present application further provides an image filtering device at the encoding end, which is configured to perform the steps in the image filtering method. Referring to fig. 18, fig. 18 is a schematic structural diagram of an image filtering apparatus 400 according to an embodiment of the application. The image filtering apparatus 400 comprises a second calculation module 401 and a second determination module 402, wherein:
a second calculation module 401 for calculating a gradient matrix based on pixel values of respective pixels of the encoded image block;
a second determining module 402 that determines gradient information of each encoding unit of each encoding image block based on the gradient matrix; and determining filter parameters of each coding unit based on the gradient information of each coding unit.
It should be noted that, the specific details of each module unit in the image filtering apparatus 400 are described in detail in the above embodiment of the image filtering method, and are not described herein again.
In the present embodiment, the term "module" or "unit" refers to a computer program or a part of a computer program having a predetermined function and working together with other relevant parts to achieve a predetermined object, and may be implemented in whole or in part by using software, hardware (such as a processing circuit or a memory), or a combination thereof. Also, a processor (or multiple processors or memories) may be used to implement one or more modules or units. Furthermore, each module or unit may be part of an overall module or unit that incorporates the functionality of the module or unit.
In some embodiments, the image filtering apparatus in the embodiments of the present application may be an electronic device, or may be a component in an electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. The electronic device may be a Mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a Mobile internet appliance (Mobile INTERNET DEVICE, MID), an augmented reality (augmented reality, AR)/Virtual Reality (VR) device, a robot, a wearable device, an ultra-Mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), etc., and may also be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a Television (TV), a teller machine, a self-service machine, etc., which are not particularly limited in the embodiments of the present application.
In some embodiments, as shown in fig. 19, an electronic device 500 is further provided in the embodiments of the present application, which includes a processor 501, a memory 502, and a computer program stored in the memory 502 and capable of running on the processor 501, where the program when executed by the processor 501 implements the respective processes of the embodiments of the image filtering method described above, and the same technical effects are achieved, so that repetition is avoided and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.
Fig. 20 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 600 includes, but is not limited to: radio frequency unit 601, network module 602, audio output unit 603, input unit 604, sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, and processor 610.
Those skilled in the art will appreciate that the electronic device 600 may further include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 610 by a power management system to perform functions such as managing charge, discharge, and power consumption by the power management system. The electronic device structure shown in fig. 20 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown in the drawings, or may combine some components, or may be arranged in different components, which will not be described in detail herein.
It should be appreciated that in embodiments of the present application, the input unit 604 may include a graphics processor (Graphics Processing Unit, GPU) 6041 and a microphone 6042, with the graphics processor 6041 processing image data of still pictures or video obtained by an image capturing apparatus (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 606 may include a display panel 6061, and the display panel 6061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 607 includes at least one of a touch panel 6071 and other input devices 6072. The touch panel 6071 is also called a touch screen. The touch panel 6071 may include two parts of a touch detection device and a touch controller. Other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.
The memory 609 may be used to store software programs as well as various data. The memory 609 may mainly include a first storage area storing programs or instructions and a second storage area storing data, wherein the first storage area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 609 may include volatile memory or nonvolatile memory, or the memory 609 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate Synchronous dynamic random access memory (Double DATA RATE SDRAM, DDRSDRAM), enhanced Synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCH LINK DRAM, SLDRAM), and Direct random access memory (DRRAM). Memory 609 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.
The processor 610 may include one or more processing units; the processor 610 integrates an application processor that primarily processes operations involving an operating system, user interfaces, applications, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 610.
The embodiment of the application also provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the embodiment of the image filtering method, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.
The processor is a processor in the electronic device in the above embodiment. The computer readable storage medium may be a computer read only memory ROM, a random access memory RAM, a magnetic or optical disk, or the like.
The embodiment of the application also provides a computer program product, which comprises a computer program, and the computer program realizes the image filtering method when being executed by a processor. Wherein the processor may be a processor in the electronic device in the above embodiment. The computer program, when executed by the processor, implements the processes of the embodiments of the image filtering method described above, and can achieve the same technical effects, and for avoiding repetition, will not be described herein.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.

Claims (15)

1. An image filtering method, comprising:
calculating a gradient matrix based on pixel reconstruction values of each pixel of the reconstructed image block;
determining gradient information of each coding unit of each reconstructed image block based on the gradient matrix;
And determining filter parameters of each coding unit based on gradient information of each coding unit.
2. The method of claim 1, wherein the computing a gradient matrix based on pixel reconstruction values for individual pixels of the reconstructed image block comprises:
Respectively calculating gradient values of each pixel of the reconstructed image block in the horizontal direction, the vertical direction, the first diagonal direction and the second diagonal direction based on pixel reconstruction values of each pixel of the reconstructed image block;
And generating a gradient matrix corresponding to the horizontal direction, a gradient matrix corresponding to the vertical direction, a gradient matrix corresponding to the first diagonal direction and a gradient matrix corresponding to the second diagonal direction respectively based on gradient values of each pixel in the horizontal direction, the vertical direction, the first diagonal direction and the second diagonal direction.
3. The method of claim 2, wherein the gradient value of any target pixel in any target gradient direction is determined based on the gradient value of the target pixel and the gradient value of a neighboring pixel of the target pixel in the target gradient direction, the target gradient direction being a horizontal direction, a vertical direction, a first diagonal direction, or a second diagonal direction.
4. The method of claim 1, wherein the gradient values of the even rows of the gradient matrix are preset values, the computing the gradient matrix based on pixel reconstruction values of individual pixels of the reconstructed image block, comprising:
The gradient values of the pixels of the odd rows of the reconstructed image block are determined based on the pixel reconstruction values of the pixels of the reconstructed image block.
5. The method of claim 1, wherein determining gradient information for each coding unit of each reconstructed image block based on the gradient matrix comprises:
determining a lateral gradient sum of the coding unit according to a plurality of associated gradient values of the coding unit in the horizontal direction of the gradient matrix;
Determining a longitudinal gradient sum of the coding unit according to a plurality of associated gradient values of the coding unit in the vertical direction of the gradient matrix;
And determining gradient information of the coding unit according to the sum of the transverse gradient and the longitudinal gradient of the coding unit.
6. The method of claim 5, wherein the lateral gradient sum of the coding unit is a sum of a plurality of associated gradient values of the coding unit in a horizontal direction, the longitudinal gradient sum of the coding unit is a sum of a plurality of associated gradient values of the coding unit in a vertical direction, and the gradient information of the coding unit is a sum of the lateral gradient sum and the longitudinal gradient sum of the coding unit.
7. The method according to claim 5 or 6, wherein the coordinates in the gradient matrix of the pixels of the preset position of the encoding unit are (x, y), the plurality of associated gradient values of the encoding unit in the horizontal direction include at least one of the respective gradient values of the coordinates (x, y), (x+2, y), (x+4, y), (x+6, y), and the plurality of associated gradient values of the encoding unit in the vertical direction include at least one of the respective pixels of the coordinates (x, y), (x, y+2), (x, y+4), (x, y+6).
8. The method according to claim 7, wherein the image coordinates of the pixels at the preset positions of the encoding unit are (m, n), the coordinates of the pixels at the preset positions in the gradient matrix are (x, y), and the plurality of associated gradient values of the encoding unit in the vertical direction include gradient values having coordinates of (x, y), (x, y+2), (x, y+4), respectively, in the case where n is the first preset threshold value; under the condition that n is a second preset threshold value, the plurality of associated gradient values of the coding unit in the vertical direction comprise gradient values with coordinates (x, y+2), (x, y+4) and (x, y+6) respectively; in the case where n is a coordinate value outside the first preset threshold and the second preset threshold, the plurality of associated gradient values of the encoding unit in the vertical direction include gradient values having coordinates of (x, y), (x, y+2), (x, y+4), (x, y+6), respectively.
9. The method of claim 7, wherein the pixel at the predetermined position is a pixel in an upper left corner of the coding unit.
10. The method of claim 1, wherein determining gradient information for each coding unit of each reconstructed image block based on the gradient matrix comprises:
And determining gradient information of each coding unit in the target gradient direction based on the completed gradient matrix when the gradient matrix calculation in the target gradient direction is completed, wherein the target gradient direction is a horizontal direction, a vertical direction, a first diagonal direction or a second diagonal direction.
11. The method according to claim 1, wherein the determining filter parameters of each coding unit based on gradient information of each coding unit, respectively, comprises:
And respectively determining filter parameters of each coding unit based on the gradient information of each coding unit in each preset gradient direction, wherein the preset gradient directions comprise a horizontal direction, a vertical direction, a first diagonal direction and a second diagonal direction.
12. An image filtering method, comprising:
calculating a gradient matrix based on pixel values of respective pixels of the encoded image block;
Determining gradient information of each coding unit of each coded image block based on the gradient matrix;
And determining filter parameters of each coding unit based on gradient information of each coding unit.
13. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the image filtering method according to any one of claims 1-12.
14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image filtering method of any of claims 1-12 when the program is executed by the processor.
15. A computer program product comprising a computer program which, when executed by a processor, implements the image filtering method according to any of claims 1-12.
CN202410169233.8A 2024-02-04 2024-02-04 Image filtering method, storage medium, electronic device and product Pending CN117979031A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410169233.8A CN117979031A (en) 2024-02-04 2024-02-04 Image filtering method, storage medium, electronic device and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410169233.8A CN117979031A (en) 2024-02-04 2024-02-04 Image filtering method, storage medium, electronic device and product

Publications (1)

Publication Number Publication Date
CN117979031A true CN117979031A (en) 2024-05-03

Family

ID=90857869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410169233.8A Pending CN117979031A (en) 2024-02-04 2024-02-04 Image filtering method, storage medium, electronic device and product

Country Status (1)

Country Link
CN (1) CN117979031A (en)

Similar Documents

Publication Publication Date Title
US11838509B2 (en) Video coding method and apparatus
CN113766249B (en) Loop filtering method, device, equipment and storage medium in video coding and decoding
CN107071440B (en) Motion vector prediction using previous frame residuals
US11991399B2 (en) Apparatus and method for de-blocking filtering
EP3711302B1 (en) Spatially adaptive quantization-aware deblocking filter
JP2023159145A (en) Apparatus and method for deblocking filter in video coding
CN104883566B (en) The fast algorithm that a kind of intra prediction block size suitable for HEVC standard divides
US20240015310A1 (en) Multimedia data processing method, apparatus, device, computer-readable storage medium, and computer program product
KR20240032707A (en) Distributed computing system and method of artificial neural network
JP2024531738A (en) Image filtering method, device, equipment and program
CN117979031A (en) Image filtering method, storage medium, electronic device and product
WO2023133888A1 (en) Image processing method and apparatus, remote control device, system, and storage medium
CN120321416A (en) Video decoding method, video encoding method, storage medium, electronic device, and product
CN112468826A (en) VVC loop filtering method and system based on multilayer GAN
CN115280772A (en) Double Standard Block Partition Heuristic for Lossy Compression
RU2789030C2 (en) Device and method for deblocking filter in video encoding
US12167047B2 (en) Neural network-based deblocking filters
CN111770338B (en) Method, device and equipment for determining index value of coding unit and storage medium
WO2025018155A1 (en) Systems and methods for reducing distortion in end-to-end feature compession in coding of multi-dimensional data
Afsana Efficient video and point cloud compression through common information exploitation
KR20240030922A (en) Npu for distributing artificial neural networks based on mpeg-vcm and method thereof
CN120128717A (en) Video decoding method, video encoding method, device, storage medium and electronic device
JP2024543310A (en) Image filtering method and apparatus, and device
CN117119182A (en) Video data processing method, device, equipment and medium
CN118803236A (en) Adaptive quantization method, device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination