[go: up one dir, main page]

CN112153385B - Encoding processing method, device, equipment and storage medium - Google Patents

Encoding processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112153385B
CN112153385B CN202011343013.0A CN202011343013A CN112153385B CN 112153385 B CN112153385 B CN 112153385B CN 202011343013 A CN202011343013 A CN 202011343013A CN 112153385 B CN112153385 B CN 112153385B
Authority
CN
China
Prior art keywords
coding
coding mode
mode
target
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011343013.0A
Other languages
Chinese (zh)
Other versions
CN112153385A (en
Inventor
张宏顺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202011343013.0A priority Critical patent/CN112153385B/en
Publication of CN112153385A publication Critical patent/CN112153385A/en
Application granted granted Critical
Publication of CN112153385B publication Critical patent/CN112153385B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the invention discloses a coding processing method, a device, equipment and a storage medium based on a cloud technology, and particularly relates to file reading and storage in the cloud technology, wherein the method comprises the following steps: determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes; grouping a plurality of coding modes according to the estimation parameters corresponding to each coding mode; selecting at least one candidate encoding mode from a plurality of encoding modes based on a packet processing result; and determining a mode rate distortion cost corresponding to each candidate coding mode in at least one candidate coding mode, and selecting the candidate coding mode with the mode rate distortion cost meeting a rate distortion cost condition to perform coding processing on the current coding block. The embodiment of the invention can save the resources required by the coding equipment during the coding processing.

Description

Encoding processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of video encoding and decoding, and in particular, to a method, an apparatus, a device, and a storage medium for encoding processing.
Background
Video is a continuous sequence of images, consisting of successive frames, a frame being an image. Due to the eye dwell effect, when a sequence of frames is played at a certain rate, the human eye sees a video with continuous motion. Because of the extremely high similarity between consecutive frames, the original video generally needs to be encoded and compressed to remove redundancy in spatial and temporal dimensions for storage and transmission.
In general, video is encoded in units of one frame image. Specifically, a frame of image is sent to an encoder, the frame of image is first divided into a plurality of coding blocks, each coding block is subjected to coding processing, and finally, coding of the frame of image is completed. It can be seen that the coding block is the smallest coding unit in video coding. At present, when any coding block is coded, the general flow can be summarized as follows: determining a plurality of coding modes corresponding to the coding blocks; calculating the complete rate distortion cost of each coding mode for coding the coding block one by one; and finally, selecting the coding mode with the minimum complete rate-distortion cost to code the coding block.
Thus, as the number of coding modes is large, the complete rate-distortion cost is calculated for each coding mode in turn, and more coding processing equipment resources are consumed. Therefore, in the field of video coding, how to encode a coding block becomes a hot issue of research today.
Disclosure of Invention
The embodiment of the invention provides a coding processing method, a coding processing device, coding processing equipment and a storage medium, which can save resources required by the coding processing equipment during coding.
In one aspect, an embodiment of the present invention provides an encoding processing method, where the encoding processing method includes:
determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes;
grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode;
selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, selecting a candidate coding mode with a mode rate distortion cost meeting a rate distortion cost condition, and coding the current coding block.
In one aspect, an embodiment of the present invention provides an encoding processing apparatus, including:
the device comprises a determining unit, a calculating unit and a calculating unit, wherein the determining unit is used for determining a plurality of coding modes for coding a current coding block in an image to be coded and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes;
the processing unit is used for grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode;
a selection unit configured to select at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
the determining unit is further configured to determine a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode;
the selection unit is further configured to select a candidate coding mode with a mode rate-distortion cost that meets a rate-distortion cost condition, and perform coding processing on the current coding block.
In one aspect, an embodiment of the present invention provides an encoding processing apparatus, including:
a processor adapted to implement one or more instructions; and the number of the first and second groups,
a computer storage medium storing one or more instructions adapted to be loaded and executed by the processor to:
determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes;
grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode;
selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, selecting a candidate coding mode with a mode rate distortion cost meeting a rate distortion cost condition, and coding the current coding block.
In one aspect, an embodiment of the present invention provides a computer storage medium, where the computer storage medium stores computer program instructions, where the computer program instructions are executed by a processor, and are configured to perform:
determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes;
grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode;
selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, selecting a candidate coding mode with a mode rate distortion cost meeting a rate distortion cost condition, and coding the current coding block.
In one aspect, an embodiment of the present invention provides a computer program product or a computer program, where the computer program product or the computer program includes computer instructions stored in a computer-readable storage medium; the processor of the encoding processing device reads the computer instructions from the computer storage medium, and executes the encoding processing method.
In the embodiment of the invention, when a current coding block in an image to be coded is coded, after a plurality of coding modes corresponding to the current coding block are obtained, instead of directly calculating the mode rate distortion cost corresponding to each coding mode and then selecting the coding mode for coding the current coding block from the plurality of coding modes based on the mode rate distortion cost corresponding to each coding mode, the plurality of coding modes are grouped according to estimation parameters corresponding to each coding mode, and further, at least one candidate coding mode is screened out from the plurality of coding modes according to the grouping processing result; in turn, a coding mode for coding the current coding block is selected from the at least one candidate coding mode based on the mode rate distortion cost of each candidate coding mode. Therefore, as the number of coding modes needing to calculate the mode rate distortion cost is reduced, resources consumed by coding the current coding block can be saved, and coding processing resources of the image to be coded are saved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1a is a diagram of an encoding framework based on an AV1 video compression standard according to an embodiment of the present invention;
fig. 1b is a schematic diagram of a coding block division provided by an embodiment of the present invention;
FIG. 1c is a diagram illustrating motion vector prediction derivation according to an embodiment of the present invention;
FIG. 1d is a schematic diagram illustrating an embodiment of the present invention for optimizing inter-prediction modes;
fig. 2 is a flowchart illustrating an encoding processing method according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of another encoding processing method according to an embodiment of the present invention;
fig. 4 is a flowchart of another encoding processing method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an encoding processing apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an encoding processing device according to an embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Video is a continuous sequence of images, consisting of successive frames, a frame being an image. Due to the eye dwell effect, when a sequence of frames is played at a certain rate, the human eye sees a video with continuous motion. Because of the extremely high similarity between consecutive frames, the original video generally needs to be encoded and compressed to remove the redundancy in spatial and temporal dimensions for storage and transmission. For example, when a video is uploaded to the cloud, in order to save transmission time and speed up video transmission efficiency, the video needs to be encoded; after the encoded video is downloaded from the cloud, the video can be played after being decoded.
The encoding compression of the original video is performed based on a video compression standard. The development trend of future video is high definition, high frame rate and high compression rate, which requires the video compression standard to be continuously upgraded. AV1 is the first generation video coding standard developed by the Open Media Alliance (AOM), finalized in 2018, and since its introduction, it received great attention and support from the industry.
Compared with the traditional HEVC and AVC technology, the AV1 has more obvious advantages, such as better compression ratio and capability of saving more than 30% of bandwidth under the same quality; the technology supports more extensive content, and can be used for transmitting streaming media, pictures, screen sharing, video game streaming and the like. Referring to fig. 1a, there is provided an encoding framework diagram based on the AV1 video compression standard according to an embodiment of the present invention.
The process of encoding the current frame image by using the encoding architecture diagram shown in fig. 1a can be summarized as follows: after the current frame image is sent to an encoder, the current frame image is divided into Coding unit trees (CTU) according to the size of a preset block; further, the Coding tree is deeply divided into a plurality of Coding Units (CUs), and one Coding unit may be referred to as one Coding block. Each CU includes a variety of prediction modes and Transform Units (TUs). Coding each CU, and then obtaining a reconstructed frame image corresponding to the current frame image based on the coding of each CU; and after filtering the reconstructed frame image, adding the reconstructed frame image into a reference frame queue to be used as a reference image of the next frame image, and then sequentially coding backwards to realize the coding compression of the video.
In one embodiment, there are 10 types of CU partitions in the AV1 video compression standard, and fig. 1b is a schematic diagram of a coding block CU partition according to an embodiment of the present invention. The 10 CU partition types correspond to 22 block sizes, which may be 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, 16x16, 16x32, 32x16, 32x32, 32x64, 64x32, 64x64, 64x128, 128x64, 128x128, 4x16, 16x4, 8x32, 32x8, 16x64, and 64x16, respectively.
In one embodiment, each CU includes two prediction types, intra prediction and inter prediction, respectively, each prediction type including multiple prediction modes. The encoding process for a CU is implemented based on intra prediction and inter prediction.
Optionally, the intra-frame prediction may include prediction modes of: the average DC _ PRED, horizontal and vertical residuals of the top and left pixels in combination SMOOTH _ PRED, vertical residual SMOOTH _ V _ PRED, horizontal residual SMOOTH _ H _ PRED, prediction from the minimum gradient direction pass _ PRED and vertical direction V _ PRED, horizontal direction H _ PRED, 45 degree angle direction D45_ PRED, 135 degree angle direction D135_ PRED, 113 degree angle direction D113_ PRED, 157 degree angle direction D157_ PRED, 203 degree angle direction D203_ PRED, and 67 degree angle direction D67_ PRED are several principal directions. Each main direction comprises 6 angular offsets of plus or minus 3 degrees, plus or minus 6 degrees and plus or minus 9 degrees, and a palette prediction mode and an intra block copy prediction mode.
Optionally, the inter prediction may include 4 single prediction modes and 8 combined prediction modes, where the 4 single prediction modes may be: NEARESTMV, NEARMV, GLOBALMV and NEWMV; the 8 combined prediction modes may be: NEAREST _ NEARESTMV, NEAR _ NEARMV, NEAREST _ NEWMV, NEW _ NEARESTMV, NEAR _ NEWMV, NEW _ NEARMV, GLOBAL _ GLOBALMV and NEW _ NEWMV. Wherein, NEARESTMV and NEARMV mode mean that the motion vector (mv) of the prediction block is derived from the information of the adjacent blocks, and the motion vector difference (mvd) does not need to be transmitted; whereas NEWMV means that mvd needs to be transmitted, GLOBALMV means that mv information of a prediction block is derived from global motion.
Inter prediction can be divided into two processes: motion Estimation (ME) and Motion Compensation (MC). In most video sequences, the contents of adjacent frame images are very similar, and the change of background pictures is very small, so that all information of each frame image is not required to be coded, only the motion information of a moving object in the current frame image is required to be transmitted to a decoder, and the current frame image can be recovered by utilizing the contents of the previous frame image and the motion information of the current frame image.
The purpose of Motion estimation is to find the best matching coding block in the reference frame image for the current coding block, and to calculate the Motion Vector (MV) between the best matching coding block and the current coding block. The reference frame image refers to an encoded reconstructed frame image.
The basic principle of motion compensation is: when the encoder processes the N frame image in the image sequence, the prediction frame image of the N frame image is obtained by using the ME technology. At the time of actual encoding transmission, the nth frame image is not always transmitted, but a residual between the nth frame image and the predicted frame image is transmitted.
Among the multiple prediction modes included in the inter prediction, NEARESTMV and NEARMV mean that mv of the best matching coding block is derived from the neighboring block information of the current coding block, and there is no need to transmit Motion Vector Difference (MVD), which is equal to the difference between the motion vector and the predicted motion vector; whereas NEWMV means that MVD needs to be transmitted, GLOBALMV means that mv for the best matching coding block is derived from global motion.
NEARESTMV, NEARMV and NEWMV rely on the derivation of MVP for prediction coding. For a given reference frame picture, the AV1 standard calculates 4 MVPs according to rules.
The derivation procedure of MVP can be summarized as follows: jump-scanning the information of the coding blocks of the left 1/3/5 column and the top 1/3/5 row according to a certain mode, firstly selecting the coding block which uses the same reference frame image as the current coding block, and calculating the motion vector mv between the selected coding block and the current coding block; performing deduplication processing on mv; if the number of the unrepeated mvs is less than 8, selecting the used reference frame image as a coding block which is in the same direction as the reference frame image used by the current coding block, and continuously calculating the mv between the current coding block and each selected coding block; if there are still less than 8, fill in with global motion vectors; after 8 mvs are selected, sorting is performed according to importance, and the most important 4 mvs are selected. Wherein the 0 th mv is NEARESTMV, and the 1 st to 3 rd correspond to NEARMV. NEWMV uses one mv of 0-2 as mvp. As shown in fig. 1 c.
In one embodiment, there are 7 reference frame images provided by the embodiment of the present invention, which are respectively expressed as:
LAST _ FRAME, LAST2_ FRAME, LAST3_ FRAME, GOLDEN _ FRAME, BWDREF _ FRAME, ALTREF2_ FRAME, and ALTREF _ FRAME.
For each of 4 single prediction modes in inter prediction, there are 7 reference frame pictures; for each prediction mode of the 8 combined prediction modes of inter prediction, there are 16 reference frame pictures, which are: { LAST _ FRAME, ALTREF _ FRAME }, { LAST2_ FRAME, ALTREF _ FRAME }, { LAST3_ FRAME, LTREF _ FRAME }, { gold _ FRAME, ALTREF _ FRAME }, { LAST _ FRAME, BWDREF _ FRAME }, { LAST2_ FRAME, BWDREF _ FRAME }, LAST3_ FRAME, altdref _ FRAME }, { gold _ FRAME, BWDREF _ FRAME }, { LAST 36 2_ FRAME, altt 2_ FRAME }, { LAST3_ FRAME, ALTREF 5_ FRAME }, { old _ FRAME, ALTREF _ FRAME }, { LAST2_ FRAME }, { LAST _ 2_ FRAME }, { old _ FRAME }, ALTREF _ FRAME }, alt _ FRAME }.
Thus, there are 4x7+8x16=156 combinations for inter prediction. Any one combination corresponds to at most 3 mvps, and 4 processes of motion estimation, inter _ inter preference, interpolation mode preference and motion mode preference are carried out on any one mvp to obtain a plurality of coding modes.
Referring to fig. 1d, a schematic diagram of selecting an optimal result for any combination according to an embodiment of the present invention is provided. In fig. 1d, for the first cycle, assuming N =0, N denotes the number of cycles, and a threshold for the number of cycles is set to ref _ set, which is equal to the number of mvps; when the circulation is started, judging whether N is smaller than ref _ set, if not, ending the circulation; if the value is smaller than the preset value, acquiring the current mvp, and performing 1 addition operation on the N value; judging whether a prediction mode of current inter-frame prediction comprises NEWMV or not; if yes, performing motion estimation; if not, judging whether a double reference frame image exists or not, and if not, executing interpolation mode preference and motion mode preference under bestmv; if not, performing interframe-interframe fusion, interpolation mode preference under bestmv and motion mode preference. At the end of this cycle, the step of determining whether N is less than ref _ set is repeated.
In the prior art, in order to select a better coding mode from a plurality of coding modes to encode a current coding block, a technical method generally adopted is to perform coding prediction on the current coding block by adopting each coding mode to obtain a complete rate-distortion cost corresponding to each coding mode, and then select a coding mode with the minimum complete rate-distortion cost to encode the current coding block. The disadvantage of this method is that the number of coding modes is too large, a large amount of redundant calculation exists, and the coding rate is reduced.
In view of this, the embodiment of the present invention considers that, in a plurality of coding modes, the prediction pixels are the same but the syntax is different. In this case, the estimated distortion and the estimated residual bit number corresponding to the residual between the predicted value and the current coding block are the same, and since the bit numbers of other elements are also added in the calculation of the estimated rate-distortion cost, the estimated rate-distortion costs corresponding to these coding modes are different. Moreover, if the predicted pixels are the same, the distortion and residual bit numbers obtained when the integral rate distortion cost calculation is performed are also the same. Therefore, an embodiment of the present invention provides an encoding processing scheme, specifically: determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes; grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode; selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result; and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, and selecting the candidate coding mode with the mode rate distortion cost meeting a rate distortion cost condition to perform coding processing on the current coding block. Before the complete rate-distortion cost (which is subsequently called mode rate-distortion cost) is calculated, a plurality of coding modes are screened, the number of the coding modes needing to calculate the complete rate-distortion cost is reduced, resources consumed by coding the current coding block can be saved, and coding processing resources of the image to be coded are saved.
Based on the above coding processing scheme, an embodiment of the present invention provides a coding processing method, and referring to fig. 2, a flowchart of the coding processing method provided by the embodiment of the present invention is shown. The encoding processing method shown in fig. 2 may be executed by an encoding processing apparatus, and may specifically be executed by a processor of the encoding processing apparatus. The encoding processing device can be a terminal or a server. The terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart sound box, a smart watch, a smart car, and the like, but is not limited thereto; the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform and the like. The encoding processing method shown in fig. 2 may include the steps of:
step S201, determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes.
In one embodiment, the current coding block in the image to be coded refers to any coding block which is not coded and is included in the image to be coded. As can be seen from the foregoing, when each coding block is coded, intra prediction and inter prediction need to be performed. When inter-predicting a current coding block, 156 inter-prediction combinations can be obtained by selecting multiple prediction modes and multiple reference frame images, and multiple coding modes can be obtained by performing the preferential treatment on each combination through the figure 1 c.
In one embodiment, the estimation parameter corresponding to each coding mode may be determined according to a residual between the prediction value corresponding to each coding mode and the current coding block. The estimated parameter corresponding to each coding mode may include an estimated distortion corresponding to the corresponding coding mode and an estimated residual bit number corresponding to the corresponding coding mode. Taking a target coding mode included in the plurality of coding modes as an example, how to determine the estimation parameter corresponding to each coding mode is described below, specifically: adopting a target coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target coding mode; determining a residual between a predicted value corresponding to the target coding mode and the current coding block, and determining an estimated distortion rate corresponding to the target coding mode and an estimated residual bit rate corresponding to the target coding mode based on the residual; multiplying the total number of the pixel points in the current coding block by the estimation distortion rate corresponding to the target coding mode to obtain the estimation distortion corresponding to the target coding mode; and multiplying the total number of the pixel points by the residual error bit rate corresponding to the target coding mode to obtain the estimated residual error bit number corresponding to the target coding mode.
For example, assuming that the estimated distortion corresponding to the target coding mode is denoted as dist ', the estimated residual bit number corresponding to the target coding mode is denoted as bit _ coff', the estimated distortion rate is denoted as dist _ ratio, the estimated residual bit rate is denoted as rate _ ratio, and the total number of pixels in the current coding block is denoted as num _ samples, the estimated distortion and the estimated residual bit number can be calculated as shown in the following formula (1) and formula (2):
dist'=dist_ratio*num_samples(1)
bit_coff=rate_ratio*num_samples(2)
according to the formula (1) and the formula (2), after the estimated distortion rate is multiplied by the total number of pixel points in the current coding block, the estimated distortion is obtained; and multiplying the estimated residual error bit rate by the total number of pixel points in the current coding block to obtain the estimated residual error bit number.
Step S202, grouping a plurality of coding modes according to the estimation parameters corresponding to each coding mode.
In one embodiment, the grouping a plurality of coding modes according to the estimation parameter corresponding to each coding mode includes: respectively comparing the estimated distortion corresponding to each coding mode with the estimated residual bit number corresponding to each coding mode; if other coding modes with the same estimated distortion and estimated residual bit number as any coding mode exist, the any coding mode and the other coding modes form a first type coding mode group; and if no other coding mode with the same estimated distortion and estimated residual bit number as the any coding mode exists, adding the any coding mode to the second type of coding mode. It can be seen that the number of the first-type coding mode groups is at least one, and the coding modes located in different first-type coding mode groups have different estimated distortions and different estimated residual bit numbers, and the number of the second-type coding mode groups is one. That is, the coding modes having the same estimated distortion and estimated residual bit number among the plurality of coding modes are grouped into a first-type coding mode group, and the remaining coding modes among the plurality of coding modes are grouped into a second-type coding mode group.
For example, the plurality of coding modes includes 7 coding modes, which are respectively: coding mode 1, coding mode 2, coding mode 3, and coding mode 4, coding mode 5, coding mode 6, and coding mode 7; assuming that the estimated distortion of the three coding modes, i.e., coding mode 1, coding mode 2, and coding mode 3, is the same, the number of estimated residual bits is the same, the estimated distortion of the two coding modes, i.e., coding mode 4 and coding mode 5, is the same, the number of estimated residual bits is different from that of the other coding modes, and the number of estimated distortion and estimated residual bits of coding mode 6 is the same as that of the other coding modes, and similarly, the number of estimated distortion and estimated residual bits of coding mode 7 is different from that of the other coding modes.
Based on the above assumption, the grouping process for the plurality of coding modes may be: the three coding modes of the coding mode 1, the coding mode 2 and the coding mode 3 form a first type coding mode group, the two coding modes of the coding mode 4 and the coding mode 5 form a first type coding mode group, and the coding mode 6 and the coding mode 7 are added into a second type coding mode group.
Step S203 selects at least one candidate encoding mode from the plurality of encoding modes based on the grouping processing result.
In an embodiment, before performing step S203, the encoding processing apparatus may calculate an estimated rate-distortion cost corresponding to each encoding mode according to an estimation parameter corresponding to each encoding mode, where: acquiring the bit number of a syntax element in the current coding block; and inputting the bit number of the syntax element, the estimated distortion corresponding to each coding mode and the estimated residual bit number corresponding to the corresponding coding mode into an estimated rate-distortion cost determination rule for operation to obtain the estimated rate-distortion cost corresponding to the corresponding coding mode. Assuming that the estimated rate-distortion cost determination rule is as shown in the following equation (3):
rdcost'=dist'+(bit_sum+bit_coff')×λ(3)
in formula (3), bit _ sum represents the number of syntax element bits of the current coding block, dist' represents the estimated distortion corresponding to any coding mode, bit _ coff represents the estimated residual bit number corresponding to any coding mode, and λ represents the lagrangian constant. The estimated rate-distortion cost corresponding to each coding mode can be calculated according to the above equation (3).
And after the estimated rate-distortion cost corresponding to each coding mode is determined, selecting at least one candidate coding mode from the multiple coding modes by combining the estimated rate-distortion cost corresponding to each coding mode and a grouping processing result of grouping processing on the multiple coding modes. In a specific implementation, selecting at least one candidate encoding mode from a plurality of encoding modes based on the packet processing result may include: obtaining an estimated rate-distortion cost corresponding to each coding mode included in each first-class coding mode group; determining the coding mode with the minimum estimated rate-distortion cost in each first-class coding mode group as a candidate coding mode; and determining each coding mode in the second-class coding mode group as a candidate coding mode.
For example, based on the above example, assuming that the estimated rate-distortion cost of coding mode 1 is the smallest in the first coding mode group consisting of coding mode 1, coding mode 2 and coding mode 3, coding mode 1 is taken as a candidate coding mode; and in a first type of coding mode group consisting of a coding mode 4 and a coding mode 5, if the estimated rate-distortion cost of the coding mode 4 is the minimum, the coding mode 4 is taken as a candidate coding mode. And taking the coding mode 6 and the coding mode 7 in the coding mode group of the second type as candidate coding modes. Thus, the at least one candidate coding mode that is ultimately determined from the plurality of coding modes comprises: coding mode 1, coding mode 4, coding mode 6, and coding mode 7.
Step S204, determining a mode rate distortion cost corresponding to each candidate coding mode in at least one candidate coding mode, and selecting the candidate coding mode with the mode rate distortion cost meeting a rate distortion cost condition to perform coding processing on the current coding block.
In an embodiment, the mode rate distortion cost corresponding to each candidate coding mode is calculated by the same method, and how to calculate the mode rate distortion cost corresponding to each candidate coding mode is described below by taking a target candidate coding mode in at least one candidate coding mode as an example. Specifically, the method comprises the following steps: acquiring the number of grammatical element bits included by the current coding block; adopting the target candidate coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target candidate coding mode; performing preset transformation processing on the predicted value corresponding to the target candidate coding mode and the current coding block to obtain a reconstruction value corresponding to the target candidate coding mode; and determining a mode rate distortion cost corresponding to the target candidate coding mode based on a residual error between the reconstruction value and the current coding block.
In a specific implementation, the performing a preset transformation on the predicted value corresponding to the target candidate coding mode and the current coding block to obtain a reconstructed value corresponding to the target candidate coding mode may include: subtracting the predicted value from the input pixel value of the current coding block to obtain a residual error; and then carrying out transformation, quantization, inverse quantization and inverse transformation processing to obtain reduced residual data, and adding the residual data and the predicted value to obtain a reconstructed value corresponding to the target coding mode.
Optionally, the determining, based on a residual between the reconstruction value and the current coding block, a mode rate distortion cost corresponding to the target candidate coding block includes: determining a target distortion and a target residual bit number corresponding to the target candidate coding mode based on a residual between the reconstruction value and the current coding block; and inputting the bit number of the syntax element, the target distortion corresponding to the target candidate coding mode and the target residual bit number corresponding to the target candidate coding mode into a mode rate distortion cost determination rule for operation to obtain the mode rate distortion cost corresponding to the target candidate coding mode.
For example, the mode rate distortion cost determination rule may be expressed as the following formula (4):
rdcost=dist+(bit_sum+bit_coff)×λ(4)
the rdcost represents the mode rate distortion cost of the target candidate coding mode, dist represents the target distortion corresponding to the target candidate coding mode, bit _ coff represents the target residual bit number corresponding to the target candidate coding mode, bit _ sum represents the syntax element bit number corresponding to the current coding block, and lambda represents the Lagrangian constant.
The mode rate distortion cost for each candidate coding mode can be obtained according to the above formula (4). And further, selecting a candidate coding mode which meets the condition that the mode rate distortion cost meets the rate distortion cost from a plurality of candidate coding modes to carry out coding processing on the current coding block. The condition that the rate distortion cost is satisfied means that the mode rate distortion cost is smaller than a certain rate distortion cost threshold, or means that the candidate coding mode with the smallest mode rate distortion cost in the multiple candidate coding modes.
In one embodiment, the selecting the candidate coding mode whose rate-distortion cost satisfies the rate-distortion cost condition to perform coding processing on the current coding block may include: selecting a candidate coding mode with the minimum mode rate distortion cost from a plurality of candidate coding modes, and determining a residual between a predicted value corresponding to the selected candidate coding mode and the current coding block to obtain a residual coefficient; and coding the current coding block according to the residual error coefficient. In a specific implementation, the encoding of the current coding block according to the residual error system may send a residual error coefficient to an entropy encoding module to output a code stream.
In the embodiment of the present invention, when a current coding block in an image to be coded is coded, after a plurality of coding modes corresponding to the current coding block are obtained, instead of directly calculating a mode rate distortion cost corresponding to each coding mode and then selecting a coding mode for coding the current coding block from the plurality of coding modes based on the mode rate distortion cost corresponding to each coding mode, the plurality of coding modes are grouped according to an estimation parameter corresponding to each coding mode, further, at least one candidate coding mode is screened out from the plurality of coding modes according to a grouping processing result, and further, the coding mode for coding the current coding block is selected from the at least one candidate coding mode based on the mode rate distortion cost of each candidate coding mode. Therefore, as the number of coding modes needing to calculate the mode rate distortion cost is reduced, resources consumed by coding the current coding block can be saved, and coding processing resources of the image to be coded are saved.
Based on the above coding processing method, another coding processing method is provided in the embodiment of the present invention, referring to fig. 3, which is a flowchart of another coding processing method provided in the embodiment of the present invention, and the coding processing method shown in fig. 3 may be executed by a coding processing device, and specifically may be executed by a processor of the coding processing device. The encoding processing device can be a terminal or a server. The terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart sound box, a smart watch, a smart car, and the like, but is not limited thereto; the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform and the like. The encoding processing method shown in fig. 3 may include the following steps:
step S301, determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimated distortion corresponding to each coding mode and an estimated residual bit number corresponding to each coding mode in the plurality of coding modes.
As can be seen from the foregoing, the estimated distortion corresponding to each coding mode is determined according to the total number of pixels in the current coding block and the estimated distortion rate corresponding to the corresponding coding mode, and the estimated residual bit number corresponding to each coding mode is determined according to the total number of pixels in the current coding block and the estimated residual bit rate corresponding to the corresponding coding mode. It can be seen that, in order to obtain the estimated distortion and the estimated residual bit rate corresponding to each coding mode, the estimated distortion rate and the estimated residual bit rate corresponding to each coding mode need to be determined.
The following describes how to determine the estimated distortion rate and the estimated residual bit rate corresponding to each encoding mode by taking the target encoding mode of the plurality of encoding modes as an example. Specifically, the method comprises the following steps: acquiring size information of a current coding block; inputting the size information of the current coding block, the predicted value corresponding to the target coding mode and the residual error between the current coding block into a mean square error calculation rule for operation to obtain a mean square error value corresponding to the target coding mode; inputting the mean square deviation value corresponding to the target coding mode into an estimation factor determination rule for calculation to obtain an estimation factor corresponding to the target coding mode; obtaining an estimated distortion rate corresponding to the target encoding mode based on an estimated distortion rate determination rule and an estimation factor corresponding to the target encoding mode; and obtaining the estimated residual bit rate corresponding to the target coding mode based on an estimated residual bit rate determination rule and the estimation factor corresponding to the target coding mode.
The size information of the current coding block may refer to width information and height information of the current coding block, the width information may refer to how many pixels are included in the width direction, and the height information may refer to how many pixels are included in the height direction. Let m denote the width information of the current coding block and n denote the height information of the current coding block. The residue between the prediction value corresponding to the target coding mode and the current coding block may refer to: and residual errors between the pixel values of the pixel points at the positions on the current coding block and the predicted pixel values predicted for the corresponding positions in the target coding mode.
Optionally, assuming that the position of any pixel point in the current coding block is represented as [ i ] [ j ], the pixel value of any pixel point on the current coding block is represented as src [ i ] [ j ], and the predicted pixel value predicted at the corresponding position under the target coding mode can represent dst [ i ] [ j ], then inputting the size information of the current coding block, the predicted value corresponding to the target coding mode, and the residual error between the current coding block into a mean square error calculation rule for operation, so as to obtain a mean square error value corresponding to the target coding mode, which can be represented by the following formula (5), and in formula (5), sse _ norm represents the mean square error value corresponding to the target coding mode:
sse_norm=(1/(m×n))∑i=1∑j=1(dst[i][j]-src[i][j])2(5)
further, as can be seen from the above steps, after the mean square deviation value corresponding to the target coding mode is obtained, the estimation factor corresponding to the target coding mode can be obtained based on the mean square deviation value and the estimation factor determination rule. As an alternative embodiment, the estimation factor determination rule may be expressed as shown in the following equation (6),
xqr=log2(sse_norm/(qstep*qstep))(6)
in equation (6), xqr represents the estimation factor and qstep represents the quantization step size of the AC component of the current coding block. It should be understood that after any one coding mode is adopted to obtain the predicted value, the predicted value needs to be transformed and quantized to obtain a reconstructed value, and the current coding block is further coded based on a residual between the reconstructed value and the current coding block. During the transformation, the upper left corner of the current coding block in the energy set, for example, the size of the current coding block is 32x32, the first position point after the transformation is (0, 0) and represents the DC component, the rest position points are referred to as AC components, and the quantization step size of the DC component is different from that of the AC component.
In video compression coding, in order to control the bandwidth, the coding coefficients are reduced by quantization, where quantization is understood as changing a continuous line into a phase line, the strength of quantization is mainly determined by the quantization coefficients qp, the quantization step size qstep is related to the quantization coefficients qp, qstep increases with qp increasing, and qstep doubles for each 6 qp increase.
In one embodiment, after obtaining the estimation factor corresponding to the target encoding mode, the estimation factor may be modified before determining the estimated distortion rate and the estimated residual bit rate corresponding to the target encoding mode based on the estimation factor. The purpose of the modification process to the estimation factor is to slice the range of xqr such that xqr is between-14.999999 and 14.999999.
In one embodiment, the corrected estimation factor can be expressed by the following formula (7), wherein x' represents the corrected estimation factor in formula (7):
x'=2*xqr+31(7)
optionally, after the estimation factor is modified, the estimation distortion rate corresponding to the target encoding mode may be obtained based on the modified estimation factor and the estimation distortion rate determination rule. In a specific implementation, the method comprises the following steps: correcting the estimation factor, and acquiring a fractional part value in the corrected estimation factor; obtaining an estimated distortion rate parameter according to the estimated distortion rate parameter comparison table; and inputting the fractional part numerical value and the estimated distortion rate parameter into the estimated distortion rate determination rule for operation to obtain the estimated distortion rate corresponding to the target candidate encoding mode.
The embodiment of obtaining the fractional part value in the estimation factor after the correction processing may be: obtaining the maximum integer less than or equal to the estimation factor after the correction processing; and performing subtraction operation on the corrected estimation factor and the maximum integer to obtain an operation result, namely the fractional part numerical value in the corrected estimation factor. For example, obtaining the largest integer less than or equal to the estimation factor after the correction processing can be achieved by the following equation (8):
xi=floor(x')(8)
in formula (8), floor (x ') represents taking the largest integer not greater than x', that is, obtaining the largest integer less than or equal to the estimation factor after the correction process. Further, the fractional part value of the estimation factor after the correction process based on the formula (8) can be expressed by the following formula (9):
x=x'-xi(9)
where x represents the fractional value of the estimation factor after the correction process.
And (3) obtaining a fractional part value of the estimation factor after the correction processing through the formula (8) and the formula (9), further obtaining an estimation distortion rate parameter according to a distortion rate comparison table, and then calculating the fractional part value and an estimation distortion rate determination rule of the estimation distortion rate parameter input value to obtain an estimation rate distortion cost corresponding to the target coding mode.
In one embodiment, the estimated distortion rate determination rule may be expressed as shown in the following equation (10):
f(x)=a0+a1*x+a2*x2+a3*x3(10)
in the formula (10), f (x) represents the estimated distortion rate, a0, a1, a2 and a3 represent the estimated distortion rate parameters, and x represents the fractional part value of the estimation factor after the modification process. Wherein a0, a1, a2 and a3 are determined according to the distortion rate parameters included in the distortion rate parameter comparison table.
Optionally, the rate-distortion parameter comparison table may be a two-dimensional array, for example, the rate-distortion parameter comparison table is 2*The two-dimensional array of 65, i.e. the distortion factor parameter look-up table, is 2 arrays including 65 elements, which can be expressed as:
p_dist_curv[2][65]={{16.000000,15.962891,15.925174,15.886888,15.848074,15.808770,15.769015,15.728850,15.688313,15.647445,15.606284,15.564870,15.525918,15.483820,15.373330,15.126844,14.637442,14.184387,13.560070,12.880717,12.165995,11.378144,10.438769,9.130790,7.487633,5.688649,4.267515,3.196300,2.434201,1.834064,1.369920,1.035921,0.775279,0.574895,0.427232,0.233236,0.171440,0.128188,0.092762,0.067569,0.049324,0.036330,0.027008,0.019853,0.015539,0.011093,0.008733,0.007624,0.008105,0.005427,0.004065,0.003427,0.002848,0.002328,0.001865,0.001457,0.001103,0.000801,0.000550,0.000348,0.000193,0.000085,0.000021,0.000000},
{16.000000,15.996116,15.984769,15.966413,15.941505,15.910501,15.873856,15.832026,15.785466,15.734633,15.679981,15.621967,15.560961,15.460157,15.288367,15.052462,14.466922,13.921212,13.073692,12.222005,11.237799,9.985848,8.898823,7.423519,5.995325,4.773152,3.744032,2.938217,2.294526,1.762412,1.327145,1.020728,0.765535,0.570548,0.425833,0.313825,0.232959,0.171324,0.128174,0.092750,0.067558,0.049319,0.036330,0.027008,0.019853,0.015539,0.011093,0.008733,0.007624,0.008105,0.005427,0.004065,0.003427,0.002848,0.002328,0.001865,0.001457,0.001103,0.000801,0.000550,0.000348,0.000193,0.000085,0.000021,0.000000}}。
in one embodiment, determining the estimated distortion factor parameters according to the distortion factor parameter comparison table may be represented by the following equation (11) -equation (14):
a0=p[1](11)
a1=(p[2]-p[0])/2(12)
a2=2×p[0]-5×p[1]+4×p[2]-p[3](13)
a3=3×(p[1]-p[2])+p[3]-p[0](14)
specifically, p [0] = p _ dist _ curv [ dist _ index ] [ xi-1], p [1] is a value at which p [0] is shifted by one position, p [2] represents a value at which p [0] is shifted by two positions, and p [3] represents a value at which p [0] is shifted by three positions in the above-described distortion parameter lookup table. Then, p 0, p 1, p 2 and p 3 are substituted into the formula (11) -formula (14) respectively to obtain the estimated distortion factor parameter of the target encoding mode.
Where xi is obtained according to formula (8), dist _ index is determined according to the mean square error value corresponding to the target coding mode, and may be represented by the following formula (15):
dist _ index =0, sse _ norm ≤ 16; dist _ index =1, others (15)
In equation (15), sse __ norm represents the mean square deviation value corresponding to the target coding mode.
In one embodiment, the obtaining the estimated residual bit rate corresponding to the target candidate coding mode based on the estimation factor corresponding to the target coding mode and a residual bit rate determination rule includes: obtaining an estimated residual bit rate parameter according to the residual bit rate parameter comparison table; and inputting the fractional part numerical value and the residual error bit rate parameter into the estimated residual error bit rate determination rule for operation to obtain the estimated residual error bit rate corresponding to the target coding mode.
Alternatively, the estimated residual bit rate determination rule may be expressed as shown in the following equation (16):
w(x)=b0+b1*x+b2*x2+b3*x3(16)
where w (x) represents the estimated residual bit rate, b0, b1, b2, and b3 are the estimated residual bit rate parameters, and x represents the fractional part of the estimated factor after the modification process. Wherein b0, b1, b2 and b3 are determined according to residual bit rate parameters included in the residual bit rate parameter lookup table.
Optionally, the residual bit rate parameter lookup table may be a two-dimensional array, for example, the residual bit rate parameter lookup table may be 4*The two-dimensional array of 65, that is, the residual bit parameter lookup table is 4 arrays including 65 elements, which may be specifically expressed as:
q_rate_curv[4][65]={{0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,118.257702,120.210658,121.434853,122.100487,122.377758,122.436865,72.290102,96.974289,101.652727,126.830141,140.417377,157.644879,184.315291,215.823873,262.300169,335.919859,420.624173,519.185032,619.854243,726.053595,827.663369,933.127475,1037.988755,1138.839609,1233.342933,1333.508064,1428.760126,1533.396364,1616.952052,1744.539319,1803.413586,1951.466618,1994.227838,2086.031680,2148.635443,2239.068450,2222.590637,2338.859809,2402.929011,2418.727875,2435.342670,2471.159469,2523.187446,2591.183827,2674.905840,2774.110714,2888.555675,3017.997952,3162.194773,3320.903365,3493.880956,3680.884773,3881.672045,4096.000000},
{0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,13.087244,15.919735,25.930313,24.412411,28.567417,29.924194,30.857010,32.742979,36.382570,39.210386,42.265690,47.378572,57.014850,82.740067,137.346562,219.968084,316.781856,415.643773,516.706538,614.914364,714.303763,815.512135,911.210485,1008.501528,1109.787854,1213.772279,1322.922561,1414.752579,1510.505641,1615.741888,1697.989032,1780.123933,1847.453790,1913.742309,1960.828122,2047.500168,2085.454095,2129.230668,2158.171824,2182.231724,2217.684864,2269.589211,2337.264824,2420.618694,2519.557814,2633.989178,2763.819779,2908.956609,3069.306660,3244.776927,3435.274401,3640.706076,3860.978945,4096.000000},
{0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,4.656893,5.123633,5.594132,6.162376,6.918433,7.768444,8.739415,10.105862,11.477328,13.236604,15.421030,19.093623,25.801871,46.724612,98.841054,181.113466,272.586364,359.499769,445.546343,525.944439,605.188743,681.793483,756.668359,838.486885,926.950356,1015.482542,1113.353926,1204.897193,1288.871992,1373.464145,1455.746628,1527.796460,1588.475066,1658.144771,1710.302500,1807.563351,1863.197608,1927.281616,1964.450872,2022.719898,2100.041145,2185.205712,2280.993936,2387.616216,2505.282950,2634.204540,2774.591385,2926.653884,3090.602436,3266.647443,3454.999303,3655.868416,3869.465182,4096.000000},
{0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.337370,0.391916,0.468839,0.566334,0.762564,1.069225,1.384361,1.787581,2.293948,3.251909,4.412991,8.050068,11.606073,27.668092,65.227758,128.463938,202.097653,262.715851,312.464873,355.601398,400.609054,447.201352,495.761568,552.871938,619.067625,691.984883,773.753288,860.628503,946.262808,1019.805896,1106.061360,1178.422145,1244.852258,1302.173987,1399.650266,1548.092912,1545.928652,1670.817500,1694.523823,1779.195362,1882.155494,1990.662097,2108.325181,2235.456119,2372.366287,2519.367059,2676.769812,2844.885918,3024.026754,3214.503695,3416.628115,3630.711389,3857.064892,4096.000000}}。
alternatively, determining the respective estimated residual bit rate parameters according to the residual bit rate parameter comparison table may be represented by the following equation (17) to equation (20):
a0=q[1](17)
a1=(q[2]-q[0])/2(18)
a2=2×q[0]-5×q[1]+4×q[2]-q[3](19)
a3=3×(q[1]-q[2])+q[3]-q[0](20)
wherein q [0] = q _ rate _ curv [ rate _ index ] [ xi-1], q [1] represents a value at which q [0] is shifted by one position, q [2] represents a value at which q [0] is shifted by two positions, and q [3] represents a value at which q [0] is shifted by three positions in the above residual bit rate parameter lookup table; after q 0, q 1, q 2 and q 3 are obtained, the estimated residual bit rate parameters of the target coding mode can be obtained by substituting the obtained values into the formula (17) and the formula (20).
The rate _ index is determined according to the total number of pixels in the current coding block, and may be specifically shown in the following formula (21):
rate_index=0,num_samples≤32;
rate_index=1,64≤num_samples≤32;(21)
rate_index=12,256≤num_samples≤512;
rate_index=3,1024≤num_samples≤16384
in equation (21), num _ samples represents the total number of pixels in the current macroblock.
Step S302, obtaining the bit number of the syntax element in the current coding block, and calculating the bit number of the syntax element, the estimated distortion corresponding to each coding mode and the estimated residual bit number input value estimated rate distortion cost determination rule corresponding to the corresponding coding mode to obtain the estimated rate distortion cost corresponding to the corresponding coding mode.
Step S303, the plurality of coding modes are sequenced according to the sequence that the estimation rate distortion cost corresponding to each coding mode is from small to large.
In an embodiment, after the estimated distortion and the estimated residual bit number corresponding to each coding mode are obtained in step S301, the estimated distortion, the estimated residual bit number, and the syntax element bit number are substituted into the estimated rate-distortion cost determination rule, so that the estimated rate-distortion cost corresponding to each coding mode can be obtained. The estimated rate-distortion cost determination rule may be as shown in formula (3) in the embodiment of fig. 2, and will not be described herein again.
After obtaining the estimated rate-distortion cost corresponding to each coding mode, the coding modes may be arranged from small to large according to the estimated rate-distortion cost.
Step S304, grouping the arranged coding modes according to the estimation distortion and the estimation residual bit number corresponding to each coding mode.
Optionally, the implementation idea of step S304 may be: sequentially judging the estimated distortion and the estimated residual bit number of the coding modes according to the sequence from front to back for the sequenced coding modes, and determining the coding modes with the same estimated distortion and estimated residual bit number as a first-class coding mode group; the other coding modes are added to the set of coding modes of the second type.
Step S305, adding a candidate identification to each coding mode based on the grouping processing result, and determining a candidate coding mode by identifying the candidate as the coding mode of the first identification.
In one embodiment, the adding a candidate identifier for each coding mode based on the packet processing result includes: if any coding mode belongs to the second type coding mode group, setting the candidate identification of any coding mode as a first identification; if any coding mode belongs to a target first-class coding mode group in at least one first-class coding mode group, acquiring the arrangement position of the any coding mode in the target first-class coding mode group; if the arrangement position is the first position, setting the candidate mark of any coding mode as a first mark; and if the arrangement position is other than the first position, setting the candidate mark of any coding mode as a second mark.
Optionally, the encoding processing device may also record the candidate identifications of the encoding modes through an array, for example, record the candidate identifications of the encoding modes through an array mask [512], where the array size may be larger, and the array size must be larger than the total number of the plurality of encoding modes according to the number of the encoding modes.
Colloquially, adding a candidate identification for each coding mode may be: judging each coding mode from front to back, if one coding mode belongs to a first type of coding mode group and is a coding mode which appears for the first time in the first type of coding mode group, setting a candidate identifier of the coding mode as a first identifier, for example, a candidate identifier of a k-th coding mode is a first identifier and can be represented as mask [ k ] = 1; if the coding mode is not the first occurring coding mode in the first class of coding mode group, its candidate flag is set to 0. If a coding mode belongs to the group of coding modes of the second type, the candidate identification of the coding mode is set as the first identification.
After the candidate flags are set for all the coding modes, the coding mode whose candidate is represented as the first flag may be determined as the candidate coding mode.
Step S306, determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, and selecting the candidate coding mode with the smallest mode rate distortion cost to perform coding processing on the current coding block.
In one embodiment, steps S304 and S305 may be performed in real time, that is, after adding the candidate identifier to each coding mode, sequentially traversing all the coding modes, and sequentially determining whether the currently traversed coding mode needs to calculate the mode rate distortion cost, and assuming that the currently traversed coding mode is the ith coding mode, determining whether the candidate identifier mask [ i ] of the ith coding mode is equal to 1; if the number of the coding modes is equal to 1, carrying out mode rate distortion cost calculation on the ith coding mode; otherwise, the mode rate distortion cost calculation is not carried out; and (4) performing an operation of adding 1 to the i, and continuously judging whether other coding modes need to calculate the mode rate distortion cost according to the steps until all the coding modes are traversed. Based on this, the flow of the encoding process for the current encoding block can be as shown in fig. 4.
In the embodiment of the invention, when a current coding block in an image to be coded is coded, after a plurality of coding modes corresponding to the current coding block are obtained, the estimated distortion corresponding to each coding mode and the estimated residual bit number corresponding to each coding mode are calculated; determining an estimated rate-distortion cost corresponding to each coding mode based on the estimated distortion corresponding to each coding mode and the estimated residual bit number corresponding to each coding mode; and arranging the coding modes according to the estimated rate-distortion cost corresponding to each coding mode from low to high.
Then, grouping the arranged coding modes according to the estimated distortion corresponding to each coding mode and the estimated residual bit number corresponding to each coding mode; candidate identifications are then added for each coding mode based on the grouping processing result, and the candidate identifications are determined as the coding mode of the first identification. The method and the device realize the orderly primary screening of a plurality of coding modes to reduce the number of the coding modes for subsequently calculating the rate distortion cost of the mode, thereby saving the resource consumption of coding processing equipment and improving the coding efficiency to a certain extent.
After the candidate coding modes are obtained, the candidate coding mode with the minimum mode rate distortion cost is selected from the candidate coding modes based on the mode rate distortion cost corresponding to each candidate coding mode to code the current coding block, and the coding accuracy can be improved.
Based on the above coding processing method embodiment, an embodiment of the present invention provides a coding processing apparatus. Fig. 5 is a schematic structural diagram of an encoding processing apparatus according to an embodiment of the present invention. The encoding processing apparatus shown in fig. 5 may operate as follows:
a determining unit 501, configured to determine multiple coding modes for coding a current coding block in an image to be coded, and determine an estimation parameter corresponding to each coding mode in the multiple coding modes;
a processing unit 502, configured to perform grouping processing on the multiple coding modes according to the estimation parameter corresponding to each coding mode;
a selecting unit 503 configured to select at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
the determining unit 501 is further configured to determine a mode rate distortion cost corresponding to each candidate encoding mode in the at least one candidate encoding mode;
the selecting unit 503 is further configured to select a candidate coding mode with a mode rate-distortion cost that meets a rate-distortion cost condition, and perform coding processing on the current coding block.
In an embodiment, the estimation parameter corresponding to any one of the coding modes includes an estimated distortion corresponding to the any one coding mode and an estimated residual bit number corresponding to the any one coding mode, and the determining unit 501, when determining the estimation parameter corresponding to each of the plurality of coding modes, performs the following steps:
adopting the target coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target coding mode;
determining a residual between a predicted value corresponding to the target coding mode and the current coding block, and determining an estimated distortion rate corresponding to the target coding mode and an estimated residual bit rate corresponding to the target coding mode based on the residual;
multiplying the total number of the pixel points in the current coding block by the estimation distortion rate corresponding to the target coding mode to obtain the estimation distortion corresponding to the target coding mode;
and multiplying the total number of the pixel points by the estimated residual error bit rate corresponding to the target coding mode to obtain the estimated residual error bit number corresponding to the target coding mode.
In one embodiment, when determining the estimated distortion rate and the estimated residual bit rate corresponding to the target encoding mode based on the residual, the determining unit 501 performs the following steps:
acquiring size information of the current coding block;
inputting the size information of the current coding block and the residual error into a mean square error calculation rule for operation to obtain a mean square error value corresponding to the target coding mode;
inputting the mean square deviation value corresponding to the target coding mode into an estimation factor determination rule for calculation to obtain an estimation factor corresponding to the target coding mode;
determining a rule and an estimation factor corresponding to the target encoding mode based on an estimation distortion rate to obtain an estimation distortion rate corresponding to the target encoding mode;
and obtaining the estimated residual bit rate corresponding to the target coding mode based on an estimated residual bit rate determination rule and the estimation factor corresponding to the target coding mode.
In one embodiment, when the determining unit 501 obtains the estimated distortion rate corresponding to the target encoding mode based on the estimated distortion rate determining rule and the estimation factor corresponding to the target encoding mode, the following steps are performed: correcting the estimation factor, and acquiring a fractional part value in the corrected estimation factor;
obtaining an estimated distortion rate parameter according to the distortion rate parameter comparison table;
and inputting the fractional part numerical value and the estimated distortion rate parameter into the estimated distortion rate determination rule for operation to obtain the estimated distortion rate corresponding to the target encoding mode.
In one embodiment, when obtaining the estimated residual bit rate corresponding to the target coding mode based on the estimated residual bit rate determination rule and the estimation factor corresponding to the target coding mode, the determining unit 501 performs the following steps: obtaining an estimated residual bit rate parameter according to the residual bit rate parameter comparison table;
and inputting the fractional part numerical value and the estimated residual bit rate parameter into the estimated residual bit rate determination rule for operation to obtain the estimated residual bit rate corresponding to the target coding mode.
In an embodiment, the estimation parameter corresponding to any one coding mode includes an estimated distortion corresponding to the any one coding mode and an estimated residual bit number corresponding to the any one coding mode, where the coding modes include a first type of coding mode and a second type of coding mode, and when the processing unit 502 performs grouping processing on the plurality of coding modes according to the estimation parameter corresponding to each coding mode, the following steps are performed:
respectively comparing the estimated distortion corresponding to each coding mode with the estimated residual bit number corresponding to each coding mode;
if other coding modes with the same estimated distortion and estimated residual bit number as any coding mode in the plurality of coding modes exist, the any coding mode and the other coding modes form a first type coding mode group;
and if no other coding mode with the same estimated distortion and estimated residual bit number as the any coding mode exists, adding the any coding mode to a second type coding mode group.
In one embodiment, the encoding processing apparatus further includes an obtaining unit 504, where the obtaining unit 504 is configured to obtain the number of bits of a syntax element in the current encoding block; the processing unit 502 is further configured to input the syntax element bit number, the estimated distortion corresponding to each coding mode, and the estimated residual bit number corresponding to the corresponding coding mode into an estimated rate-distortion cost determination rule for operation, so as to obtain an estimated rate-distortion cost corresponding to the corresponding coding mode.
In one embodiment, the number of the first type coding mode group is at least one, the number of the second type coding mode group is one, and the selecting unit 503 performs the following steps when selecting at least one candidate coding mode from the plurality of coding modes based on the grouping processing result: obtaining an estimated rate-distortion cost corresponding to each coding mode included in each first-class coding mode group;
determining the coding mode with the minimum estimated rate-distortion cost in each first-class coding mode group as a candidate coding mode;
and determining each coding mode in the second-class coding mode group as a candidate coding mode.
In one embodiment, the selecting unit 503, when selecting at least one candidate encoding mode from the plurality of encoding modes based on the grouping processing result, performs the steps of: adding a candidate identifier for each coding mode based on the grouping processing result;
and identifying the candidate coding mode in the plurality of coding modes as the first identified coding mode, and determining the candidate coding mode.
In an embodiment, the number of the first type coding mode groups is at least one, the number of the second type coding mode groups is one, a plurality of coding modes included in any first type coding mode group are arranged in an order from small to large according to an estimated rate-distortion cost corresponding to each coding mode, and the selecting unit 503, when adding the candidate identifier to each coding mode based on the grouping processing result, performs the following steps:
if any coding mode belongs to the second type coding mode group, setting the candidate identification of any coding mode as a first identification;
if any coding mode belongs to a target first-class coding mode group in a first-class coding mode group, acquiring the arrangement position of the any coding mode in the target first-class coding mode group;
if the arrangement position is the first position, setting the candidate mark of any coding mode as a first mark;
and if the arrangement position is other than the first position, setting the candidate mark of any coding mode as a second mark.
In one embodiment, the at least one candidate coding mode includes a target candidate coding mode, and the determining unit 501 performs the following steps when determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode:
acquiring the number of grammatical element bits included by the current coding block;
adopting the target candidate coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target candidate coding mode;
performing preset transformation processing on the predicted value corresponding to the target candidate coding mode and the current coding block to obtain a reconstruction value corresponding to the target candidate coding mode;
and determining a mode rate distortion cost corresponding to the target candidate coding mode based on a residual error between the reconstruction value and the current coding block.
In one embodiment, the determining unit 501, when determining the mode rate distortion cost corresponding to the target candidate coding mode based on the residual between the reconstructed value and the current coding block, performs the following steps:
determining a target distortion corresponding to the target candidate coding mode and a target residual bit number corresponding to the target candidate coding mode based on a residual between the reconstruction value and the current coding block;
and inputting the number of the syntactic element bits, the target distortion corresponding to the target candidate coding mode and the target residual bit number corresponding to the target coding mode into a mode rate distortion cost determination rule for operation to obtain the mode rate distortion cost corresponding to the target candidate coding mode.
In one embodiment, when selecting a candidate coding mode with a mode rate-distortion cost satisfying a rate-distortion cost condition to perform coding processing on the current coding block, the selecting unit 503 performs the following steps: selecting a candidate coding mode with the smallest mode rate distortion cost from the at least one candidate coding mode;
determining a residual between a predicted value corresponding to the selected candidate coding mode and the current coding block to obtain a residual coefficient;
and coding the current coding block according to the residual error coefficient.
According to an embodiment of the present invention, the steps involved in the encoding processing methods shown in fig. 2 and 3 may be performed by units in the encoding processing apparatus shown in fig. 5. For example, step S201 shown in fig. 2 may be performed by the determining unit 501 in the encoding processing apparatus shown in fig. 5, step S202 may be performed by the processing unit 502 in the encoding processing apparatus shown in fig. 5, step S203 may be performed by the selecting unit 503 in the encoding processing apparatus shown in fig. 5, and step S204 may be performed by the determining unit 501 and the selecting unit 503 in the encoding processing apparatus shown in fig. 5; as another example, step S301 in the encoding processing method shown in fig. 3 may be performed by the determination unit 501 in the encoding processing apparatus shown in fig. 5, step S302 may be performed by the acquisition unit 504 and the processing unit 502 in the encoding processing apparatus shown in fig. 5, steps S303 to S305 may be performed by the processing unit 502 in the encoding processing apparatus shown in fig. 5, and step S305 may be performed by the determination unit 501 and the selection unit 503 in the encoding processing apparatus shown in fig. 5.
According to another embodiment of the present invention, the units in the encoding processing apparatus shown in fig. 5 may be respectively or entirely combined into one or several other units to form one or several other units, or some unit(s) therein may be further split into multiple units with smaller functions to form the same operation, without affecting the achievement of the technical effect of the embodiment of the present invention. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present invention, the encoding-based processing apparatus may also include other units, and in practical applications, these functions may also be implemented by being assisted by other units, and may be implemented by cooperation of a plurality of units.
According to another embodiment of the present invention, the encoding processing apparatus shown in fig. 5 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the respective methods shown in fig. 2 and 3 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and implementing the encoding processing method of the embodiment of the present invention. The computer program may be embodied on a computer-readable storage medium, for example, and loaded into and executed by the above-described computing apparatus via the computer-readable storage medium.
In the embodiment of the present invention, when a current coding block in an image to be coded is coded, after a plurality of coding modes corresponding to the current coding block are obtained, instead of directly calculating a mode rate distortion cost corresponding to each coding mode, and then selecting a coding mode for coding the current coding block from the plurality of coding modes based on the mode rate distortion cost corresponding to each coding mode, a grouping process is determined for the plurality of coding modes according to an estimation parameter corresponding to each coding mode, further, at least one candidate coding mode is selected from the plurality of coding modes according to a grouping processing result, and further, a coding mode for coding the current coding block is selected from the at least one candidate coding mode based on the mode rate distortion cost of each candidate coding mode. Therefore, as the number of coding modes needing to calculate the mode rate distortion cost is reduced, resources consumed by coding the current coding block can be saved, and coding processing resources of the image to be coded are saved.
Based on the above method and apparatus embodiments, an embodiment of the present invention provides an encoding processing device. Fig. 6 is a schematic structural diagram of an encoding processing apparatus according to an embodiment of the present invention. The encoding processing device shown in fig. 6 may include at least a processor 601, an input interface 602, an output interface 603, and a computer storage medium 604. The processor 601, the input interface 602, the output interface 603, and the computer storage medium 604 may be connected by a bus or other means.
A computer storage medium 604 may be stored in the memory of the encoding processing device, the computer storage medium 601 being for storing a computer program comprising program instructions, the processor 601 being for executing the program instructions stored by the computer storage medium 604. The processor 601 (or CPU) is a computing core and a control core of the encoding Processing device, and is adapted to implement one or more instructions, and specifically adapted to load and execute:
determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes; grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode; selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result; and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, and selecting the candidate coding mode with the mode rate distortion cost meeting a rate distortion cost condition to perform coding processing on the current coding block.
An embodiment of the present invention further provides a computer storage medium (Memory), which is a Memory device in the encoding processing device and is used to store programs and data. It is understood that the computer storage medium herein may include both the built-in storage medium of the encoding processing device and, of course, also the extended storage medium supported by the encoding processing device. The computer storage medium provides a storage space that stores an operating system of the code processing device. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), suitable for loading and execution by processor 601. The computer storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; and optionally at least one computer storage medium located remotely from the processor.
In one embodiment, the computer storage medium may be loaded with one or more instructions and executed by processor 601 to implement the corresponding steps described above with respect to the encoding methods shown in fig. 2 and 3. In particular implementations, one or more instructions in the computer storage medium are loaded and executed by processor 601 to perform the steps of:
determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes; grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode; selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result; and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, selecting a candidate coding mode with a mode rate distortion cost meeting a rate distortion cost condition, and coding the current coding block.
In one embodiment, the estimated parameters corresponding to any one of the coding modes include an estimated distortion corresponding to the any one coding mode and an estimated residual bit number corresponding to the any one coding mode, the plurality of coding modes include a target coding mode, and the processor 601, when determining the estimated parameters corresponding to each of the plurality of coding modes, performs the following steps:
adopting the target coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target coding mode;
determining a residual between a predicted value corresponding to the target coding mode and the current coding block, and determining an estimated distortion rate corresponding to the target coding mode and an estimated residual bit rate corresponding to the target coding mode based on the residual; multiplying the total number of the pixel points in the current coding block by the estimation distortion rate corresponding to the target coding mode to obtain the estimation distortion corresponding to the target coding mode; and multiplying the total number of the pixel points by the estimated residual error bit rate corresponding to the target coding mode to obtain the estimated residual error bit number corresponding to the target coding mode.
In one embodiment, the processor 601, when determining the estimated distortion rate corresponding to the target encoding mode and the estimated residual bit rate corresponding to the target encoding mode based on the residual, performs the following steps:
acquiring size information of the current coding block; inputting the size information of the current coding block and the residual error into a mean square error calculation rule for operation to obtain a mean square error value corresponding to the target coding mode; inputting the mean square deviation value corresponding to the target coding mode into an estimation factor determination rule for calculation to obtain an estimation factor corresponding to the target coding mode; determining a rule and an estimation factor corresponding to the target encoding mode based on an estimation distortion rate to obtain an estimation distortion rate corresponding to the target encoding mode; and obtaining the estimated residual bit rate corresponding to the target coding mode based on an estimated residual bit rate determination rule and the estimation factor corresponding to the target coding mode.
In one embodiment, when the estimated distortion rate corresponding to the target encoding mode is obtained based on the estimated distortion rate determination rule and the estimation factor corresponding to the target encoding mode, the processor 601 performs the following steps:
correcting the estimation factor, and acquiring a fractional part value in the corrected estimation factor; obtaining an estimated distortion rate parameter according to the distortion rate parameter comparison table; and inputting the fractional part numerical value and the estimated distortion rate parameter into the estimated distortion rate determination rule for operation to obtain the estimated distortion rate corresponding to the target encoding mode.
In one embodiment, the processor 601, when obtaining the estimated residual bit rate corresponding to the target coding mode based on the estimated residual bit rate determination rule and the estimation factor corresponding to the target coding mode, performs the following steps:
obtaining an estimated residual bit rate parameter according to the residual bit rate parameter comparison table; and inputting the fractional part numerical value and the estimated residual bit rate parameter into the estimated residual bit rate determination rule for operation to obtain the estimated residual bit rate corresponding to the target coding mode.
In an embodiment, the estimation parameter corresponding to any one of the coding modes includes an estimated distortion corresponding to the any one coding mode and an estimated residual bit number corresponding to the any one coding mode, and the processor 601, when performing grouping processing on the plurality of coding modes according to the estimation parameter corresponding to each coding mode, performs the following steps:
respectively comparing the estimated distortion corresponding to each coding mode with the estimated residual bit number corresponding to each coding mode; if other coding modes with the same estimated distortion and estimated residual bit number as any coding mode in the plurality of coding modes exist, the any coding mode and the other coding modes form a first type coding mode group; and if no other coding mode with the same estimated distortion and estimated residual bit number as the any coding mode exists, adding the any coding mode to a second type coding mode group.
In one embodiment, the processor 601 is further configured to: acquiring the bit number of a syntax element in the current coding block; and inputting the bit number of the syntax element, the estimated distortion corresponding to each coding mode and the estimated residual bit number corresponding to the corresponding coding mode into an estimated rate-distortion cost determination rule for operation to obtain the estimated rate-distortion cost corresponding to the corresponding coding mode.
In one embodiment, the number of the first type coding mode group is at least one, the number of the second type coding mode group is one, and the processor 601, when selecting at least one candidate coding mode from the plurality of coding modes based on the grouping processing result, performs the following steps: obtaining an estimated rate-distortion cost corresponding to each coding mode included in each first-class coding mode group; determining the coding mode with the minimum estimated rate-distortion cost in each first-class coding mode group as a candidate coding mode; and determining each coding mode in the second-class coding mode group as a candidate coding mode.
In one embodiment, the processor 601, when selecting at least one candidate encoding mode from the plurality of encoding modes based on the packet processing result, performs the steps of: adding a candidate identifier for each coding mode based on the grouping processing result; and identifying the candidate coding mode in the plurality of coding modes as the first identified coding mode, and determining the candidate coding mode.
In an embodiment, the number of the first type coding mode groups is at least one, the number of the second type coding mode groups is one, a plurality of coding modes included in any first type coding mode group are arranged in an order from a small estimation rate-distortion cost to a large estimation rate-distortion cost corresponding to each coding mode, and the processor 601, when adding a candidate identifier to each coding mode based on a grouping processing result, performs the following steps:
if any coding mode belongs to the second type coding mode group, setting the candidate identification of any coding mode as a first identification; if any coding mode belongs to a target first-class coding mode group in a first-class coding mode group, acquiring the arrangement position of the any coding mode in the target first-class coding mode group; if the arrangement position is the first position, setting the candidate mark of any coding mode as a first mark; and if the arrangement position is other than the first position, setting the candidate mark of any coding mode as a second mark.
In one embodiment, the at least one candidate coding mode includes a target candidate coding mode, and the processor 601, when determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, performs the following steps:
acquiring the number of grammatical element bits included by the current coding block; adopting the target candidate coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target candidate coding mode;
performing preset transformation processing on the predicted value corresponding to the target candidate coding mode and the current coding block to obtain a reconstruction value corresponding to the target candidate coding mode; and determining a mode rate distortion cost corresponding to the target candidate coding mode based on a residual error between the reconstruction value and the current coding block.
In one embodiment, the processor 601, when determining the mode rate distortion cost corresponding to the target candidate coding mode based on the residual between the reconstructed value and the current coding block, performs the following steps:
determining a target distortion corresponding to the target candidate coding mode and a target residual bit number corresponding to the target candidate coding mode based on a residual between the reconstruction value and the current coding block; and inputting the number of the syntactic element bits, the target distortion corresponding to the target candidate coding mode and the target residual bit number corresponding to the target coding mode into a mode rate distortion cost determination rule for operation to obtain the mode rate distortion cost corresponding to the target candidate coding mode.
In one embodiment, when the candidate coding mode with the rate-distortion cost satisfying the rate-distortion cost condition is selected to perform coding processing on the current coding block, the processor 601 performs the following steps:
selecting a candidate coding mode with the smallest mode rate distortion cost from the at least one candidate coding mode;
determining a residual between a predicted value corresponding to the selected candidate coding mode and the current coding block to obtain a residual coefficient; and coding the current coding block according to the residual error coefficient.
In the embodiment of the invention, when a current coding block in an image to be coded is coded, after a plurality of coding modes corresponding to the current coding block are obtained, instead of directly calculating the mode rate distortion cost corresponding to each coding mode and selecting the coding mode for coding the current coding block through the mode rate distortion cost corresponding to each coding mode, the coding modes are grouped according to the estimation parameter corresponding to each coding mode, further, at least one candidate coding mode is screened out from the plurality of coding modes according to the grouping processing result, and further, the coding mode for coding the current coding block is selected from at least one candidate coding mode based on the mode rate distortion cost of each candidate coding mode. Therefore, as the number of coding modes needing to calculate the mode rate distortion cost is reduced, resources consumed by coding the current coding block can be saved, and coding processing resources of the image to be coded are saved.
According to an aspect of the present application, an embodiment of the present invention also provides a computer product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor 601 reads the computer instructions from the computer-readable storage medium, and the processor 601 executes the computer instructions to make the encoding processing device execute the encoding processing method shown in fig. 2 and 3, and specifically: determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes; grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode; selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result; and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, and selecting the candidate coding mode with the mode rate distortion cost meeting a rate distortion cost condition to perform coding processing on the current coding block.
In the embodiment of the invention, when a current coding block in an image to be coded is coded, after a plurality of coding modes corresponding to the current coding block are obtained, instead of directly calculating the mode rate distortion cost corresponding to each coding mode and selecting the coding mode for coding the current coding block through the mode rate distortion cost corresponding to each coding mode, the coding modes are grouped according to the estimation parameter corresponding to each coding mode, further, at least one candidate coding mode is screened out from the plurality of coding modes according to the grouping processing result, and further, the coding mode for coding the current coding block is selected from at least one candidate coding mode based on the mode rate distortion cost of each candidate coding mode. Therefore, as the number of coding modes needing to calculate the mode rate distortion cost is reduced, resources consumed by coding the current coding block can be saved, and coding processing resources of the image to be coded are saved.

Claims (15)

1. An encoding processing method, comprising:
determining a plurality of coding modes for coding a current coding block in an image to be coded, and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes;
grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode;
selecting at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
and determining a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode, selecting a candidate coding mode with a mode rate distortion cost meeting a rate distortion cost condition, and coding the current coding block.
2. The method of claim 1, wherein the estimated parameters for any one of the coding modes include an estimated distortion for the any one of the coding modes and an estimated number of residual bits for the any one of the coding modes, wherein the plurality of coding modes includes a target coding mode, and wherein determining the estimated parameters for each of the plurality of coding modes comprises:
adopting the target coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target coding mode;
determining a residual between a predicted value corresponding to the target coding mode and the current coding block, and determining an estimated distortion rate corresponding to the target coding mode and an estimated residual bit rate corresponding to the target coding mode based on the residual;
multiplying the total number of the pixel points in the current coding block by the estimation distortion rate corresponding to the target coding mode to obtain the estimation distortion corresponding to the target coding mode;
and multiplying the total number of the pixel points by the estimated residual error bit rate corresponding to the target coding mode to obtain the estimated residual error bit number corresponding to the target coding mode.
3. The method of claim 2, wherein the determining an estimated distortion rate corresponding to the target encoding mode and an estimated residual bitrate corresponding to the target encoding mode based on the residual comprises:
acquiring size information of the current coding block;
inputting the size information of the current coding block and the residual error into a mean square error calculation rule for operation to obtain a mean square error value corresponding to the target coding mode;
inputting the mean square deviation value corresponding to the target coding mode into an estimation factor determination rule for calculation to obtain an estimation factor corresponding to the target coding mode;
determining a rule and an estimation factor corresponding to the target encoding mode based on an estimation distortion rate to obtain an estimation distortion rate corresponding to the target encoding mode;
and obtaining the estimated residual bit rate corresponding to the target coding mode based on an estimated residual bit rate determination rule and the estimation factor corresponding to the target coding mode.
4. The method as claimed in claim 3, wherein the deriving the estimated distortion ratio for the target encoding mode based on the estimated distortion ratio determination rule and the estimation factor for the target encoding mode comprises:
correcting the estimation factor, and acquiring a fractional part value in the corrected estimation factor;
obtaining an estimated distortion rate parameter according to the distortion rate parameter comparison table;
and inputting the fractional part numerical value and the estimated distortion rate parameter into the estimated distortion rate determination rule for operation to obtain the estimated distortion rate corresponding to the target encoding mode.
5. The method of claim 4, wherein the deriving the estimated residual bit rate for the target coding mode based on the estimated residual bit rate determination rule and the estimation factor for the target coding mode comprises:
obtaining an estimated residual bit rate parameter according to the residual bit rate parameter comparison table;
and inputting the fractional part numerical value and the estimated residual bit rate parameter into the estimated residual bit rate determination rule for operation to obtain the estimated residual bit rate corresponding to the target coding mode.
6. The method according to claim 1, wherein the estimation parameters corresponding to any one of the coding modes include an estimated distortion corresponding to the any one of the coding modes and an estimated residual bit number corresponding to the any one of the coding modes, and the grouping of the plurality of coding modes according to the estimation parameters corresponding to each of the coding modes comprises:
respectively comparing the estimated distortion corresponding to each coding mode with the estimated residual bit number corresponding to each coding mode;
for any coding mode in the plurality of coding modes, if other coding modes with the same estimated distortion and the same estimated residual bit number exist in the plurality of coding modes, the any coding mode and the other coding modes form a first type of coding mode group;
and if no other coding mode with the same estimated distortion and the same estimated residual bit number as the any coding mode exists, adding the any coding mode to a second type coding mode group.
7. The method of claim 6, wherein the method further comprises:
acquiring the bit number of a syntax element in the current coding block;
and inputting the bit number of the syntax element, the estimated distortion corresponding to each coding mode and the estimated residual bit number corresponding to the corresponding coding mode into an estimated rate-distortion cost determination rule for operation to obtain the estimated rate-distortion cost corresponding to the corresponding coding mode.
8. The method of claim 7, wherein the number of the first group of coding modes is at least one, the number of the second group of coding modes is one, and the selecting at least one candidate coding mode from the plurality of coding modes based on the grouping processing result comprises:
obtaining an estimated rate-distortion cost corresponding to each coding mode included in each first-class coding mode group;
determining the coding mode with the minimum estimated rate-distortion cost in each first-class coding mode group as a candidate coding mode;
and determining each coding mode in the second-class coding mode group as a candidate coding mode.
9. The method of claim 6, wherein the selecting at least one candidate coding mode from the plurality of coding modes based on packet processing results comprises:
adding a candidate identifier for each coding mode based on the grouping processing result;
and identifying the candidate coding mode in the plurality of coding modes as the first identified coding mode, and determining the candidate coding mode.
10. The method of claim 9, wherein the number of the first type coding mode groups is at least one, the number of the second type coding mode groups is one, a plurality of coding modes included in any first type coding mode group are arranged in an order from a small estimation rate distortion cost to a large estimation rate distortion cost corresponding to each coding mode, and the adding of the candidate identifier for each coding mode based on the grouping processing result includes:
if any coding mode belongs to the second type coding mode group, setting the candidate identification of any coding mode as a first identification;
if any coding mode belongs to a target first-class coding mode group in a first-class coding mode group, acquiring the arrangement position of the any coding mode in the target first-class coding mode group;
if the arrangement position is the first position, setting the candidate mark of any coding mode as a first mark;
and if the arrangement position is other than the first position, setting the candidate mark of any coding mode as a second mark.
11. The method of claim 1, wherein the at least one candidate coding mode comprises a target candidate coding mode, and wherein the determining the mode rate-distortion cost for each of the at least one candidate coding mode comprises:
acquiring the number of grammatical element bits included by the current coding block;
adopting the target candidate coding mode to carry out coding prediction on the current coding block to obtain a predicted value corresponding to the target candidate coding mode;
performing preset transformation processing on the predicted value corresponding to the target candidate coding mode and the current coding block to obtain a reconstruction value corresponding to the target candidate coding mode;
and determining a mode rate distortion cost corresponding to the target candidate coding mode based on a residual error between the reconstruction value and the current coding block.
12. The method of claim 11, wherein the determining a mode rate-distortion cost for the target candidate coding mode based on a residual between the reconstructed value and the current coding block comprises:
determining a target distortion corresponding to the target candidate coding mode and a target residual bit number corresponding to the target candidate coding mode based on a residual between the reconstruction value and the current coding block;
and inputting the number of the syntactic element bits, the target distortion corresponding to the target candidate coding mode and the target residual bit number corresponding to the target candidate coding mode into a mode rate distortion cost determination rule for operation to obtain the mode rate distortion cost corresponding to the target candidate coding mode.
13. The method of claim 1, wherein the selecting the candidate coding mode whose rate-distortion cost satisfies the rate-distortion cost condition, and performing coding processing on the current coding block comprises:
selecting a candidate coding mode with the smallest mode rate distortion cost from the at least one candidate coding mode;
determining a residual between a predicted value corresponding to the selected candidate coding mode and the current coding block to obtain a residual coefficient;
and coding the current coding block according to the residual error coefficient.
14. An encoding processing apparatus characterized by comprising:
the device comprises a determining unit, a calculating unit and a calculating unit, wherein the determining unit is used for determining a plurality of coding modes for coding a current coding block in an image to be coded and determining an estimation parameter corresponding to each coding mode in the plurality of coding modes;
the processing unit is used for grouping the plurality of coding modes according to the estimation parameters corresponding to each coding mode;
a selection unit configured to select at least one candidate encoding mode from the plurality of encoding modes based on a packet processing result;
the determining unit is further configured to determine a mode rate distortion cost corresponding to each candidate coding mode in the at least one candidate coding mode;
the selection unit is further configured to select a candidate coding mode with a mode rate-distortion cost that meets a rate-distortion cost condition, and perform coding processing on the current coding block.
15. An encoding processing apparatus characterized by comprising:
a processor adapted to implement one or more instructions; and the number of the first and second groups,
a computer storage medium having stored thereon one or more instructions adapted to be loaded by the processor and to perform the method of any of claims 1-13.
CN202011343013.0A 2020-11-25 2020-11-25 Encoding processing method, device, equipment and storage medium Active CN112153385B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011343013.0A CN112153385B (en) 2020-11-25 2020-11-25 Encoding processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011343013.0A CN112153385B (en) 2020-11-25 2020-11-25 Encoding processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112153385A CN112153385A (en) 2020-12-29
CN112153385B true CN112153385B (en) 2021-03-02

Family

ID=73887397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011343013.0A Active CN112153385B (en) 2020-11-25 2020-11-25 Encoding processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112153385B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20230091169A (en) * 2021-09-15 2023-06-22 텐센트 아메리카 엘엘씨 Method and Apparatus for Enhanced Signaling of Motion Vector Difference
CN114143545A (en) * 2021-11-30 2022-03-04 成都爱奇艺智能创新科技有限公司 A video compression method, device, electronic device and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7792188B2 (en) * 2004-06-27 2010-09-07 Apple Inc. Selecting encoding types and predictive modes for encoding video data
US9264710B2 (en) * 2012-07-06 2016-02-16 Texas Instruments Incorporated Method and system for video picture intra-prediction estimation
CN105530518B (en) * 2014-09-30 2019-04-26 联想(北京)有限公司 A kind of Video coding, coding/decoding method and device
CN107846593B (en) * 2016-09-21 2020-01-03 中国移动通信有限公司研究院 Rate distortion optimization method and device
AU2016426405B2 (en) * 2016-10-14 2021-11-25 Huawei Technologies Co., Ltd Devices and methods for video coding
CN108156458B (en) * 2017-12-28 2020-04-10 北京奇艺世纪科技有限公司 Method and device for determining coding mode
CN110381311B (en) * 2019-07-01 2023-06-30 腾讯科技(深圳)有限公司 Video frame encoding method, video frame encoding device, computer readable medium and electronic equipment

Also Published As

Publication number Publication date
CN112153385A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
CN111866512B (en) Video decoding method, video encoding method, video decoding apparatus, video encoding apparatus, and storage medium
CN112740697B (en) Image encoding/decoding method and device and recording medium storing bit stream
CN104320664B (en) Image processing equipment and method
CN111741299B (en) Method, device and equipment for selecting intra-frame prediction mode and storage medium
CN118540500A (en) Image encoding/decoding method and apparatus, and recording medium storing bit stream
CN119299668A (en) Image encoding/decoding method and recording medium storing bit stream
KR20190062273A (en) Method and apparatus for image processing using image transform network and inverse transform neaural network
CN118694929A (en) Method for decoding and encoding an image and non-transitory computer-readable storage medium
KR20190072450A (en) Method and apparatus to provide comprssion and transmission of learning parameter in distributed processing environment
CN113841404B (en) Video encoding/decoding method and apparatus, and recording medium storing bit stream
WO2022116113A1 (en) Intra-frame prediction method and device, decoder, and encoder
WO2023081322A1 (en) Intra prediction modes signaling
CN112153385B (en) Encoding processing method, device, equipment and storage medium
CN118590661A (en) Method and apparatus for image encoding and image decoding using temporal motion information
CN114503566A (en) Image encoding/decoding method and apparatus, and recording medium storing bit stream
WO2023034629A1 (en) Intra prediction modes signaling
KR20140031974A (en) Image coding method, image decoding method, image coding device, image decoding device, image coding program, and image decoding program
EP4476908A1 (en) Methods and devices for multi-hypothesis-based prediction
HK40036250B (en) Encoding processing method, device and apparatus, and storage medium
HK40036250A (en) Encoding processing method, device and apparatus, and storage medium
CN115499647B (en) Multi-transformation kernel selection method, encoding and decoding method, electronic device and storage medium
HK40041556A (en) Method and apparatus for selecting encoding mode, and readable storage medium
HK40074033A (en) Video decoding method and apparatus, video encoding method and apparatus, device, and storage medium
HK40030128A (en) Method and apparatus for selecting intra-prediction mode, device and storage medium
CN119484848A (en) A video processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40036250

Country of ref document: HK