CN110933430B

CN110933430B - Secondary encoding optimization method

Info

Publication number: CN110933430B
Application number: CN201911292705.4A
Authority: CN
Inventors: 朱策; 郭红伟; 刘宇洋; 叶茂
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2019-12-16
Filing date: 2019-12-16
Publication date: 2022-03-25
Anticipated expiration: 2039-12-16
Also published as: US20210185318A1; CN110933430A; WO2021120614A1; US11240503B2

Abstract

The invention belongs to the technical field of coding, and in particular relates to a secondary coding optimization method. The method of the present invention mainly includes: setting the quantization parameter of the initial I frame of the video sequence according to the range of the quantization parameter QP ₀ input by the encoder, using a simplified method to perform the first encoding, and simultaneously calculating the frame-level time domain influence factor k of the current frame _i and the block-level temporal impact factor k _B,j of all 16×16 pixel blocks in the frame, restore the reference list information of the encoder after the first encoding, and then judge whether the scene is switched, and adopt different strategies according to whether the scene is switched or not Set the quantization parameters for the second encoding. The beneficial effects of the present invention are: the rate-distortion dependence strength between the current frame and the reference frame, and the rate-distortion importance of each coding block in the current frame to the subsequent coding process are obtained through the simplified first coding, which are then used to guide the first coding. Frame-level and CTU-level coding resource optimization for secondary coding.

Description

Secondary coding optimization method

Technical Field

The invention belongs to the technical field of coding, and particularly relates to a secondary coding optimization method.

Background

With the rapid development of electronic information technology and the use of various video data acquisition modes, digital video becomes the main carrier of multimedia information, however, the data volume of uncompressed digital video is very huge, for example, 8-bit RGB color video with a resolution of 1920 × 1080 and a frame rate of 30Hz, the data volume per hour is up to 4.89 TB. Such a large amount of data poses a great challenge to the transmission and storage of video, so that the video compression technology continues to become a hot field for research and application at home and abroad since the last 80 th century. With the development of digital video encoding and decoding technology, the application of digital video covers various fields such as television broadcasting, digital movies, remote education, telemedicine, video monitoring, video sessions, streaming media transmission and the like, and a plurality of well-known video application enterprises appear. In order to ensure the interoperability between the coding and decoding products of different manufacturers, corresponding video coding standards are promoted.

Since there are many kinds of redundant data in digital video, it is impossible for any one encoding tool to achieve efficient video compression by itself. The first generation video coding standard H.261 released in the last 80 th century adopts a hybrid video coding framework containing various compression tools such as prediction, transformation, quantization and entropy coding, and can effectively remove redundancies such as time domain, spatial domain, vision and information entropy in digital video, and realize high-efficiency video data compression. Therefore, hybrid video coding structures are used today by each subsequent generation of video coding standards. The first edition of the High Efficiency Video Coding (HEVC) standard was published in 2013 by the Video Coding Experts Group (VCEG) of the International Telecommunication Union-Telecommunication Standardization Organization (ITU-T) and the Moving Picture Experts Group (Moving Picture Experts Group, MPEG) of the International Organization for Standardization/International Electrotechnical Commission (International Organization for Standardization/International Electrotechnical Commission, ISO/IEC), and the compression performance of the first edition was improved by one time compared to the h.264/AVC standard, and the HEVC Coding method was optimized in the past several years for HEVC.

In video coding, a Quantization Parameter (QP) and a lagrange multiplier λ determine distortion of a coded video and a number of bits required for coding, and there is a close relationship between them. The HEVC encoder employs a hierarchical coding structure, the quantization parameter of a coded frame being determined by its position in a Group of pictures (GOP) and the encoder input quantization parameter QP₀Determining: QP_HM＝QP₀+ Δ QP. In the HEVC low-latency coding configuration, the GOP size is 4, and the corresponding Δ QP values are 3, 2, 3, and 1, respectively. Then, the Lagrangian multiplier used in the encoding is represented by QP_HMAnd calculating to obtain:

w herein_LThe weighting factor is related to the level to which the coding frame belongs, and the value of the weighting factor takes the rate distortion importance of the coding frames of different levels into consideration to a certain extent. In addition, in an R- λ rate control algorithm of HEVC, a lagrangian multiplier of a frame to be encoded or a Coding Tree Unit (CTU) is first determined; the quantization parameter is then calculated from the lagrange multiplier λ: QP 4.2005 × ln (λ) + 13.7122.

In the framework of hybrid video coding, predictive coding technology makes a great contribution to an encoder to realize efficient video compression, intra-frame prediction and inter-frame prediction effectively remove spatial and temporal redundancy in video data, however, predictive coding also causes great rate-distortion dependence between coding frames and between basic coding units, that is, the current coding decision affects the maximum rate-distortion performance achievable by a subsequent coding process. The compression performance of the encoder can be further improved by effectively utilizing the rate distortion dependency to carry out self-adaptive bit resource allocation, and the bit resource allocation can be realized by adjusting the quantization parameter and the Lagrange multiplier in the encoding process.

Disclosure of Invention

The invention optimizes the code bit resource allocation by using the distortion dependency relationship and provides a secondary coding optimization method.

The technical scheme of the invention is as follows:

the secondary coding optimization method comprises the following steps:

s1, setting the video sequence start frame as an I frame, setting the Level as Level 0, and using the I frame as an independent GOP; optimizing the starting frame I frame, specifically according to the input quantization parameter QP of the encoder₀Range setting I frame quantization parameter QP_I：

S2, the encoder reads in a frame to be encoded of a GOP;

s3, first encoding: by usingQuantization parameter QP of HEVC default setting_HM,iThe current frame is coded for the first time to obtain a frame-level time domain influence factor k of the current frame_iAnd the block-level temporal impact factor k of all 16 x 16 pixel blocks within a frame_B,j：

D_iAnd

coding distortion and motion compensated prediction error of the current frame, respectively; d_B,jAnd

coding distortion and motion compensated prediction error of the jth 16 × 16 pixel block in the current frame, respectively;

s4, restoring reference list information of the encoder after the first encoding is finished, namely, the first encoding does not output the code stream of the current frame and stores the reconstructed image, and resetting the state from the image linked list in the encoder to the state before the current frame is encoded after the encoding, including restoring the reference frame identifier in the image linked list;

s5, judging whether the scene is switched, if so, entering the step S6, and if not, entering the step S7; the judging method comprises the following steps: if it is

And p is_i>10, judging that the scene switching of the current frame occurs, wherein p_iIs the average motion compensated prediction absolute error of the current frame,

is the mean of the average motion compensated prediction absolute errors of the previous 6 frames;

s6, setting the currentFrame quantization parameter QP_i＝QP₀Then, encoding is carried out, a code stream is output and a reconstructed image is stored according to a default mode of an encoder, and the step S8 is carried out;

s7, setting the quantization parameter of the current frame

Wherein QP_HM,iIs the quantization parameter set by the current frame in the original encoder, Round () is the operator of rounding; if the current frame is a key frame, increase QP_iThe resulting coding quality loss is multiplied because the direct distortion propagation of the key frame to the subsequent multiframes, so the saved bit number is not enough to offset the total loss of the coding quality, and finally the coding performance may be reduced, therefore, the quantization parameter of the key frame is not adjusted in the current step;

by QP_iCalculating to obtain a frame-level Lagrange multiplier lambda_pThen, the Lagrange multiplier lambda of each CTU in the coding current frame is obtained through the following formula_nAnd quantization parameter QP_n：

QP_n＝4.2005×ln(λ_n)+13.7122

Where M is the number of 16 × 16 pixel blocks included in the nth CTU, N is the number of CTUs in the encoded frame, w_n、

And W_nUsing the Lagrange multiplier lambda obtained by calculation as an intermediate variable_nAnd quantization parameter QP_nCoding each CTU in the current frame, outputting a code stream and storing a reconstructed image according to a default mode of a coder;

s8, judging whether the coded frame is the last frame of the video sequence, if so, ending the coding; otherwise, continuously determining whether the encoded frame is the last frame in the current GOP, if so, returning to step S2, otherwise, returning to step S3.

The scheme of the invention is different from the traditional method in that: proposing a frame-level temporal impact factor k_iAnd a block-level time domain impact factor k_B,jTo measure the rate-distortion dependence. The initial I frame of the video sequence is optimized, the coding quality of the I frame determines the maximum rate distortion performance which can be achieved by the coding of the subsequent P/B frame, and the rate distortion performance can be improved for the whole coded video by properly improving the quality of the I frame. Because the rate distortion dependency is weaker when the code rate is high and the rate distortion dependency is stronger when the code rate is low, the invention quantizes the parameter QP according to the input of the coder₀Range setting I frame quantization parameter QP_I. Frame-level temporal impact factor k obtained with simplified coding process_iAnd a block-level time domain impact factor k_B,j. To reduce the computational complexity, the RDO mode selection process for the first encoding skips most of the mode decision and only employs the 64 × 64, 32 × 32, and 16 × 16 inter prediction modes. And adopting different optimization strategies for the coding frame according to whether scene switching occurs or not.

The invention has the beneficial effects that:

the rate-distortion dependence strength between the current frame and the reference frame and the rate-distortion importance degree of each coding block in the current frame to the subsequent coding process are obtained through simplified first-time coding, and the rate-distortion dependence strength and the rate-distortion importance degree are further used for guiding the optimization of the coding resources of the second-time coding frame level and the CTU level. In HEVC encoder HM-16.7, the present invention achieves code rate savings of on average 5.1% and 5.3% in both low latency B frame (LDB) and low latency P frame (LDP) coding configurations, respectively, with an average increase in coding complexity of 23%.

Drawings

FIG. 1 is a principal flow diagram of the present invention;

fig. 2 is an exemplary diagram of reference relationships in HEVC low-latency coding;

FIG. 3 is a frame-level temporal impact factor k in the present invention_iA statistical chart;

FIG. 4 is a block-level time domain impact factor k in the present invention_B,jA videotex example schematic;

FIG. 5 is a graph comparing rate-distortion curves;

fig. 6 is a diagram illustrating the coding time increase and rate saving of the present invention with respect to an HEVC encoder.

Detailed Description

The invention will be further explained and its effectiveness pointed out in the following description with reference to the drawings and simulation examples.

Examples

The embodiment adopts a development environment of Visual Studio 2013, and is realized based on HEVC reference software HM-16.7.

Fig. 1 is a flow chart of the main steps of a secondary coding optimization method, specifically including:

step 1: and optimizing the starting frame of the video sequence. According to input quantization parameter QP of encoder₀Range setting I frame quantization parameter QP_I：

Step 2: the frames to be encoded of a GOP are read in the HM-16.7 default way.

And step 3: the first encoding is performed using a simplified encoding process. The method comprises the steps of firstly encoding a current frame by using HEVC default quantization parameter setting and a Lagrange multiplier calculation mode, thereby obtaining information such as motion compensation prediction error and coding distortion of a frame level and a 16 x 16 pixel block level, and then calculating a frame level time domain influence factor k of the current frame according to a formula_iAnd the block-level temporal impact factor k of all 16 x 16 pixel blocks within a frame_B,j。

And 4, step 4: restore the reference list and the like. The first encoding does not output the code stream of the current frame and store the reconstructed image, and the state from the image linked list in the encoder to the state before the current frame is encoded is reset after the encoding, including the restoration of the reference frame identifier in the image linked list.

And 5: and (4) judging scene switching, if so, entering a step 6, and otherwise, entering a step 7. The scene switching judgment method comprises the following steps: if it is

And p is_i>10, it is determined that scene switching has occurred in the ith frame, where p_iIs the average motion compensated prediction absolute error of the current frame,

is the average of the average motion compensated prediction absolute errors of the previous 6 frames.

Step 6: setting a current frame quantization parameter QP_i＝QP₀And then coding is carried out, and a code stream is output and a reconstructed image is stored according to a default mode of the coder.

And 7: setting a current frame quantization parameter

Wherein QP_HM,iIs the quantization parameter that the current frame sets in the original HEVC encoder HM, Round (·) is the rounding operator. It should be noted that the above quantization parameter setting only operates on the coding frames of

levels

2 and 3, and the quantization parameter of the key frame maintains the setting in the original HEVC encoder.

QP_n＝4.2005×ln(λ_n)+13.7122

Where M is the number of 16 × 16 pixel blocks included in the nth CTU, and N is the number of CTUs in the encoded frame. Finally, for the case that no scene switching occurs, the Lagrangian multiplier λ calculated as above is adopted_nAnd quantization parameter QP_nAnd coding each CTU in the current frame, outputting a code stream and storing a reconstructed image according to a default mode of the coder.

And 8: it is determined whether the encoded frame is the last frame of the video sequence. If yes, the encoding is ended; if not, it is determined whether the encoded frame is the last frame in the current GOP. If yes, skipping to step 2 to read the next GOP data; if not, then go to step 3 to encode the next frame in the current GOP.

The bitstream generated by the coding of the invention conforms to the syntax format of the HEVC standard, and the generated bitstreams can be decoded by a standard HEVC decoder. Coding experiments were performed according to HEVC general test conditions, and two encoder configurations, LDB and LDP, were tested, and reference relationships are shown in fig. 2. As can be seen from FIGS. 3 and 4, the frame-level temporal impact factor k proposed by the present invention_iAnd a block-level time domain impact factor k_B,jThe rate-distortion dependence is effectively represented.

Fig. 5 is a comparison of rate-distortion curves of a test sequence parylene in LDB and LDP coding configurations, and it can be seen that the rate-distortion performance of the present invention is better than that of the original HEVC encoder HM-16.7 at both low and high code rates.

Fig. 6 is a schematic diagram of coding time increase and code rate saving of the present invention compared with the original HEVC encoder HM-16.7, and it can be seen that, under the condition that the coding complexity is increased by 23% on average, the present invention obtains code rate saving of 5.1% and 5.3% on average under LDB and LDP configurations, respectively, and shows a large rate-distortion performance improvement.

Claims

1. The secondary coding optimization method is characterized by comprising the following steps of:

S2, the encoder reads in a frame to be encoded of a GOP;

s3, first encoding: quantization parameter QP with HEVC default settings_HM,iThe current frame is coded for the first time to obtain a frame-level time domain influence factor k of the current frame_iAnd the block-level temporal impact factor k of all 16 x 16 pixel blocks within a frame_B,j：

D_iAnd

s6, setting the quantization parameter QP of the current frame_i＝QP₀Then, encoding is carried out, a code stream is output and a reconstructed image is stored according to a default mode of an encoder, and the step S8 is carried out;

s7, setting the quantization parameter of the current frame

Wherein QP_HM,iIs the quantization parameter set by the current frame in the original encoder, Round () is the operator of rounding;

QP_n＝4.2005×ln(λ_n)+13.7122