CN114640849B

CN114640849B - Live video encoding method, device, computer equipment and readable storage medium

Info

Publication number: CN114640849B
Application number: CN202210288184.0A
Authority: CN
Inventors: 马学睿; 周超; 方周; 朱经腾
Original assignee: Guangzhou Cubesili Information Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2022-03-23
Filing date: 2022-03-23
Publication date: 2024-03-12
Anticipated expiration: 2042-03-23
Also published as: CN114640849A

Abstract

The application relates to the technical field of network live broadcasting and the technical field of video coding, and provides a live broadcasting video coding method, a live broadcasting video coding device, computer equipment and a readable storage medium, wherein the live broadcasting video coding method comprises the following steps: obtaining quantization parameters corresponding to each frame of live image; quantizing the live image in each prediction mode to obtain quantization information corresponding to each coding block; obtaining distortion information corresponding to each coding block in each prediction mode according to quantization information corresponding to each coding block; the distortion information corresponding to each coding block in the target live image is obtained according to the quantization information corresponding to the coding block and the inverse quantization information corresponding to the coding block; and obtaining target prediction modes corresponding to the coding blocks, and obtaining the coded live video according to the target prediction modes corresponding to the coding blocks and quantization information corresponding to the coding blocks in the target prediction modes. Compared with the prior art, the method and the device have the advantages that the coding efficiency of the live video is improved, and the live experience of a user is improved.

Description

Live video encoding method, device, computer equipment and readable storage medium

Technical Field

The embodiment of the application relates to the technical field of network live broadcasting and video coding, in particular to a live video coding method, a live video coding device, computer equipment and a readable storage medium.

Background

With the rapid development of the live broadcast industry, more and more internet platforms start to provide live broadcast services so as to attract users to perform network live broadcast interaction in a live broadcast room.

The live broadcast service comprises a video live broadcast service and a voice live broadcast service, wherein in the video live broadcast service process, video contents watched by a user at a client are called live broadcast videos, and the definition and fluency of the live broadcast videos during playing can directly influence the live broadcast experience of the user.

The clients in the network live broadcast scene can be divided into a live broadcast client and a spectator client, the live broadcast client is triggered to collect live broadcast video after the live broadcast is started by the live broadcast, the live broadcast client encodes the live broadcast video and then sends the encoded live broadcast video to the server, and the spectator client pulls the encoded live broadcast video from the server to decode and play. In the process, encoding the live video is an important link for guaranteeing the definition and smoothness of the live video and improving the live experience of a user.

Currently, video coding is mostly developed based on the HECV standard, for example: an X265 encoder based on HEVC standard for video coding. However, because the video coding mode based on the HECV standard is relatively high in complexity and relatively low in efficiency, it is difficult to bear the higher requirements of the current user on the definition and smoothness of live video playing in the network live scene, and the live experience of the user cannot be further improved.

Disclosure of Invention

The embodiment of the application provides a live video coding method, a device, computer equipment and a readable storage medium, which can solve the technical problems that the live video coding complexity is higher, the coding efficiency is lower, and the live video playing fluency can not be improved under the condition of guaranteeing the live video playing definition, and the technical scheme is as follows:

in a first aspect, an embodiment of the present application provides a live video encoding method, including the steps of:

acquiring quantization parameters corresponding to live images of frames in live video;

quantizing the live image according to the corresponding quantization parameters in each prediction mode to obtain quantization information corresponding to each coding block in the live image in each prediction mode; the encoding block is obtained by dividing the live image;

Obtaining distortion information corresponding to each coding block in each prediction mode according to quantization information corresponding to each coding block in each prediction mode; the distortion information corresponding to each coding block in the target live image is obtained according to quantization information corresponding to the coding block and inverse quantization information corresponding to the coding block; the frame type of the target live image is a non-reference bidirectional difference frame;

obtaining rate distortion optimization information corresponding to each coding block in each prediction mode; the rate-distortion optimization information comprises the distortion information and prediction bit information, wherein the prediction bit information is bit information required by the predicted coding of the coding block;

obtaining target prediction modes corresponding to the coding blocks according to rate distortion optimization information corresponding to the coding blocks in the prediction modes; the rate distortion optimization information corresponding to the coding block in the target prediction mode is minimum;

and encoding each encoding block according to a target prediction mode corresponding to each encoding block and quantization information corresponding to the encoding block in the target prediction mode to obtain the encoded live video.

In a second aspect, an embodiment of the present application provides a live video encoding apparatus, including:

the first acquisition unit is used for acquiring live video and quantization parameters corresponding to each frame of live image in the live video;

the first quantization unit is used for quantizing the live image according to the corresponding quantization parameters in each prediction mode to obtain quantization information corresponding to each coding block in the live image in each prediction mode; the encoding block is obtained by dividing the live image;

the second obtaining unit is used for obtaining distortion information corresponding to each coding block in each prediction mode according to quantization information corresponding to each coding block in each prediction mode; the distortion information corresponding to each coding block in the target live image is obtained according to quantization information corresponding to the coding block and inverse quantization information corresponding to the coding block; the frame type of the target live image is a non-reference bidirectional difference frame;

a third obtaining unit, configured to obtain rate distortion optimization information corresponding to each coding block in each prediction mode; the rate-distortion optimization information comprises the distortion information and prediction bit information, wherein the prediction bit information is bit information required by the predicted coding of the coding block;

The first determining unit is used for obtaining a target prediction mode corresponding to each coding block according to rate distortion optimization information corresponding to each coding block in each prediction mode; the rate distortion optimization information corresponding to the coding block in the target prediction mode is minimum;

and the first coding unit is used for coding each coding block according to the target prediction mode corresponding to the coding block and the quantization information corresponding to the coding block in the target prediction mode to obtain the coded live video.

In a third aspect, embodiments of the present application provide a computer device, a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to the first aspect when the computer program is executed.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method according to the first aspect.

In the embodiment of the application, a live video and quantization parameters corresponding to each frame of live image in the live video are obtained; quantizing the live image according to the corresponding quantization parameters in each prediction mode to obtain quantization information corresponding to each coding block in the live image in each prediction mode; the encoding block is obtained by dividing live images; obtaining distortion information corresponding to each coding block in each prediction mode according to quantization information corresponding to each coding block in each prediction mode; the distortion information corresponding to each coding block in the target live image is obtained according to the quantization information corresponding to the coding block and the inverse quantization information corresponding to the coding block; the frame type of the target live image is a non-reference bidirectional difference frame; obtaining rate distortion optimization information corresponding to each coding block in each prediction mode; the rate distortion optimization information comprises distortion information and prediction bit information, wherein the prediction bit information is bit information required by the predicted coding of the coding block; obtaining target prediction modes corresponding to all the coding blocks according to rate distortion optimization information corresponding to all the coding blocks in all the prediction modes; the rate distortion optimization information corresponding to the coding block in the target prediction mode is minimum; and encoding each encoding block according to the target prediction mode corresponding to each encoding block and the quantization information corresponding to the encoding block in the target prediction mode to obtain the encoded live video. In the process of determining the target prediction mode corresponding to each coding block, the method simplifies the process of obtaining the distortion information corresponding to each coding block in the target live image aiming at the target live image with the frame type of the non-reference bidirectional difference frame, and directly obtains the distortion information corresponding to the coding block according to the quantization information corresponding to the coding block and the inverse quantization information corresponding to the coding block, thereby shortening the time of determining the target prediction mode corresponding to each coding block in the target live image, further shortening the coding time of the live video and improving the coding efficiency of the live video. In addition, as the target live image with the frame type of the non-reference bidirectional difference frame is not used as a reference frame and is used for encoding blocks in other live images, the process of obtaining distortion information of the target live image is simplified, encoding quality of live video is not affected, resolution ratio of the live video is reduced, definition and fluency of the live video during playing can be guaranteed at the same time, and live experience of a user is improved.

For a better understanding and implementation, the technical solutions of the present application are described in detail below with reference to the accompanying drawings.

Drawings

Fig. 1 is an application scenario schematic diagram of a live video encoding method provided in an embodiment of the present application;

fig. 2 is a schematic diagram of another application scenario of the live video encoding method provided in the embodiment of the present application;

fig. 3 is a flow chart of a live video encoding method according to a first embodiment of the present application;

fig. 4 is a schematic flowchart of S101 in the live video encoding method provided in the first embodiment of the present application;

fig. 5 is a schematic flowchart of S102 in the live video encoding method provided in the first embodiment of the present application;

fig. 6 is a schematic flowchart of S103 in the live video encoding method provided in the first embodiment of the present application;

fig. 7 is a schematic flowchart of S104 in the live video encoding method provided in the first embodiment of the present application;

fig. 8 is a schematic structural diagram of a live video encoding device according to a second embodiment of the present application;

fig. 9 is a schematic structural diagram of a computer device according to a third embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. The word "if"/"if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination", depending on the context.

As will be appreciated by those skilled in the art, the terms "client," "terminal device," and "terminal device" as used herein include both devices that include only wireless signal receiver devices without transmitting capabilities and devices that include receiving and transmitting hardware that include devices capable of two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device such as a personal computer, tablet, or the like, having a single-line display or a multi-line display or a cellular or other communication device without a multi-line display; a PCS (PersonalCommunications Service, personal communication system) that may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant ) that can include a radio frequency receiver, pager, internet/intranet access, web browser, notepad, calendar and/or GPS (Global PositioningSystem ) receiver; a conventional laptop and/or palmtop computer or other appliance that has and/or includes a radio frequency receiver. As used herein, "client," "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or adapted and/or configured to operate locally and/or in a distributed fashion, at any other location(s) on earth and/or in space. As used herein, a "client," "terminal device," or "terminal device" may also be a communication terminal, an internet terminal, or a music/video playing terminal, for example, a PDA, a MID (Mobile Internet Device ), and/or a mobile phone with music/video playing function, or may also be a device such as a smart tv, a set top box, or the like.

The hardware referred to by the names "server", "client", "service node", etc. in this application is essentially a computer device having the performance of a personal computer, and is a hardware device having necessary components disclosed by von neumann's principle, such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, and an output device, and a computer program is stored in the memory, and the central processing unit calls a program stored in the external memory to run in the memory, executes instructions in the program, and interacts with the input/output device, thereby completing a specific function.

It should be noted that the concept of "server" as referred to in this application is equally applicable to the case of a server farm. The servers should be logically partitioned, physically separate from each other but interface-callable, or integrated into a physical computer or group of computers, according to network deployment principles understood by those skilled in the art. Those skilled in the art will appreciate this variation and should not be construed as limiting the implementation of the network deployment approach of the present application.

Referring to fig. 1, fig. 1 is a schematic application scenario of a live video encoding method provided in an embodiment of the present application, where the application scenario includes a hosting client 101, a server 102 and an audience client 103 provided in an embodiment of the present application, and the hosting client 101 and the audience client 103 interact through the server 102.

The clients proposed in the embodiment of the present application include the anchor client 101 and the audience client 103.

It should be noted that there are various understandings of the concept "client" in the prior art, such as: it may be understood as an application installed in a computer device or as a hardware device corresponding to a server.

In the embodiments of the present application, the term "client" refers to a hardware device corresponding to a server, more specifically, refers to a computer device, for example: smart phones, smart interactive tablets, personal computers, etc.

When the client is a mobile device such as a smart phone and an intelligent interaction tablet, a user can install a matched mobile terminal application program on the client, and can access a Web terminal application program on the client.

When the client is a non-mobile device such as a Personal Computer (PC), the user can install a matched PC application program on the client, and can access a Web application program on the client.

The mobile terminal application program refers to an application program which can be installed in mobile equipment, the PC terminal application program refers to an application program which can be installed in non-mobile equipment, and the Web terminal application program refers to an application program which needs to be accessed through a browser.

Specifically, the Web application may be further divided into a mobile version and a PC version according to the difference of client types, and there may be a difference between the page layout manner and the available server support of the two.

In the embodiment of the present application, the types of live broadcast applications provided to the user are classified into a mobile-side live broadcast application, a PC-side live broadcast application, and a Web-side live broadcast application. The user can autonomously select the mode of participating in the network live broadcast according to different types of the client.

The present application may divide clients into a hosting client 101 and a spectator client 103, depending on the identity of the user in which the clients are employed.

The anchor client 101 refers to an end that transmits a live video, and is generally a client used by an anchor (i.e., a live anchor user) in a live video.

The viewer client 103 refers to a client employed by a viewer (i.e., a live viewer user) receiving and viewing a live video, typically in a live video.

The hardware pointed to by the anchor client 101 and the audience client 103 essentially refers to computer devices, which may be, as shown in fig. 1, in particular, smart phones, smart interactive tablets, personal computers, and the like. Both the anchor client 101 and the spectator client 103 may access the internet via known network access means to establish a data communication link with the server 102.

The server 102 acts as a service server and may be responsible for further interfacing with related audio data servers, video streaming servers, and other servers providing related support, etc., to form a logically associated service cluster for serving related end devices, such as the anchor client 101 and the viewer client 103 shown in fig. 1.

In this embodiment of the present application, the anchor client 101 and the viewer client 103 may join the same live broadcast room (i.e., live broadcast channel), where the live broadcast room is a chat room implemented by means of the internet technology, and generally has an audio/video playing control function. A live user plays a live broadcast in a live broadcast room through a live broadcast client 101, and a viewer of a viewer client 103 can log into a server 102 to watch live broadcast in the live broadcast room.

Specifically, the anchor logs in the server 102 through the anchor client 101, triggers the anchor client 101 to load an open play interface, wherein an open play control is displayed in the open play interface, and the anchor can start live broadcast by clicking the open play control, and if the anchor client 101 is in a live video broadcast mode, the anchor client 101 is triggered to acquire live video.

The live video is video data collected by a camera that establishes data connection with the anchor client 101, and the camera may be a self-contained camera of the anchor client 101 or an external camera of the anchor client 101.

The anchor client 101 encodes the collected live video, and pushes the encoded live video to the server 102.

If the audience enters the live broadcast room created by the anchor through the audience client 103, the audience client 103 is triggered to pull the encoded live broadcast video from the server 102 and output the encoded live broadcast video to the live broadcast room interface after decoding, so that the audience can watch the live broadcast video in the live broadcast room.

The manner of entering the live room created by the host is not limited herein, and the viewer may enter the live room created by the host by way of live room recommendation pages, manually searching the live room, and sliding the live room interface up and down.

Referring to fig. 2, fig. 2 is a schematic diagram of another application scenario of the live video encoding method according to the embodiment of the present application. In fig. 2, the server 102 is a server cluster, where the server cluster at least includes a service server 1021 and a streaming media server 1022, the service server 1021 is responsible for providing services related to live service logic, and the streaming media server 1022 is responsible for providing services related to streaming media data, where live video is a streaming media data.

The video camera which establishes data connection with the anchor client 101 collects live video, the anchor client 101 encodes the live video, then pushes the encoded live video to the streaming media server 1022, and the audience client 103 pulls the encoded live video from the streaming media server 1022 after joining the live room created by the anchor.

In the embodiment of the application, as the encoding quality and encoding efficiency of the live video can directly influence the definition and fluency when the live video is played, the embodiment of the application provides a live video encoding method for solving the technical problems of higher complexity and lower encoding efficiency of the live video encoding. Referring to fig. 3, fig. 3 is a flowchart of a live video encoding method according to a first embodiment of the present application, where the method includes the following steps:

S101: and acquiring quantization parameters corresponding to each frame of live image in the live video.

S102: quantizing the live image according to the corresponding quantization parameters in each prediction mode to obtain quantization information corresponding to each coding block in the live image in each prediction mode; the encoding block is obtained by dividing the live image.

S103: obtaining distortion information corresponding to each coding block in each prediction mode according to quantization information corresponding to each coding block in each prediction mode; the distortion information corresponding to each coding block in the target live image is obtained according to the quantization information corresponding to the coding block and the inverse quantization information corresponding to the coding block; the frame type of the target live image is a non-reference bi-directional difference frame.

S104: obtaining rate distortion optimization information corresponding to each coding block in each prediction mode; wherein the rate-distortion optimization information includes distortion information and prediction bit information, the prediction bit information being bit information required for the predicted encoding of the encoded block.

S105: obtaining target prediction modes corresponding to all the coding blocks according to rate distortion optimization information corresponding to all the coding blocks in all the prediction modes; and the rate distortion optimization information corresponding to the coding block in the target prediction mode is minimum.

S106: and encoding each encoding block according to the target prediction mode corresponding to each encoding block and the quantization information corresponding to the encoding block in the target prediction mode to obtain the encoded live video.

In this embodiment, the live video encoding method is described with the anchor client as the execution body. Meanwhile, in order to more clearly illustrate each step in the live video coding method, a description of a server angle is also assisted to help understand the overall scheme.

Regarding step S101, the anchor client acquires the live video and quantization parameters corresponding to each frame of live image in the live video.

The live video is video data collected by a camera which establishes data connection with the anchor client, wherein the camera can be a self-contained camera of the anchor client or an external camera of the anchor client.

In the embodiment of the present application, the live video includes a plurality of frames of live images, and the live video is quantized, that is, a plurality of frames of live images are quantized.

The quantization is explained below, and the quantization refers to a process of mapping a continuous value (or a large number of discrete values) of a signal into a limited plurality of discrete amplitude values, so as to realize mapping of the signal values in many-to-one mode. Therefore, the value space of the signal can be reduced through quantization, and further, a better compression effect is obtained.

In this embodiment, the nature of the live video is also a signal, and after a plurality of frames of live images in the live video are quantized, the live images are encoded, so that redundant information to be encoded can be reduced and the length of image encoding can be shortened on the premise of not reducing visual effects, thereby improving the encoding effect of the live video.

Because the quantization can map the signal values in many-to-one mode, the quantization can cause the loss of the pixel values of the pixel points in the live image, and the response can cause the distortion of the live image on the live image, so that the quantization parameters are required to be reasonably set.

The quantization parameter is small, the details of the live image are more reserved, the distortion of the live image is weakened, and the bit rate required by encoding the live image is improved; the quantization parameter is large, the details of the live image are lost, the distortion of the live image is enhanced, and the bit rate required for encoding the live image is reduced.

In the embodiment of the application, the anchor client acquires the quantization parameter corresponding to each frame of live image, and the quantization parameter is not a fixed value, but is adjusted according to the live images of different frames.

In an alternative embodiment, referring to fig. 4, step S101 includes steps S1011-S1012, which are specifically as follows:

S1011: and acquiring the first bit rate information, the complexity degree information of each frame of live image and the importance degree information of each frame of live image.

S1012: obtaining quantization parameters corresponding to each frame of live broadcast image according to the first bit rate information, the complexity information of each frame of live broadcast image and the importance information of each frame of live broadcast image; the larger the quantization parameter corresponding to the live image is, the smaller the bit rate information allocated to the live image is, and the average value of the bit rate information allocated to each frame of live image does not exceed the first bit rate information.

With respect to step S1011, the anchor client may obtain the first bit rate information. The first bit rate information is the number of bits transmitted per unit time, the unit is bps, and in the art, the bit rate and the code rate are the same concept.

The first bit rate information is used for macroscopically regulating and controlling bit rate information which can be distributed to each frame of live image.

The corresponding importance and complexity of each frame of live image are different due to the different positions of each frame of live image in the live video and the content of the information carried by each frame of live image. In order to more reasonably adjust the quantization parameters, complexity information of each frame of live image and importance information of each frame of live image need to be acquired.

Regarding step S1012, the anchor client obtains quantization parameters corresponding to each frame of live image according to the first bit rate information, the complexity information of each frame of live image, and the importance information of each frame of live image.

If the complexity information of the live image is smaller and the importance information is smaller, the quantization parameter corresponding to the live image is larger, the bit rate information allocated to the live image is smaller, and in the average bit rate mode, the average value of the bit rate information allocated to each frame of live image does not exceed the first bit rate information.

In this embodiment, the anchor client adjusts quantization parameters corresponding to each frame of live image according to the first bitrate information, the complexity information of each frame of live image and the importance information of each frame of live image, so that more important and more complex live images can be obtained, the corresponding quantization parameters are smaller, further loss of details of the live image in the quantization process is reduced, and bitrate information allocated to the live image is increased.

Regarding step S102, the anchor client quantizes the live image according to the corresponding quantization parameters in each prediction mode, so as to obtain quantization information corresponding to each coding block in the live image in each prediction mode.

And dividing the live image by the anchor client before quantization to obtain a plurality of coding blocks.

In an alternative embodiment, the size of the division may be 64x64, resulting in a 64x64 encoded block, i.e. an encoded block consisting of 64 rows and 64 columns of pixels. It will be appreciated that, in a specific quantization process, the anchor client may further divide the 64×64 encoded block into smaller encoded blocks for quantization, which is not limited in detail herein.

If the live image is in YUV format, the pixel values of the pixel points are represented by the luminance component Y and the chrominance component U, V, and then the encoding block includes a luminance block and a chrominance block, and the luminance block and the chrominance block are respectively quantized, which in this embodiment are collectively referred to as transforming, quantizing, determining a target prediction mode, and the like, for the encoding block, and are not divided into the luminance block and the chrominance block in detail.

In this embodiment, after dividing the live image into a plurality of encoding blocks, the anchor client quantizes each encoding block in the live image according to the corresponding quantization parameter in each prediction mode.

The prediction mode will be explained first.

The prediction modes are classified into an inter prediction mode and an intra prediction mode. Each prediction mode is used to find the reference coding block closest to the coding block, so that the predicted pixel value of the coding block can be obtained from the pixel value of the reference coding block.

The inter-frame prediction mode is to find the nearest reference coding block in the live image which is coded before and after, and the intra-frame prediction mode is to find the nearest reference coding block in the current live image.

By acquiring the reference coding block which is closer to the coding block, residual information between the predicted pixel value of the coding block and the original pixel value of the coding block can be smaller, so that bit information required for coding the acquired quantization information is less after the residual information is transformed and quantized, and the compression rate can be effectively improved.

The manner in which different prediction modes find the closest reference coding block is different, and in order to improve the quality of video coding, it is necessary to determine the optimal prediction mode corresponding to each coding block. Therefore, in S102 of the present embodiment, the anchor client needs to obtain the quantization information corresponding to each encoding block in the live image in different prediction modes.

Referring to fig. 5, step S102 includes steps S1021 to S1024, which are specifically described as follows:

s1021: and acquiring a reference coding block corresponding to each coding block in each prediction mode.

S1022: and obtaining the predicted pixel value of each pixel point in the coding block in each prediction mode according to the pixel value of each reference pixel point in the reference coding block.

S1023: obtaining residual information corresponding to the coding blocks in each prediction mode according to original pixel values of all the pixel points in the coding blocks and predicted pixel values of all the pixel points in the coding blocks in each prediction mode; the residual information corresponding to the coding block comprises residual values corresponding to all pixel points in the coding block.

S1024: and carrying out transformation operation and quantization operation on residual information corresponding to the coding blocks in each prediction mode in sequence to obtain quantization information corresponding to each coding block in each prediction mode.

Regarding steps S1021 to S1022, the anchor client obtains the reference coding blocks corresponding to each coding block in each prediction mode, and obtains the predicted pixel values of each pixel point in the coding block in each prediction mode according to the pixel values of each reference pixel point in the reference coding block.

The prediction modes are divided into an Inter prediction mode and an intra prediction mode, and in an alternative embodiment, the Inter prediction mode includes a Merge mode and an Inter mode, and the intra prediction mode includes 35 types of modes, namely a DC mode, a Planar mode and 33 types of angle modes.

It will be appreciated that there are numerous existing inter-prediction modes and intra-prediction modes, and that each prediction mode is not described in detail since the present application does not address the process of determining a reference coded block in improving various prediction modes.

The pixel value of each reference pixel point in the reference coding block refers to the reconstructed pixel value of each reference pixel point.

As to how to obtain the reconstructed pixel values of each reference pixel point in the reference coding block, this will be reflected in step S103, since the reference coding block has already been coded, then before coding, the target prediction mode corresponding to the reference coding block must be determined, which means that the distortion information corresponding to the reference coding block in the target prediction mode has already been calculated, and in the process of calculating the distortion information, the reconstructed pixel values of the reference pixel point can be obtained.

The anchor client predicts the pixel values of each pixel point in the coding block according to the pixel values of each reference pixel point in the reference coding block to obtain the predicted pixel values of each pixel point in the coding block, and the specific prediction process is also different based on different prediction modes, which is not limited herein.

Regarding step S1023, the anchor client obtains residual information corresponding to the coding block in each prediction mode according to the original pixel values of each pixel in the coding block and the prediction pixel values of each pixel in the coding block in each prediction mode.

The residual information corresponding to the coding block comprises residual values corresponding to all pixel points in the coding block. The residual value of the pixel is the difference between the original pixel value of the pixel and the predicted pixel value of the pixel.

Regarding step S1024, the transformation operation and the quantization operation are sequentially performed on the residual information corresponding to the coding blocks in each prediction mode, so as to obtain the quantization information corresponding to each coding block in each prediction mode.

The transform operation under the HEVC standard is classified into a discrete cosine transform DCT and a discrete sine transform DST, which are used to process a 4x4 luminance block only in an intra prediction mode, and reference is made to the foregoing regarding the meaning of the luminance block.

Specifically, the anchor client performs transformation operation on residual information corresponding to the coding block, and then performs quantization operation, so as to obtain quantization information corresponding to the coding block.

The transformation process and quantization process are both prior art in the field of video coding and are not described here.

Regarding step S103, the anchor client obtains distortion information corresponding to each coding block in each prediction mode according to the quantization information corresponding to each coding block in each prediction mode.

Before explaining step S103, I frames, P frames, and B frames in the field of video encoding are explained.

I frames are key frames and no reference is made to other pictures at the time of encoding. The P frame is a difference frame, and the previous frame is used when encoding, the B frame is a bidirectional difference frame, and the previous and subsequent frames are used when encoding.

In this embodiment, the key frame, the difference frame, and the bi-directional difference frame are all different frame types. The frame type of the target live image is a non-reference bidirectional difference frame, namely a non-reference B frame, and the non-reference B frame is not used as a reference image corresponding to other live images and is not used for searching a reference coding block.

Therefore, the anchor client can obtain the distortion information corresponding to the coding block in each prediction mode according to the quantization information corresponding to the coding block in each prediction mode and the inverse quantization information corresponding to the coding block in each prediction mode aiming at the target live image.

Specifically, the anchor client calculates the difference between the quantization information corresponding to the coding block in each prediction mode and the inverse quantization information corresponding to the coding block in each prediction mode to obtain a first difference value corresponding to the coding block in each prediction mode, performs hadamard transformation on the first difference value corresponding to the coding block in each prediction mode, and sums the obtained hadamard transformation results to obtain distortion information corresponding to the coding block in each prediction mode. The distortion information corresponding to the encoded block is a distortion value.

And performing inverse quantization operation on the quantization information corresponding to each coding block in each prediction mode to obtain inverse quantization information corresponding to each coding block in each prediction mode.

The following expands the explanation why, for a target live image, the process of obtaining distortion information corresponding to a coding block can be simplified, and in an alternative embodiment, referring to fig. 6, step S103 includes steps S1031 to S1035, which are specifically as follows:

s1031: and performing inverse quantization operation on the quantization information corresponding to each coding block in each prediction mode to obtain inverse quantization information corresponding to each coding block in each prediction mode.

S1032: and judging whether the coding blocks are obtained by dividing the target live image, if so, obtaining distortion information corresponding to each coding block in each prediction mode according to the difference value of quantization information corresponding to the coding block in each prediction mode and inverse quantization information corresponding to the coding block in each prediction mode.

S1033: if the coding block is obtained by non-dividing the target live image, performing inverse transformation operation on inverse quantization information corresponding to the coding block in each prediction mode to obtain inverse transformation information corresponding to the coding block in each prediction mode; the inverse transformation information corresponding to the coding block comprises inverse transformation values corresponding to all pixel points in the coding block.

S1034: and obtaining the reconstructed pixel value of each pixel point in the coding block in each prediction mode according to the inverse transformation value corresponding to each pixel point in the coding block in each prediction mode and the predicted pixel value of each pixel point in the coding block in each prediction mode.

S1035: and obtaining distortion information corresponding to each coding block in each prediction mode according to the difference value between the reconstructed pixel value of each pixel point in the coding block in each prediction mode and the original pixel value of each pixel point in the coding block in each prediction mode.

Regarding step S1031, the anchor client performs inverse quantization operation on the quantization information corresponding to each coding block in each prediction mode, to obtain inverse quantization information corresponding to each coding block in each prediction mode.

The inverse quantization operation is the inverse process of the quantization operation.

Regarding step S1032, it is determined whether the encoded block is obtained by dividing the target live image, if so, distortion information corresponding to each encoded block in each prediction mode is obtained according to a difference between quantization information corresponding to the encoded block in each prediction mode and inverse quantization information corresponding to the encoded block in each prediction mode.

The foregoing explanation has been made on the target live image, and it is mainly explained here why the calculation process of distortion information corresponding to each encoding block in the target live image is simplified, the encoding quality of the live video is not affected, and the encoding efficiency can be improved.

Because the frame type of the target live image is a non-reference bidirectional difference frame, the non-reference bidirectional difference frame is not used as a reference image corresponding to other live images and is used for searching a reference coding block. Therefore, the situation that the nearest reference coding block is found from the target live image does not occur, and the situation that the predicted pixel value corresponding to each pixel point in the current coding block is obtained by utilizing the reconstructed pixel value corresponding to each reference pixel point in the nearest reference coding block does not occur.

The calculation process of the distortion information in the following steps S1033 to S1035 is capable of obtaining the reconstructed pixel value corresponding to the pixel point, but the calculation process is complex, takes a long time, and is liable to cause a reduction in coding efficiency. Therefore, in this embodiment, for the target live image, distortion information corresponding to each encoding block is obtained directly according to the difference between quantization information corresponding to the encoding block and inverse quantization information corresponding to the encoding block, so that the process of obtaining the distortion information corresponding to each encoding block in the target live image is simplified, and certainly, at this time, a reconstructed pixel value corresponding to each pixel point in each encoding block in the target live image cannot be obtained, but this does not affect the encoding quality of the live video.

Regarding step S1033, if the encoding block is not obtained by dividing the target live image, performing an inverse transform operation on the inverse quantization information corresponding to the encoding block in each prediction mode, to obtain the inverse transform information corresponding to the encoding block in each prediction mode.

The inverse transformation information corresponding to the coding block comprises inverse transformation values corresponding to all pixel points in the coding block.

The inverse transformation operation is the inverse of the transformation operation.

Regarding step S1034, the anchor client obtains the reconstructed pixel values of each pixel point in the encoding block in each prediction mode according to the inverse transform values corresponding to each pixel point in the encoding block in each prediction mode and the predicted pixel values of each pixel point in the encoding block in each prediction mode.

Specifically, the anchor client obtains the reconstructed pixel value of each pixel point in the coding block in each prediction mode according to the sum of the inverse transformation value corresponding to each pixel point in the coding block in each prediction mode and the predicted pixel value of each pixel point in the coding block in each prediction mode.

Regarding step S1035, the anchor client obtains distortion information corresponding to each encoding block in each prediction mode according to the difference between the reconstructed pixel value of each pixel point in the encoding block in each prediction mode and the original pixel value of each pixel point in the encoding block in each prediction mode.

Specifically, the anchor client obtains a second difference value corresponding to the coding block in each prediction mode according to the difference value between the reconstructed pixel value of each pixel point in the coding block in each prediction mode and the original pixel value of each pixel point in the coding block in each prediction mode, then carries out hadamard transformation on the second difference value corresponding to the coding block in each prediction mode, and takes absolute values of the obtained hadamard transformation results and sums the absolute values, so that distortion information corresponding to the coding block in each prediction mode is obtained. The distortion information corresponding to the encoded block is a distortion value.

In this embodiment, the process of obtaining distortion information corresponding to each coding block in the target live image is simplified, and the distortion information corresponding to the coding block is directly obtained according to quantization information corresponding to the coding block and inverse quantization information corresponding to the coding block, so that the time for determining the target prediction mode corresponding to each coding block in the target live image is shortened, the coding time of the live video is shortened, and the coding efficiency of the live video is improved.

Regarding step S104, the anchor client acquires rate-distortion optimization information corresponding to each coding block in each prediction mode.

Wherein the Rate distortion optimization information (RDO, rate-distortion optimization) includes distortion information and prediction bit information.

Rdo=d+λ R, D is distortion information, R is prediction bit information, and λ is an adjustment parameter determined based on experiments.

How the distortion information is obtained has been explained, and how the prediction bit information is obtained therein is explained below.

The prediction bit information is bit information required for the predicted encoding of the encoded block. Because the modes of searching for the reference coding block are different and the reference coding block may be found differently, the data required for coding the coding block is different in different prediction modes, and thus the bit information required for coding the coding block is also different.

In an alternative embodiment, referring to fig. 7, fig. 7 is a schematic flow chart of step S104 in the live video encoding method provided in the first embodiment of the present application, and step S104 includes steps S1041 to S1042, specifically as follows:

s1041: acquiring first parameters to be coded corresponding to each coding block in each prediction mode; the first parameter to be coded is used for confirming a reference coding block corresponding to the coding block; the pixel values of each reference pixel point in the reference coding block are used to obtain predicted pixel values of each pixel point in the coding block.

S1042: and obtaining bit information required by coding the coding block in each prediction mode according to the first to-be-coded parameter corresponding to each coding block in each prediction mode and the quantization information corresponding to each coding block in each prediction mode.

Regarding step S1041, the anchor client obtains the first parameters to be encoded corresponding to each encoding block in each prediction mode.

The first parameter to be coded is used for confirming a reference coding block corresponding to the coding block, and pixel values of all reference pixel points in the reference coding block are used for obtaining predicted pixel values of all pixel points in the coding block.

The first parameters to be encoded that need to be transmitted may be different for different prediction modes.

For example: and searching a reference coding block corresponding to the coding block by utilizing the inter-frame prediction mode, wherein the first to-be-coded parameter to be transmitted comprises the reference coding block in which frame of live image (also called as a reference frame), specifically which inter-frame prediction mode, a motion vector and the like. A motion vector may be understood as data needed to determine a reference coded block in a reference frame.

It will be appreciated that, due to different specific manners of determining the reference coding block corresponding to the coding block, the content included in the first parameter to be coded may be increased or decreased accordingly, which is not limited in detail herein.

For intra prediction modes, the first parameter to be encoded includes, in particular, which intra prediction mode is, etc.

Regarding step S1042, the anchor client obtains the bit information required for encoding the encoding block in each prediction mode according to the first to-be-encoded parameter corresponding to each encoding block in each prediction mode and the quantization information corresponding to each encoding block in each prediction mode.

The encoding process is a process of converting the data to be encoded into a binary code stream, and thus, the bit information required for encoding the encoded block in step S1042 can be understood as the number of bits occupied by transmitting the binary code stream.

Because the first to-be-encoded parameter corresponding to each encoding block and the quantization information corresponding to each encoding block may be different in different prediction modes, the bit information required for encoding the encoding block may also be different, so that the bit information required for encoding the encoding block in each prediction mode needs to be obtained according to the first to-be-encoded parameter corresponding to each encoding block in each prediction mode and the quantization information corresponding to each encoding block in each prediction mode.

In an optional embodiment, the anchor client may further quantize quantization parameters corresponding to the coding block to obtain quantization information corresponding to the quantization parameters and a second parameter to be coded corresponding to the coding block.

The second parameter to be encoded corresponding to the encoding block is used for determining quantization parameters corresponding to the encoding block according to which quantization parameters corresponding to the reference encoding block can be predicted.

And then the first parameter to be coded corresponding to the coding block, the quantization information corresponding to the quantization parameter and the second parameter to be coded corresponding to the coding block are coded together to obtain bit information required by coding the coding block. It will be appreciated that the above procedure is performed in different prediction modes, respectively, and that the bit information required to encode the encoded block in the different prediction modes is available.

Regarding step S105, the anchor client obtains the target prediction mode corresponding to each coding block according to the rate distortion optimization information corresponding to each coding block in each prediction mode.

And the rate distortion optimization information corresponding to the coding block in the target prediction mode is minimum.

Regarding step S106, the anchor client encodes each encoding block according to the target prediction mode corresponding to each encoding block and the quantization information corresponding to the encoding block in the target prediction mode, so as to obtain the encoded live video.

After determining the target prediction mode corresponding to each coding block, the anchor client encodes each coding block according to the target prediction mode corresponding to each coding block and quantization information corresponding to the coding block in the target prediction mode to obtain encoded coding blocks, thereby obtaining encoded live images and further obtaining encoded live videos.

Specifically, the anchor client acquires target parameters to be coded corresponding to each coding block in a target prediction mode; the target parameter to be coded is used for confirming a reference coding block corresponding to the coding block in a target prediction mode; the pixel values of each reference pixel point in the reference coding block are used to obtain predicted pixel values of each pixel point in the coding block. And the anchor client encodes target parameters to be encoded corresponding to the encoding blocks in the target prediction mode and quantization information corresponding to the encoding blocks in the target prediction mode to obtain the encoded live video.

The second parameter to be coded corresponding to the coding block is used for determining the quantization parameter corresponding to the coding block according to the quantization parameters corresponding to the reference coding block.

And then, the anchor client encodes the target to-be-encoded parameters corresponding to the encoding blocks in the target prediction mode, the quantization information corresponding to the quantization parameters and the second to-be-encoded parameters corresponding to the encoding blocks together to obtain encoded encoding blocks, so as to obtain encoded live images and further obtain encoded live videos.

In the process of determining the target prediction mode corresponding to each coding block, the method simplifies the process of obtaining the distortion information corresponding to each coding block in the target live image aiming at the target live image with the frame type of the non-reference bidirectional difference frame, and directly obtains the distortion information corresponding to the coding block according to the quantization information corresponding to the coding block and the inverse quantization information corresponding to the coding block, thereby shortening the time of determining the target prediction mode corresponding to each coding block in the target live image, further shortening the coding time of the live video and improving the coding efficiency of the live video. In addition, as the target live image with the frame type of the non-reference bidirectional difference frame is not used as a reference frame and is used for encoding blocks in other live images, the process of obtaining distortion information of the target live image is simplified, encoding quality of live video is not affected, resolution ratio of the live video is reduced, definition and fluency of the live video during playing can be guaranteed at the same time, and live experience of a user is improved.

Fig. 8 is a schematic structural diagram of a live video encoding device according to a second embodiment of the present application. The apparatus may be implemented as all or part of a computer device by software, hardware, or a combination of both. The device 8 comprises:

A first obtaining unit 81, configured to obtain a live video and quantization parameters corresponding to live images of each frame in the live video;

the first quantization unit 82 is configured to quantize the live image according to the quantization parameters in each prediction mode, so as to obtain quantization information corresponding to each coding block in the live image in each prediction mode; the encoding block is obtained by dividing the live image;

a second obtaining unit 83, configured to obtain distortion information corresponding to each coding block in each prediction mode according to quantization information corresponding to each coding block in each prediction mode; the distortion information corresponding to each coding block in the target live image is obtained according to quantization information corresponding to the coding block and inverse quantization information corresponding to the coding block; the frame type of the target live image is a non-reference bidirectional difference frame;

a third obtaining unit 84, configured to obtain rate distortion optimization information corresponding to each of the encoding blocks in each of the prediction modes; the rate-distortion optimization information comprises the distortion information and prediction bit information, wherein the prediction bit information is bit information required by the predicted coding of the coding block;

A first determining unit 85, configured to obtain a target prediction mode corresponding to each coding block according to rate distortion optimization information corresponding to each coding block in each prediction mode; the rate distortion optimization information corresponding to the coding block in the target prediction mode is minimum;

the first encoding unit 86 is configured to encode each of the encoding blocks according to a target prediction mode corresponding to the encoding block and quantization information corresponding to the encoding block in the target prediction mode, so as to obtain the encoded live video.

It should be noted that, when the live video encoding apparatus provided in the foregoing embodiment performs the live video encoding method, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the live video encoding device and the live video encoding method provided in the foregoing embodiments belong to the same concept, which embody detailed implementation procedures in method embodiments, and are not described herein again.

Fig. 9 is a schematic structural diagram of a computer device according to a third embodiment of the present application. As shown in fig. 9, the computer device 9 may include: a processor 90, a memory 91 and a computer program 92 stored in the memory 91 and executable on the processor 90, for example: live video encoding program; the processor 90, when executing the computer program 92, implements the steps of the first embodiment described above.

Wherein the processor 90 may include one or more processing cores. The processor 90 utilizes various interfaces and wiring to connect various portions of the computer device 9, performs various functions of the computer device 9 and processes data by executing or executing instructions, programs, code sets or instruction sets stored in the memory 91 and invoking data in the memory 91, alternatively the processor 90 may be implemented in at least one hardware form of digital signal processing (Digital Signal Processing, DSP), field-programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programble Logic Array, PLA). The processor 90 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the touch display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 90 and may be implemented by a single chip.

The Memory 91 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 91 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 91 may be used to store instructions, programs, code sets, or instruction sets. The memory 91 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as touch instructions, etc.), instructions for implementing the various method embodiments described above, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 91 may alternatively be at least one memory device located remotely from the aforementioned processor 90.

The embodiment of the present application further provides a computer storage medium, where a plurality of instructions may be stored, where the instructions are adapted to be loaded and executed by a processor, and the specific implementation procedure may refer to the specific description of the foregoing embodiment, and details are not repeated herein.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the steps of each method embodiment described above may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc.

The present invention is not limited to the above-described embodiments, but, if various modifications or variations of the present invention are not departing from the spirit and scope of the present invention, the present invention is intended to include such modifications and variations as fall within the scope of the claims and the equivalents thereof.

Claims

1. A method of live video encoding, the method comprising the steps of:

2. The method for encoding live video according to claim 1, wherein the step of obtaining the live video and quantization parameters corresponding to each frame of live image in the live video comprises:

acquiring first bit rate information, complexity information of the live image of each frame and importance information of the live image of each frame;

obtaining quantization parameters corresponding to the live broadcast images of each frame according to the first bit rate information, the complexity information of the live broadcast images of each frame and the importance information of the live broadcast images of each frame; the larger the quantization parameter corresponding to the live image is, the smaller the bit rate information allocated to the live image is, and the average value of the bit rate information allocated to each frame of the live image does not exceed the first bit rate information.

3. The method for encoding live video according to claim 1, wherein the step of quantizing the live video according to the quantization parameter in each prediction mode to obtain quantization information corresponding to each encoding block in the live video in each prediction mode includes the steps of:

acquiring a reference coding block corresponding to each coding block in each prediction mode;

obtaining predicted pixel values of all pixel points in the coding block in all the prediction modes according to the pixel values of all the reference pixel points in the reference coding block;

obtaining residual information corresponding to the coding block in each prediction mode according to original pixel values of the pixel points in the coding block and predicted pixel values of the pixel points in the coding block in each prediction mode; the residual information corresponding to the coding block comprises residual values corresponding to the pixel points in the coding block;

and carrying out transformation operation and quantization operation on residual information corresponding to the coding blocks in each prediction mode in sequence to obtain quantization information corresponding to each coding block in each prediction mode.

4. A method of encoding live video as claimed in any one of claims 1 to 3, wherein the obtaining distortion information corresponding to each of the encoded blocks in each of the prediction modes based on quantization information corresponding to each of the encoded blocks in each of the prediction modes includes the steps of:

performing inverse quantization operation on quantization information corresponding to each coding block in each prediction mode to obtain inverse quantization information corresponding to each coding block in each prediction mode;

and judging whether the coding block is obtained by dividing the target live image, if so, obtaining distortion information corresponding to each coding block in each prediction mode according to the difference value of quantization information corresponding to the coding block in each prediction mode and inverse quantization information corresponding to the coding block in each prediction mode.

5. The method of encoding live video as in claim 4 wherein if the encoding block is not dividing the target live video, the method further comprises the steps of:

performing inverse transformation operation on inverse quantization information corresponding to the coding blocks in each prediction mode to obtain inverse transformation information corresponding to the coding blocks in each prediction mode; the inverse transformation information corresponding to the coding block comprises inverse transformation values corresponding to all pixel points in the coding block;

Obtaining a reconstructed pixel value of each pixel point in the coding block in each prediction mode according to the inverse transformation value corresponding to each pixel point in the coding block in each prediction mode and the predicted pixel value of each pixel point in the coding block in each prediction mode;

and obtaining distortion information corresponding to each coding block in each prediction mode according to the difference value between the reconstructed pixel value of each pixel point in the coding block in each prediction mode and the original pixel value of each pixel point in the coding block in each prediction mode.

6. A method of direct broadcast video encoding according to any one of claims 1 to 3, wherein the step of obtaining rate-distortion optimization information corresponding to each of the encoded blocks in each of the prediction modes comprises:

acquiring a first parameter to be coded corresponding to each coding block in each prediction mode; the first parameter to be coded is used for confirming a reference coding block corresponding to the coding block; the pixel value of each reference pixel point in the reference coding block is used for obtaining the predicted pixel value of each pixel point in the coding block;

And obtaining bit information required by coding the coding blocks in each prediction mode according to the first parameters to be coded corresponding to the coding blocks in each prediction mode and quantization information corresponding to the coding blocks in each prediction mode.

7. A method for encoding live video according to any one of claims 1 to 3, wherein the step of encoding each encoding block according to a target prediction mode corresponding to each encoding block and quantization information corresponding to the encoding block in the target prediction mode to obtain the encoded live video includes the steps of:

acquiring target parameters to be coded corresponding to each coding block in the target prediction mode; the target parameter to be coded is used for confirming a reference coding block corresponding to the coding block in the target prediction mode; the pixel value of each reference pixel point in the reference coding block is used for obtaining the predicted pixel value of each pixel point in the coding block;

and encoding target parameters to be encoded corresponding to the encoding blocks in the target prediction mode and quantization information corresponding to the encoding blocks in the target prediction mode to obtain the encoded live video.

8. A live video encoding apparatus, comprising:

9. A computer device, comprising: a processor, a memory and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 7 when the computer program is executed.

10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.