CN119052479A

CN119052479A - Video data processing method, device, equipment and storage medium

Info

Publication number: CN119052479A
Application number: CN202411525826.XA
Authority: CN
Inventors: 吴善刚; 孙所瑞; 李世奇; 董茂; 马强; 师恩义
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2024-10-30
Filing date: 2024-10-30
Publication date: 2024-11-29
Anticipated expiration: 2044-10-30
Also published as: CN119052479B

Abstract

The present application provides a method, device, equipment and storage medium for processing video data. It is applied to the field of video coding technology. The method includes: if the QP values of the consecutive n frames of data encoded before the current moment are all greater than the preset QP threshold, then adjust the image signal processing ISP image parameters, and process the original video data obtained at the current moment based on the adjusted ISP image parameters to obtain YUV data. In the encoding process based on the preset target bit rate, if the QP values of the consecutive n frames of data obtained by encoding are all greater than the QP threshold, and the motion state of the front-end device indicates that the front-end device is stationary, the encoding reference relationship is adjusted to the first encoding reference relationship until the encoding is completed to obtain the code stream data. The above method effectively responds to the problems of network signal weakening and insufficient bandwidth by dynamically adjusting the ISP image parameters and the encoding reference relationship, thereby improving the stability of video transmission, encoding quality and user experience, while reducing bandwidth occupancy and storage costs.

Description

Video data processing method, device, equipment and storage medium

Technical Field

The present application relates to the field of video coding technologies, and in particular, to a method, an apparatus, a device, and a storage medium for processing video data.

Background

With the advancement of society and the improvement of people's safety consciousness, the demand for safety monitoring is increasing. The 4G ball machine is widely applied to safety monitoring in the fields of safety, traffic, education, medical treatment and the like by virtue of the advantages of wireless transmission, high-definition image quality, intelligent analysis and the like. The coding of video data is particularly important under the conditions that the background area is greatly changed, and the 4G signal and the wireless WiFi signal are weak and unstable in the rotation/zooming process of the dome camera.

In the prior art, network bandwidth is generally estimated through a network protocol quality service library, so that the code rate is adjusted to ensure the coding effect.

However, the method forcedly reduces the coding rate according to the network condition, which causes obvious loss of the coded image effect and cannot guarantee the effect and quality of video data coding.

Disclosure of Invention

The application provides a processing method, a device, equipment and a storage medium of video data, which are used for solving the problems that the effect of an encoded image is obviously lost and the effect and quality of video data encoding cannot be ensured due to forced reduction of an encoding code rate according to network conditions.

In a first aspect, the present application provides a method for processing video data, applied to a front-end device, where the method includes:

If the QP values of the continuous n frames of data coded before the current moment are all larger than a preset QP threshold, adjusting the ISP image parameters of image signal processing;

Processing the original video data acquired at the current moment based on the adjusted ISP image parameters to obtain YUV data corresponding to the original video;

In the YUV data encoding process based on a preset target code rate, if QP values of continuous n frames of data obtained by encoding are all larger than the QP threshold, and the motion state of the front-end device indicates that the front-end device is static, the encoding reference relation is adjusted to be a first encoding reference relation until encoding is completed, so as to obtain code stream data, wherein the first encoding reference relation comprises encoding a first frame into an intra-frame encoding I frame, encoding an I frame after each first preset interval, encoding a last I frame with reference to the previous encoding into a forward prediction Pm frame after each second preset interval, encoding a last I frame with reference to the previous encoding into a P frame between the first frame and the P1 frame, and between the Pm-1 frame and the Pm frame, and m represents an m second preset interval which is an integer larger than 2, and m represents an m-1P frame and m represents an m P frame.

With reference to the first aspect, in some embodiments, the method further includes:

And when the motion state indicates the front-end equipment to move, adjusting the coding reference relation to a second coding reference relation until the coding is completed, and obtaining the code stream data, wherein the second coding reference relation comprises that a first frame is coded as an I frame, one I frame is coded after each third preset interval, a last I frame coded by referring to the front after each fourth preset interval is coded as a Pn frame, the first frame and the P1 frame are referenced to code into a plurality of bi-directional prediction B frames between the first frame and the P1 frame, the Pn-1 frame and the Pn frame are referenced to code into a plurality of B frames between the Pn-1 frame and the Pn frame, n represents an nth fourth preset interval which is an integer larger than 2, pn-1 represents an nth-1P frame, and Pn represents an nth P frame.

And in the coding process of the first coding reference relation, if the QP values of the continuous n frames of data obtained by coding are all larger than the QP threshold and the current network bandwidth is detected to be smaller than a preset first bandwidth threshold, the first coding reference relation is adjusted to be a third coding reference relation for coding, wherein the third coding reference relation comprises that the first frame is only coded into an I frame, the first frame is coded into a Pa frame after each fifth preset interval, the current coding previous frame and the current coding previous frame are coded into P frames between the first frame and the P1 frame, and between the Pa-1 frame and the Pa frame, a represents an a fifth preset interval, and the current coding previous frame and the first frame are coded into an integer larger than 2, pa-1 represents an a-1P frame, and Pa represents an a-1P frame.

And in the encoding process according to the third encoding reference relation, adjusting the encoding code rate based on the motion duty ratio of the original video data, and encoding by adopting the adjusted code rate in the next frame encoding, wherein the motion duty ratio is used for indicating the change rate of the original video data to the macro block pixel value.

With reference to the first aspect, in some embodiments, before the adjusting the encoded bitrate based on the motion duty cycle of the original video data, the method further includes:

performing downsampling processing on the resolution of the original video data to generate a Bitmap table;

for each macro block corresponding to the original video data after downsampling, if the change value of the pixel value of the macro block is larger than a preset pixel value threshold, setting the corresponding position of the macro block in the Bitmap table as 1;

If the change value of the pixel value of the macro block is smaller than the pixel value threshold, setting the corresponding position of the macro block in the Bitmap table to be 0;

And determining the duty ratio of 1 in the Bitmap table as the motion duty ratio.

With reference to the first aspect, in some embodiments, if QP values of the n consecutive frames of data encoded before the current time are all greater than a preset QP threshold, adjusting an image parameter of the image signal processing ISP includes:

Determining ISP target image parameters from the ISP image parameters based on the QP threshold, wherein the ISP target image parameters comprise noise reduction parameters and sharpening parameters;

Raising the level of the noise reduction parameter;

And reducing the grade of the sharpening parameter.

With reference to the first aspect, in some embodiments, before the adjusting the image signal processing ISP image parameter, the method further comprises:

and when the network signal is weakened and the network bandwidth cannot meet the code stream transmission data quantity, adjusting the target code rate based on the network bandwidth at the current moment.

If the QP values of the continuous n frames of data coded after the coding reference relation is adjusted are all larger than the QP threshold, the coding frame rate and resolution are reduced.

And when the motion state indicates that the front-end equipment is static and the current network bandwidth is detected to be larger than the first bandwidth threshold and smaller than a preset second bandwidth threshold, adjusting the coding reference relation to a fourth coding reference relation for coding, wherein the fourth coding reference relation comprises that a first frame is coded into I frames, an I frame is coded after each sixth preset interval, a p frame is coded between two I frames by taking the previous frame and the last I frame coded before as reference frames, and the first bandwidth threshold is smaller than the second bandwidth threshold.

and in the encoding process according to the second encoding reference relation, if the QP values of the continuous n frames of data obtained by encoding are all larger than the QP threshold value and the current network bandwidth is detected to be incapable of meeting the code stream transmission data quantity, the motion state of the front-end equipment is adjusted to be static.

In a second aspect, the present application provides a processing apparatus for video data, comprising:

the first adjusting module is used for adjusting the image parameters of the image signal processing ISP if the QP values of the continuous n frames of data coded before the current moment are all larger than a preset QP threshold;

The processing module is used for processing the original video data acquired at the current moment based on the adjusted ISP image parameters to obtain YUV data corresponding to the original video;

And the second adjusting module is used for adjusting the coding reference relation to a first coding reference relation until the coding is completed to obtain code stream data when the QP value of the continuous n-frame data obtained by coding is larger than the QP threshold value and the motion state of the front-end equipment indicates that the front-end equipment is static in the coding process of the YUV data based on a preset target code rate, wherein the first coding reference relation comprises that a first frame is coded as an intra-frame coding I frame, an I frame is coded after each first preset interval, a last I frame coded before each second preset interval is coded as a forward prediction Pm frame by referring to a last I frame coded before each second preset interval, P frames are coded between the first frame and P1 frame, pm-1 frame and Pm frame, m represents an m second preset interval which is an integer larger than 2, and Pm-1 represents an m-1P frame and Pm represents an m-th P frame.

In a third aspect, the application provides a front-end device comprising a processor, a memory communicatively connected with the processor, a communication interface, and an image sensor;

The memory stores computer-executable instructions;

The processor executes computer-executable instructions stored in the memory to implement the method for processing video data according to any one of the first aspects.

In a fourth aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions which, when executed by a processor, are adapted to carry out the method of processing video data according to any one of the preceding aspects.

In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of processing video data according to any of the preceding aspects.

According to the video data processing method, device, equipment and storage medium, if QP values of continuous n-frame data coded before the current moment are all larger than the preset QP threshold, image signal processing ISP image parameters are adjusted, original video data acquired at the current moment are processed based on the adjusted ISP image parameters to obtain YUV data corresponding to the original video, in the YUV data coding process based on the preset target code rate, if the QP values of the continuous n-frame data obtained by coding are all larger than the QP threshold, and the motion state of front-end equipment indicates that the front-end equipment is static, the coding reference relation is adjusted to be the first coding reference relation until coding is completed, and code stream data is obtained. The method effectively solves the problems of network signal weakening and bandwidth shortage by dynamically adjusting the ISP image parameters and the coding reference relation, improves the stability, the coding quality and the user experience of video transmission, and reduces the bandwidth occupation and the storage cost.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic diagram of an application system of a method for processing video data according to an embodiment of the present application;

fig. 2 is a flowchart of a first embodiment of a method for processing video data according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a first code reference relationship;

FIG. 4 is a diagram of a third code reference relationship;

FIG. 5 is a diagram illustrating a third coding reference relationship for adjusting code rate in real time;

fig. 6 is a schematic flow chart of a second embodiment of a method for processing video data according to the present application;

FIG. 7 is a diagram of a second code reference relationship;

fig. 8 is a flowchart illustrating a third embodiment of a method for processing video data according to an embodiment of the present application;

FIG. 9 is a diagram illustrating the effect of noise reduction parameters on code rate;

FIG. 10 is a diagram illustrating the effect of sharpening parameters on code rate;

FIG. 11 is a texture complexity contrast diagram;

FIG. 12 is a schematic diagram of a comparison of noise reduction level 0 and noise reduction level 100;

FIG. 13 is a comparative schematic of sharpening level 0 and sharpening level 100;

fig. 14 is a flowchart of a fourth embodiment of a method for processing video data according to an embodiment of the present application;

FIG. 15 is a diagram of a fourth code reference relationship;

FIG. 16 is a diagram of a fourth modified code reference relationship;

Fig. 17 is a schematic structural diagram of an embodiment of a processing apparatus for video data according to an embodiment of the present application;

fig. 18 is a schematic structural diagram of a front-end device according to an embodiment of the present application.

Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.

With the advancement of society and the improvement of people's safety consciousness, the demand for safety monitoring is increasing. The 4G ball machine is widely applied to safety monitoring in the fields of safety, traffic, education, medical treatment and the like by virtue of the advantages of wireless transmission, high-definition image quality, intelligent analysis and the like. The coding of video data is particularly important under the conditions that the background area is greatly changed, and the 4G signal and the wireless WiFi signal are weak and unstable in the rotation/zooming process of the dome camera. In the prior art, network bandwidth is generally estimated through a network protocol quality service library, so that the code rate is adjusted to ensure the coding effect. However, the coding code rate is forcedly reduced according to the network condition, so that the image effect after coding is obviously lost, the frame rate is forcedly reduced under the condition that the code rate is not pressed, the frame interval is increased, the smoothness of a moving target is reduced, if the balance of the image quality and the coding efficiency is realized through the linkage of the image preprocessing and the coding code rate, the better image quality can be kept under the lower code rate, but the image before coding is blurred due to the excessive adjustment of the preprocessing, a lot of information is lost before coding, the definition is reduced, and the regulation means is single.

In view of the above problems, the present application provides a method, apparatus, device and storage medium for processing video data, which improves stability and coding quality of video transmission under the condition of poor network signals. Specifically, in the prior art, network bandwidth is estimated through a network protocol quality service library, so that the code rate is adjusted to ensure the coding effect. However, the coding code rate is forcedly reduced according to the network condition, so that the image effect after coding is obviously lost, the frame rate is forcedly reduced under the condition that the code rate is not pressed, the frame interval is increased, the smoothness of a moving target is reduced, if the balance of the image quality and the coding efficiency is realized through the linkage of the image preprocessing and the coding code rate, the better image quality can be kept under the lower code rate, but the image before coding is blurred due to the excessive adjustment of the preprocessing, a lot of information is lost before coding, the definition is reduced, and the regulation means is single. In view of these problems, the inventor studied whether the code rate can be adjusted in real time based on the network bandwidth, and monitors the quantization parameter (Quantization Parameter, QP) value of each frame of encoded data in real time, and adjusts the ISP parameter to reduce the encoding pressure when the QP value of the continuous multi-frame data reaches the QP threshold, and adjusts the encoding reference relationship based on the motion state of the front-end device when the QP value of the continuous multi-frame data encoded after adjusting the image signal Processor (IMAGE SIGNAL Processor, ISP) parameter reaches the QP threshold in order to ensure the video quality, thereby reducing the encoding pressure and ensuring the encoding quality. Based on the above, the technical scheme of the application is provided.

Fig. 1 is a schematic structural diagram of an application system of a video data processing method provided by an embodiment of the present application, where, as shown in fig. 1, the system includes front-end devices that may be configured in a weak tennis machine, a front-end camera, etc., where the weak network includes 4g, wifi, or may be a satellite signal, etc., and includes an acquisition module, a processing module, an encoder, and an output module. The acquisition module is used for acquiring video data of the image sensor, the processing module is used for performing image processing on the data acquired by the acquisition module to obtain YUV data, the encoder is used for encoding the YUV data, and the output module is used for encoding code stream data of the data encoder, so that the code stream data is transmitted out through a 4G network. The data can be transmitted to a back-end device, such as a smart phone, a notebook computer, a desktop computer, a server, a cloud end and the like held by a user.

In the above application system, in order to ensure that the coding effect improves the video quality, the network bandwidth can be monitored in real time, the code rate of the encoder is further adjusted in real time based on the network bandwidth, the frame level QP value of the data code stream is monitored, when the QP value of the continuous multi-frame data reaches the preset threshold, it is indicated that the current coding pressure is larger, the processing parameters of the processing module are adjusted, the coding reference relationship is adjusted, or the coding resolution and the frame rate are reduced until the network bandwidth is met.

The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 2 is a flowchart of an embodiment of a method for processing video data according to an embodiment of the present application, as shown in fig. 2, where the method may be applied to a front-end device, and the method specifically includes:

S201, if the QP values of the continuous n frames of data coded before the current moment are all larger than a preset QP threshold, adjusting the image parameters of the image signal processing ISP.

In this step, in order to monitor the encoding pressure in the encoding process in real time, QP values of continuous n-frame data encoded before the current time may be acquired in real time.

Specifically, the QP value is a key parameter affecting video quality and file size, and thus can be used to determine the encoding pressure.

Illustratively, for most encoders, a detailed logging function is allowed to be enabled. This can be generally achieved by setting specific parameters in the encoded command. For example, in FFmpeg, -loglevel debug or-loglevel verbose may be used to enable more detailed log output. These logs would contain QP value information for each frame. After the log or the output file is obtained, the log or the output file can be analyzed, can be manually searched, and can be written with a script to automatically extract related QP values. The above is merely an example of acquiring a QP value, and embodiments of the present application are not limited in particular to a specific implementation for acquiring a QP value.

The QP value obtained is used to determine the encoding pressure, and n is an integer greater than 1, but since the QP value obtained is not persuasive if it is one frame, n may be an integer greater than 3 or more in a practical application scenario. The setting can be performed according to the actual application scenario, and the embodiment of the application is not particularly limited.

After the QP values of the continuous n frames of data are obtained, the QP value of each frame is compared with a preset QP threshold value, if the QP values of the continuous n frames of data are all larger than the preset QP threshold value, the current coding pressure is excessively large, ISP image parameters can be adjusted, image definition is reduced, and further, the texture complexity of the data used for coding is reduced, so that coding pressure is reduced.

Specifically, based on the QP threshold, determining ISP target image parameters in the ISP image parameters, wherein the ISP target image parameters comprise noise reduction parameters and sharpening parameters, so that the level of the noise reduction parameters is increased, and the level of the sharpening parameters is reduced.

S202, processing the original video data acquired at the current moment based on the adjusted ISP image parameters to obtain YUV data corresponding to the original video.

In this step, after the ISP image parameters are adjusted, the original video data is image-processed based on the adjusted ISP image parameters, so as to obtain YUV data, where the YUV data is used for encoding.

In particular, ISPs are used to convert RAW video data captured by sensors (typically in RAW format) into data more suitable for subsequent processing and display (e.g., in YUV format). YUV is a color coding method in which Y represents Luminance (luminence or Luma), U and V represent chrominance (Chrominance or Chroma) components, commonly used for video compression and transmission. The specific processing procedure of ISP includes:

denoising (Denoising) reduces random noise in the image.

Defect correction (Defect Correction) repairs bad points or defects in the image sensor.

White Balance (White Balance) the image color is adjusted to reflect the true light source color in the scene.

Auto Exposure (Auto Exposure), which adjusts the brightness of an image to ensure that the image is not overexposed or underexposed.

Color Correction (Color Correction) is to adjust the Color saturation, contrast and hue of an image.

Sharpening SHARPENING enhances the edges and details of the image, making the image appear clearer.

Demosaicing Demosaicing converts RAW data (typically Bayer pattern) into full-color images.

After processing by the ISP, the data is typically in RGB format (although it is possible to use different color spaces for processing within the ISP). The RGB data then needs to be converted to YUV format for further video processing, compression or display. In particular, RGB to YUV conversion is typically performed using standard mathematical formulas that convert RGB components into Y (luminance) and U, V (chrominance) components. These conversions may be linear or non-linear depending on the desired color accuracy and computational efficiency. The embodiment of the present application is not particularly limited.

It should be noted that there are various variants of YUV formats (such as YUV420, YUV422, YUV444, etc.), which differ in sampling and storing manners of the chrominance components, and an appropriate format needs to be selected according to a specific application scenario.

And S203, in the process of coding YUV data based on a preset target code rate, if QP values of continuous n frames of data obtained by coding are all larger than QP threshold values and the motion state of front-end equipment indicates that the front-end equipment is static, adjusting a coding reference relation into a first coding reference relation until coding is completed, and obtaining code stream data.

In this step, after the ISP parameters are adjusted, encoding is continued, and if the QP value of the continuous n-frame data is detected to be still greater than the QP threshold in the encoding process, the encoding reference relationship may be adjusted based on the motion state of the front-end device, and finally encoding is completed to output the code stream data. Wherein the movement status of the front-end device is used to indicate that the front-end device is stationary.

In particular, when the motion state indicates that the front-end device is stationary, such as a weak tennis machine, most of the area of the machine is unchanged when stationary (not rotating), and the use of the first coded reference relationship may be determined in order to add more reference information. The first coding reference relation comprises that a first frame is coded into an I frame, an I frame is coded after each first preset interval, a last I frame coded before the first frame is referenced after each second preset interval is coded into a Pm frame, P frames are coded between the first frame and a P1 frame and between the first frame and the P1 frame, and the last I frame coded before the current frame is referenced, m represents an m second preset interval which is an integer larger than 2, pm-1 represents an m-1P frame, and Pm represents an m-th P frame.

For example, fig. 3 is a schematic diagram of a first coding reference relationship, and as shown in fig. 3, P frames coded in the first coding reference relationship refer to not only a previous frame but also an I or P frame, and one P frame and one I frame are added at a certain interval to reduce the residual error of the subsequent P frame. The P-frames and P-frames are encoded in the same manner, except that the reference frames are different. The coding mode can optimize transmission efficiency while maintaining video quality. It should be noted that, referring to fig. 3, when encoding according to the above reference relationship, it is also necessary to encode the I frame at a preset interval. Taking the common frame rate of 25fps as an example, the interval of an I frame is 50 (equal to 2 seconds to generate one I frame), 49 frames are arranged between two I frames, each I frame is only used as an intra-frame reference, and other frames are not referred to, wherein the interval of the I frame is 200, the interval of a p frame is 50, the p frame is only referred to the I frame, and the other frames are common p frames and refer to the previous frame and the I frame. The reference structure user clicks on a frame at will, and only needs to find the P frame in front of the frame to decode sequentially, and the current frame can be displayed by decoding 49 frames at most.

In one possible implementation, if the QP values of the n consecutive frames of data encoded after adjusting the encoding reference relationship are still all greater than the QP threshold, the encoding frame rate and resolution may be reduced until the network bandwidth is met.

Optionally, when the network signal is weakened and the network bandwidth cannot meet the code stream transmission data quantity, the target code rate is adjusted based on the network bandwidth at the current moment.

In this step, the network bandwidth is monitored in real time, when the signal is detected to be weakened, and the network bandwidth of the communication link cannot meet the data volume requirement of the code stream transmission, the code rate value configured to the encoder is reduced, the loss after theoretical encoding is increased after the code rate value is reduced, and the image effect is deteriorated, so that the frame level QP value of the encoded data is monitored in real time while the code stream is reduced, thereby ensuring the video quality.

Specifically, the network signal is monitored, and when the original signal from the network interface is received, analysis and measurement are performed to obtain the strength indicator of the signal, such as signal strength indicator (RSSI), signal-to-noise ratio (SNR), and the like. The system may preset a series of thresholds for signal strength, which are determined based on network criteria, device performance, and application scenario requirements. The signal strength measured in real time is compared with a preset threshold value, and if the signal strength is lower than a certain threshold value, the signal is considered to be weakened.

For the network bandwidth acquisition mode, the current network bandwidth can be estimated by measuring the data quantity successfully transmitted in unit time. Generally involves timing and counting packets transmitted. A common measurement method includes throughput testing (Throughput Testing), i.e., sending large amounts of data over a period of time and measuring the amount of data received.

The amount of required code stream transmission data can also be determined according to the requirements of the application. The network bandwidth actually needed needs to be calculated in consideration of redundancy of data, coding efficiency and overhead of network protocol.

The measured actual network bandwidth is then compared to the amount of required code stream transmission data. If the actual bandwidth is lower than the required bandwidth, the network bandwidth of the communication link is considered to be unable to meet the data volume requirements of the code stream transmission.

When the network bandwidth is monitored to be reduced, the coding rate is reduced to reduce the data transmission quantity, and network congestion and buffering delay are avoided.

Conversely, when the network bandwidth increases, the coding rate may be increased appropriately to improve video quality.

The upper limit value of the code rate is preset, and the upper limit value of the code rate cannot be exceeded in the dynamic adjustment process.

Optionally, in the encoding process of the first encoding reference relation, QP values of the continuous n frame data obtained after encoding are detected in real time, and if the QP values are still greater than the QP threshold and the current network bandwidth is detected to be smaller than the preset first bandwidth threshold, the first encoding reference relation is adjusted to be the third encoding reference relation.

Specifically, the third coding reference relation includes that only the first frame is coded as an I frame, the first frame is coded as a Pa frame after each fifth preset interval, the previous frame and the first frame which are coded currently are coded as P frames between the first frame and the P1 frame, the Pa-1 frame and the Pa frame, a represents an a fifth preset interval which is an integer which is larger than 2, pa-1 represents an a-1P frame, and Pa represents an a P frame.

For example, fig. 4 is a schematic diagram of a third coding reference relationship, as shown in fig. 4, the frame structure of fig. 3 can solve the problems of reducing residual error and fast-looking video, but an I frame is generated every 200 frames, and the instant bandwidth blocking is still easy to be caused for a very low bandwidth scene, so that the picture is blocked, and the I frame can not be generated periodically any more under the condition that QP and network bandwidth threshold are met,

It should be noted that an I-frame is generated immediately when the original frame structure fails to decode normally due to packet loss, etc., and an I-frame is generated immediately when the current scene should be changed as shown by the continuous N-second scene detection result, and the original I-frame has no referential property, and an I-frame is generated immediately.

In one possible design, during encoding with the third encoding reference relationship, the frame structure may also dynamically adjust the code rate value at the current time according to a motion duty cycle of the current scene, where the motion duty cycle is used to indicate a rate of change of the original video data with respect to the macroblock pixel values.

For example, fig. 5 is a schematic diagram of a third coding reference relationship for adjusting the code rate in real time, as shown in fig. 5, where a high code rate value is set immediately when the motion duty ratio exceeds a set threshold, and the next frame is required to be validated immediately without waiting for the next I frame or large P frame to be validated. Finally, the size structure diagram of each frame is shown in fig. 5, the frames in the dotted line frame correspond to frames with motion proportion exceeding a threshold value and high code rate, the code rate is high in the period, the coding effect is good, and the frames outside the dotted line frame correspond to the frames with motion proportion being low.

According to the video data processing method provided by the embodiment, if the QP values of the continuous n-frame data coded before the current moment are all larger than the preset QP threshold, the image parameters of the image signal processing ISP are adjusted, the original video data acquired at the current moment are processed based on the adjusted ISP image parameters, the YUV data corresponding to the original video are obtained, and in the YUV data coding process based on the preset target code rate, if the QP values of the continuous n-frame data obtained by coding are all larger than the QP threshold, and the motion state of the front-end equipment indicates that the front-end equipment is static, the coding reference relation is adjusted to be the first coding reference relation until coding is completed, and the code stream data is obtained. The method effectively solves the problems of network signal weakening and bandwidth shortage by dynamically adjusting the ISP image parameters and the coding reference relation, improves the stability, the coding quality and the user experience of video transmission, and reduces the bandwidth occupation and the storage cost.

Fig. 6 is a flow chart of a second embodiment of a method for processing video data according to an embodiment of the present application, as shown in fig. 6, the method includes:

S601, if QP values of the continuous n frames of data coded before the current moment are all larger than a preset QP threshold, adjusting image parameters of the image signal processing ISP.

S602, processing the original video data acquired at the current moment based on the adjusted ISP image parameters to obtain YUV data corresponding to the original video.

The implementation manner of step S601 and step 602 is the same as that of step S201 and step S202 in the foregoing embodiment, and will not be described in detail here.

And S603, in the process of coding YUV data based on a preset target code rate, if QP values of continuous n frames of data obtained by coding are all larger than QP threshold values and the motion state of front-end equipment indicates the motion of the front-end equipment, adjusting the coding reference relation to a second coding reference relation until the coding is completed, and obtaining code stream data.

In this step, when the motion state indicates that the front-end device moves, for example, a weak tennis machine, most of background areas of the tennis machine are changed under the condition of movement (rotation), and the conventional reference relation has large residual, so that the encoding pressure is large, the image quality is poor, and the encoding code stream data amount is large in the motion process, so that the network packet loss and other problems are caused. Then the second encoded reference relationship is used for encoding.

The second coding reference relation comprises that a first frame is coded into an I frame, each third preset interval is coded into an I frame, each fourth preset interval is coded into a Pn frame by referring to the last I frame coded before, a plurality of bi-directional prediction B frames are coded between the first frame and a P1 frame by referring to the first frame and the P1 frame, a plurality of B frames are coded between the Pn-1 frame and the Pn frame by referring to the Pn-1 frame and the Pn frame, n represents an nth fourth preset interval which is an integer larger than 2, pn-1 represents an nth-1P frame, and Pn represents an nth P frame.

For example, fig. 7 is a schematic diagram of a second coding reference relationship, and as shown in fig. 7, B frames refer to a previous frame and a next frame, so that a matching block is easier to find during motion, and residual errors are reduced, so that the B frames are particularly small, coding pressure is reduced, and video coding effect is improved as a whole. Meanwhile, all the P frames only refer to the I frames, so that even if a certain P frame is lost, the following P frames can still be used, and because the P frames are 4G networks, the problem of signal difference can exist, and the reference relation can play a role in resisting packet loss.

The background change is severe in the ball machine rotation process, the data volume that needs to encode increases, the phenomenon such as picture jamming, screen display and the like can be caused due to reasons such as slow transmission, serious packet loss and the like under the condition of low network bandwidth, user experience is extremely poor, the service life problem (related to rotation times) exists in the ball machine rotating slip ring, the service life of the slip ring is wasted when the ball machine rotates under the condition of low network bandwidth, preview experience is also very poor, and the ball machine can be limited not to rotate under the condition of meeting QP and network bandwidth threshold.

Note that the preset interval in this embodiment is determined according to the encoding frame rate and the actual application scenario.

According to the video data processing method provided by the embodiment, if QP values of continuous n-frame data coded before the current moment are all larger than the preset QP threshold, image parameters of image signal processing ISP are adjusted, original video data acquired at the current moment are processed based on the adjusted ISP image parameters, YUV data corresponding to the original video are obtained, in the process of coding the YUV data based on the preset target code rate, if QP values of continuous n-frame data obtained by coding are all larger than the QP threshold, and when the motion state of front-end equipment indicates the motion of the front-end equipment, the coding reference relation is adjusted to be the second coding reference relation until coding is completed, and code stream data are obtained. The method optimizes the video coding process, thereby being applicable to different motion scenes, increasing the coding flexibility and achieving the effects of improving the coding efficiency, reducing the data quantity and maintaining the video quality.

Fig. 8 is a flow chart of a third embodiment of a video data processing method according to the embodiment of the present application, as shown in fig. 8, on the basis of the above embodiment, step S201 specifically includes:

s801, determining ISP target image parameters in ISP image parameters based on QP threshold.

In this step, in order to guarantee the encoding pressure, the influence of each ISP image parameter on the average code rate of the output code stream may be analyzed based on the fixed QP, thereby determining the primary ISP target image parameter. And obtaining ISP target image parameters including noise reduction parameters and sharpening parameters through determination.

Illustratively, fig. 9 is a schematic diagram of the influence of the noise reduction parameter on the code rate, and fig. 10 is a schematic diagram of the influence of the sharpening parameter on the code rate. As shown in fig. 9, the noise reduction parameters can be evaluated by the QP-determining method, the noise reduction level is higher, the code rate is smaller, and as shown in fig. 10, the sharpening parameters can be evaluated by the QP-determining method, the larger the sharpening level is, the larger the code rate is.

S802, increasing the level of the noise reduction parameter.

And S803, reducing the grade of the sharpening parameter.

The level of the noise reduction parameters and/or the level of the sharpening parameters may be increased in order to reduce the texture complexity of the YUV data and to reduce the encoding pressure.

Illustratively, the texture complexity of YUV data may be measured by standard deviation, and illustratively, in units of macro blocks, the texture complexity of one macro block corresponds to the standard deviation of the luminance component of the 16 x 16 region of the pixel point, and the larger the value, the more complex the texture, and the smaller the value, the simpler the texture.

Fig. 11 is a schematic diagram showing texture complexity contrast, as shown in fig. 11, the white wall (block 1), the value of each pixel in a macroblock is very close, and the difference between the value corresponding to each pixel and the mean value is very small, so that the calculated standard deviation is also very small, and the white wall texture is simple.

The leaf/grassland (block 2) has large difference of the value of each pixel point in one macro block, bright and dark, white and green, and the difference of the value and the mean value corresponding to each pixel point is very large, so the calculated standard deviation is also very large, and the leaf/grassland texture is complex.

Fig. 12 is a schematic diagram showing a comparison of the noise reduction level 0 and the noise reduction level 100, fig. 13 is a schematic diagram showing a comparison of the sharpening level 0 and the sharpening level 100, and in combination with fig. 12 and fig. 13, the image is blurred by adjusting the noise reduction intensity or the sharpening intensity, and correspondingly, the difference of each pixel point is reduced after being processed by the noise and sharpening algorithm, and the corresponding texture complexity is also reduced.

Alternatively, when there is a large amount of random noise (e.g., brightness noise, dark noise, etc.) in the image, raising the noise reduction level can effectively reduce the noise, making the image cleaner. In an application scene needing to output high-quality video, such as a high-definition movie, professional monitoring and the like, the requirement on the purity of the image is high, and at the moment, the whole quality of the video can be improved by properly increasing the noise reduction level. When the performance of the used encoder is limited, the load of the encoder can be reduced and the encoding efficiency can be improved by increasing the noise reduction level and reducing the image complexity.

If the image is over sharpened, the problems of saw teeth, false textures and the like are caused at the edge, the sharpening level needs to be reduced, and in some application scenes with low requirements on image detail reservation, such as network live broadcasting, video conferences and the like, the high-frequency information in the image can be reduced by properly reducing the sharpening level, so that the coding pressure is reduced. In some cases, too high sharpening parameters may introduce additional coding noise, which can be reduced by reducing the sharpening level, improving the quality of the coded image.

When the image quality itself is not high and there is significant noise, the noise reduction level needs to be raised at the same time to reduce the noise, and the sharpening level needs to be lowered appropriately to avoid introducing additional noise or aliasing, and in the case where the encoder performance is limited but a certain image quality needs to be ensured, the noise reduction and sharpening parameters need to be adjusted at the same time to find the optimal balance point. In some specific application scenarios (such as night mode of security monitoring), it may be necessary to adjust the noise reduction and sharpening parameters simultaneously in order to ensure that the sharpness and noise level of the image are within acceptable ranges.

The implementation of specific adjustment parameters for reducing texture complexity is not particularly limited in the embodiments of the present application.

The embodiment of the present application is not particularly limited with respect to the specific implementation manner of reducing the noise reduction level and increasing the sharpening level.

According to the video data processing method provided by the embodiment, based on the QP threshold, the ISP target image parameter is determined in the ISP image parameters, the grade of the noise reduction parameter is increased, and the grade of the sharpening parameter is reduced. The method for adjusting the ISP image parameters based on the QP threshold improves the image quality, can adapt to different network environments, improves the user experience and saves the bandwidth resources.

Fig. 14 is a flow chart of a fourth embodiment of a method for processing video data according to an embodiment of the present application, as shown in fig. 14, where on the basis of the foregoing embodiment, the method further includes:

s1401, performing downsampling processing on the resolution of the original video data, and generating a Bitmap table.

S1402, for each macroblock corresponding to the downsampled original video data, if the change value of the pixel value of the macroblock is greater than a preset pixel value threshold, setting the corresponding position of the macroblock in the Bitmap table to be 1.

S1403, if the change value of the pixel value of the macro block is smaller than the pixel value threshold value, the corresponding position of the macro block in the Bitmap table is set to be 0.

And S1404, determining the duty ratio of 1 in the Bitmap table as the motion duty ratio.

Illustratively, the resolution of the sensor acquisition is wide and 1/16 of the resolution is high, and a Bitmap table of (original width/16) x (original height/16) is generated, wherein each element in the Bitmap table occupies only 1bit. And (3) carrying out frame difference on the downsampled image to judge the change of the pixel value of the macro block of 16, wherein the change value reaches a set threshold value to consider the macro block to move, setting the corresponding position of the Bitmap table to 1, otherwise setting 0, finally generating a static Bitmap table representing the movement of all 16 x 16 macro blocks, and obtaining the movement duty ratio when the sum of the numbers of which the statistics is 1 is in the number of all elements.

According to the processing method of the video data, the resolution of the original video data is subjected to downsampling processing, a Bitmap table is generated, for each macro block corresponding to the downsampled original video data, if the change value of the pixel value of the macro block is larger than the preset pixel value threshold value, the corresponding position of the macro block in the Bitmap table is set to be 1, and if the change value of the pixel value of the macro block is smaller than the pixel value threshold value, the corresponding position of the macro block in the Bitmap table is set to be 0, and the duty ratio of 1 in the Bitmap table is determined to be the motion duty ratio. The encoding rate of the next frame is adjusted in real time according to the motion duty ratio, so that the encoding effect is improved, and the video quality is ensured.

Optionally, on the basis of the foregoing embodiment, when the motion state indicates that the front-end device is stationary and detects that the current network bandwidth is greater than the first bandwidth threshold and less than the preset second bandwidth threshold, the coding reference relationship may be adjusted to be the fourth coding reference relationship.

Specifically, the fourth coding reference relation includes that the first frame is coded as an I frame, one I frame is coded after each sixth preset interval, the last I frame which is coded before and is referenced with the previous frame between two I frames is coded as a p frame as a reference frame, and the first bandwidth threshold is smaller than the second bandwidth threshold.

For example, fig. 15 is a schematic diagram of a fourth coding reference relationship, as shown in fig. 15, when an original coding reference relationship is blocked by a moving object in an originally still region, a matching block cannot be found only by referring to a partial region of a previous frame, so that a coding residual error of a current frame is larger, and if an earlier I frame can be referred to while referring to the previous frame, the matching block can be found more easily, thereby reducing the residual error.

Optionally, fig. 16 is a schematic diagram of a fourth coding reference relationship adjusted in real time, as shown in fig. 16, where the reference relationship in fig. 15 can reduce coding residual error, but the I-frame interval is still 50, a large I-frame is still generated every 50 frames, and code rate fluctuation is generated periodically, so that the method is not friendly to a weak network or a traffic-saving product, and in order to reduce the code rate fluctuation, the I-frame interval can be enlarged, and the original I-frame interval 50 is changed into the I-frame interval 200. However, the reference mode can solve the problem that the display time is too long when the video is checked, if the user checks the frame N+199 clicked by the video, then the frame 199 is needed to be decoded from the frame N until the frame image clicked by the user can be previewed, so that the waiting time of the user is too long, and the user experience is bad, the frame reference relationship can be adjusted to be the reference relationship shown in FIG. 3, and the problem that the user decodes the video quickly can be solved.

Fig. 17 is a schematic structural diagram of an embodiment of a video data processing apparatus according to an embodiment of the present application, and as shown in fig. 17, a video data processing apparatus 1700 includes:

The first adjusting module 1701 is configured to adjust image parameters of the image signal processing ISP if QP values of n consecutive frames of data encoded before the current time are all greater than a preset QP threshold.

The processing module 1702 is configured to process, based on the adjusted ISP image parameter, the original video data acquired at the current moment, to obtain YUV data corresponding to the original video.

And a second adjustment module 1703, configured to adjust, in a process of encoding YUV data based on a preset target code rate, if QP values of n consecutive frames of data obtained by encoding are all greater than a QP threshold, and a motion state of a front-end device indicates that the front-end device is stationary, the encoding reference relationship to a first encoding reference relationship until encoding is completed, so as to obtain code stream data, where the first encoding reference relationship includes encoding a first frame as an intra-frame encoded I frame, encoding an I frame after each first preset interval, encoding a last I frame encoded before each second preset interval as a forward predicted Pm frame with reference to a last I frame encoded before, and encoding a P frame between the first frame and a P1 frame, and a last I frame encoded before referring to the current encoding as a P frame, m represents an mth second preset interval, being an integer greater than 2, pm-1 represents an mth P frame, and Pm represents an mth P frame.

Optionally, the second adjustment module 1703 may also be configured to:

When the motion state indicates that the front-end equipment moves, the coding reference relation is adjusted to a second coding reference relation until coding is completed, and code stream data is obtained, wherein the second coding reference relation comprises that a first frame is coded as an I frame, one I frame is coded after each third preset interval, the last I frame coded by referring to the front is coded as a Pn frame after each fourth preset interval, a plurality of B frames are coded by referring to the first frame and the P1 frame between the first frame and the P1 frame, a plurality of B frames are coded by referring to the Pn-1 frame and the Pn frame between the Pn-1 frame and the Pn frame, n represents an nth fourth preset interval which is an integer larger than 2, pn-1 represents an nth-1P frame, and Pn represents an nth P frame.

Optionally, the second adjustment module 1703 may also be configured to:

In the coding process of the first coding reference relation, if QP values of the continuous n frames of data obtained by coding are all larger than QP threshold and the current network bandwidth is detected to be smaller than a preset first bandwidth threshold, the first coding reference relation is adjusted to be a third coding reference relation for coding, the third coding reference relation comprises that the first frame is only coded into an I frame, the first frame is coded into a Pa frame after each fifth preset interval, the previous frame and the first frame are coded into P frames in a reference mode between the first frame and the P1 frame, pa-1 frames and Pa frames, a represents an a fifth preset interval, an integer larger than 2, pa-1 represents an a-1P frame, and Pa represents an a-1P frame.

Optionally, the second adjustment module 1703 may also be configured to:

In the process of encoding according to the third encoding reference relation, the encoding code rate is adjusted based on the motion duty ratio of the original video data, and the adjusted code rate is adopted for encoding in the next frame of encoding, wherein the motion duty ratio is used for indicating the change rate of the original video data to the macro block pixel value.

Optionally, the processing module 1702 may be further configured to:

Downsampling the resolution of the original video data to generate a Bitmap table;

For each macro block corresponding to the original video data after downsampling, if the change value of the pixel value of the macro block is larger than a preset pixel value threshold value, setting the corresponding position of the macro block in a Bitmap table as 1;

if the change value of the pixel value of the macro block is smaller than the pixel value threshold value, the corresponding position of the macro block in the Bitmap table is set to be 0;

The duty cycle of 1 in the Bitmap table is determined as the motion duty cycle.

Optionally, the first adjusting module 1701 is specifically configured to:

determining ISP target image parameters in the ISP image parameters based on the QP threshold, wherein the ISP target image parameters comprise noise reduction parameters and sharpening parameters;

Raising the level of the noise reduction parameter;

The level of sharpening parameters is reduced.

Optionally, the first adjustment module 1701 is further configured to:

Optionally, the second adjustment module 1703 may also be configured to:

And when the motion state indicates that the front-end equipment is stationary and the current network bandwidth is detected to be larger than a first bandwidth threshold and smaller than a preset second bandwidth threshold, the coding reference relation is adjusted to be a fourth coding reference relation for coding, wherein the fourth coding reference relation comprises that a first frame is coded into I frames, an I frame is coded after every sixth preset interval, the previous frame and the last I frame which is coded before are referenced between the two I frames as reference frames to be coded into p frames, and the first bandwidth threshold is smaller than the second bandwidth threshold.

Optionally, the second adjustment module 1703 may also be configured to:

and in the process of coding according to the second coding reference relation, if the QP values of the continuous n frames of data obtained by coding are all larger than the QP threshold value and the current network bandwidth is detected to be incapable of meeting the code stream transmission data quantity, the motion state of the front-end equipment is adjusted to be static.

Fig. 18 is a schematic structural diagram of a front-end device according to an embodiment of the present application, as shown in fig. 18, the front-end device 1800 includes a processor 1802, a memory 1801 communicatively connected to the processor 1802, a communication interface 1803, and an image sensor 1804;

memory 1801 stores computer-executable instructions;

The processor 1802 executes computer-executable instructions stored in the memory 1801 to implement a method of processing video data in any of the method embodiments.

The communication interface 1803 is used for communication connection with an external device.

The image sensor 1804 is used to acquire video data in real time.

The embodiment of the application also provides a computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, and the computer executable instructions are used for executing the video data processing method provided by the various embodiments when being executed by a processor.

The computer readable storage medium described above may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as static random access memory, electrically erasable programmable read-only memory, magnetic memory, flash memory, magnetic disk or optical disk. A readable storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.

In the alternative, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). The processor and the readable storage medium may reside as discrete components in a device.

The embodiment of the application also provides a computer program product, which comprises a computer program, the computer program is stored in a computer readable storage medium, at least one processor can read the computer program from the computer readable storage medium, and the technical scheme provided by any one of the method embodiments can be realized when the at least one processor executes the computer program.

In the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or" describes an association of associated objects, meaning that there may be three relationships, e.g., A and/or B, and that there may be A alone, while A and B are present, and B alone, where A, B may be singular or plural. The character "/" generally indicates that the front and rear associated objects are in a "or" relationship, and in the formula, the character "/" indicates that the front and rear associated objects are in a "division" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a, b, or c) of a, b, c, a-b, a-c, b-c, or a-b-c may be represented, wherein a, b, c may be single or plural.

It will be appreciated that the various numerical numbers referred to in the embodiments of the present application are merely for ease of description and are not intended to limit the scope of the embodiments of the present application. In the embodiment of the present application, the sequence number of each process does not mean the sequence of the execution sequence, and the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application in any way.

Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A method for processing video data, characterized in that it is applied to a front-end device, and the method comprises:

If the QP values of the consecutive n frames of data encoded before the current moment are all greater than the preset QP threshold, the image signal processing ISP image parameters are adjusted;

In the process of encoding the YUV data based on a preset target bit rate, if the QP values of the encoded continuous n frames of data are all greater than the QP threshold, and the motion state of the front-end device indicates that the front-end device is stationary, the encoding reference relationship is adjusted to the first encoding reference relationship until the encoding is completed to obtain the code stream data, and the first encoding reference relationship includes: the first frame is encoded as an intra-frame coded I frame, an I frame is encoded after each first preset interval, and after each second preset interval, it is encoded as a forward predicted _Pm frame with reference to the last I frame encoded previously, and between the first frame and the _P1 frame, the Pm- ₁ frame and the _Pm frame, it is encoded as a p frame with reference to the previous frame currently encoded and the last I frame encoded previously before the current encoding, where m represents the mth second preset interval and is an integer greater than 2, the _Pm-1 represents the m-1th P frame, and the _Pm represents the mth P frame.

2. The method according to claim 1, characterized in that the method further comprises:

When the motion state indicates that the front-end device is moving, the encoding reference relationship is adjusted to a second encoding reference relationship until the encoding is completed to obtain the code stream data, wherein the second encoding reference relationship includes: the first frame is encoded as an I frame, an I frame is encoded after each third preset interval, and after each fourth preset interval, it is encoded as a _Pn frame with reference to the last I frame encoded previously; between the first frame and the _P1 frame, multiple bidirectionally predicted B frames are encoded with reference to the _first frame and the _P1 frame; between the Pn _-1 frame and the Pn frame, multiple B frames are encoded with reference to the _Pn-1 frame and the _Pn frame; n represents the nth fourth preset interval, which is an integer greater than 2; the Pn _-1 represents the n-1th P frame; and the _Pn represents the nth P frame.

3. The method according to claim 1, characterized in that the method further comprises:

In the encoding process using the first encoding reference relationship, if the QP values of the encoded consecutive n frames of data are all greater than the QP threshold and it is detected that the current network bandwidth is less than the preset first bandwidth threshold, the first encoding reference relationship is adjusted to a third encoding reference relationship for encoding, and the third encoding reference relationship includes: only the first frame is encoded as an I frame, and after each fifth preset interval, it is encoded as a _Pa frame with reference to the first frame, and between the first frame and the _P1 frame, and the _Pa-1 frame and the _Pa frame, it is encoded as a p frame with reference to the previous frame currently encoded and the first frame, where a represents the ath fifth preset interval and is an integer greater than 2, the Pa _-1 represents the a-1th P frame, and the _Pa represents the ath P frame.

4. The method according to claim 3, characterized in that the method further comprises:

During the encoding process using the third encoding reference relationship, the encoding bit rate is adjusted based on the motion ratio of the original video data, and the adjusted bit rate is used for encoding the next frame, wherein the motion ratio is used to indicate the rate of change of the original video data for the macroblock pixel value.

5. The method according to claim 4, characterized in that before adjusting the encoding bit rate based on the motion ratio of the original video data, the method further comprises:

For each macroblock corresponding to the downsampled original video data, if the change value of the pixel value of the macroblock is greater than a preset pixel value threshold, the corresponding position of the macroblock in the Bitmap table is set to 1;

If the change value of the pixel value of the macroblock is less than the pixel value threshold, the corresponding position of the macroblock in the Bitmap table is set to 0;

The proportion of 1 in the Bitmap table is determined as the motion proportion.

6. The method according to any one of claims 1 to 5, characterized in that if the QP values of the consecutive n frames of data encoded before the current moment are all greater than a preset QP threshold, adjusting the image signal processing ISP image parameters comprises:

Based on the QP threshold, determining ISP target image parameters from the ISP image parameters, the ISP target image parameters including noise reduction parameters and sharpening parameters;

increasing the level of the noise reduction parameter;

Reduce the level of the sharpening parameter.

7. The method according to any one of claims 1 to 5, characterized in that before adjusting the image signal processing ISP image parameters, the method further comprises:

When the network signal weakens and the network bandwidth cannot meet the data volume of the bitstream transmission, the target bitrate is adjusted based on the network bandwidth at the current moment.

8. The method according to any one of claims 1 to 5, characterized in that the method further comprises:

If the QP values of the consecutive n frames of data encoded after adjusting the encoding reference relationship are all greater than the QP threshold, the encoding frame rate and resolution are reduced.

9. The method according to claim 1, characterized in that the method further comprises:

When the motion state indicates that the front-end device is stationary and it is detected that the current network bandwidth is greater than the first bandwidth threshold and less than the preset second bandwidth threshold, the encoding reference relationship is adjusted to a fourth encoding reference relationship for encoding, and the fourth encoding reference relationship includes: the first frame is encoded as an I frame, an I frame is encoded after each sixth preset interval, and between two I frames, the previous frame and the last I frame that has been encoded are referenced as reference frames and encoded as a p frame, and the first bandwidth threshold is less than the second bandwidth threshold.

10. The method according to claim 2, characterized in that the method further comprises:

During encoding using the second encoding reference relationship, if the QP values of the encoded n consecutive frames of data are all greater than the QP threshold and it is detected that the current network bandwidth cannot meet the data volume of the code stream transmission, the motion state of the front-end device is adjusted to still.