CN113850879A

CN113850879A - Method for improving compression ratio of static background video based on background modeling technology

Info

Publication number: CN113850879A
Application number: CN202110609798.XA
Authority: CN
Inventors: 曹靖城; 吕超; 沈文琦; 史国杰
Original assignee: Tianyi Smart Family Technology Co Ltd
Current assignee: Tianyi Digital Life Technology Co Ltd
Priority date: 2021-06-01
Filing date: 2021-06-01
Publication date: 2021-12-28
Anticipated expiration: 2041-06-01
Also published as: CN113850879B

Abstract

The present invention provides a method for improving the compression rate of static background video based on background modeling technology. In the present invention, background modeling is performed on the video picture, and continuous differential encoding is performed according to the probability that each pixel is determined to be background, so that pixels with a high probability of background tend to obtain fewer bits in the rate control. encoding, thereby increasing the compression rate. Since the probability function is continuous, the coding strategy can be continuously adjusted accordingly, so as to eliminate the visual difference in image quality and improve user perception on the basis of ensuring the compression performance.

Description

A method for improving the compression ratio of static background video based on background modeling technology

技术领域technical field

本发明涉及视频编码，尤其涉及基于背景建模技术提高静态背景视频压缩率的方法。The present invention relates to video coding, in particular to a method for improving the compression rate of static background video based on background modeling technology.

背景技术Background technique

人类通过视觉获取的信息量约占总信息量的70％，而且视频信息具有直观性、可信性等一系列优点。但随着视频技术应用范围的不断扩展，比如家庭场景下摄像头的普遍应用，传输的数据量也日益增大。然而，单纯依靠扩大存储器容量、增加通信干线的传输速率的办法是昂贵且较难实现的，因此，视频编码压缩技术才是行之有效的解决办法。The amount of information obtained by humans through vision accounts for about 70% of the total amount of information, and video information has a series of advantages such as intuition and credibility. However, with the continuous expansion of the application scope of video technology, such as the widespread application of cameras in home scenarios, the amount of data transmitted is also increasing. However, the method of simply relying on expanding the memory capacity and increasing the transmission rate of the communication trunk is expensive and difficult to achieve. Therefore, the video coding and compression technology is an effective solution.

目前，主流的视频编码器主要分为3个系列：VPx(VP8，VP9)，H.26x(H.264，H.265，H.266)，AVS(AVS1.0，AVS2.0)，但它们都只提出了标准和规范，并没有区分应用场景。因此，在实际应用中，要想进一步提高视频压缩效率，还需要结合场景特点，进行改进和深度优化。针对家庭摄像头、道路摄像头等背景相对静止且变化较小的应用场景，如果只使用主流视频编码技术，压缩率已经基本达到瓶颈，只有针对视频特性，采用更具适应性的优化方案，才能提高压缩效率。At present, the mainstream video encoders are mainly divided into 3 series: VPx (VP8, VP9), H.26x (H.264, H.265, H.266), AVS (AVS1.0, AVS2.0), but They only propose standards and specifications, and do not distinguish application scenarios. Therefore, in practical applications, in order to further improve the video compression efficiency, it is necessary to improve and deeply optimize the scene characteristics. For application scenarios such as home cameras and road cameras with relatively static backgrounds and small changes, if only mainstream video coding technologies are used, the compression rate has basically reached the bottleneck. Only by adopting more adaptive optimization solutions for video characteristics can the compression be improved. efficiency.

专利“基于ROI的视频编码方法和系统以及视频传输和编码系统”(CN202010249206.3A)公开了一种基于ROI的视频编码方法，包括：获取待编码视频的视频帧，所述视频帧包括多个编码块；将所述视频帧分为ROI区域和非ROI区域；生成所述视频帧的掩模，所述掩模可区分所述ROI区域和非ROI区域；获得所述视频帧的颜色空间的至少一个通道的量化参数的差值；对于所述视频帧的每个编码块，根据所述掩模，选择所述至少一个通道的预测模式；对于所述视频帧的每个编码块，根据所述至少一个通道的量化参数的差值，根据该编码块包括ROI区域和/或非ROI区域，调整所述至少一个通道的量化参数；和根据所述预测模式、以及所述至少一个通道的量化参数，对所述视频帧进行编码。The patent "ROI-based video coding method and system, and video transmission and coding system" (CN202010249206.3A) discloses a ROI-based video coding method, including: acquiring a video frame of a video to be coded, the video frame including a plurality of coding block; dividing the video frame into a ROI area and a non-ROI area; generating a mask of the video frame, the mask can distinguish the ROI area and the non-ROI area; obtaining the color space of the video frame The difference value of the quantization parameter of at least one channel; for each coding block of the video frame, the prediction mode of the at least one channel is selected according to the mask; for each coding block of the video frame, according to the the difference value of the quantization parameter of the at least one channel, according to the coding block including the ROI area and/or the non-ROI area, adjust the quantization parameter of the at least one channel; and according to the prediction mode and the quantization of the at least one channel parameter to encode the video frame.

专利“基于ROI的视频编码方法以及视频编码系统”(CN202010366816.1A)公开了一种基于ROI的视频编码方法，包括：S101：获取待编码视频的视频帧；S102：通过神经网络模型提取所述视频帧的ROI区域；S103：针对所述视频帧的ROI区域，采用第一编码方式进行编码；针对所述视频帧的非ROI区域，采用第二编码方式进行编码，其中第一编码方式的编码图像质量等级高于所述第二编码方式的编码图像质量等级。The patent "ROI-based video coding method and video coding system" (CN202010366816.1A) discloses a ROI-based video coding method, including: S101: acquiring video frames of the video to be encoded; S102: extracting the The ROI area of the video frame; S103: For the ROI area of the video frame, use the first encoding mode for encoding; for the non-ROI area of the video frame, use the second encoding mode to encode, wherein the encoding of the first encoding mode The image quality level is higher than the encoded image quality level of the second encoding method.

但是，上述两种方案仅将画面划分为两类区域，即ROI(感兴趣)区域和非ROI区域，对两种区域进行基于分段函数的差异化编码会使同一画面下两种区域的画质差异明显，从而降低用户观看的视觉体验。However, the above two schemes only divide the picture into two types of areas, namely, ROI (interesting) area and non-ROI area. Performing differentiated coding based on segmentation function for the two areas will make the two areas in the same picture. The quality difference is obvious, thereby reducing the visual experience of the user's viewing.

因此，如何在不影响画面主观视觉质量的情况下提高家庭摄像头等背景变化较小的视频的压缩率，是值得进一步优化解决的问题。Therefore, how to improve the compression rate of videos with small background changes, such as home cameras, without affecting the subjective visual quality of the picture, is a problem worthy of further optimization and solution.

发明内容SUMMARY OF THE INVENTION

提供本发明内容以便以简化形式介绍将在以下具体实施方式中进一步的描述一些概念。本发明内容并非旨在标识所要求保护的主题的关键特征或必要特征，也不旨在用于帮助确定所要求保护的主题的范围。This Summary is provided to introduce some concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

根据本发明的一个实施例，提供了一种基于背景建模技术提高静态背景视频压缩率的方法，包括：采用前景提取算法来建立背景模型，所述背景模型用于表征图像帧中每一个像素点的特征；对建立的背景模型进行初始化；基于获取到的新的图像帧，对所述背景模型进行更新；对所述新的图像帧中的每个像素点进行前景或背景估计；针对所述新的图像帧中每一个编码块内的像素点被确定为背景的概率，对每一个编码块调整动态码率控制因子；以及基于针对每一个编码块的动态码率控制因子，对每一个编码块进行差异化编码。According to an embodiment of the present invention, there is provided a method for improving the compression rate of a static background video based on a background modeling technology, including: using a foreground extraction algorithm to establish a background model, where the background model is used to represent each pixel in an image frame point features; initialize the established background model; update the background model based on the acquired new image frame; perform foreground or background estimation on each pixel in the new image frame; Describe the probability that the pixel point in each coding block in the new image frame is determined as the background, adjust the dynamic rate control factor for each coding block; and based on the dynamic rate control factor for each coding block, for each coding block The encoding block is differentially encoded.

根据本发明的又一个实施例，提供了一种基于背景建模技术提高静态背景视频压缩率的系统，包括背景建模模块和动态编码模块。所述背景建模模块被配置为：采用前景提取算法来建立背景模型，所述背景模型用于表征图像帧中每一个像素点的特征；对建立的背景模型进行初始化；基于获取到的新的图像帧，对所述背景模型进行更新；对所述新的图像帧中的每个像素点进行前景或背景估计。所述动态编码模块被配置为：针对所述新的图像帧中每一个编码块内的像素点被确定为背景的概率，对每一个编码块调整动态码率控制因子；以及基于针对每一个编码块的动态码率控制因子，对每一个编码块进行差异化编码。According to another embodiment of the present invention, there is provided a system for improving the compression rate of static background video based on background modeling technology, including a background modeling module and a dynamic encoding module. The background modeling module is configured to: use a foreground extraction algorithm to establish a background model, the background model is used to characterize the characteristics of each pixel in the image frame; initialize the established background model; Image frame, update the background model; perform foreground or background estimation on each pixel in the new image frame. The dynamic coding module is configured to: adjust the dynamic rate control factor for each coding block according to the probability that the pixel point in each coding block in the new image frame is determined as the background; and based on the probability for each coding block The dynamic rate control factor of the block, which performs differential coding for each coded block.

根据本发明的还一个实施例，提供了一种用于图像超分辨重建的计算设备，包括：处理器；存储器，所述存储器存储有指令，所述指令在被所述处理器执行时能执行如上述所述的方法。According to yet another embodiment of the present invention, there is provided a computing device for image super-resolution reconstruction, comprising: a processor; a memory, where the memory stores instructions that can be executed when executed by the processor method as described above.

通过阅读下面的详细描述并参考相关联的附图，这些及其他特点和优点将变得显而易见。应该理解，前面的概括说明和下面的详细描述只是说明性的，不会对所要求保护的各方面形成限制。These and other features and advantages will become apparent upon reading the following detailed description with reference to the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are illustrative only and not restrictive of the claimed aspects.

附图说明Description of drawings

为了能详细地理解本发明的上述特征所用的方式，可以参照各实施例来对以上简要概述的内容进行更具体的描述，其中一些方面在附图中示出。然而应该注意，附图仅示出了本发明的某些典型方面，故不应被认为限定其范围，因为该描述可以允许有其它等同有效的方面。In order that the manner in which the above-described features of the present invention can be understood in detail, what has been briefly summarized above may be described in more detail with reference to various embodiments, some aspects of which are illustrated in the accompanying drawings. It is to be noted, however, that the appended drawings illustrate only certain typical aspects of the invention and are therefore not to be considered limiting of its scope, for the description may admit to other equally effective aspects.

图1示出了根据本发明的一个实施例的用于基于背景建模技术的视频编码系统100的框图；FIG. 1 shows a block diagram of a video coding system 100 based on background modeling techniques according to one embodiment of the present invention;

图2示出了根据本发明的一个实施例的用于基于背景建模技术的视频编码方法200的流程图；FIG. 2 shows a flowchart of a video coding method 200 based on a background modeling technique according to an embodiment of the present invention;

图3示出了根据本发明的一实施例的可应用于本发明的各方面的硬件设备的计算设备300的框图。3 illustrates a block diagram of a computing device 300 of a hardware device applicable to aspects of the invention, according to an embodiment of the invention.

具体实施方式Detailed ways

下面结合附图详细描述本发明，本发明的特点将在以下的具体描述中得到进一步的显现。The present invention will be described in detail below in conjunction with the accompanying drawings, and the features of the present invention will be further revealed in the following detailed description.

本发明的目的在于结合背景建模技术，获得比主流编码(诸如，H.264，H.265等)更高的压缩率，并能提高画面主观视觉质量。在本发明中，对视频画面进行背景建模，根据每个像素点被判别为背景的概率进行连续的差异化编码，使得大概率为背景的部分在码率控制中倾向于获得更少比特的编码，从而提高压缩率。由于概率函数是连续的，编码策略可相应地进行连续调整，从而在保证压缩性能的基础上，消除视觉上的画质差异感，提高用户感知。The purpose of the present invention is to combine the background modeling technology to obtain a higher compression rate than mainstream coding (such as H.264, H.265, etc.), and to improve the subjective visual quality of the picture. In the present invention, the background modeling is performed on the video picture, and continuous differential encoding is performed according to the probability that each pixel is judged to be the background, so that the part with a high probability of the background tends to obtain fewer bits in the rate control. encoding, thereby increasing the compression rate. Since the probability function is continuous, the coding strategy can be continuously adjusted accordingly, so as to eliminate the visual difference in image quality and improve user perception on the basis of ensuring the compression performance.

图1示出了根据本发明的一个实施例的用于基于背景建模技术的视频编码系统100的框图。如图1中示出的，该系统100按模块进行划分，各模块之间通过本领域已知的方式进行通信和数据交换。在本发明中，各模块可通过软件或硬件或其组合的方式来实现。该系统100可包括背景建模模块101和动态编码模块102。FIG. 1 shows a block diagram of a video coding system 100 for background modeling techniques according to one embodiment of the present invention. As shown in FIG. 1 , the system 100 is divided into modules, and each module communicates and exchanges data in a manner known in the art. In the present invention, each module can be implemented by software or hardware or a combination thereof. The system 100 may include a background modeling module 101 and a dynamic encoding module 102 .

总体而言，参考图1，背景建模模块101被配置用于基于背景建模法的背景减除/前景提取来获得像素点属于背景的概率。根据本发明的一个实施例，基于背景建模法的背景减除/前景提取主要包括：对视频画面的背景环境进行建模，通过背景减除提取出画面前景，同时计算出各像素点在模型中被判别为背景的概率值。In general, referring to FIG. 1 , the background modeling module 101 is configured to obtain the probability that a pixel belongs to the background based on background subtraction/foreground extraction based on the background modeling method. According to an embodiment of the present invention, the background subtraction/foreground extraction based on the background modeling method mainly includes: modeling the background environment of the video picture, extracting the foreground of the picture through background subtraction, and calculating the value of each pixel in the model The probability of being identified as the background in the middle.

动态编码模块102被配置用于基于动态码率控制的差异化编码。根据本发明的一个实施例，基于动态码率控制的差异化编码主要包括：基于背景建模模块101输出的像素点被判别为背景的概率，对像素点采取动态的码率控制策略，使得大概率为背景的像素点在码率控制中倾向于获得更少比特的编码，保证视觉质量的同时提高视频压缩率。The dynamic encoding module 102 is configured for differential encoding based on dynamic rate control. According to an embodiment of the present invention, the differential coding based on the dynamic rate control mainly includes: based on the probability that the pixels output by the background modeling module 101 are judged to be backgrounds, adopting a dynamic rate control strategy for the pixels, so that the maximum Pixels whose probability is the background tend to obtain fewer bits of coding in the rate control, which can improve the video compression rate while ensuring the visual quality.

本发明尤其适用于静态背景的应用场景，如采用诸如家用摄像头、会议录像设备、道路摄像头等监控设备的应用场景。此类监控设备能对场景进行拍照、摄像，并将获取的图像数据存储在本机进行处理(例如，编码)，以及在需要时将处理后的数据发送到远程设备(例如，智能家居控制平台、中央控制平台、其他计算设备等)进行后续处理(例如，播放、编辑等)。根据本发明的一个实施例，背景建模模块101和动态编码模块102可被实现在上述监控设备或如图3描述的其他计算设备300中。The present invention is especially suitable for application scenarios with static backgrounds, such as application scenarios using monitoring equipment such as home cameras, conference video equipment, road cameras and the like. Such monitoring equipment can take pictures and video of the scene, store the acquired image data in the local machine for processing (for example, encoding), and send the processed data to a remote device (for example, a smart home control platform) when needed. , central control platform, other computing devices, etc.) for subsequent processing (eg, playback, editing, etc.). According to one embodiment of the present invention, the background modeling module 101 and the dynamic encoding module 102 may be implemented in the above-mentioned monitoring device or other computing device 300 as described in FIG. 3 .

图2示出了根据本发明的一个实施例的用于基于背景建模技术的视频编码方法200的流程图。该方法200主要包括两个阶段，背景建模阶段201和动态编码阶段202。根据本发明的一个实施例，背景建模阶段201可由图1中所示的背景建模模块101来实现，动态编码阶段202可由图1中所示的动态编码模块102来实现。FIG. 2 shows a flowchart of a video encoding method 200 based on background modeling techniques according to one embodiment of the present invention. The method 200 mainly includes two stages, a background modeling stage 201 and a dynamic coding stage 202 . According to an embodiment of the present invention, the background modeling stage 201 may be implemented by the background modeling module 101 shown in FIG. 1 , and the dynamic encoding stage 202 may be implemented by the dynamic encoding module 102 shown in FIG. 1 .

在本发明中，对于图像帧中被确定为背景的像素点与被确定为前景的像素点，采用不同的码率控制因子λ进行后续常规编码(诸如，H.264，H.265等)，使得大概率为背景的像素点在码率控制中倾向于获得更少比特的编码，从而提高图像帧整体的压缩率，减轻了设备的存储压力以及通信干线的传输压力。以下参考图2来进行进一步的详细描述。In the present invention, for the pixel points determined as the background and the pixels determined as the foreground in the image frame, different rate control factors λ are used for subsequent conventional encoding (such as H.264, H.265, etc.), The pixels with a high probability of background tend to obtain fewer bits of coding in the rate control, thereby improving the overall compression rate of the image frame, reducing the storage pressure of the device and the transmission pressure of the communication trunk. Further detailed description is given below with reference to FIG. 2 .

背景建模阶段201采用了一种使用背景建模法获得像素点属于背景的概率的方法。该阶段201利用背景建模法，在识别出前景后，利用背景模型的相关参数，求取能代表像素点属于背景的概率大小的参数，使得后续采用常规编码(诸如，H.264，H.265等)时对码率控制因子的调整函数是连续的。如本领域的技术人员所知，常规编码中多采用率失真优化策略来权衡码率和图像质量，即采用拉格朗日法得到代价函数J＝D+λ·R，其中D代表图像失真，R代表码率，拉格朗日因子λ又称码率控制因子，用于控制失真和码率之间的比重，λ越大，码率所占比重越大，编码时越倾向于牺牲更多视频质量以获得更小的码率。The background modeling stage 201 adopts a method of obtaining the probability that a pixel belongs to the background using a background modeling method. In this stage 201, the background modeling method is used, and after identifying the foreground, the relevant parameters of the background model are used to obtain parameters that can represent the probability that the pixel belongs to the background, so that conventional coding (such as H.264, H.264, H. 265, etc.), the adjustment function for the rate control factor is continuous. As known to those skilled in the art, the rate-distortion optimization strategy is often used in conventional coding to balance the bit rate and image quality, that is, the Lagrangian method is used to obtain the cost function J=D+λ·R, where D represents the image distortion, R represents the code rate, and the Lagrangian factor λ is also called the code rate control factor, which is used to control the proportion between distortion and code rate. The larger the λ, the greater the proportion of the code rate, and the more inclined to sacrifice more when encoding. Video quality for a smaller bitrate.

首先，背景建模阶段201开始于步骤201-1。在步骤201-1，采用前景提取算法来建立背景模型，该背景模型用于表征图像帧中每一个像素点的特征。基于没有侵入目标的背景图片可以用统计学模型描述这一普遍假设，对画面中的每个像素点进行背景建模，建模方法可参考且不限于当前主流的前景提取算法，例如GMM，ViBe，SACON，PBAS等算法。当然，其他类型的前景提取算法也在本发明的范围之内。First, the background modeling phase 201 begins at step 201-1. In step 201-1, a foreground extraction algorithm is used to establish a background model, and the background model is used to characterize the feature of each pixel in the image frame. Based on the general assumption that the background image without the intrusion target can be described by a statistical model, the background modeling is performed for each pixel in the picture. The modeling method can refer to and is not limited to the current mainstream foreground extraction algorithms, such as GMM, ViBe , SACON, PBAS and other algorithms. Of course, other types of foreground extraction algorithms are also within the scope of the present invention.

根据本发明的一个实施例，以使用高斯混合模型(GMM)为例来说明步骤201-1。如等式(1)所示，使用K个带权重w的混合高斯分布模型来表征图像帧中每一个像素点的特征：According to an embodiment of the present invention, step 201-1 is described by using a Gaussian mixture model (GMM) as an example. As shown in equation (1), K mixed Gaussian distribution models with weights w are used to characterize the features of each pixel in the image frame:

其中X为任一像素点的历史值，ω,μ,Σ分别为各高斯分布的权重，均值和协方差。Where X is the historical value of any pixel, ω, μ, Σ are the weight, mean and covariance of each Gaussian distribution, respectively.

在步骤201-2，对步骤201-1建立的背景模型进行初始化。理论上说，如果有一张仅有背景不含前景的图像，则只需要在新的图像中减去背景就可以得到前景对象了。然而在很多情况下，并没有这样的背景图像，因此根据不同算法，通常采用第一帧或前N帧图像对选定的/建立的背景模型进行初始化。In step 201-2, the background model established in step 201-1 is initialized. In theory, if there is an image with only background and no foreground, you only need to subtract the background from the new image to get the foreground object. However, in many cases, there is no such background image, so according to different algorithms, the first frame or the first N frame images are usually used to initialize the selected/established background model.

根据本发明的一个实施例，继续上述采用GMM模型的示例，在步骤201-2中，采用第一帧图像对背景模型进行初始化，每个高斯分布用第一帧的像素值作为期望，权重均为1/K，标准差均很大。本领域的技术人员能够理解，标准差表示了数据的分散程度，初始化的时候只有一个数值，无法算出分散程度，所以先假设它很大，等下一帧有新数据时再对它进行相应的更新。According to an embodiment of the present invention, continuing the above example of using the GMM model, in step 201-2, the background model is initialized by using the first frame of image, each Gaussian distribution uses the pixel value of the first frame as the expectation, and the weights are equal to each other. is 1/K, and the standard deviation is large. Those skilled in the art can understand that the standard deviation represents the degree of dispersion of the data. There is only one value during initialization, and the degree of dispersion cannot be calculated. Therefore, it is assumed that it is very large, and then the corresponding data is performed on it when there is new data in the next frame. renew.

在步骤201-3，基于获取到的新的图像帧，对背景模型进行更新。根据本发明的一个实施例，当获取到新的图像帧时，根据各模型算法的更新策略和新的图像帧中的像素值对模型参数进行更新。本领域的技术人员可以理解，不同模型中使用的参数各不相同(诸如匹配条件中使用的阈值等)，此处并未具体限制哪些参数被更新。In step 201-3, the background model is updated based on the acquired new image frame. According to an embodiment of the present invention, when a new image frame is acquired, the model parameters are updated according to the update strategy of each model algorithm and the pixel values in the new image frame. Those skilled in the art can understand that parameters used in different models are different (such as thresholds used in matching conditions, etc.), and which parameters are updated are not specifically limited here.

根据本发明的一个实施例，继续上述采用GMM模型的示例。在步骤201-3中，当获取新的图像帧时，根据以下更新策略对模型参数进行更新：According to an embodiment of the present invention, the above-mentioned example of using the GMM model is continued. In step 201-3, when a new image frame is acquired, the model parameters are updated according to the following update strategy:

ω_i,t＝(1-α)ω_i,t-1+αM_i,t (2)ω _i,t =(1-α)ω _i,t-1 +αM _i,t (2)

μ_i,t＝(1-ρ)μ_i,t-1+ρX_t (3)μ _i,t =(1-ρ)μ _i,t-1 +ρX _t (3)

其中，in,

ρ＝αη(X_t|μ_i,t,σ_i,t) (6)ρ=αη(X _t |μ _i,t ,σ _i,t ) (6)

且α为学习率。And α is the learning rate.

在步骤201-4，对步骤201-3中获取到的新的图像帧中的每个像素点进行前景/背景估计。根据本发明的一个实施例，将新的图像帧中的每个像素点的当前像素值与其对应的背景模型(例如，每个像素点自己的GMM背景模型)进行匹配，匹配失败的像素点被归为前景；对于匹配成功的像素点，则根据步骤201-3中经更新的算法模型参数计算出该像素点属于背景的概率(p)。根据本发明的一个实施例，匹配阈值可被用于上述匹配中。In step 201-4, perform foreground/background estimation on each pixel in the new image frame acquired in step 201-3. According to an embodiment of the present invention, the current pixel value of each pixel in the new image frame is matched with its corresponding background model (for example, each pixel's own GMM background model), and the pixels that fail to match are For the pixels that are successfully matched, the probability (p) that the pixel belongs to the background is calculated according to the updated algorithm model parameters in step 201-3. According to one embodiment of the present invention, a matching threshold may be used in the above-mentioned matching.

根据本发明的一个实施例，继续上述采用GMM模型的示例。在步骤201-4中，将K个高斯分布按

排序，取前B个高斯分布作为当前的背景模型：According to an embodiment of the present invention, the above-mentioned example of using the GMM model is continued. In step 201-4, K Gaussian distributions are

Sort, take the first B Gaussian distributions as the current background model:

其中T为画面中背景所占的最小比例。等式(7)是为了获得B这个参数，如上面所说，经步骤201-3更新过后的GMM模型，取前B个高斯分布组成的模型为当前背景模型。将每个像素点的当前像素值与当前背景模型进行匹配，若匹配失败，则将该像素点判别为前景；否则，对B个a值进行归一化得到a’，计算出能代表该像素点属于背景的概率大小的参数p：Where T is the minimum proportion of the background in the picture. Equation (7) is to obtain the parameter B. As mentioned above, for the GMM model updated in step 201-3, the model composed of the first B Gaussian distributions is taken as the current background model. Match the current pixel value of each pixel with the current background model. If the matching fails, the pixel is judged as the foreground; otherwise, the B values of a are normalized to obtain a', and the pixel that can represent the pixel is calculated. The parameter p of the probability size of the point belonging to the background:

p＝a'(X_t-μ) (8)p=a'(X _t -μ) (8)

在计算出像素点被判别为背景的概率后，进入动态编码阶段202。动态编码阶段202采用一种利用像素点属于背景的概率调整编码策略的方法。根据本发明的一个实施例，动态编码阶段202是在逐编码块的基础上进行的。对当前图像帧中的每个编码块重复下述步骤202-1到202-2，直到所有编码块均被编码。本领域的技术人员完全可以理解，对图像帧的编码块的划分以及对编码块进行编码的顺序可基于所采用的具体编码方式(诸如，H.264，H.265等)来配置，其具体的划分方式和/或编码顺序不在本发明的保护范围内。After calculating the probability that the pixel is determined to be the background, the dynamic encoding stage 202 is entered. The dynamic encoding stage 202 adopts a method of adjusting the encoding strategy using the probability that the pixel points belong to the background. According to one embodiment of the present invention, the dynamic coding stage 202 is performed on a coding block-by-block basis. The following steps 202-1 to 202-2 are repeated for each coding block in the current image frame until all coding blocks are coded. Those skilled in the art can fully understand that the division of the coding blocks of the image frame and the coding sequence of the coding blocks can be configured based on the specific coding mode (such as H.264, H.265, etc.) adopted. The division manner and/or coding sequence of .

该阶段202基于用户可接受的背景画面质量，探索率失真优化策略中衰减与码率间的权衡值和像素点为背景点的概率间的映射关系，获得更符合人眼注意力机制的调整策略。In this stage 202, based on the user-acceptable background picture quality, explore the mapping relationship between the trade-off value between the attenuation and the bit rate in the rate-distortion optimization strategy and the probability that the pixel points are background points, and obtain an adjustment strategy that is more in line with the human eye's attention mechanism. .

在步骤202-1，基于当前编码块内的像素点被确定为背景的概率，对当前编码块调整动态码率控制因子。根据本发明的一个实施例，若当前编码块内的像素点全部为背景点，则取该编码块内所有像素点的p值的平均值p_avg，通过探索像素点属于背景的概率与用户可接受的背景画面质量之间的映射关系，得出率失真优化策略中的码率控制因子λ＝f(p_avg)，其中f(·)表示率失真优化策略中衰减与码率间的权衡和像素点属于背景的概率间的对应关系，该对应关系为一个连续函数，而非现有技术中离散的分段函数。例如，在常规的x264编码器中，λ与量化参数QP值有对应关系，QP值是编码器的输入之一，且都是整数，通过QP值查表得出λ。因此，常规编码中λ与QP值的对应关系不是一个连续函数。此外，在本发明中，基于率失真优化策略中衰减与码率间的权衡和像素点属于背景的概率间的对应关系，像素点属于背景的概率越大，λ就越大，编码时越倾向于牺牲更多视频质量以获得更小的码率。In step 202-1, the dynamic rate control factor is adjusted for the current coding block based on the probability that the pixels in the current coding block are determined as the background. According to an embodiment of the present invention, if all the pixels in the current coding block are background points, the average value p_avg of the p values of all the pixels in the coding block is taken, and the probability that the pixels belong to the background is found to be acceptable to the user. The mapping relationship between the background picture quality of , obtains the rate control factor λ=f(p_avg) in the rate-distortion optimization strategy, where f( ) represents the trade-off between attenuation and bit rate in the rate-distortion optimization strategy and the pixel point The correspondence between the probabilities belonging to the background is a continuous function rather than a discrete piecewise function in the prior art. For example, in a conventional x264 encoder, λ has a corresponding relationship with the quantization parameter QP value. The QP value is one of the inputs of the encoder and is an integer, and λ is obtained by looking up the QP value in a table. Therefore, the correspondence between λ and QP values in conventional coding is not a continuous function. In addition, in the present invention, based on the trade-off between the attenuation and the bit rate in the rate-distortion optimization strategy and the correspondence between the probability that the pixel belongs to the background, the greater the probability of the pixel belonging to the background, the larger the λ, and the more inclined it is to encode. to sacrifice more video quality for a smaller bitrate.

在步骤202-2，使用动态码率控制因子来对当前编码块进行差异化编码。根据本发明的一个实施例，若当前编码块内的像素点含有或全部为前景，则按原有常规编码方式(诸如，H.264，H.265等)正常编码；若当前编码块内的像素点全部为背景点，则按照步骤202-1中得出的λ进行编码，由于p值大的像素点可使用更大的λ进行编码，视频画面的背景部分将获得更进一步的压缩。In step 202-2, the current coding block is differentially coded using the dynamic rate control factor. According to an embodiment of the present invention, if the pixels in the current coding block contain or all are foreground, the normal coding is performed according to the original conventional coding method (such as H.264, H.265, etc.); If all the pixels are background points, the coding is performed according to the λ obtained in step 202-1. Since the pixels with a larger p value can be coded with a larger λ, the background part of the video picture will be further compressed.

在步骤202-3，判断当前图像帧中是否还有需要编码的编码块，如果是，则返回到步骤202-1，如果否，则进入步骤203。In step 202-3, it is judged whether there are coding blocks that need to be coded in the current image frame, if yes, go back to step 202-1, if not, go to step 203.

在步骤203，判断是否有新的图像帧，如果是，则返回到步骤201-3以继续获取下一新的图像帧，如果否，则结束本流程。根据本发明的一个实施例，可设定获取预定时间段内的图像帧进行编码。根据本发明的另一个实施例，如果背景发生显著变化(诸如新的图像帧的背景与上一图像帧的背景之间的差异达到某个阈值)，则可结束本流程。In step 203, it is judged whether there is a new image frame, if yes, return to step 201-3 to continue to acquire the next new image frame, if not, end the process. According to an embodiment of the present invention, it may be set to acquire image frames within a predetermined time period for encoding. According to another embodiment of the present invention, if the background changes significantly (such as the difference between the background of the new image frame and the background of the previous image frame reaches a certain threshold), the process may end.

本发明与现有技术相比，具有以下优点：Compared with the prior art, the present invention has the following advantages:

(1)提高视频压缩率。在保证前景画面质量的前提下，通过进一步压缩视频中更可能属于背景的画面，节省更多的码率，从而提高视频的压缩率。(1) Improve the video compression rate. On the premise of ensuring the quality of the foreground image, by further compressing the images in the video that are more likely to belong to the background, more bit rates are saved, thereby improving the compression rate of the video.

(2)压缩效率可动态调整。可结合具体场景的要求，通过修改计算码率控制因子公式中的参数，实现压缩效率的动态调整。(2) The compression efficiency can be dynamically adjusted. The dynamic adjustment of compression efficiency can be achieved by modifying the parameters in the formula for calculating the rate control factor according to the requirements of specific scenarios.

(3)提高画面主观视觉质量。基于人眼注意力机制，采用更合理的编码策略，对画面中识别为背景的部分按照其确实为背景点的概率进行编码压缩，使得对码率控制因子的调整公式是连续函数，而非离散的分段函数，避免视觉上前景和背景区域的画质分层。(3) Improve the subjective visual quality of the picture. Based on the human eye attention mechanism, a more reasonable coding strategy is adopted to encode and compress the part identified as background in the picture according to the probability that it is indeed a background point, so that the adjustment formula for the rate control factor is a continuous function rather than a discrete one. A piecewise function that avoids visual quality layering of foreground and background regions.

(4)静态背景下的高实用性和易实施性。对于画面背景多为静止或变化较小的场景，例如家庭摄像头、会议录像、道路摄像头等，具有突出的压缩性能改善和视觉效果提升。(4) High practicability and easy implementation in a static background. For scenes where the background of the picture is mostly static or with small changes, such as home cameras, conference recordings, road cameras, etc., it has outstanding compression performance improvement and visual effect improvement.

图3示出了根据本发明的一个实施例的示例性计算设备的框图300，该计算设备是可应用于本发明的各方面的硬件设备的一个示例。FIG. 3 shows a block diagram 300 of an exemplary computing device, which is one example of a hardware device applicable to aspects of the invention, according to one embodiment of the invention.

参考图3，现在将描述一种计算设备300，该计算设备是可应用于本发明的各方面的硬件设备的一个示例。计算设备300可以是可被配置成用于实现处理和/或计算的任何机器，可以是但并不局限于工作站、服务器、桌面型计算机、膝上型计算机、平板计算机、个人数字处理、智能手机、车载计算机、家用摄像头、会议录像设备、道路摄像头或者它们的任何组合。前述的各种方法/装置/服务器/客户端设备可全部或者至少部分地由计算设备300或者类似设备或系统来实现。3, a computing device 300 will now be described, which is one example of a hardware device applicable to aspects of the present invention. Computing device 300 can be any machine that can be configured to perform processing and/or computation, and can be, but is not limited to, workstations, servers, desktops, laptops, tablets, personal digital processors, smartphones , on-board computers, home cameras, conference recording equipment, road cameras, or any combination thereof. The aforementioned various methods/apparatuses/servers/client devices may be implemented in whole or at least in part by the computing device 300 or similar device or system.

计算设备300可包括可经由一个或多个接口和总线302连接或通信的组件。例如，计算设备300可包括总线302、一个或多个处理器304、一个或多个输入设备306以及一个或多个输出设备308。该一个或多个处理器304可以是任何类型的处理器并且可包括但不限于一个或多个通用处理器和/或一个或多个专用处理器(例如，专门的处理芯片)。输入设备306可以是任何类型的能够向计算设备输入信息的设备并且可以包括但不限于鼠标、键盘、触摸屏、麦克风和/或远程控制器。输出设备308可以是任何类型的能够呈现信息的设备并且可以包括但不限于显示器、扬声器、视频/音频输出终端、振动器和/或打印机。计算设备300也可以包括非瞬态存储设备310或者与所述非瞬态存储设备相连接，所述非瞬态存储设备可以是非瞬态的并且能够实现数据存储的任何存储设备，并且所述非瞬态存储设备可以包括但不限于磁盘驱动器、光存储设备、固态存储器、软盘、软磁盘、硬盘、磁带或任何其它磁介质、光盘或任何其它光介质、ROM(只读存储器)、RAM(随机存取存储器)、高速缓冲存储器和/或任何存储芯片或盒式磁带、和/或计算机可从其读取数据、指令和/或代码的任何其它介质。非瞬态存储设备310可从接口分离。非瞬态存储设备310可具有用于实施上述方法和步骤的数据/指令/代码。计算设备300也可包括通信设备312。通信设备312可以是任何类型的能够实现与内部装置通信和/或与网络通信的设备或系统并且可以包括但不限于调制解调器、网卡、红外通信设备、无线通信设备和/或芯片组，例如蓝牙设备、IEEE 1302.11设备、WiFi设备、WiMax设备、蜂窝通信设备和/或类似设备。Computing device 300 may include components that may connect or communicate with bus 302 via one or more interfaces. For example, computing device 300 may include a bus 302 , one or more processors 304 , one or more input devices 306 , and one or more output devices 308 . The one or more processors 304 may be any type of processor and may include, but are not limited to, one or more general-purpose processors and/or one or more special-purpose processors (eg, specialized processing chips). Input device 306 may be any type of device capable of inputting information to a computing device and may include, but is not limited to, a mouse, keyboard, touch screen, microphone, and/or remote controller. Output device 308 may be any type of device capable of presenting information and may include, but is not limited to, displays, speakers, video/audio output terminals, vibrators, and/or printers. Computing device 300 may also include or be connected to a non-transitory storage device 310, which may be any storage device that is non-transitory and capable of data storage, and that is non-transitory. Transient storage devices may include, but are not limited to, magnetic disk drives, optical storage devices, solid state memory, floppy disks, floppy disks, hard disks, magnetic tape or any other magnetic medium, optical disks or any other optical medium, ROM (read only memory), RAM (random memory) fetch memory), cache memory, and/or any memory chip or tape cartridge, and/or any other medium from which a computer may read data, instructions, and/or code. Non-transitory storage device 310 may be detached from the interface. The non-transitory storage device 310 may have data/instructions/code for implementing the methods and steps described above. Computing device 300 may also include communication device 312 . Communication device 312 may be any type of device or system capable of communicating with internal devices and/or with a network and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication devices, and/or chipsets, such as Bluetooth devices , IEEE 1302.11 devices, WiFi devices, WiMax devices, cellular communication devices and/or similar devices.

总线302可以包括但不限于工业标准结构(ISA)总线、微通道结构(MCA)总线、增强型ISA(EISA)总线、视频电子标准协会(VESA)局部总线和外部设备互连(PCI)总线。The bus 302 may include, but is not limited to, an Industry Standard Architecture (ISA) bus, a Microchannel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.

计算设备300还可包括工作存储器314，该工作存储器314可以是任何类型的能够存储有利于处理器304的工作的指令和/或数据的工作存储器并且可以包括但不限于随机存取存储器和/或只读存储设备。Computing device 300 may also include working memory 314, which may be any type of working memory capable of storing instructions and/or data that facilitates the operation of processor 304 and may include, but is not limited to, random access memory and/or read-only storage device.

软件组件可位于工作存储器314中，这些软件组件包括但不限于操作系统316、一个或多个应用程序318、驱动程序和/或其它数据和代码。用于实现本发明上述方法和步骤的指令可包含在所述一个或多个应用程序318中，并且可通过处理器304读取和执行所述一个或多个应用程序318的指令来实现本发明的上述方法200。Software components may be located in working memory 314, including, but not limited to, operating system 316, one or more application programs 318, drivers, and/or other data and code. Instructions for implementing the above-described methods and steps of the present invention may be contained in the one or more application programs 318, and the present invention may be implemented by the processor 304 reading and executing the instructions of the one or more application programs 318 The method 200 described above.

也应该认识到可根据具体需求而做出变化。例如，也可使用定制硬件、和/或特定组件可在硬件、软件、固件、中间件、微代码、硬件描述语音或其任何组合中实现。此外，可采用与其它计算设备、例如网络输入/输出设备等的连接。例如，可通过具有汇编语言或硬件编程语言(例如，VERILOG、VHDL、C++)的编程硬件(例如，包括现场可编程门阵列(FPGA)和/或可编程逻辑阵列(PLA)的可编程逻辑电路)利用根据本发明的逻辑和算法来实现所公开的方法和设备的部分或全部。It should also be recognized that variations may be made according to specific needs. For example, custom hardware may also be used, and/or certain components may be implemented in hardware, software, firmware, middleware, microcode, hardware description voice, or any combination thereof. Additionally, connections to other computing devices, such as network input/output devices, etc., may be employed. For example, programmable logic circuits (eg, programmable logic circuits including field programmable gate arrays (FPGA) and/or programmable logic arrays (PLA)) may be programmed with assembly language or hardware programming languages (eg, VERILOG, VHDL, C++). ) utilize logic and algorithms in accordance with the present invention to implement some or all of the disclosed methods and apparatus.

尽管目前为止已经参考附图描述了本发明的各方面，但是上述方法、系统和设备仅是示例，并且本发明的范围不限于这些方面，而是仅由所附权利要求及其等同物来限定。各种组件可被省略或者也可被等同组件替代。另外，也可以在与本发明中描述的顺序不同的顺序实现所述步骤。此外，可以按各种方式组合各种组件。也重要的是，随着技术的发展，所描述的组件中的许多组件可被之后出现的等同组件所替代。Although aspects of the present invention have so far been described with reference to the accompanying drawings, the above-described methods, systems, and apparatus are merely examples, and the scope of the present invention is not limited to these aspects, but only by the appended claims and their equivalents . Various components may be omitted or may be replaced by equivalent components. Additionally, the steps may also be performed in an order different from that described in this disclosure. Furthermore, the various components can be combined in various ways. It is also important that, as technology develops, many of the components described may be replaced by equivalent components that appear later.

Claims

1. A method for improving the compression ratio of static background video based on background modeling technology, comprising:

A foreground extraction algorithm is used to establish a background model, and the background model is used to characterize the feature of each pixel in the image frame;

Initialize the established background model;

updating the background model based on the acquired new image frame;

performing foreground or background estimation on each pixel in the new image frame;

Adjusting the dynamic rate control factor for each coding block according to the probability that the pixel point in each coding block in the new image frame is determined as the background; and

Each coding block is differentially encoded based on a dynamic rate control factor for each coding block.

2. The method of claim 1, wherein the foreground extraction algorithm comprises one of GMM, ViBe, SACON, or PBAS.

3. The method of claim 1, wherein initializing the established background model further comprises:

The background model is initialized by using the first frame image or the first N frame images.

4. The method of claim 1, wherein performing foreground or background estimation on each pixel in the new image frame further comprises:

matching the current pixel value of each pixel in the new image frame to its corresponding updated background model;

If the match fails, the pixel is determined to be the foreground;

If the matching is successful, the probability p that the pixel belongs to the background is calculated according to the updated background model.

5 . The method according to claim 4 , wherein, according to the probability that a pixel point in each coding block in the new image frame is determined as the background, adjusting the dynamic rate control factor for each coding block further. 6 . include:

If all the pixels in the current coding block are all determined as the background, take the average value p_avg of the p values of all the pixels in the current coding block, and obtain the rate control factor λ=f(p_avg in the rate-distortion optimization strategy ).

6. The method of claim 5, wherein the differentially encoding each coding block based on a dynamic rate control factor for each coding block further comprises:

If all the pixels in the current coding block are all determined to be the background, coding is performed using the rate control factor λ.

7. A system for improving the compression ratio of static background video based on background modeling technology, comprising:

A background modeling module configured to:

Initialize the established background model;

updating the background model based on the acquired new image frame;

performing a foreground or background estimate for each pixel in the new image frame; and

A dynamic encoding module, the dynamic encoding module is configured to:

8. The system of claim 7, wherein performing foreground or background estimation on each pixel in the new image frame further comprises:

If the match fails, the pixel is determined to be the foreground;

9. The method of claim 8, wherein

Wherein, according to the probability that the pixels in each coding block in the new image frame are determined as the background, adjusting the dynamic rate control factor for each coding block further includes: if all the pixels in the current coding block are all determined as In the background, take the average value p_avg of the p values of all pixels in the current coding block, and obtain the rate control factor λ=f(p_avg) in the rate-distortion optimization strategy;

Wherein, based on the dynamic rate control factor for each coding block, performing differential coding on each coding block further includes: if all the pixels in the current coding block are determined as backgrounds, use the rate control factor λ to perform the differential coding. coding.

10. A computing device for image super-resolution reconstruction, comprising:

processor;

a memory storing instructions which, when executed by the processor, are capable of performing the method of claims 1-6.