CN103974071A

CN103974071A - Video coding method and equipment on basis of regions of interest

Info

Publication number: CN103974071A
Application number: CN201310034633.XA
Authority: CN
Inventors: 王琪; 伍健荣
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2013-01-29
Filing date: 2013-01-29
Publication date: 2014-08-06

Abstract

The present invention discloses a video coding method and device based on an area of interest. The video coding method includes: an object detection step for detecting a specific object in an input video frame, so as to identify the video frame based on the detected specific object A region of interest and a region of non-interest in the region; a quantization parameter calculation step for calculating quantization parameter values of macroblocks in the identified region of interest and a region of non-interest; a macroblock type selection step for calculating a macroblock based on a video frame type of the macroblocks in the region of interest and the non-interest region; and a video encoding step for calculating the quantization parameter values of the macroblocks in the region of interest and the non-interest region and the selected Macroblock type, which encodes video frames. According to the present invention, video compression can be implemented simply and efficiently, which not only ensures the high resolution of the interest area, but also does not make the non-interest area too blurred to be distinguished.

Description

Region-of-interest-based video coding method and device

技术领域technical field

本公开内容涉及视频处理技术，更具体地，涉及一种基于感兴趣区域（Region-of-Interest，ROI）的视频编码方法和设备，其能够简单且高效地在实现视频压缩，保持视频帧中的ROI区域的良好视频质量，同时不会使得非感兴趣区域（non-ROI）的视频图像过于模糊而无法分辨。The present disclosure relates to video processing technologies, and more specifically, to a region-of-interest (ROI)-based video coding method and device, which can simply and efficiently realize video compression and maintain video frames Good video quality in the ROI area, while not making the video image of the non-ROI area too blurred to be resolved.

背景技术Background technique

近年来，基于智能视频处理的应用已广泛用于各种电子装置中，并且大部分应用需要将压缩视频传送到用于进行进一步处理的后端服务器或者其它通信终端。因此，存在对于视频分辨率的高要求与限制了视频质量的有限网络带宽之间出现冲突的问题，从而需要新的视频编码方法来解决这种矛盾。In recent years, applications based on intelligent video processing have been widely used in various electronic devices, and most applications need to transmit compressed video to back-end servers or other communication terminals for further processing. Therefore, there is a problem of conflict between high requirements for video resolution and limited network bandwidth that limits video quality, and a new video encoding method is required to solve this conflict.

因此，在视频处理技术中提出了ROI概念，以使得应用仅关注视频帧中的用户所关心的重要区域或对象，而对于视频帧或图片中的其它区域（即，非感兴趣区域），可以粗糙地处理甚至忽略这些区域。应用于视频压缩中的ROI概念通常表示在编码时，对于视频帧中的用户更为关注的区域分配较多比特，而对于其它区域（例如，背景、无关对象等等）则分配较少比特，从而实现了编码视频流的减小。例如，在智能监控应用中，人、汽车、面部等的区域通常可能是ROI区域并且将以更多的比特进行编码以保持必要的细节清晰可见，而其它的非ROI区域将以较少的比特来编码以仅提供一些基本信息。Therefore, the concept of ROI is proposed in the video processing technology, so that the application only pays attention to the important area or object that the user cares about in the video frame, and for other areas in the video frame or picture (that is, the non-interest area), you can Roughly handle or even ignore these areas. The ROI concept applied in video compression usually means that when encoding, more bits are allocated to the area of the video frame that the user pays more attention to, while less bits are allocated to other areas (such as background, irrelevant objects, etc.), A reduction in the coded video stream is thereby achieved. For example, in a smart surveillance application, areas of people, cars, faces, etc. may often be ROI areas and will be encoded with more bits to keep the necessary details clearly visible, while other non-ROI areas will be encoded with fewer bits to code to provide only some basic information.

在视频编码领域中，已出现了大量基于ROI的视频编码方法。例如，根据以下非专利文献1中所公开的视频编码方法，通过为视频帧中的ROI区域和非ROI区域中的宏块（macroblock）设置不同的量化参数值，可以减小编码后的视频流的大小。然而，在该方法中，为ROI区域和非ROI区域中的宏块所设置的量化参数是固定值而非自适应的，因此可能无法满足对于视频质量的要求。In the field of video coding, a large number of ROI-based video coding methods have appeared. For example, according to the video encoding method disclosed in the following Non-Patent Document 1, by setting different quantization parameter values for macroblocks (macroblocks) in ROI areas and non-ROI areas in video frames, the encoded video stream can be reduced the size of. However, in this method, the quantization parameters set for the macroblocks in the ROI area and the non-ROI area are fixed values rather than adaptive, so it may not be able to meet the requirements for video quality.

此外，根据以下非专利文献2中所公开的视频编码方法，通过对于每个帧均限制其中的ROI区域和非ROI区域的宏块类型来减小编码后的视频流的大小。然而，这可能会使得如背景的非ROI区域过于模糊而无法分辨。Furthermore, according to the video encoding method disclosed in the following Non-Patent Document 2, the size of the encoded video stream is reduced by limiting the macroblock types of the ROI area and the non-ROI area therein for each frame. However, this may make non-ROI regions such as the background too blurry to be resolved.

而对于以下非专利文献3中所公开的视频编码方法中，采用改进的比率控制算法来计算ROI区域和非ROI区域中的宏块的量化参数值从而实现视频压缩，但是该算法较复杂。In the following video coding method disclosed in Non-Patent Document 3, an improved ratio control algorithm is used to calculate quantization parameter values of macroblocks in the ROI area and non-ROI area to achieve video compression, but the algorithm is more complicated.

引用文献Citation

【非专利文献1】：Lino F,Luís C,Pedro A.,“H.264/SVC ROIEncoding with Spatial Scalability”,Proceedings of InternationalConference on Signal Processing and Multimedia Applications.Porto,Portugal:[s.n.],2008:212-215[Non-Patent Document 1]: Lino F, Luís C, Pedro A., "H.264/SVC ROIEncoding with Spatial Scalability", Proceedings of International Conference on Signal Processing and Multimedia Applications. Porto, Portugal: [s.n.], 2008: 212 -215

【非专利文献2】：Taeyoung Na et al,“A Fast Macroblock ModeDecision Scheme Using ROI-Based Coding of H.264|MPEG-4Part10AVC for Mobile Video Telephony Applications”,Applications of DigitalImage Processing XXXI,Andrew G.Tescher编辑，Proceedings of the SPIE,Volume7073,pp.70730I-70730I-10(2008)[Non-Patent Document 2]: Taeyoung Na et al, "A Fast Macroblock ModeDecision Scheme Using ROI-Based Coding of H.264|MPEG-4 Part10AVC for Mobile Video Telephony Applications", Applications of DigitalImage Processing XXXI, edited by Andrew G.Tescher, Proceedings of the SPIE, Volume 7073, pp. 70730I-70730I-10 (2008)

【非专利文献3】：Sira Rao和Nikil Jayant,“Optimizing Algorithmsfor Region-of-Interest Video Compression,with Application to MobileTelehealth”,ICME,pp.513-516.IEEE,(2006)[Non-Patent Document 3]: Sira Rao and Nikil Jayant, "Optimizing Algorithms for Region-of-Interest Video Compression, with Application to MobileTelehealth", ICME, pp.513-516.IEEE, (2006)

发明内容Contents of the invention

在下文中给出了关于本发明的简要概述，以便提供关于本发明的某些方面的基本理解。但是，应当理解，这个概述并不是关于本发明的穷举性概述。它并不是意图用来确定本发明的关键性部分或重要部分，也不是意图用来限定本发明的范围。其目的仅仅是以简化的形式给出关于本发明的某些概念，以此作为稍后给出的更详细描述的前序。A brief overview of the invention is given below in order to provide a basic understanding of some aspects of the invention. It should be understood, however, that this summary is not an exhaustive summary of the invention. It is not intended to identify key or critical parts of the invention, nor to limit the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.

鉴于以上问题，本发明的目的是提供一种简单且高效的视频编码方法，其通过对视频帧中的ROI区域和非ROI区域自适应地设置其中的宏块的量化参数值，并且基于视频帧的类型来选择ROI区域和非ROI区域中的宏块类型，在实现视频压缩的同时，保证ROI区域的高分辨率且不会使得非ROI区域的视频过于模糊而无法分辨。In view of the above problems, the object of the present invention is to provide a simple and efficient video coding method, which adaptively sets the quantization parameter value of the macroblock in the ROI area and the non-ROI area in the video frame, and based on the video frame The types of macroblocks in the ROI area and non-ROI area are selected according to the type, and the high resolution of the ROI area is guaranteed while realizing video compression without making the video in the non-ROI area too blurred to be distinguishable.

根据本发明的实施例的一方面，提供了一种基于感兴趣区域的视频编码方法，包括：对象检测步骤，用于检测输入的视频帧中的特定对象，以基于所检测的特定对象来识别视频帧中的感兴趣区域和非感兴趣区域；量化参数计算步骤，用于计算所识别的感兴趣区域和非感兴趣区域中的宏块的量化参数值；宏块类型选择步骤，用于基于视频帧的类型而选择感兴趣区域和非感兴趣区域中的宏块的类型；以及视频编码步骤，用于基于所计算的感兴趣区域和非感兴趣区域中的宏块的量化参数值以及所选择的宏块类型，对视频帧进行编码。According to an aspect of an embodiment of the present invention, there is provided a video encoding method based on an area of interest, including: an object detection step, which is used to detect a specific object in an input video frame, so as to identify based on the detected specific object A region of interest and a region of non-interest in a video frame; a quantization parameter calculation step for calculating the quantization parameter value of a macroblock in the region of interest identified and a region of non-interest; a macroblock type selection step for based on The types of macroblocks in the region of interest and the region of non-interest are selected according to the type of the video frame; The selected macroblock type to encode the video frame.

根据本发明的优选实施例，在宏块类型选择步骤中，可以在视频帧是帧内帧的情况下，对于所述视频帧中的感兴趣区域和非感兴趣区域均选择所有宏块类型；而在视频帧是帧间帧的情况下，对于视频帧中的非感兴趣区域，仅选择帧间宏块类型，而对于视频帧中的感兴趣区域，选择所有宏块类型。According to a preferred embodiment of the present invention, in the macroblock type selection step, when the video frame is an intra frame, all macroblock types can be selected for both the region of interest and the region of non-interest in the video frame; Whereas, in the case that the video frame is an inter frame, for the non-interest region in the video frame, only the inter macroblock type is selected, and for the interest region in the video frame, all macroblock types are selected.

根据本发明的另一优选实施例，在量化参数计算步骤中，可以根据视频帧中的感兴趣区域的大小来计算感兴趣区域中的宏块的量化参数值。According to another preferred embodiment of the present invention, in the step of calculating the quantization parameter, the quantization parameter value of the macroblock in the region of interest may be calculated according to the size of the region of interest in the video frame.

根据本发明的另一优选实施例，在量化参数计算步骤中，根据视频帧中的感兴趣区域的大小来计算感兴趣区域中的宏块的量化参数值进一步包括基于感兴趣区域中的宏块的数量与视频帧中的宏块的数量的比率来计算感兴趣区域中的宏块的量化参数值。According to another preferred embodiment of the present invention, in the step of calculating the quantization parameter, calculating the quantization parameter value of the macroblock in the region of interest according to the size of the region of interest in the video frame further includes Quantization parameter values for macroblocks in the region of interest are calculated as the ratio of the number of macroblocks in the video frame to the number of macroblocks in the video frame.

根据本发明的另一优选实施例，该视频编码方法还包括：采样步骤，用于在对象检测步骤之前对视频帧进行采样，其中，在对象检测步骤中，通过识别采样后的视频帧中的感兴趣区域和非感兴趣区域来识别视频帧中的感兴趣区域和非感兴趣区域。According to another preferred embodiment of the present invention, the video encoding method further includes: a sampling step for sampling video frames before the object detection step, wherein, in the object detection step, by identifying the Regions of interest and regions of non-interest to identify regions of interest and regions of non-interest in video frames.

根据本发明的实施例的另一方面，还提供了一种基于感兴趣区域的视频编码设备，包括：对象检测单元，被配置成检测输入的视频帧中的特定对象，以基于所检测的特定对象来识别视频帧中的感兴趣区域和非感兴趣区域；量化参数计算单元，被配置成计算所识别的感兴趣区域和非感兴趣区域中的宏块的量化参数值；宏块类型选择单元，被配置成基于视频帧的类型而选择所述感兴趣区域和非感兴趣区域中的宏块的类型；以及视频编码单元，被配置成基于所计算的感兴趣区域和非感兴趣区域中的宏块的量化参数值以及所选择的宏块类型，对视频帧进行编码。According to another aspect of the embodiments of the present invention, there is also provided an ROI-based video encoding device, including: an object detection unit configured to detect a specific object in an input video frame, and based on the detected specific object object to identify a region of interest and a region of non-interest in a video frame; a quantization parameter calculation unit configured to calculate a quantization parameter value of a macroblock in the region of interest identified and a region of non-interest; a macroblock type selection unit , configured to select the type of macroblocks in the region of interest and the region of non-interest based on the type of the video frame; The quantization parameter value of the macroblock and the selected macroblock type are used to encode the video frame.

根据本发明的实施例的又一方面，还提供了一种存储介质，该存储介质包括机器可读的程序代码，当在信息处理设备上执行程序代码时，该程序代码使得信息处理设备执行根据本发明的基于感兴趣区域的视频编码方法。According to still another aspect of the embodiments of the present invention, there is also provided a storage medium, the storage medium includes machine-readable program code, and when the program code is executed on the information processing device, the program code causes the information processing device to execute the The video encoding method based on the region of interest of the present invention.

此外，根据本发明的实施例的再一方面，还提供了一种程序产品，该程序产品包括机器可执行的指令，当在信息处理设备上执行指令时，该指令使得信息处理设备执行根据本发明的基于感兴趣区域的视频编码方法。In addition, according to still another aspect of the embodiments of the present invention, there is also provided a program product, the program product includes machine-executable instructions, and when the instructions are executed on the information processing device, the instructions cause the information processing device to execute the Invented region-of-interest-based video coding method.

因此，根据本发明的实施例，通过在宏块模式预测处理期间根据帧类型而对于ROI区域和非ROI区域限定宏块类型的范围并且通过在量化处理期间对于ROI区域和非ROI区域采用不同的量化参数值，可以简单且高效地实现视频压缩，保证了ROI区域的高视频质量，同时不会使得非ROI区域的视频过于模糊，从而适合于内容感知（content-aware）视频编码应用，例如门禁系统的人脸识别、停车场的远程监控系统等等。Therefore, according to an embodiment of the present invention, by defining the range of the macroblock type for the ROI area and the non-ROI area according to the frame type during the macroblock mode prediction process and by using different Quantizing parameter values can realize video compression simply and efficiently, ensuring high video quality in the ROI area without making the video in the non-ROI area too blurry, making it suitable for content-aware video coding applications, such as access control The face recognition system, the remote monitoring system of the parking lot, etc.

在下面的说明书部分中给出本发明实施例的其他方面，其中，详细说明用于充分地公开本发明实施例的优选实施例，而不对其施加限定。Further aspects of the embodiments of the present invention are given in the description section below, wherein the detailed description serves to fully disclose preferred embodiments of the embodiments of the present invention without imposing limitations thereon.

附图说明Description of drawings

本发明可以通过参考下文中结合附图所给出的详细描述而得到更好的理解，其中在所有附图中使用了相同或相似的附图标记来表示相同或者相似的部件。所述附图连同下面的详细说明一起包含在本说明书中并形成说明书的一部分，用来进一步举例说明本发明的优选实施例和解释本发明的原理和优点。其中：The present invention can be better understood by referring to the following detailed description given in conjunction with the accompanying drawings, wherein the same or similar reference numerals are used throughout to designate the same or similar parts. The accompanying drawings, together with the following detailed description, are incorporated in and form a part of this specification, and serve to further illustrate preferred embodiments of the invention and explain the principles and advantages of the invention. in:

图1是示出根据本发明的实施例的基于ROI的视频编码方法的示例的流程图；1 is a flowchart illustrating an example of an ROI-based video encoding method according to an embodiment of the present invention;

图2是示出在图1所示的基于ROI的视频编码方法中的对象检测步骤中的ROI区域识别的示例的示意图；FIG. 2 is a schematic diagram showing an example of ROI region recognition in an object detection step in the ROI-based video coding method shown in FIG. 1;

图3是示出在图1所示的基于ROI的视频编码方法中的宏块类型选择步骤的示例实现的示意图；FIG. 3 is a schematic diagram showing an example implementation of a macroblock type selection step in the ROI-based video coding method shown in FIG. 1;

图4是示出根据本发明的实施例的基于ROI的视频编码方法与现有技术的视频编码方法相比的帧比特数减少的效果示例的示意图；4 is a schematic diagram showing an example of the effect of reducing the number of frame bits in the ROI-based video coding method according to an embodiment of the present invention compared with the video coding method in the prior art;

图5是示出根据本发明的实施例的基于ROI的视频编码设备的功能配置的示例的框图；以及5 is a block diagram showing an example of a functional configuration of an ROI-based video encoding device according to an embodiment of the present invention; and

图6是示出作为本发明的实施例中所采用的信息处理设备的个人计算机的示例性结构的框图。FIG. 6 is a block diagram showing an exemplary structure of a personal computer as an information processing device employed in an embodiment of the present invention.

具体实施方式Detailed ways

在下文中将结合附图对本发明的示范性实施例进行描述。为了清楚和简明起见，在说明书中并未描述实际实施方式的所有特征。然而，应该了解，在开发任何这种实际实施例的过程中必须做出很多特定于实施方式的决定，以便实现开发人员的具体目标，例如，符合与系统及业务相关的那些限制条件，并且这些限制条件可能会随着实施方式的不同而有所改变。此外，还应该了解，虽然开发工作有可能是非常复杂和费时的，但对得益于本公开内容的本领域技术人员来说，这种开发工作仅仅是例行的任务。Exemplary embodiments of the present invention will be described below with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in this specification. It should be understood, however, that in developing any such practical embodiment, many implementation-specific decisions must be made in order to achieve the developer's specific goals, such as meeting those constraints related to the system and business, and those Restrictions may vary from implementation to implementation. Moreover, it should also be understood that development work, while potentially complex and time-consuming, would at least be a routine undertaking for those skilled in the art having the benefit of this disclosure.

在此，还需要说明的一点是，为了避免因不必要的细节而模糊了本发明，在附图中仅仅示出了与根据本发明的方案密切相关的设备结构和/或处理步骤，而省略了与本发明关系不大的其它细节。Here, it should also be noted that, in order to avoid obscuring the present invention due to unnecessary details, only the device structure and/or processing steps closely related to the solution according to the present invention are shown in the drawings, and the Other details not relevant to the present invention are described.

以下将参照图1至图6来描述根据本发明的实施例的基于ROI的视频编码方法和视频编码设备。An ROI-based video encoding method and a video encoding device according to embodiments of the present invention will be described below with reference to FIGS. 1 to 6 .

首先，将参照图1来说明根据本发明的实施例的基于ROI的视频编码方法的处理流程示例。First, an example of a processing flow of an ROI-based video encoding method according to an embodiment of the present invention will be described with reference to FIG. 1 .

具体地，参照图1，根据本发明的实施例的基于ROI的视频编码方法包括对象检测步骤S114、量化参数计算步骤S116、宏块类型选择步骤S118和视频编码步骤S120。优选地，该方法还可以包括去噪步骤S110和采样步骤S112。以下将详细描述各个步骤中的处理。Specifically, referring to FIG. 1, the ROI-based video encoding method according to an embodiment of the present invention includes an object detection step S114, a quantization parameter calculation step S116, a macroblock type selection step S118, and a video encoding step S120. Preferably, the method may further include a denoising step S110 and a sampling step S112. The processing in each step will be described in detail below.

首先，在对象检测步骤S114中，可以检测输入的视频帧中的特定对象，以基于所检测的特定对象来识别视频帧中的感兴趣区域和非感兴趣区域。具体地，可以根据实际应用的需要而采用适当的对象检测算法来检测目标对象（例如，人、汽车等等），从而基于所检测到的对象来定义视频帧中的ROI区域和非ROI区域。这些对象检测算法可以是视频处理领域公知的任意算法，在此不再详述。First, in the object detection step S114 , a specific object in the input video frame may be detected to identify an area of interest and an area of non-interest in the video frame based on the detected specific object. Specifically, an appropriate object detection algorithm may be used to detect target objects (eg, people, cars, etc.) according to the needs of practical applications, thereby defining ROI regions and non-ROI regions in the video frame based on the detected objects. These object detection algorithms may be any algorithms known in the field of video processing, and will not be described in detail here.

优选地，还可以采用例如滤波方法等对输入的原始视频帧进行预处理，以去除视频帧中包括的环境噪声并且提高视频压缩质量。在去噪步骤S110中，可以去除输入的视频帧中包括的噪声。优选地，为了进一步提高处理效率，去噪步骤S110和对象检测步骤S114可以并行地执行。Preferably, the input original video frame may also be preprocessed by using, for example, a filtering method, so as to remove environmental noise included in the video frame and improve video compression quality. In the denoising step S110, noise included in the input video frame may be removed. Preferably, in order to further improve the processing efficiency, the denoising step S110 and the object detection step S114 can be executed in parallel.

此外，为了提高视频处理的效率，还可以在对象检测步骤S114之前首先对输入的视频帧进行采样，以减少随后的对象检测处理的计算量。优选地，采样率可以由用户根据在对象检测步骤S114中采用的对象检测算法的复杂度来设定。In addition, in order to improve the efficiency of video processing, the input video frame may be sampled first before the object detection step S114, so as to reduce the calculation amount of subsequent object detection processing. Preferably, the sampling rate can be set by the user according to the complexity of the object detection algorithm adopted in the object detection step S114.

优选地，在对象检测步骤S114中，可以通过识别采样后的视频帧中的ROI区域和非ROI区域来识别原始视频帧中的ROI区域和非ROI区域。Preferably, in the object detection step S114, the ROI area and the non-ROI area in the original video frame can be identified by identifying the ROI area and the non-ROI area in the sampled video frame.

具体地，该过程可以参照图2来详细说明。对采样后的视频帧应用对象检测算法以检测其中包括的特定对象，并且基于所检测的特定对象来识别采样后的视频帧中的ROI区域和非ROI区域，以得到采样后的视频帧中的ROI区域的位置信息和尺寸信息等。然后，通过调整所得到的采样后视频帧中的ROI区域的尺寸并且将其映射回到原始输入的视频中，可以获得原始输入视频帧中的ROI区域和非ROI区域的信息（例如，位置，尺寸等），这些信息将用于随后的编码处理中。这在图2中示意性地示出。Specifically, this process can be described in detail with reference to FIG. 2 . Applying an object detection algorithm to the sampled video frames to detect specific objects included therein, and identifying ROI regions and non-ROI regions in the sampled video frames based on the detected specific objects to obtain Position information and size information of the ROI area, etc. Then, by adjusting the size of the ROI area in the obtained sampled video frame and mapping it back to the original input video, the information of the ROI area and non-ROI area in the original input video frame (for example, position, size, etc.), this information will be used in the subsequent encoding process. This is shown schematically in FIG. 2 .

作为示例，由于在视频处理技术中通常以宏块级来执行各种处理，因此在这里，ROI区域可以指的是包括包含对象像素的所有宏块的区域。相应地，作为示例，输入视频中的ROI区域的尺寸信息可以由ROI区域中包括的宏块数量与视频帧中包括的宏块总数的比率来表示，即，k=N_ROI/N_TOTAL。As an example, since various processes are generally performed at the macroblock level in video processing technology, here, the ROI area may refer to an area including all macroblocks including the target pixel. Accordingly, as an example, the size information of the ROI area in the input video may be represented by the ratio of the number of macroblocks included in the ROI area to the total number of macroblocks included in the video frame, ie, k=N _ROI /N _TOTAL .

接下来，在获得了ROI区域的相关信息之后，可以在量化参数计算步骤S116中，计算在步骤S114中所识别的ROI区域和非ROI区域中的宏块的量化参数值。优选地，可以根据视频帧中的ROI区域的大小来计算ROI区域中的宏块的量化参数值。Next, after obtaining the relevant information of the ROI area, in the quantization parameter calculation step S116, the quantization parameter values of the macroblocks in the ROI area and the non-ROI area identified in the step S114 can be calculated. Preferably, the quantization parameter value of the macroblock in the ROI area can be calculated according to the size of the ROI area in the video frame.

在视频压缩处理中，对本领域技术人员公知的是，由于ROI区域是用户较关心的区域，而非ROI区域是包含不重要信息的区域（诸如背景等），因此期望在视频编码时以较多的比特对ROI区域进行编码以保证编码后视频的高分辨率，而以较少的比特对非ROI区域进行编码以减小编码视频流从而满足有限网络带宽的要求。In video compression processing, it is well known to those skilled in the art that since the ROI area is the area that the user is more concerned about, and the non-ROI area is the area containing unimportant information (such as the background, etc.), it is expected to use more The ROI area is coded with fewer bits to ensure the high resolution of the coded video, and the non-ROI area is coded with fewer bits to reduce the coded video stream to meet the requirements of limited network bandwidth.

因此，在量化参数计算步骤S116中，分别计算ROI区域和非ROI区域中的宏块的量化参数，以满足不同的视频编码要求。Therefore, in the quantization parameter calculation step S116, the quantization parameters of the macroblocks in the ROI area and the non-ROI area are respectively calculated to meet different video coding requirements.

以下给出了计算ROI区域和非ROI区域中的宏块的量化参数值的具体示例。具体地，对于非ROI区域中的宏块，通常采用作为预定的固定常数的量化参数值。该固定常数可以根据对于视频质量的具体要求而预先设定。作为示例，该固定常数在这里可以为45。A specific example of calculating quantization parameter values of macroblocks in the ROI area and the non-ROI area is given below. Specifically, for a macroblock in a non-ROI area, a quantization parameter value that is a predetermined fixed constant is generally adopted. The fixed constant can be preset according to specific requirements for video quality. As an example, the fixed constant may be 45 here.

而对于ROI区域中的宏块，为了控制比特数量，优选地，其量化参数值不是固定的而是可以根据视频帧中的ROI区域的大小而自适应地更新。As for the macroblock in the ROI area, in order to control the number of bits, preferably, its quantization parameter value is not fixed but can be adaptively updated according to the size of the ROI area in the video frame.

优选地，基于ROI区域的大小计算ROI区域中的宏块的量化参数值可以通过基于上述ROI区域中的宏块数量与输入视频帧中的宏块的总数的比率（即，k）计算ROI区域中的宏块的量化参数值来实现。该过程例如可以通过下述等式来实现。Preferably, calculating the quantization parameter value of the macroblocks in the ROI area based on the size of the ROI area can be calculated based on the ratio (ie, k) of the number of macroblocks in the ROI area to the total number of macroblocks in the input video frame This is achieved by the quantization parameter values of the macroblocks in . This process can be realized, for example, by the following equation.

QP_ROI=Round(F+R*k) （1）QP _ROI = Round(F+R*k) (1)

QP’_ROI=MIN(QP_ROI,C) （2）QP' _ROI = MIN(QP _ROI ,C) (2)

其中，QP’_ROI表示ROI区域中的宏块的量化参数值，k表示上述ROI区域中的宏块的数量与视频帧中的宏块的数量的比率，Round()表示取整运算，MIN()表示取最小值的运算，其中F、R和C分别表示ROI区域中的宏块的量化参数值的基值、调节值及上限值，均为预定常数。优选地，作为示例，F可以为22，R为50，并且C为32。Wherein, QP' _ROI represents the quantization parameter value of the macroblock in the ROI area, k represents the ratio of the quantity of the macroblock in the above-mentioned ROI region and the quantity of the macroblock in the video frame, Round () represents the rounding operation, MIN ( ) represents the operation of taking the minimum value, wherein F, R and C respectively represent the base value, adjustment value and upper limit value of the quantization parameter value of the macroblock in the ROI area, all of which are predetermined constants. Preferably, F may be 22, R may be 50, and C may be 32 as an example.

应理解，以上给出的计算方法和数值仅为示例而非限制，并且本领域技术人员可以根据实际需要而采用其它数值或者对上述计算公式进行修改，只需满足量化参数值越大则编码所需比特数越少从而分辨率越低的视频编码原理即可。It should be understood that the calculation methods and numerical values given above are only examples and not limiting, and those skilled in the art can use other numerical values or modify the above calculation formulas according to actual needs, as long as the larger the value of the quantization parameter is, the greater the encoding value is. The video coding principle that requires fewer bits and lower resolution is sufficient.

接下来，为了进一步减少非ROI区域编码所需的比特数，从而进一步减小编码视频流的大小，可以限制在宏块预测处理中非ROI区域中的宏块类型。Next, in order to further reduce the number of bits required for encoding the non-ROI area, thereby further reducing the size of the encoded video stream, the types of macroblocks in the non-ROI area can be limited in the macroblock prediction process.

具体地，在宏块类型选择步骤S118中，可以基于视频帧的类型而选择ROI区域和非ROI区域中的宏块的类型。Specifically, in the macroblock type selection step S118, the type of macroblocks in the ROI area and the non-ROI area may be selected based on the type of the video frame.

作为示例，参照图3，在宏块类型选择步骤S118中，可以应用以下规则来基于视频帧的类型分别选择ROI区域和非ROI区域中的宏块类型：如果视频帧是帧内帧，则对于ROI区域和非ROI区域均选择所有宏块类型，这是由于一般而言帧内帧需要较多的比特来编码；而如果视频帧是帧间帧，则对于视频帧中的ROI区域，选择所有宏块类型以保证高分辨率，而对于视频帧中的非ROI区域，仅选择帧间宏块类型以减少编码所需的比特。As an example, referring to FIG. 3 , in the macroblock type selection step S118, the following rules may be applied to select the macroblock types in the ROI area and the non-ROI area respectively based on the type of the video frame: if the video frame is an intra frame, then for Both the ROI area and the non-ROI area select all macroblock types, because generally speaking, intra frames require more bits to encode; and if the video frame is an inter frame, then for the ROI area in the video frame, select all macroblock type to guarantee high resolution, while for non-ROI regions in video frames, only the inter macroblock type is selected to reduce the bits required for encoding.

关于视频处理中的宏块类型的定义，可以参见“RecommendationITU-T H.264,Series H:Audiovisual and Multimedia Systems,Infrastructure of Audiovisual Services-Coding of Moving Video（2009）”中的相关规定，在此不再详述。For the definition of the macroblock type in video processing, please refer to the relevant regulations in "Recommendation ITU-T H.264, Series H: Audiovisual and Multimedia Systems, Infrastructure of Audiovisual Services-Coding of Moving Video (2009)". More details.

然后，在视频编码步骤S120中，可以基于在量化参数计算步骤S116中算出的ROI区域和非ROI区域中的宏块的量化参数值和在宏块类型选择步骤S118中选择的宏块类型，对视频帧进行编码。优选地，该编码方法可以为熵编码，以实现无损视频压缩。Then, in the video encoding step S120, based on the quantization parameter values of the macroblocks in the ROI area and the non-ROI area calculated in the quantization parameter calculation step S116 and the macroblock type selected in the macroblock type selection step S118, the Video frames are encoded. Preferably, the encoding method may be entropy encoding to achieve lossless video compression.

优选地，在对视频帧进行了去噪处理之后，可以在视频编码步骤S120中对去噪后的视频帧。Preferably, after the denoising process is performed on the video frame, the denoised video frame may be processed in the video encoding step S120.

应理解，上述基于帧类型而选择ROI区域和非ROI区域中的宏块类型的规则仅为示例而非限制，并且本领域技术人员可以根据本发明的原理而采用其它适当的规则来减少编码所需的比特数。It should be understood that the above-mentioned rules for selecting macroblock types in ROI areas and non-ROI areas based on frame types are only examples and not limiting, and those skilled in the art can adopt other appropriate rules according to the principle of the present invention to reduce the coding effort. number of bits required.

图4示意性地示出了根据本发明的实施例的基于ROI的视频编码方法与现有技术的视频编码方法相比的帧比特数的减少效果。其中，在图4所示的曲线图中，横坐标表示帧数量，而纵坐标表示比特数。Fig. 4 schematically shows the effect of reducing the number of frame bits in the ROI-based video coding method according to the embodiment of the present invention compared with the video coding method in the prior art. Wherein, in the graph shown in FIG. 4, the abscissa represents the number of frames, and the ordinate represents the number of bits.

在该示例中，用于测试的示例输入视频帧可以是例如用于办公室的人脸检测的视频序列，因此可以应用面部检测引擎来获得ROI区域信息，并且编码后的序列为IPPPP模式，其中，I表示帧内帧，P表示预测帧。从图4可以看出，与现有技术的视频编码方法相比，通过应用根据本发明的实施例的基于ROI的视频编码方法，大大减少了编码所需的比特数（大约30%至40%），从而能够在有限的网络带宽的情况下，将理想的编码视频图像传送到用户终端设备，即，既保证了ROI区域的高分辨率又不会使得非ROI区域过于模糊的视频。此外，与现有技术相比，根据本发明的基于ROI的视频编码方法的实现较简单且高效，从而还降低了视频处理的实现成本。In this example, an example input video frame for testing can be, for example, a video sequence for face detection in an office, so a face detection engine can be applied to obtain ROI region information, and the encoded sequence is IPPPP mode, where, I represents an intra frame, and P represents a predicted frame. As can be seen from FIG. 4, compared with the video coding method of the prior art, by applying the ROI-based video coding method according to the embodiment of the present invention, the number of bits required for coding is greatly reduced (about 30% to 40%) ), so that in the case of limited network bandwidth, the ideal coded video image can be transmitted to the user terminal device, that is, the video that not only ensures the high resolution of the ROI area but does not make the non-ROI area too blurred. In addition, compared with the prior art, the implementation of the ROI-based video encoding method according to the present invention is simpler and more efficient, thereby also reducing the implementation cost of video processing.

虽然上面结合图1至图4详细描述了根据本发明实施例的基于ROI的视频编码方法的示例，但是本领域的技术人员应当明白，附图所示的流程图仅仅是示例性的，并且可以根据实际应用和具体要求的不同，对上述方法流程进行相应的修改。例如，根据需要，可以对上述方法中的某些步骤的执行顺序进行调整，或者可以省去或者添加某些处理步骤。例如，在图1中以虚线框示出的去噪步骤S110和采样步骤S112即是为了提高视频质量和处理效率而执行的可选步骤，并且根据本发明的视频编码方法可以在不包括这两个步骤的情况下来实现。此外，应理解，以上示例并不构成对本发明的限制，本领域技术人员可以基于所教导的原理，对上述过程进行适当的修改而应用于其它应用场合。Although the example of the ROI-based video coding method according to the embodiment of the present invention has been described in detail above in conjunction with FIGS. According to different actual applications and specific requirements, corresponding modifications are made to the flow of the above method. For example, the execution order of certain steps in the above methods may be adjusted, or certain processing steps may be omitted or added as needed. For example, the denoising step S110 and the sampling step S112 shown in dotted line boxes in FIG. It is realized in steps. In addition, it should be understood that the above examples do not limit the present invention, and those skilled in the art may make appropriate modifications to the above process based on the principles taught and apply it to other applications.

与根据本发明实施例的基于ROI的视频编码方法相对应，本发明的实施例还提供了一种基于ROI的视频编码设备。以下将参照图5详细描述根据本发明的基于ROI的视频编码设备的功能配置示例。Corresponding to the ROI-based video encoding method according to the embodiment of the present invention, the embodiment of the present invention further provides an ROI-based video encoding device. A functional configuration example of the ROI-based video encoding device according to the present invention will be described in detail below with reference to FIG. 5 .

如图5所示，根据本发明的基于ROI的视频编码设备500可以包括对象检测单元514、量化参数计算单元516、宏块类型选择单元518和视频编码单元520。优选地，该视频编码设备500还可以包括去噪单元510和采样单元512。以下将参考图5详细描述各个单元的功能配置示例。As shown in FIG. 5 , the ROI-based video encoding device 500 according to the present invention may include an object detection unit 514 , a quantization parameter calculation unit 516 , a macroblock type selection unit 518 and a video encoding unit 520 . Preferably, the video encoding device 500 may further include a denoising unit 510 and a sampling unit 512 . A functional configuration example of each unit will be described in detail below with reference to FIG. 5 .

对象检测单元514可以被配置成检测输入的视频帧中的特定对象，以基于所检测的特定对象来识别视频帧中的ROI区域和非ROI区域。可以根据实际需要而采用本领域公知的对象检测算法来检测对象。The object detection unit 514 may be configured to detect a specific object in an input video frame to identify ROI regions and non-ROI regions in the video frame based on the detected specific object. Object detection algorithms known in the art can be used to detect objects according to actual needs.

量化参数计算单元516可以被配置成计算所识别的ROI区域和非ROI区域中的宏块的量化参数值。The quantization parameter calculation unit 516 may be configured to calculate quantization parameter values of macroblocks in the identified ROI area and non-ROI area.

具体地，对于视频帧中的非ROI区域，其量化参数值通常可以为预定的固定常数，这里例如可以为45。而对于视频帧中的ROI区域，为了控制比特数量，其量化参数值不是固定的而是可以根据实际需要而自适应地改变。优选地，量化参数计算单元516可以根据视频帧中的ROI区域的大小来计算ROI区域中的宏块的量化参数值。Specifically, for the non-ROI region in the video frame, its quantization parameter value can usually be a predetermined fixed constant, for example, it can be 45 here. As for the ROI area in the video frame, in order to control the number of bits, its quantization parameter value is not fixed but can be adaptively changed according to actual needs. Preferably, the quantization parameter calculation unit 516 can calculate the quantization parameter value of the macroblock in the ROI area according to the size of the ROI area in the video frame.

作为示例，量化参数计算单元516可以基于ROI区域中的宏块的数量与视频帧中的宏块的总数的比率来计算ROI区域中的宏块的量化参数值。具体计算过程可以参见以上在根据本发明实施例的基于ROI的视频编码方法中所描述的计算等式（1）和（2），在此不再赘述。As an example, the quantization parameter calculation unit 516 may calculate the quantization parameter value of the macroblocks in the ROI area based on the ratio of the number of macroblocks in the ROI area to the total number of macroblocks in the video frame. For the specific calculation process, reference may be made to the calculation equations (1) and (2) described above in the ROI-based video coding method according to the embodiment of the present invention, which will not be repeated here.

应理解，这里给出的计算方法和所采用的具体数值仅为示例而非限制，并且可以根据实际需要进行修改。It should be understood that the calculation methods and specific numerical values given here are only examples rather than limitations, and can be modified according to actual needs.

宏块类型选择单元518可以被配置成基于视频帧的类型而选择ROI区域和非ROI区域中的宏块的类型。这可以进一步减少非ROI区域编码所需的比特数，从而进一步减小编码视频流的大小。The macroblock type selection unit 518 may be configured to select the type of macroblocks in the ROI area and the non-ROI area based on the type of video frame. This can further reduce the number of bits required for encoding non-ROI regions, thereby further reducing the size of the encoded video stream.

具体地，宏块类型的选择可以基于以下简单规则来实现：如果视频帧是帧内帧，则对于ROI区域和非ROI区域，宏块类型选择单元518可以均选择所有宏块类型；而如果视频帧是帧间帧，则对于ROI区域，宏块类型选择单元518可以选择所有宏块类型，而对于非ROI区域，宏块类型选择单元518可以仅选择帧间宏块类型。Specifically, the selection of the macroblock type can be realized based on the following simple rules: if the video frame is an intra frame, then for the ROI area and the non-ROI area, the macroblock type selection unit 518 can select all macroblock types; If the frame is an inter frame, for the ROI area, the macroblock type selection unit 518 can select all macroblock types, and for the non-ROI area, the macroblock type selection unit 518 can only select the inter macroblock type.

基于该规则，可以在保证ROI区域的高分辨率的同时，减少非ROI区域编码所需的比特数以减小编码视频流，并且不会使得非ROI区域的视频图像过于模糊。该规则仅是在宏块模式预测处理中用于选择宏块类型的规则示例，并且本领域技术人员可以基于本发明的原理对其进行适当修改。Based on this rule, while ensuring the high resolution of the ROI area, the number of bits required for encoding the non-ROI area can be reduced to reduce the coded video stream, and the video image of the non-ROI area will not be too blurred. This rule is just an example of a rule for selecting a macroblock type in the macroblock mode prediction process, and those skilled in the art can appropriately modify it based on the principle of the present invention.

视频编码单元520可以被配置成基于量化参数计算单元516所计算的ROI区域和非ROI区域中的宏块的量化参数值和宏块类型选择单元518所选择的宏块类型，对视频帧进行编码。优选地，编码方法可以为熵编码。The video encoding unit 520 may be configured to encode video frames based on the quantization parameter values of the macroblocks in the ROI area and the non-ROI area calculated by the quantization parameter calculation unit 516 and the macroblock type selected by the macroblock type selection unit 518 . Preferably, the encoding method may be entropy encoding.

此外，为了减少环境噪声以提高视频压缩质量，还可以将原始视频帧输入到去噪单元510中接受去噪处理。In addition, in order to reduce environmental noise and improve video compression quality, the original video frame can also be input into the denoising unit 510 for denoising processing.

具体地，去噪单元510可以被配置成去除输入视频帧中包括的噪声，这例如可以通过本领域公知的滤波处理来实现。优选地，为了提高处理效率，去噪单元510和对象检测单元514可以被配置成并行地执行各自的处理，从而视频编码单元520可以对去噪后的视频进行编码。Specifically, the denoising unit 510 may be configured to remove noise included in the input video frame, which may be implemented, for example, by filtering processing known in the art. Preferably, in order to improve processing efficiency, the denoising unit 510 and the object detection unit 514 may be configured to perform respective processes in parallel, so that the video encoding unit 520 may encode the denoised video.

此外，为了减少对象检测单元514的对象检测处理的计算量以提高视频处理的效率，还可以在将视频输入到对象检测单元514之前对其进行采样处理。In addition, in order to reduce the calculation amount of the object detection processing of the object detection unit 514 and improve the efficiency of video processing, sampling processing may also be performed on the video before being input to the object detection unit 514 .

具体地，采样单元512可以被配置成对输入视频帧进行采样处理。相应地，对象检测单元514可以被配置成通过识别采样后的视频帧中的ROI区域和非ROI区域来识别原始视频帧中的ROI区域和非ROI区域。通过将得到的采样后的视频帧中ROI区域的位置和尺寸信息等映射回到原始视频帧中以得到原始视频帧中的ROI区域信息的具体实现可以参见在以上方法实施例中结合图2的描述，在此不再赘述。Specifically, the sampling unit 512 may be configured to perform sampling processing on input video frames. Correspondingly, the object detection unit 514 may be configured to identify the ROI area and the non-ROI area in the original video frame by identifying the ROI area and the non-ROI area in the sampled video frame. The specific implementation of obtaining the ROI area information in the original video frame by mapping the position and size information of the ROI area in the obtained sampled video frame back to the original video frame can be referred to in the above method embodiment in conjunction with FIG. 2 description and will not be repeated here.

需要说明的是，本发明实施例所述的基于ROI的视频编码设备是与前述方法实施例相对应的，因此，设备实施例中未详述的部分，请参见方法实施例中相应位置的介绍，这里不再赘述。It should be noted that the ROI-based video coding device described in the embodiment of the present invention corresponds to the aforementioned method embodiment, therefore, for the parts not described in detail in the device embodiment, please refer to the introduction of the corresponding position in the method embodiment , which will not be repeated here.

此外，上述基于ROI的视频编码设备500的结构和功能配置仅为示例而非限制，并且本领域技术人员可以根据需要而对上述结构进行修改。例如，图5中以虚线框示出的去噪单元510和采样单元512为可选单元。In addition, the structure and functional configuration of the ROI-based video encoding device 500 described above are only examples and not limiting, and those skilled in the art can modify the above structure as required. For example, the denoising unit 510 and the sampling unit 512 shown in dashed boxes in FIG. 5 are optional units.

另外，还应该指出的是，上述系列处理和设备也可以通过软件和/或固件实现。在通过软件和/或固件实现的情况下，从存储介质或网络向具有专用硬件结构的计算机，例如图6所示的通用个人计算机600安装构成该软件的程序，该计算机在安装有各种程序时，能够执行各种功能等等。In addition, it should also be noted that the series of processes and devices described above may also be implemented by software and/or firmware. In the case of realization by software and/or firmware, the program constituting the software is installed from a storage medium or a network to a computer having a dedicated hardware configuration, such as a general-purpose personal computer 600 shown in FIG. , can perform various functions and so on.

在图6中，中央处理单元（CPU）601根据只读存储器（ROM）602中存储的程序或从存储部分608加载到随机存取存储器（RAM）603的程序执行各种处理。在RAM603中，也根据需要存储当CPU601执行各种处理等等时所需的数据。In FIG. 6 , a central processing unit (CPU) 601 executes various processes according to programs stored in a read only memory (ROM) 602 or loaded from a storage section 608 to a random access memory (RAM) 603 . In the RAM 603 , data required when the CPU 601 executes various processes and the like is also stored as necessary.

CPU601、ROM602和RAM603经由总线604彼此连接。输入/输出接口605也连接到总线604。The CPU 601 , ROM 602 , and RAM 603 are connected to each other via a bus 604 . The input/output interface 605 is also connected to the bus 604 .

下述部件连接到输入/输出接口605：输入部分606，包括键盘、鼠标等等；输出部分607，包括显示器，比如阴极射线管（CRT）、液晶显示器（LCD）等等，和扬声器等等；存储部分608，包括硬盘等等；和通信部分609，包括网络接口卡比如LAN卡、调制解调器等等。通信部分609经由网络比如因特网执行通信处理。The following components are connected to the input/output interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; The storage section 608 includes a hard disk and the like; and the communication section 609 includes a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet.

根据需要，驱动器610也连接到输入/输出接口605。可拆卸介质611比如磁盘、光盘、磁光盘、半导体存储器等等根据需要被安装在驱动器610上，使得从中读出的计算机程序根据需要被安装到存储部分608中。A driver 610 is also connected to the input/output interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.

在通过软件实现上述系列处理的情况下，从网络比如因特网或存储介质比如可拆卸介质611安装构成软件的程序。In the case of realizing the above-described series of processes by software, the programs constituting the software are installed from a network such as the Internet or a storage medium such as the removable medium 611 .

本领域的技术人员应当理解，这种存储介质不局限于图6所示的其中存储有程序、与设备相分离地分发以向用户提供程序的可拆卸介质611。可拆卸介质611的例子包含磁盘（包含软盘（注册商标））、光盘（包含光盘只读存储器（CD-ROM）和数字通用盘（DVD））、磁光盘（包含迷你盘（MD）（注册商标））和半导体存储器。或者，存储介质可以是ROM602、存储部分608中包含的硬盘等等，其中存有程序，并且与包含它们的设备一起被分发给用户。Those skilled in the art should understand that such a storage medium is not limited to the removable medium 611 shown in FIG. 6 in which the program is stored and distributed separately from the device to provide the program to the user. Examples of removable media 611 include magnetic disks (including floppy disks (registered trademark)), optical disks (including compact disk read-only memory (CD-ROM) and digital versatile disks (DVD)), magneto-optical disks (including )) and semiconductor memory. Alternatively, the storage medium may be the ROM 602, a hard disk contained in the storage section 608, or the like, in which the programs are stored and distributed to users together with devices containing them.

还需要指出的是，执行上述系列处理的步骤可以自然地根据说明的顺序按时间顺序执行，但是并不需要一定根据时间顺序执行。某些步骤可以并行或彼此独立地执行。It should also be pointed out that the steps for executing the above series of processes can naturally be executed in chronological order according to the illustrated order, but they do not need to be executed in chronological order. Certain steps may be performed in parallel or independently of each other.

虽然已经详细说明了本发明及其优点，但是应当理解在不脱离由所附的权利要求所限定的本发明的精神和范围的情况下可以进行各种改变、替代和变换。而且，本发明实施例的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。Although the present invention and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made hereto without departing from the spirit and scope of the invention as defined by the appended claims. Moreover, the terms "comprising", "comprising" or any other variation thereof in the embodiments of the present invention are intended to cover a non-exclusive inclusion such that a process, method, article or device comprising a series of elements includes not only those elements, but also Including other elements not expressly listed, or also including elements inherent in such process, method, article or apparatus. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

根据本发明的实施例，还公开了以下附记：According to the embodiments of the present invention, the following additional notes are also disclosed:

1．一种基于感兴趣区域的视频编码方法，包括：1. A video encoding method based on a region of interest, comprising:

对象检测步骤，用于检测输入的视频帧中的特定对象，以基于所检测的特定对象识别所述视频帧中的感兴趣区域和非感兴趣区域；an object detection step for detecting specific objects in the input video frames to identify regions of interest and non-interest regions in the video frames based on the detected specific objects;

量化参数计算步骤，用于计算所识别的感兴趣区域和非感兴趣区域中的宏块的量化参数值；a quantization parameter calculation step for calculating quantization parameter values for macroblocks in the identified regions of interest and non-regions of interest;

宏块类型选择步骤，用于基于所述视频帧的类型而选择所述感兴趣区域和非感兴趣区域中的宏块的类型；以及a macroblock type selection step for selecting the type of macroblocks in the region of interest and the region of non-interest based on the type of the video frame; and

视频编码步骤，用于基于所计算的所述感兴趣区域和非感兴趣区域中的宏块的量化参数值以及所选择的宏块类型，对所述视频帧进行编码。A video encoding step for encoding the video frame based on the calculated quantization parameter values of the macroblocks in the region of interest and the region of non-interest and the selected macroblock type.

2．根据附记1所述的视频编码方法，其中，在所述宏块类型选择步骤中，在所述视频帧是帧内帧的情况下，对于所述视频帧中的感兴趣区域和非感兴趣区域均选择所有宏块类型；而在所述视频帧是帧间帧的情况下，对于所述视频帧中的非感兴趣区域，仅选择帧间宏块类型，而对于所述视频帧中的感兴趣区域，选择所有宏块类型。2. The video coding method according to Supplement 1, wherein, in the macroblock type selection step, when the video frame is an intra frame, for the region of interest and non-interest in the video frame All macroblock types are selected for all regions; and in the case that the video frame is an inter frame, only the inter macroblock type is selected for the non-interest region in the video frame, and for the video frame Region of interest, select all macroblock types.

3.根据附记1所述的视频编码方法，其中，在所述量化参数计算步骤中，根据所述视频帧中的感兴趣区域的大小来计算所述感兴趣区域中的宏块的量化参数值。3. The video coding method according to Supplementary Note 1, wherein, in the quantization parameter calculation step, the quantization parameters of the macroblocks in the region of interest are calculated according to the size of the region of interest in the video frame value.

4．根据附记3所述的视频编码方法，其中，在所述量化参数计算步骤中，根据所述视频帧中的感兴趣区域的大小来计算所述感兴趣区域中的宏块的量化参数值进一步包括基于所述感兴趣区域中的宏块的数量与所述视频帧中的宏块的数量的比率来计算所述感兴趣区域中的宏块的量化参数值。4. According to the video coding method described in Supplement 3, wherein, in the step of calculating the quantization parameter, calculating the quantization parameter value of the macroblock in the region of interest according to the size of the region of interest in the video frame is further including calculating quantization parameter values for macroblocks in the region of interest based on a ratio of the number of macroblocks in the region of interest to the number of macroblocks in the video frame.

5．根据附记4所述的视频编码方法，其中，在所述量化参数计算步骤中，基于以下等式来计算所述感兴趣区域中的宏块的量化参数值：5. The video coding method according to Supplementary Note 4, wherein, in the quantization parameter calculation step, the quantization parameter value of the macroblock in the region of interest is calculated based on the following equation:

QP_ROI=Round(F+R*k)QP _ROI =Round(F+R*k)

QP’_ROI=MIN(QP_ROI,C)QP' _ROI =MIN(QP _ROI ,C)

其中，QP’_ROI表示所述感兴趣区域中的宏块的量化参数值，k表示所述感兴趣区域中的宏块的数量与所述视频帧中的宏块的数量的比率，Round()表示取整运算，MIN()表示取最小值的运算，其中F、R和C分别表示所述感兴趣区域中的宏块的量化参数值的基值、调节值及上限值，均为预定常数。Wherein, QP' _ROI represents the quantization parameter value of the macroblock in the region of interest, k represents the ratio of the quantity of the macroblock in the region of interest to the quantity of the macroblock in the video frame, Round() Represents a rounding operation, MIN() represents a minimum value operation, wherein F, R and C represent the base value, adjustment value and upper limit value of the quantization parameter value of the macroblock in the region of interest, all of which are predetermined constant.

6．根据附记5所述的视频编码方法，其中，F为22，R为50，并且C为32。6. The video coding method according to Supplement 5, wherein F is 22, R is 50, and C is 32.

7．根据附记1所述的视频编码方法，还包括：7. According to the video encoding method described in Appendix 1, further comprising:

去噪步骤，用于去除所述视频帧中包括的噪声，a denoising step for removing noise comprised in said video frames,

其中，在所述视频编码步骤中，对去噪后的视频帧进行编码。Wherein, in the video encoding step, the denoised video frames are encoded.

8．根据附记7所述的视频编码方法，其中，所述去噪步骤与所述对象检测步骤并行地执行。8. The video coding method according to Supplement 7, wherein the denoising step is performed in parallel with the object detection step.

9．根据附记1所述的视频编码方法，还包括：9. According to the video encoding method described in Appendix 1, further comprising:

采样步骤，用于在所述对象检测步骤之前对所述视频帧进行采样，a sampling step for sampling said video frames prior to said object detection step,

其中，在所述对象检测步骤中，通过识别采样后的视频帧中的感兴趣区域和非感兴趣区域来识别所述视频帧中的感兴趣区域和非感兴趣区域。Wherein, in the object detection step, the region of interest and the region of non-interest in the video frame are identified by identifying the region of interest and the region of non-interest in the sampled video frame.

10．根据附记1-9中任一项所述的视频编码方法，其中，所述非感兴趣区域中的宏块的量化参数值为预定的固定常数。10. The video coding method according to any one of Supplements 1-9, wherein the quantization parameter value of the macroblock in the non-interest area is a predetermined fixed constant.

11．根据附记10所述的视频编码方法，其中，所述固定常数为45。11. The video coding method according to Supplement 10, wherein the fixed constant is 45.

12．根据附记1-11中任一项所述的视频编码方法，其中，在所述视频编码步骤中，对所述视频帧进行熵编码。12. The video coding method according to any one of Supplements 1-11, wherein, in the video coding step, entropy coding is performed on the video frame.

13．一种基于感兴趣区域的视频编码设备，包括：13. A region-of-interest-based video encoding device, comprising:

对象检测单元，被配置成检测输入的视频帧中的特定对象，以基于所检测的特定对象识别所述视频帧中的感兴趣区域和非感兴趣区域；an object detection unit configured to detect a specific object in an input video frame to identify a region of interest and a region of non-interest in the video frame based on the detected specific object;

量化参数计算单元，被配置成计算所识别的感兴趣区域和非感兴趣区域中的宏块的量化参数值；a quantization parameter calculation unit configured to calculate quantization parameter values for macroblocks in the identified regions of interest and non-regions of interest;

宏块类型选择单元，被配置成基于所述视频帧的类型而选择所述感兴趣区域和非感兴趣区域中的宏块的类型；以及a macroblock type selection unit configured to select the type of macroblocks in the region of interest and the region of non-interest based on the type of the video frame; and

视频编码单元，被配置成基于所计算的所述感兴趣区域和非感兴趣区域中的宏块的量化参数值以及所选择的宏块类型，对所述视频帧进行编码。A video encoding unit configured to encode the video frame based on the calculated quantization parameter values of the macroblocks in the region of interest and the region of non-interest and the selected macroblock type.

14．根据附记13所述的视频编码设备，其中，所述宏块类型选择单元进一步被配置成在所述视频帧是帧内帧的情况下，对于所述视频帧中的感兴趣区域和非感兴趣区域均选择所有宏块类型；而在所述视频帧是帧间帧的情况下，对于所述视频帧中的非感兴趣区域，仅选择帧间宏块类型，而对于所述视频帧中的感兴趣区域，选择所有宏块类型。14. The video encoding device according to Supplement 13, wherein the macroblock type selection unit is further configured to, in the case that the video frame is an intra frame, for the region of interest and non-awareness in the video frame All the macroblock types are selected for the regions of interest; and in the case where the video frame is an inter frame, only the inter macroblock type is selected for the non-interest region in the video frame, and for the video frame ROI, select all macroblock types.

15.根据附记13所述的视频编码设备，其中，所述量化参数计算单元进一步被配置成根据所述视频帧中的感兴趣区域的大小来计算所述感兴趣区域中的宏块的量化参数值。15. The video encoding device according to supplementary note 13, wherein the quantization parameter calculation unit is further configured to calculate the quantization of the macroblocks in the region of interest according to the size of the region of interest in the video frame parameter value.

16．根据附记15所述的视频编码设备，其中，所述量化参数计算单元进一步被配置成基于所述感兴趣区域中的宏块的数量与所述视频帧中的宏块的数量的比率来计算所述感兴趣区域中的宏块的量化参数值。16. The video encoding device according to Supplement 15, wherein the quantization parameter calculation unit is further configured to calculate based on the ratio of the number of macroblocks in the region of interest to the number of macroblocks in the video frame Quantization parameter values of macroblocks in the region of interest.

17．根据附记16所述的视频编码设备，其中，所述量化参数计算单元进一步被配置成基于以下等式来计算所述感兴趣区域中的宏块的量化参数值：17. The video encoding device according to Supplement 16, wherein the quantization parameter calculation unit is further configured to calculate the quantization parameter value of the macroblock in the region of interest based on the following equation:

QP_ROI=Round(F+R*k)QP _ROI =Round(F+R*k)

QP’_ROI=MIN(QP_ROI,C)QP' _ROI =MIN(QP _ROI ,C)

18．根据附记17所述的视频编码设备，其中，F为22，R为50，并且C为32。18. The video encoding device according to supplementary note 17, wherein F is 22, R is 50, and C is 32.

19．根据附记13所述的视频编码设备，还包括：19. The video encoding device according to Supplementary Note 13, further comprising:

去噪单元，被配置成去除所述视频帧中包括的噪声，a denoising unit configured to remove noise included in said video frame,

其中，所述视频编码单元进一步被配置成对去噪后的视频帧进行编码。Wherein, the video encoding unit is further configured to encode the denoised video frames.

20．根据附记19所述的视频编码设备，其中，所述去噪单元和所述对象检测单元被配置成并行地执行处理。20. The video encoding device according to supplementary note 19, wherein the denoising unit and the object detection unit are configured to perform processing in parallel.

21．根据附记13所述的视频编码设备，还包括：twenty one. The video encoding device according to Supplementary Note 13, further comprising:

采样单元，被配置成对所述视频帧进行采样，a sampling unit configured to sample the video frame,

其中，所述对象检测单元被配置成通过识别采样后的视频帧中的感兴趣区域和非感兴趣区域来识别所述视频帧中的感兴趣区域和非感兴趣区域。Wherein, the object detection unit is configured to identify the region of interest and the region of non-interest in the video frame by identifying the region of interest and the region of non-interest in the sampled video frame.

22．根据附记13-21中任一项所述的视频编码设备，其中，所述非感兴趣区域中的宏块的量化参数值为预定的固定常数。twenty two. The video encoding device according to any one of supplementary notes 13-21, wherein the quantization parameter value of the macroblock in the non-interest region is a predetermined fixed constant.

23．根据附记22所述的视频编码设备，其中，所述固定常数为45。twenty three. The video encoding device according to supplementary note 22, wherein the fixed constant is 45.

24．根据附记13-23中任一项所述的视频编码设备，其中，所述视频编码单元进一步被配置成对所述视频帧进行熵编码。twenty four. The video encoding device according to any one of supplementary notes 13-23, wherein the video encoding unit is further configured to perform entropy encoding on the video frame.

25．一种存储介质，所述存储介质包括机器可读的程序代码，当在信息处理设备上执行所述程序代码时，所述程序代码使得所述信息处理设备执行根据附记1至12中任一项所述的基于感兴趣区域的视频编码方法。25． A storage medium, the storage medium includes a machine-readable program code, and when the program code is executed on an information processing device, the program code causes the information processing device to execute any one of Supplements 1 to 12. Region-of-interest-based video coding method described in item.

26．一种程序产品，所述程序产品包括机器可执行的指令，当在信息处理设备上执行所述指令时，所述指令使得所述信息处理设备执行根据附记1至12中任一项所述的基于感兴趣区域的视频编码方法。26． A program product, the program product includes machine-executable instructions, and when the instructions are executed on an information processing device, the instructions cause the information processing device to execute the Region-of-interest-based video coding method.

Claims

1. A video coding method based on a region of interest, comprising:

an object detection step for detecting specific objects in the input video frames to identify regions of interest and non-interest regions in the video frames based on the detected specific objects;

a quantization parameter calculation step for calculating quantization parameter values for macroblocks in the identified regions of interest and non-regions of interest;

a macroblock type selection step for selecting the type of macroblocks in the region of interest and the region of non-interest based on the type of the video frame; and

A video encoding step for encoding the video frame based on the calculated quantization parameter values of the macroblocks in the region of interest and the region of non-interest and the selected macroblock type.

2. The video coding method according to claim 1, wherein, in the macroblock type selection step, in the case that the video frame is an intra frame, for the region of interest and non- All macroblock types are selected for the regions of interest; and when the video frame is an inter frame, only the inter macroblock type is selected for the non-interest region in the video frame, and for the video frame Region of interest in , select all macroblock types.

3. The video coding method according to claim 1, wherein, in the quantization parameter calculation step, the quantization parameter of the macroblock in the region of interest is calculated according to the size of the region of interest in the video frame value.

4. The video coding method according to claim 3, wherein, in the quantization parameter calculation step, the quantization parameter of the macroblock in the region of interest is calculated according to the size of the region of interest in the video frame Value further includes calculating a quantization parameter value for a macroblock in the region of interest based on a ratio of a number of macroblocks in the region of interest to a number of macroblocks in the video frame.

5. The video encoding method according to claim 1, further comprising:

a sampling step for sampling said video frames prior to said object detection step,

Wherein, in the object detection step, the region of interest and the region of non-interest in the video frame are identified by identifying the region of interest and the region of non-interest in the sampled video frame.

6. A video encoding device based on a region of interest, comprising:

an object detection unit configured to detect a specific object in an input video frame to identify a region of interest and a region of non-interest in the video frame based on the detected specific object;

a quantization parameter calculation unit configured to calculate quantization parameter values for macroblocks in the identified regions of interest and non-regions of interest;

a macroblock type selection unit configured to select the type of macroblocks in the region of interest and the region of non-interest based on the type of the video frame; and

A video encoding unit configured to encode the video frame based on the calculated quantization parameter values of the macroblocks in the region of interest and the region of non-interest and the selected macroblock type.

7. The video encoding device according to claim 6, wherein the macroblock type selection unit is further configured to, when the video frame is an intra frame, for the region of interest and Non-interest regions all select all macroblock types; and in the case where the video frame is an inter frame, only the inter macroblock type is selected for the non-interest region in the video frame, while for the video A region of interest in a frame, with all macroblock types selected.

8. The video encoding device according to claim 6, wherein the quantization parameter calculation unit is further configured to calculate the quantization of the macroblocks in the region of interest according to the size of the region of interest in the video frame parameter value.

9. The video encoding device according to claim 8, wherein the quantization parameter calculation unit is further configured to be based on a ratio of the number of macroblocks in the region of interest to the number of macroblocks in the video frame to calculate the quantization parameter value of the macroblock in the region of interest.

10. The video encoding device of claim 6, further comprising:

a sampling unit configured to sample the video frame,

Wherein, the object detection unit is further configured to identify the region of interest and the region of non-interest in the video frame by identifying the region of interest and the region of non-interest in the sampled video frame.