CN103442226B - The multiple views color video fast encoding method of distortion just can be perceived based on binocular - Google Patents
The multiple views color video fast encoding method of distortion just can be perceived based on binocular Download PDFInfo
- Publication number
- CN103442226B CN103442226B CN201310325370.8A CN201310325370A CN103442226B CN 103442226 B CN103442226 B CN 103442226B CN 201310325370 A CN201310325370 A CN 201310325370A CN 103442226 B CN103442226 B CN 103442226B
- Authority
- CN
- China
- Prior art keywords
- macroblock
- frame
- viewpoint
- image
- binocular
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 79
- 238000012795 verification Methods 0.000 claims description 28
- YYJNOYZRYGDPNH-MFKUBSTISA-N fenpyroximate Chemical compound C=1C=C(C(=O)OC(C)(C)C)C=CC=1CO/N=C/C=1C(C)=NN(C)C=1OC1=CC=CC=C1 YYJNOYZRYGDPNH-MFKUBSTISA-N 0.000 description 23
- 238000012360 testing method Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 6
- 235000019993 champagne Nutrition 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000016776 visual perception Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本发明公开了一种基于双目恰可觉察失真的多视点彩色视频快速编码方法,其首先利用左视点视频和右视点视频的视差信息确定右视点视频中的每帧右视点图像中的非边界区域内的每个宏块的双目恰可觉察失真值,其次根据双目恰可觉察失真值的大小提前终止宏块模式选择,该快速编码方法在不造成率失真性能下降的基础上,能够有效地提高多视点彩色视频的编码效率,节约的编码时间可达66.48%到71.90%,平均节约编码时间68.46%。
The invention discloses a multi-view point color video fast encoding method based on binocular just detectable distortion, which firstly uses the disparity information of the left view point video and the right view point video to determine the non-boundary of each frame of the right view point image in the right view point video The binocular just perceptible distortion value of each macroblock in the area, and secondly terminate the macroblock mode selection in advance according to the size of the binocular just perceptible distortion value. This fast coding method can achieve Effectively improve the coding efficiency of multi-viewpoint color video, the saving of coding time can reach 66.48% to 71.90%, and the average saving of coding time is 68.46%.
Description
技术领域technical field
本发明涉及一种多视点彩色视频信号的处理方法,尤其是涉及一种基于双目恰可觉察失真的多视点彩色视频快速编码方法。The invention relates to a method for processing multi-viewpoint color video signals, in particular to a fast encoding method for multi-viewpoint color video based on binocularly detectable distortion.
背景技术Background technique
三维电视与自由视点电视广泛使用多视点彩色视频进行场景描述。多视点彩色视频包括多个视点的彩色,通过编码、传输、解码后在显示端进行虚拟视点绘制。在多视点彩色加深度视频中,多视点彩色视频编码已经得到广泛的研究,其中较为适用的编码平台有联合多视点视频模型和联合多视点视频编码。但是由于有关人眼视觉的特性的研究正在发展中,因此在多视点彩色视频的压缩中利用人眼视觉系统的感知特性有待进一步的研究。3D and free-viewpoint television make extensive use of multi-viewpoint color video for scene description. Multi-viewpoint color video includes multiple viewpoints in color, and after encoding, transmission, and decoding, virtual viewpoint rendering is performed on the display terminal. In multi-view color plus depth video, multi-view color video coding has been extensively studied, and the more applicable coding platforms include joint multi-view video model and joint multi-view video coding. However, because the research on the characteristics of human vision is developing, the use of the perceptual characteristics of the human visual system in the compression of multi-viewpoint color video needs further research.
目前,研究人眼视觉系统的诸多特性中,恰可觉察失真是广大研究者所倾向的特征之一。恰可觉察失真表征人眼观看一幅图像时对图像像素的变化所能感知的可见阈值,主要依赖于图像的亮度和对比度。Liu等人利用恰可觉察失真模型区分图像的边界和纹理区域。最近,一些针对人眼感知三维图像和视频的可见阈值的研究正在盛行,如深度恰可觉察失真和双目恰可觉察失真。深度恰可觉察失真表征的是深度视频中最小能被感知的阈值,因此可将深度视频中低于该阈值的像素进一步进行压缩。双目恰可觉察失真是利于基于双眼的亮度掩蔽和对比度掩蔽实验得到的模型,其表示在其中一个视点的图像或视频的失真低于双目恰可觉察失真时,双目不会觉察到该图像或视频的失真。At present, among the many characteristics of the human visual system, just perceptible distortion is one of the characteristics that most researchers tend to. Just perceivable distortion represents the visible threshold that the human eye can perceive changes in image pixels when watching an image, and it mainly depends on the brightness and contrast of the image. Liu et al. utilize a just perceptible distortion model to distinguish the border and texture regions of images. Recently, some researches on the visible threshold of human eyes perceiving 3D images and videos, such as depth just perceptible distortion and binocular just perceptible distortion, are popular. Depth just perceivable distortion represents the minimum perceivable threshold in the depth video, so the pixels below the threshold in the depth video can be further compressed. Binocular just perceptible distortion is a model obtained based on binocular brightness masking and contrast masking experiments, which means that when the distortion of an image or video at one viewpoint is lower than binocular just perceptible distortion, binoculars will not perceive the distortion. Distortion of images or videos.
为了进一步压缩多视点彩色加深度视频庞大的数据量,适用的编码平台采用全搜索模式选择,确定宏块的最小率失真代价,从而确定最佳预测模式。针对全搜索模式较高的计算复杂度,研究人员提出了一些快速模式选择算法。Shen等人提出了一种低复杂度模式选择算法,包括四种有效地模式选择技术,预先判定SKIP模式,自适应提前终止,快速模式大小选择和有选择的帧内编码方法,这种方法有效地节约了多视点彩色视频的编码时间,同时能够保持几乎和全搜索模式选择相同的编码结果。Zeng等人利用量化步长和率失真代价之间的关系作为阈值,利用相邻块的运动矢量计算当前块的运动矢量。上述的方法都能在不降低编码质量的前提下有效地节约编码复杂度,然而在多视点彩色视频的编码过程中,人眼的视觉特性并不能完全利用到这些方法中,关于感知方向的多视点彩色视频依然有很多的研究空间。In order to further compress the huge data volume of multi-viewpoint color plus depth video, the applicable coding platform adopts full search mode selection to determine the minimum rate-distortion cost of macroblocks, so as to determine the best prediction mode. In view of the high computational complexity of the full search mode, researchers have proposed some fast mode selection algorithms. Shen et al. proposed a low-complexity mode selection algorithm, including four effective mode selection techniques, pre-determined SKIP mode, adaptive early termination, fast mode size selection and selective intra-frame coding methods, this method is effective It greatly saves the encoding time of multi-viewpoint color video, and at the same time can maintain almost the same encoding result as the full search mode selection. Zeng et al. used the relationship between the quantization step size and the rate-distortion cost as a threshold, and calculated the motion vector of the current block using the motion vectors of adjacent blocks. The above methods can effectively save coding complexity without reducing the coding quality. However, in the coding process of multi-viewpoint color video, the visual characteristics of the human eye cannot be fully utilized in these methods. Viewpoint color video still has a lot of room for research.
发明内容Contents of the invention
本发明所要解决的技术问题是提供一种基于双目恰可觉察失真的多视点彩色视频快速编码方法,其能够在维持重建视点视频性能的基础上,有效地减少多视点彩色视频的编码时间。The technical problem to be solved by the present invention is to provide a fast encoding method for multi-viewpoint color video based on binocular just detectable distortion, which can effectively reduce the encoding time of multi-viewpoint color video while maintaining the performance of reconstructed viewpoint video.
本发明解决上述技术问题所采用的技术方案为:一种基于双目恰可觉察失真的多视点彩色视频快速编码方法,其特征在于包括以下步骤:The technical solution adopted by the present invention to solve the above-mentioned technical problems is: a multi-viewpoint color video fast encoding method based on binocular just detectable distortion, which is characterized in that it includes the following steps:
①将多视点彩色视频的左视点视频记为{CL(k)},将多视点彩色视频的右视点视频记为{CR(k)},其中,CL(k)表示{CL(k)}中的第k帧左视点图像,CR(k)表示{CR(k)}中的第k帧右视点图像,1≤k≤K,K表示左视点视频和右视点视频中包含的图像的帧数;①Record the left view video of the multi-viewpoint color video as {C L (k)}, and record the right view video of the multi-viewpoint color video as {C R (k)}, where C L (k) means {C L (k)} in the k-th frame of the left-view image, C R (k) represents the k-th frame of the right-view image in {C R (k)}, 1≤k≤K, K represents the left-view video and the right-view video The number of frames of the image contained in;
②将{CR(k)}中的每帧右视点图像划分为边界区域和非边界区域,其中,边界区域由右视点图像中的第一行宏块、最后一行宏块、第一列宏块和最后一列宏块构成;然后计算{CR(k)}中的每帧右视点图像中的非边界区域内的每个宏块的双目恰可觉察失真值,假设{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块属于{CR(k)}中的第k帧右视点图像CR(k)中的非边界区域,则将{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块的双目恰可觉察失真值记为其中,2≤i≤W/16-1,2≤j≤H/16-1,W表示{CL(k)}中的每帧左视点图像和{CR(k)}中的每帧右视点图像的宽度,H表示{CL(k)}中的每帧左视点图像和{CR(k)}中的每帧右视点图像的高度;② Divide each frame of right-view image in {C R (k)} into boundary area and non-boundary area, where the boundary area consists of the first row of macroblocks, the last row of macroblocks, and the first column of macroblocks in the right-view image. block and the last column of macroblocks; then calculate the binocular perceivable distortion value of each macroblock in the non-boundary area of each frame of right-view image in {C R (k)}, assuming {C R (k )} in the k-th frame of right-viewpoint image C R (k) in the coordinate position (i, j) of the macroblock belongs to {C R (k)} in the k-th frame of right-viewpoint image C R (k) in In the non-boundary area, the binocular just perceivable distortion value of the macroblock at the coordinate position (i, j) in the k-th frame of the right view image C R (k) in {C R (k)} is recorded as Among them, 2≤i≤W/16-1, 2≤j≤H/16-1, W represents each frame of left view image in { CL (k)} and each frame in { CR (k)} The width of the right view image, H represents the height of each frame of the left view image in { CL (k)} and the height of each frame of the right view image in {C R (k)};
③在多视点视频编码校验模型JMVC上,采用HBP预测编码结构对{CL(k)}中的每帧左视点图像中的每个宏块和{CR(k)}中的每帧右视点图像中的每个宏块进行编码,在编码过程中为每个宏块选择最优宏块编码模式的过程为:③ On the multi-view video coding verification model JMVC, use the HBP predictive coding structure to pair each macroblock in each frame of the left view image in { CL (k)} and each frame in { CR (k)} Each macroblock in the right view image is encoded, and the process of selecting the optimal macroblock encoding mode for each macroblock in the encoding process is:
③-1、将当前待编码的宏块定义为当前宏块;③-1. Define the current macroblock to be encoded as the current macroblock;
③-2、当当前宏块为{CL(k)}中的宏块时,编码器采用H.264的模式选择过程搜索SKIP、Inter16×16、Inter16×8、Inter8×16、Inter8×8、Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4宏块编码模式,从这些宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;③-2. When the current macroblock is a macroblock in { CL (k)}, the encoder uses the H.264 mode selection process to search for SKIP, Inter16×16, Inter16×8, Inter8×16, Inter8×8 , Inter8×8Frext, Intra16×16, Intra8×8, and Intra4×4 macroblock coding modes, select the macroblock coding mode with the smallest rate-distortion cost from these macroblock coding modes as the optimal macroblock coding mode for the current macroblock to encode;
当当前宏块为{CR(k)}中的宏块时,判断当前宏块属于边界区域还是属于非边界区域,如果当前宏块属于边界区域,则编码器采用H.264的模式选择过程搜索SKIP、Inter16×16、Inter16×8、Inter8×16、Inter8×8、Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4宏块编码模式,从这些宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;如果当前宏块属于非边界区域,则再判断当前宏块的双目恰可觉察失真值是否大于或等于设定的判定阈值,如果是,则编码器采用H.264的模式选择过程搜索SKIP、Inter16×16、Inter16×8、Inter8×16、Inter8×8、Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4宏块编码模式,从这些宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;否则,编码器采用H.264的模式选择过程只搜索SKIP和Inter16×16宏块编码模式,从这两种宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;When the current macroblock is a macroblock in { CR (k)}, it is judged whether the current macroblock belongs to the boundary area or the non-boundary area. If the current macroblock belongs to the boundary area, the encoder adopts the mode selection process of H.264 Search for SKIP, Inter16×16, Inter16×8, Inter8×16, Inter8×8, Inter8×8Frext, Intra16×16, Intra8×8, and Intra4×4 macroblock coding modes, and select rate-distortion from these macroblock coding modes The macroblock coding mode with the least cost is used as the optimal macroblock coding mode for the current macroblock; if the current macroblock belongs to the non-boundary area, then judge whether the binocular detectable distortion value of the current macroblock is greater than or equal to the set If yes, the encoder uses the H.264 mode selection process to search for SKIP, Inter16×16, Inter16×8, Inter8×16, Inter8×8, Inter8×8Frext, Intra16×16, Intra8×8 and Intra4 ×4 macroblock coding mode, select the macroblock coding mode with the smallest rate-distortion cost from these macroblock coding modes as the optimal macroblock coding mode of the current macroblock for coding; otherwise, the encoder adopts the mode selection of H.264 The process only searches for SKIP and Inter16×16 macroblock coding modes, and selects the macroblock coding mode with the smallest rate-distortion cost from these two macroblock coding modes as the optimal macroblock coding mode of the current macroblock for coding;
③-3、将下一个待编码的宏块作为当前宏块,然后返回步骤③-2继续执行,直至{CL(k)}中的每帧左视点图像中的每个宏块和{CR(k)}中的每帧右视点图像中的每个宏块均完成编码。③-3. Use the next macroblock to be encoded as the current macroblock, and then return to step ③-2 to continue until each macroblock in each frame of the left view image in { CL (k)} and {C Each macroblock in each frame of right-view image in R (k)} is coded.
所述的步骤②中{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块的双目恰可觉察失真值其中,d表示{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块的视差值,bgL(i+d,j)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块中的所有像素点的亮度值的平均值,bgL(i+d,j)∈[0,255],bgL(i×16+d×16+m,j×16+n)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i×16+d×16+m,j×16+n)的像素点的亮度值,0≤m≤15,0≤n≤15,ehL(i+d,j)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块中的所有像素点的边缘强度值的平均值, bgL(i×16+d×16+m-3+h,j×16+n-3+v)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i×16+d×16+m-3+h,j×16+n-3+v)的像素点的亮度值,GH(h,v)表示5×5的水平Sobel算子GH中坐标位置为(h,v)处的元素,GV(h,v)表示5×5的垂直Sobel算子GV中坐标位置为(h,v)处的元素,1≤h≤5,1≤v≤5,Alimt(bgL(i+d,j),ehL(i+d,j))=Alimit(bgL(i+d,j))+K(bgL(i+d,j))×ehL(i+d,j),K(bgL(i+d,j))=-10-6×(0.7×(bgL(i+d,j))2+32×bgL(i+d,j))+0.07,λ表示控制右视点噪声影响的参数,nL(i+d,j)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块的噪声幅值。The binocular just perceivable distortion value of the macroblock whose coordinate position is (i, j) in the k-th frame of the right viewpoint image C R (k) in {C R (k)} in the step ② Among them, d represents the disparity value of the macroblock whose coordinate position is (i, j) in the k-th frame right view image C R (k) in {C R (k)}, bg L (i+d, j) Represents the average value of the luminance values of all pixels in the macroblock whose coordinate position is (i+d,j) in the k-th frame left-viewpoint image CL (k) in { CL (k)}, bg L ( i+d,j)∈[0,255], bg L (i×16+d×16+m, j×16+n) means that the coordinate position in the kth frame left viewpoint image C L (k) in { CL (k)} is (i×16+d ×16+m, j×16+n), 0≤m≤15, 0≤n≤15, eh L (i+d, j) represents the luminance value of the pixel in {C L (k)} The average value of the edge intensity values of all pixels in the macroblock whose coordinate position is (i+d, j) in the left view point image C L (k) of k frames, bg L (i×16+d×16+m-3+h, j×16+n-3+v) represents the coordinates in the k-th frame left viewpoint image C L (k) in { CL (k)} The brightness value of the pixel at the position (i×16+d×16+m-3+h, j×16+n-3+v), G H (h, v) represents the horizontal Sobel operator of 5×5 The element at the coordinate position (h, v) in G H , G V (h, v) represents the element at the coordinate position (h, v) in the vertical Sobel operator G V of 5×5, 1≤h≤5,1≤v≤5 , A limit (bg L (i+d,j),eh L (i+d,j))=A limit (bg L (i+d,j))+ K(bg L (i+d,j))×eh L (i+d,j), K(bg L (i+d,j))=-10 -6 ×(0.7×(bg L (i+d,j)) 2 +32×bg L (i+d,j))+0.07,λ Indicates the parameter to control the influence of right viewpoint noise, n L (i+d,j) indicates that the coordinate position in the kth frame left viewpoint image C L (k) in { CL (k)} is (i+d,j) The noise magnitude of the macroblock.
所述的控制右视点噪声影响的参数λ取值为1.25。The parameter λ for controlling the influence of noise on the right viewpoint takes a value of 1.25.
所述的{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块的噪声幅值nL(i+d,j)取值为0.3。The noise magnitude n L (i+d, j) of the macroblock whose coordinate position is (i+d, j) in the k-th frame left view image C L ( k ) in the {CL (k)} The value is 0.3.
所述的步骤③-2中设定的判定阈值取值为5。The determination threshold set in step ③-2 is 5.
与现有技术相比,本发明的优点在于:首先利用左视点视频和右视点视频的视差信息确定右视点视频中的每帧右视点图像中的非边界区域内的每个宏块的双目恰可觉察失真值,其次根据双目恰可觉察失真值的大小提前终止宏块模式选择,该快速编码方法在不造成率失真性能下降的基础上,能够有效地提高多视点彩色视频的编码效率,节约的编码时间可达66.48%到71.90%,平均节约编码时间68.46%。Compared with the prior art, the present invention has the advantages that: firstly, the binocular value of each macroblock in the non-boundary area of each frame of the right-view image in the right-view video is determined by using the disparity information of the left-view video and the right-view video. Just detectable distortion value, and secondly terminate the macroblock mode selection in advance according to the size of binocular just detectable distortion value. This fast coding method can effectively improve the coding efficiency of multi-viewpoint color video without causing rate-distortion performance degradation. , the saving of coding time can reach 66.48% to 71.90%, and the average saving of coding time is 68.46%.
附图说明Description of drawings
图1为本发明方法的流程框图;Fig. 1 is the block flow diagram of the inventive method;
图2a为“Book Arrival”测试视频序列中的第4视点对应彩色视频序列的第1帧彩色图像;Figure 2a is the first frame color image of the color video sequence corresponding to the fourth viewpoint in the "Book Arrival" test video sequence;
图2b为“Book Arrival”测试视频序列中的第5视点对应彩色视频序列的第1帧彩色图像;Figure 2b is the color image of the first frame of the color video sequence corresponding to the fifth viewpoint in the "Book Arrival" test video sequence;
图2c为图2b所示的彩色图像中的非边界区域内的每一个宏块的双目恰可觉察失真值放大20倍后形成的图像;Fig. 2c is the image formed after the binocular just detectable distortion value of each macroblock in the non-boundary area in the color image shown in Fig. 2b is enlarged by 20 times;
图3a为多视点彩色视频序列“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”的右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择SKIP模式作为最优宏块编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比;Figure 3a shows the non-boundary area in the right view image of the multi-view color video sequence "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9", for the same For all macroblocks with binocular just perceivable distortion values, the percentage of the number of macroblocks whose SKIP mode is selected as the optimal macroblock coding mode accounts for the number of all macroblocks with the same binocular just perceivable distortion value;
图3b为多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter16×16模式作为最优宏块编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比;Figure 3b shows the non-boundary area in the right view image of the multi-view color video "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" sequences, for the same For all macroblocks with binocular just perceivable distortion values, the percentage of the number of macroblocks for which Inter16×16 mode is selected as the optimal macroblock coding mode to the number of all macroblocks with the same binocular just perceivable distortion value;
图3c为多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter16×8模式作为最优宏块编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比;Figure 3c shows the non-boundary area in the right view image of the multi-view color video "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" sequences, for the same For all macroblocks with binocular just perceivable distortion values, the percentage of the number of macroblocks for which Inter16×8 mode is selected as the optimal macroblock coding mode to the number of all macroblocks with the same binocular just perceivable distortion value;
图3d为多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter8×16模式作为最优宏块编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比;Figure 3d shows the non-boundary areas in the right view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9". For all macroblocks with binocular just perceivable distortion values, the percentage of the number of macroblocks for which Inter8×16 mode is selected as the optimal macroblock coding mode to the number of all macroblocks with the same binocular just perceivable distortion value;
图3e为多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter8×8模式作为最优宏块编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比;Figure 3e shows the non-boundary areas in the right view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9". For all macroblocks with binocular just perceivable distortion values, the percentage of the number of macroblocks for which the Inter8×8 mode is selected as the optimal macroblock coding mode to the number of all macroblocks with the same binocular just perceivable distortion value;
图3f为多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Others(包括Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4)模式作为最优宏块编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比;Figure 3f shows the non-boundary areas in the right view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9". For all macroblocks whose binocular distortion value can be detected, select Others (including Inter8×8Frext, Intra16×16, Intra8×8 and Intra4×4) mode as the optimal macroblock coding mode. The number of macroblocks with the same binocular correctness Percentage of the number of all macroblocks with perceivable distortion values;
图4a为“Exit”测试序列分别通过多视点标准校验平台和本发明方法编码后得到的多视点彩色视频序列的编码率失真性能对比图;Fig. 4a is the coding rate-distortion performance comparison diagram of the multi-viewpoint color video sequence obtained after the "Exit" test sequence is respectively encoded by the multi-viewpoint standard verification platform and the method of the present invention;
图4b为“Vassar”测试序列分别通过多视点标准校验平台和本发明方法编码后得到的多视点彩色视频序列的编码率失真性能对比图;Fig. 4 b is the coding rate-distortion performance comparison diagram of the multi-viewpoint color video sequence obtained after the "Vassar" test sequence is encoded by the multi-viewpoint standard verification platform and the method of the present invention respectively;
图4c为“Champagne tower”测试序列分别通过多视点标准校验平台和本发明方法编码后得到的多视点彩色视频序列的编码率失真性能对比图;Fig. 4c is a comparison chart of the encoding rate-distortion performance of the multi-viewpoint color video sequence obtained after the "Champagne tower" test sequence is coded by the multi-viewpoint standard verification platform and the method of the present invention respectively;
图5a为“Exit”测试序列采用多视点标准校验平台重建的第2视点第10帧图像;Figure 5a is the 10th frame image of the second viewpoint reconstructed by using the multi-viewpoint standard verification platform in the "Exit" test sequence;
图5b为“Exit”测试序列采用本发明的快速编码方法重建的第2视点第10帧图像;Fig. 5b is the 10th frame image of the second viewpoint reconstructed by the fast encoding method of the present invention in the "Exit" test sequence;
图5c为“Vassar”测试序列采用多视点标准校验平台重建的第2视点第10帧图像;Figure 5c is the 10th frame image of the second viewpoint reconstructed by the multi-viewpoint standard verification platform of the "Vassar" test sequence;
图5d为“Vassar”测试序列采用本发明的快速编码方法重建的第2视点第10帧图像;Figure 5d is the 10th frame image of the second viewpoint reconstructed by the fast encoding method of the present invention in the "Vassar" test sequence;
图6为宏块的坐标位置与宏块内的像素点的坐标位置之间的关系示意图。FIG. 6 is a schematic diagram of the relationship between the coordinate position of a macroblock and the coordinate position of a pixel in the macroblock.
具体实施方式detailed description
以下结合附图实施例对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.
由于多视点彩色视频编码中没有完全利用人眼视觉感知特性压缩,多视点彩色视频中存在大量的冗余,因此为了得到在不影响人眼视觉感知的前提下进一步减少多视点彩色视频的编码时间,本发明提出了一种基于双目恰可觉察失真的多视点彩色视频快速编码方法,其在对多视点彩色视频编码之前对右视点视频中的宏块求取相应的双目恰可觉察失真值,以减少编码多视点彩色视频的编码时间。Since the multi-viewpoint color video coding does not fully utilize the human visual perception characteristic compression, there is a large amount of redundancy in the multi-viewpoint color video, so in order to further reduce the coding time of the multi-viewpoint color video without affecting the human visual perception , the present invention proposes a fast coding method for multi-viewpoint color video based on binocular just perceivable distortion, which calculates the corresponding binocular just perceivable distortion for the macroblocks in the right-viewpoint video before encoding the multi-viewpoint color video value to reduce encoding time for encoding multiview color video.
本发明提出的一种基于双目恰可觉察失真的多视点彩色视频快速编码方法,其流程框图如图1所示,其包括以下步骤:A fast encoding method for multi-viewpoint color video based on binocular just detectable distortion proposed by the present invention, its flow chart is shown in Figure 1, which includes the following steps:
①将多视点彩色视频的左视点视频记为{CL(k)},将多视点彩色视频的右视点视频记为{CR(k)},其中,CL(k)表示{CL(k)}中的第k帧左视点图像,CR(k)表示{CR(k)}中的第k帧右视点图像,1≤k≤K,K表示左视点视频和右视点视频中包含的图像的帧数。①Record the left view video of the multi-viewpoint color video as {C L (k)}, and record the right view video of the multi-viewpoint color video as {C R (k)}, where C L (k) means {C L (k)} in the k-th frame of the left-view image, C R (k) represents the k-th frame of the right-view image in {C R (k)}, 1≤k≤K, K represents the left-view video and the right-view video The number of frames of the image contained in .
②由于左视点图像和右视点图像中存在遮挡暴露区域,无法求取左视点图像与右视点图像的视差值,而且遮挡暴露区域一般位于左视点图像和右视点图像中的边界区域,因此在计算{CR(k)}中的每帧右视点图像中的宏块的双目恰可觉察失真值时需去除右视点图像的边界区域(最上面一行和最下面一行宏块及最左边和最右边一列宏块),即具体过程为:将{CR(k)}中的每帧右视点图像划分为边界区域和非边界区域,其中,边界区域由右视点图像中的第一行宏块、最后一行宏块、第一列宏块和最后一列宏块构成,非边界区域即由右视点图像中横坐标在2≤i≤W/16-1范围内且纵坐标在2≤j≤H/16-1范围内的所有宏块构成;然后计算{CR(k)}中的每帧右视点图像中的非边界区域内的每个宏块的双目恰可觉察失真值,假设{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块属于{CR(k)}中的第k帧右视点图像CR(k)中的非边界区域,则将{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块的双目恰可觉察失真值记为 其中,2≤i≤W/16-1,2≤j≤H/16-1,W表示{CL(k)}中的每帧左视点图像和{CR(k)}中的每帧右视点图像的宽度,H表示{CL(k)}中的每帧左视点图像和{CR(k)}中的每帧右视点图像的高度,d表示{CR(k)}中的第k帧右视点图像CR(k)中坐标位置为(i,j)的宏块的视差值,利用现有的基于平方差和(the Sum of Square Differences,SSD)的视差估计算法求取{CR(k)}中的每帧右视点图像中的非边界区域内的每个宏块与{CL(k)}中同一时刻的左视点图像中的对应宏块的视差值,bgL(i+d,j)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块中的所有像素点的亮度值的平均值,bgL(i+d,j)∈[0,255],bgL(i×16+d×16+m,j×16+n)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i×16+d×16+m,j×16+n)的像素点的亮度值,图6给出了宏块的坐标位置与宏块内的像素点的坐标位置之间的关系示意图,0≤m≤15,0≤n≤15,ehL(i+d,j)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块中的所有像素点的边缘强度值的平均值, bgL(i×16+d×16+m-3+h,j×16+n-3+v)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i×16+d×16+m-3+h,j×16+n-3+v)的像素点的亮度值,GH(h,v)表示5×5的水平Sobel算子GH中坐标位置为(h,v)处的元素,GV(h,v)表示5×5的垂直Sobel算子GV中坐标位置为(h,v)处的元素,1≤h≤5,1≤v≤5,Alimt(bgL(i+d,j),ehL(i+d,j))=Alimit(bgL(i+d,j))+K(bgL(i+d,j))×ehL(i+d,j),K(bgL(i+d,j))=-10-6×(0.7×(bgL(i+d,j))2+32×bgL(i+d,j))+0.07,λ表示控制右视点噪声影响的参数,在本实施例中λ取值为1.25,nL(i+d,j)表示{CL(k)}中的第k帧左视点图像CL(k)中坐标位置为(i+d,j)的宏块的噪声幅值,在本实施例中,nL(i+d,j)取值为0.3。②Because there are occlusion exposure areas in the left view image and right view image, the disparity value between the left view image and the right view image cannot be calculated, and the occlusion exposure area is generally located in the boundary area between the left view image and the right view image, so in When calculating the binocular just perceptible distortion value of the macroblocks in each frame of the right view image in {C R (k)}, it is necessary to remove the border area of the right view image (the top row and the bottom row of macroblocks and the leftmost and The rightmost column of macroblocks), that is, the specific process is: divide each frame of right-viewpoint image in {C R (k)} into boundary area and non-boundary area, where the boundary area is defined by the first row of macroblocks in the right-viewpoint image block, the last row of macroblocks, the first column of macroblocks and the last column of macroblocks. The non-boundary area is composed of the abscissa in the right view image within the range of 2≤i≤W/16-1 and the ordinate in the range of 2≤j≤ All macroblocks in the range of H/16-1 are composed; then calculate the binocular perceivable distortion value of each macroblock in the non-boundary area of each frame of right view image in { CR (k)}, assuming The macroblock whose coordinate position is (i, j) in the k-th frame of right-viewpoint image C R (k) in { CR (k)} belongs to the k-th frame of right-viewpoint image C R in { CR (k)} In the non-boundary area in (k), the binocular distortion of the macroblock whose coordinate position is (i, j) in the k-th frame of right view image C R (k) in {C R (k)} is just perceivable value recorded as Among them, 2≤i≤W/16-1, 2≤j≤H/16-1, W represents each frame of left view image in { CL (k)} and each frame in { CR (k)} The width of the right view image, H represents the height of each frame of the left view image in {C L (k)} and the height of each frame of the right view image in {C R (k)}, d represents the height of each frame of {C R (k)} The disparity value of the macroblock whose coordinate position is (i,j) in the k-th frame of the right view image C R (k) of , using the existing disparity estimation algorithm based on the Sum of Square Differences (SSD) Calculate the disparity between each macroblock in the non-boundary area of each frame of right view image in {C R (k)} and the corresponding macro block in the left view image at the same moment in {C L (k)} value, bg L (i+d, j) represents all pixels in the macroblock whose coordinate position is (i+d, j) in the k-th frame left view image C L (k) in { CL (k)} The average value of the brightness value of the point, bg L (i+d,j)∈[0,255], bg L (i×16+d×16+m,j×16+n) means that the coordinate position in the kth frame left viewpoint image C L (k) in { CL (k)} is (i×16+d ×16+m, j×16+n) pixel brightness value, Figure 6 shows the schematic diagram of the relationship between the coordinate position of the macroblock and the coordinate position of the pixel point in the macroblock, 0≤m≤15, 0≤n≤15, eh L (i+d,j) represents the macroblock whose coordinate position is (i+d,j) in the k-th frame left view image CL (k) in { CL (k)} The average of the edge intensity values of all pixels in bg L (i×16+d×16+m-3+h, j×16+n-3+v) represents the coordinates in the k-th frame left viewpoint image C L (k) in { CL (k)} The brightness value of the pixel at the position (i×16+d×16+m-3+h, j×16+n-3+v), G H (h, v) represents the horizontal Sobel operator of 5×5 The element at the coordinate position (h, v) in G H , G V (h, v) represents the element at the coordinate position (h, v) in the vertical Sobel operator G V of 5×5, 1≤h≤5,1≤v≤5 , A limit (bg L (i+d,j),eh L (i+d,j))=A limit (bg L (i+d,j))+ K(bg L (i+d,j))×eh L (i+d,j), K(bg L (i+d,j))=-10 -6 ×(0.7×(bg L (i+d,j)) 2 +32×bg L (i+d,j))+0.07,λ Represents the parameter for controlling the influence of right viewpoint noise. In this embodiment, the value of λ is 1.25, and n L (i+d, j) represents the kth frame left viewpoint image C L (k) in { CL (k)} The noise magnitude of the macroblock whose middle coordinate position is (i+d, j), in this embodiment, n L (i+d, j) takes a value of 0.3.
图2a给出了“Book Arrival”测试视频序列中的第4视点对应彩色视频序列的第1帧彩色图像(左视点图像);图2b给出了“Book Arrival”测试视频序列中的第5视点对应彩色视频序列的第1帧彩色图像(右视点图像);由于H.264编码过程是基于宏块大小进行的,因此本发明对图2b所示的彩色图像中的非边界区域内的每一个宏块求取双目恰可觉察失真值,并且由于图2b所示的彩色图像中的非边界区域内的每一个宏块的恰可觉察失真值较小,人眼不能直接观看,因此为方便观看将图2b所示的彩色图像中的非边界区域内的每一个宏块的恰可觉察失真值放大20倍,图2c给出了图2b所示的彩色图像中的非边界区域内的每一个宏块的双目恰可觉察失真值放大20倍后形成的图像。Figure 2a shows the first frame of color image (left view image) corresponding to the 4th viewpoint in the "Book Arrival" test video sequence; Fig. 2b shows the 5th viewpoint in the "Book Arrival" test video sequence The first frame color image (right viewpoint image) corresponding to the color video sequence; because the H.264 encoding process is carried out based on the size of the macroblock, the present invention is for each non-boundary area in the color image shown in Figure 2b The macroblock calculates the binocular just perceptible distortion value, and since the just perceivable distortion value of each macroblock in the non-boundary area in the color image shown in Figure 2b is small, the human eye cannot directly view it, so for convenience Looking at the just perceivable distortion value of each macroblock in the non-boundary area in the color image shown in Figure 2b is magnified by 20 times, Figure 2c shows the value of each macroblock in the non-boundary area in the color image shown in Figure 2b The binoculars of a macroblock can just perceive the image formed by magnifying the distortion value by 20 times.
③在多视点视频编码校验模型JMVC上,采用HBP预测编码结构对{CL(k)}中的每帧左视点图像中的每个宏块和{CR(k)}中的每帧右视点图像中的每个宏块进行编码,在编码过程中为每个宏块选择最优宏块编码模式的过程为:③ On the multi-view video coding verification model JMVC, use the HBP predictive coding structure to pair each macroblock in each frame of the left view image in { CL (k)} and each frame in { CR (k)} Each macroblock in the right view image is encoded, and the process of selecting the optimal macroblock encoding mode for each macroblock in the encoding process is:
③-1、将当前待编码的宏块定义为当前宏块;③-1. Define the current macroblock to be encoded as the current macroblock;
③-2、当当前宏块为{CL(k)}中的宏块时,编码器采用现有的H.264的模式选择过程搜索SKIP、Inter16×16、Inter16×8、Inter8×16、Inter8×8、Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4宏块编码模式,从这些宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;③-2. When the current macroblock is a macroblock in { CL (k)}, the encoder uses the existing H.264 mode selection process to search for SKIP, Inter16×16, Inter16×8, Inter8×16, Inter8×8, Inter8×8Frext, Intra16×16, Intra8×8 and Intra4×4 macroblock coding modes, select the macroblock coding mode with the smallest rate-distortion cost from these macroblock coding modes as the optimal macroblock for the current macroblock Block encoding mode for encoding;
当当前宏块为{CR(k)}中的宏块时,判断当前宏块属于边界区域还是属于非边界区域,如果当前宏块属于边界区域,则编码器采用现有的H.264的模式选择过程搜索SKIP、Inter16×16、Inter16×8、Inter8×16、Inter8×8、Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4宏块编码模式,从这些宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;如果当前宏块属于非边界区域,则再判断当前宏块的双目恰可觉察失真值是否大于或等于设定的判定阈值,如果是,则编码器采用现有的H.264的模式选择过程搜索SKIP、Inter16×16、Inter16×8、Inter8×16、Inter8×8、Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4宏块编码模式,从这些宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;否则,编码器采用现有的H.264的模式选择过程只搜索SKIP和Inter16×16宏块编码模式,从这两种宏块编码模式中选出率失真代价最小的宏块编码模式作为当前宏块的最优宏块编码模式进行编码;When the current macroblock is a macroblock in { CR (k)}, it is judged whether the current macroblock belongs to the boundary area or the non-boundary area. If the current macroblock belongs to the boundary area, the encoder uses the existing H.264 The mode selection process searches for SKIP, Inter16×16, Inter16×8, Inter8×16, Inter8×8, Inter8×8Frext, Intra16×16, Intra8×8 and Intra4×4 macroblock coding modes, and selects from these macroblock coding modes The macroblock coding mode with the smallest rate distortion cost is used as the optimal macroblock coding mode for the current macroblock; if the current macroblock belongs to the non-boundary area, then judge whether the binocular just detectable distortion value of the current macroblock is greater than or Equal to the set decision threshold, if yes, the encoder uses the existing H.264 mode selection process to search for SKIP, Inter16×16, Inter16×8, Inter8×16, Inter8×8, Inter8×8Frext, Intra16×16 , Intra8×8 and Intra4×4 macroblock coding modes, from these macroblock coding modes, select the macroblock coding mode with the smallest rate-distortion cost as the optimal macroblock coding mode for the current macroblock; otherwise, the encoder uses The existing H.264 mode selection process only searches for SKIP and Inter16×16 macroblock coding modes, and selects the macroblock coding mode with the smallest rate-distortion cost from these two macroblock coding modes as the optimal macroblock coding mode for the current macroblock. Block encoding mode for encoding;
③-3、将下一个待编码的宏块作为当前宏块,然后返回步骤③-2继续执行,直至{CL(k)}中的每帧左视点图像中的每个宏块和{CR(k)}中的每帧右视点图像中的每个宏块均完成编码。③-3. Use the next macroblock to be encoded as the current macroblock, and then return to step ③-2 to continue until each macroblock in each frame of the left view image in { CL (k)} and {C Each macroblock in each frame of right-view image in R (k)} is coded.
在本实施例中,步骤③-2中设定的判定阈值取值为5,该设定的判定阈值的具体取值是通过实验获取的,实验过程如下:In this embodiment, the value of the decision threshold set in step ③-2 is 5, and the specific value of the set decision threshold is obtained through experiments, and the experiment process is as follows:
③-2a、采用多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的左视点视频中的46帧,多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列右视点视频中的46帧,依次计算“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点视频中的每一帧右视点图像中的非边界区域内的每个宏块的双目恰可觉察失真值,将每一帧右视点图像中的非边界区域内的每个宏块的双目恰可觉察失真值的最小值和最大值分别记为BJND_min和BJND_max,然后依次统计“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点视频中的46帧图像中双目恰可觉察失真值等于从BJND_min到BJND_max范围内每一个数值的宏块的个数,以集合形式记为{NBJND_min,…,NBJND_max},其中,NBJND_min表示一个序列的右视点视频中的46帧图像中双目恰可觉察失真值等于BJND_min的宏块的个数,NBJND_max表示一个序列的右视点视频中的46帧图像中双目恰可觉察失真值等于BJND_max的宏块的个数。③-2a. 46 frames in the left-viewpoint video of the sequence "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" using multi-viewpoint color video, multi-viewpoint color For the 46 frames in the right view video sequence of the video "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9", calculate "Alt Moabit", "Balloons", " The binocular perceivable distortion value of each macroblock in the non-boundary area of each frame of the right view video in the right view video of the Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" sequences , record the minimum and maximum values of binocular just perceivable distortion values of each macroblock in the non-boundary area of each frame of right view image as BJND_min and BJND_max respectively, and then count "Alt Moabit", "Balloons ", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" sequences in the 46 frames of right-view video images, the binocular just detectable distortion value is equal to the value of each value in the range from BJND_min to BJND_max The number of macroblocks is recorded as {N BJND_min ,...,N BJND_max } in the form of a set, where N BJND_min represents a macroblock whose binocular distortion value is equal to BJND_min in 46 frames of images in a sequence of right-view video N BJND_max represents the number of macroblocks whose binocularly detectable distortion value is equal to BJND_max in 46 frames of images in a sequence of right-view video.
③-2b、采用多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的左视点视频中的46帧,多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列右视点视频中的46帧,在多视点视频编码校验模型JMVC上,采用HBP预测编码结构对多视点彩色视频“AltMoabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的左视点视频和多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点视频进行编码,在编码过程中获取“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点视频中的46帧图像中的所有宏块的最优宏块编码模式。③-2b. 46 frames of the left-viewpoint video of the sequence "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" using multi-viewpoint color video, multi-viewpoint color For the 46 frames in the right-view video sequences of "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9", on the multi-view video coding verification model JMVC, adopt HBP predictive coding structure for multi-view color video "AltMoabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" sequences of left-view video and multi-view color video "Alt Moabit" , "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" sequences to encode right-view video, and obtain "Alt Moabit", "Balloons", "Ballroom", The optimal macroblock coding mode for all macroblocks in the 46 frames of images in the right-view video of the sequences "Kendo", "Race1", "Xmas3", and "Xmas9".
③-2c、对于多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点视频中的46帧图像中的非边界区域内的所有宏块,其中,当宏块的双目恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x时,对应集合{NBJND_min,…,NBJND_max}中的一个数值,记为NBJND_x,即,所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x,其中,在这NBJND_x个宏块中,选择SKIP模式作为最优宏块编码模式的宏块数目记为NSKIP,选择Inter16×16模式作为最优宏块编码模式的宏块数目记为NInter16×16,选择Inter16×8模式作为最优宏块编码模式的宏块数目记为NInter16×8,选择Inter8×16模式作为最优宏块编码模式的宏块数目记为NInter8×16,选择Inter8×8模式作为最优宏块编码模式的宏块数目记为NInter8×8,选择Others(包括Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4)模式作为最优宏块编码模式的宏块数目记为NOthers。③-2c. For the multi-viewpoint color video "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", "Xmas9" sequences of the right-viewpoint video in the 46-frame image All macroblocks in the boundary area, where when the binocularly detectable distortion value of the macroblock is equal to a certain value BJND_x in the range from BJND_min to BJND_max, a value in the corresponding set {N BJND_min ,...,N BJND_max } , denoted as N BJND_x , that is, the number of macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , wherein, among these N BJND_x macroblocks, the SKIP mode is selected as The number of macroblocks in the optimal macroblock coding mode is recorded as N SKIP , the number of macroblocks in which the Inter16×16 mode is selected as the optimal macroblock coding mode is recorded as N Inter16×16 , and the Inter16×8 mode is selected as the optimal macroblock coding mode The number of macroblocks is recorded as N Inter16×8 , the number of macroblocks with Inter8×16 mode as the optimal macroblock coding mode is recorded as N Inter8×16 , and the number of macroblocks with Inter8×8 mode as the optimal macroblock coding mode It is denoted as N Inter8×8 , and the number of macroblocks that select the Others (including Inter8×8Frext, Intra16×16, Intra8×8 and Intra4×4) mode as the optimal macroblock coding mode is denoted as N Others .
③-2d、计算NSKIP/NBJND_x的值,记为PSKIP,即,在所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x中,选择SKIP模式作为最优宏块编码模式的宏块数目NSKIP占这NBJND_x个宏块的百分比。③-2d. Calculate the value of N SKIP /N BJND_x , which is recorded as P SKIP , that is, among all macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , select The SKIP mode is the percentage of the number N SKIP of macroblocks in the optimal macroblock coding mode to the N BJND_x macroblocks.
图3a给出了多视点彩色视频序列“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”各自的46帧右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择SKIP模式作为最优编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比,即,图3a为多视点彩色视频序列“AltMoabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列的右视点图像中PSKIP的大小变化示意图,如图3a所示,PSKIP的值一般大于50%,所占的百分比较大,即,对于多视点彩色视频右视点视频图像中的非边界区域内恰可觉察失真值在BJND_min到BJND_max范围内的所有宏块,在现有的H.264编码的模式选择过程中,选择SKIP模式的百分比较大,所以,对于多视点彩色视频右视点视频图像中的非边界区域内恰可觉察失真值在BJND_min到BJND_max范围内的的所有宏块,都要进行SKIP模式的搜索。Figure 3a shows the non-boundary regions in the respective 46-frame right-view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" Inside, for all macroblocks with the same binocular just perceptible distortion value, the percentage of the number of macroblocks with the same binocular just perceivable distortion value as the number of macroblocks that select SKIP mode as the optimal coding mode, that is, Fig. 3a is a schematic diagram of the size change of P SKIP in the right view images of the multi-view color video sequences "AltMoabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9", as shown in Figure 3a As shown, the value of P SKIP is generally greater than 50%, and the percentage is relatively large, that is, for all macros whose distortion values are just in the range of BJND_min to BJND_max in the non-boundary area of the multi-viewpoint color video right-viewpoint video image Block, in the mode selection process of the existing H.264 encoding, the percentage of selecting the SKIP mode is relatively large, so for the non-boundary area of the multi-viewpoint color video right-viewpoint video image, the perceivable distortion value is between BJND_min and BJND_max All macroblocks within the range must be searched in SKIP mode.
计算NInter16×16/NBJND_x的值,记为PInter16×16,即,在所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x中,选择Inter16×16模式作为最优宏块编码模式的宏块数目NInter16×16占这NBJND_x个宏块的百分比。Calculate the value of N Inter16×16 /N BJND_x , which is recorded as P Inter16×16 , that is, among all the macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , select The Inter16×16 mode is the percentage of the number N Inter16×16 macroblocks of the optimal macroblock coding mode to the N BJND_x macroblocks.
图3b给出了多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列各自的46帧右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter16×16模式作为最优编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比,如图3b所示,PInter16×16的值一般大于20%,所占的百分比较大,即,对于多视点彩色视频右视点视频图像中的非边界区域内恰可觉察失真值在BJND_min到BJND_max范围内的所有宏块,在现有的H.264编码的模式选择过程中,选择SKIP模式的百分比较大,所以,对于多视点彩色视频右视点视频图像中的非边界区域内恰可觉察失真值在BJND_min到BJND_max范围内的所有宏块,都要进行Inter16×16模式的搜索。Figure 3b shows the non-boundary regions in the 46 right-view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" respectively Inside, for all macroblocks with the same binocular just perceivable distortion value, the percentage of the number of macroblocks that choose Inter16×16 mode as the optimal coding mode to the number of all macroblocks with the same binocular just perceivable distortion value, such as As shown in Figure 3b, the value of P Inter16×16 is generally greater than 20%, and the percentage is relatively large, that is, for the multi-viewpoint color video in the non-boundary area of the right-viewpoint video image, the perceivable distortion value is in the range of BJND_min to BJND_max For all macroblocks in the existing H.264 encoding mode selection process, the percentage of SKIP mode is relatively large, so the distortion value can be perceived in the non-boundary area of the right-viewpoint video image for multi-viewpoint color video All macroblocks within the range from BJND_min to BJND_max must be searched in Inter16×16 mode.
计算NInter16×8/NBJND_x的值,记为PInter16×8,即,在所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x中,选择Inter16×8模式作为最优宏块编码模式的宏块数目NInter16×8占这NBJND_x个宏块的百分比。Calculate the value of N Inter16×8 /N BJND_x , which is denoted as P Inter16×8 , that is, among all macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , select The number N Inter16×8 of macroblocks in Inter16×8 mode as the optimal macroblock coding mode accounts for the percentage of the N BJND_x macroblocks.
图3c给出了多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列各自的46帧右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter16×8模式作为最优编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比,如图3c所示,PInter16×8的值一般所占比例约为5%左右,随着双目恰可觉察失真值的增大,人眼所能容忍的失真越小,宏块需进一步细化,在图3c中表现为,PInter16×8也随之增大,具体表现在当0≤BJND≤5,PInter16×8保持在较小的数值,约为2%,这表明,当宏块的双目恰可觉察失真值小于某个阈值时,宏块模式选择可以不再进行Inter16×8宏块编码模式搜索过程,而当6<BJND<n,n>6时,PInter16×8呈增大的趋势,大约增大到5%,这表明,当宏块的双目恰可觉察失真值大于某个阈值时,宏块需要进行Inter16×8宏块编码模式搜索过程,其中,BJND表示双目恰可觉察失真值。Figure 3c shows the non-boundary regions in the respective 46-frame right-view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" Inside, for all macroblocks with the same binocular just perceptible distortion value, the percentage of the number of macroblocks that choose Inter16×8 mode as the optimal coding mode to the number of all macroblocks with the same binocular just perceivable distortion value, such as As shown in Figure 3c, the value of P Inter16×8 generally accounts for about 5%. As the binocularly detectable distortion value increases, the distortion that the human eye can tolerate is smaller, and the macroblock needs to be further refined. , as shown in Figure 3c, P Inter16×8 also increases accordingly, specifically when 0≤BJND≤5, P Inter16×8 remains at a small value, about 2%, which shows that when the macroblock When the binocular just detectable distortion value is less than a certain threshold, the macroblock mode selection can no longer perform the Inter16×8 macroblock coding mode search process, and when 6<BJND<n, n>6, P Inter16×8 is The increasing trend increases to approximately 5%, which indicates that when the binocularly detectable distortion value of the macroblock is greater than a certain threshold, the macroblock needs to perform the Inter16×8 macroblock coding mode search process, where BJND means Distortion values are just perceptible to both eyes.
计算NInter8×16/NBJND_x的值,记为PInter8×16,即,在所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x中,选择Inter16×8模式作为最优宏块编码模式的宏块数目NInter8×16占这NBJND_x个宏块的百分比。Calculate the value of N Inter8×16 /N BJND_x , which is denoted as P Inter8×16 , that is, among all macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , select The number of macroblocks N Inter8×16 in Inter16×8 mode as the optimal macroblock coding mode accounts for the percentage of the N BJND_x macroblocks.
图3d给出了多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列各自的46帧右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter8×16模式作为最优编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比,如图3d所示,PInter8×16的值一般所占比例约为6%左右,随着双目恰可觉察失真值的增大,人眼所能容忍的失真越小,宏块需进一步细化,在图3d中表现为,PInter8×16也随之增大;具体表现在当0≤BJND≤5,PInter8×16保持在较小的数值,约为3%,这表明,当宏块的双目恰可觉察失真值小于某个阈值时,宏块模式选择可以不再进行Inter8×16宏块编码模式搜索过程,而当6<BJND<n,n>6时,PInter8×16呈增大的趋势,大约增大到6%,这表明,当宏块的双目恰可觉察失真值大于某个阈值时,宏块需要进行Inter8×16宏块编码模式搜索过程。Figure 3d shows the non-boundary regions in the respective 46-frame right view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" Inside, for all macroblocks with the same binocular just perceivable distortion value, the percentage of the number of macroblocks that choose Inter8×16 mode as the optimal coding mode to the number of all macroblocks with the same binocular just perceivable distortion value, such as As shown in Figure 3d, the value of P Inter8×16 generally accounts for about 6%. As the binocularly detectable distortion value increases, the distortion that the human eye can tolerate is smaller, and the macroblock needs to be further refined. , as shown in Figure 3d, P Inter8×16 also increases; specifically, when 0≤BJND≤5, P Inter8×16 remains at a small value, about 3%, which shows that when the macroblock When the binocular just detectable distortion value is less than a certain threshold, the macroblock mode selection can no longer perform the Inter8×16 macroblock coding mode search process, and when 6<BJND<n, n>6, P Inter8×16 is The increasing trend increases to approximately 6%, which indicates that when the binocularly perceptible distortion value of the macroblock is greater than a certain threshold, the macroblock needs to perform an Inter8×16 macroblock coding mode search process.
计算NInter8×8/NBJND_x的值,记为PInter8×8,即,在所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x中,选择Inter16×8模式作为最优宏块编码模式的宏块数目NInter8×8占这NBJND_x个宏块的百分比。Calculate the value of N Inter8×8 /N BJND_x , which is denoted as P Inter8×8 , that is, among all macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , select The number N Inter8×8 of macroblocks in Inter16×8 mode as the optimal macroblock coding mode accounts for the percentage of the N BJND_x macroblocks.
图3e给出了多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列各自的46帧右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Inter8×8模式作为最优编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比,如图3e所示,PInter8×8的值一般所占比例约为4%左右,随着双目恰可觉察失真值的增大,人眼所能容忍的失真越小,宏块需进一步细化,在图3e中表现为,PInter8×8也随之增大;具体表现在当0≤BJND≤5,PInter8×8保持在较小的数值,约为1.5%,这表明,当宏块的双目恰可觉察失真值小于某个阈值时,宏块模式选择可以不再进行Inter8×8宏块编码模式搜索过程,而当6<BJND<n,n>6时,PInter8×8呈增大的趋势,大约增大到4%,这表明,当宏块的双目恰可觉察失真值大于某个阈值时,宏块需要进行Inter8×8宏块编码模式搜索过程。Figure 3e shows the non-boundary regions in the respective 46-frame right-view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" Inside, for all macroblocks with the same binocular just perceivable distortion value, the percentage of the number of macroblocks that choose Inter8×8 mode as the optimal coding mode to the number of all macroblocks with the same binocular just perceivable distortion value, such as As shown in Figure 3e, the value of P Inter8×8 generally accounts for about 4%. As the binocularly detectable distortion value increases, the distortion that the human eye can tolerate is smaller, and the macroblock needs to be further refined. , as shown in Figure 3e, P Inter8×8 also increases; specifically, when 0≤BJND≤5, P Inter8×8 remains at a small value, about 1.5%, which shows that when the macroblock When the binocular just detectable distortion value of is less than a certain threshold, the macroblock mode selection can no longer perform the Inter8×8 macroblock coding mode search process, and when 6<BJND<n, n>6, P Inter8×8 is The increasing trend increases to approximately 4%, which indicates that when the binocularly detectable distortion value of the macroblock is greater than a certain threshold, the macroblock needs to perform an Inter8×8 macroblock coding mode search process.
计算NOthers/NBJND_x的值,记为POthers,即,在所有恰可觉察失真值等于从BJND_min到BJND_max范围内的某一个数值BJND_x的宏块个数为NBJND_x中,选择Inter16×8模式作为最优宏块编码模式的宏块数目NOthers占这NBJND_x个宏块的百分比。Calculate the value of N Others /N BJND_x , denoted as P Others , that is, among all the macroblocks whose perceivable distortion value is equal to a certain value BJND_x in the range from BJND_min to BJND_max is N BJND_x , select Inter16×8 mode The number of macroblocks N Others as the optimal macroblock coding mode accounts for the percentage of the N BJND_x macroblocks.
图3f给出了多视点彩色视频“Alt Moabit”、“Balloons”、“Ballroom”、“Kendo”、“Race1”、“Xmas3”、“Xmas9”序列各自的46帧右视点图像中的非边界区域内,对于具有相同双目恰可觉察失真值的所有宏块,选择Others(包括Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4)模式作为最优编码模式的宏块数目占具有相同双目恰可觉察失真值的所有宏块数目的百分比,如图3f所示,POthers的值一般所占比例约为6%左右,随着双目恰可觉察失真值的增大,人眼所能容忍的失真越小,宏块需进一步细化,在图3f中表现为,POthers也随之增大;具体表现在当0≤BJND≤5,POthers保持在较小的数值,约为5%,这表明,当宏块的双目恰可觉察失真值小于某个阈值时,宏块模式选择可以不再进行Others(包括Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4)宏块编码模式搜索过程,而当6<BJND<n,n>6时,POthers呈增大的趋势,大约增大到为8%,这表明,当宏块的双目恰可觉察失真值大于某个阈值时,宏块需要进行Others(包括Inter8×8Frext、Intra16×16、Intra8×8和Intra4×4)宏块编码模式搜索过程。Figure 3f shows the non-boundary regions in the respective 46-frame right-view images of the multi-view color video sequences "Alt Moabit", "Balloons", "Ballroom", "Kendo", "Race1", "Xmas3", and "Xmas9" Inside, for all macroblocks with the same binocularly detectable distortion value, the number of macroblocks with the same The percentage of the number of all macroblocks whose binocular distortion value is just perceivable, as shown in Figure 3f, the value of P Others generally accounts for about 6%. The smaller the distortion that can be tolerated, the macroblock needs to be further refined. As shown in Figure 3f, P Others also increases; when 0≤BJND≤5, P Others is kept at a small value, about It is 5%, which shows that when the binocular distortion value of the macroblock is less than a certain threshold, the macroblock mode selection can no longer be carried out Others (including Inter8×8Frext, Intra16×16, Intra8×8 and Intra4×4 ) macroblock coding mode search process, and when 6<BJND<n, n>6, P Others shows an increasing trend, increasing to about 8%, which shows that when the binocular of the macroblock can detect the distortion When the value is greater than a certain threshold, the macroblock needs to go through the Others (including Inter8×8Frext, Intra16×16, Intra8×8 and Intra4×4) macroblock coding mode search process.
综上分析可得判定阈值取值为5。Based on the above analysis, it can be concluded that the judgment threshold value is 5.
为了验证本发明的快速编码方法的有效性和可行性,首先选取了MERL实验室的“Exit”、“Vassar”第0视点和第2视点彩色视频测试序列和日本Nogaya大学的“Champagnetower”第39视点和第41视点的彩色视频测试序列,将这些彩色视频作为原始彩色视频序列,再利用本发明提出的快速编码方法对这些序列的“Exit”、“Vassar”第2个视点以及“Champagne tower”的第41视点的彩色视频进行快速编码得到快速编码后重建的彩色视频序列。In order to verify the effectiveness and feasibility of the fast coding method of the present invention, at first the "Exit", "Vassar" 0 viewpoint and 2 viewpoint color video test sequence of MERL laboratory and the 39th viewpoint of "Champagnetower" of Nogaya University in Japan were selected. Viewpoint and the color video test sequence of the 41st viewpoint, these color videos are used as the original color video sequence, and then the "Exit", "Vassar" second viewpoint and "Champagne tower" of these sequences are processed by the fast coding method proposed by the present invention The color video of the 41st viewpoint is fast encoded to obtain the reconstructed color video sequence after fast coding.
在此将从彩色视频序列的编码时间、编码码率和重建的彩色视频序列的峰值信噪比(Peak Signal to Noise Ratio,PSNR)来衡量本发明方法的性能。Here, the performance of the method of the present invention will be measured from the encoding time of the color video sequence, the encoding bit rate and the peak signal-to-noise ratio (Peak Signal to Noise Ratio, PSNR) of the reconstructed color video sequence.
表1列出了上述各序列的原始彩色视频序列通过JMVC校验平台和本发明方法编码后的彩色视频序列在相同条件下(Basis QP=22)编码的时间和编码码率以及重建质量对比。从表1所列的数据可以得出,在彩色视频序列编码时间方面,本发明方法与多视点标准校验平台相比能节约编码时间66.48%~71.90%,平均节约编码时间68.46%,原始彩色视频序列通过本发明的快速编码后重建的彩色视频序列与通过JMVC校验平台编码重建的彩色视频序列的PSNR相比平均减少0.037dB,最大减少为0.04dB,原始彩色视频序列通过本发明的快速编码后的码率与通过JMVC校验平台编码的码率相比平均增加0.46%,最大增加1.27%,最小增加-0.20%,几乎可以忽略不计。Table 1 lists the original color video sequences of the above-mentioned sequences through the JMVC verification platform and the color video sequences encoded by the method of the present invention under the same conditions (Basis QP=22) encoding time and encoding bit rate and reconstruction quality comparison. From the data listed in Table 1, it can be drawn that in terms of color video sequence coding time, the method of the present invention can save 66.48% to 71.90% of coding time compared with the multi-viewpoint standard verification platform, and save 68.46% of coding time on average. Compared with the PSNR of the color video sequence reconstructed by the fast encoding of the present invention and the color video sequence reconstructed by the JMVC verification platform, the average reduction of the video sequence is 0.037dB, and the maximum reduction is 0.04dB. Compared with the code rate encoded by the JMVC verification platform, the code rate after encoding increases by an average of 0.46%, the maximum increase is 1.27%, and the minimum increase is -0.20%, which is almost negligible.
图4a、图4b和图4c分别给出了“Exit”、“Vassar”和“Champagne tower”测试序列分别通过JMVC校验平台和本发明方法编码后得到的彩色视频序列的编码率失真性能对比图,率失真性能对比包括通过JMVC校验平台和本发明方法编码后得到的彩色视频序列的PSNR和所用的码率对比。从图4a中可以看出,在横坐标方向上,当编码“Exit”序列的右视点图像的码率相同时,通过JMVC校验平台和本发明方法编码后重建的彩色视频序列的PSNR基本保持一致;在纵坐标方向上,当通过JMVC校验平台和本发明方法编码后重建的彩色视频序列的PSNR相同时,编码“Exit”序列的右视点图像所用的码率基本相同。从图4b可以看出,在横坐标方向上,当编码“Vassar”序列的右视点图像的码率相同时,通过JMVC校验平台和本发明方法编码后重建的彩色视频序列的PSNR基本保持一致;在纵坐标方向上,当通过JMVC校验平台和本发明方法编码后重建的彩色视频序列的PSNR相同时,编码“Vassar”序列的右视点图像所用的码率基本相同。从图4c可以看出,在横坐标方向上,当编码“Champagnetower”序列的右视点图像的码率一定时,通过JMVC校验平台和本发明方法编码后重建的彩色视频序列的PSNR基本保持一致;在纵坐标方向上,当通过JMVC校验平台和本发明方法编码后重建的彩色视频序列的PSNR相同时,编码“Champagne tower”序列的右视点图像所用的码率基本相同。综上分析可得,本发明的快速编码方法能够基本保持和JMVC校验平台一致的率失真性能。Fig. 4a, Fig. 4b and Fig. 4c have respectively provided " Exit ", " Vassar " and " Champagne tower " test sequence respectively by JMVC verification platform and the coding rate-distortion performance comparison chart of the color video sequence obtained after coding by the method of the present invention The rate-distortion performance comparison includes the PSNR of the color video sequence encoded by the JMVC verification platform and the method of the present invention and the comparison of the code rate used. As can be seen from Figure 4a, in the direction of the abscissa, when the code rate of the right view image of the coded "Exit" sequence is the same, the PSNR of the reconstructed color video sequence encoded by the JMVC verification platform and the method of the present invention is basically maintained Consistent; in the direction of the ordinate, when the PSNR of the reconstructed color video sequence encoded by the JMVC verification platform and the method of the present invention is the same, the code rate used for encoding the right viewpoint image of the "Exit" sequence is basically the same. It can be seen from Figure 4b that in the direction of the abscissa, when the code rate of the right view image of the encoded "Vassar" sequence is the same, the PSNR of the reconstructed color video sequence encoded by the JMVC verification platform and the method of the present invention is basically consistent ; On the ordinate direction, when the PSNR of the color video sequence reconstructed by the JMVC verification platform and the inventive method encoding is the same, the code rate used by the right viewpoint image of the encoding "Vassar" sequence is basically the same. It can be seen from Fig. 4c that, in the direction of the abscissa, when the code rate of the right view image of the coded "Champagnetower" sequence is constant, the PSNR of the reconstructed color video sequence encoded by the JMVC verification platform and the method of the present invention is basically consistent. On the ordinate direction, when the PSNR of the color video sequence reconstructed by the JMVC verification platform and the inventive method encoding is the same, the code rate used by the right viewpoint image of the encoding "Champagne tower" sequence is basically the same. From the above analysis, it can be concluded that the fast encoding method of the present invention can basically maintain the same rate-distortion performance as that of the JMVC verification platform.
表1原始彩色视频序列通过JMVC校验平台和本发明方法的编码时间和编码码率以及重建彩色视频序列的质量对比The original color video sequence of table 1 passes through JMVC verification platform and the encoding time of the present invention method and coding bit rate and the quality comparison of reconstruction color video sequence
在主观上,对JMVC校验平台和本发明方法重建的彩色视频序列的质量进行对比,图5a给出了“Exit”测试序列采用多视点标准校验平台重建的第2视点第10帧图像,图5b给出了“Exit”测试序列采用本发明的快速编码方法重建的第2视点第10帧图像,从直观上几乎看不出图5a与图5b的差别;图5c给出了“Vassar”测试序列采用多视点标准校验平台重建的第2视点第10帧图像,图5d给出了“Vassar”测试序列采用本发明的快速编码方法重建的第2视点第10帧图像,从直观上几乎看不出图5c与图5d的差别。因此,本发明的快速编码方法和JMVC校验平台重建的多视点视频序列的右视点图像直观上基本没有差别。Subjectively, comparing the quality of the color video sequence reconstructed by the JMVC verification platform and the method of the present invention, Fig. 5a shows the 10th frame image of the 2nd viewpoint reconstructed by the multi-viewpoint standard verification platform for the "Exit" test sequence, Figure 5b shows the image of the 10th frame of the second viewpoint reconstructed by the fast encoding method of the present invention in the "Exit" test sequence, and there is almost no difference between Figure 5a and Figure 5b intuitively; Figure 5c shows the "Vassar" The test sequence uses the multi-viewpoint standard verification platform to reconstruct the image of the 10th frame of the 2nd viewpoint. Figure 5d shows the image of the 10th frame of the 2nd viewpoint reconstructed by the fast coding method of the present invention in the "Vassar" test sequence, intuitively almost No difference between Figure 5c and Figure 5d can be seen. Therefore, there is basically no difference intuitively between the fast encoding method of the present invention and the right view image of the multi-view video sequence reconstructed by the JMVC verification platform.
综上,对于多视点视频序列的右视点图像,本发明方法在保持率失真性能和JMVC校验平台一致的前提下,能够大大节约多视点视频序列的右视点图像的编码时间,从而节省多视点视频序列的整体编码时间。In summary, for the right-view image of a multi-view video sequence, the method of the present invention can greatly save the encoding time of the right-view image of a multi-view video sequence on the premise of keeping the rate-distortion performance consistent with the JMVC verification platform, thereby saving multi-view The overall encoding time of the video sequence.
Claims (5)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310325370.8A CN103442226B (en) | 2013-07-30 | 2013-07-30 | The multiple views color video fast encoding method of distortion just can be perceived based on binocular |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310325370.8A CN103442226B (en) | 2013-07-30 | 2013-07-30 | The multiple views color video fast encoding method of distortion just can be perceived based on binocular |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103442226A CN103442226A (en) | 2013-12-11 |
CN103442226B true CN103442226B (en) | 2016-08-17 |
Family
ID=49695886
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310325370.8A Active CN103442226B (en) | 2013-07-30 | 2013-07-30 | The multiple views color video fast encoding method of distortion just can be perceived based on binocular |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103442226B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105611272B (en) * | 2015-12-28 | 2017-05-03 | 宁波大学 | Eye exactly perceptible stereo image distortion analyzing method based on texture complexity |
CN112969066B (en) * | 2021-01-29 | 2023-09-01 | 北京博雅慧视智能技术研究院有限公司 | Prediction unit selection method and device, electronic equipment and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008136607A1 (en) * | 2007-05-02 | 2008-11-13 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding and decoding multi-view video data |
CN101404766A (en) * | 2008-11-05 | 2009-04-08 | 宁波大学 | Multi-view point video signal encoding method |
CN102724525A (en) * | 2012-06-01 | 2012-10-10 | 宁波大学 | Depth video coding method on basis of foveal JND (just noticeable distortion) model |
-
2013
- 2013-07-30 CN CN201310325370.8A patent/CN103442226B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008136607A1 (en) * | 2007-05-02 | 2008-11-13 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding and decoding multi-view video data |
CN101404766A (en) * | 2008-11-05 | 2009-04-08 | 宁波大学 | Multi-view point video signal encoding method |
CN102724525A (en) * | 2012-06-01 | 2012-10-10 | 宁波大学 | Depth video coding method on basis of foveal JND (just noticeable distortion) model |
Non-Patent Citations (2)
Title |
---|
Binocular Just-Noticeable-Difference Model for Stereoscopic Image;Zhao Y,Chen Z Z,Zhu C;《Signal Processing Letters IEEE》;20101028;第18卷(第1期);19-22 * |
JND模型及其在视频编码中的应用;连凤宗;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20120515(第05期);I136-364 * |
Also Published As
Publication number | Publication date |
---|---|
CN103442226A (en) | 2013-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103179405B (en) | A kind of multi-view point video encoding method based on multi-level region-of-interest | |
CN102801996B (en) | Rapid depth map coding mode selection method based on JNDD (Just Noticeable Depth Difference) model | |
CN101374242B (en) | A Depth Image Coding and Compression Method Applied to 3DTV and FTV Systems | |
CN103237226B (en) | A kind of stereoscopic video macroblock loses error concealing method | |
CN101374243B (en) | Depth map encoding compression method for 3DTV and FTV system | |
CN101729891B (en) | Method for encoding multi-view depth video | |
CN104602028B (en) | A kind of three-dimensional video-frequency B frames entire frame loss error concealing method | |
CN102724525B (en) | Depth video coding method on basis of foveal JND (just noticeable distortion) model | |
CN103002306B (en) | Depth image coding method | |
CN103024381B (en) | A kind of macro block mode fast selecting method based on proper discernable distortion | |
CN105430415A (en) | A fast intra-frame coding method for 3D-HEVC depth video | |
CN102271254A (en) | Depth image preprocessing method | |
CN105306954B (en) | A kind of perception stereo scopic video coding based on parallax minimum appreciable error model | |
CN102307304A (en) | Image segmentation based error concealment method for entire right frame loss in stereoscopic video | |
CN106507116A (en) | A 3D‑HEVC Coding Method Based on 3D Saliency Information and View Synthesis Prediction | |
CN105120290A (en) | Fast coding method for depth video | |
CN102710949B (en) | Visual sensation-based stereo video coding method | |
CN103414889B (en) | A kind of method for controlling three-dimensional video code rates based on the proper discernable distortion of binocular | |
CN103442226B (en) | The multiple views color video fast encoding method of distortion just can be perceived based on binocular | |
CN106331707B (en) | Asymmetric perceptual video coding system and method based on just noticeable distortion model | |
CN111464805B (en) | Three-dimensional panoramic video rapid coding method based on panoramic saliency | |
CN102098516B (en) | Deblocking filtering method for multi-view video decoder | |
CN105141967B (en) | Based on the quick self-adapted loop circuit filtering method that can just perceive distortion model | |
CN105007494B (en) | Wedge-shaped Fractionation regimen selection method in a kind of frame of 3D video depths image | |
CN104394399B (en) | Three limit filtering methods of deep video coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |