CN114567776B - Video low-complexity coding method based on panoramic visual perception characteristics - Google Patents
Video low-complexity coding method based on panoramic visual perception characteristics Download PDFInfo
- Publication number
- CN114567776B CN114567776B CN202210157533.5A CN202210157533A CN114567776B CN 114567776 B CN114567776 B CN 114567776B CN 202210157533 A CN202210157533 A CN 202210157533A CN 114567776 B CN114567776 B CN 114567776B
- Authority
- CN
- China
- Prior art keywords
- current frame
- pixel point
- pixel
- coding unit
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000016776 visual perception Effects 0.000 title claims abstract description 10
- 238000013139 quantization Methods 0.000 claims abstract description 33
- 230000008447 perception Effects 0.000 claims abstract description 32
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 238000005457 optimization Methods 0.000 abstract description 2
- 238000013441 quality evaluation Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 4
- 230000002123 temporal effect Effects 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及一种视频编码技术,尤其是涉及一种基于全景视觉感知特性的视频低复杂度编码方法。The present invention relates to a video encoding technology, and in particular to a video low-complexity encoding method based on panoramic visual perception characteristics.
背景技术Background Art
近年来,全景视频系统以其“身临其境”的视觉体验受到了人们的广泛欢迎,在虚拟现实、模拟驾驶等领域有着巨大的应用前景。但是,目前全景视频系统在编码方面仍然存在编码复杂度过高的问题,这对全景视频系统的应用带来了巨大的挑战。因此,如何降低编码复杂度已成为该领域亟待解决的技术问题。In recent years, panoramic video systems have been widely welcomed by people for their "immersive" visual experience, and have great application prospects in the fields of virtual reality, simulated driving, etc. However, the current panoramic video system still has the problem of high coding complexity, which brings great challenges to the application of panoramic video systems. Therefore, how to reduce the coding complexity has become a technical problem that needs to be solved urgently in this field.
现有的全景视频低复杂度编码算法,未充分考虑人眼视觉系统(Human VisualSystem,HVS)感知特性和全景视频特点,难以达到最优的编码性能。由于视频编码的主要目的是在保证一定视频质量的前提下,尽可能地减少编码的码率;或者在编码的码率受限的情况下,采用失真最小的模式进行编码。因此,如何结合利用人眼视觉系统感知特性和全景视频特点,用于指导编码参数选择,就成为了该领域研究降低编码复杂度的重要突破方向。The existing low-complexity coding algorithms for panoramic videos do not fully consider the perceptual characteristics of the human visual system (HVS) and the characteristics of panoramic videos, making it difficult to achieve optimal coding performance. Since the main purpose of video coding is to reduce the coding bit rate as much as possible while ensuring a certain video quality; or to use the least distorted mode for coding when the coding bit rate is limited. Therefore, how to combine the perceptual characteristics of the human visual system and the characteristics of panoramic videos to guide the selection of coding parameters has become an important breakthrough direction for reducing coding complexity in this field.
发明内容Summary of the invention
本发明所要解决的技术问题是提供一种基于全景视觉感知特性的视频低复杂度编码方法,其能够有效节省编码码率,从而能够有效降低编码复杂度。The technical problem to be solved by the present invention is to provide a low-complexity video encoding method based on panoramic visual perception characteristics, which can effectively save the encoding bit rate and thus effectively reduce the encoding complexity.
本发明解决上述技术问题所采用的技术方案为:一种基于全景视觉感知特性的视频低复杂度编码方法,其特征在于包括以下步骤:The technical solution adopted by the present invention to solve the above technical problems is: a video low-complexity encoding method based on panoramic visual perception characteristics, characterized by comprising the following steps:
步骤1:将ERP投影格式的全景视频中当前待编码的视频帧定义为当前帧;其中,ERP投影格式的全景视频中的视频帧的宽度为W且高度为H;Step 1: define a video frame to be encoded in the panoramic video in the ERP projection format as a current frame; wherein the width of the video frame in the panoramic video in the ERP projection format is W and the height is H;
步骤2:判断当前帧是否为第1帧视频帧,如果是,则采用HEVC视频编码器的原始算法对当前帧进行编码,然后执行步骤10;否则,执行步骤3;Step 2: Determine whether the current frame is the first video frame. If so, encode the current frame using the original algorithm of the HEVC video encoder, and then execute step 10; otherwise, execute step 3;
步骤3:对当前帧中的每个像素点进行空域JND阈值计算,得到当前帧的全景空域JND阈值图,记为G1,G1中的每个像素点的像素值即为当前帧中对应像素点的空域JND阈值;并对当前帧中的每个像素点进行加权梯度计算,得到当前帧的加权梯度图,记为G2,G2中的每个像素点的像素值即为当前帧中对应像素点的加权梯度值;Step 3: Perform spatial JND threshold calculation on each pixel in the current frame to obtain the panoramic spatial JND threshold map of the current frame, denoted as G 1 . The pixel value of each pixel in G 1 is the spatial JND threshold of the corresponding pixel in the current frame. Perform weighted gradient calculation on each pixel in the current frame to obtain the weighted gradient map of the current frame, denoted as G 2 . The pixel value of each pixel in G 2 is the weighted gradient value of the corresponding pixel in the current frame.
步骤4:计算当前帧中的每个像素点的空域感知因子,将当前帧中坐标位置为(x,y)的像素点的空域感知因子记为δA(x,y),δA(x,y)=G1(x,y);并计算当前帧中的每个像素点的运动感知因子,将当前帧中坐标位置为(x,y)的像素点的运动感知因子记为δT(x,y),然后计算当前帧中的每个像素点的时空加权感知因子,将当前帧中坐标位置为(x,y)的像素点的时空加权感知因子记为δ(x,y),δ(x,y)=δA(x,y)×δT(x,y);再计算当前帧中的所有像素点的时空加权感知因子的平均值,记为Sδ;计算当前帧中的每个像素点的维度权重,将当前帧中坐标位置为(x,y)的像素点的维度权重记为wERP(x,y),其中,0≤x≤W-1,0≤y≤H-1,G1(x,y)表示G1中坐标位置为(x,y)的像素点的像素值,G1(x,y)亦表示当前帧中坐标位置为(x,y)的像素点的空域JND阈值,G2(x,y)表示G2中坐标位置为(x,y)的像素点的像素值,G2(x,y)亦表示当前帧中坐标位置为(x,y)的像素点的加权梯度值,SF表示G2中的所有像素点的像素值的平均值,SF亦表示当前帧中的所有像素点的加权梯度值的平均值,ε为运动感知常数,ε∈[1,2],cos()为余弦函数;Step 4: Calculate the spatial perception factor of each pixel in the current frame, and record the spatial perception factor of the pixel with coordinate position (x, y) in the current frame as δ A (x, y), δ A (x, y) = G 1 (x, y); and calculate the motion perception factor of each pixel in the current frame, and record the motion perception factor of the pixel with coordinate position (x, y) in the current frame as δ T (x, y), Then, the spatiotemporal weighted perceptual factor of each pixel in the current frame is calculated, and the spatiotemporal weighted perceptual factor of the pixel with coordinate position (x, y) in the current frame is recorded as δ(x, y), δ(x, y) = δ A (x, y) × δ T (x, y); then the average spatiotemporal weighted perceptual factor of all pixels in the current frame is calculated, recorded as S δ ; the dimensional weight of each pixel in the current frame is calculated, and the dimensional weight of the pixel with coordinate position (x, y) in the current frame is recorded as w ERP (x, y), Wherein, 0≤x≤W-1, 0≤y≤H-1, G1 (x,y) represents the pixel value of the pixel with coordinate position (x,y) in G1 , G1 (x,y) also represents the spatial JND threshold of the pixel with coordinate position (x,y) in the current frame, G2 (x,y) represents the pixel value of the pixel with coordinate position (x,y) in G2 , G2 (x,y) also represents the weighted gradient value of the pixel with coordinate position (x,y) in the current frame, SF represents the average pixel value of all pixels in G2 , SF also represents the average weighted gradient value of all pixels in the current frame, ε is the motion perception constant, ε∈[1,2], cos() is the cosine function;
步骤5:将当前帧中当前待处理的最大编码单元定义为当前最大编码单元;Step 5: define the maximum coding unit to be processed in the current frame as the current maximum coding unit;
步骤6:计算当前最大编码单元中的所有像素点的时空加权感知因子的平均值,记为Sδ_LCU;然后计算当前最大编码单元的基于时空加权感知因子的拉格朗日系数调节因子,记为ΨLCU,再计算当前最大编码单元的基于时空加权感知因子的量化参数变化量,记为ΔQP1,ΔQP1=3log2(ΨLCU);其中,KLCU和BLCU均为调节参数,KLCU∈(0,1),BLCU∈(0,1);Step 6: Calculate the average value of the spatiotemporal weighted perceptual factors of all pixels in the current maximum coding unit, denoted as S δ_LCU ; then calculate the Lagrange coefficient adjustment factor based on the spatiotemporal weighted perceptual factor of the current maximum coding unit, denoted as Ψ LCU , Then calculate the change of the quantization parameter based on the spatiotemporal weighted perceptual factor of the current largest coding unit, which is recorded as ΔQP 1 , ΔQP 1 =3log 2 (Ψ LCU ); wherein K LCU and B LCU are adjustment parameters, K LCU ∈(0,1), B LCU ∈(0,1);
步骤7:计算当前最大编码单元中的所有像素点的维度权重的平均值,记为SwERP_LCU;再计算当前最大编码单元的基于维度权重的量化参数变化量,记为ΔQP2,其中,a和b均为调节参数,a∈(0,1),b∈(0,1),b<a;Step 7: Calculate the average value of the dimension weights of all pixels in the current largest coding unit, recorded as S wERP_LCU ; then calculate the quantization parameter change based on the dimension weight of the current largest coding unit, recorded as ΔQP 2 , Wherein, a and b are adjustment parameters, a∈(0,1), b∈(0,1), b<a;
步骤8:计算当前最大编码单元的新的编码量化参数,记为QPnew,然后用QPnew更新当前最大编码单元的编码量化参数;再对当前最大编码单元进行编码;其中,QPorg表示当前最大编码单元的原始的编码量化参数,符号为向下取整运算符号;Step 8: Calculate the new encoding quantization parameter of the current largest coding unit, denoted as QP new , Then, QP new is used to update the encoding quantization parameter of the current maximum coding unit; and the current maximum coding unit is encoded again; wherein QP org represents the original encoding quantization parameter of the current maximum coding unit, and the symbol is the floor rounding operator symbol;
步骤9:将当前帧中下一个待处理的最大编码单元作为当前最大编码单元,然后返回步骤6继续执行,直至当前帧中的所有最大编码单元均处理完毕,再执行步骤10;Step 9: taking the next maximum coding unit to be processed in the current frame as the current maximum coding unit, and then returning to step 6 to continue executing until all maximum coding units in the current frame are processed, and then executing step 10;
步骤10:将ERP投影格式的全景视频中下一帧待编码的视频帧作为当前帧,然后返回步骤2继续执行,直至ERP投影格式的全景视频中的所有视频帧均编码完毕。Step 10: The next video frame to be encoded in the panoramic video in the ERP projection format is used as the current frame, and then the process returns to step 2 to continue until all video frames in the panoramic video in the ERP projection format are encoded.
所述的步骤3中,G1的获取方式为:采用空域恰可察觉失真模型对当前帧中的每个像素点进行空域JND阈值计算得到G1。In the step 3, G 1 is obtained by using a spatial just perceptible distortion model to perform spatial JND threshold calculation on each pixel point in the current frame to obtain G 1 .
所述的步骤3中,G2的获取过程为:将G2中坐标位置为(x,y)的像素点的像素值记为G2(x,y),其中,0≤x≤W-1,0≤y≤H-1,G2(x,y)亦表示当前帧中坐标位置为(x,y)的像素点的加权梯度值,表示水平方向,表示垂直方向,表示时域方向,表示当前帧中坐标位置为(x,y)的像素点的水平方向梯度值,表示当前帧中坐标位置为(x,y)的像素点的垂直方向梯度值,表示当前帧中坐标位置为(x,y)的像素点的时域方向梯度值,和由3D-sobel算子计算得到,α表示水平方向的梯度调节因子,β表示垂直方向的梯度调节因子,γ表示时域方向的梯度调节因子,α+β+γ=1。In step 3, the process of obtaining G 2 is as follows: the pixel value of the pixel point with coordinate position (x, y) in G 2 is recorded as G 2 (x, y), Where, 0≤x≤W-1, 0≤y≤H-1, G 2 (x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame. Indicates the horizontal direction, Indicates the vertical direction, represents the time domain direction, Indicates the horizontal gradient value of the pixel with coordinate position (x, y) in the current frame. Indicates the vertical gradient value of the pixel with coordinate position (x, y) in the current frame. Represents the temporal directional gradient value of the pixel with coordinate position (x, y) in the current frame. and It is calculated by the 3D-sobel operator, α represents the gradient adjustment factor in the horizontal direction, β represents the gradient adjustment factor in the vertical direction, γ represents the gradient adjustment factor in the time domain direction, and α+β+γ=1.
与现有技术相比,本发明的优点在于:Compared with the prior art, the advantages of the present invention are:
本发明方法充分考虑了人眼视觉系统感知特性和全景视频特点,利用空域JND阈值(视觉感知信息)作为空域感知因子,通过加权梯度值(视觉感知信息)获得运动感知因子,进而计算得到最大编码单元中的所有像素点的时空加权感知因子的平均值,根据率失真优化理论计算最大编码单元的基于时空加权感知因子的拉格朗日系数调节因子,并进一步得到最大编码单元的基于时空加权感知因子的量化参数变化量;同时本发明方法考虑ERP投影格式的全景视频的维度权重特点,计算最大编码单元的基于维度权重的量化参数变化量;根据两个量化参数变化量计算最大编码单元的新的编码量化参数,并运用于编码。本发明方法能针对具体最大编码单元的时空域以及全景纬度特征,自适应地调节编码量化参数,实验测试表明本发明方法能在保证编码质量的同时,有效降低编码码率,从而能够有效降低编码复杂度,且率失真性能显著提升,特别是针对初始编码量化参数较小的情况,编码效果更优。The method of the present invention fully considers the perceptual characteristics of the human visual system and the characteristics of panoramic video, uses the spatial JND threshold (visual perception information) as the spatial perception factor, obtains the motion perception factor through the weighted gradient value (visual perception information), and then calculates the average value of the spatiotemporal weighted perception factor of all pixels in the maximum coding unit, calculates the Lagrange coefficient adjustment factor based on the spatiotemporal weighted perception factor of the maximum coding unit according to the rate-distortion optimization theory, and further obtains the quantization parameter change based on the spatiotemporal weighted perception factor of the maximum coding unit; at the same time, the method of the present invention considers the dimensional weight characteristics of the panoramic video in the ERP projection format, calculates the quantization parameter change based on the dimensional weight of the maximum coding unit; calculates the new coding quantization parameter of the maximum coding unit according to the two quantization parameter changes, and applies it to coding. The method of the present invention can adaptively adjust the coding quantization parameter for the spatiotemporal domain and the panoramic latitude characteristics of the specific maximum coding unit. Experimental tests show that the method of the present invention can effectively reduce the coding bit rate while ensuring the coding quality, thereby effectively reducing the coding complexity, and the rate-distortion performance is significantly improved, especially for the case where the initial coding quantization parameter is small, the coding effect is better.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明方法的总体实现框图。FIG1 is a general implementation block diagram of the method of the present invention.
具体实施方式DETAILED DESCRIPTION
以下结合附图实施例对本发明作进一步详细描述。The present invention is further described in detail below with reference to the accompanying drawings.
本发明提出的一种基于全景视觉感知特性的视频低复杂度编码方法,其总体实现框图如图1所示,其包括以下步骤:The present invention proposes a low-complexity video encoding method based on panoramic visual perception characteristics, and its overall implementation block diagram is shown in FIG1 , which includes the following steps:
步骤1:将ERP(Equirectangular Projection)投影格式的全景视频中当前待编码的视频帧定义为当前帧;其中,ERP投影格式的全景视频中的视频帧的宽度为W且高度为H。Step 1: define a video frame to be encoded in a panoramic video in an ERP (Equirectangular Projection) projection format as a current frame; wherein the width of the video frame in the panoramic video in the ERP projection format is W and the height is H.
步骤2:判断当前帧是否为第1帧视频帧,如果是,则采用HEVC视频编码器的原始算法对当前帧进行编码,然后执行步骤10;否则,执行步骤3。Step 2: Determine whether the current frame is the first video frame. If so, encode the current frame using the original algorithm of the HEVC video encoder, and then execute step 10; otherwise, execute step 3.
步骤3:对当前帧中的每个像素点进行空域JND(Just Noticeable Distortion,恰可察觉失真)阈值计算,得到当前帧的全景空域JND阈值图,记为G1,G1中的每个像素点的像素值即为当前帧中对应像素点的空域JND阈值;并对当前帧中的每个像素点进行加权梯度计算,得到当前帧的加权梯度图,记为G2,G2中的每个像素点的像素值即为当前帧中对应像素点的加权梯度值;空域JND阈值越大表征恰可察觉失真越大,即对应区域的空域掩蔽性越强;反之,空域JND阈值越小则对应区域的空域掩蔽性越弱。Step 3: Perform spatial JND (Just Noticeable Distortion) threshold calculation on each pixel in the current frame to obtain the panoramic spatial JND threshold map of the current frame, denoted as G1 , and the pixel value of each pixel in G1 is the spatial JND threshold of the corresponding pixel in the current frame; and perform weighted gradient calculation on each pixel in the current frame to obtain the weighted gradient map of the current frame, denoted as G2 , and the pixel value of each pixel in G2 is the weighted gradient value of the corresponding pixel in the current frame; the larger the spatial JND threshold, the greater the just noticeable distortion, that is, the stronger the spatial masking of the corresponding area; conversely, the smaller the spatial JND threshold, the weaker the spatial masking of the corresponding area.
在本实施例中,G1的获取方式为:采用现有的经典的空域恰可察觉失真模型对当前帧中的每个像素点进行空域JND阈值计算得到G1。In this embodiment, G 1 is obtained by using an existing classic spatial just noticeable distortion model to perform spatial JND threshold calculation on each pixel point in the current frame to obtain G 1 .
在本实施例中,G2的获取过程为:将G2中坐标位置为(x,y)的像素点的像素值记为G2(x,y),其中,0≤x≤W-1,0≤y≤H-1,G2(x,y)亦表示当前帧中坐标位置为(x,y)的像素点的加权梯度值,表示水平方向,表示垂直方向,表示时域方向,表示当前帧中坐标位置为(x,y)的像素点的水平方向梯度值,表示当前帧中坐标位置为(x,y)的像素点的垂直方向梯度值,表示当前帧中坐标位置为(x,y)的像素点的时域方向梯度值,即为当前帧中坐标位置为(x,y)的像素点沿时域方向与前一帧视频帧中坐标位置为(x,y)的像素点的梯度值,和由现有的3D-sobel算子计算得到,α表示水平方向的梯度调节因子,β表示垂直方向的梯度调节因子,γ表示时域方向的梯度调节因子,α+β+γ=1,在本实施例中α取值为0.25、β取值为0.25、γ取值为0.5。In this embodiment, the process of obtaining G 2 is as follows: the pixel value of the pixel point with coordinate position (x, y) in G 2 is recorded as G 2 (x, y), Where, 0≤x≤W-1, 0≤y≤H-1, G 2 (x, y) also represents the weighted gradient value of the pixel with coordinate position (x, y) in the current frame. Indicates the horizontal direction, Indicates the vertical direction, represents the time domain direction, Indicates the horizontal gradient value of the pixel with coordinate position (x, y) in the current frame. Indicates the vertical gradient value of the pixel with coordinate position (x, y) in the current frame. Indicates the temporal gradient value of the pixel with coordinate position (x, y) in the current frame, that is, the gradient value of the pixel with coordinate position (x, y) in the current frame along the temporal direction with the pixel with coordinate position (x, y) in the previous video frame. and It is calculated by the existing 3D-sobel operator, α represents the gradient adjustment factor in the horizontal direction, β represents the gradient adjustment factor in the vertical direction, γ represents the gradient adjustment factor in the time domain direction, α+β+γ=1, and in this embodiment, α is taken as 0.25, β is taken as 0.25, and γ is taken as 0.5.
步骤4:计算当前帧中的每个像素点的空域感知因子,将当前帧中坐标位置为(x,y)的像素点的空域感知因子记为δA(x,y),δA(x,y)=G1(x,y);并计算当前帧中的每个像素点的运动感知因子,将当前帧中坐标位置为(x,y)的像素点的运动感知因子记为δT(x,y),然后计算当前帧中的每个像素点的时空加权感知因子,将当前帧中坐标位置为(x,y)的像素点的时空加权感知因子记为δ(x,y),δ(x,y)=δA(x,y)×δT(x,y);再计算当前帧中的所有像素点的时空加权感知因子的平均值,记为Sδ,计算当前帧中的每个像素点的维度权重,将当前帧中坐标位置为(x,y)的像素点的维度权重记为wERP(x,y),其中,0≤x≤W-1,0≤y≤H-1,G1(x,y)表示G1中坐标位置为(x,y)的像素点的像素值,G1(x,y)亦表示当前帧中坐标位置为(x,y)的像素点的空域JND阈值,G2(x,y)表示G2中坐标位置为(x,y)的像素点的像素值,G2(x,y)亦表示当前帧中坐标位置为(x,y)的像素点的加权梯度值,SF表示G2中的所有像素点的像素值的平均值,SF亦表示当前帧中的所有像素点的加权梯度值的平均值,ε为运动感知常数,ε∈[1,2],在本实施例中ε取值为1,cos()为余弦函数,π=3.14…。Step 4: Calculate the spatial perception factor of each pixel in the current frame, and record the spatial perception factor of the pixel with coordinate position (x, y) in the current frame as δ A (x, y), δ A (x, y) = G 1 (x, y); and calculate the motion perception factor of each pixel in the current frame, and record the motion perception factor of the pixel with coordinate position (x, y) in the current frame as δ T (x, y), Then, the spatiotemporal weighted perception factor of each pixel in the current frame is calculated, and the spatiotemporal weighted perception factor of the pixel with coordinate position (x, y) in the current frame is recorded as δ(x, y), δ(x, y) = δ A (x, y) × δ T (x, y); then the average spatiotemporal weighted perception factor of all pixels in the current frame is calculated, recorded as S δ , Calculate the dimension weight of each pixel in the current frame, and record the dimension weight of the pixel with coordinate position (x, y) in the current frame as w ERP (x, y). Wherein, 0≤x≤W-1, 0≤y≤H-1, G1 (x,y) represents the pixel value of the pixel with coordinate position (x,y) in G1 , G1 (x,y) also represents the spatial JND threshold of the pixel with coordinate position (x,y) in the current frame, G2 (x,y) represents the pixel value of the pixel with coordinate position (x,y) in G2 , G2 (x,y) also represents the weighted gradient value of the pixel with coordinate position (x,y) in the current frame, SF represents the average pixel value of all pixels in G2 , SF also represents the average weighted gradient value of all pixels in the current frame, ε is a motion perception constant, ε∈[1,2], in this embodiment, ε is 1, cos() is a cosine function, π=3.14….
在本实施例中,ERP投影格式由于各个纬度采用不同程度像素采样,平面中不同维度存在不同像素冗余,且两极极度拉升冗余最为明显,因此球体投影到ERP投影格式后,通常以球体中心为基点,ERP投影格式的经度θ与球体的球面的经度对应,ERP投影格式的纬度与球体的球面的纬度对应,θ∈[-π,π],考虑到全景纬度的特点,引入ERP投影格式的维度权重参数wERP(x,y)。In this embodiment, the ERP projection format uses different degrees of pixel sampling at each latitude, and different dimensions in the plane have different pixel redundancy, and the extreme pull-up redundancy is most obvious. Therefore, after the sphere is projected into the ERP projection format, the center of the sphere is usually used as the base point. The longitude θ of the ERP projection format corresponds to the longitude of the spherical surface of the sphere, and the latitude of the ERP projection format is Corresponding to the latitude of the spherical surface of the sphere, θ∈[-π,π], Taking into account the characteristics of panoramic latitude, the dimension weight parameter w ERP (x, y) of the ERP projection format is introduced.
步骤5:将当前帧中当前待处理的最大编码单元(Largest Coding Unit,LCU)定义为当前最大编码单元。Step 5: Define the largest coding unit (LCU) to be processed in the current frame as the current largest coding unit.
步骤6:计算当前最大编码单元中的所有像素点的时空加权感知因子的平均值,记为Sδ_LCU,然后计算当前最大编码单元的基于时空加权感知因子的拉格朗日系数调节因子,记为ΨLCU,再计算当前最大编码单元的基于时空加权感知因子的量化参数变化量,记为ΔQP1,ΔQP1=3log2(ΨLCU);其中,0≤i≤63,0≤j≤63,δLCU(i,j)表示当前最大编码单元中块内坐标位置为(i,j)的像素点的时空加权感知因子,KLCU和BLCU均为调节参数,KLCU∈(0,1),BLCU∈(0,1),在本实施例中通过大量实验最终确定KLCU和BLCU均取值为0.5。Step 6: Calculate the average value of the spatiotemporal weighted perceptual factors of all pixels in the current largest coding unit, denoted as S δ_LCU , Then, the Lagrangian coefficient adjustment factor based on the spatiotemporal weighted perceptual factor of the current largest coding unit is calculated, denoted as Ψ LCU , Then calculate the quantization parameter change based on the spatiotemporal weighted perceptual factor of the current largest coding unit, denoted as ΔQP 1 , ΔQP 1 =3log 2 (Ψ LCU ); wherein 0≤i≤63,0≤j≤63, δ LCU (i,j) represents the spatiotemporal weighted perceptual factor of the pixel point with coordinate position (i,j) in the block of the current largest coding unit, K LCU and B LCU are both adjustment parameters, K LCU ∈(0,1), B LCU ∈(0,1), and in this embodiment, through a large number of experiments, it is finally determined that K LCU and B LCU are both 0.5.
步骤7:计算当前最大编码单元中的所有像素点的维度权重的平均值,记为SwERP_LCU,再计算当前最大编码单元的基于维度权重的量化参数变化量,记为ΔQP2,其中,wERP_LCU(i,j)表示当前最大编码单元中块内坐标位置为(i,j)的像素点的维度权重,a和b均为调节参数,a∈(0,1),b∈(0,1),b<a,在本实施例中通过大量实验最终确定a取值为0.85、b取值为0.3。Step 7: Calculate the average value of the dimension weights of all pixels in the current largest coding unit, denoted as S wERP_LCU , Then calculate the change of the quantization parameter based on the dimension weight of the current largest coding unit, recorded as ΔQP 2 , Among them, w ERP_LCU (i, j) represents the dimensional weight of the pixel point with coordinate position (i, j) in the block in the current largest coding unit, a and b are adjustment parameters, a∈(0,1), b∈(0,1), b<a. In this embodiment, through a large number of experiments, it is finally determined that the value of a is 0.85 and the value of b is 0.3.
步骤8:计算当前最大编码单元的新的编码量化参数,记为QPnew,然后用QPnew更新当前最大编码单元的编码量化参数;再采用HEVC视频编码器对当前最大编码单元进行编码;其中,QPorg表示当前最大编码单元的原始的编码量化参数,QPorg可以从编码器的初始化参数列表中读取,符号为向下取整运算符号。Step 8: Calculate the new encoding quantization parameter of the current largest coding unit, denoted as QP new , Then, QP new is used to update the encoding quantization parameter of the current maximum coding unit; the HEVC video encoder is used to encode the current maximum coding unit; wherein, QP org represents the original encoding quantization parameter of the current maximum coding unit, and QP org can be read from the initialization parameter list of the encoder, and the symbol The floor operator symbol.
步骤9:将当前帧中下一个待处理的最大编码单元作为当前最大编码单元,然后返回步骤6继续执行,直至当前帧中的所有最大编码单元均处理完毕,再执行步骤10。Step 9: Take the next maximum coding unit to be processed in the current frame as the current maximum coding unit, and then return to step 6 to continue executing until all maximum coding units in the current frame are processed, and then execute step 10.
步骤10:将ERP投影格式的全景视频中下一帧待编码的视频帧作为当前帧,然后返回步骤2继续执行,直至ERP投影格式的全景视频中的所有视频帧均编码完毕。Step 10: The next video frame to be encoded in the panoramic video in the ERP projection format is used as the current frame, and then the process returns to step 2 to continue until all video frames in the panoramic video in the ERP projection format are encoded.
为了进一步说明本发明方法的性能,对本发明方法进行测试。In order to further illustrate the performance of the method of the present invention, the method of the present invention was tested.
选取HEVC视频编码器标准参考软件HM16.14作为实验测试平台,硬件配置为Intel(R)Core(TM)i7-10700 CPU,主频2.9GHz,内存为32G的64位WIN10操作系统,开发工具选择VS2013。选取4个全景视频序列作为标准测试序列,分别为:两个4K序列“AerialCity”、“DrivingInCity”以及两个6K序列“BranCastle2”、“Landing2”。每个标准测试序列的测试帧数为100帧,采用帧内编码方式,设置SearchRange(搜索范围)为64,设置MaxPartitionDepth(最大递归深度)为4,初始编码量化参数QP(即原始的编码量化参数QPorg)分别取为22、27、32、37。The HEVC video encoder standard reference software HM16.14 was selected as the experimental test platform. The hardware configuration was Intel(R) Core(TM) i7-10700 CPU, main frequency 2.9GHz, 64-bit WIN10 operating system with 32G memory, and VS2013 was selected as the development tool. Four panoramic video sequences were selected as standard test sequences, namely: two 4K sequences "AerialCity" and "DrivingInCity" and two 6K sequences "BranCastle2" and "Landing2". The number of test frames of each standard test sequence was 100 frames, intra-frame coding was used, SearchRange (search range) was set to 64, MaxPartitionDepth (maximum recursive depth) was set to 4, and the initial coding quantization parameter QP (i.e. the original coding quantization parameter QP org ) was taken as 22, 27, 32, and 37 respectively.
表1列出了“AerialCity”、“DrivingInCity”“BranCastle2”、“Landing2”4个全景视频序列的相关参数信息。Table 1 lists the relevant parameter information of the four panoramic video sequences: “AerialCity”, “DrivingInCity”, “BranCastle2” and “Landing2”.
表1全景视频序列的相关参数信息Table 1. Related parameter information of panoramic video sequence
表2列出了采用本发明方法对表1列出的全景视频序列进行编码,与采用HM16.14原始平台方法相比,编码码率的节省情况。定义采用本发明方法进行编码相比于采用HM16.14原始平台方法进行编码的编码码率节省率为ΔRPRO,ΔRPRO=(RORG-RPRO)/RORG×100(%),其中,RPRO表示采用本发明方法进行编码的编码码率,RORG表示采用HM16.14原始平台方法进行编码的编码码率。Table 2 lists the encoding bit rate savings of the panoramic video sequences listed in Table 1 by using the method of the present invention to encode the panoramic video sequences listed in Table 1, compared with using the HM16.14 original platform method. The encoding bit rate savings of the encoding method of the present invention compared with using the HM16.14 original platform method is defined as ΔR PRO , ΔR PRO =(R ORG -R PRO )/R ORG ×100(%), where R PRO represents the encoding bit rate of the encoding method of the present invention, and R ORG represents the encoding bit rate of the encoding method of the HM16.14 original platform method.
表2采用本发明方法进行编码相比于采用HM16.14原始平台方法进行编码的编码码率节省比较情况Table 2 Comparison of the coding rate savings of the coding method of the present invention compared to the coding method of the HM16.14 original platform
从表2中可以看出,采用本发明方法进行编码能够平均节省编码码率12.9%。针对4个不同场景、不同运动情况的全景视频序列,采用本发明方法进行编码均能有效地降低编码码率,特别是针对初始编码量化参数QP(即原始的编码量化参数QPorg)较小的情况,编码效果更优。It can be seen from Table 2 that the encoding method of the present invention can save an average of 12.9% of the encoding bit rate. For the panoramic video sequences of 4 different scenes and different motion conditions, the encoding method of the present invention can effectively reduce the encoding bit rate, especially for the case where the initial encoding quantization parameter QP (i.e., the original encoding quantization parameter QP org ) is small, the encoding effect is better.
表3列出了采用本发明方法对表1列出的全景视频序列进行编码的率失真性能。采用经典的主观质量评价方法,评估编码的视频质量,在该质量评价中,采用主观质量评价方法MOS((Mean Opinion Score)作为质量评价指标,分别计算出各全景视频序列在该主观质量评价方法MOS下的率失真性能指标BDBRMOS,以综合评价本发明方法的性能。Table 3 lists the rate-distortion performance of the panoramic video sequences listed in Table 1 encoded by the method of the present invention. The classical subjective quality evaluation method is used to evaluate the encoded video quality. In this quality evaluation, the subjective quality evaluation method MOS (Mean Opinion Score) is used as the quality evaluation index, and the rate-distortion performance index BDBR MOS of each panoramic video sequence under the subjective quality evaluation method MOS is calculated to comprehensively evaluate the performance of the method of the present invention.
表3采用本发明方法进行编码的率失真性能Table 3 Rate-distortion performance of encoding using the method of the present invention
从表3中可以看出,本发明方法采用BDBRMOS率失真性能评价指标,表征在相同主观质量条件下,在质量评价指标MOS下编码码率节省均值在-7.4%左右。这说明与HM16.14原始平台方法相比,在相同主观感知质量下,本发明方法能节省更多编码码率。从表3中可以看到针对全景视频序列不同场景、不同运动情况,本发明方法能有效节约编码码率,且率失真性能显著提升。As can be seen from Table 3, the method of the present invention adopts the BDBR MOS rate-distortion performance evaluation index, which indicates that under the same subjective quality conditions, the average encoding bit rate saving under the quality evaluation index MOS is about -7.4%. This shows that compared with the HM16.14 original platform method, under the same subjective perceived quality, the method of the present invention can save more encoding bit rate. From Table 3, it can be seen that for different scenes and different motion conditions of panoramic video sequences, the method of the present invention can effectively save encoding bit rate, and the rate-distortion performance is significantly improved.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210157533.5A CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210157533.5A CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114567776A CN114567776A (en) | 2022-05-31 |
CN114567776B true CN114567776B (en) | 2023-05-05 |
Family
ID=81714022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210157533.5A Active CN114567776B (en) | 2022-02-21 | 2022-02-21 | Video low-complexity coding method based on panoramic visual perception characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114567776B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116723330B (en) * | 2023-03-28 | 2024-02-23 | 成都师范学院 | Panoramic video coding method for self-adapting spherical domain distortion propagation chain length |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6366705B1 (en) * | 1999-01-28 | 2002-04-02 | Lucent Technologies Inc. | Perceptual preprocessing techniques to reduce complexity of video coders |
CN103096079A (en) * | 2013-01-08 | 2013-05-08 | 宁波大学 | Multi-view video rate control method based on exactly perceptible distortion |
CN104954778A (en) * | 2015-06-04 | 2015-09-30 | 宁波大学 | Objective stereo image quality assessment method based on perception feature set |
CN107147912A (en) * | 2017-05-04 | 2017-09-08 | 浙江大华技术股份有限公司 | A kind of method for video coding and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100086063A1 (en) * | 2008-10-02 | 2010-04-08 | Apple Inc. | Quality metrics for coded video using just noticeable difference models |
US9237343B2 (en) * | 2012-12-13 | 2016-01-12 | Mitsubishi Electric Research Laboratories, Inc. | Perceptually coding images and videos |
-
2022
- 2022-02-21 CN CN202210157533.5A patent/CN114567776B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6366705B1 (en) * | 1999-01-28 | 2002-04-02 | Lucent Technologies Inc. | Perceptual preprocessing techniques to reduce complexity of video coders |
CN103096079A (en) * | 2013-01-08 | 2013-05-08 | 宁波大学 | Multi-view video rate control method based on exactly perceptible distortion |
CN104954778A (en) * | 2015-06-04 | 2015-09-30 | 宁波大学 | Objective stereo image quality assessment method based on perception feature set |
CN107147912A (en) * | 2017-05-04 | 2017-09-08 | 浙江大华技术股份有限公司 | A kind of method for video coding and device |
Non-Patent Citations (2)
Title |
---|
Yafen Xing et al.Spatiotemporal just noticeable difference modeling with heterogeneous temporal visual features.《Displays》.2021,全文. * |
杜宝祯.基于感知阈值的立体视频快速编码算法.《信息与电脑(理论版)》.2020,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN114567776A (en) | 2022-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110062234B (en) | Perceptual video coding method based on just noticeable distortion of region | |
CN111988611B (en) | Quantization offset information determining method, image encoding device and electronic equipment | |
CN101325711A (en) | Adaptive rate control method based on spatio-temporal masking effect | |
CN108063944B (en) | A Perceptual Rate Control Method Based on Visual Saliency | |
CN107241607B (en) | A Visual Perceptual Coding Method Based on Multi-Domain JND Model | |
CN111193931B (en) | Video data coding processing method and computer storage medium | |
CN107454413B (en) | A Feature Preserving Video Coding Method | |
CN103096079B (en) | A kind of multi-view video rate control based on proper discernable distortion | |
CN109997360A (en) | The method and apparatus that video is coded and decoded based on perception measurement classification | |
CN108924554A (en) | A kind of panorama video code Rate-distortion optimization method of spherical shape weighting structures similarity | |
CN101977322A (en) | Screen coding system based on universal video coding standard | |
US20230045884A1 (en) | Rio-based video coding method and deivice | |
CN111131831A (en) | Data transmission method and device | |
CN114567776B (en) | Video low-complexity coding method based on panoramic visual perception characteristics | |
WO2017004889A1 (en) | Jnd factor-based super-pixel gaussian filter pre-processing method | |
CN106604029A (en) | HEVC-based bit rate control method for motion region detection | |
CN112584153A (en) | Video compression method and device based on just noticeable distortion model | |
WO2019141007A1 (en) | Method and device for selecting prediction direction in image encoding, and storage medium | |
CN109089115B (en) | The method used to encode 360 video in HEVC | |
CN114567779A (en) | Panoramic video rapid coding method based on distortion threshold | |
CN102572438B (en) | Motion predication method based on image texture and motion features | |
CN103856780B (en) | Method for video coding, coding/decoding method, encoder and decoder | |
CN117579832B (en) | A point cloud attribute compression method based on adaptive sampling and quantization | |
CN101841701B (en) | Codec method and device based on macroblock pair | |
CN111757112A (en) | A Perceptual Rate Control Method for HEVC Based on Just Perceptible Distortion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240118 Address after: Room 166, Building 1, No. 8 Xingye Avenue, Ningbo Free Trade Zone, Zhejiang Province, 315800 Patentee after: Zhejiang Chuanzhi Electronic Technology Co.,Ltd. Address before: 315800 no.388, Lushan East Road, Ningbo Economic and Technological Development Zone, Zhejiang Province Patentee before: Ningbo Polytechnic |