CN102045571B - Fast iterative search algorithm for stereo video coding - Google Patents
Fast iterative search algorithm for stereo video coding Download PDFInfo
- Publication number
- CN102045571B CN102045571B CN 201110007342 CN201110007342A CN102045571B CN 102045571 B CN102045571 B CN 102045571B CN 201110007342 CN201110007342 CN 201110007342 CN 201110007342 A CN201110007342 A CN 201110007342A CN 102045571 B CN102045571 B CN 102045571B
- Authority
- CN
- China
- Prior art keywords
- current block
- vector
- iteration
- block
- disparity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
技术领域 technical field
本发明涉及视频编码领域,尤其是涉及一种立体视频编码中的运动矢量和视差矢量快速搜索算法。The invention relates to the field of video coding, in particular to a fast search algorithm for motion vectors and disparity vectors in stereoscopic video coding.
背景技术 Background technique
立体视频蕴含景物的深度信息,在自然场景的表征上更具有真实感,在3D电视、移动设备的立体视觉系统以及具有临场感的可视会议等领域展现了广阔的应用前景。Stereoscopic video contains the depth information of the scene, and is more realistic in the representation of natural scenes. It has shown broad application prospects in the fields of 3D TV, stereoscopic vision system of mobile devices, and visual conferencing with a sense of presence.
立体视频包含左右两个视频通道,典型的IPPPP预测结构如图1所示,水平方向为时间方向,垂直方向为视点方向。令左视点为参考视点,即左视点先编码,左视点的第一帧为I帧,在编码时,不需要参考其它帧的信息,直接进行DCT变换,线性量化,游程长编码,最后送入算术编码器。左视点除第一帧以外的其它帧都是P帧,通过参考左视点前一个时刻的已编码帧来进行运动估计。右视点为预测视点,第一帧为P帧,既允许它参考左视点的第一帧进行视差估计,又允许帧内预测编码,从二者中选取更优的编码方式,保证了编码效率。右视点的其余P帧都包含两个参考帧,不仅要参考时间方向的参考帧(即,右视点前一个时刻的已编码帧)来进行运动估计,还要参考视点方向的参考帧(即,左视点相同时刻的已编码帧)进行视差估计。传统的立体视频压缩通过全搜索方法,采用大搜索窗口来分别进行运动估计和视差估计,以消除同一视点内部的时间空间冗余和左右视点之间的交叉冗余,并且比较运动矢量和视差矢量的率失真代价,选择使率失真代价最小的作为当前块的最终预测矢量。其中,率失真代价通过RDCost(mv)=SAD(c,r)+λ×R(mv-p)计算得到,mv表示当前块的运动/视差矢量,c表示当前块,r表示预测块,λ表示拉格朗日乘子,p表示当前块的运动/视差矢量的预测值,R(mv-p)表示编码运动/视差矢量和预测值的差值所需的比特数,SAD(c,r)表示当前块和预测块的绝对误差和,B1,B2分别表示块的水平和垂直像素数,[i,j]表示像素的坐标,c[i,j]表示当前块像素值;r[i-mvx,j-mvy]表示预测块的像素值,(mvx,mvy)表示当前块的运动/视差矢量的水平和垂直分量大小。传统的全搜索算法在得到高率失真性能的同时带来了巨大的运算量,限制了立体视频的实时应用。Stereoscopic video includes left and right video channels. A typical IPPPP prediction structure is shown in Figure 1. The horizontal direction is the time direction, and the vertical direction is the viewpoint direction. Let the left viewpoint be the reference viewpoint, that is, the left viewpoint is coded first, and the first frame of the left viewpoint is an I frame. When coding, there is no need to refer to the information of other frames, and the DCT transformation, linear quantization, and run-length coding are performed directly, and finally sent to arithmetic coder. All frames except the first frame of the left view are P frames, and the motion estimation is performed by referring to the coded frame at the previous moment of the left view. The right view is the predictive view, and the first frame is a P frame, which not only allows it to perform disparity estimation with reference to the first frame of the left view, but also allows intra-frame prediction coding, and selects a better coding method from the two to ensure coding efficiency. The rest of the P frames of the right view contain two reference frames, which not only refer to the reference frame in the time direction (that is, the coded frame at the previous moment of the right view) for motion estimation, but also refer to the reference frame in the view direction (that is, The coded frame at the same moment of the left view) is used for disparity estimation. The traditional stereoscopic video compression uses a large search window to perform motion estimation and disparity estimation respectively through the full search method to eliminate the time-space redundancy within the same viewpoint and the cross redundancy between the left and right viewpoints, and compare the motion vector and disparity vector The rate-distortion cost, choose the one that minimizes the rate-distortion cost as the final predictor of the current block. Among them, the rate-distortion cost is calculated by RDCost(mv)=SAD(c,r)+λ×R(mv-p), mv represents the motion/disparity vector of the current block, c represents the current block, r represents the predicted block, λ Represents the Lagrangian multiplier, p represents the predictor of the motion/disparity vector of the current block, R(mv-p) represents the number of bits required to encode the difference between the motion/disparity vector and the predictor, SAD(c,r ) represents the absolute error sum of the current block and the predicted block, B 1 and B 2 represent the number of horizontal and vertical pixels of the block respectively, [i, j] represents the coordinates of the pixel, c[i, j] represents the pixel value of the current block; r[i-mv x , j-mv y ] represents The pixel value of the predicted block, (mv x , mv y ) represents the size of the horizontal and vertical components of the motion/disparity vector of the current block. The traditional full search algorithm brings a huge amount of calculation while obtaining high rate-distortion performance, which limits the real-time application of stereoscopic video.
目前,立体视频快速编码算法大体可分为两大类:一类是基于预测矢量的编码算法,先用全搜索算法对某一域(视差域或运动域)计算视差或运动矢量,然后利用“立体图像对”在相邻时刻视差矢量的一致性或者相邻视点运动矢量一致性原理,对另一域(运动域或视差域)采用快速算法进行预测[1-2]。这类算法能得到较好的编码性能,但是由于下一域(运动域或视差域)预测矢量的准确性取决于前一域(视差域或运动域)的预测结果,因此前一域往往采用穷尽的全搜索算法来保证结果的准确性,编码速度仍然有待提高。另一类是运动和视差联合估计算法,根据立体视频的序列相关性原理,运动域和视差域的信息可以互相利用,由相邻图像的运动和视差矢量关系直接预测得到当前块的运动/视差矢量,从而最大限度降低编码复杂度[3-4]。但是目前这类算法的研究大多只针对像素域或者基于MPEG标准,不能与当前主流的基于块的H.264/AVC视频编码标准兼容,并且,直接利用相邻图像的运动和视差矢量关系求得预测矢量容易陷入局部极小值,编码质量得不到保证。At present, the stereoscopic video fast coding algorithm can be roughly divided into two categories: one is the coding algorithm based on the predictive vector, first use the full search algorithm to calculate the disparity or motion vector for a certain domain (disparity domain or motion domain), and then use the " Based on the principle of the consistency of disparity vectors at adjacent moments or the consistency of motion vectors at adjacent viewpoints for a stereo image pair, a fast algorithm is used to predict another domain (motion domain or disparity domain) [1-2]. This type of algorithm can get better coding performance, but because the accuracy of the prediction vector in the next domain (motion domain or disparity domain) depends on the prediction result of the previous domain (disparity domain or motion domain), the previous domain often uses An exhaustive full search algorithm is used to ensure the accuracy of the results, and the encoding speed still needs to be improved. The other is the motion and disparity joint estimation algorithm. According to the sequence correlation principle of stereoscopic video, the information of the motion domain and the disparity domain can be used mutually, and the motion/disparity of the current block can be directly predicted by the relationship between the motion and disparity vector of the adjacent image. vector, thereby minimizing the coding complexity [3-4]. However, most of the current research on such algorithms only focuses on the pixel domain or is based on the MPEG standard, which is not compatible with the current mainstream block-based H.264/AVC video coding standard, and directly uses the motion and disparity vector relationship of adjacent images to obtain The prediction vector is easy to fall into local minimum, and the coding quality cannot be guaranteed.
本发明基于H.264/AVC标准,提出一种基于立体-运动约束模型的立体视频编码快速迭代搜索算法,在保证高压缩率的前提下大大减少编码复杂度,是非常有意义的。Based on the H.264/AVC standard, the present invention proposes a fast iterative search algorithm for stereoscopic video coding based on a stereo-motion constraint model, which greatly reduces coding complexity under the premise of ensuring a high compression rate, which is very meaningful.
附:参考文献Attachment: References
[1]Ding L F,Chien S Y,Chen L G.Joint prediction algorithm and architecturefor stereo video hybrid coding systems[J]IEEE Transactions on Circuits andSystems for Video Technology,2006,16(11):1324-1337[1] Ding L F, Chien S Y, Chen L G. Joint prediction algorithm and architecture for stereo video hybrid coding systems [J] IEEE Transactions on Circuits and Systems for Video Technology, 2006, 16(11): 1324-1337
[2]Lai P,Ortega A.Predictive fast motion/disparity search for multiview videocoding[C]//SPIE.Proceedings of SPIE.San Jose:Visual Communications andImage Processing,2006,6077:607709[2]Lai P, Ortega A. Predictive fast motion/disparity search for multiview videocoding[C]//SPIE.Proceedings of SPIE.San Jose: Visual Communications and Image Processing, 2006, 6077: 607709
[3]Paras I,Alvertos N,Tziritas G.Joint disparity and motion field estimation instereoscopic image sequences[C]//IEEE.Proceedings of 13th InternationalConference on Pattern Recognition.Vienna:ICPR,1996:359-363[3] Paras I, Alvertos N, Tziritas G. Joint disparity and motion field estimation instereoscopic image sequences[C]//IEEE. Proceedings of 13th International Conference on Pattern Recognition. Vienna: ICPR, 1996: 359-363
[4]Kim Y,Lee J,Park C,et al.MPEG-4 compatible stereoscopic sequence codec forstereo broadcasting[J]IEEE Transactions on Consumer Electronics,2005,51(4):1227-1236[4] Kim Y, Lee J, Park C, et al. MPEG-4 compatible stereoscopic sequence codec forstereo broadcasting [J] IEEE Transactions on Consumer Electronics, 2005, 51(4): 1227-1236
发明内容 Contents of the invention
本发明的目的在于,通过提供一种立体视频编码快速迭代搜索方法,以解决立体视频右视点除第一帧以外的图像帧编码复杂度高的问题,实现低复杂度的立体视频编码。The purpose of the present invention is to provide a fast iterative search method for stereoscopic video coding to solve the problem of high coding complexity of image frames except the first frame of the stereoscopic video right view, and to realize low-complexity stereoscopic video coding.
本发明解决上述技术问题采取的技术方案为:The technical scheme that the present invention solves above-mentioned technical problem to take is:
一种立体视频编码快速迭代搜索方法,令右视点t时刻图像中的宏块MBr,t为当前块,i为迭代步数,δ为立体-运动约束模型的模型误差,mvr,t(MBr,t)表示当前块的最优运动矢量,dvr,t(MBr,t)表示当前块的最优视差矢量,为第i次迭代修正后的当前块的运动矢量的率失真代价,为第i-1次迭代修正后的当前块的运动矢量的率失真代价,为第i次迭代修正后的当前块的视差矢量的率失真代价,为第i-1次迭代修正后的当前块的视差矢量的率失真代价,包括以下步骤:A fast iterative search method for stereoscopic video coding, let the macroblock MB r, t in the image at the moment t of the right viewpoint be the current block, i be the number of iteration steps, δ be the model error of the stereo-motion constraint model, mv r, t ( MB r, t ) represents the optimal motion vector of the current block, dv r, t (MB r, t ) represents the optimal disparity vector of the current block, is the rate-distortion cost of the motion vector of the current block corrected for the i-th iteration, The rate-distortion cost of the motion vector of the current block corrected for the i-1th iteration, is the rate-distortion cost of the disparity vector of the current block corrected for the i-th iteration, The rate-distortion cost of the disparity vector of the current block corrected for the i-1th iteration includes the following steps:
1.1、初始化;确定当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点得到修正后的当前块的运动矢量搜索起始点和修正后的当前块的视差矢量搜索起始点保存修正后的当前块的运动和视差矢量搜索起始点的率失真代价;1.1. Initialization; determine the starting point of the motion vector search of the current block and the disparity vector search starting point of the current block Get the modified motion vector search starting point of the current block and the corrected disparity vector search starting point of the current block Save the corrected motion of the current block and the rate-distortion cost of the disparity vector search starting point;
1.2、按照下式调整修正搜索窗口RSR的大小,1.2. Adjust the size of the modified search window RSR according to the following formula,
其中,T1和T2表示两个阈值(T1<T2),RSRMIN为最小修正搜索窗口,RSRMAX为最大修正搜索窗口;Among them, T 1 and T 2 represent two thresholds (T 1 < T 2 ), RSR MIN is the minimum correction search window, RSR MAX is the maximum correction search window;
1.3、迭代搜索过程;确定第i次迭代当前块的视差矢量预测初值进行矢量修正,得到第i次迭代修正后的当前块的视差矢量确定第i次迭代当前块的运动矢量预测初值进行矢量修正,得到第i次迭代修正后的当前块的运动矢量保存第i次迭代修正后的当前块的运动矢量和视差矢量的率失真代价。1.3. Iterative search process; determine the initial value of the disparity vector prediction of the current block in the i-th iteration Perform vector correction to obtain the disparity vector of the current block corrected by the i-th iteration Determine the initial value of motion vector prediction for the current block in the ith iteration Perform vector correction to obtain the motion vector of the current block corrected by the i-th iteration Save the rate-distortion cost of the motion vector and disparity vector of the current block corrected by the i-th iteration.
1.4、中止准则:如果并且则令第i-1次迭代修正后的当前块的运动矢量和视差矢量分别作为当前块的最优运动矢量mvr,t(MBr,t)和当前块的最优视差矢量dvr,t(MBr,t),结束迭代搜索过程,否则,令i=i+1,由重新计算并更新δ,跳转到步骤1.2,其中,和分别表示第i次迭代修正后的当前块的运动矢量和第i次迭代修正后的当前块的视差矢量;1.4. Suspension criteria: if and Then let the motion vector of the current block corrected by the i-1th iteration and the disparity vector respectively as the optimal motion vector mv r, t (MB r, t ) of the current block and the optimal disparity vector dv r, t (MB r, t ) of the current block, and end the iterative search process; otherwise, let i=
其中,表示第i次迭代当前块在时间方向参考帧中的运动补偿块的视差矢量,第i次迭代当前块在视点方向参考帧中的视差补偿块的运动矢量。in, Indicates the disparity vector of the motion-compensated block in the time-direction reference frame of the current block in the i-th iteration, The motion vector of the disparity compensation block of the current block in the reference frame in the view direction at the i-th iteration.
前述的步骤1.1包括:The aforementioned step 1.1 includes:
2.1、令i=0,δ=0;2.1. Let i=0, δ=0;
2.2、所述确定当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点通过候选矢量集和获得;2.2. The determination of the motion vector search starting point of the current block and the disparity vector search starting point of the current block by candidate vector set and get;
其中,mva/dva,mvb/dvb和mvc/dvc分别表示当前块相邻的左边块a、上方块b和右上块c的运动或视差矢量,mvmed和dvmed分别表示当前块运动矢量的中值矢量和当前块视差矢量的中值矢量,mvl,t为当前块在视点方向参考帧中与当前块位置相同的块的运动矢量,dvr,t-1为当前块在时间方向参考帧中与当前块位置相同的块的视差矢量;Among them, mv a /dv a , mv b /dv b and mv c /dv c respectively represent the motion or disparity vectors of the left block a, the upper block b and the upper right block c adjacent to the current block, and mv med and dv med respectively represent The median vector of the motion vector of the current block and the median vector of the disparity vector of the current block, mv l, t is the motion vector of the block whose position is the same as that of the current block in the reference frame in the view direction of the current block, dv r, t-1 is the current block The disparity vector of the block at the same position as the current block in the time direction reference frame of the block;
2.3、所述的得到修正后的当前块的运动矢量搜索起始点和修正后的当前块的视差矢量搜索起始点分别以当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点为中心,划定一个RSRMIN×RSRMIN的修正搜索窗口,在这个搜索窗口内做矢量修正得到;所述的保存修正后的当前块的运动矢量搜索起始点的率失真代价,记作所述的保存修正后的当前块的视差矢量搜索起始点的率失真代价,记作令i=i+1。2.3. The motion vector search starting point of the modified current block and the corrected disparity vector search starting point of the current block Search the starting point with the motion vector of the current block respectively and the disparity vector search starting point of the current block As the center, define a modified search window of RSR MIN ×RSR MIN , and perform vector correction in this search window; the rate-distortion cost of saving the modified motion vector search starting point of the current block is denoted as The rate-distortion cost of saving the corrected disparity vector search starting point of the current block is denoted as Let i=i+1.
前述的步骤1.2中的阈值T1为5,阈值T2为20,RSRMIN为2,RSRMAX为96。The threshold T 1 in the aforementioned step 1.2 is 5, the threshold T 2 is 20, the RSR MIN is 2, and the RSR MAX is 96.
前述的步骤1.3包括:The aforementioned step 1.3 includes:
4.1、第i次迭代当前块的视差矢量预测初值由计算;4.1. The initial value of the disparity vector prediction of the current block in the i-th iteration Depend on calculate;
其中, 表示第i-1次迭代得到的当前块的运动矢量;表示由当前块在时间方向参考帧中的运动补偿块覆盖的已编码块的视差矢量,u表示被覆盖的已编码块的个数,将使最小的作为第i次迭代当前块在时间方向参考帧中的运动补偿块的视差矢量,记作 为第i-1次迭代得到的当前块在视点方向参考帧中的视差补偿块的运动矢量;in, Indicates the motion vector of the current block obtained by the i-1th iteration; Represents the disparity vector of the coded block covered by the motion compensation block in the time direction reference frame of the current block, u represents the number of covered coded blocks, will make the smallest As the disparity vector of the motion compensation block of the current block in the time direction reference frame in the i-th iteration, denoted as The motion vector of the parallax compensation block in the reference frame in the view direction of the current block obtained for the i-1th iteration;
以为中心,划定一个RSR×RSR的搜索窗口,在这个搜索窗口中进行矢量修正,得到第i次迭代修正后的当前块的视差矢量保存第i次迭代修正后的当前块的视差矢量的率失真代价,记作 by As the center, define a search window of RSR×RSR, and perform vector correction in this search window to obtain the disparity vector of the current block after the i-th iteration correction Save the rate-distortion cost of the disparity vector of the current block corrected by the i-th iteration, denoted as
4.2、第i次迭代当前块的运动矢量预测初值通过计算;4.2. The initial value of the motion vector prediction of the current block in the i-th iteration pass calculate;
其中, 表示步骤4.1中得到的第i次迭代修正后的当前块的视差矢量,表示由当前块在视点方向参考帧中的视差补偿块覆盖的已编码块的视差矢量,v表示被覆盖的已编码块的个数,将使最小的作为第i次迭代当前块在视点方向参考帧中的视差补偿块的运动矢量,记作 表示步骤4.1中得到的第i次迭代当前块在时间方向参考帧中的运动补偿块的视差矢量;in, Indicates the disparity vector of the current block corrected by the i-th iteration obtained in step 4.1, Indicates the disparity vector of the coded block covered by the disparity compensation block in the view direction reference frame of the current block, v represents the number of covered coded blocks, which will make the smallest As the motion vector of the disparity compensation block of the current block in the reference frame in the view direction for the i-th iteration, denoted as Indicates the disparity vector of the motion compensation block of the i-th iteration current block in the time direction reference frame obtained in step 4.1;
以为中心,划定一个RSR×RSR的搜索窗口,在这个搜索窗口中进行矢量修正,得到第i次迭代修正后的当前块的运动矢量保存第i次迭代修正后的当前块的运动矢量的率失真代价,记作 by As the center, define a search window of RSR×RSR, perform vector correction in this search window, and obtain the motion vector of the current block after the i-th iteration correction Save the rate-distortion cost of the motion vector of the current block corrected by the i-th iteration, denoted as
与现有技术相比,本发明的优点在于:传统的立体视频编码算法分别在时间方向参考帧和视点方向参考帧中采用大搜索窗口独立搜索运动矢量和视差矢量,没有利用“立体图像对”之间的矢量关系,本发明通过建立立体-运动约束模型,并且根据模型误差的大小发明一种自适应修正窗口的迭代搜索策略,该方法可以在保持编码质量的同时,极大降低立体视频编码的复杂度,提高编码速度。Compared with the prior art, the advantage of the present invention is that the traditional stereoscopic video coding algorithm uses a large search window to independently search the motion vector and the disparity vector in the reference frame in the time direction and the reference frame in the viewpoint direction, and does not use the "stereo image pair" The vector relationship between, the present invention establishes the stereo-motion constraint model, and invents an iterative search strategy for adaptively correcting the window according to the size of the model error. This method can greatly reduce the stereoscopic video coding while maintaining the coding quality. complexity and improve encoding speed.
实验结果证明本发明的方法能够在基本不降低编码质量的同时节省平均96.43%的编码时间。Experimental results prove that the method of the present invention can save an average of 96.43% of the encoding time while basically not reducing the encoding quality.
附图说明 Description of drawings
图1是立体视频编码结构示意图;FIG. 1 is a schematic diagram of a stereoscopic video encoding structure;
图2是立体-运动约束模型示意图;Fig. 2 is a schematic diagram of a stereo-motion constraint model;
图3是本发明方法的流程图;Fig. 3 is a flow chart of the inventive method;
图4是相邻块的预测矢量示意图;Fig. 4 is a schematic diagram of prediction vectors of adjacent blocks;
图5是求运动矢量预测初值的示意图;Fig. 5 is a schematic diagram of seeking an initial value of motion vector prediction;
图6是求视差矢量预测初值的示意图;Fig. 6 is a schematic diagram of calculating the initial value of the disparity vector prediction;
图7是“Ballroom”序列不同算法的率失真曲线示意。Fig. 7 shows the rate-distortion curves of different algorithms of the "Ballroom" sequence.
具体实施方式 Detailed ways
以下结合附图实施例对本发明作进一步详细阐述。The present invention will be further described in detail below in conjunction with the accompanying drawings.
图2为立体-运动约束模型示意图,立体视频左右视点相邻时刻的4幅图像称为一个“立体图像对”,其中,右视点t时刻图像Fr,t为当前帧,MBr,t表示当前帧,Fr,t中的当前块。Fl,t表示当前块MBr,t在视点方向的参考帧,Fr,t-1表示当前块MBr,t在时间方向的参考帧,Fl,t-1表示DCMBl,t在时间方向的参考帧,Fl,t-1也是MCMBr,t-1在视点方向的参考帧,其中,DCMBl,t表示当前块MBr,t在Fl,t中的视差补偿块,MCMBl,t-1表示DCMBl,t在Fl,t-1中的运动补偿块,MCMBr,t-1表示当前块MBr,t在Fr,t-1中的运动补偿块,DCMBl,t-1表示MCMBr,t-1在Fl,t-1中的视差补偿块。Figure 2 is a schematic diagram of the stereo-motion constraint model. The four images at the adjacent moments of the left and right viewpoints of the stereo video are called a "stereo image pair", where the image F r, t of the right viewpoint at time t is the current frame, and MB r, t represents Current frame, F r, current block in t . F l, t represents the reference frame of the current block MB r, t in the view direction, F r, t-1 represents the reference frame of the current block MB r, t in the time direction, F l, t-1 represents the DCMB l, t in The reference frame in the time direction, F l, t-1 is also the reference frame of MCMB r, t-1 in the view direction, where DCMB l, t represents the disparity compensation block of the current block MB r, t in F l, t , MCMB l,t-1 indicates the motion compensated block of DCMB l,t in F l,t-1 , MCMB r,t-1 indicates the motion compensated block of current block MB r,t in F r,t-1 , DCMB l,t-1 denotes the disparity compensation block of MCMB r,t-1 in F l,t-1 .
“立体图像对”的运动矢量和视差矢量关系可以由下式表示,The relationship between the motion vector and the disparity vector of a "stereo image pair" can be expressed by the following formula,
δ=‖mvr,t(MBr,t)+dvr,t-1(MCMBr,t-1)-dvr,t(MBr,t)-mvl,t(DCMBl,t)‖δ=∥mv r,t (MB r,t )+dv r,t-1 (MCMB r,t-1 )-dv r,t (MB r,t )-mv l,t (DCMB l,t ) ‖
其中,‖v‖表示v的范数,δ为立体-运动约束模型的模型误差,mvr,t(MBr,t)为当前块MBr,t的最优运动矢量,dvr,t(MBr,t)为当前块MBr,t的最优视差矢量,dvr,t-1(MCMBr,t-1)为MCMBr,t-1的视差矢量,mvl,t(DCMBl,t)为DCMBl,t的运动矢量。Among them, ‖v‖ represents the norm of v, δ is the model error of the stereo-motion constraint model, mv r,t (MB r,t ) is the optimal motion vector of the current block MB r,t , dv r,t ( MB r, t ) is the optimal disparity vector of the current block MB r, t , dv r, t-1 (MCMB r, t-1 ) is the disparity vector of MCMB r, t-1 , mv l, t (DCMB l , t ) is the motion vector of DCMB l, t .
当且仅当MCMBl,t-1和DCMBl,t-1为同一个物体在不同3D表面的真实汇聚点时,上式中的δ等于0,即,If and only if MCMB l, t-1 and DCMB l, t-1 are the real convergence points of the same object on different 3D surfaces, δ in the above formula is equal to 0, that is,
mvr,t(MBr,t)+dvr,t-1(MCMBr,t-1)=dvr,t(MBr,t)+mvl,t(DCMBl,t)mv r,t (MB r,t )+dv r,t-1 (MCMB r,t-1 )=dv r,t (MB r,t )+mv l,t (DCMB l,t )
本发明针对立体视频右视点除第一帧以外的所有帧的编码块设计运动矢量和视差矢量的快速预测方法,右视点第一帧仍然采用全搜索算法来保证搜索精度。图3为本发明方法的流程图,分为初始化、调整修正搜索窗口大小、迭代搜索,和中止准则四个步骤。假设右视点t时刻图像中的宏块MBr,t为当前块,i表示迭代步数,δ为立体-运动约束模型的模型误差,mvr,t(MBr,t)和dvr,t(MBr,t)分别表示当前块的最优运动矢量和当前块的最优视差矢量,本发明的方法步骤如下:The present invention designs a fast prediction method for motion vectors and disparity vectors for coding blocks of all frames except the first frame of the right view of the stereoscopic video, and the first frame of the right view still uses a full search algorithm to ensure search accuracy. Fig. 3 is a flowchart of the method of the present invention, which is divided into four steps of initialization, adjusting and correcting the size of the search window, iterative search, and stopping criteria. Assume that the macroblock MB r, t in the image at the right viewpoint t is the current block, i represents the number of iteration steps, δ is the model error of the stereo-motion constraint model, mv r, t (MB r, t ) and dv r, t (MB r, t ) respectively represent the optimal motion vector of the current block and the optimal disparity vector of the current block, and the method steps of the present invention are as follows:
第一步,初始化:The first step, initialization:
1)令i=0,δ=0;1) Let i=0, δ=0;
2)由候选矢量集和确定当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点其中,mva/dva,mvb/dvb和mvc/dvc分别表示当前块相邻的左边块a、上方块b和右上块c的运动或视差矢量,mvmed和dvmed分别表示当前块的运动矢量和视差矢量的中值矢量,中值矢量mvmed的水平分量和垂直分量分别等于mva,mvb和mvc水平分量和垂直分量的中值,中值矢量dvmed的水平分量和垂直分量分别等于dva,dvb和dvc水平分量和垂直分量的中值,mvl,t为当前块在视点方向参考帧中与当前块位置相同的块的运动矢量,dvr,t-1为当前块在时间方向参考帧中与当前块位置相同的块的视差矢量,如图4所示。由于当前块的相邻块已经完成编码,每个相邻块都只有运动矢量或者视差矢量,例如,对于当前块左边的相邻块a来说,只存在mva或者dva,因此,如果a是以视差估计方式进行编码的话,mva就不存在。在这种情况下,我们用(0,0)代替mva来计算中值矢量mvmed。比较各个候选预测矢量的率失真代价,分别选取使率失真代价最小的作为当前块的运动和视差矢量搜索起始点和 2) From the candidate vector set and Determine the motion vector search starting point for the current block and the disparity vector search starting point of the current block Among them, mv a /dv a , mv b /dv b and mv c /dv c respectively represent the motion or disparity vectors of the left block a, the upper block b and the upper right block c adjacent to the current block, and mv med and dv med respectively represent The median vector of the motion vector and the disparity vector of the current block, the horizontal and vertical components of the median vector mv med are equal to the median of the horizontal and vertical components of mv a , mv b and mv c respectively, the horizontal and vertical components of the median vector dv med component and vertical component are equal to dv a , dv b and dv c median value of the horizontal component and vertical component respectively, mv l, t is the motion vector of the block with the same position as the current block in the view direction reference frame of the current block, dv r, t-1 is the disparity vector of the block at the same position as the current block in the reference frame in the time direction of the current block, as shown in FIG. 4 . Since the adjacent blocks of the current block have been coded, each adjacent block only has a motion vector or a disparity vector. For example, for the adjacent block a to the left of the current block, there is only mv a or dv a , therefore, if a If it is encoded in the way of disparity estimation, mv a does not exist. In this case, we substitute (0,0) for mv a to compute the median vector mv med . Compare the rate-distortion cost of each candidate prediction vector, and select the one that minimizes the rate-distortion cost as the starting point of the search for the motion and disparity vector of the current block and
3)分别以当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点为中心,划定一个RSRMIN×RSRMIN的修正搜索窗口,在这个搜索窗口内做矢量修正,得到修正后的当前块的运动矢量和视差矢量搜索起始点和保存修正后的当前块的运动矢量搜索起始点的率失真代价,记作保存修正后的当前块的视差矢量搜索起始点的率失真代价,记作令i=i+1。3) Search the starting point with the motion vector of the current block respectively and the disparity vector search starting point of the current block As the center, define a modified search window of RSR MIN × RSR MIN , do vector correction in this search window, and obtain the modified motion vector and disparity vector search starting point of the current block and Save the rate-distortion cost of the modified motion vector search starting point of the current block, denoted as Save the rate-distortion cost of the disparity vector search starting point of the current block after correction, denoted as Let i=i+1.
第二步,按照下式调整修正搜索窗口RSR的大小。In the second step, the size of the modified search window RSR is adjusted according to the following formula.
其中,RSRMIN为最小的修正搜索窗口(通常为2×2个像素),RSRMAX为最大的修正搜索窗口(通常取全搜索算法的搜索窗口大小,即96×96),T1和T2表示两个阈值(T1<T2),用来控制修正搜索窗口的大小,当模型误差δ大于阈值T2时(T2通常为20),当前块很可能为运动遮挡区域,需要采用大的修正搜索窗口RSRMAX来保证搜索精度;当模型误差δ小于阈值T1时,T1通常取5,说明当前迭代过程中得到的当前块的运动/视差矢量已经非常接近当前块的最优运动/视差矢量,于是,采用较小的修正搜索窗口RSRMIN,就足以保证搜索精度;反之,根据模型误差δ的大小来设计一个自适应大小的修正搜索窗口。Among them, RSR MIN is the smallest modified search window (usually 2×2 pixels), RSR MAX is the largest modified search window (usually the search window size of the full search algorithm, namely 96×96), T 1 and T 2 Indicates two thresholds (T 1 <T 2 ), which are used to control the size of the modified search window. When the model error δ is greater than the threshold T 2 (T 2 is usually 20), the current block is likely to be a motion occlusion area, and a large The modified search window RSR MAX is used to ensure the search accuracy; when the model error δ is smaller than the threshold T 1 , T 1 usually takes 5, indicating that the motion/disparity vector of the current block obtained in the current iteration process is very close to the optimal motion of the current block / disparity vector, therefore, using a smaller modified search window RSR MIN is sufficient to ensure the search accuracy; otherwise, an adaptively sized modified search window is designed according to the size of the model error δ.
第三步,迭代搜索过程:The third step is to iterate the search process:
1)由计算第i次迭代过程中当前块的视差矢量预测初值其中, 表示第i-1次迭代过程中得到的当前块的运动矢量;表示由当前块在时间方向参考帧中的运动补偿块覆盖的已编码块的视差矢量,u表示被覆盖的已编码块的个数,如图5所示,将使最小的作为第i次迭代过程中当前块在时间方向参考帧中的运动补偿块的视差矢量,记作 为第i-1次迭代过程中得到的当前块在视点方向参考帧中的视差补偿块的运动矢量。1) by Calculate the initial value of the disparity vector prediction of the current block in the iterative process in, Indicates the motion vector of the current block obtained during the i-1th iteration; Represents the disparity vector of the coded block covered by the motion compensation block in the time direction reference frame of the current block, u represents the number of covered coded blocks, as shown in Figure 5, will make the smallest As the disparity vector of the motion compensation block of the current block in the time direction reference frame in the iterative process, denoted as is the motion vector of the disparity compensation block of the current block in the reference frame in the view direction obtained in the iterative process of i-1.
以为中心,划定一个RSR×RSR的搜索窗口,在这个搜索窗口中进行矢量修正,得到第i次迭代过程中修正后的当前块的视差矢量保存第i次迭代过程中修正后的当前块的视差矢量的率失真代价,记作 by As the center, define a RSR×RSR search window, and perform vector correction in this search window to obtain the corrected disparity vector of the current block in the i-th iteration process Save the rate-distortion cost of the disparity vector of the current block corrected in the i-th iteration, denoted as
2)由计算第i次迭代过程中当前块的运动矢量预测初值其中, 表示第三步步骤1)中得到的第i次迭代过程中修正后的当前块的视差矢量,表示由当前块在视点方向参考帧中的视差补偿块覆盖的已编码块的视差矢量,v表示被覆盖的已编码块的个数,如图6所示,将使最小的作为第i次迭代过程中当前块在视点方向参考帧中的视差补偿块的运动矢量,记作 表示第三步步骤1)中得到的第i次迭代过程中当前块在时间方向参考帧中的运动补偿块的视差矢量。2) by Calculate the initial value of the motion vector prediction of the current block during the i-th iteration in, Represents the disparity vector of the current block corrected in the i-th iteration process obtained in step 1) of the third step, Represents the disparity vector of the coded block covered by the disparity compensation block in the reference frame of the current block in the view direction, v represents the number of coded blocks covered, as shown in Figure 6, will make the smallest As the motion vector of the parallax compensation block of the current block in the reference frame in the view direction during the iterative process, denoted as Represents the disparity vector of the motion compensation block of the current block in the reference frame in the time direction obtained in step 1) of the third step in the iterative process.
以为中心,划定一个RSR×RSR的搜索窗口,在这个搜索窗口中进行矢量修正,得到第i次迭代过程中修正后的当前块的运动矢量保存第i次迭代过程中修正后的当前块的运动矢量的率失真代价,记作 by As the center, define a RSR×RSR search window, and perform vector correction in this search window to obtain the corrected motion vector of the current block in the i-th iteration process Save the rate-distortion cost of the motion vector of the current block corrected during the i-th iteration, denoted as
第四步,中止准则:表示第i次迭代过程修正后的当前块的运动矢量的率失真代价,表示第i-1次迭代过程修正后的当前块的运动矢量的率失真代价,表示第i次迭代过程修正后的当前块的视差矢量的率失真代价,表示第i-1次迭代过程修正后的当前块的视差矢量的率失真代价,如果并且则分别令第i-1次迭代过程中修正后的当前块的运动矢量和视差矢量和作为当前块的最优运动矢量mvr,t(MBr,t)和当前块的最优视差矢量dvr,t(MBr,t),结束迭代搜索过程。否则,令i=i+1,由重新计算并更新δ,跳转到步骤B,其中,和分别表示第i次迭代过程中修正后的当前块的运动矢量和第i次迭代过程中修正后的当前块的视差矢量,表示第i次迭代过程中当前块在时间方向参考帧中的运动补偿块的视差矢量,第i次迭代过程中当前块在视点方向参考帧中的视差补偿块的运动矢量。The fourth step, the suspension criterion: Indicates the rate-distortion cost of the motion vector of the current block corrected by the iterative process, Indicates the rate-distortion cost of the motion vector of the current block corrected by the iterative process of the i-1th time, Indicates the rate-distortion cost of the disparity vector of the current block corrected by the iterative process, Indicates the rate-distortion cost of the disparity vector of the current block corrected by the iterative process of the i-1th time, if and Then respectively let the motion vector and disparity vector of the current block corrected in the iterative process of i-1 and As the optimal motion vector mv r,t (MB r,t ) of the current block and the optimal disparity vector dv r,t (MB r,t ) of the current block, the iterative search process ends. Otherwise, let i=i+1, by Recalculate and update δ, jump to step B, where, and respectively represent the motion vector of the current block corrected in the ith iteration process and the disparity vector of the current block corrected in the ith iteration process, Indicates the disparity vector of the motion compensation block of the current block in the time direction reference frame during the i-th iteration, The motion vector of the disparity compensation block of the current block in the reference frame in the view direction during the iterative process.
为了检验本发明所提出的方法的性能,将本发明的方法与全搜索方法进行比较。实验平台为JMVM8.0,运动估计和视差估计搜索窗口大小为32×32,宏块模式为Inter16×16,从“Ballroom”,“Exit”,“Vassar”,“Racel”,“Rena”序列的视点0,视点1中各取100帧图像分别作为左视点和右视点,左视点采用传统的h.264编码方法先进行编码,右视点采用本发明的立体视频编码快速迭代搜索方法进行编码,各序列分辨率为640×480。所有实验在配置为Intel(R)Core(TM)2 Extreme X9650 2.99GHzCPU,4GB RAM的PC上独立执行。In order to examine the performance of the proposed method in the present invention, the method of the present invention is compared with the full search method. The experimental platform is JMVM8.0, the motion estimation and disparity estimation search window size is 32×32, the macroblock mode is Inter16×16, from the sequence of “Ballroom”, “Exit”, “Vassar”, “Racel”, “Rena” Take 100 frames of images from
表1为不同算法的编码结果比较。图7为“Ballroom”序列不同算法的率失真曲线示意。可以看出,本文算法与JMVM全搜索相比,峰值信噪比的变化在-0.33dB到+0.01dB之间,平均码率的变化范围在-10.93%到+0.07%之间,基本上看不出编码质量的下降,并且本文算法能够节省96%以上的编码时间。Table 1 compares the coding results of different algorithms. Figure 7 shows the rate-distortion curves of different algorithms for the "Ballroom" sequence. It can be seen that, compared with the JMVM full search, the algorithm in this paper has a change in peak signal-to-noise ratio between -0.33dB and +0.01dB, and an average code rate between -10.93% and +0.07%. There is no decline in encoding quality, and the algorithm in this paper can save more than 96% of encoding time.
表1不同算法的编码结果与JMVM全搜索算法比较Table 1 Encoding results of different algorithms compared with JMVM full search algorithm
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110007342 CN102045571B (en) | 2011-01-13 | 2011-01-13 | Fast iterative search algorithm for stereo video coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201110007342 CN102045571B (en) | 2011-01-13 | 2011-01-13 | Fast iterative search algorithm for stereo video coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102045571A CN102045571A (en) | 2011-05-04 |
CN102045571B true CN102045571B (en) | 2012-09-05 |
Family
ID=43911273
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201110007342 Expired - Fee Related CN102045571B (en) | 2011-01-13 | 2011-01-13 | Fast iterative search algorithm for stereo video coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102045571B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130163880A1 (en) * | 2011-12-23 | 2013-06-27 | Chao-Chung Cheng | Disparity search methods and apparatuses for multi-view videos |
CN102595164A (en) * | 2012-02-27 | 2012-07-18 | 中兴通讯股份有限公司 | Method, device and system for sending video image |
WO2013159326A1 (en) * | 2012-04-27 | 2013-10-31 | Mediatek Singapore Pte. Ltd. | Inter-view motion prediction in 3d video coding |
US10277844B2 (en) * | 2016-04-20 | 2019-04-30 | Intel Corporation | Processing images based on generated motion data |
CN108419082B (en) * | 2017-02-10 | 2020-09-11 | 北京金山云网络技术有限公司 | Motion estimation method and device |
CN117426095A (en) * | 2021-06-04 | 2024-01-19 | 抖音视界有限公司 | Method, apparatus and medium for video processing |
CN115908170B (en) * | 2022-11-04 | 2023-11-21 | 浙江华诺康科技有限公司 | Noise reduction method and device for binocular image, electronic device and storage medium |
CN115630191B (en) * | 2022-12-22 | 2023-03-28 | 成都纵横自动化技术股份有限公司 | Time-space data set retrieval method and device based on full-dynamic video and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101600108A (en) * | 2009-06-26 | 2009-12-09 | 北京工业大学 | A joint motion and disparity estimation method in multi-view video coding |
CN101895749A (en) * | 2010-06-29 | 2010-11-24 | 宁波大学 | Quick parallax estimation and motion estimation method |
CN101917619A (en) * | 2010-08-20 | 2010-12-15 | 浙江大学 | A fast motion estimation method for multi-view video coding |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090167843A1 (en) * | 2006-06-08 | 2009-07-02 | Izzat Hekmat Izzat | Two pass approach to three dimensional Reconstruction |
-
2011
- 2011-01-13 CN CN 201110007342 patent/CN102045571B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101600108A (en) * | 2009-06-26 | 2009-12-09 | 北京工业大学 | A joint motion and disparity estimation method in multi-view video coding |
CN101895749A (en) * | 2010-06-29 | 2010-11-24 | 宁波大学 | Quick parallax estimation and motion estimation method |
CN101917619A (en) * | 2010-08-20 | 2010-12-15 | 浙江大学 | A fast motion estimation method for multi-view video coding |
Non-Patent Citations (1)
Title |
---|
邓智玭,贾克斌,陈锐霖,伏长虹,萧允治.面向立体视频的视差-运动同步联立预测算法.《计算机辅助设计与图形学学报》.2010,第22卷(第10期), * |
Also Published As
Publication number | Publication date |
---|---|
CN102045571A (en) | 2011-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102045571B (en) | Fast iterative search algorithm for stereo video coding | |
CN101600108B (en) | Joint estimation method for movement and parallax error in multi-view video coding | |
CN110087087B (en) | VVC inter-frame coding unit prediction mode early decision and block division early termination method | |
CN107027029B (en) | High-performance video coding improvement method based on frame rate conversion | |
CN101860748B (en) | System and method for generating side information based on distributed video coding | |
CN103533359B (en) | One is bit rate control method H.264 | |
CN100463527C (en) | A method for disparity estimation of multi-viewpoint video images | |
CN103051894B (en) | A kind of based on fractal and H.264 binocular tri-dimensional video compression & decompression method | |
CN104469336B (en) | Coding method for multi-view depth video signals | |
US20230042575A1 (en) | Methods and systems for estimating motion in multimedia pictures | |
CN102752588A (en) | Video encoding and decoding method using space zoom prediction | |
CN101895749B (en) | Quick parallax estimation and motion estimation method | |
CN110557646B (en) | Intelligent inter-view coding method | |
CN101304529A (en) | Method and device for selecting macroblock mode | |
CN102291579A (en) | Rapid fractal compression and decompression method for multi-cast stereo video | |
CN103327327A (en) | Selection method of inter-frame predictive coding units for HEVC | |
CN107071421A (en) | A kind of method for video coding of combination video stabilization | |
TWI489876B (en) | A Multi - view Video Coding Method That Can Save Decoding Picture Memory Space | |
CN102316323B (en) | A Fast Fractal Compression and Decompression Method for Binocular Stereo Video | |
CN101867818B (en) | Selection method and device of macroblock mode | |
Yan et al. | CTU layer rate control algorithm in scene change video for free-viewpoint video | |
CN101557519B (en) | Multi-view video coding method | |
CN102263953B (en) | Quick fractal compression and decompression method for multicasting stereo video based on object | |
CN114827616B (en) | Compressed video quality enhancement method based on space-time information balance | |
CN113507607B (en) | Compressed video multi-frame quality enhancement method without motion compensation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120905 Termination date: 20150113 |
|
EXPY | Termination of patent right or utility model |