CN102045571B

CN102045571B - Fast iterative search algorithm for stereo video coding

Info

Publication number: CN102045571B
Application number: CN 201110007342
Authority: CN
Inventors: 贾克斌; 邓智玭
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2011-01-13
Filing date: 2011-01-13
Publication date: 2012-09-05
Anticipated expiration: 2031-01-13
Also published as: CN102045571A

Abstract

The invention discloses a fast iterative search algorithm for stereo video coding, which is characterized in that a stereo-motion constraint model is defined according to the relation between the motion vector and disparity vector of the stereo image pair of the adjacent images of the left viewpoint and right viewpoint of a stereo video. The fast iterative search method for stereo video coding comprises the following steps: initializing, adjusting the RSR (rank sum ration) of an amending search window, carrying out iterative search, and suspending: calculating the initial motion vector and initial disparity vector of a current module by adopting an iterative search strategy, and designing the adaptive amending search window according to the error of the stereo-motion constraint model to amend the motion vector and disparity vector of the current module so that the optimum motion vector and optimum disparity vector of the current module can be finally predicted rapidly. Compared with the traditional full search algorithm, the fast iterative search algorithm provided by the invention can ensure the coding quality and save the coding time by over 96 percent.

Description

A Fast Iterative Search Method for Stereo Video Coding

技术领域 technical field

本发明涉及视频编码领域，尤其是涉及一种立体视频编码中的运动矢量和视差矢量快速搜索算法。The invention relates to the field of video coding, in particular to a fast search algorithm for motion vectors and disparity vectors in stereoscopic video coding.

背景技术 Background technique

立体视频蕴含景物的深度信息，在自然场景的表征上更具有真实感，在3D电视、移动设备的立体视觉系统以及具有临场感的可视会议等领域展现了广阔的应用前景。Stereoscopic video contains the depth information of the scene, and is more realistic in the representation of natural scenes. It has shown broad application prospects in the fields of 3D TV, stereoscopic vision system of mobile devices, and visual conferencing with a sense of presence.

立体视频包含左右两个视频通道，典型的IPPPP预测结构如图1所示，水平方向为时间方向，垂直方向为视点方向。令左视点为参考视点，即左视点先编码，左视点的第一帧为I帧，在编码时，不需要参考其它帧的信息，直接进行DCT变换，线性量化，游程长编码，最后送入算术编码器。左视点除第一帧以外的其它帧都是P帧，通过参考左视点前一个时刻的已编码帧来进行运动估计。右视点为预测视点，第一帧为P帧，既允许它参考左视点的第一帧进行视差估计，又允许帧内预测编码，从二者中选取更优的编码方式，保证了编码效率。右视点的其余P帧都包含两个参考帧，不仅要参考时间方向的参考帧(即，右视点前一个时刻的已编码帧)来进行运动估计，还要参考视点方向的参考帧(即，左视点相同时刻的已编码帧)进行视差估计。传统的立体视频压缩通过全搜索方法，采用大搜索窗口来分别进行运动估计和视差估计，以消除同一视点内部的时间空间冗余和左右视点之间的交叉冗余，并且比较运动矢量和视差矢量的率失真代价，选择使率失真代价最小的作为当前块的最终预测矢量。其中，率失真代价通过RDCost(mv)＝SAD(c，r)+λ×R(mv-p)计算得到，mv表示当前块的运动/视差矢量，c表示当前块，r表示预测块，λ表示拉格朗日乘子，p表示当前块的运动/视差矢量的预测值，R(mv-p)表示编码运动/视差矢量和预测值的差值所需的比特数，SAD(c，r)表示当前块和预测块的绝对误差和，

B₁，B₂分别表示块的水平和垂直像素数，[i，j]表示像素的坐标，c[i，j]表示当前块像素值；r[i-mv_x，j-mv_y]表示预测块的像素值，(mv_x，mv_y)表示当前块的运动/视差矢量的水平和垂直分量大小。传统的全搜索算法在得到高率失真性能的同时带来了巨大的运算量，限制了立体视频的实时应用。Stereoscopic video includes left and right video channels. A typical IPPPP prediction structure is shown in Figure 1. The horizontal direction is the time direction, and the vertical direction is the viewpoint direction. Let the left viewpoint be the reference viewpoint, that is, the left viewpoint is coded first, and the first frame of the left viewpoint is an I frame. When coding, there is no need to refer to the information of other frames, and the DCT transformation, linear quantization, and run-length coding are performed directly, and finally sent to arithmetic coder. All frames except the first frame of the left view are P frames, and the motion estimation is performed by referring to the coded frame at the previous moment of the left view. The right view is the predictive view, and the first frame is a P frame, which not only allows it to perform disparity estimation with reference to the first frame of the left view, but also allows intra-frame prediction coding, and selects a better coding method from the two to ensure coding efficiency. The rest of the P frames of the right view contain two reference frames, which not only refer to the reference frame in the time direction (that is, the coded frame at the previous moment of the right view) for motion estimation, but also refer to the reference frame in the view direction (that is, The coded frame at the same moment of the left view) is used for disparity estimation. The traditional stereoscopic video compression uses a large search window to perform motion estimation and disparity estimation respectively through the full search method to eliminate the time-space redundancy within the same viewpoint and the cross redundancy between the left and right viewpoints, and compare the motion vector and disparity vector The rate-distortion cost, choose the one that minimizes the rate-distortion cost as the final predictor of the current block. Among them, the rate-distortion cost is calculated by RDCost(mv)=SAD(c,r)+λ×R(mv-p), mv represents the motion/disparity vector of the current block, c represents the current block, r represents the predicted block, λ Represents the Lagrangian multiplier, p represents the predictor of the motion/disparity vector of the current block, R(mv-p) represents the number of bits required to encode the difference between the motion/disparity vector and the predictor, SAD(c,r ) represents the absolute error sum of the current block and the predicted block,

B ₁ and B ₂ represent the number of horizontal and vertical pixels of the block respectively, [i, j] represents the coordinates of the pixel, c[i, j] represents the pixel value of the current block; r[i-mv _x , j-mv _y ] represents The pixel value of the predicted block, (mv _x , mv _y ) represents the size of the horizontal and vertical components of the motion/disparity vector of the current block. The traditional full search algorithm brings a huge amount of calculation while obtaining high rate-distortion performance, which limits the real-time application of stereoscopic video.

目前，立体视频快速编码算法大体可分为两大类：一类是基于预测矢量的编码算法，先用全搜索算法对某一域(视差域或运动域)计算视差或运动矢量，然后利用“立体图像对”在相邻时刻视差矢量的一致性或者相邻视点运动矢量一致性原理，对另一域(运动域或视差域)采用快速算法进行预测[1-2]。这类算法能得到较好的编码性能，但是由于下一域(运动域或视差域)预测矢量的准确性取决于前一域(视差域或运动域)的预测结果，因此前一域往往采用穷尽的全搜索算法来保证结果的准确性，编码速度仍然有待提高。另一类是运动和视差联合估计算法，根据立体视频的序列相关性原理，运动域和视差域的信息可以互相利用，由相邻图像的运动和视差矢量关系直接预测得到当前块的运动/视差矢量，从而最大限度降低编码复杂度[3-4]。但是目前这类算法的研究大多只针对像素域或者基于MPEG标准，不能与当前主流的基于块的H.264/AVC视频编码标准兼容，并且，直接利用相邻图像的运动和视差矢量关系求得预测矢量容易陷入局部极小值，编码质量得不到保证。At present, the stereoscopic video fast coding algorithm can be roughly divided into two categories: one is the coding algorithm based on the predictive vector, first use the full search algorithm to calculate the disparity or motion vector for a certain domain (disparity domain or motion domain), and then use the " Based on the principle of the consistency of disparity vectors at adjacent moments or the consistency of motion vectors at adjacent viewpoints for a stereo image pair, a fast algorithm is used to predict another domain (motion domain or disparity domain) [1-2]. This type of algorithm can get better coding performance, but because the accuracy of the prediction vector in the next domain (motion domain or disparity domain) depends on the prediction result of the previous domain (disparity domain or motion domain), the previous domain often uses An exhaustive full search algorithm is used to ensure the accuracy of the results, and the encoding speed still needs to be improved. The other is the motion and disparity joint estimation algorithm. According to the sequence correlation principle of stereoscopic video, the information of the motion domain and the disparity domain can be used mutually, and the motion/disparity of the current block can be directly predicted by the relationship between the motion and disparity vector of the adjacent image. vector, thereby minimizing the coding complexity [3-4]. However, most of the current research on such algorithms only focuses on the pixel domain or is based on the MPEG standard, which is not compatible with the current mainstream block-based H.264/AVC video coding standard, and directly uses the motion and disparity vector relationship of adjacent images to obtain The prediction vector is easy to fall into local minimum, and the coding quality cannot be guaranteed.

本发明基于H.264/AVC标准，提出一种基于立体-运动约束模型的立体视频编码快速迭代搜索算法，在保证高压缩率的前提下大大减少编码复杂度，是非常有意义的。Based on the H.264/AVC standard, the present invention proposes a fast iterative search algorithm for stereoscopic video coding based on a stereo-motion constraint model, which greatly reduces coding complexity under the premise of ensuring a high compression rate, which is very meaningful.

附：参考文献Attachment: References

[1]Ding L F，Chien S Y，Chen L G.Joint prediction algorithm and architecturefor stereo video hybrid coding systems[J]IEEE Transactions on Circuits andSystems for Video Technology，2006，16(11)：1324-1337[1] Ding L F, Chien S Y, Chen L G. Joint prediction algorithm and architecture for stereo video hybrid coding systems [J] IEEE Transactions on Circuits and Systems for Video Technology, 2006, 16(11): 1324-1337

[2]Lai P，Ortega A.Predictive fast motion/disparity search for multiview videocoding[C]//SPIE.Proceedings of SPIE.San Jose：Visual Communications andImage Processing，2006，6077：607709[2]Lai P, Ortega A. Predictive fast motion/disparity search for multiview videocoding[C]//SPIE.Proceedings of SPIE.San Jose: Visual Communications and Image Processing, 2006, 6077: 607709

[3]Paras I，Alvertos N，Tziritas G.Joint disparity and motion field estimation instereoscopic image sequences[C]//IEEE.Proceedings of 13th InternationalConference on Pattern Recognition.Vienna：ICPR，1996：359-363[3] Paras I, Alvertos N, Tziritas G. Joint disparity and motion field estimation instereoscopic image sequences[C]//IEEE. Proceedings of 13th International Conference on Pattern Recognition. Vienna: ICPR, 1996: 359-363

[4]Kim Y，Lee J，Park C，et al.MPEG-4 compatible stereoscopic sequence codec forstereo broadcasting[J]IEEE Transactions on Consumer Electronics，2005，51(4)：1227-1236[4] Kim Y, Lee J, Park C, et al. MPEG-4 compatible stereoscopic sequence codec forstereo broadcasting [J] IEEE Transactions on Consumer Electronics, 2005, 51(4): 1227-1236

发明内容 Contents of the invention

本发明的目的在于，通过提供一种立体视频编码快速迭代搜索方法，以解决立体视频右视点除第一帧以外的图像帧编码复杂度高的问题，实现低复杂度的立体视频编码。The purpose of the present invention is to provide a fast iterative search method for stereoscopic video coding to solve the problem of high coding complexity of image frames except the first frame of the stereoscopic video right view, and to realize low-complexity stereoscopic video coding.

本发明解决上述技术问题采取的技术方案为：The technical scheme that the present invention solves above-mentioned technical problem to take is:

一种立体视频编码快速迭代搜索方法，令右视点t时刻图像中的宏块MB_r，t为当前块，i为迭代步数，δ为立体-运动约束模型的模型误差，mv_r，t(MB_r，t)表示当前块的最优运动矢量，dv_r，t(MB_r，t)表示当前块的最优视差矢量，

为第i次迭代修正后的当前块的运动矢量的率失真代价，

为第i-1次迭代修正后的当前块的运动矢量的率失真代价，

为第i次迭代修正后的当前块的视差矢量的率失真代价，

为第i-1次迭代修正后的当前块的视差矢量的率失真代价，包括以下步骤：A fast iterative search method for stereoscopic video coding, let the macroblock MB _{r, t} in the image at the moment t of the right viewpoint be the current block, i be the number of iteration steps, δ be the model error of the stereo-motion constraint model, mv _{r, t} ( MB _{r, t} ) represents the optimal motion vector of the current block, dv _{r, t} (MB _{r, t} ) represents the optimal disparity vector of the current block,

is the rate-distortion cost of the motion vector of the current block corrected for the i-th iteration,

The rate-distortion cost of the motion vector of the current block corrected for the i-1th iteration,

is the rate-distortion cost of the disparity vector of the current block corrected for the i-th iteration,

The rate-distortion cost of the disparity vector of the current block corrected for the i-1th iteration includes the following steps:

1.1、初始化；确定当前块的运动矢量搜索起始点

和当前块的视差矢量搜索起始点得到修正后的当前块的运动矢量搜索起始点

和修正后的当前块的视差矢量搜索起始点

保存修正后的当前块的运动和视差矢量搜索起始点的率失真代价；1.1. Initialization; determine the starting point of the motion vector search of the current block

and the disparity vector search starting point of the current block Get the modified motion vector search starting point of the current block

and the corrected disparity vector search starting point of the current block

Save the corrected motion of the current block and the rate-distortion cost of the disparity vector search starting point;

1.2、按照下式调整修正搜索窗口RSR的大小，1.2. Adjust the size of the modified search window RSR according to the following formula,

$RSR RSR = = \{\begin{matrix} {RSR RSR}_{MIN MIN} & δ δ < < {T T}_{11} \\ {RSR RSR}_{MIN MIN} + + \frac{δ δ - - {T T}_{11}}{{T T}_{22} - - {T T}_{11}} (({RSR RSR}_{MAX MAX} - - {RSR RSR}_{MIN MIN})) & {T T}_{11} \leq \leq δ δ \leq \leq {T T}_{22} \\ {RSR RSR}_{MAX MAX} & δ δ > > {T T}_{22} \end{matrix}$

其中，T₁和T₂表示两个阈值(T₁＜T₂)，RSR_MIN为最小修正搜索窗口，RSR_MAX为最大修正搜索窗口；Among them, T ₁ and T ₂ represent two thresholds (T ₁ < T ₂ ), RSR _MIN is the minimum correction search window, RSR _MAX is the maximum correction search window;

1.3、迭代搜索过程；确定第i次迭代当前块的视差矢量预测初值

进行矢量修正，得到第i次迭代修正后的当前块的视差矢量

确定第i次迭代当前块的运动矢量预测初值

进行矢量修正，得到第i次迭代修正后的当前块的运动矢量

保存第i次迭代修正后的当前块的运动矢量和视差矢量的率失真代价。1.3. Iterative search process; determine the initial value of the disparity vector prediction of the current block in the i-th iteration

Perform vector correction to obtain the disparity vector of the current block corrected by the i-th iteration

Determine the initial value of motion vector prediction for the current block in the ith iteration

Perform vector correction to obtain the motion vector of the current block corrected by the i-th iteration

Save the rate-distortion cost of the motion vector and disparity vector of the current block corrected by the i-th iteration.

1.4、中止准则：如果并且

则令第i-1次迭代修正后的当前块的运动矢量和视差矢量

分别作为当前块的最优运动矢量mv_r，t(MB_r，t)和当前块的最优视差矢量dv_r，t(MB_r，t)，结束迭代搜索过程，否则，令i＝i+1，由

重新计算并更新δ，跳转到步骤1.2，其中，

和

分别表示第i次迭代修正后的当前块的运动矢量和第i次迭代修正后的当前块的视差矢量；1.4. Suspension criteria: if and

Then let the motion vector of the current block corrected by the i-1th iteration and the disparity vector

respectively as the optimal motion vector mv _{r, t} (MB _{r, t} ) of the current block and the optimal disparity vector dv _{r, t} (MB _{r, t} ) of the current block, and end the iterative search process; otherwise, let i=i+ 1, by

Recalculate and update δ, jump to step 1.2, where,

and

respectively represent the motion vector of the current block corrected by the ith iteration and the disparity vector of the current block corrected by the ith iteration;

其中，

表示第i次迭代当前块在时间方向参考帧中的运动补偿块的视差矢量，第i次迭代当前块在视点方向参考帧中的视差补偿块的运动矢量。in,

Indicates the disparity vector of the motion-compensated block in the time-direction reference frame of the current block in the i-th iteration, The motion vector of the disparity compensation block of the current block in the reference frame in the view direction at the i-th iteration.

前述的步骤1.1包括：The aforementioned step 1.1 includes:

2.1、令i＝0，δ＝0；2.1. Let i=0, δ=0;

2.2、所述确定当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点通过候选矢量集

和

获得；2.2. The determination of the motion vector search starting point of the current block and the disparity vector search starting point of the current block by candidate vector set

and

get;

其中，mv_a/dv_a，mv_b/dv_b和mv_c/dv_c分别表示当前块相邻的左边块a、上方块b和右上块c的运动或视差矢量，mv_med和dv_med分别表示当前块运动矢量的中值矢量和当前块视差矢量的中值矢量，mv_l，t为当前块在视点方向参考帧中与当前块位置相同的块的运动矢量，dv_r，t-1为当前块在时间方向参考帧中与当前块位置相同的块的视差矢量；Among them, mv _a /dv _a , mv _b /dv _b and mv _c /dv _c respectively represent the motion or disparity vectors of the left block a, the upper block b and the upper right block c adjacent to the current block, and mv _med and dv _med respectively represent The median vector of the motion vector of the current block and the median vector of the disparity vector of the current block, mv _{l, t} is the motion vector of the block whose position is the same as that of the current block in the reference frame in the view direction of the current block, dv _{r, t-1} is the current block The disparity vector of the block at the same position as the current block in the time direction reference frame of the block;

2.3、所述的得到修正后的当前块的运动矢量搜索起始点

和修正后的当前块的视差矢量搜索起始点

分别以当前块的运动矢量搜索起始点

和当前块的视差矢量搜索起始点

为中心，划定一个RSR_MIN×RSR_MIN的修正搜索窗口，在这个搜索窗口内做矢量修正得到；所述的保存修正后的当前块的运动矢量搜索起始点的率失真代价，记作

所述的保存修正后的当前块的视差矢量搜索起始点的率失真代价，记作

令i＝i+1。2.3. The motion vector search starting point of the modified current block

and the corrected disparity vector search starting point of the current block

Search the starting point with the motion vector of the current block respectively

and the disparity vector search starting point of the current block

As the center, define a modified search window of RSR _MIN ×RSR _MIN , and perform vector correction in this search window; the rate-distortion cost of saving the modified motion vector search starting point of the current block is denoted as

The rate-distortion cost of saving the corrected disparity vector search starting point of the current block is denoted as

Let i=i+1.

前述的步骤1.2中的阈值T₁为5，阈值T₂为20，RSR_MIN为2，RSR_MAX为96。The threshold T ₁ in the aforementioned step 1.2 is 5, the threshold T ₂ is 20, the RSR _MIN is 2, and the RSR _MAX is 96.

前述的步骤1.3包括：The aforementioned step 1.3 includes:

4.1、第i次迭代当前块的视差矢量预测初值

由计算；4.1. The initial value of the disparity vector prediction of the current block in the i-th iteration

Depend on calculate;

其中，

表示第i-1次迭代得到的当前块的运动矢量；

表示由当前块在时间方向参考帧中的运动补偿块覆盖的已编码块的视差矢量，u表示被覆盖的已编码块的个数，将使

最小的

作为第i次迭代当前块在时间方向参考帧中的运动补偿块的视差矢量，记作

为第i-1次迭代得到的当前块在视点方向参考帧中的视差补偿块的运动矢量；in,

Indicates the motion vector of the current block obtained by the i-1th iteration;

Represents the disparity vector of the coded block covered by the motion compensation block in the time direction reference frame of the current block, u represents the number of covered coded blocks, will make

the smallest

As the disparity vector of the motion compensation block of the current block in the time direction reference frame in the i-th iteration, denoted as

The motion vector of the parallax compensation block in the reference frame in the view direction of the current block obtained for the i-1th iteration;

以

为中心，划定一个RSR×RSR的搜索窗口，在这个搜索窗口中进行矢量修正，得到第i次迭代修正后的当前块的视差矢量

保存第i次迭代修正后的当前块的视差矢量的率失真代价，记作 by

As the center, define a search window of RSR×RSR, and perform vector correction in this search window to obtain the disparity vector of the current block after the i-th iteration correction

Save the rate-distortion cost of the disparity vector of the current block corrected by the i-th iteration, denoted as

4.2、第i次迭代当前块的运动矢量预测初值

通过

计算；4.2. The initial value of the motion vector prediction of the current block in the i-th iteration

pass

calculate;

其中，

表示步骤4.1中得到的第i次迭代修正后的当前块的视差矢量，

表示由当前块在视点方向参考帧中的视差补偿块覆盖的已编码块的视差矢量，v表示被覆盖的已编码块的个数，将使

最小的

作为第i次迭代当前块在视点方向参考帧中的视差补偿块的运动矢量，记作

表示步骤4.1中得到的第i次迭代当前块在时间方向参考帧中的运动补偿块的视差矢量；in,

Indicates the disparity vector of the current block corrected by the i-th iteration obtained in step 4.1,

Indicates the disparity vector of the coded block covered by the disparity compensation block in the view direction reference frame of the current block, v represents the number of covered coded blocks, which will make

the smallest

As the motion vector of the disparity compensation block of the current block in the reference frame in the view direction for the i-th iteration, denoted as

Indicates the disparity vector of the motion compensation block of the i-th iteration current block in the time direction reference frame obtained in step 4.1;

以为中心，划定一个RSR×RSR的搜索窗口，在这个搜索窗口中进行矢量修正，得到第i次迭代修正后的当前块的运动矢量

保存第i次迭代修正后的当前块的运动矢量的率失真代价，记作

by As the center, define a search window of RSR×RSR, perform vector correction in this search window, and obtain the motion vector of the current block after the i-th iteration correction

Save the rate-distortion cost of the motion vector of the current block corrected by the i-th iteration, denoted as

与现有技术相比，本发明的优点在于：传统的立体视频编码算法分别在时间方向参考帧和视点方向参考帧中采用大搜索窗口独立搜索运动矢量和视差矢量，没有利用“立体图像对”之间的矢量关系，本发明通过建立立体-运动约束模型，并且根据模型误差的大小发明一种自适应修正窗口的迭代搜索策略，该方法可以在保持编码质量的同时，极大降低立体视频编码的复杂度，提高编码速度。Compared with the prior art, the advantage of the present invention is that the traditional stereoscopic video coding algorithm uses a large search window to independently search the motion vector and the disparity vector in the reference frame in the time direction and the reference frame in the viewpoint direction, and does not use the "stereo image pair" The vector relationship between, the present invention establishes the stereo-motion constraint model, and invents an iterative search strategy for adaptively correcting the window according to the size of the model error. This method can greatly reduce the stereoscopic video coding while maintaining the coding quality. complexity and improve encoding speed.

实验结果证明本发明的方法能够在基本不降低编码质量的同时节省平均96.43％的编码时间。Experimental results prove that the method of the present invention can save an average of 96.43% of the encoding time while basically not reducing the encoding quality.

附图说明 Description of drawings

图1是立体视频编码结构示意图；FIG. 1 is a schematic diagram of a stereoscopic video encoding structure;

图2是立体-运动约束模型示意图；Fig. 2 is a schematic diagram of a stereo-motion constraint model;

图3是本发明方法的流程图；Fig. 3 is a flow chart of the inventive method;

图4是相邻块的预测矢量示意图；Fig. 4 is a schematic diagram of prediction vectors of adjacent blocks;

图5是求运动矢量预测初值的示意图；Fig. 5 is a schematic diagram of seeking an initial value of motion vector prediction;

图6是求视差矢量预测初值的示意图；Fig. 6 is a schematic diagram of calculating the initial value of the disparity vector prediction;

图7是“Ballroom”序列不同算法的率失真曲线示意。Fig. 7 shows the rate-distortion curves of different algorithms of the "Ballroom" sequence.

具体实施方式 Detailed ways

以下结合附图实施例对本发明作进一步详细阐述。The present invention will be further described in detail below in conjunction with the accompanying drawings.

图2为立体-运动约束模型示意图，立体视频左右视点相邻时刻的4幅图像称为一个“立体图像对”，其中，右视点t时刻图像F_r，t为当前帧，MB_r，t表示当前帧，F_r，t中的当前块。F_l，t表示当前块MB_r，t在视点方向的参考帧，F_r，t-1表示当前块MB_r，t在时间方向的参考帧，F_l，t-1表示DCMB_l，t在时间方向的参考帧，F_l，t-1也是MCMB_r，t-1在视点方向的参考帧，其中，DCMB_l，t表示当前块MB_r，t在F_l，t中的视差补偿块，MCMB_l，t-1表示DCMB_l，t在F_l，t-1中的运动补偿块，MCMB_r，t-1表示当前块MB_r，t在F_r，t-1中的运动补偿块，DCMB_l，t-1表示MCMB_r，t-1在F_l，t-1中的视差补偿块。Figure 2 is a schematic diagram of the stereo-motion constraint model. The four images at the adjacent moments of the left and right viewpoints of the stereo video are called a "stereo image pair", where the image F _{r, t of the right viewpoint at time t} is the current frame, and MB _{r, t} represents Current frame, F _{r, current block in t} . F _{l, t} represents the reference frame of the current block MB _{r, t} in the view direction, F _{r, t-1} represents the reference frame of the current block MB _{r, t} in the time direction, F _{l, t-1} represents the DCMB _{l, t} in The reference frame in the time direction, F _{l, t-1} is also the reference frame of MCMB _{r, t-1} in the view direction, where DCMB _{l, t} represents the disparity compensation block of the current block MB _{r, t} in F _{l, t} , MCMB _l,t-1 indicates the motion compensated block of DCMB _l,t in F _l,t-1 , MCMB _r,t-1 indicates the motion compensated block of current block MB _r,t in F _r,t-1 , DCMB _l,t-1 denotes the disparity compensation block of MCMB _r,t-1 in F _l,t-1 .

“立体图像对”的运动矢量和视差矢量关系可以由下式表示，The relationship between the motion vector and the disparity vector of a "stereo image pair" can be expressed by the following formula,

δ＝‖mv_r，t(MB_r，t)+dv_r，t-1(MCMB_r，t-1)-dv_r，t(MB_r，t)-mv_l，t(DCMB_l，t)‖δ=∥mv _r,t (MB _r,t )+dv _r,t-1 (MCMB _r,t-1 )-dv _r,t (MB _r,t )-mv _l,t (DCMB _l,t ) ‖

其中，‖v‖表示v的范数，δ为立体-运动约束模型的模型误差，mv_r，t(MB_r，t)为当前块MB_r，t的最优运动矢量，dv_r，t(MB_r，t)为当前块MB_r，t的最优视差矢量，dv_r，t-1(MCMB_r，t-1)为MCMB_r，t-1的视差矢量，mv_l，t(DCMB_l，t)为DCMB_l，t的运动矢量。Among them, ‖v‖ represents the norm of v, δ is the model error of the stereo-motion constraint model, mv _r,t (MB _r,t ) is the optimal motion vector of the current block MB _r,t , dv _r,t ( MB _{r, t} ) is the optimal disparity vector of the current block MB _{r, t} , dv _{r, t-1} (MCMB _{r, t-1} ) is the disparity vector of MCMB _{r, t-1} , mv _{l, t} (DCMB _{l , t} ) is the motion vector of DCMB _{l, t} .

当且仅当MCMB_l，t-1和DCMB_l，t-1为同一个物体在不同3D表面的真实汇聚点时，上式中的δ等于0，即，If and only if MCMB _{l, t-1} and DCMB _{l, t-1} are the real convergence points of the same object on different 3D surfaces, δ in the above formula is equal to 0, that is,

mv_r，t(MB_r，t)+dv_r，t-1(MCMB_r，t-1)＝dv_r，t(MB_r，t)+mv_l，t(DCMB_l，t)mv _r,t (MB _r,t )+dv _r,t-1 (MCMB _r,t-1 )=dv _r,t (MB _r,t )+mv _l,t (DCMB _l,t )

本发明针对立体视频右视点除第一帧以外的所有帧的编码块设计运动矢量和视差矢量的快速预测方法，右视点第一帧仍然采用全搜索算法来保证搜索精度。图3为本发明方法的流程图，分为初始化、调整修正搜索窗口大小、迭代搜索，和中止准则四个步骤。假设右视点t时刻图像中的宏块MB_r，t为当前块，i表示迭代步数，δ为立体-运动约束模型的模型误差，mv_r，t(MB_r，t)和dv_r，t(MB_r，t)分别表示当前块的最优运动矢量和当前块的最优视差矢量，本发明的方法步骤如下：The present invention designs a fast prediction method for motion vectors and disparity vectors for coding blocks of all frames except the first frame of the right view of the stereoscopic video, and the first frame of the right view still uses a full search algorithm to ensure search accuracy. Fig. 3 is a flowchart of the method of the present invention, which is divided into four steps of initialization, adjusting and correcting the size of the search window, iterative search, and stopping criteria. Assume that the macroblock MB _{r, t} in the image at the right viewpoint t is the current block, i represents the number of iteration steps, δ is the model error of the stereo-motion constraint model, mv _{r, t} (MB _{r, t} ) and dv _{r, t} (MB _{r, t} ) respectively represent the optimal motion vector of the current block and the optimal disparity vector of the current block, and the method steps of the present invention are as follows:

第一步，初始化：The first step, initialization:

1)令i＝0，δ＝0；1) Let i=0, δ=0;

2)由候选矢量集

和

确定当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点

其中，mv_a/dv_a，mv_b/dv_b和mv_c/dv_c分别表示当前块相邻的左边块a、上方块b和右上块c的运动或视差矢量，mv_med和dv_med分别表示当前块的运动矢量和视差矢量的中值矢量，中值矢量mv_med的水平分量和垂直分量分别等于mv_a，mv_b和mv_c水平分量和垂直分量的中值，中值矢量dv_med的水平分量和垂直分量分别等于dv_a，dv_b和dv_c水平分量和垂直分量的中值，mv_l，t为当前块在视点方向参考帧中与当前块位置相同的块的运动矢量，dv_r，t-1为当前块在时间方向参考帧中与当前块位置相同的块的视差矢量，如图4所示。由于当前块的相邻块已经完成编码，每个相邻块都只有运动矢量或者视差矢量，例如，对于当前块左边的相邻块a来说，只存在mv_a或者dv_a，因此，如果a是以视差估计方式进行编码的话，mv_a就不存在。在这种情况下，我们用(0，0)代替mv_a来计算中值矢量mv_med。比较各个候选预测矢量的率失真代价，分别选取使率失真代价最小的作为当前块的运动和视差矢量搜索起始点

和

2) From the candidate vector set

and

Determine the motion vector search starting point for the current block and the disparity vector search starting point of the current block

Among them, mv _a /dv _a , mv _b /dv _b and mv _c /dv _c respectively represent the motion or disparity vectors of the left block a, the upper block b and the upper right block c adjacent to the current block, and mv _med and dv _med respectively represent The median vector of the motion vector and the disparity vector of the current block, the horizontal and vertical components of the median vector mv _med are equal to the median of the horizontal and vertical components of mv _a , mv _b and mv _c respectively, the horizontal and vertical components of the median vector dv _med component and vertical component are equal to dv _a , dv _b and dv _c median value of the horizontal component and vertical component respectively, mv _{l, t} is the motion vector of the block with the same position as the current block in the view direction reference frame of the current block, dv _{r, t-1} is the disparity vector of the block at the same position as the current block in the reference frame in the time direction of the current block, as shown in FIG. 4 . Since the adjacent blocks of the current block have been coded, each adjacent block only has a motion vector or a disparity vector. For example, for the adjacent block a to the left of the current block, there is only mv _a or dv _a , therefore, if a If it is encoded in the way of disparity estimation, mv _a does not exist. In this case, we substitute (0,0) for mv _a to compute the median vector mv _med . Compare the rate-distortion cost of each candidate prediction vector, and select the one that minimizes the rate-distortion cost as the starting point of the search for the motion and disparity vector of the current block

and

3)分别以当前块的运动矢量搜索起始点和当前块的视差矢量搜索起始点为中心，划定一个RSR_MIN×RSR_MIN的修正搜索窗口，在这个搜索窗口内做矢量修正，得到修正后的当前块的运动矢量和视差矢量搜索起始点

和保存修正后的当前块的运动矢量搜索起始点的率失真代价，记作

保存修正后的当前块的视差矢量搜索起始点的率失真代价，记作令i＝i+1。3) Search the starting point with the motion vector of the current block respectively and the disparity vector search starting point of the current block As the center, define a modified search window of RSR _MIN × RSR _MIN , do vector correction in this search window, and obtain the modified motion vector and disparity vector search starting point of the current block

and Save the rate-distortion cost of the modified motion vector search starting point of the current block, denoted as

Save the rate-distortion cost of the disparity vector search starting point of the current block after correction, denoted as Let i=i+1.

第二步，按照下式调整修正搜索窗口RSR的大小。In the second step, the size of the modified search window RSR is adjusted according to the following formula.

其中，RSR_MIN为最小的修正搜索窗口(通常为2×2个像素)，RSR_MAX为最大的修正搜索窗口(通常取全搜索算法的搜索窗口大小，即96×96)，T₁和T₂表示两个阈值(T₁＜T₂)，用来控制修正搜索窗口的大小，当模型误差δ大于阈值T₂时(T₂通常为20)，当前块很可能为运动遮挡区域，需要采用大的修正搜索窗口RSR_MAX来保证搜索精度；当模型误差δ小于阈值T₁时，T₁通常取5，说明当前迭代过程中得到的当前块的运动/视差矢量已经非常接近当前块的最优运动/视差矢量，于是，采用较小的修正搜索窗口RSR_MIN，就足以保证搜索精度；反之，根据模型误差δ的大小来设计一个自适应大小的修正搜索窗口。Among them, RSR _MIN is the smallest modified search window (usually 2×2 pixels), RSR _MAX is the largest modified search window (usually the search window size of the full search algorithm, namely 96×96), T ₁ and T ₂ Indicates two thresholds (T ₁ <T ₂ ), which are used to control the size of the modified search window. When the model error δ is greater than the threshold T ₂ (T ₂ is usually 20), the current block is likely to be a motion occlusion area, and a large The modified search window RSR _MAX is used to ensure the search accuracy; when the model error δ is smaller than the threshold T ₁ , T ₁ usually takes 5, indicating that the motion/disparity vector of the current block obtained in the current iteration process is very close to the optimal motion of the current block / disparity vector, therefore, using a smaller modified search window RSR _MIN is sufficient to ensure the search accuracy; otherwise, an adaptively sized modified search window is designed according to the size of the model error δ.

第三步，迭代搜索过程：The third step is to iterate the search process:

1)由

计算第i次迭代过程中当前块的视差矢量预测初值

其中，

表示第i-1次迭代过程中得到的当前块的运动矢量；

表示由当前块在时间方向参考帧中的运动补偿块覆盖的已编码块的视差矢量，u表示被覆盖的已编码块的个数，如图5所示，将使最小的

作为第i次迭代过程中当前块在时间方向参考帧中的运动补偿块的视差矢量，记作

为第i-1次迭代过程中得到的当前块在视点方向参考帧中的视差补偿块的运动矢量。1) by

Calculate the initial value of the disparity vector prediction of the current block in the iterative process

in,

Indicates the motion vector of the current block obtained during the i-1th iteration;

Represents the disparity vector of the coded block covered by the motion compensation block in the time direction reference frame of the current block, u represents the number of covered coded blocks, as shown in Figure 5, will make the smallest

As the disparity vector of the motion compensation block of the current block in the time direction reference frame in the iterative process, denoted as

is the motion vector of the disparity compensation block of the current block in the reference frame in the view direction obtained in the iterative process of i-1.

以

为中心，划定一个RSR×RSR的搜索窗口，在这个搜索窗口中进行矢量修正，得到第i次迭代过程中修正后的当前块的视差矢量

保存第i次迭代过程中修正后的当前块的视差矢量的率失真代价，记作 by

As the center, define a RSR×RSR search window, and perform vector correction in this search window to obtain the corrected disparity vector of the current block in the i-th iteration process

Save the rate-distortion cost of the disparity vector of the current block corrected in the i-th iteration, denoted as

2)由

计算第i次迭代过程中当前块的运动矢量预测初值

其中，

表示第三步步骤1)中得到的第i次迭代过程中修正后的当前块的视差矢量，

表示由当前块在视点方向参考帧中的视差补偿块覆盖的已编码块的视差矢量，v表示被覆盖的已编码块的个数，如图6所示，将使最小的

作为第i次迭代过程中当前块在视点方向参考帧中的视差补偿块的运动矢量，记作

表示第三步步骤1)中得到的第i次迭代过程中当前块在时间方向参考帧中的运动补偿块的视差矢量。2) by

Calculate the initial value of the motion vector prediction of the current block during the i-th iteration

in,

Represents the disparity vector of the current block corrected in the i-th iteration process obtained in step 1) of the third step,

Represents the disparity vector of the coded block covered by the disparity compensation block in the reference frame of the current block in the view direction, v represents the number of coded blocks covered, as shown in Figure 6, will make the smallest

As the motion vector of the parallax compensation block of the current block in the reference frame in the view direction during the iterative process, denoted as

Represents the disparity vector of the motion compensation block of the current block in the reference frame in the time direction obtained in step 1) of the third step in the iterative process.

以

为中心，划定一个RSR×RSR的搜索窗口，在这个搜索窗口中进行矢量修正，得到第i次迭代过程中修正后的当前块的运动矢量

保存第i次迭代过程中修正后的当前块的运动矢量的率失真代价，记作

by

As the center, define a RSR×RSR search window, and perform vector correction in this search window to obtain the corrected motion vector of the current block in the i-th iteration process

Save the rate-distortion cost of the motion vector of the current block corrected during the i-th iteration, denoted as

第四步，中止准则：表示第i次迭代过程修正后的当前块的运动矢量的率失真代价，

表示第i-1次迭代过程修正后的当前块的运动矢量的率失真代价，

表示第i次迭代过程修正后的当前块的视差矢量的率失真代价，

表示第i-1次迭代过程修正后的当前块的视差矢量的率失真代价，如果

并且

则分别令第i-1次迭代过程中修正后的当前块的运动矢量和视差矢量

和

作为当前块的最优运动矢量mv_r，t(MB_r，t)和当前块的最优视差矢量dv_r，t(MB_r，t)，结束迭代搜索过程。否则，令i＝i+1，由

重新计算并更新δ，跳转到步骤B，其中，和

分别表示第i次迭代过程中修正后的当前块的运动矢量和第i次迭代过程中修正后的当前块的视差矢量，

表示第i次迭代过程中当前块在时间方向参考帧中的运动补偿块的视差矢量，

第i次迭代过程中当前块在视点方向参考帧中的视差补偿块的运动矢量。The fourth step, the suspension criterion: Indicates the rate-distortion cost of the motion vector of the current block corrected by the iterative process,

Indicates the rate-distortion cost of the motion vector of the current block corrected by the iterative process of the i-1th time,

Indicates the rate-distortion cost of the disparity vector of the current block corrected by the iterative process,

Indicates the rate-distortion cost of the disparity vector of the current block corrected by the iterative process of the i-1th time, if

and

Then respectively let the motion vector and disparity vector of the current block corrected in the iterative process of i-1

and

As the optimal motion vector mv _r,t (MB _r,t ) of the current block and the optimal disparity vector dv _r,t (MB _r,t ) of the current block, the iterative search process ends. Otherwise, let i=i+1, by

Recalculate and update δ, jump to step B, where, and

respectively represent the motion vector of the current block corrected in the ith iteration process and the disparity vector of the current block corrected in the ith iteration process,

Indicates the disparity vector of the motion compensation block of the current block in the time direction reference frame during the i-th iteration,

The motion vector of the disparity compensation block of the current block in the reference frame in the view direction during the iterative process.

为了检验本发明所提出的方法的性能，将本发明的方法与全搜索方法进行比较。实验平台为JMVM8.0，运动估计和视差估计搜索窗口大小为32×32，宏块模式为Inter16×16，从“Ballroom”，“Exit”，“Vassar”，“Racel”，“Rena”序列的视点0，视点1中各取100帧图像分别作为左视点和右视点，左视点采用传统的h.264编码方法先进行编码，右视点采用本发明的立体视频编码快速迭代搜索方法进行编码，各序列分辨率为640×480。所有实验在配置为Intel(R)Core(TM)2 Extreme X9650 2.99GHzCPU，4GB RAM的PC上独立执行。In order to examine the performance of the proposed method in the present invention, the method of the present invention is compared with the full search method. The experimental platform is JMVM8.0, the motion estimation and disparity estimation search window size is 32×32, the macroblock mode is Inter16×16, from the sequence of “Ballroom”, “Exit”, “Vassar”, “Racel”, “Rena” Take 100 frames of images from viewpoint 0 and viewpoint 1 respectively as the left viewpoint and right viewpoint. The left viewpoint is first encoded using the traditional h. The sequence resolution is 640×480. All experiments are performed independently on a PC configured with Intel(R) Core(TM)2 Extreme X9650 2.99GHz CPU, 4GB RAM.

表1为不同算法的编码结果比较。图7为“Ballroom”序列不同算法的率失真曲线示意。可以看出，本文算法与JMVM全搜索相比，峰值信噪比的变化在-0.33dB到+0.01dB之间，平均码率的变化范围在-10.93％到+0.07％之间，基本上看不出编码质量的下降，并且本文算法能够节省96％以上的编码时间。Table 1 compares the coding results of different algorithms. Figure 7 shows the rate-distortion curves of different algorithms for the "Ballroom" sequence. It can be seen that, compared with the JMVM full search, the algorithm in this paper has a change in peak signal-to-noise ratio between -0.33dB and +0.01dB, and an average code rate between -10.93% and +0.07%. There is no decline in encoding quality, and the algorithm in this paper can save more than 96% of encoding time.

表1不同算法的编码结果与JMVM全搜索算法比较Table 1 Encoding results of different algorithms compared with JMVM full search algorithm

Claims

1. A fast iterative search method for stereoscopic video coding, let the macroblock MB _{r in the image at the moment t of the right viewpoint, t} be the current block, i be the number of iteration steps, and δ be the model error of the stereo-motion constraint model, mv _{r, t} (MB _{r, t} ) represents the optimal motion vector of the current block, dv _{r, t} (MB _{r, t} ) represents the optimal disparity vector of the current block,

is the rate-distortion cost of the disparity vector of the current block corrected in the i-1th iteration, T ₁ and T ₂ represent two thresholds, T ₁ < T ₂ , RSR _MIN is the minimum correction search window, RSR _MAX is the maximum correction search Window, T ₁ is 5, T ₂ is 20, RSR _MIN is 2, RSR _MAX is 96, it is characterized in that comprising the following steps:

1.1. Initialization; determine the starting point of the motion vector search of the current block

and the disparity vector search starting point of the current block

and the disparity vector search starting point of the current block

As the center, define a modified search window of RSR _MIN × RSR _MIN , do vector correction in this search window, and obtain the modified motion vector search starting point of the current block

and the corrected disparity vector search starting point of the current block Save the corrected motion of the current block and the rate-distortion cost of the disparity vector search starting point;

1.2. Adjust the size of the modified search window RSR according to the following formula,

RSR RSR = = \{\begin{matrix} {RSR RSR}_{MIN MIN} & δ δ < < {T T}_{11} \\ {RSR RSR}_{MIN MIN} + + \frac{δ δ - - {T T}_{11}}{{T T}_{22} - - {T T}_{11}} (({RSR RSR}_{MAX MAX} - - {RSR RSR}_{MIN MIN})) & {T T}_{11} \leq \leq δ δ \leq \leq {T T}_{22} \\ {RSR RSR}_{MAX MAX} & δ δ > > {T T}_{22} \end{matrix}

1.3. Iterative search process; determine the initial value of the disparity vector prediction of the current block in the i-th iteration

Determine the initial value of motion vector prediction for the current block in the ith iteration Perform vector correction to obtain the motion vector of the current block corrected by the i-th iteration Save the rate-distortion cost of the motion vector and disparity vector of the current block corrected by the i-th iteration;

1.4. Suspension criteria: if

RDCost ({mν}_{r, t}^{i} ({MB}_{r, t})) &Greater Equal; RDCost ({mν}_{r, t}^{i - 1} ({MB}_{r, t}))

and

RDCost ({dν}_{r, t}^{i} ({MB}_{r, t})) &Greater Equal; RDCost ({dν}_{r, t}^{i - 1} ({MB}_{r, t})),

Then let the motion vector of the current block corrected by the i-1th iteration

and the disparity vector

δ = | | {mν}_{r, t}^{i} ({MB}_{r, t}) + {dν}_{r, t - 1}^{i} ({MCMB}_{r, t - 1}) - {dν}_{r, t}^{i} ({MB}_{r, t}) - {mν}_{l, t}^{i} ({DCMB}_{l, t}) | |

Recalculate and update δ, jump to step 1.2, where,

and respectively represent the motion vector of the current block corrected by the ith iteration and the disparity vector of the current block corrected by the ith iteration;

in, Indicates the disparity vector of the motion-compensated block in the time-direction reference frame of the current block in the i-th iteration,

The motion vector of the disparity compensation block of the current block in the reference frame in the view direction at the i-th iteration.

2. The fast iterative search method for stereoscopic video coding according to claim 1, wherein said step 1.1 comprises:

2.1. Let i=0, δ=0;

2.2. The determination of the motion vector search starting point of the current block

and the disparity vector search starting point of the current block

by candidate vector set

{Bmν}_{r, t}^{0} ({MB}_{r, t}) : {{mν}_{l, t}, {mν}_{med}, {mν}_{a}, {mν}_{b}, {mν}_{c}, \overset{&Right Arrow;}{0}}

and

{Bdν}_{r, t}^{0} ({MB}_{r, t}) : {{dν}_{r, t - 1}, {dν}_{med}, {dν}_{a}, {dν}_{b}, {dν}_{c}, \overset{&Right Arrow;}{0}}

get;

Among them, mv _a /dv _a , mv _b /dv _b and mv _c /dv _c respectively represent the motion or disparity vectors of the left block a, the upper block b and the upper right block c adjacent to the current block, and mv _med and dv _med respectively represent The median vector of the motion vector of the current block and the median vector of the disparity vector of the current block, mv _{l, t} is the motion vector of the block whose position is the same as that of the current block in the reference frame in the view direction of the current block, dv _{r, t-1} is the current block The disparity vector of the block at the same position as the current block in the time direction reference frame of the block;

2.3. The rate-distortion cost of saving the modified motion vector search starting point of the current block is denoted as

Let i=i+1.

3. The fast iterative search method for stereoscopic video coding according to claim 1, wherein said step 1.3 comprises:

3.1. The initial value of the disparity vector prediction of the current block in the i-th iteration

Depend on

\underset{1 \leq μ \leq 4}{Min} (RDCost ({Bdν}_{r, t}^{i, u} ({MB}_{r, t})))

calculate;

in,

{Bdν}_{r, t}^{i, u} ({MB}_{r, t}) = {mν}_{r, t}^{i - 1} ({MB}_{r, t}) + {dν}_{r, t - 1}^{i, u} ({MCMB}_{r, t - 1}) - {mν}_{l, t}^{i - 1} ({DCMB}_{l, t}),

Represents the disparity vector of the coded block covered by the motion compensation block of the current block in the time direction reference frame, u represents the covered coded block, will make

the smallest

by

3.2. The initial value of the motion vector prediction of the current block in the i-th iteration pass

\underset{1 \leq v \leq 4}{Min} (RDCost ({Bmν}_{r, t}^{i, ν} ({MB}_{r, t}))

calculate;

in,

{Bmν}_{r, t}^{i, ν} ({MB}_{r, t}) = {dν}_{r, t}^{i} ({MB}_{r, t}) + {mν}_{l, t}^{i, ν} ({DCMB}_{l, t}) - {dν}_{r, t - 1}^{i} ({MCMB}_{r, t - 1})

Indicates the disparity vector of the current block corrected by the i-th iteration obtained in step 3.1,

Represents the disparity vector of the coded block covered by the disparity compensation block in the view direction reference frame of the current block, v represents the coded block covered, will make

the smallest

Indicates the disparity vector of the motion compensation block of the i-th iteration current block in the reference frame in the time direction obtained in step 3.1;

by

As the center, define a search window of RSR×RSR, perform vector correction in this search window, and obtain the motion vector of the current block after the i-th iteration correction