CN101090491B

CN101090491B - Enhanced Block-Based Motion Estimation Algorithm for Video Compression

Info

Publication number: CN101090491B
Application number: CN200610064777.XA
Authority: CN
Inventors: 区子廉; 黄海明
Original assignee: Hong Kong University of Science and Technology
Current assignee: Hong Kong University of Science and Technology
Priority date: 2006-06-16
Filing date: 2006-12-15
Publication date: 2016-05-18
Anticipated expiration: 2026-12-15
Also published as: CN101090491A

Abstract

A method, system and software are presented for obtaining blocks of a first image that are similar to blocks of a second image (a "reference image"). The blocks of the first image are processed sequentially, a plurality of candidate locations in the second image are derived for each block, and a cost function is evaluated for each candidate location. Each candidate location in the second image is replaced by a respective motion vector from the block of the first image. In the first aspect of the invention, the cost function is a function of a predicted motion vector of a future block of the first image (i.e. a block of the first image that has not yet been processed). In the second aspect of the present invention, the motion vector is given by position values which are not all of full-pixel space, half-pixel space, or 1/4-pixel space.

Description

Enhanced Block-Based Motion Estimation Algorithm for Video Compression

相关申请related application

本申请请求了于2005年6月17日申请的美国临时专利申请60/691,181的优先权，在此将其整体引入作为参考。This application claims priority to US Provisional Patent Application 60/691,181, filed June 17, 2005, which is hereby incorporated by reference in its entirety.

技术领域technical field

本发明通常涉及用于数字信号压缩、编码和表示的方法和系统，并且更加确切地说，本发明涉及使用多帧运动估计(ME)的方法和系统。本发明进一步涉及一种计算机程序产品，诸如记录媒体，承载可以由计算设备读取的程序指令，以使得所述计算设备执行根据本发明的一种方法。The present invention relates generally to methods and systems for digital signal compression, encoding and representation, and more particularly to methods and systems using multi-frame motion estimation (ME). The invention further relates to a computer program product, such as a recording medium, carrying program instructions readable by a computing device to cause said computing device to perform a method according to the invention.

背景技术Background technique

由于由现代多媒体应用使用的原始数字视频数据(或者图像序列)的巨大尺寸，必须对这种数据进行压缩以便可以传输和存储这些数据。存在许多重要视频压缩标准，包括ISO/IECMPEG-1、MPEG-2、MPEG-4标准和ITU-TH.261、H.263、H.264标准。ISO/IECMPEG-1/2/4标准广泛地运用于娱乐业以发行电影、包括视频压缩光盘或者VCD(MPEG-1)的数字视频广播、数字视频光盘或者数字多用途光盘或者DVD(MPEG-2)、可记录DVD(MPEG-2)、数字视频广播、数字视频广播或者DVB(MPEG-2)、视频点播或者VOD(MPEG-2)、在US中的高清晰度电视或者HDTV(MPEG-2)等等。MPEG-4标准比MPEG-2更加先进，可以在较低的比特率下实现高质量视频，这使得其非常适合于因特网、数字无线网络(例如3G网络)、多媒体信息服务(来自3GPP的MMS标准)等等上的视频信息流。MPEG-4被下一代高清晰度DVD(HD-DVD)标准和MMS标准所接受。ITU-TH.261/3/4标准设计用于低延迟电视电话和视频会议系统。早期的H.261标准设计为在p^*64kbit/s下工作，p＝1，2，..，31。后期的H.263标准非常成功，被广泛地用于现代电视会议系统，并且用于宽带网络和无线网络中的视频信息流，其中无线网络包括在2.5G和3G网络以及其他网络中的多媒体信息服务(MMS)。最新标准，H.264(也称作MPEG-4版本10，或者MPEG-4AVC)是当前最新技术水平的视频压缩标准。它如此强大以至MPEG决定与JointVideoTeam(JVT)的框架中的ITU-T联合开发。新的标准在ITU-T中称为的H.264，并且被称作MPEG-4高级视频编码(MPEG-4AVC)，或者MPEG-4版本10。H.264用于HD-DVD标准、直接视频广播(DVB)标准并且可能用于MMS标准。基于H.264，当前在中国正在开发称作视听标准(AVS)的相关标准。AVS1.0设计用于高清晰度电视(HDTV)。AVS-M设计用于移动应用。H.264具有超过MPEG-1/2/4及H.261/3标准的目标和主观的视频质量。除使用整数4x4离散余弦变换(DCT)代替传统的8x8DCT以外，H.264[1]的基本编码算法类似于H.263或者MPEG-4，并且还有额外的特点，包括I帧的帧间预测模式、用于运动估计/补偿的多种块大小以及多种参考坐标系、用于运动估计的四分之一像素精度、回路内去块效应(in-loopdeblocking)滤波器，内容自适应二进制算术编码(contextadaptivebinaryarithmeticcoding)，等等。Due to the enormous size of raw digital video data (or image sequences) used by modern multimedia applications, this data must be compressed so that it can be transmitted and stored. There are many important video compression standards, including ISO/IEC MPEG-1, MPEG-2, MPEG-4 standards and ITU-TH.261, H.263, H.264 standards. The ISO/IEC MPEG-1/2/4 standard is widely used in the entertainment industry to distribute movies, digital video broadcasts including video compact discs or VCDs (MPEG-1), digital video discs or digital versatile discs or DVDs (MPEG-2 ), Recordable DVD (MPEG-2), Digital Video Broadcasting, Digital Video Broadcasting or DVB (MPEG-2), Video On Demand or VOD (MPEG-2), High Definition Television or HDTV in the US (MPEG-2 )etc. The MPEG-4 standard is more advanced than MPEG-2, and can achieve high-quality video at a lower bit rate, which makes it very suitable for the Internet, digital wireless networks (such as 3G networks), multimedia information services (MMS standard from 3GPP ) and more. MPEG-4 is adopted by the next generation High Definition DVD (HD-DVD) standard and MMS standard. The ITU-TH.261/3/4 standard is designed for low-latency video telephony and video conferencing systems. The early H.261 standard was designed to work at p ^* 64kbit/s, p = 1, 2, . . . , 31. The late H.263 standard is very successful and is widely used in modern video conferencing systems, and for video information flow in broadband networks and wireless networks, where wireless networks include multimedia information in 2.5G and 3G networks and other networks service (MMS). The latest standard, H.264 (also known as MPEG-4 version 10, or MPEG-4 AVC) is the current state-of-the-art video compression standard. It is so powerful that MPEG decided to develop it jointly with ITU-T in the framework of JointVideoTeam (JVT). The new standard is called H.264 in the ITU-T, and is called MPEG-4 Advanced Video Coding (MPEG-4 AVC), or MPEG-4 version 10. H.264 is used in the HD-DVD standard, the Direct Video Broadcasting (DVB) standard and possibly the MMS standard. Based on H.264, a related standard called Audio Visual Standard (AVS) is currently being developed in China. AVS1.0 is designed for high-definition television (HDTV). AVS-M is designed for mobile applications. H.264 has the objective and subjective video quality exceeding MPEG-1/2/4 and H.261/3 standards. In addition to using an integer 4x4 discrete cosine transform (DCT) instead of the traditional 8x8DCT, the basic coding algorithm of H.264 [1] is similar to H.263 or MPEG-4, and has additional features, including inter-frame prediction of I frames modes, multiple block sizes and multiple reference coordinate systems for motion estimation/compensation, quarter-pixel precision for motion estimation, in-loop deblocking filters, content-adaptive binary arithmetic Coding (context adaptive binary arithmetic coding), and so on.

运动估计是大多数视频压缩标准(诸如MPEG-1/2/4和H.261/3/4)的核心部分，其充分利用时间冗余度，因此其性能直接影响视频编码系统的压缩效率、主观视频质量以及编码速度。Motion estimation is the core part of most video compression standards (such as MPEG-1/2/4 and H.261/3/4), which makes full use of time redundancy, so its performance directly affects the compression efficiency of video coding systems, Subjective video quality as well as encoding speed.

在块匹配运动估计(BMME)中，在ME中，对当前块和参考块之间的失真的最通用的测量是绝对差值的和(SAD)，对于一个N×N块而言，定义为In block matching motion estimation (BMME), the most common measure of distortion between the current block and the reference block in ME is the sum of absolute differences (SAD), which for an N×N block is defined as

$SAD SAD ((mvx mvx,, mvy mvy)) = = {Σ Σ}_{m m = = 00,, n no = = 00}^{N N - - 11} | | {F f}_{t t} ((x x + + m m,, y the y + + n no)) - - {F f}_{t t - - 11} ((x x + + m m + + mvx mvx,, y the y + + n no + + mvy mvy)) | |$

其中F_t是当前帧，F_t-1是标准帧，(mvx，mvy)表示当前运动矢量(MV)。对于宽度＝X，高度＝Y，并且块大小＝N×N的帧而言，在搜索范围±W中需要对SAD进行评估以查找最优运动矢量的搜索点总数等于：Where F _t is the current frame, F _t-1 is the standard frame, (mvx, mvy) represents the current motion vector (MV). For a frame with width=X, height=Y, and block size=N×N, the total number of search points in the search range ± W where the SAD needs to be evaluated to find the optimal motion vector is equal to:

$((\frac{X x}{N N})) ((\frac{Y Y}{N N})) {((22 W W + + 11))}^{22},,$

对于X＝352，Y＝288，N＝16并且W＝32的情况下，其等于1673100。这是在视频编码器中消耗巨大计算能力的巨大数值。已经提出了许多快速算法[2]-[9]来减少在ME中搜索点的数目，例如三步搜索(TSS)[11]，2D对数搜索[12]，新三步搜索(NTSS)[3]，MVFAST[7]，以及PMVFAST[2]。MVFAST和PMVFAST显著地优于前三个算法，因为它们使用中值运动矢量预测器作为搜索中心执行中心偏离ME，由此通过平滑运动矢量场降低了MV编码的位的数目。For the case of X=352, Y=288, N=16 and W=32, it is equal to 1673100. This is a huge number that consumes huge computing power in a video encoder. Many fast algorithms [2]-[9] have been proposed to reduce the number of search points in ME, such as three-step search (TSS) [11], 2D logarithmic search [12], new three-step search (NTSS) [ 3], MVFAST[7], and PMVFAST[2]. MVFAST and PMVFAST are significantly better than the first three algorithms because they perform center-off ME using the median motion vector predictor as the search center, thereby reducing the number of bits for MV coding by smoothing the motion vector field.

PMVFAST算法(其是对MVFAST及其它快速算法的重要改进，并且因而被MPEG标准[10]所接受)最初考虑一组MV预测器，包括中值、零点、左边、顶部、右上方的和先前的共位(co-located)MV预测器。图1举例说明了当前块、左块、顶部块、右上方块、右右上方块、以及右块(其是＂未来块＂，即，在当前块之后被处理的块)的位置。它计算每个预测的SAD代价。在后来的发展中，对PMVFAST进行修改以计算RD(速率失真)代价[13]来替代使用以下代价函数的SAD代价：The PMVFAST algorithm (which is an important improvement over MVFAST and other fast algorithms, and thus accepted by the MPEG standard [10]) initially considers a set of MV predictors, including median, zero, left, top, top right, and previous Co-located MV predictor. Figure 1 illustrates the locations of the current block, the left block, the top block, the upper right block, the upper right block, and the right block (which is a "future block", ie, a block that is processed after the current block). It computes the SAD cost for each prediction. In a later development, PMVFAST was modified to compute the RD (rate-distortion) cost [13] instead of the SAD cost using the following cost function:

J(m，λ_motion)＝SAD(s，c(m))+λ_motion(R(m-p))(1)J(m,λ _motion )=SAD(s,c(m))+λ _motion (R(mp))(1)

其中s是原始视频信号，c是参考视频信号，m是当前MV，p是当前块的中值MV预测器，λ_motion是Lagrange乘法器，R(m-p)表示用于编码运动信息的位。在PMVFAST中的下一步骤是选择具有最小代价的MV预测器，并且根据从MV预测器获得的最小代价的值来执行数量大菱形搜索(diamondsearch)或者小菱形搜索。where s is the original video signal, c is the reference video signal, m is the current MV, p is the median MV predictor for the current block, λ _motion is the Lagrange multiplier, and R(mp) denotes the bits used to encode the motion information. The next step in PMVFAST is to select the MV predictor with the minimum cost, and perform either a large diamond search or a small diamond search according to the value of the minimum cost obtained from the MV predictor.

在定义当前视频编码标准中独立但是重要问题是使用亚像素预定矢量，包括半像素、1/4像素或者可能甚至1/8像素运动矢量，其提供对运动的更加精确的描述，并且可以提供整像素运动估计的大约1dB的PSNR增益。采用半像素精度，运动矢量可以采用等间距位置值，诸如0.0，0.5，1.0，1.5，2.0等等。采用1/4像素精度，运动矢量可以采用诸如0.00，0.25，0.50，0.75，1.00，1.25，1.50，1.75，2.00等等之类的位置值。采用1/8像素精度，运动矢量可以采用诸如0.000，0.125，0.250，0.375，0.500，0.625，0.750，0.875，1.000，1.125，1.250，1.375，1.500，1.625，1.750，1.875，2.000等等之类的位置值。A separate but important issue in defining current video coding standards is the use of sub-pixel predetermined vectors, including half-pixel, 1/4-pixel, or possibly even 1/8-pixel motion vectors, which provide a more precise description of motion and can provide full ~1dB PSNR gain for pixel motion estimation. With half-pixel precision, motion vectors can take equally spaced position values, such as 0.0, 0.5, 1.0, 1.5, 2.0, and so on. With 1/4 pixel precision, motion vectors can take position values such as 0.00, 0.25, 0.50, 0.75, 1.00, 1.25, 1.50, 1.75, 2.00, and so on. With 1/8 pixel precision, motion vectors can be in values such as 0.000, 0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875, 1.000, 1.125, 1.250, 1.375, 1.500, 1.625, 1.750, 1.875, 2.000, etc. position value.

众所周知，运动矢量分布趋向于中心偏移，这意味着运动矢量趋向于非常地接近于(0，0)。在图6(a)中示出了这种情况，其示出为(0，0)MV使用完全搜索(FS)算法的在Foreman序列中的运动矢量分布。此外，如图6(b)所示，运动矢量分布还向中值预测器(中值MV)偏移，它是在图1中示出的运动矢量左块、顶部块和右上方块的中值。此外，如图6(c)所示，运动矢量还向在当前帧中的相邻运动矢量(leftMV，topMV，topRightMV)和先前帧中所设置的运动矢量(preMV)偏移，如图6(d)所示。这些可以都是用于当前矢量的运动矢量的可考虑的预测器，并且它们可以被用于PMVFAST。It is well known that motion vector distributions tend to be center-shifted, which means that motion vectors tend to be very close to (0,0). This is illustrated in Figure 6(a), which shows the distribution of motion vectors in a Foreman sequence using the full search (FS) algorithm for (0,0) MV. Furthermore, as shown in Fig. 6(b), the motion vector distribution is also shifted towards the median predictor (median MV), which is the median value of the motion vector left block, top block and top right block shown in Fig. 1 . In addition, as shown in Figure 6(c), the motion vector is also offset to the adjacent motion vectors (leftMV, topMV, topRightMV) in the current frame and the motion vector (preMV) set in the previous frame, as shown in Figure 6( d) as shown. These can all be considered predictors for the motion vector of the current vector, and they can be used for PMVFAST.

发明内容Contents of the invention

本发明目的是提供用于运动估计的新的并且有益的技术，其适用于数字信号压缩、编码和表达的方法和系统。It is an object of the present invention to provide new and beneficial techniques for motion estimation, applicable to methods and systems for compression, coding and representation of digital signals.

特别是，本发明设法提供新的并且有益的有效运动估计技术，其可以例如应用在MPEG-1、MPEG-2、MPEG-4、H.261、H.263、H.264或AVS或者其它相关视频编码标准中。In particular, the present invention seeks to provide new and beneficial efficient motion estimation techniques that can be applied, for example, in MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H.264 or AVS or other related in video coding standards.

本发明第一方面基于这样的实现：PMVFAST算法的运动估计尽管与在前技术相比确实具有优点，但是其并不是最佳的。在原理上，对于视频中的每个帧，存在整体上使得整个帧的RD代价最小化的运动矢量场{m_i，j，i＝0..M-1，j＝0..N-1}：The first aspect of the invention is based on the realization that the motion estimation of the PMVFAST algorithm, although it does have advantages over the prior art, is not optimal. In principle, for each frame in the video, there exists a motion vector field {m _i,j , i=0..M-1, j=0..N-1 }:

$total total__RD RD__Cost cost = = {Σ Σ}_{i i = = 00}^{M m - - 11} {Σ Σ}_{j j = = 00}^{N N - - 11} [[SAD SAD (({s the s}_{i i,, j j},, c c (({m m}_{i i,, j j})))) + + {λ λ}_{i i,, j j} ((R R (({m m}_{i i,, j j} - - {p p}_{i i,, j j}))))]] - - - - - - ((22))$

其中(i，j)表示在包含M×N个块的帧中的第(i，j)个块。对于固定Q_p(其是量化参数)，λ_i，j＝λ＝恒量，而且where (i,j) denotes the (i,j)th block in a frame containing MxN blocks. For a fixed _Qp (which is the quantization parameter), λ _i,j = λ = constant, and

p_i，j＝median(m_i，j-1，m_i-1，j，m_i-1，j+1)(3)p _i,j =median(m _i,j-1 ,m _i-1,j ,m _i-1,j+1 )(3)

然而，考虑到整个帧的全部RD代价同时需要指数级的计算复杂性，这是不实际的。因而，PMVFAST和其它已知算法每次仅仅考虑仅一个块的RD代价，而不是一帧中全部块。However, this is impractical considering that the full RD cost for the entire frame requires exponential computational complexity at the same time. Thus, PMVFAST and other known algorithms only consider the RD cost of only one block at a time, instead of all blocks in a frame.

特别是，MVFAST或PMVFAST都没有考虑当导出相对于当前块的运动矢量时，这导致下一个块的在中值MV预测器中的变化。这能够影响整个运动矢量场的平滑。In particular, neither MVFAST nor PMVFAST take into account when deriving the motion vector relative to the current block, which results in changes in the median MV predictor for the next block. This can affect the smoothing of the entire motion vector field.

一般地说，本发明的第一方面通过改善PMVFAST的代价定义和运动预测器候选的选择而提出了一种新的ME算法。特别是，对于第一图像的每个当前块(当前块可以是16×16，16×8，8×16，8×8，4×8，8×4，4×4或其它矩形长度，甚至非矩形)，根据一个代价函数来选择第二图像(参考图像)的类似块，所述代价函数包括两项：(i)当前块与相似块的不相似测量(例如SAD，SAE)的项，以及(ii)作为至少是对第一图像的未来块的运动矢量的预测的函数的项。In general, the first aspect of the invention proposes a new ME algorithm by improving the cost definition and motion predictor candidate selection for PMVFAST. In particular, for each current block of the first image (the current block can be 16×16, 16×8, 8×16, 8×8, 4×8, 8×4, 4×4 or other rectangular lengths, even non-rectangular), similar blocks of the second image (reference image) are selected according to a cost function consisting of two terms: (i) a term for the measure of dissimilarity (e.g. SAD, SAE) between the current block and the similar block, and (ii) a term that is a function of at least a prediction of a motion vector for a future block of the first picture.

特别是，所提出的算法通过包含当前中值MV预测器并且还包含未来(即至今未处理)的编码块的估计中值MV预测器，使得改善运动场平滑性成为可能。In particular, the proposed algorithm makes it possible to improve motion field smoothness by including the current median MV predictor and also the estimated median MV predictor of future (ie not yet processed) coded blocks.

本发明的许多变化是可能的。特别是，所述块可以具有任何大小和任何形状。Many variations of the invention are possible. In particular, the blocks can be of any size and of any shape.

可以有多个第二图像(即多个参考值)并且所述搜索可以包括在所有第二图像中的候选位置。There may be multiple second images (ie multiple reference values) and the search may include candidate positions in all of the second images.

此外，可以为一起构成所述第一图像中的较大区域的、并且采用由编码数字所定义的编码顺序进行编码的多个子块执行所述新的代价函数。这些子块无须具有相同大小或形状。Furthermore, the new cost function may be implemented for a plurality of sub-blocks which together form a larger region in the first image and are coded in a coding order defined by a coding number. These sub-blocks need not be of the same size or shape.

本发明具有另一方面，其可以与本发明第一方面组合或独立使用。一般地说，本发明第二方面提出当对第一图像(当前块可以是16×16，16×8，8×16，8×8，4×8，8×4，4×4，或其它矩形大小，甚至非矩形)的当前块进行编码时，使用所选出的、具有从与已知技术中所使用的值不同的一组值中选出的位置值(即两个轴线方向中的各自组件)的运动矢量进行编码。The invention has another aspect, which may be used in combination with or independently of the first aspect of the invention. Generally speaking, the second aspect of the present invention proposes that when the first image (the current block can be 16×16, 16×8, 8×16, 8×8, 4×8, 8×4, 4×4, or other Rectangular size, even non-rectangular) current block is encoded using a position value selected from a set of values different from those used in known techniques (i.e. in both axis directions The motion vectors of their respective components) are encoded.

考虑一个可能的运动矢量预测器：(0，0)。而整像素的常规技术允许运动矢量采用诸如-2.0，-1.0，0，1.0，2.0等的位置值，本发明第二方面提出修改接近于所述预测器的一组可能位置值。对于最接近0的位置值1.0，我们可以使用另一位置值诸如0.85，从而使得可允许的位置值将会包括-2.0，-0.85，0，0.85，2.0等等。其优点在于，在统计上，运动矢量趋向接近于0。并且，因而通过选择更加接近于0的位置，我们将更加接近真实运动矢量，并且因而可以给出可以导致较高压缩效率的更好的运动补偿。Consider a possible motion vector predictor: (0, 0). Whereas conventional techniques of integer pixels allow motion vectors to take position values such as -2.0, -1.0, 0, 1.0, 2.0 etc., the second aspect of the invention proposes to modify the set of possible position values close to the predictor. For the closest position value of 1.0 to 0, we could use another position value such as 0.85, so that allowable position values would include -2.0, -0.85, 0, 0.85, 2.0 and so on. This has the advantage that, statistically, motion vectors tend to be close to zero. And, thus by choosing a position closer to 0, we will be closer to the true motion vector and thus can give better motion compensation which can lead to higher compression efficiency.

因而，在本发明一个特定表达中，可以选择为至少一个轴方向选择一组可能位置值，从而使得它们无法全部写作Lm，其中m＝-...，2，-1，0，1，2...，并且L是常量(例如1个像素间隔，1/2像素间隔，或1/4像素间隔)；即，位置值是不均匀的。特别是，可以选择为至少一个轴方向选择一组可能位置值，从而使得它们无法全部写作m/n，其中m＝-...，2，-1，0，1，2，...并且n是1或2的幂。Thus, in a particular expression of the invention, it is possible to choose a set of possible position values for at least one axis direction, so that they cannot all be written Lm, where m=-...,2,-1,0,1,2 ..., and L is constant (eg, 1 pixel interval, 1/2 pixel interval, or 1/4 pixel interval); that is, the position values are non-uniform. In particular, it is possible to choose a set of possible position values for at least one axis direction such that they cannot all be written as m/n, where m=-...,2,-1,0,1,2,...and n is a power of 1 or 2.

注意，本发明的第二方面并不局限于从一组非均匀的空间位置值中选择位置值；与传统的位置值组相比，也不局限于仅仅选择最接近于零的两个位置值。作为示例，在本发明的第二方面的另一个示例里，位置值2.0可变成1.9，从而可允许的位置值将包括-1.9，- 0.85，0，0.85，1.9等。Note that the second aspect of the invention is not limited to selecting a position value from a non-uniform set of spatial position values; nor is it limited to selecting only the two position values closest to zero compared to a traditional set of position values . As an example, in another example of the second aspect of the invention, a position value of 2.0 could be changed to 1.9, whereby allowable position values would include -1.9 , -0.85, 0, 0.85 , 1.9 , etc.

因而，在本发明第二方面的可替换的特定表达中，(为至少一个所述方向)选择一组可能位置值，以包括可以写作为LA_mm/n的一个或多个位置值，其中m＝-...，2，-1，0，1，2..，n是1或2的幂，L是常量(例如1像素间隔，1/2像素间隔，或1/4像素间隔)，并且A_m是小于1但是至少是0.75的值(对于不同的m值选择性地不同)，更加优选地至少是0.80，并且最优选地至少是0.85。Thus, in an alternative specific expression of the second aspect of the invention, a set of possible position values is selected (for at least one of said directions) to include one or more position values which can be written as LA _{mm m} /n, where m=-..., 2, -1, 0, 1, 2.., n is a power of 1 or 2, L is a constant (such as 1 pixel interval, 1/2 pixel interval, or 1/4 pixel interval) , and A _m is a value less than 1 but at least 0.75 (optionally different for different values of m), more preferably at least 0.80, and most preferably at least 0.85.

我们已经发现A_m的最佳值取决于视频。We have found _that the optimal value of Am depends on the video.

本发明第二方面的特定实施例的一个优点在于，它们所产生的运动矢量可以采用与常规算法相同的格式码进行编码，除了该位置值的常规码应当分别解释为该实施例所使用的可能位置值之外。例如，如果由特定实施例所使用的位置值是-1.9，-0.85，0，0.85，1.9等，则位置值1.0的常规码应该解释为0.85，并且位置值2.0的常规码应该解释为1.9，等。An advantage of certain embodiments of the second aspect of the invention is that the motion vectors they produce can be coded using the same format codes as conventional algorithms, except that the conventional codes for the position values should be interpreted separately as the possible values used by this embodiment outside the positional value. For example, if the position values used by a particular embodiment are -1.9 , -0.85 , 0, 0.85 , 1.9 , etc., then a regular code for a position value of 1.0 should be interpreted as 0.85, and a regular code for a position value of 2.0 should be interpreted as 1.9, Wait.

根据本发明的第二方面的方法可以包括以下步骤：定义搜索区域，在所述搜索区域内定义多个候选位置，所述多个候选位置包括由本发明第二方面的新的位置值所定义的一组多个位置。这些位置值是来自关键位置(例如，(0，0)运动矢量位置，预测运动矢量，等)处的候选位置的相应位移值。对于每个候选运动矢量而言，我们计算代价函数，所述代价函数是第一图像中的当前块与第二图像中的所述候选运动矢量处的块之间的相似度测量(例如SAD，SAE)的函数。可选地，其还可以是以下运动矢量的函数：所述候选运动矢量、当前预测运动矢量，以及可选地，如本发明第一方面，一个或多个未来预测运动矢量。例如，可选地，如本发明第一方面，可以给定该代价函数。The method according to the second aspect of the present invention may include the steps of: defining a search area, defining a plurality of candidate positions within the search area, the plurality of candidate positions including the position defined by the new position value of the second aspect of the present invention A set of multiple locations. These position values are corresponding displacement values from candidate positions at key positions (eg, (0,0) motion vector positions, predicted motion vectors, etc.). For each candidate motion vector, we compute a cost function which is the similarity measure (e.g. SAD, SAE) function. Optionally, it may also be a function of the motion vectors: said candidate motion vector, a current predicted motion vector, and optionally, as in the first aspect of the invention, one or more future predicted motion vectors. For example, optionally, as in the first aspect of the present invention, the cost function may be given.

附图说明Description of drawings

现在将仅仅参考以下附图描述本发明实施例作为示例，其中：Embodiments of the invention will now be described by way of example only with reference to the following drawings, in which:

图1示出了当前块、左块、顶部块、右上方块、右右上方块、和右块；Figure 1 shows a current block, a left block, a top block, an upper right block, an upper right block, and a right block;

图3a示出了如本发明第一实施例所使用的巨大菱形搜索的搜索方式；Fig. 3a shows the search mode of the giant diamond search as used in the first embodiment of the present invention;

图3b示出了如本发明第一实施例中所使用的修改的巨大菱形搜索的搜索方式；Figure 3b shows the search pattern of the modified giant diamond search as used in the first embodiment of the invention;

图4示出了如本发明第一实施例中所使用的小菱形搜索的搜索方式；Fig. 4 shows the search mode of the small diamond search as used in the first embodiment of the present invention;

图5a和5b比较PMVFAST和本发明第一实施例的MV域的平滑性；Figure 5a and 5b compare the smoothness of the MV domain of PMVFAST and the first embodiment of the present invention;

图6示出了对当前帧中的(a)(0，0)MV、(b)中值MV、(c)相邻MV和先前帧中所配置的MV、(d)在先前帧中的右下方的MV(PreBottomRightMV)使用完全搜索(FS)算法得到的在Foreman序列中的运动矢量分布；Fig. 6 shows (a) (0,0) MV in current frame, (b) median MV, (c) adjacent MV and MV configured in previous frame, (d) MV in previous frame The MV (PreBottomRightMV) on the lower right uses the full search (FS) algorithm to obtain the motion vector distribution in the Foreman sequence;

图7是本发明第一实施例的流程图。Fig. 7 is a flowchart of the first embodiment of the present invention.

具体实施方式detailed description

本发明第一实施例采用了PMVFAST算法的许多特征，但是是通过考虑几个邻近块而不是仅仅一个块来对PMVFAST(及其它现有算法)加以改进的。等式(2)和(3)示出，当前块MV的选择直接影响邻近块的RD代价，所述邻近块包括右块(或第(i，j+1)块)，左下方块(或第(i+1，j-1)块)，以及下方块(或第(i+1，j)块)。这是因为，当前MV将会影响这些邻近块的所预测MV，并且因此进而影响那些块的最佳运动矢量。这些是“未来”块，因为当处理当前块时，还没有对它们执行运动估计。我们无法与当前块同时地计算这些未来块的最佳运动矢量，因为我们将需要按照等式(2)同时计算整个帧中所有块的最佳运动矢量，这将会非常复杂。The first embodiment of the present invention adopts many features of the PMVFAST algorithm, but improves upon PMVFAST (and other existing algorithms) by considering several neighboring blocks instead of just one. Equations (2) and (3) show that the selection of the current block MV directly affects the RD cost of neighboring blocks, including the right block (or (i, j+1)th block), the lower left block (or (i+1, j-1) block), and the following block (or (i+1, j)th block). This is because the current MV will affect the predicted MVs of these neighboring blocks, and thus in turn affect the best motion vectors for those blocks. These are "future" blocks because no motion estimation has been performed on them when the current block is processed. We cannot compute the best motion vectors for these future blocks concurrently with the current block, because we would need to compute the best motion vectors for all blocks in the whole frame simultaneously according to equation (2), which would be very complicated.

作为替代，为了评定在右块或第(i，j+1)块上的当前块的当前MV的选择的蕴含式，我们可以为等式(1)的当前块的RD代价函数增加一项：Instead, to evaluate the implications of the choice of the current MV of the current block on the right block or the (i, j+1)th block, we can add a term to the RD cost function of the current block in equation (1):

R(|m_i，j+1-p_i，j+1|)(4)R(|m _{i, j+1} -p _{i, j+1} |) (4)

其中，m_i，j+1是基于当前块的当前MV的右块的最佳运动矢量，并且p_i，j+1是基于当前块的当前MV的右块的中值MV预测器，即where m _i,j+1 is the best motion vector based on the right block of the current MV of the current block, and p _i,j+1 is the median MV predictor of the right block based on the current MV of the current block, i.e.

p_i，j+1＝median(m_i，j，m_i-1，j+1，m_i-1，j+2)(5)p _{i, j+1} = median(m _{i, j} , m _{i-1, j+1} , m _{i-1, j+2} )(5)

对于第(i，j)块，让medianMV表示由等式3给出中值MV预测器。让FmedianMV表示由等式(5)给出的未来中值MV预测器(用于右块的中值MV预测器)。从而，FmedianMV是MV候选的函数。在此，第一实施例被称为对于第(i，j)块的“增强预测运动矢量场自适应搜索技术”(E-PMVFAST)。该实施例的步骤如下。For the (i,j)th block, let medianMV denote the median MV predictor given by Equation 3. Let FmedianMV denote the future median MV predictor given by equation (5) (the median MV predictor for the right block). Thus, FmedianMV is a function of MV candidates. Here, the first embodiment is referred to as "Enhanced Predictive Motion Vector Field Adaptive Search Technique" (E-PMVFAST) for the (i,j)th block. The steps of this example are as follows.

实施例的各步骤如下，示于图7中。Each step of the embodiment is as follows, shown in FIG. 7 .

对任何候选MV，如下定义代价。For any candidate MV, the cost is defined as follows.

cost(MV)＝SAD+λ*[w*R(MV-medianMV)+(1-w)*R(MV-FmedianMV)]cost(MV)＝SAD+λ*[w*R(MV-medianMV)+(1-w)*R(MV-FmedianMV)]

(6)(6)

1.计算三个运动矢量预测器的代价：(i)中值MV预测器(“medianMV”)，(ii)右块的估计运动矢量(“futureMV”)，其定义为如下：1. Compute the cost of three motion vector predictors: (i) the median MV predictor (“medianMV”), (ii) the estimated motion vector of the right block (“futureMV”), which is defined as follows:

futureMV≡median(TopMV，TopRightMV，TopRightRightMV)和(iii)来自过去块(“pastMV”)的MV预测器，其是先前的共位MV(“PreMV”)和远离于medianMV的先前右下方的MV(“PreBottomRightMV”)中的一个，即futureMV ≡ median(TopMV, TopRightMV, TopRightRightMV) and (iii) the MV predictor from the past block (“pastMV”), which is the previous co-located MV (“PreMV”) and the previous bottom right MV away from medianMV( "PreBottomRightMV"), ie

pastMV≡argmax_{MV∈{PreMv，PreBottomRightMV}}{abs(MV-medianMV)}pastMV _{≡ argmax MV ∈ {PreMv, PreBottomRightMV}} {abs(MV-medianMV)}

注意，项(ii)可以由用于另一邻近未来块(诸如左下方的、底部，和/或右下方的块)的估计运动矢量来补充或代替。Note that item (ii) may be supplemented or replaced by an estimated motion vector for another neighboring future block, such as the bottom left, bottom, and/or bottom right blocks.

还应当注意，在项(iii)中，先前右下方的MV可以由用于另一邻近块的先前MV预测器补充或替换。It should also be noted that in item (iii), the previous bottom right MV can be supplemented or replaced by the previous MV predictor for another neighboring block.

注意，项(ii)和(iii)形代价发明的独立方面。Note that items (ii) and (iii) form independent aspects of the invention.

如果以上任意一个MV预测器不可用(例如在帧的边界)，则跳过该预测器。If any one of the above MV predictors is not available (eg at a frame boundary), that predictor is skipped.

2.如果运动矢量预测器的最小代价小于阈值T1，则停止搜索并且转到步骤7。否则，选择具有最小代价的运动矢量作为currentMV(当前MV)并且转到下一步骤。注意，3个运动矢量预测器的代价可以采用预定顺序来计算(例如medianMV，接着是futureMV，接着是pastMV)，并且在任何时刻，如果任何运动矢量预测器的代价小于特定阈值，则搜索可能停止并且转到步骤7。2. If the minimum cost of the motion vector predictor is less than the threshold T1, stop the search and go to step 7. Otherwise, choose the motion vector with the smallest cost as currentMV (current MV) and go to the next step. Note that the costs of the 3 motion vector predictors may be computed in a predetermined order (e.g. medianMV, then futureMV, then pastMV), and at any moment, if the cost of any motion vector predictor is less than a certain threshold, the search may stop And go to step 7.

3.围绕currentMV执行定向小菱形搜索的一次迭代。下面解释定向小菱形搜索的概念。3. Perform one iteration of a directed diamond search around currentMV. The concept of directional diamond search is explained below.

4.如果最小代价小于阈值T2，则停止搜索并且转到步骤7。否则，选择具有最小代价的运动矢量作为currentMV并且转到下一步骤。4. If the minimum cost is less than the threshold T2, stop the search and go to step 7. Otherwise, choose the motion vector with the smallest cost as currentMV and go to the next step.

5.如果(currentMV＝medianMV)并且当前最小代价小于阈值T3，则执行小菱形搜索并且转到步骤7。5. If (currentMV=medianMV) and the current minimum cost is less than threshold T3, perform a small diamond search and go to step 7.

6.如果视频不是隔行扫描，则执行大菱形搜索，如图3(a)所示；否则，执行如图3(b)所示的修改的大菱形搜索。在这些步骤中的每个步骤中，对菱形的每一标志点评估代价函数。6. If the video is not interlaced, perform a large diamond search as shown in Figure 3(a); otherwise, perform a modified large diamond search as shown in Figure 3(b). In each of these steps, a cost function is evaluated for each landmark of the diamond.

7.选择具有最小代价的MV。7. Choose the MV with the smallest cost.

在我们的实验中，发现w的值大约0.8是有效。In our experiments, a value of w around 0.8 was found to be effective.

现在解释定向小菱形搜索的步骤。假定，centerMV是当前搜索中心，并且MV1，MV2，MV3和MV4是四个围绕搜索点，如图4所示。为每个MVi计算R(MVi-medianMV)。如果R(MVi-medianMV)<R(centerMV-medianMV)，则计算MVi的SAD和代价。否则，忽略该MVi。选择具有最低代价的MV作为currentMV。注意，定向方块搜索的概念被认为是新的，并且构代价发明一个独立方面，其无须与使用futureMV的概念相结合来执行。The steps of the directed diamond search are now explained. Assume, centerMV is the current search center, and MV1, MV2, MV3 and MV4 are four surrounding search points, as shown in Fig. 4 . Compute R(MVi-medianMV) for each MVi. If R(MVi-medianMV)<R(centerMV-medianMV), calculate the SAD and cost of MVi. Otherwise, the MVi is ignored. The MV with the lowest cost is chosen as the currentMV. Note that the concept of directional block search is considered new and constitutes an independent aspect of the invention that does not have to be performed in conjunction with the concept of using futureMV.

大菱形搜索和修改的大菱形搜索的步骤是相同的，但是所述搜索是对于图3(a)和3(b)中分别示出的点的所有集合而完成的。The steps of the large diamond search and the modified large diamond search are the same, but the search is done for all sets of points shown in Figures 3(a) and 3(b), respectively.

我们现在考虑在本发明范围内的实施例的多个可能的变体。We now consider a number of possible variants of embodiments within the scope of the invention.

首先，注意，对于不同的块，在代价函数中的加权系数w可以是不同的。此外，可选择地，对于不同的MV候选所述w可以是不同的。特别地是，w的定义可以取决于诸如MV候选是否接近于medianMV和/或futureMV、或MV候选的X轴分量或Y轴分量是否与所述FmedianMV的X轴分量或Y轴分量相同之类的情况。First, note that the weighting coefficient w in the cost function can be different for different blocks. In addition, optionally, the w may be different for different MV candidates. In particular, the definition of w may depend on factors such as whether the MV candidate is close to medianMV and/or futureMV, or whether the X-axis component or Y-axis component of the MV candidate is the same as that of the FmedianMV Condition.

此外，代价函数可以不限制于等式(6)的形式。其可以是包括失真测量项(例如SAD、失真平方和(SSD)、平均偏差失真(MAD)、MSD等)和考虑了对当前块和某些邻近块(例如右块、下方块、左下块等)的运动矢量进行编码所必需的位的项的任何函数。In addition, the cost function may not be limited to the form of equation (6). It may include distortion measurement items (such as SAD, sum of squared distortion (SSD), mean deviation distortion (MAD), MSD, etc.) ) Any function of the terms of the bits necessary to encode the motion vector.

此外，在步骤1中，futureMV的定义不局限于以上步骤1给出的形式。对于futureMV的两个可能的替换定义是：Furthermore, in step 1, the definition of futureMV is not limited to the form given in step 1 above. Two possible alternative definitions for futureMV are:

futureMV≡median(leftMV，TopRightMV，TopRightRightMV)futureMV≡median(leftMV, TopRightMV, TopRightRightMV)

futureMV≡median(medianMV，TopRightMV，TopRightRightMV)futureMV≡median(medianMV, TopRightMV, TopRightRightMV)

此外，在以上表达的步骤1中，pastMV被选为在先前帧中的可能MV(preMV和preBottomRlghtMV)的列表集合中距离medianMV最远的一个。然而，要考虑的MV列表可以包含两个以上可能的MV(例如preMV，preLeftMV，preRightMV，preTopMV，preTopLeftMV，preTopRightMV，preBottomMB，preBottomLeftMV，preBottomRightMV，等)。此外，来自一个以上先前编码帧的MV可以包括在所述列表中(例如如果当前帧是帧N，则所述列表可以包含帧N-1，N-2，N-3，...)。如果当前帧是B帧，则先前编码帧的列表可以包括未来P帧。Furthermore, in step 1 of the expression above, pastMV is selected as the one furthest away from medianMV in the set of lists of possible MVs (preMV and preBottomRlghtMV) in the previous frame. However, the list of MVs to consider may contain more than two possible MVs (eg preMV, preLeftMV, preRightMV, preTopMV, preTopLeftMV, preTopRightMV, preBottomMB, preBottomLeftMV, preBottomRightMV, etc.). Furthermore, MVs from more than one previously encoded frame may be included in the list (eg, if the current frame is frame N, the list may contain frames N-1, N-2, N-3, . . . ). If the current frame is a B frame, the list of previously coded frames may include future P frames.

此外，在步骤1中，选择pastMV为距离参考MV(在步骤1中的medianMV)最远的可能的MV。其它参考MV也是可以的，包括leftMV、或TopMV、或TopRightMV、或某组合。从可能的MV的列表中进行选择的其它方法也是可能的。Furthermore, in step 1, the pastMV is selected as the possible MV that is farthest from the reference MV (medianMV in step 1). Other reference MVs are also possible, including leftMV, or TopMV, or TopRightMV, or some combination. Other methods of selecting from a list of possible MVs are also possible.

在步骤2中，所述3个运动矢量预测器的代价是在按照某种预定义顺序中得到的。可能的预定义顺序包括：In step 2, the costs of the three motion vector predictors are obtained in some predefined order. Possible predefined sequences include:

a)medianMV，随后是futureMV，随后是pastMVa) medianMV, followed by futureMV, followed by pastMV

b)medianMV，随后是pastMV，随后是futureMVb) medianMV, followed by pastMV, followed by futureMV

c)futureMV，随后是medianMV，随后是pastMVc) futureMV, followed by medianMV, followed by pastMV

d)futureMV，随后是pastMV，随后是medianMVd) futureMV, followed by pastMV, followed by medianMV

e)pastMV，随后是medianMV，随后是futureMVe) pastMV, followed by medianMV, followed by futureMV

f)pastMV，随后是futureMV，随后是medianMVf) pastMV, followed by futureMV, followed by medianMV

此外，虽然如以上表达，在步骤3中执行了定向小菱形搜索的一次迭代，可以应用一个以上的迭代。Furthermore, although as expressed above, one iteration of the directed diamond search was performed in step 3, more than one iteration could be applied.

模拟结果Simulation results

我们现在给出实施例E-PMVFAST的模拟结果。将所述实施例嵌入到H.264参考软件JM9.3[13]中，并且使用各种QP、视频序列、分辨率和搜索范围对其进行模拟。表1(a-c)和2(a-c)示出一些典型的模拟结果。PSNR(峰值信号与噪声的比例)变化和BR(比特率)变化是相对于完全搜索(FS)的PSNR和比特率的变化。模拟结果示出，所提出的E-PMVFAST的比特率和PSNR趋向于与完全搜索和PMVFAST类似，但是在大范围的视频序列和比特率上，E-PMVFAST趋向于比PMVFAST快大约40％。E-PMVFAST的一个重要特征在于，其运动矢量场趋向于非常地平滑，从而使得运动矢量可以比其它快速移动估计算法更加准确地表示对象的移动。We now present simulation results for Example E-PMVFAST. The embodiments are embedded in the H.264 reference software JM9.3 [13] and simulated using various QPs, video sequences, resolutions and search ranges. Tables 1(a-c) and 2(a-c) show some typical simulation results. PSNR (Peak Signal to Noise Ratio) Variation and BR (Bit Rate) Variation are PSNR and Bit Rate Variations relative to Full Search (FS). Simulation results show that the bitrate and PSNR of the proposed E-PMVFAST tend to be similar to full search and PMVFAST, but over a wide range of video sequences and bitrates, E-PMVFAST tends to be about 40% faster than PMVFAST. An important feature of E-PMVFAST is that its motion vector field tends to be very smooth, so that motion vectors can represent the movement of objects more accurately than other fast motion estimation algorithms.

在图5(a)以及5(b)中，左侧的图像示出(如短线)由PMVFAST算法获得的运动矢量场，而右侧的成像示出通过所述实施例获得的相同成像的运动矢量。E-PMVFAST的运动矢量场显著地比PMVFAST的运动矢量场更加平滑，特别是在所圈出的区域中。对于将感知转换编码(perceptualtrans-coding)、速率控制、多种块大小运动估计、多种标准帧运动估计等中的视频运动内容进行分类而言，平滑的运动场非常有用。In Figures 5(a) and 5(b), the images on the left show (as dashed lines) the motion vector field obtained by the PMVFAST algorithm, while the images on the right show the motion of the same image obtained by the described embodiment vector. The motion vector field of E-PMVFAST is significantly smoother than that of PMVFAST, especially in the circled region. Smooth motion fields are useful for classifying video motion content in perceptual trans-coding, rate control, various block size motion estimation, various standard frame motion estimation, etc.

表格1a-foremanCIF序列的模拟结果Table 1a - Simulation results for foremanCIF sequences

表格1b—CoastguardCIF序列的模拟结果Table 1b—Simulation results for the CoastguardCIF sequence

表格1c—HallCIF序列的模拟结果Table 1c—Simulation results for HallCIF sequences

表格2a-foremanQCIF序列的模拟结果Table 2a - Simulation results for foremanQCIF sequences

表格2b-AkiyoQCIF序列的模拟结果Table 2b - Simulation results for AkiyoQCIF sequences

表格2c-CoastguardQCIF序列的模拟结果Table 2c - Simulation results for the CoastguardQCIF sequence

我们现在转向本发明的第二实施例，其示出了本发明的第二方面。We now turn to the second embodiment of the invention, which illustrates a second aspect of the invention.

如上所述，常规完全整数像素允许运动矢量在每个方向上采用-2.0，-1.0，0，1.0，2.0等等的位置值。在本发明第二实施例中，选择接近于预测器的可能位置值。对于最接近0的位置值，我们可以使用(代替1.0)另一位置值，诸如0.85，从而使得允许的位置值可以包括-2.0，-0.85，0，0.85，2.0等等。此优点在于，在统计上运动矢量趋向接近于0。因此，通过选择更加接近0的位置，我们会更加接近于真实的运动矢量，并且因而可以给出可以导致较高压缩效率的更好的运动补偿。类似地，可以改变其它位置值。例如，可以将位置值2.0改变为1.9，从而使得允许的位置值会包括-1.9，-0.85，0，0.85，1.9等等。除了1.0的编码运动矢量位置应当解释为0.85，并且2.0的编码运动矢量位置应当解释为1.9以外，所提出的改变的优点在于可以使用相同运动矢量代码。As noted above, conventional full integer pixels allow motion vectors to take position values of -2.0, -1.0, 0, 1.0, 2.0, etc. in each direction. In a second embodiment of the invention, possible position values close to the predictor are selected. For the position value closest to 0, we can use (instead of 1.0) another position value, such as 0.85, so that allowed position values can include -2.0, -0.85 , 0, 0.85 , 2.0, etc. This has the advantage that motion vectors tend to be close to zero statistically. Therefore, by choosing a position closer to 0, we get closer to the true motion vector and thus can give better motion compensation which can lead to higher compression efficiency. Similarly, other position values can be changed. For example, a position value of 2.0 could be changed to 1.9, so that allowed position values would include -1.9 , -0.85 , 0, 0.85, 1.9, and so on. The advantage of the proposed change is that the same motion vector code can be used, except that a coded motion vector position of 1.0 should be interpreted as 0.85, and a coded motion vector position of 2.0 should be interpreted as 1.9.

半像素精度允许运动矢量采用诸如0.0，0.5，1.0，1.5，2.0等等之类的位置值。我们建议修改这些位置值，特别是那些接近于预测器的位置值。对于非常接近于0的位置值0.5，我们建议使用一个不同的值。例如，一种可能性是使用0.4代替0.5。换句话说，位置值将包括0.0，0.4，1.0，1.5，2.0。类似地，其它位置值可以被修改。例如，位置值1.0可以改变为0.9，从而使得该组新位置值将包括0.0，0.4，0.95，1.5，2.0等等。同样，这可以有助于提高压缩效率。类似地，可以修改其它位置值以提高压缩效率。然而，改变这种位置能够显著地导致编码器和解码器两者处的较高的计算效率。通常，大部分压缩效率增益来自于将位置值改变为接近于预测器。Half-pixel precision allows motion vectors to take position values such as 0.0, 0.5, 1.0, 1.5, 2.0, and so on. We recommend modifying these location values, especially those that are close to the predictor. For a position value of 0.5 which is very close to 0, we recommend using a different value. For example, one possibility is to use 0.4 instead of 0.5. In other words, positional values will include 0.0, 0.4 , 1.0, 1.5, 2.0. Similarly, other position values can be modified. For example, a position value of 1.0 could be changed to 0.9 such that the new set of position values would include 0.0, 0.4, 0.95, 1.5, 2.0, and so on. Again, this can help improve compression efficiency. Similarly, other position values can be modified to improve compression efficiency. However, changing this position can significantly lead to higher computational efficiency at both the encoder and decoder. Typically, most of the compression efficiency gain comes from changing the position value to be closer to the predictor.

1/4像素精度允许运动矢量采用诸如0.00，0.25，0.50，0.75，1.00等等之类的位置值。我们可以修改位置值，特别那些接近于预测器的位置值。例如，我们可以将它们修改为0.00，0.20，0.47，0.73，0.99等。1/4 pixel precision allows motion vectors to take position values such as 0.00, 0.25, 0.50, 0.75, 1.00, etc. We can modify the position values, especially those close to the predictor. For example, we can modify them to 0.00, 0.20, 0.47, 0.73, 0.99, etc.

注意，所提出的方法允许我们在每个整数位置值之间选择任意数量的位置值。例如，在位置值0和1之间，半像素精度使用1个位置值{0.5}，1/4像素精度使用3个位置值{0.25，0.50，0.75}，以及1/8像素精度使用7个位置值{0.125，0.250，0.375，0.500，0.625，0.750，0.875}。提出的方法允许我们在0和1之间选择任何N个位置值。例如，我们可以选择N＝2值诸如0.3和0.6。Note that the proposed method allows us to choose an arbitrary number of position values between each integer position value. For example, between position values 0 and 1, half-pixel precision uses 1 position value {0.5}, 1/4-pixel precision uses 3 position values {0.25, 0.50, 0.75}, and 1/8-pixel precision uses 7 Position values {0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875}. The proposed method allows us to choose any N position values between 0 and 1. For example, we can choose N=2 values such as 0.3 and 0.6.

所提出的不均匀的亚像素运动估计和补偿不是必须要应用于每个帧的每个区域。相反，可以将某些位引入到头部中，以指示对于所述视频帧的每个区域(例如片)其是否被开启或关闭。除此之外，其可以在没有任何语法改变的情况下直接应用于现有标准，因为可以应用相同的运动矢量代码。The proposed non-uniform sub-pixel motion estimation and compensation does not have to be applied to every region of every frame. Instead, certain bits can be introduced into the header to indicate whether it is turned on or off for each region (eg slice) of the video frame. Apart from that, it can be directly applied to existing standards without any syntax changes, since the same motion vector codes can be applied.

采用H.264JM82软件对所提出的不均匀的亚像素运动估计和补偿进行模拟，并且其结果在以上表格中示出，其中QP代表量化参数。该模拟在x和y方向中使用位置值(...-1，-0.75，-0.5，-0.15，0，0.15，0.5，0.75，1..)。即，与使用1/4像素空间位置值的标准方法相比较，仅仅修改了在-0.25和+0.25处的位置值。除了使用新的位置值之外，所述算法在其他方面与已知的H.264标准算法相同。如在表中所示出的，第二实施例在实现相似的PSNR的同时，明显地降低了比特率。不需要修改H.264的语法。The proposed non-uniform sub-pixel motion estimation and compensation was simulated using H.264JM82 software, and the results are shown in the above table, where QP stands for quantization parameter. The simulation uses position values (...-1, -0.75, -0.5, -0.15, 0, 0.15, 0.5, 0.75, 1..) in the x and y directions. That is, only the position values at -0.25 and +0.25 are modified compared to the standard method of using 1/4 pixel spatial position values. The algorithm is otherwise the same as the known H.264 standard algorithm, except that the new position values are used. As shown in the table, the second embodiment significantly reduces the bit rate while achieving similar PSNR. There is no need to modify the syntax of H.264.

尽管以上描述的仅仅是本发明的几个实施例，在本发明范围内许多变化都是可能的。Although only a few embodiments of the invention have been described above, many variations are possible within the scope of the invention.

例如，以上给出的本发明的描述是用于具有一个参考帧的P帧中的固定大小的块。然而，本发明可以应用于具有多种子块大小的块，并且所述块不是必须要非重叠的。可以有一个以上的参考帧，并且参考帧可以是视频序列的相对于当前帧的过去或将来的任何块。For example, the description of the invention given above is for fixed-size blocks in a P-frame with one reference frame. However, the invention can be applied to blocks with multiple sub-block sizes, and the blocks do not have to be non-overlapping. There can be more than one reference frame, and a reference frame can be any block of the video sequence in the past or in the future relative to the current frame.

对于视频，一个图像元素(像素)可以具有一个或多个分量，诸如亮度分量、红色，绿色，蓝色(RGB)分量、YUV分量、YCrCb分量、红外分量、X光或其它分量。图像元素的每个分量是可以表示为数字的符号，所述数字可以是自然数、整数、实数甚至是复数。在自然数的情况下，它们可以是12位，8位，或任何其它位分辨率。虽然在视频中的像素是具有的矩形采样网格和均匀采样周期的2维样本，但是所述采样网格不是必须为矩形并且所述采样周期不是必须为是均匀的。For video, a picture element (pixel) can have one or more components, such as luminance, red, green, blue (RGB) components, YUV components, YCrCb components, infrared components, X-ray or other components. Each component of an image element is a symbol that can be represented as a number, which can be a natural number, an integer, a real number, or even a complex number. In the case of natural numbers, they can be 12 bits, 8 bits, or any other bit resolution. Although pixels in video are 2-dimensional samples with a rectangular sampling grid and a uniform sampling period, the sampling grid does not have to be rectangular and the sampling period does not have to be uniform.

工业实用性Industrial Applicability

本发明的每个实施例适合于由MPEG-1、MPEG-2、MPEG-4、H.261、H.263、H.264、AVS或其他可以被修改以包括以上它的相关视频编码标准或方法的快速、低延迟和低代价软件和硬件实现来实现。可能的应用包括数字视频广播(地面、卫星、有线)、数字照相机、数字可携式摄像机(camcorder)、数字录像机、机顶盒、个人数字助理(PDA)、可使用多媒体的蜂窝式电话(2.5G、3G及以上)、视频会议系统、视频点播系统、无线局域网设备、蓝牙应用、网络服务器、低或高带宽应用中的视频流服务器、视频代码转换机(从一个格式转换到另一格式)、及其它电视通信系统等。Each embodiment of the present invention is suitable for use by MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H.264, AVS or other related video coding standards which may be modified to include the above or Fast, low-latency and low-cost software and hardware implementation of the method. Possible applications include digital video broadcasting (terrestrial, satellite, cable), digital cameras, digital camcorders, digital video recorders, set-top boxes, personal digital assistants (PDAs), multimedia-capable cellular phones (2.5G, 3G and above), video conferencing systems, video-on-demand systems, wireless LAN devices, Bluetooth applications, web servers, video streaming servers in low- or high-bandwidth applications, video transcoders (from one format to another), and Other TV communication systems, etc.

参考文献references

以下参考文献的公开内容于此全面引入：The disclosures of the following references are hereby incorporated in their entirety:

[1]JointVideoTeamofITU-TandISO/1ECJTC1，“DraftITU-TRecommendationandFinalDraftinternationalStandardofJointVideoSpecification(ITU-TRec.H.264|ISO/IEC14496-10AVC)”，docunmentJVT-G050r1，may2003[1] JointVideoTeamofITU-TandISO/1ECJTC1, "DraftITU-TRecommendationandFinalDraftinternationalStandardofJointVideoSpecification(ITU-TRec.H.264|ISO/IEC14496-10AVC)", documentJVT-G050r1, may2003

[2]A.M.Tourapis，O.C.AuandM.L.Liou，“PredictiveMotionVectorFieldAdaptiveSearchTechnique(PMVFAST)”，ISO/IECJTC1/SC29/WG11MPEG2000，Noordwijkerhout，ML，March’2000[2] A.M. Tourapis, O.C.Auand M.L.Liou, "PredictiveMotionVectorFieldAdaptiveSearchTechnique (PMVFAST)", ISO/IECJTC1/SC29/WG11MPEG2000, Noordwijkerhout, ML, March'2000

[3]R.LI，B.ZengandM.L.Liou，“Anewthree-stepsearchalgorithmforblockmotionestimation”，OnCircuitsandSystemsforVideoTechnology，vol4，no4，pp438-42，Aug’94[3] R.LI, B.Zeng and M.L.Liou, "A new three-step search algorithm for block motion estimation", On Circuits and Systems for Video Technology, vol4, no4, pp438-42, Aug'94

[4]Z.L.He和M.L.Liou，“Ahighperformancefastsearchalgorithmforblockmatchingmotionestimation”，IEEETrans.onCircuitsandSystemsforVld&o2Technology，vol.7，no5，pp826-8，Oct’97[4] Z.L.He and M.L.Liou, "A high performance fast search algorithm for blockmatching motion estimation", IEEE Trans.on Circuits and Systems for Vld&o2Technology, vol.7, no5, pp826-8, Oct'97

[5]A.M.Tourapis，O.C.Au，andM.L.Liou，“FastMotionEstimationusingCircularZonalSearch”，Proc.ofSPIESym.OfVisualComm.&ImaggProcessin，vol2，pp.1496-1504，Jan.25-27，‘99[5] A.M. Tourapis, O.C.Au, and M.L. Liou, "Fast Motion Estimation using Circular Zonal Search", Proc. of SPIES Sym. Of Visual Comm. & Imagg Processin, vol2, pp. 1496-1504, Jan. 25-27, '99

[6]A，M.Tourapis，O.C.Au，M.LLiou，G.Shen，andI.Ahmad，“OptimizingtheMpeg-4Encoder-AdvancedDiamondZonalSearch”，inPros.of2000IEEEInter.Sym.onCircuitsandSystems，Geneva，Switzerland，May，2000[6] A, M.Tourapis, O.C.Au, M.LLiou, G.Shen, and I.Ahmad, "Optimizing the Mpeg-4Encoder-Advanced Diamond Zonal Search", inPros.of2000IEEEInter.Sym.onCircuitsandSystems, Geneva, Switzerland, May, 2000

[7]K.K.MaandP.I.Hosur，“PerformanceReportofMotionVectorFieldAdaptiveSearchTechnique(MVFAST)”，inISO/IECJTC1/SC29/WG11MPEG99/m81，Noordwijkerhout，NLMar’00[7] K.K.MaandP.I.Hosur, "PerformanceReportofMotionVectorFieldAdaptiveSearchTechnique(MVFAST)", inISO/IECJTC1/SC29/WG11MPEG99/m81, Noordwijkerhout, NLMar'00

[8]A.M.Tourapis，O.C.Au，andM.L.Liou，“FastBlock-MatchingMotionEstimationusingPredictiveMotionVectorFieldAdaptiveSearchTechnique(PMVFAST)”，inISO/IEC/JTC1/SC29/WG11MPEG2000/M5866，Noordwijkerhout，NL，Mar’00[8] A.M. Tourapis, O.C.Au, and M.L.Liou, "FastBlock-MatchingMotionEstimationusingPredictiveMotionVectorFieldAdaptiveSearchTechnique (PMVFAST)", inISO/IEC/JTC1/SC29/WG11MPEG2000/M5866, Noordwijkerhout, NL, Mar'00

[9]ImplementationStudyGroup，“Experimentalconditionsforevaluatingencodermotionestimationalgorithms”，inISO/IECJTC1/SC29/WG11MPEG99/n3141，Hawaii，USA，Dec’99[9] Implementation Study Group, "Experimental conditions for evaluating encoder motion estimation algorithms", inISO/IECJTC1/SC29/WG11MPEG99/n3141, Hawaii, USA, Dec'99

[10]“MPEG-4OptimizationModelVersion1.0”，inISO/IECJTC1/SC29/WG11MPEG2000/n3324，Noordwijkerhout，NL，Mar’00[10] "MPEG-4OptimizationModelVersion1.0", inISO/IECJTC1/SC29/WG11MPEG2000/n3324, Noordwijkerhout, NL, Mar'00

[11]T，Koga，K.linuma，A.Hirano，Y.lijima，andT.Ishlguro，“Motioncompensatedinterframecodingforvideoconferencing”Proc.Nat.Telecommun.Conf.，NewOrleans，LA，pp.G.5.3.1-G.5.3.5，Dec’81。[11] T, Koga, K.linuma, A.Hirano, Y.lijima, and T.Ishlguro, "Motion compensated interframe coding for video conferencing" Proc.Nat.Telecommun.Conf., NewOrleans, LA, pp.G.5.3.1-G.5.3 .5, Dec '81.

[12]J.R.JainandA.K.Jain，“Displacementmeasurementanditsapplicationininterframeimagecoding”，JEEETrans.OnCommunications，vol.COM-29，pp.1799-808，Dec’81[12] J.R. Jain and A.K. Jain, "Displacement measurement and its application in interframe image coding", JEEE Trans. On Communications, vol. COM-29, pp.1799-808, Dec'81

[13]JVTreferencesoftwareJM9.2forJVT/H.264FRext[13] JVTreferencesoftwareJM9.2forJVT/H.264FRext

Claims

1. A method of selecting a respective similar pixel block in a second image for each of a series of pixel blocks in a first image, the method comprising:

selecting between a plurality of candidate positions in the second image for a current block in the series of pixel blocks of the first image, the plurality of candidate positions comprising two A set of candidate positions defined by position values, at least one of which is non-uniform for at least one of the two axis directions.

2. The method of claim 1, wherein each of the set of candidate positions is associated with a respective motion vector by a respective position value, the motion vector being different from the predicted motion vector.

3. The method of claim 1, wherein each of the set of candidate positions is associated with a respective motion vector by a respective position value, the motion vector being different from a (0,0) motion vector.

4. The method of claim 1, wherein the position value can be written as DA _m m/n, where m=-..., 2, -1, 0, 1, 2..., n is a power of 1 or 2, D is a constant, while A _m is a value less than 1 for at least one value of m, and 0.75≦A _m <1 for all values of m.

5. The method of claim 4, wherein each of the set of candidate positions is associated with a respective motion vector by a respective position value, the motion vector being different from the predicted motion vector.

6. The method of claim 4, wherein each of the set of candidate positions is associated with a respective motion vector by a respective position value, the motion vector being different from the (0,0) motion vector.

7. The method of claim 4, wherein A _m has a value of at least 0.85 for all values of m.

8. A system for selecting, for each of a series of pixel blocks in a first image, a respective similar pixel block in a second image, the system comprising a processor configured to, for the first image For the current block in the series of pixel blocks of an image, select between a plurality of candidate positions in the second image, the plurality of candidate positions comprising A set of candidate positions is defined, at least one of the position values being non-uniform for at least one of the two axis directions.