Background technology
H.261 traditional video encoding standard such as ITU formulate, H.263, H.26L, the MPEG-1 that organizes to set up of the MPEG of standard and ISO H.264, MPEG-2, MPEG-4 etc. are based on hybrid coding, both on the HybridCoding framework.So-called hybrid encoding frame is to take all factors into consideration prediction, and the coding framework of the method for conversion and entropy coding has following main feature:
1) utilize prediction to remove the redundancy of time-domain;
2) utilize conversion to remove the redundancy of spatial domain;
3) remove statistical redundancy with entropy coding;
Above-mentioned video encoding standard all has intracoded frame, i.e. I frame, and inter-frame encoding frame, i.e. and P frame, I frame and P frame adopt different coding methods.The cataloged procedure of I frame is as follows: original image data is carried out two-dimensional transform (adopting discrete cosine transform or integer transform); In transform domain, conversion coefficient is quantized then; Carry out entropy coding at last, i.e. Hunffman coding or arithmetic coding etc.The cataloged procedure of P frame is as follows: adopt estimation to obtain motion vector, adopt the inter prediction based on motion compensation then, then the residual block that inter prediction is obtained carries out two-dimensional transform, coefficient in transform domain is quantized again, and carries out entropy coding at last.
Because the stronger correlation of video sequence on time-domain, inter prediction is the key factor that improves coding gain, so estimation and motion compensation are the very part and parcels in the Video Coding Scheme.
Estimation is divided into two parts, and whole picture element movement is estimated and sub-pixel motion is estimated.Whole picture element movement is estimated need be with respect to (the 2*W of present frame present picture element point at the corresponding picture element of reference frame
x+ 1) * (2*W
y+ 1) seeks the match block of a cost function minimum in the window, wherein W
x, W
yBe search width and height parameter.Sub-pixel motion estimates it then is to search for corresponding to the inferior picture element around the whole picture element of the best of the whole picture element movement vector of optimum, obtaining the best inferior picture element corresponding to optimum sub-pixel motion vector.As half pel search of common employing be 8 half picture elements around the whole picture element of the best search for to best inferior picture element.The motion compensation of inferior pixel accuracy can greatly improve code efficiency, as H.263 adopt after the half-pixel accuracy motion compensation than only adopt whole pixel accuracy motion compensation H.261 under identical code rate signal to noise ratio approximately can improve above 1dB.Adopt higher inferior pixel accuracy motion compensation then can obtain higher coding gain, but the complexity of aspects such as corresponding Design of Filter and coding also can increase as 1/4 or 1/8.Adopted the motion compensation technique of 1/4 pixel accuracy in the MPEG-4 standard.
The video encoding standard of formulating at present H.264, absorbed the achievement of video coding technique development for many years, on code efficiency and function, all surmounted video encoding standard in the past, but its basic framework remains based on hybrid encoding frame, and the precision of its estimation can reach 1/8 pixel.Fig. 1 is inferior pixel location and motion search range schematic diagram thereof.Capitalization among the figure (C, H
i, V
i, D
i) be whole pixel location, and Roman number (I, II, III...) expression half pixel location, lowercase (a, b, c...) expression 1/4 pixel location, Arabic numerals (1,2,3...) represent 1/8 pixel location.Estimation for each macro block in the video coding process is divided into following a few step basically:
1. the motion search of at first making whole pixel obtains the motion vector of whole pixel accuracy, obtains corresponding to the whole picture element C of the best of whole picture element movement vector;
2. in 8 half pixel location I~VIII around the whole picture element C of the best, seek best inferior picture element V;
3. in 8 1/4 pixel a~h around the inferior picture element V of the best, seek best 1/4 picture element h;
4. in 8 1/8 pixels 1~8 around the best 1/4 picture element h, seek best 1/8 picture element 1;
Seek in the motion search best matching blocks just need to adopt a matching criterior, the cost function of employing generally adopt absolute difference and: SAD (Sum of Absolute Difference) function, it is defined as:
Formula (1)
Here the size of supposing match block is N * N, and (i, j t) are (i, j) the pixel brightness value at coordinate position place, (x, y) two components of the motion vector of the current picture block of expression present frame position sensing reference frame mid point P position of t picture frame constantly to f.
This shows,, but also need 24 times extra interpolation to calculate for the inferior pel search part of the motion vector light that obtains 1/8 pixel accuracy just needs the calculating of the formula (1) of 24 points.
Because in the computing of whole estimation, the shared operand of estimation of whole picture element movement vector is very big, as when step-size in search is 32, the all direction search method of whole pixel needs formula (1) computing of 4225 points, therefore in the research work in the past, rapid motion estimating method all is to estimate at whole picture element movement, and has ignored the sub-pixel motion estimation effect.But along with the research of fast whole picture element movement method of estimation deepens continuously, the operand that whole picture element movement is estimated is fewer and feweri, present achievement in research shows that the number of the point of whole picture element movement estimating searching can reach below 10, and all keeps goodish code efficiency under various code checks.Like this, sub-pixel motion estimates that ratio shared in the operand of whole estimation is higher, especially when more the motion vector of high pixel precision is adopted, the operand that sub-pixel motion is estimated more and more becomes the bottleneck that the restriction operand descends, and the research for quick sub-pixel motion method of estimation just seems important all the more like this.
Summary of the invention
The objective of the invention is for overcoming the weak point of prior art, a kind of sub-pixel motion method for fast searching based on prediction direction correction/statistical prejudgement is proposed, comprise rapid motion estimating method based on prediction-correction for direction, and based on the fast motion estimation method of statistical prejudgement.In the encoding rate distorted characteristic that keeps prior art, greatly reduce the computational complexity that sub-pixel motion is estimated in the software and hardware.And this method is not limited to H.264 international standard, can extend to the application of other international standards and non-international standard video coding.Simultaneously method of the present invention has certain extensibility, can combine with numerous whole picture element movement algorithm for estimating, and can realize that the equilibrium between computational complexity and the precision of prediction regulates.
The correction for direction rapid motion estimating method based on prediction in the sub-pixel motion method of estimation that the present invention proposes comprises following three steps:
1/2 motion-vector prediction (Prediction), correction for direction (Directional Refinement), cutoff criterion (Half-Stop).Introduce respectively below:
1.1/2 motion-vector prediction (Prediction):
Two hypothesis are at first arranged, the one, the cost function around the whole picture element of the best is a level and smooth convex function, second hypothesis is that the cost function of four whole picture elements around the best whole picture element of supposition is known (corresponding to being exactly V1 among Fig. 1, V2, the cost function of H1 and four positions of H2 is known), can be designated as SAD (V1) respectively, SAD (V2), SAD (H1), the cost function of SAD (H2) central point is SAD (C).This hypothesis is to be based upon at present numerous fast whole picture element movement methods of estimation all to be based on the rhombus estimation model based.Therefore can predict the possible direction of next step half picture element movement vector according to the cost function of adjacent four whole picture elements.
The specific implementation step is as follows:
1) at SAD (V1), SAD (V2) selects minimum value SAD among SAD (H1) and the SAD (H2)
MinObtain cost function smallest point (being designated hereinafter simply as smallest point); Select inferior little value SAD
SubObtain the inferior point (being designated hereinafter simply as inferior point) of cost function, and the picture element of corresponding points is P respectively
MinAnd P
Sub:
Formula (2)
2) if P
MinAnd P
Sub2 on same horizontal line or vertical line, i.e. P
Min(x)=P
Sub(x) or P
Min(y)=P
Sub(y) time, selection is arranged in candidate's 1/2 picture element that two 1/2 picture elements on this straight line and whole picture element C estimate as next step 1/2 picture element movement (for example at Fig. 1, if V1 and V2 are respectively smallest point and inferior point, select II and IIV estimation point so) as half pixel;
3) if P
MinAnd P
Sub2 not on same horizontal line or vertical line, i.e. P
Min(x) ≠ P
Sub(x) and P
Min(y) ≠ P
Sub(y), so these 2 respectively and on two line segments being formed by connecting of central point C and between three 1/2 picture elements and whole picture element C constitute the candidate point of the estimation of 1/2 pixel (for example in Fig. 1, if V1 and H1 are respectively smallest point and inferior point, select I so, II, IV is as the estimation point of half pixel);
4) in candidate point, the point that selects the cost function reckling is 1/2 pixel smallest point, and the pairing motion vector of smallest point is 1/2 picture element movement vector.The point that selects the inferior little person of cost function is the inferior point of 1/2 pixel.
2. correction for direction (Directional Refinement)
The correction for direction technology belongs to the content that the inventor has delivered in the ISCAS international conference of IEEE in 2002, is used for 1/4 and 1/8 sub-pixel motion vector.Combine with above-mentioned 1/2 picture element movement vector and next step cutoff criterion method, constitute complete sub-pixel motion vector method.
On the basis of upper level motion estimation result, further the direction of correction of movement estimation makes it guarantee the accuracy of high accuracy motion vectors.The specific implementation step is as shown in Figure 2:
1) in the motion estimation process of upper level precision, the position of smallest point and inferior point is respectively P
MinAnd P
Sub, P so
MinAnd P
SubRelative position relation two kinds of patterns shown in Figure 2 are arranged, (a) expression P
MinAnd P
Sub2 on same horizontal line or vertical line, i.e. P
Min(x)=P
Sub(x) or P
Min(y)=P
Sub(y) situation the time, (b) expression P
MinAnd P
Sub2 not on same horizontal line or vertical line, i.e. P
Min(x) ≠ P
Sub(x) and P
Min(y) ≠ P
Sub(y) situation;
2) under every kind of pattern, get P
MinAnd P
SubBetween the picture element of three current precision be candidate point.Enumerated example under two kinds of patterns among Fig. 2, promptly chosen among the figure by slightly deceiving three current precision picture elements of Arabic numerals sign as candidate's picture element.
3) in the set that selected candidate pixel point and upper level search optimal match point constitute, selecting the point of cost function minimum is this class precision smallest point, and it is corresponding to this class precision picture element movement vector.The point of selecting the inferior little person of cost function is the inferior point of this class precision pixel.
4) the correction for direction method in above-mentioned 3 steps step is all adopted in the correction of the motion vector direction under 1/4 pixel and 1/8 pixel accuracy judgement.
3. by (Half-Stop) criterion
The process of general interframe encode is to carry out two-dimensional transform for the residual block after the motion compensation, coefficient in transform domain is quantized again, and carries out entropy coding at last.And when residual error less than certain value the time, its variation coefficient all can vanishing after quantizing, and does not need coding.Therefore in the process of estimation, when estimation to cost function less than just having there is no need to continue to search for value that cost function is littler after to a certain degree, do not improved again because of it can not make code efficiency.So adopted this search cutoff criterion in the present invention in the inferior pixel fast method of Ti Chuing:
When the cost function SAD of estimation point<T, motion estimation process ends, and wherein T is a threshold value, can get definite value and (obtain according to experimental result, be experiential cost error tolerance level and be provided with), also can get according to the formula and the quantification manner estimation of shaping conversion in H.264.
The operation principle based on the correction for direction rapid motion estimating method of predicting in the sub-pixel motion method of estimation that the present invention proposes is as follows:
Around the optimal motion vector, has this hypothesis of certain flatness based on prediction-correction for direction method based on cost function, direction according to the cost function value prediction next stage precision motion vector of adjacent position, and adopt search to avoid unnecessary computing by judgment criterion, it is original about 1/3 that the operand of sub-pix estimation is reduced to, and keeps original coding efficiency simultaneously.Help the reduction of operand in the hardware realization, realize that at hardware the complexity of the interpolative operation that Central Asia pixel motion is estimated also descends about 1/3.
The rapid motion estimating method based on statistical prejudgement in the sub-pixel motion method of estimation that the present invention proposes is that a predictive mode with unanimity carries out from the sub-pixel motion estimation approach of 1/2 pixel to 1/4 and 1/8 pixel accuracy.May be summarized to be and comprise following three steps: the one dimension coupling is estimated prediction, and the two dimension coupling is estimated computing, cutoff criterion.Introduce respectively below
1. the one dimension coupling is estimated prediction:
Here mainly be divided into three steps:
(1) calculates VSum (P) value that one dimension mates each position in estimating, promptly utilize the upper level search precision (for 1/2 pixel accuracy, the upper level search precision is exactly whole pixel accuracy, be respectively 1/2 and 1/4 precision for upper level search precision under the 1/4 and 1/8 pixel accuracy situation) one dimension coupling VSum (P) value in estimating, by the operation acquisition of medium filtering;
(2) according to formula
Formula (4)
All search points are carried out the coupling of one dimension and estimate prediction;
(3) judge that according to triangle inequality rule selection needs carry out the set П that the two dimension coupling is estimated the point of computing:
П={ P
i, s.t.VSAD (P
i)≤α * SAD (P
Min) formula (5)
2. the two dimension coupling is estimated computing:
In the set П that one dimension coupling estimation forecasting institute gets, carry out the two dimension coupling and estimate computing, select optimal match point P
Min, satisfy:
Formula (6)
3. cutoff criterion:
The process of general interframe encode is to carry out two-dimensional transform for the residual block after the motion compensation, coefficient in transform domain is quantized again, and carries out entropy coding at last.And when residual error less than certain value the time, its variation coefficient all can vanishing after quantizing, and does not need coding.Therefore in the process of estimation, when estimation to cost function less than just having there is no need to continue to search for value that cost function is littler after to a certain degree, do not improved again because of it can not make code efficiency.So adopted this search cutoff criterion in this article in the inferior pixel fast method of Ti Chuing:
When the cost function SAD of estimation point<T, motion estimation process ends, and wherein T is a threshold value, can get definite value, also can get according to the formula and the quantification manner estimation of shaping conversion in H.264.
The operation principle based on the statistical prejudgement rapid motion estimating method in the sub-pixel motion method of estimation that the present invention proposes is as follows:
Triangle inequality is judged rule (being public technology):
Error adaptation function commonly used in the method for estimating is the absolute difference function, and is as follows:
Formula (7)
Each row by calculating earlier current processing block and reference prediction piece and value, and then ask the error matching operation of an one dimension to get:
Formula (8)
Can get by triangle inequality:
VSAD (P)≤SAD (P) formula (9)
Estimation process be exactly in the set omega of all points that need detect, to select some P with minimum match error value
MinAs optimal match point:
Formula (10)
For certain location point P
jIf, VSAD (P
j)>SAD (P
Min) set up, then must have:
SAD (P
j)>SAD (P
Min) formula (11)
So by the matching operation of one dimension, can dope those can not be the point of optimum Match certainly, then other might be the matching operation of carrying out two dimension in the point of optimum Match, select optimum match point.
Obviously it is few more good more to carry out counting of two dimension coupling through remaining needs after the one dimension matching operation, and the distribution statistics characteristic of this result and concrete data, and SAD (P
Min) to choose be relevant.
Because sub-pixel motion estimates that the method that this paper proposes has two following technical characterstics:
(1) SAD (P in the motion in one dimension estimation procedure of each grade
Min) the matching error value of the minimum that obtained by the upper level estimation replaces, be to utilize whole picture element movement results estimated during half picture element movement is estimated, 1/4 picture element movement estimates then to utilize the half picture element movement results estimated, and 1/8 picture element movement estimates to utilize 1/4 picture element movement results estimated.
In the motion in one dimension of each grade of inferior pixel is estimated, choose satisfied
VSAD (P)<α * SAD (P
Min) formula (12)
Location point carry out two dimensional motion and estimate.Wherein alpha parameter can be used to regulate the equilibrium between computational complexity and the precision of prediction.
(2) VSum (P) value in the motion in one dimension estimation procedure of each grade all is to carry out interpolation by used VSum (P) value in the upper level estimation to obtain, and can save huge operand like this.
By formula (4) as can be seen, the matching operation of so-called one dimension be exactly by ask every columns certificate and two-dimensional blocks of data is converted to the one-dimensional data piece, carry out the same matching error computing of finding the solution again.
Two dimension in the formula (4) can be described below to the conversion of one dimension:
Formula (13)
The motion estimation process of each class precision (1/2,1/4,1/8 pixel accuracy) all needs to calculate the VSum numerical value in the one dimension matched data piece, adopts two following principles to calculate fast here:
1. for the numerical value in the whole pixel accuracy one dimension matched data piece down because the VSum of vertical direction adjacent position point (i, j, t) have much overlapping, so have general fast algorithm implementation VSum (i, j, calculating t), utilize formula:
Formula (14)
If combine some specific fast whole picture element movement algorithm for estimating, the operand of this part even can omit.
2. for 1/2,1/4, VSum (P) numerical value that VSum (P) numerical value in the one dimension matched data piece under 1/8 pixel accuracy then calculates according to upper level utilizes medium filtering to calculate and gets.
As shown in figure 14: suppose that the data point that circle is demarcated is the point of upper level estimation, C is the resulting optimal match point of upper level estimation, other points are the candidate point of required estimation under the current class resolution ratio, the point of wherein leg-of-mutton some expression level or vertical direction, and the point of rhombus is represented the point of diagonal positions.If with the value after the one dimension conversion of VSum (P) expression P point position, the value after the one dimension conversion of these picture elements of so current estimation can be obtained through interpolation by the value of upper level:
VSum(1)=(VSum(C)+VSum(V1))>>1
VSum(2)=(VSum(C)+VSum(V2))>>1
VSum(5)=(VSum(C)+VSum(H1))>>1
VSum(6)=(VSum(C)+VSum(H2))>>1
VSum(3)=(VSum(C)+VSum(V1)+VSum(D1)+VSum(H1))>>2
VSum(4)=(VSum(C)+VSum(V1)+VSum(D2)+VSum(H2))>>2
VSum(7)=(VSum(C)+VSum(V2)+VSum(D3)+VSum(H1))>>2
VSum(8)=(VSum(C)+VSum(V2)+VSum(D4)+VSum(H2))>>2
Formula (15)
Interpolation filter used herein is a median filter.Experimental result shows, adopts median filter to predict in the motion in one dimension estimation procedure and can obtain and adopt H.264 middle definition filter to predict akin result, and complexity significantly decreases simultaneously.
Characteristics of the present invention and effect:
The present invention proposes a kind of sub-pixel motion method for fast searching, comprise rapid motion estimating method based on prediction-correction for direction based on prediction direction correction/statistical prejudgement, and based on the fast motion estimation method of statistical prejudgement.This method greatly reduces the computational complexity that sub-pixel motion is estimated in the software and hardware in the encoding rate distorted characteristic that keeps prior art.And this method is not limited to H.264 international standard, can extend to the application of other international standards and non-international standard video coding.Simultaneously method of the present invention has certain extensibility, can combine with numerous whole picture element movement algorithm for estimating, and can realize that the equilibrium between computational complexity and the precision of prediction regulates.
Embodiment
In the sub-pixel motion method of estimation that the present invention proposes based on the correction for direction rapid motion estimating method of prediction
Specific embodiment is described as follows:
1.1/2 motion-vector prediction (Prediction):
Specific implementation step following (concrete pixel location is with reference to Fig. 1):
1) at SAD (V1), SAD (V2) selects minimum value SAD among SAD (H1) and the SAD (H2)
MinAnd inferior little value SAD
Sub
2) according to the position relation of minimum value and inferior little value, promptly on a horizontal/vertical line, perhaps on diagonal, select estimation candidate's picture element of 1/2 pixel;
3) selecting the pairing motion vector of cost function reckling in above-mentioned candidate point is 1/2 picture element movement vector.
2. correction for direction (Directional Refinement)
1) according to the smallest point and time point of 1/2 pixel accuracy search gained, determine the estimation candidate point of 1/4 pixel accuracy, selecting the cost function smallest point is best 1/4 pixel;
2) according to the smallest point and time point of 1/4 pixel accuracy search gained, determine the estimation candidate point of 1/8 pixel accuracy, selecting the cost function smallest point is best 1/8 pixel;
3. by (Half-Stop) criterion
When the cost function SAD of estimation point<T, motion estimation process ends, and wherein T is a thresholding, and getting definite value at the macro block of 16x16 in the present experiment is 500.For example when the error adaptation function value that detects certain point is 400, end search procedure, confirm that certain is best search match point.
Present embodiment is to realize that the international standard sequence of selecting more representational 4 CIF forms and 2 QCIF forms is as cycle tests on test platform JM2.0 basis H.264.The sequence of 4 CIF forms is Foreman, has been characterized in that stylus rocks; Stefan has been characterized in strenuous exercise; ContainShip is characterized in; And Carphon, be characterized in the moderate motion; The sequence of QICF form is Suzi, is characterized in that the head shoulder resembles; And Salesman, be characterized in that object rotates.Parameter in the present embodiment is provided with as follows:
1. reference frame number: 1
2.Slice pattern: do not adopt
3. entropy coding pattern: CABAC
4. whole picture element movement estimation range: 32
5. rate-distortion optimization: use
6.Hardmard conversion: do not use
7. interframe movement is estimated block mode: only use 16 * 16 pattern
Present embodiment shows that operand reduces to about 17.4%~34.7% of former method, and also can reduce the interpolative operation of same ratio in hardware is realized, and interpolative operation especially the interpolative operation amount of high accuracy pixel be very big.The inventive method has improved arithmetic speed significantly, can well keep the rate distortion characteristic of original encoder when reducing operand.
The rapid motion estimating method based on statistical prejudgement in the sub-pixel motion method of estimation that the present invention proposes is that a predictive mode with unanimity carries out from the sub-pixel motion method of estimation of 1/2 pixel to 1/4 and 1/8 pixel accuracy.
The specific embodiment step is as follows:
1. estimate that according to the one dimension coupling needs carry out the point that the two dimension coupling is estimated in prediction 1/2 picture element:
Here mainly be divided into three steps:
A) as shown in Figure 3, the optimal match point that C obtains for whole picture element movement estimation, two-dimensional blocks of data of the correspondence position of 8 adjacent whole pixel location points can calculate according to formula (4) to the conversion of one-dimensional data piece around C and its, and its operand is near the amount of calculation of a sad value.Obtain the data of the one-dimensional data piece of current inferior pixel location search point then by the computing of formula (15) medium filtering;
B) according to formula (4) coupling that all search points carry out one dimension is estimated prediction;
C) judge that according to triangle inequality rule selection needs carry out the set П that the two dimension coupling is estimated the point of computing:
П={P
i,s.t.VSAD(P
i)≤α*SAD(P
min)}
2. the two dimension coupling is estimated computing:
In the set П that one dimension coupling estimation forecasting institute gets, carry out the two dimension coupling and estimate computing, select optimal match point P
Min, satisfy:
3. at the search point P of best 1/2 pixel accuracy
MinCarry out the estimation of 1/4 pixel accuracy on every side, whole process is consistent with the motion estimation process of 1/2 pixel accuracy, just calculate the data Vsum value of the one-dimensional data piece in the 1/2 pixel accuracy estimation, obtain the Vsum value of the one-dimensional data piece of current 1/4 pixel location search point again by medium filtering by formula (4);
4. at the search point P of best 1/4 pixel accuracy
MinCarry out the motion estimation process of 1/8 pixel accuracy on every side, whole process is consistent with the motion estimation process of 1/4 pixel accuracy, just calculate the data Vsum value of the one-dimensional data piece in the 1/4 pixel accuracy estimation, obtain the Vsum value of the one-dimensional data piece of current 1/8 pixel location search point again by medium filtering by formula (4);
5. cutoff criterion:
When the cost function SAD of certain estimation point<T, motion estimation process ends, and wherein T is a thresholding, and getting definite value at 16 * 16 macro block in the present experiment is 500.
6. choosing of the α value in actual the realization can be according to the difference of search precision, and the statistical property of image sequence itself is adjusted dynamically.
The term harmonization of the condition of present embodiment and a last embodiment.Present embodiment shows that operand can regulate with respect to the ratio that primal algorithm descends about 5%~80%, reduced the interpolative operation of same ratio, and interpolative operation especially the interpolative operation amount of high accuracy pixel be very big.
The inventive method can realize the equilibrium adjusting between computational complexity and the precision of prediction.