CN100586185C

CN100586185C - A Mode Selection Method for H.264 Video Reduced Resolution Transcoding

Info

Publication number: CN100586185C
Application number: CN 200810103682
Authority: CN
Inventors: 戴琼海; 陈芝鑫; 栗强
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2008-04-10
Filing date: 2008-04-10
Publication date: 2010-01-27
Anticipated expiration: 2028-04-10
Also published as: CN101272496A

Abstract

本发明涉及一种264降低分辨率转码的模式选择的方法，属于计算机多媒体技术领域，包括：对264格式的编码码流进行解码，并记录下各个块的大小和对应的运动向量的大小；原始块的运动模式按2∶1映射到新块中，并将原运动向量经过1/2缩减后传递给新视频帧中的相应块；根据原始块的运动模式和运动向量判断新块的运动模式和估计运动向量，确定备选模式；使用基于转码改进的264率失真优化算法来进行最佳模式选择；完成该块的264编码；继续解码下一帧并进行上述的转码处理。本发明充分利用了输入264编码数据的有效信息，大大的降低了在264到264降低分辨率转码在模式选择时的计算量。The invention relates to a mode selection method for 264 reduced-resolution transcoding, which belongs to the technical field of computer multimedia, and includes: decoding a 264-format coded stream, and recording the size of each block and the size of a corresponding motion vector; The motion mode of the original block is mapped to the new block by 2:1, and the original motion vector is reduced by 1/2 and passed to the corresponding block in the new video frame; the motion of the new block is judged according to the motion mode and motion vector of the original block Mode and estimate the motion vector, determine the alternative mode; use the improved 264 rate-distortion optimization algorithm based on transcoding to select the best mode; complete the 264 encoding of the block; continue to decode the next frame and perform the above-mentioned transcoding process. The present invention makes full use of the effective information of the input 264 coded data, and greatly reduces the amount of calculation in mode selection for transcoding with reduced resolution from 264 to 264.

Description

A kind of H.264 video reduces the mode selecting method of resolution transcoding

Technical field

The invention belongs to technical field of computer multimedia, particularly Video Transcoding Technology.

Background technology

Digital video is meant the video information with the digital form record, and English corresponding phrase is Digital video.The original data volume of digital video is very big, has all brought very big inconvenience for transmission and storage, so often needs to carry out encoding compression in the practical application.

Video coding adopts the method for estimation to realize the compression of data.Video data is that because picking rate is fast, adjacent image correlation in terms of content is very strong with the image sequence of regular time interval continuous acquisition.Estimation is exactly this correlation of utilizing between image, eliminates redundant information wherein, realizes the compression of data.The general procedure of method for estimating is as follows: the piece (16 pixels * 16 pixels) that at first present frame (preparing image encoded in the cataloged procedure) is divided into fixed size; Then, for each piece, (reference picture that coding is selected for use) searched for and compared in reference frame, finds " match block " that the content similarity is the highest, the motion vector (relative displacement between current block and the match block) of the current block that obtains thus encoding; Then, two pieces are subtracted each other obtain residual error, and residual matrix is carried out dct transform; At last motion vector and dct transform coefficient are carried out entropy coding and obtain packed data.

But because the processing of terminal equipment is different with display capabilities, varying of the network condition of transmission video signal, and a series of different video encoding standard formulated at different purposes and application scenario of each big tissue and company, caused the form and the parameter variation of video data, given the propagation of media resource and share and brought very big difficulty.Video Transcoding Technology is exactly in order to overcome the above problems, better to realize that more effectively the conversion between video data puts forward.

Video code conversion (Video Transcoding) technology is exactly will be with certain original compressing video frequency flow, be converted to another kind of different compressing video frequency flow, changing parameters such as its code check, resolution, or changes its syntax format fully.Video transcoding method mainly can be divided into two big classes: pixel domain code conversion and transform domain transcoding.The transform domain transcoding is realized the conversion of code stream form by the DCT coefficient is carried out motion compensation.The advantage of this method is that amount of calculation is little, and transcoding speed is fast, shortcoming be this method to motion vector, code rates etc. have certain requirement, have brought certain restriction to practical application.Pixel domain code conversion then is to handle after changing to pixel domain by the DCT contravariant behind the code stream decoding, relatively flexibly, be convenient to realize, but the amount of calculation of transcoding is bigger.

H.264/AVC be the up-to-date international video encoding standard that ITU-T video coding expert group (VCEG) and ISO/IEC Motion Picture Experts Group (MPEG) unite formulation, its main target is to improve compression performance, and the representation of video shot with network close friend (network-friendly) is provided.By to the many-sided improvement of each video encoding standard before, H.264 can satisfy the different demands under the various code checks.For example, it both can be used for the video transmission down of low code check such as mobile phone, the occasions such as video conference of code check in also can be used on, and the while also is suitable for the application of high code rate such as Digital Television.H.264 when providing sensuously suitable video quality, code check is 1/3 to 1/2 of a MPEG-2 code check; Compare with existing other video standards, H.264 can expect to improve the code efficiency more than 50%.With MPEG-2 before, coding standards such as MPEG-4 are compared, and have H.264 increased the variable size block technology of motion compensation, in motion compensation, H.264 support piece size Selection mechanism more flexibly, than other video standards more block-shaped and size are arranged.The piece of each P type (one-way movement prediction) or Type B (bidirectional-movement prediction), all corresponding specific subregion situation is divided into several pieces of 8 * 8,8 * 16,16 * 8 or 16 * 16; If selected 8 * 8 piece, might further be split into 8 * 4,4 * 8 or 4 * 4 piece.This technology has also increased the complexity and the amount of calculation of coding greatly when increasing code efficiency.

The present code-transferring method at reducing resolution is to pixel domain with code stream decoding, after each frame image data is down sampled to specified resolution, wherein each piece is carried out estimation, estimating motion vector for this piece correspond to original image the zone in all pieces motion vector weighted average or get intermediate value, after to the estimating motion vector refinement, finish the coding that reduces image after the resolution.This method is fit to the transcoding of the reduction resolution of coding standards such as MPEG-2, MPEG-4 and uses, but owing to H.264 have the characteristics of variable size block, at coding with need carry out optimal mode transcoding the time and select, the therefore transcoding of the interior reduction resolution of the code-transferring method of traditional this reduction resolution and the standard H.264 of being not suitable for.

Summary of the invention

The objective of the invention is for overcoming the weak point of prior art, the mode selecting method that a kind of 264 new videos reduce the resolution transcoding has been proposed, this method has made full use of motor pattern and the motion vector information in 264 codings, characteristics in conjunction with the variable size block of 264 video codings, effectively reduce the alternative mode of model selection in the transcoding process, improved transcoding efficiency.

A kind of pixel domain code conversion method that H.264 reduces by 1/2 spatial resolution that the present invention proposes is characterized in that, may further comprise the steps:

1) encoding code stream of 264 forms is decoded, and note the size of each original block and the size of the motion vector of correspondence;

2) motor pattern of original block is by (as 16 * 16 mode map is 8 * 8 patterns, and 16 * 8 mode map are 8 * 4 patterns etc.) in the piece that is mapped to new video frame at 2: 1, and former motion vector is passed to relevant block in the new video frame after through 1/2 reduction; Original block less than 8 * 8,8 * 8 regional integrations at place are mapped as 4 * 4, calculate new piece motion vector MV _NewFor:

{MV}_{new} = \frac{Σ_{i}^{n} ({MV}_{i}^{org} \times w_{i})}{2 \times Σ_{i}^{n} w_{i}}

In the formula, MV _i ^OrgBe the motion vector of i piece in the corresponding primitive frame of this piece zone, w _iBe weight coefficient, weight coefficient determines with the area of i piece, n represent comprise the sum of original block;

3) if four original blocks of current 16 * 16 correspondences are 16 * 16 patterns, calculate at first respectively upper left upper right, the bottom right, lower-left, between per two motion vectors of upper left lower-left and upper right bottom right correspondence apart from d ₁:

d_{1} = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}

In the formula, x ₁, x ₂, y ₁, y ₂Be respectively horizontal stroke, the ordinate value of two motion vectors;

If obtain apart from d ₁Maximum less than D _{16 * 16}, then new piece uses 16 * 16 motor patterns, and corresponding motion vector is the intermediate vector of four motion vectors;

If obtain apart from d ₁Maximum greater than D _{16 * 16}, minimum value is less than D _{8 * 8}, then the piece of these two motion vector correspondences uses 16 * 8 or 8 * 16 patterns, and corresponding motion vector is the mean value of these two motion vectors; Simultaneously, if the distance between two other motion vector correspondence less than D _{8 * 8}, then the piece of these two motion vector correspondences uses 16 * 8 or 8 * 16 patterns, and corresponding motion vector is the mean value of these two motion vectors; Otherwise the piece of two correspondences uses 8 * 8 motor patterns;

If obtain apart from d ₁Maximum greater than D _{16 * 16}, minimum value is greater than D _{8 * 8}, then use step 2) in the motor pattern and the vector of mapping;

D _{16 * 16}And D _{8 * 8}(its value can based on the actual application requirements video quality and amount of calculation be determined D for constant _{16 * 16}And D _{8 * 8}Be worth greatly more, amount of calculation is more little, and video quality is poor more; Otherwise amount of calculation is big more, and video quality is good more; D for example _{16 * 16}And D _{8 * 8}Be set to 20 and 10 respectively);

4) if in four original blocks three pieces being arranged is 16 * 16 patterns, calculate respectively between the motion vector of 3 16 * 16 correspondences apart from d ₂:

d_{2} = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}

If ultimate range d ₂Less than D _{16 * 16}, then 16 * 16 motor patterns being joined in the alternative mode, corresponding motion vector is the intermediate vector of 3 motion vectors;

If obtain apart from d ₂Maximum greater than D _{16 * 16}, but exist between the motion vector of adjacent two correspondences apart from d less than D _{8 * 8}, then the piece of these two motion vector correspondences uses 16 * 8 or 8 * 16 patterns, and corresponding motion vector be the mean value of these two motion vectors, and two other piece keeps that to shine upon the motor pattern that obtains constant, and this pattern is joined in the alternative mode;

With step 2) in the mapping motor pattern and the vector that obtain join in the alternative mode; After motion vector carried out refinement, from above-mentioned alternative mode, select optimal mode.

This optimal mode selects to adopt the rate-distortion optimization technology based on the Lagrangian optimized Algorithm (Rate-Distortion Optimization) in 264 codings, and wherein λ is calculated as follows:

λ＝0.85×2 ^(QP-12)/2

QP is 264 quantization parameter, the QP value (QP of corresponding blocks before the value of this new QP is not less than _Downsampling≤ QP _Original).

Characteristics of the present invention and effect

The present invention proposes a kind of new pixel domain code conversion method that H.264 reduces by 1/2 spatial resolution.The present invention has made full use of motor pattern and the motion vector information in 264 codings, combines the characteristics of the variable size block of 264 video codings, has effectively reduced the alternative mode of model selection in the transcoding process, has improved transcoding efficiency.

Embodiment

The mode selecting method of a kind of 264 videos reduction resolution transcoding that the present invention proposes reaches embodiment in conjunction with the accompanying drawings and is described in detail as follows:

The inventive method may further comprise the steps:

{MV}_{new} = \frac{Σ_{i}^{n} ({MV}_{i}^{org} \times w_{i})}{2 \times Σ_{i}^{n} w_{i}}

d_{1} = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}

If obtain apart from d ₁Maximum greater than D _{16 * 16}, minimum value is less than D _{8 * 8}, then the piece of these two motion vector correspondences uses 16 * 8 or 8 * 16 patterns, and corresponding motion vector is the mean value of these two motion vectors; Simultaneously, if the distance between two other motion vector correspondence less than D _{8 * 8}, then the piece of these two motion vector correspondences uses 16 * 8 or 8 * 16 patterns, and corresponding motion vector is the mean value of these two motion vectors, otherwise the piece of two correspondences uses 8 * 8 motor patterns;

D _{16 * 16}And D _{8 * 8}(its value can based on the actual application requirements video quality and amount of calculation be determined D for constant _{16 * 16}And D _{8 * 8}Be worth greatly more, amount of calculation is more little, and video quality is poor more; Otherwise amount of calculation is big more, and video quality is good more; D _{16 * 16}And D _{8 * 8}Can be set to 20 and 10);

d_{2} = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}

If obtain apart from d ₂Maximum greater than D _{16 * 16}, but exist between the motion vector of adjacent two correspondences apart from d less than D _{8 * 8}, then the piece of these two motion vector correspondences uses 16 * 8 (8 * 16) pattern, and corresponding motion vector be the mean value of these two motion vectors, and two other piece keeps that to shine upon the motor pattern that obtains constant, and this pattern is joined in the alternative mode;

With step 2) in the mapping motor pattern and the vector that obtain join in the alternative mode;

After motion vector carried out refinement, from above-mentioned alternative mode, select optimal mode;

Optimal mode selects to adopt the rate-distortion optimization technology based on the Lagrangian optimized Algorithm (Rate-Distortion Optimization) in 264 codings, and wherein λ is calculated as follows:

λ＝0.85×2 ^(QP-12)/2

Be example with a kind of situation that may occur in the step 2 below, 264 videos that describe the present invention's proposition in detail reduce the method for operation of the mode selecting method of resolution transcoding:

After 264 code stream decodings are obtained a frame, it is carried out 2: 1 down-samplings, obtain the new image of a frame.One of them 16 * 16 are encoded;

These 16 * 16 corresponding in original image is 4 16 * 16, and each piece all adopts 16 * 16 motor patterns coding, and corresponding motion vector is respectively MV ₁, MV ₂, MV ₃And MV ₄Calculate MV respectively ₁And MV ₂, MV ₃And MV ₄, MV ₁And MV ₃, MV ₂And MV ₄Between apart from d ₁

MV ₂And MV ₄Between apart from d _MaxMaximum, and d _MaxGreater than D _{16 * 16}, so 16 * 16 do not used 16 * 16 motor patterns codings;

MV ₁And MV ₂Between apart from d _MinMinimum, and less than D _{8 * 8}, so 8 * 16 motor patterns, motion vector MV are used in first 8 * 16 zone of new piece ₁' be estimated as MV ₁And MV ₂Mean value 1/2;

Remain two piece MV ₃And MV ₄Between distance greater than D _{8 * 8}, so 8 * 16 motor patterns are not used in second 8 * 16 zone of new piece, but two 8 * 8 motor patterns, motion vector MV ₂₁' and MV ₂₂' be estimated as MV respectively ₃And MV ₄1/2;

After finishing this coding of 16 * 16 according to above-mentioned motor pattern, continue the next piece of coding.

Claims

1, a kind of H.264 video reduces the mode selection method of resolution transcoding, is characterized in that, comprises the following steps:

1) Decoding the encoded code stream in H.264 format, and recording the size of each original block and the size of the corresponding motion vector;

2) The motion mode of the original block is mapped to the block of the new video frame by 2:1, and the original motion vector is reduced by 1/2 and passed to the corresponding block in the new video frame; the original block smaller than 8×8, where The overall mapping of the 8×8 area is 4×4 blocks, and the new block motion vector MV _new is calculated as:

{MV MV}_{new new} = = \frac{{Σ Σ}_{i i}^{n no} (({MV MV}_{i i}^{org org} \times \times {w w}_{i i}))}{22 \times \times {Σ Σ}_{i i}^{n no} {w w}_{i i}}

In the formula, MV _i ^org is the motion vector of the corresponding block in the new video frame corresponding to the i-th block in the original frame area, w _i is the weighting coefficient, and the weighting coefficient is determined by the area of the i-th block, and n represents the Contains the total number of original blocks;

3) If the four original blocks corresponding to the current 16×16 block are all in 16×16 mode, first calculate the distance d ₁ between each two motion vectors corresponding to upper left and upper right, lower left and lower right, upper left and lower left, and upper right and lower right respectively :

{d d}_{11} = = \sqrt{{(({x x}_{11} - - {x x}_{22}))}^{22} + + {(({y the y}_{11} - - {y the y}_{22}))}^{22}}

In the formula, x ₁ , x ₂ , y ₁ , y ₂ are the abscissa and ordinate values of the two motion vectors respectively;

If the obtained maximum value of distance d ₁ is less than D _16×16 , the new block uses 16×16 motion mode, and the corresponding motion vector is the middle vector of the four motion vectors;

If the maximum value of the obtained distance d ₁ is greater than D _16×16 and the minimum value is less than D _8×8 , then the blocks corresponding to the two motion vectors whose minimum value of d ₁ is less than D _8×8 use 16×8 or 8× 16 mode, the corresponding motion vector is the average value of these two motion vectors; at the same time, if the distance between the other two motion vectors is less than D _8×8 , the blocks corresponding to these two motion vectors use 16×8 or 8×16 mode, the corresponding motion vector is the average of these two motion vectors; otherwise, the two corresponding blocks use 8×8 motion mode;

If the maximum value of the obtained distance d ₁ is greater than D _16×16 and the minimum value is greater than D _8×8 , then use the mapped motion pattern and vector in step 2);

D _16×16 and D _8×8 are constants;

4) If three of the four original blocks are in the 16×16 mode, calculate the distance d ₂ between the motion vectors corresponding to the three 16×16 blocks:

{d d}_{22} = = \sqrt{{(({x x}_{11} - - {x x}_{22}))}^{22} + + {(({y the y}_{11} - - {y the y}_{22}))}^{22}}

If the maximum distance d ₂ is less than D _16×16 , add the 16×16 motion mode to the candidate mode, and the corresponding motion vector is the middle vector of the 3 motion vectors;

If the maximum value of the obtained distance d ₂ is greater than D _16×16 , but the distance d between motion vectors corresponding to two adjacent blocks is smaller than D _8×8 , then the blocks corresponding to these two motion vectors use 16×8 or 8 ×16 mode, the corresponding motion vector is the average value of these two motion vectors, and the other two blocks maintain the mapped motion mode unchanged, and add this mode to the alternative mode;

5) Add the motion mode and motion vector mapped in step 2) to the candidate modes, refine the motion vector, and select the best mode from the above candidate modes.

2. The method according to claim 1, characterized in that, said step 5) selects the best mode from the alternative modes and adopts the rate-distortion optimization technology (Rate-Distortion Optimization) based on the Lagrangian optimization algorithm in the H.264 encoding ), where λ is calculated as follows:

λ＝0.85×2 ^(QP-12)/2

QP is a quantization parameter of H.264, and the value of the QP of the new block is not smaller than the QP value of the previous corresponding block.