HK1205840A1

HK1205840A1 - Encoding device for high-resolution moving images

Info

Publication number: HK1205840A1
Application number: HK15106259.6A
Authority: HK
Inventors: 金守年; 林晶娟; 李英烈; 文柱禧; 全炳宇; 金海光; 徐正勛; 徐正勋; 金基五; 洪性旭
Original assignee: Sk电信有限公司
Priority date: 2009-09-14
Filing date: 2015-07-01
Publication date: 2015-12-24
Also published as: KR20110028734A; US9584810B2; US20160029023A1; CN104539974B; US20150229924A1; CN104506875A; CN104539957B; CN102598669A; CN104539974A; US9621895B2; CN104506876A; WO2011031044A3; KR101302660B1; CN104506875B; CN104506876B; CN104539957A; US20120201300A1; CN102598669B; US9154809B2; WO2011031044A2

Abstract

A video decoding apparatus includes a decoder to decode division information related to dividing a current block into subblocks, and reconstruct transform coefficients of each of the subblocks identified by the division information, and thereby generate transformed residual subblocks; an inverse transformer to inverse-transform each of the transformed residual subblocks, and thereby generate residual subblocks; a predictor to generate predicted subblocks by intra-predicting each of the subblocks; an adder to reconstruct the current block by adding the predicted subblocks to the residual subblocks corresponding thereto; and a first filter to perform deblocking-filtering on boundaries between the subblocks in a reconstructed picture including the reconstructed current block.

Description

Encoding/decoding method and apparatus for high resolution moving picture

The present application is a divisional application of an invention patent application having a parent application number of 201080051482.3 (international application number: PCT/KR2010/006017, filing date: 9/3/2010, title: encoding/decoding method and apparatus for high resolution moving pictures).

Technical Field

The present invention relates to a high resolution video encoding/decoding method and apparatus. More particularly, the present invention relates to a method and apparatus for improving encoding efficiency by performing encoding and decoding in units of various types of blocks and performing transformation, quantization, scanning, and filtering according to the type of block suitable for the corresponding encoding and decoding.

Background

The Moving Pictures Experts Group (MPEG) and Video Coding Experts Group (VCEG) have developed improved and superior video compression techniques over the existing MPEG-4 part 2 and h.263 standards. The new standard is called h.264/AVC (advanced video coding) and is simultaneously released as MPEG-4 part 10 AVC and ITU-T recommendation h.264. H.264/AVC (hereinafter referred to as "H.264") can reduce the number of bits of encoded data by performing the following steps: intra (intra) prediction/inter (inter) prediction is performed in units of macroblocks (macroblocks) each having various types of subblocks (subblocks) to generate a residual signal, the generated residual signal is transformed and quantized, and then the transformed and quantized residual signal is encoded.

A video encoding apparatus employing a typical encoding method in units of macroblocks divides input video into macroblocks, performs prediction in a subblock size that the macroblock may have according to an inter mode or an intra mode for each macroblock to generate a residual block, applies an integer transform based on a 4 × 4 or 8 × 8 Discrete Cosine Transform (DCT) to the generated residual block to generate transform coefficients, and quantizes the transform coefficients according to a specified Quantization Parameter (QP). Blocking effects due to the transform and quantization processes are reduced by loop filtering.

Since a typical video compression technique such as h.264 performs encoding in units of 16 × 16 macroblocks into which video to be encoded is divided, and also fixes the unit of transform in block sizes of 4 × 4 or 8 × 8, there are the following problems: when video has high correlation between pixels, coding efficiency is reduced. That is, when there is high correlation between pixels of a video, prediction can be efficiently performed in units of macroblocks larger than a macroblock having a size of 16 × 16 or in units of various types of macroblocks, and thus various transform block sizes other than a 4 × 4 block size or an 8 × 8 block size can also be efficiently used as a unit of transform, but a typical video compression technique cannot perform adaptive encoding according to characteristics of a video because the macroblock size and the transform block size are fixed, which causes a problem of a reduction in encoding efficiency.

Disclosure of Invention

Technical problem

Accordingly, the present invention has been made in an effort to solve the above-mentioned problems, and the present invention proposes to improve compression efficiency by performing encoding in units of various types of macroblocks suitable for high-resolution video, and correspondingly performing various types of prediction, transform and quantization, scanning, filtering, and the like.

Technical solution

An aspect of the present invention provides a video encoding method, including: dividing an input video into a plurality of macroblocks having various shapes or sizes; encoding each of the plurality of macroblocks; and generating macroblock information indicating a shape or size of each of the plurality of macroblocks.

The step of encoding each of the plurality of macroblocks may comprise the steps of: dividing each of the plurality of macroblocks into a plurality of sub-blocks; performing predictive encoding for each of the plurality of sub-blocks; and generating prediction mode information indicating a prediction mode for each of the plurality of sub-blocks and macroblock division information indicating a size of each of the plurality of sub-blocks.

The step of encoding each of the plurality of macroblocks may comprise the steps of: predicting each of the plurality of sub-blocks to obtain a predicted sub-block; obtaining a residual block representing a difference between each of the plurality of sub-blocks and a predicted sub-block of each of the plurality of sub-blocks; determining a transform type based on at least one of a size of each of the plurality of macroblocks, the prediction mode, and a size of each of the plurality of subblocks; transforming the residual block according to the determined transform type; and quantizing the transformed residual block.

The step of encoding each of the plurality of macroblocks may further comprise the steps of: encoding information indicating the determined transform type.

The information indicating the transform type may include the transform size or category.

Each of the plurality of macroblocks may have a rectangular shape in which a length of a horizontal side is different from a length of a vertical side.

The step of determining the transform type may comprise the steps of: determining the transform size to be the same as a size of the predictor block when the prediction mode is an intra prediction mode.

The transforming of the residual block may comprise the steps of: when the size of the residual block is smaller than the determined transform size, combining a plurality of residual blocks with each other to generate a combined residual block having a size equal to the transform size, and then transforming the combined residual block.

The step of encoding each of the plurality of macroblocks may comprise the steps of: scanning coefficients of the transformed and quantized residual block with a scanning pattern selected according to the prediction mode.

The step of scanning the coefficients may comprise the steps of: scanning the coefficients of the transformed and quantized residual block with a scanning pattern corresponding to an intra prediction mode for predicting each of the plurality of sub blocks among a plurality of intra prediction modes when the prediction mode is an intra prediction mode.

The step of scanning the coefficients may comprise the steps of: scanning the coefficients of the transformed and quantized residual block with a scanning pattern selected according to the transform type for the transformed and quantized residual block when the prediction mode is an inter prediction mode.

Another aspect of the present invention provides a video encoding apparatus, including: a macroblock type determiner for dividing an input video into a plurality of macroblocks having different shapes or sizes; and a macroblock encoder for encoding each of the plurality of macroblocks and encoding macroblock information indicating a shape or size of each of the plurality of macroblocks.

Yet another aspect of the present invention provides a video decoding method including the steps of: receiving video data obtained by dividing an input video into a plurality of macro blocks having different shapes or sizes and encoding each of the plurality of macro blocks; decoding macroblock information indicating a shape or size of each of the plurality of macroblocks; and decoding each of the plurality of macroblocks based on the macroblock information.

The step of decoding each of the plurality of macroblocks may comprise the steps of: decoding macroblock division information indicating a size of each of the plurality of subblocks and prediction mode information indicating a prediction mode for each of the plurality of subblocks when each of the plurality of macroblocks is divided into the plurality of subblocks; and obtaining a predicted subblock of each of the plurality of subblocks based on the macroblock division information and the prediction mode information.

The step of decoding each of the plurality of macroblocks may comprise the steps of: determining an inverse transform type based on at least one of a size of each of the plurality of macroblocks, the prediction mode, and a size of each of the plurality of sub-blocks; inverse quantizing and inverse transforming the residual block according to the determined inverse transform type to obtain an inverse-quantized and inverse-transformed residual block; and adding the inverse quantized and inverse transformed residual block to the predictor block to obtain a reconstructed block.

The step of determining the inverse transform type may comprise the steps of: determining the inverse transform type based on information indicating the inverse transform type.

The information indicating the inverse transform type may include the inverse transform size or kind.

The step of determining the inverse transform type may comprise the steps of: determining the inverse transform size to be the same as a size of each of the plurality of sub-blocks when the prediction mode is an intra prediction mode.

The step of decoding each of the plurality of macroblocks may further comprise the steps of: dividing the inverse-quantized and inverse-transformed residual block into a plurality of residual sub-blocks each having a size equal to the size of the predictor block, when the size of the predictor block is smaller than the determined inverse transform size; and adding each of the plurality of residual sub-blocks to the predictor block to obtain a reconstructed block.

The step of decoding each of the plurality of macroblocks may further comprise the steps of: scanning a coefficient string of the transformed and quantized residual block to obtain a two-dimensional transformed and quantized residual block, wherein the scanning of the coefficient string is selected according to the prediction mode.

The step of scanning the coefficient string may comprise the steps of: scanning the coefficient string of the transformed and quantized residual block with a scan pattern corresponding to an intra prediction mode for predicting each of the plurality of sub blocks among a plurality of intra prediction modes when the prediction mode is an intra prediction mode.

The step of scanning the coefficient string may comprise the steps of: scanning the coefficient string of the transformed and quantized residual block with a scanning pattern selected according to the inverse transform type when the prediction mode is an inter prediction mode.

Yet another aspect of the present invention provides a video decoding apparatus including: a decoder for receiving video data obtained by dividing an input video into a plurality of macroblocks having different shapes or sizes and encoding each of the plurality of macroblocks, and decoding macroblock information indicating the shape or size of each of the plurality of macroblocks; and an inverse quantizer and inverse transformer for inverse quantizing and inverse transforming each of the plurality of macroblocks based on the macroblock information.

Advantageous effects

According to the present invention described above, not only can the coding efficiency be improved because the present invention enables coding with high correlation between temporally/spatially adjacent pixels appearing in video, but also the compression efficiency can be improved by reducing block distortion. Also, the number of times filtering is performed may be reduced, which makes it possible to reduce implementation complexity of the video encoding and decoding apparatus.

Drawings

Fig. 1 is a schematic block diagram illustrating the structure of a video encoding apparatus according to an aspect of the present invention;

fig. 2 to 4 are views illustrating an intra prediction mode according to a macroblock type for typical video encoding;

fig. 5 is a view illustrating an inter prediction mode according to a macroblock type for typical video encoding;

fig. 6 is a view illustrating a macroblock of size mxn according to an aspect of the present invention;

fig. 7 is a view illustrating various types of sub-macroblocks that a macroblock of size mxn may have according to an aspect of the present invention;

FIG. 8 is a view illustrating MF for 8 x 4 transform according to an aspect of the present invention;

fig. 9 is a view for explaining filtering with various transform and quantization types applied according to an aspect of the present invention;

fig. 10 is a view for explaining a process of performing deblocking (deblocking) filtering across block boundaries according to an aspect of the present invention;

fig. 11 is a view for explaining a process of performing a deringing filtering according to an aspect of the present invention;

FIG. 12 is a view illustrating a scan sequence according to transform and quantization types according to an aspect of the present invention;

fig. 13 to 18 are views for explaining a method of applying CAVLC according to transform and quantization types according to an aspect of the present invention;

fig. 19 is a flowchart for explaining a video encoding method according to an aspect of the present invention;

fig. 20 is a schematic block diagram of a video decoding apparatus according to an aspect of the present invention;

fig. 21 is a flowchart for explaining a video decoding method according to an aspect of the present invention;

fig. 22 is a view illustrating various sub-blocks into which a 64 × 64 macroblock is divided, according to an aspect of the present invention; and

fig. 23 is a view illustrating a scan pattern for coefficients of an intra 4 × 4 block according to an intra prediction mode according to an aspect of the present invention.

Detailed Description

Aspects of the present invention will be described in detail below with reference to the accompanying drawings. In the following description, the same elements are denoted by the same reference numerals although they are shown in different drawings. Further, in the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention unclear.

In addition, in describing the components of the present invention, terms such as first, second, A, B, (a) and (b) may be present. These terms are only used for the purpose of distinguishing one component from another component and do not imply or imply a substance, order, or sequence of such components. If a component is described as being "connected," "coupled," or "linked" to other components, then such components can be not only directly "connected," "coupled," or "linked" to the other components, but also indirectly "connected," "coupled," or "linked" to the other components via a third component.

Fig. 1 schematically illustrates the structure of a video encoding apparatus according to an aspect of the present invention.

The video encoding apparatus 100 according to an aspect of the present invention is an apparatus for encoding video, and may include: a predictor 110, a subtractor 120, a transformer and quantizer 130, a scanner 140, an encoder 150, an inverse quantizer and inverse transformer 160, an adder 170, a filter 180, and a macroblock type determiner 190. The video encoding apparatus 100 may be a personal computer or PC, a notebook computer or laptop computer, a personal digital assistant or PDA, a portable multimedia player or PMP, a portable game machine or PSP or mobile communication terminal, a smart phone, etc., and represents various apparatuses equipped with, for example, a communication device such as a modem for performing communication between various devices or wired/wireless communication networks, a memory for storing various programs for encoding video and related data, and a microprocessor for executing the programs to realize operations and controls.

An input video to be encoded, such as a frame or slice (slice), may be divided into unit blocks for encoding. In the present invention, each unit block used for encoding or decoding is referred to as a macroblock. According to an aspect of the invention, the macroblock may have a variable size of M × N. Here, each of M and N may have a value of 2ⁿIs an integer of (1) or more, and specifically, may be an integer of 16 or more. The macro block according to the present invention is different from the conventional macro block in that the macro block according to the present invention may have a variable shape or size.

In addition, according to the present invention, macroblocks having different shapes or sizes may be used for each input video to be encoded, such as a frame or a slice. Also, one input video (such as a frame or a slice) to be encoded or decoded may be divided into a plurality of macroblocks having different shapes or sizes, which are sequentially encoded or decoded. Macroblock information indicating the shape or size of each macroblock is encoded and the encoded macroblock information is included in a picture header, a slice header, or a macroblock header. When the video decoding apparatus decodes the encoded data, the video decoding apparatus may determine the type of macroblock to be decoded using the macroblock information. What type of block is used may be determined by selecting a block type that yields optimal efficiency when performing video encoding using various types of blocks, or selecting a block type according to characteristics obtained by analyzing characteristics of a frame. For example, if the frame video has high spatial redundancy, a square macroblock (such as a 32 × 32 macroblock or a 64 × 64 macroblock) larger than a conventional fixed 16 × 16 macroblock may be selected as a unit of encoding. Alternatively, if the frame video has a high level of correlation or a vertical correlation, a macroblock having a horizontally long shape or a vertically long shape may be selected. To this end, the video encoding apparatus 100 may include a macroblock type determiner 190 for determining a macroblock type, dividing an input video to be encoded, such as a frame or a slice, into a plurality of macroblocks each having a determined shape or size, encoding macroblock information for the macroblock type, and including the encoded macroblock information in the encoded data.

As shown in fig. 1, a macroblock encoder 100a, which is a component of the present invention, may include a predictor 110, a subtractor 120, a transformer and quantizer 130, a scanner 140, an encoder 150, an inverse quantizer and inverse transformer 160, an adder 170, and a filter 180. The macroblock encoder 100a encodes each macroblock, and encodes macroblock information indicating the shape or size of each macroblock.

The predictor 110 predicts the current block to be encoded using other previously decoded pixel values to generate a prediction block of the current block. That is, the predictor 110 predicts the current block using intra prediction, inter prediction, or the like to generate a prediction block having prediction pixel values as pixel values of respective pixels.

The current block to be predicted may be a macroblock and, if necessary, may be a current block obtained by dividing the macroblockThe sub-blocks are fetched in order to optimize the predicted pixel values. That is, it is also possible to divide a macroblock that is a unit of encoding into a plurality of prediction units each having a smaller size than the macroblock. As a result, the prediction block may be generated in units of subblocks divided from the macroblock. Here, the macroblock may be an M × N block having a square or rectangular shape, and the subblocks may be 2 having a size within a range not exceeding the macroblock size, respectivelyⁿP × Q blocks of horizontal and vertical sizes.

The predictor 110 generates information indicating the type of the subblock, i.e., the subblock type, and provides the generated information to the encoder 150. The encoder 150 encodes information indicating the sub-block type and provides the encoded information to the video decoding apparatus. The subblock type includes a subblock prediction mode and a subblock size. The subblocks may be classified into intra subblocks and inter subblocks according to a subblock prediction mode. Intra sub-blocks may also be classified into intra 4 × 4 blocks, intra 8 × 8 blocks, intra 16 × 16 blocks, etc. according to sub-block sizes. In addition, intra sub-blocks may be classified into various intra sub-blocks, such as intra 32 × 32 blocks and intra 16 × 32 blocks. The inter sub-blocks may be classified into inter 4 × 4 blocks, inter 8 × 4 blocks, inter 4 × 8 blocks, inter 8 × 16 blocks, inter 16 × 8 blocks, inter 16 × 16 blocks, and the like. In addition, the inter sub-blocks may be classified into various inter sub-blocks, such as an inter 32 × 16 block, an inter 16 × 32 block, and an inter 32 × 32 block.

According to an aspect of the present invention, the video encoding apparatus may generate prediction mode information indicating a prediction mode for each sub-block and include the generated prediction mode information in a bitstream. The prediction modes may include an intra prediction mode, an inter prediction mode, a skip mode, and a direct mode.

Also, when a macroblock, which is a coding unit, is divided into a plurality of subblocks for prediction, the video encoding apparatus according to the present invention may generate macroblock division information indicating the shape or size of the subblock into which the macroblock is divided, and transmit the generated macroblock division information to the video decoding apparatus. The macroblock partition information may be implemented in various ways according to how the macroblock is divided. According to an aspect of the present invention, the macroblock division information may include a start position and a size of each subblock within the macroblock. In this case, flexible block division is possible, but the amount of data to be transmitted is increased. According to another aspect of the present invention, the macroblock partition information may be implemented by a flag having one or more bits indicating a partition type. The flags have respective values that define different block partitioning schemes. For example, if the flag has a value of "0", the flag indicates that the block is not divided. Also, if the flag has a value of "1", the flag indicates that the block is divided into four equally sized sub-blocks. When there are three or more partition types, each partition type may be indicated by a flag having two or more bits.

As described above, the prediction mode information and the macroblock partition information may be implemented by independent syntax elements, and may also be expressed by one syntax element at the same time. Since the conventional h.264 standard fixes the size of a macroblock to a unit of encoding and decoding in 16 × 16, a prediction mode for a corresponding macroblock and a partition mode for predicting the macroblock may be indicated by only one syntax element "mb _ type" generated for each macroblock. However, in the present invention, since the size or shape of the macroblock is variable, the video encoding apparatus may generate prediction mode information and macroblock partition information and macroblock information indicating the size or shape of the macroblock and provide them to the video decoding apparatus.

The subtractor 120 generates a residual block by subtracting the prediction block from the current block. That is, the subtractor 120 calculates a difference between original pixel values of respective pixels of the current block and predicted pixel values of respective pixels of the predicted block to generate a residual block having a residual signal.

The transformer and quantizer 130 determines a transform and quantization type according to the shape and size of a current macroblock to be encoded, a block type of the current macroblock or subblock, and the like, and transforms and quantizes a residual block according to the determined transform and quantization type. More specifically, the transformer and quantizer 130 transforms a residual signal of the residual block into a frequency domain to generate a transformed residual block having transform coefficients, and quantizes the transformed residual block to generate a transformed and quantized residual block having quantized transform coefficients.

When the transformer and quantizer 130 transforms and quantizes the residual block, the transformation is not completed until the quantization is completed because the transformation process is included in the quantization process. Here, a technique for transforming a video signal in a spatial domain into a frequency domain, such as an integer transform (hereinafter, simply referred to as "integer transform") based on a Hadamard (Hadamard) transform and a Discrete Cosine Transform (DCT), may be used as the transformation method, and various quantization techniques, such as dead zone consistency threshold quantization (hereinafter, referred to as "DZUTQ") and a quantization weighting matrix, may be used as the quantization method.

Also, various transform and quantization types of transform and quantization having a block size of P × Q may be within a range not exceeding the size of the current macroblock. Here, the transform and quantization of the block size P × Q may correspond to a transform and quantization having a subblock size (except for a typical transform and quantization of a block size 4 × 4 and a transform and quantization of a block size 8 × 8) that a macroblock of a current size M × N may have.

Also, the transformer and quantizer 130 may transform and quantize the residual block based on a transform and quantization type determined according to a prediction mode for a current macroblock or subblock and/or a size of a subblock (i.e., a subblock that is a prediction unit). In this regard, when the current block is an intra block type, the transform and quantization type may be determined to be the same as a block size of the intra block type. Also, when the current block is an inter block type, one transform and quantization type may be determined from among a plurality of transform and quantization types using encoding costs. Here, the plurality of transform and quantization types may be not only transform and quantization types having the same size as the block size of the subblock, but also transform and quantization types having various block sizes. The process of transforming and quantizing the residual block by the transformer and quantizer 130 is described in detail below.

The scanner 140 scans the quantized transform coefficients of the transformed and quantized residual block output from the transformer and quantizer 130 to generate a quantized transform coefficient string. In this regard, the scanning method is determined in consideration of the transform technique, the quantization technique, and the characteristics of the block (macroblock or subblock), and the scanning sequence is determined such that the scanned quantized transform coefficient string has a minimum length. Also, the scanning method may be changed according to the intra block and the inter block. Details of the intra block and the inter block are described below. Although the scanner 140 is shown and described as being implemented separately in fig. 1, the scanner 140 may be omitted and its functionality may be incorporated into the encoder 150.

The encoder 150 encodes the transformed and quantized residual block to generate encoded data. That is, the encoder 150 encodes a quantized transform coefficient string obtained by scanning quantized transform coefficients of a transformed and quantized residual block output from the transformer and quantizer 130 to generate encoded data, or encodes a quantized transform coefficient string obtained by scanning in the scanner 140 to generate encoded data.

Entropy encoding may be used as the encoding technique, but various other unlimited encoding techniques are also available. Also, the encoded data output from the encoder 150 may include not only a bitstream resulting from encoding the quantized transform coefficient string but also various information required for decoding the encoded bitstream. Here, the various information necessary for decoding the encoded bitstream may be the above-described information, that is, macroblock information indicating the size or shape of a macroblock as a coding unit, prediction mode information, macroblock division information indicating the division scheme or size of subblocks (i.e., blocks as a prediction unit) in the case where the macroblock is divided into a plurality of subblocks for prediction, information of a motion vector in the case where the prediction mode is an inter prediction mode, information regarding a transform and quantization type, and the like, but may also be various other information.

The inverse quantizer and inverse transformer 160 inversely quantizes and inversely transforms the transformed and quantized residual block output from the transformer and quantizer 130 to reconstruct the residual block. The inverse quantization and inverse transform may be implemented by inversely performing the transform and quantization processes performed by the transformer and quantizer 130. That is, the inverse quantizer and inverse transformer 160 may perform inverse quantization and inverse transform using the transform and quantization information (e.g., information for transform and quantization types) transmitted from the transformer and quantizer 130 to inversely perform a process of transforming and quantizing the residual block by the transformer and quantizer 130.

The adder 170 adds the inverse-quantized and inverse-transformed residual block output from the inverse quantizer and inverse transformer 160 to the prediction block output from the predictor 110 to reconstruct the current block.

The filter 180 filters the current block reconstructed by the adder 170. In this regard, the filter 180 reduces blocking artifacts occurring at block boundaries of the video due to transforming and quantizing the video in units of blocks and ringing (ringing) noise occurring around edges of the video due to high frequency loss. Here, the deblocking filter and the deringing filter may be used to reduce blocking artifacts and ringing noise, respectively, and one of the following may be selectively employed: filtering with both a deblocking filter and a deringing filter, filtering with a deblocking filter or with a deringing filter, and filtering without any of a deblocking filter and a deringing filter. Also, one of the following ways may be selectively employed: the deblocking filtering is applied at the boundaries between sub-blocks and at the boundaries between macro-blocks and the deblocking filtering is applied only at the boundaries between macro-blocks.

In typical video encoding, a macroblock type used for video encoding is a square macroblock having 16 × 16 pixels, and a prediction block may be generated by performing at least one of intra prediction and inter prediction for each macroblock. The reason why video coding in units of macroblocks is widely used is that: which enables efficient encoding taking into account the regional characteristics of the video. Also, since various intra prediction or inter prediction methods are used to generate the prediction block, the encoding efficiency of video is high.

Fig. 2 to 4 illustrate intra prediction modes according to macroblock types used for typical video encoding.

Fig. 2 illustrates nine intra prediction modes in the case where the macroblock type is an intra 4 × 4 macroblock, fig. 3 illustrates nine intra prediction modes in the case where the macroblock type is an intra 8 × 8 macroblock, and fig. 4 illustrates four intra prediction modes in the case where the macroblock type is an intra 16 × 16 macroblock.

When the macroblock type is an intra block type, the macroblock to be encoded is predicted using intra prediction. The intra block type is subdivided into an intra 4 × 4 macroblock, an intra 8 × 8 macroblock, an intra 16 × 16 macroblock, and the like. For each intra block type, the macroblock is predicted using neighboring pixels of previously encoded, decoded, and reconstructed neighboring blocks according to the prediction modes shown in fig. 2 through 4.

Fig. 5 illustrates inter prediction modes according to macroblock types used for typical video coding.

When the macroblock type is an inter block type, the macroblock to be encoded is predicted using inter prediction. In this case, as shown in fig. 5, the prediction block is generated by performing prediction at a block size of 16 × 16, 16 × 8, 8 × 16, or 8 × 8 for a macroblock using a previously encoded, decoded, and reconstructed frame. When prediction is performed at a block size of 8 × 8 for a macroblock, a prediction block is generated by performing prediction at a block size of 8 × 8, 8 × 4, 4 × 8, or 4 × 4 for each 8 × 8 block.

However, when a high-resolution video is encoded in units of macroblocks each having a block size of 16 × 16 as in typical video encoding, encoding cannot be efficiently performed with high correlation between pixels, which is a characteristic of the high-resolution video. This is because although the prediction accuracy of the prediction block generated in units of macroblocks each having an extended block size of M × N is similar to that of the prediction block generated in units of macroblocks each having a typical block size of 16 × 16, the number of macroblocks to be encoded in the case of encoding video in units of 16 × 16 macroblocks is increased, thereby decreasing the encoding efficiency.

Also, DCT-based integer transform by block size 4 × 4 or 8 × 8 is used in typical video coding. This integer transform has advantages in terms of coding efficiency and complexity because it does not perform operations on real numbers, which is a disadvantage involved in DCT, but only on integers while maintaining as much of the characteristics of DCT as possible. Blocking artifacts and ringing noise due to the transform in units of blocks can be minimized using filtering.

However, it is more effective for the video encoding apparatus 100 to encode high-resolution video using various types of transforms and quantization in accordance with a block size P × Q, rather than using only transforms and quantization in accordance with typical 4 × 4 or 8 × 8 block sizes. This is because in the case of coding a wide area where pixels with high correlation are clustered together, using only transform and quantization in 4 × 4 or 8 × 8 block sizes results in significant blockiness and loss of high frequency components.

In contrast, if various types of transforms and quantization in accordance with the block size P × Q are available for performing the transform and quantization, blocking artifacts may be reduced and ringing noise may also be reduced because loss of high frequency components is reduced as compared to when only transforms and quantization in accordance with typical block sizes are used. Thereby, the number of filtering operations is also reduced, which makes it possible to reduce the complexity of the filtering operation, which contributes mainly to the implementation complexity of the video encoding apparatus 100 and the video decoding apparatus to be described below. Also, since various types of transforms and quantizes according to block sizes P × Q are used, the scanner 140 may scan the quantized transform coefficients obtained by the transformer and quantizer 130 in a manner suitable for the transform and quantization block sizes as a result of the improved coding efficiency.

Accordingly, in an aspect of the present invention, in contrast to typical video encoding in which video is encoded in units of macroblocks each having a block size of 16 × 16, a residual block is generated by performing prediction in units of macroblocks each having an extension block size of M × N. In addition, instead of only a 4 × 4 or 8 × 8 block size being used as a transform and quantization block size, various transforms and quantization in accordance with a block size P × Q are used to perform the transform and quantization, and filtering and scanning appropriate for the transform and quantization block size are performed.

Fig. 6 illustrates a macroblock of size M × N in accordance with an aspect of the subject innovation.

According to an aspect of the present invention, a video may be encoded in units of various types of macroblocks, including macroblocks having sizes of 64 × 64, 128 × 128, and 64 × 128 and macroblocks having sizes of 32 × 32 and 32 × 16 as shown in fig. 6. As shown in fig. 6, the macroblock having the size of M × N may have not only a square shape but also a rectangular shape.

Fig. 7 illustrates various types of sub-macroblocks that a macroblock of size M × N may have in accordance with an aspect of the present invention.

Fig. 7 shows an example of block sizes of sub-macroblocks that a macroblock having a block size of 32 × 16 may have. When a macroblock is predicted using these sub-macroblocks, a prediction block more similar to the original macroblock can be generated, and thus coding efficiency can be further improved. For this, M and N, which determine the size of the macroblock of size M × N, may be respectively in accordance with 2ⁿJ and K, which are determined as the size of the sub-macroblock of size J × K, have values of 2 within ranges not exceeding M and N, respectivelyⁿ。

The transformer and quantizer 130 transforms and quantizes the residual block according to transform and quantization types to transform a residual signal of the residual block into transform coefficients, and generates quantized transform coefficients by quantizing the transform coefficients. In this regard, the transformer and quantizer 130 may determine transform and quantization block sizes for the transform and quantization in consideration of the shape or size of the current macroblock to be encoded. That is, the size of a block, which is a unit of transform and quantization, or the transform and quantization block size is equal to or greater than the size of the current macroblock. Also, when the current macroblock has a rectangular shape, a rectangular transform may be selected. Also, a block size available according to the size of the current macroblock may be selected as the transform and quantization block size. For example, when the current macroblock is a large block having a size of 64 × 64, a transform having a size larger than a 4 × 4 transform, a 4 × 8 transform, an 8 × 4 transform, or the like may be used. If the current macroblock is a conventional 16 x 16 sized macroblock, typical 4 x 4 transforms and 8 x 8 transforms may be used. According to another aspect of the present invention, the transformer and quantizer 130 may determine a transform and quantization block size for transform and quantization in consideration of a prediction mode for a sub-macroblock or a size of a block that is a unit of prediction.

For example, when the block is an intra block type, the size of the block subjected to intra prediction may be determined as a transform and quantization block size. That is, the transform and quantization of the block size 4 × 4 may be used in the case of intra 4 × 4 prediction, the transform and quantization of the block size 8 × 8 may be used in the case of intra 8 × 8 prediction, the transform and quantization of the block size 16 × 16 may be used in the case of intra 16 × 16 prediction, and the transform and quantization of the block size 16 × 8 may be used in the case of intra 16 × 8 prediction. Therefore, when the unit of intra prediction is a block of size P × Q, transform and quantization of block size P × Q can be determined as a transform and quantization type.

As another example, when the block is an inter block type, a transform and quantization block size that minimizes an encoding cost may be determined from among a plurality of transform and quantization block sizes. That is, one transform and quantization block size may be selected from among transform and quantization block sizes 4 × 4, 8 × 8, 16 × 16, 32 × 16, 8 × 16, 16 × 8, and the like, and the residual block may be transformed and quantized using transform and quantization according to the selected block size.

Hereinafter, a transform and quantization process according to an aspect of the present invention will be described based on an assumption that transform and quantization having a block size of 8 × 4 are determined as a transform and quantization type.

The transform with a block size of 8 × 4 can be designed by combining the 4 × 4 integer transform and the 8 × 8 integer transform, and can be expressed by the following equation:

Y＝AXB^Tformula 1

In equation 1, X denotes a residual block of a block size of 8 × 4 generated from a prediction block of a block size of 8 × 4, a denotes a matrix for 4 × 4 integer transform, B denotes a matrix for 8 × 8 integer transform, T denotes a transposed matrix that is a matrix obtained by interchanging rows and columns of the corresponding matrix, and Y denotes a transformed residual block that is a result of performing 8 × 4 transform on the residual block of a block size of 8 × 4.

In the above formula, A and B^TMay be represented as given by:

formula 2

In equation 2, x of matrix A is 1/2 and y of matrix A isZ of the matrix A isA of the matrix B isB of the matrix B isC of the matrix B isD of the matrix B isE of the matrix B isF of the matrix B isAnd g of the matrix B isHere, in order to perform integer arithmetic while maintaining orthogonality which is a characteristic of DCT, each of 4 × 4 integer transform and 8 × 8 integer transform is decompressed and approximated as follows:

formula 3

In equation 3, z of matrix A is 1/2 as in equation 2, and y of matrix A is approximatelyThe w (═ z/y) of matrix a is approximately 1/2, and a of matrix B is, as in equation 2B of the matrix B is approximated byC of the matrix B is approximated byK (═ d/B) of matrix B is approximately 5/6, L (═ e/B) of matrix B is approximately 1/2, and M (═ g/B) of matrix B is approximately 1/4. By this processing, equation 1 can be rewritten as follows:

formula 4

In formula 4, X and Y are the same as in formula 1, C represents the right 4 × 4 matrix of the matrix A of formula 3, and D^TThe left 8 × 8 matrix in the matrix B of expression 3 is represented. OperatorRepresenting (CXD) on an element-by-element basis^T) Multiplication of the coefficients of the result matrix with the coefficients of matrix E. Matrix E represents an 8 × 8 matrix derived in the process of decompressing and approximating equation 1 to equation 4, and the elements in matrix E are given by:

formula 5

In formula 5, x, y, a, b, and c of the matrix E are the same as those in formula 3. As can be noted from equation 5, matrix C, D^TAnd E has non-integer coefficients. Thus, for integer operations, the corresponding matrix is scaled (scale), as given by:

formula 6

Once the scaling process of equation 6 is completed, an 8 × 4 integer transform is designed. That is, for integer transforms, the final integer transform with block size of 8 × 4 is designed by including the matrix E in the quantization process.

The basic quantization process can be represented by the following equation:

Z_ij＝round(Y_ij/Qstep) formula 7

In formula 7, Y_ijElements expressed by a matrix after transforming the residual block are represented, and Qstep represents the size of the quantization step.

When the basic quantization operation performed as shown in equation 7 is applied to an actual quantization process for transform and quantization with a block size of 8 × 4, it can be expressed as follows:

|Z_ij|＝(|W_ij|·MF+f)>>qbits formula 8

sign(Z_ij)＝sign(W_ij) Formula 9

In formula 8, W_ijRepresenting elements expressed by a matrix after transforming respective residual signals of the residual block, MF representing a multiplication factor determined according to the quantization parameter, and f, which is a factor determining a rounding error and a dead zone size, is 2 when predicting the current block through intra prediction^qbitsAnd/3 and is fixed at 2 when predicting the current block through inter prediction^qbits/6. Here, qbits is 16+ floor (QP/6) (where floor refers to a rounding-down operation), and may be changed according to the maximum and minimum values of the transform coefficient after the transform.

In this regard, the matrix E of equation 6 is included in MF, and MF is given as follows:

formula 10

In equation 10, PF refers to a matrix E, and since PF varies according to a transform type and an approximation, MF appropriate for each transform must be acquired and used.

Fig. 8 illustrates an MF for an 8 x 4 transform in accordance with an aspect of the subject innovation.

Since the MF value changes according to the position of the coefficient of the matrix E of equation 6, the first row of fig. 8 represents the position of the coefficient of the matrix E according to equation 6, and the first column of fig. 8 represents Qstep. Here, the MF value used is derived mathematically, but it may be modified because the integer transform is not the optimal transform for video coding.

By adaptively applying the above-described transform and quantization with a block size of 8 × 4, the transform and quantization may be performed according to various transform and quantization types including transform and quantization with a block size of P × Q, such as transform and quantization with a block size of 4 × 8 and transform and quantization with a block size of 16 × 8.

Also, according to an aspect of the present invention, the transformer and quantizer 130 may select a size of 2 in consideration of a size or shape of a current macroblock, a prediction mode and size of the current macroblock or subblock for prediction, and the likeⁿ×2ⁿTo maximize the transform and quantization efficiency of the current macroblock. Hereinafter, an aspect of the present invention in which a macroblock having a size of 64 × 64 is divided into subblocks having different sizes and then transformation and quantization are performed will be described.

A 64 × 64 macroblock may be divided into 4 32 × 32 sub-blocks, 16 × 16 sub-blocks, 64 8 × 8 sub-blocks, 256 4 × sub-blocks, or 1024 2 × 2 sub-blocks. Fig. 22 illustrates an example of dividing a 64 × 64 macroblock into 32 × 32 sub-blocks, 16 × 16 sub-blocks, and 8 × 8 sub-blocks, respectively. The transformer and quantizer 130 may divide a 64 × 64 macroblock into sub-blocks, perform transformation and quantization with sizes corresponding to the respective sub-blocks, and then determine transformation and quantization block sizes that achieve the best coding efficiency. According to an aspect of the present invention, a transform type indicating the determined transform and quantization block size may be reported to a video decoding apparatus using two flags transform _ size _ flag and transform _ division _ flag.

transform _ size _ flag indicates whether a transform having the original macroblock size is used. For example, when the value of transform _ size _ flag is 0, transform _ size _ flag indicates that transform and quantization of the original macroblock size of 64 × 64 are used. In contrast, when the value of transform _ size _ flag is 1, transform _ size _ flag indicates that transform and quantization having a subblock size smaller than the original macroblock size of 64 × 64 are performed. When the value of transform _ size _ flag is 1, transform _ division _ flag indicating a specific transform and quantization block size is encoded. For example, the transform _ division _ flag indicates 32 × 32 transform and quantization when the value of the transform _ division _ flag is 0, indicates 16 × 16 transform and quantization when the value of the transform _ division _ flag is 1, and indicates 8 × 8 transform and quantization when the value of the transform _ division _ flag is 2. The inverse transformer and inverse quantizer 2030 of the video decoding apparatus according to an aspect of the present invention may select inverse transform and inverse quantization block sizes based on any one of the two flags, i.e., transform _ size _ flag and transform _ division _ flag.

When the size of a residual block obtained by subtracting a prediction block of a current macroblock or subblock to be encoded from the current macroblock or subblock is greater than the transform and quantization block size determined as above, the transformer and quantizer 130 divides the residual block into subblocks having the same size as the transform and quantization block size and then transforms and quantizes the respective subblocks. In contrast, when the size of the residual block is smaller than the transform and quantization block size, the transformer and quantizer 130 combines a plurality of adjacent residual blocks with each other to generate a combined residual block having the same size as the transform and quantization block size, and then transforms and quantizes the combined residual block.

Fig. 9 is a view for explaining filtering in case of applying various transform and quantization types according to an aspect of the present invention.

Fig. 9 illustrates that, when a current block is a macroblock having a size of 32 × 32, a residual block is transformed and quantized according to various transform and quantization types, the transformed and quantized residual block is again inverse-transformed and inverse-quantized, and then the current block is reconstructed by adding the inverse-transformed and inverse-quantized residual block to a prediction block.

In fig. 9, a solid line indicates a boundary of a current macroblock or a sub-macroblock of the current macroblock. In the case where filtering is performed across the boundary existing within the region indicated by the solid-line circle, filtering is performed only across the block boundary indicated by the broken line. In this regard, filtering may be performed using deblocking filtering for reducing blocking artifacts, and in this case, a one-dimensional low-pass filter is used as a deblocking filter for deblocking filtering. Filtering is performed across vertical boundaries and then across horizontal boundaries.

In typical video coding, since transform and quantization are performed using only 4 × 4 or 8 × 8 integer transforms, block boundaries to be subjected to deblocking filtering and/or deringing filtering are increased, which results in an increase in the number of times filtering is performed. However, when the transform and quantization of a block size P × Q are performed according to an aspect of the present invention, since the number of times filtering is performed is reduced compared to typical video encoding, implementation complexity of the video encoding apparatus 100 and the video decoding apparatus may be reduced, and since a small blocking effect is caused, encoding efficiency may be improved.

Also, when the transform and quantization of block size P × Q are performed according to an aspect of the present invention, the number of pixels referred to for filtering may increase due to an increase in the transform and quantization block size, which leads to a more accurate result of the performed filtering, and thus the deblocking effect and the deringing noise may be further reduced.

Fig. 10 is a view for explaining a process of performing deblocking filtering across a block boundary according to an aspect of the present invention.

In fig. 10, a, b, c, d, e, and f denote pixels before performing deblocking filtering across a block boundary, b ', c', d ', and e' denote pixels after performing deblocking filtering across a block boundary, and the vertical position of each pixel denotes the luminance of the corresponding pixel. The solid line between pixels c and d represents the block boundary. As can be seen from fig. 10, before deblocking filtering is performed across a block boundary, a large luminance difference between pixels (a sudden increase in luminance difference between pixels c and d) occurs at the block boundary, resulting in blocking artifacts.

To reduce these blockiness, deblocking filtering is performed across block boundaries. That is, the luminance of the pixels b, c, d, and e before filtering is corrected with the luminance of the neighboring pixels, and thus the pixels b ', c', d ', and e' can be generated.

The one-dimensional low-pass filter for deblocking filtering may include a strong filter and a weak filter. The strong filter may be implemented as shown in equation 11 and the weak filter may be implemented as shown in equation 12.

Formula 11

Formula 12

In equations 11 and 12, b, c, d, e, f, and d' indicate pixels shown in fig. 10, and α indicates a rounding constant.

As can be seen from equations 11 and 12, a pixel to be filtered is greatly affected by neighboring pixels when filtering is performed using a strong filter, thereby setting a weight for the pixel to be filtered, and thus the pixel is less affected by neighboring pixels when filtering is performed using a weak filter. With this concept, filtering can be performed while changing weights applied to the strong filter and the weak filter, and filtering can also be performed while selectively applying a plurality of neighboring pixels and a plurality of pixels to be filtered. Therefore, according to an aspect of the present invention, the filtering result can be further improved due to an increase in the number of neighboring pixels referred to when performing transform and quantization of a block size P × Q and performing deblocking filtering across block boundaries.

Fig. 11 is a view for explaining a process of performing a deringing filtering according to an aspect of the present invention.

In order to perform the de-ringing filtering, edges in the video to be reconstructed must be detected. For this purpose, edge detection processing such as Sobel operation is performed. FIG. 11 shows a block with a detected edge. In fig. 11, pixels filled in black represent edges detected in the corresponding video, a represents a pixel to be filtered, and B and C represent adjacent pixels when filtering is performed in the horizontal direction. To reduce the de-ringing noise, the pixels of the block for which the edge has been detected are subjected to de-ringing filtering. The filter for the deringing filtering may be implemented as shown below, and filters the corresponding pixels of the block in which the edge has been detected in the vertical direction and then in the horizontal direction.

Formula 13

In equation 13, a denotes a pixel to be filtered, B and C denote adjacent pixels when filtering is performed in the horizontal direction or the vertical direction, and a' denotes a pixel resulting from the filtering. Also, β, γ, and each represent a weight that is applied differently depending on whether the pixel B, A or C is an edge, α represents a rounding constant, and λ represents the sum of β, γ, and.

The deringing filtering is performed for pixels that are not edges, and the weight given to each pixel when pixel B or C is an edge, the weight given to each pixel when pixels B and C are edges, and the weight given to each pixel when pixels B and C are not edges may be different. For example, when a pixel B or C is an edge, the weight given to the pixel C is the largest if the pixel B is an edge, and the weight given to the pixel B is the largest if the pixel C is an edge. Also, when both of the pixels B and C are edges, the weight given to the pixels B and C is 0. Also, when neither of the pixels B and C is an edge, the weight given to the pixel a is the largest.

With this concept, it is possible to perform the deringing filtering while changing the weight applied to each pixel, and also to selectively use a plurality of neighboring pixels referred to. Therefore, according to an aspect of the present invention, the filtering result may be further improved due to an increase in the number of neighboring pixels referred to when performing transformation and quantization of a block size P × Q and performing de-ringing filtering for a block including an edge.

A scan for converting quantized transform coefficients included in a transformed and quantized residual block into a one-dimensional transform coefficient string according to an aspect of the present invention is described below.

According to an aspect of the present invention, the scanning method may be selected according to whether the current block is an intra block or an inter block.

When the quantized transform coefficients of an intra block are scanned according to an aspect of the present invention, a scan pattern may be selected according to a prediction direction (i.e., an intra prediction mode) of the intra block. More specifically, 2 is acquired according to the mode of completing the predictionⁿ×2ⁿAnd the scanning sequence is changed such that the frequency domain coefficients with a high probability are scanned first. Thus, the method of continuously updating the scan sequence is applied to the video encoding apparatus and the video decoding apparatus. By this method, the scanning sequence is set so that scanning is performed in the order from the coefficient position having the highest frequency of occurrence or the largest coefficient value to the coefficient position having the lowest frequency of occurrence or the highest probability of occurrence of 0, and as a result, the efficiency to be used for entropy encoding of the encoder 150 after the scanning process is completed is further improved. Here, since the respective modes in which prediction and transformation and quantization are completed are used, the video decoding apparatus can also know the respective prediction modes, and further, the encoding and decoding processes coincide with each other. By doing so, the encoding and decoding coefficient positions can have the same frequency of occurrence, and the scanning sequence can be the same in the encoding and decoding processes.

Fig. 23 illustrates that the scanning for the intra 4 × 4 block may have various scanning sequences according to nine modes based on probability calculation. Each of the nine modes shown in fig. 23 represents a scan sequence of 1 to 16. The scanning is performed in the order from the number 1 to the number 16. The scan sequence is not a specific pattern but a sequence determined based on the size of the coefficient or the probability of coefficient distribution for each mode in which transformation and quantization are completed. The scan sequence may be updated according to the probability of change as encoding is performed.

Here, the size of the block may be 2 covering all intra modesⁿ×2ⁿAnd the scanning sequence is not limited to the nine modes for the intra 4 × 4 block shown in fig. 23. If the size is 2ⁿ×2ⁿHas M modes, the scanning method covers all scanning methods that may have different sequences for all M modes. Also, the scan sequence may be updated by continuously acquiring probabilities for modes for which prediction is completed and changing the scan sequence according to the acquired probabilities. Since encoding and decoding processes are performed for the mode in which prediction is completed, even if additional information on the scan sequence is encoded, the decoding process can know the determined mode and probability of deriving the scan sequence, and can perform decoding from the same scan sequence.

In the case of an inter block, since transformation and quantization are performed according to various transformation and quantization types, the scanner 140 or the encoder 150 scans quantized transform coefficients of a transformed and quantized residual block according to the transformation and quantization type to generate a quantized transform coefficient string. Fig. 12 illustrates a scan sequence according to transform and quantization types of inter blocks according to an aspect of the present invention.

In fig. 12, a sequence of scanning the quantized transform coefficients when the transformed and quantized residual block has a block size of 8 × 4 and a sequence of scanning the quantized transform coefficients when the transformed and quantized residual block has a block size of 4 × 8 are shown by way of example. The scanning sequence shown in fig. 12 may be adaptively applied to various transform and quantization types used according to an aspect of the present invention, and may also improve coding efficiency by scanning quantized transform coefficients according to an appropriate scanning sequence. In the case of DCT, transform coefficients tend to be crowded because transform coefficients after transform are generally grouped together in frequency parts having low energy, and integer transform also shows the same tendency because it is based on DCT. Therefore, it is effective to set the position of the DC coefficient as a start point of scanning, and scan the transform coefficients in descending order of their energies from the coefficient positioned closest to the DC coefficient.

The encoder 150 may encode the quantized transform coefficient string generated by scanning the quantized transform coefficients according to a scan sequence shown in fig. 12 or a similar scan sequence in various ways to generate a bitstream. However, the encoder 150 may scan and encode the transformed and quantized residual block by applying adaptive variable length coding (hereinafter, referred to as "CAVLC") based on existing contents.

Fig. 13 to 18 are views for explaining a method of applying CAVLC according to transform and quantization types according to an aspect of the present invention.

In typical video coding, CAVLC is performed only for blocks of size 4 × 4. However, in an aspect of the invention, AVLC may also be performed for blocks with block sizes greater than 4 x 4.

Fig. 13 to 18 illustrate that each of the transformed and quantized residual blocks having sizes of 8 × 4, 4 × 8, 8 × 8, 16 × 8, 8 × 16, and 16 × 16, respectively, is sequentially decomposed into blocks having sizes of 4 × 4 to apply CAVLC. In fig. 13 to 18, numbers 1, 2, 3, and 4 input in the respective pixels are given to indicate positions of the respective pixels, and the respective blocks of size 4 × 4 are formed by collecting only pixels having the same number.

As an example, in the transformed and quantized residual block having the size 8 × 4 shown in fig. 13, the numbers 1 and 2 are given every other column, and two transformed and quantized residual blocks each having the size 4 × 4 may be obtained by independently collecting four columns given the number 1 and four columns given the number 2, as shown in the drawing. The encoder 150 generates a bitstream by encoding two transformed and quantized residual blocks each having a size of 4 × 4 acquired as shown by using CAVLC.

As another example, among the transformed and quantized residual blocks having a size of 8 × 8 shown in fig. 15, four transformed and quantized residual blocks each having a size of 4 × 4 may be obtained by independently collecting a pixel given a number 1, a pixel given a number 2, a pixel given a number 3, and a pixel given a number 4. The encoder 150 generates a bitstream by encoding four transformed and quantized residual blocks each having a size of 4 × 4 acquired as shown by using CAVLC.

Fig. 19 is a flowchart for explaining a video encoding method according to an aspect of the present invention.

In a video encoding method according to an aspect of the present invention, the video encoding apparatus 100 predicts a current block to generate a prediction block (S1910), generates a residual block by subtracting the prediction block from the current block (S1910), determines a transform and quantization type according to a block type of the current block (S1930), and transforms and quantizes the residual block according to the determined transform and quantization type and encodes the transformed and quantized residual block (S1940). Here, the current block is a macroblock having a size M × M, and M and N may be greater than 16. The block type of the current block used in the step of determining the transform and quantization type (S1930) includes the shape or size of the current block. Also, the prediction mode and size of the block, which is the unit of prediction, used in the step of generating the prediction block (S1910) may also be used to determine the transform and quantization type.

Also, in step S1910, the video encoding apparatus 100 may generate a prediction block by: the method includes dividing a current block into a plurality of sub-blocks, predicting the plurality of sub-blocks to generate predicted sub-blocks, and combining the predicted sub-blocks with each other. For this, the video encoding apparatus 100 may determine a block type for each frame of the video, and in this case, the current block may be a macroblock according to the determined block type. In determining the block type, the video encoding apparatus 100 may determine the block type using encoding overhead required to encode the frame according to a plurality of block types, but may also determine the block type according to characteristics of the frame. Such characteristics of the frame may include one or more of a horizontal correlation and a vertical correlation of the frame. Also, the video encoding apparatus 100 may encode information on the block type and additionally include the information in the encoded data.

Also, in step S1930, the video encoding apparatus 100 may determine, when the block type is an intra block type, transform and quantization having a block size equal to that of the block type as a transform and quantization type, and may determine, when the block type is an inter block type, one of a plurality of transforms and quantization having various block sizes as a transform and quantization type using encoding overhead. Here, the plurality of transforms and quantizes having various block sizes may include a transform and a quantization having the same block size as that of the subblock.

Also, in step S1940, the video encoding apparatus 100 may generate a quantized transform coefficient string by scanning the quantized transform coefficients of the transformed and quantized residual block in descending order of the energy of the quantized transform coefficients of the transformed and quantized residual block starting from the quantized transform coefficient located closest to the DC coefficient, and then encode the scanned quantized transform coefficient string.

In addition, the video encoding apparatus 100 may reconstruct a residual block by inverse-transforming and inverse-quantizing the transformed and quantized residual block, reconstruct a current block by adding the reconstructed residual block to a prediction block, and filter the reconstructed current block according to the transform and quantization type. In filtering the reconstructed current block, the video encoding apparatus 100 may perform deblocking filtering across a boundary of the reconstructed current block according to a transform and quantization type and perform deringing filtering according to the transform and quantization type. However, it is not necessary to perform both the deblocking filtering and the deringing filtering, but only one or both of the deblocking filtering and the deringing filtering may be performed.

Also, the video encoding apparatus 100 may encode information on the transform and quantization type determined in step S1930 and include the information in the encoded data. Here, the information included in the encoded data may include, in addition to the transform and quantization types, the above-described information, that is, macroblock information indicating the size or shape of a macroblock that is a unit of encoding, information on a prediction mode for the macroblock or a subblock (in the case where the macroblock is divided into a plurality of subblocks for prediction), macroblock division information indicating the division scheme or size of a subblock (i.e., a block that is a unit of prediction) in the case where the macroblock is divided into a plurality of subblocks for prediction, information on a motion vector in the case where the prediction mode is an intra prediction mode, and the like.

As described above, when the video encoding apparatus 100 and the video encoding method according to an aspect of the present invention are used, since prediction in units of macroblocks or subblocks of variable sizes, transformation and quantization having various block sizes, scanning and filtering suitable for types of transformation and quantization can be performed, high resolution video can be encoded more efficiently. The video encoded into the encoded data by the video encoding apparatus 100 may be transmitted to a video decoding apparatus to be described below in real time or non-real time, wherein the encoded data is decoded, reconstructed, and reproduced into the video via a wireless/wired communication network including the internet, a short-range wireless communication network, a wireless LAN network, WiBro (wireless broadband) also called WiMax network, and a mobile communication network or a communication interface such as cable or USB (universal serial bus).

Fig. 20 schematically illustrates a video decoding apparatus according to an aspect of the present invention.

The video decoding apparatus 2000 according to an aspect of the present invention may include: decoder 2010, inverse scanner 2020, inverse quantizer and inverse transformer 2030, predictor 2040, adder 2050, and filter 2060. Here, the inverse scanner 202 and the filter 2060 are not necessarily included in the video decoding apparatus 2000 and may be omitted according to the implementation design of the video decoding apparatus 2000. When the inverse scanner 2020 is omitted, its function may be incorporated into the decoder 2010.

The decoder 2010 receives video data obtained by dividing input video into a plurality of macroblocks having different shapes or sizes and encoding the respective macroblocks, and decodes macroblock information indicating the shapes or sizes of the respective macroblocks.

The decoder 2010 decodes the encoded data (i.e., the video data obtained by encoding the macroblock in the video encoding apparatus and transmitted from the video encoding apparatus) to reconstruct the transformed and quantized residual block. That is, the decoder 2010 decodes the encoded data to reconstruct the quantized transform coefficient string. When the function of the scanner 140 is incorporated into the encoder 150 of the video encoding apparatus 100, the inverse scanner 2020 is also omitted from the video decoding apparatus 2000, and the function thereof is incorporated into the decoder 2010. Thus, the decoder 2010 may reconstruct the transformed and quantized residual block by inverse scanning the reconstructed quantized transform coefficient string.

Also, the decoder 2010 may decode or extract information required for decoding and a transformed and quantized residual block by decoding the encoded data. The information necessary for decoding refers to information necessary for decoding a bitstream encoded in the encoded data, and may be, for example, macroblock information indicating the size or shape of a macroblock that is a unit of encoding, information about a prediction mode for the macroblock or subblock (in the case where the macroblock is divided into a plurality of subblocks for prediction), macroblock division information indicating the division scheme or size of a subblock (i.e., a block that is a unit of prediction) in the case where the macroblock is divided into a plurality of subblocks for prediction, information about a motion vector in the case where the prediction mode is an inter prediction mode, information about transform and quantization types, or the like, but may be various other information.

The decoder 2010 parses the incoming bitstream to identify the hierarchy of the encoded video and the specific algorithm to be used for decoding. More specifically, the decoder 2010 identifies the shape or size of each macroblock, which is a unit of encoding, by macroblock information. What type and/or size of transform and quantization is to be performed is determined by information about the type of transform and quantization. The size or shape of a prediction unit block, which is a unit of prediction, is determined by macroblock division information. It is determined by the prediction mode information what prediction mode is used to generate a prediction block of a current macroblock or subblock (in case of dividing the macroblock into subblocks for prediction).

The macroblock information parsed by the decoder 2010 may be transferred to the inverse quantizer and inverse transformer 2030 and the predictor 2040. Information on the transform and quantization type may be transferred to the inverse quantizer and inverse transformer 2030, and information required for prediction, such as prediction mode information, macroblock partition information, and motion vector information, may be transferred to the predictor 2040.

The inverse scanner 2020 inverse scans the quantized transform coefficient string reconstructed by the decoder 2010 and transmitted from the decoder 2010 to reconstruct the transformed and quantized residual block. As described above, when the function of the scanner 140 is incorporated into the encoder 150 of the video encoding apparatus 100, the inverse scanner 2020 may also be omitted from the video decoding apparatus 2000 and its function may be incorporated into the decoder 2010. Also, the decoder 2010 or the inverse scanner 2020 inverse scans the transformed and quantized residual block according to the transform and quantization type identified by the information on the transform and quantization type, which is reconstructed by decoding the encoded data in the decoder 2010. Here, in the case of an inter block, a method of inverse-scanning the transformed and quantized residual block according to the transform and quantization type by the inverse scanner 2020 is the same as or similar to a method of inversely performing a process of scanning the quantized transform coefficients of the transformed and quantized residual block by the scanner 140, as described with reference to fig. 1 and 12. In the case of an intra block, the coefficients are scanned according to the scan pattern for each intra prediction mode in the same manner as described above. The inverse scanning method is performed in the same sequence as the scanning sequence in the scanner 140, and thus a detailed description thereof is omitted.

The inverse quantizer and inverse transformer 2030 inverse quantizes and inverse transforms the reconstructed transformed and quantized residual block to reconstruct the residual block. In this regard, the inverse quantizer and inverse transformer 2030 inversely quantizes and inversely transforms the transformed and quantized residual block according to the transform and quantization type identified by the information on the transform and quantization type transmitted from the decoder 2010. Here, a method of inversely quantizing and inversely transforming the transformed and quantized residual block according to the transform and quantization type by the inverse quantizer and inverse transformer 2030 is the same as or similar to a method of inversely performing a process of performing transform and quantization according to the transform and quantization type by the transformer and quantizer 130 of the video encoding apparatus 100. And thus a detailed description thereof will be omitted.

When the size of the residual block generated through the inverse transform and inverse quantization processes is greater than that of the prediction block, the inverse quantizer and inverse transformer 2030 divides the residual block into subblocks each having the same size as the prediction block, and then outputs the divided subblocks to the adder 2050. In contrast, when the size of the residual block generated through the inverse transform and inverse quantization processes is smaller than that of the prediction block, the inverse transformer and inverse quantizer 2030 combines a plurality of adjacent residual blocks to generate a combined residual block having the same size as the prediction block, and then outputs the generated combined residual block to the adder 2050. The predictor 2040 predicts the current block to generate a prediction block. Here, the predictor 2040 predicts the current block using the macroblock information transmitted from the decoder 2010 and information necessary for prediction. That is, the predictor 2040 determines the size and shape of the current macroblock from the macroblock information, and predicts the current macroblock using an intra prediction mode or a motion vector identified by information necessary for prediction to generate a prediction block. When the macroblock information indicates that the current macroblock has been divided into a plurality of subblocks, the predictor 2040 may divide the current macroblock into subblocks and predict the respective divided subblocks in the same or similar manner as the predictor 110 of the video encoding apparatus 100 to generate predicted subblocks.

The adder 2050 adds the residual block reconstructed by the inverse quantizer and the inverse transformer 2030 to the prediction block generated by the predictor 2040 to reconstruct the current block.

The filter 2060 filters the current block reconstructed by the adder 2050. The reconstructed and filtered current block is accumulated per picture and stored in a memory (not shown) as a reference picture for use when the predictor 2040 predicts the next block or the next picture. Here, when the filter 2060 filters the reconstructed current block, the filter 2060 performs filtering according to the transform and quantization type identified by the information on the transform and quantization type transmitted from the decoder 2010. In this regard, the filter 2060 may perform deblocking filtering across the boundary of the current block in different manners according to transform and quantization types, or perform de-ringing filtering of the current block in different manners according to transform and quantization types when an edge is detected in the current block, thereby reducing blocking artifacts occurring at the block boundary of the reconstructed video, or reducing ringing noise occurring around the edge in the block. A method of performing the deblocking filtering and the deringing filtering by the filter 2060 is the same as or similar to a method of performing the deblocking filtering and the deringing filtering by the filter 180 of the video encoding apparatus 100, and thus a detailed description thereof is omitted.

Fig. 21 is a flowchart for explaining a video decoding method according to an aspect of the present invention.

In a video decoding method according to an aspect of the present invention, the video decoding apparatus 2000 decodes encoded data to reconstruct a transformed and quantized residual block (S2110), inverse transforms and inverse quantizes the transformed and quantized residual block to reconstruct a residual block (S2120), predicts a current block to generate a prediction block (S2130), and adds the reconstructed residual block to the prediction block to reconstruct the current block (S2140).

The video decoding apparatus 2000 may additionally reconstruct information on transform and quantization types by decoding the encoded data. Thus, in step S2120, the video decoding apparatus 2000 inverse-transforms and inverse-quantizes the transformed and quantized residual block according to the transform and quantization type identified by the information on the transform and quantization type.

Also, in step S2140, the video decoding apparatus 2000 may filter the reconstructed current block according to the transform and quantization type. That is, the video decoding apparatus 2000 performs deblocking filtering across the boundary of the reconstructed current block according to the transform and quantization type, and performs deringing filtering of the reconstructed current block according to the transform and quantization type. Here, both of the deblocking filtering and the deringing filtering may be performed, but only one of them may also be selectively performed, or neither may be performed. The video decoding apparatus 2000 may perform deblocking filtering in different manners according to transform and quantization types, and may perform deringing filtering in different manners according to transform and quantization types. Also, the current block is a macroblock having a size of M × N, and M and N may be greater than 16.

Also, in generating the prediction block of step S2130, the video decoding apparatus 2000 may divide the current block into a plurality of sub-blocks, predict the plurality of sub-blocks to obtain predicted sub-blocks, and combine the predicted sub-blocks to generate the prediction block.

Also, the video decoding apparatus 2000 may additionally reconstruct macroblock information for each frame of the video, and in this case, the current block may be a macroblock having a size or shape identified by the reconstructed macroblock information.

As described above, according to an aspect of the present invention, since encoding of high resolution video is achieved by appropriately utilizing high correlation between temporally/spatially adjacent pixels occurring in the high resolution video, scanning at a block size P × Q, and filtering through variable-sized macroblocks having a block size P × Q and corresponding transformation and quantization, encoding efficiency can be improved. Also, since block distortion caused by the use of macroblocks having scalable block sizes and the transformation and quantization is reduced, not only coding efficiency can be improved, but also the number of times that transformation and deblocking filtering are performed across block boundaries (transformation and deblocking filtering can be performed in coding and decoding) can be reduced, which makes it possible to reduce implementation complexity of the video coding apparatus 100 and the video decoding device 2000.

In the above description, although all components of the embodiments of the present invention have been explained as assembled or operatively connected as a unit, the present invention is not intended to limit itself to these embodiments. Rather, the respective components may be selectively and operatively combined in any number within the scope of the present invention. Each component itself may also be implemented in hardware, while the respective components may be selectively combined in part or as a whole and implemented with a computer program having program modules for performing the functions of hardware equivalents. Codes or code segments for constituting such a program can be easily inferred by those skilled in the art. The computer program may be stored in a computer readable medium, which when operated, may implement aspects of the present invention. Candidates for the computer-readable medium include magnetic recording media, optical recording media, and carrier wave media.

In addition, terms such as "including" and "having" should be interpreted as inclusive or open-ended, rather than exclusive or closed-ended, by default, unless otherwise specified. All technical, scientific, and like terms are to be interpreted as consistent with the understanding of those skilled in the art unless otherwise indicated. Common terms found in dictionaries should be interpreted as less than ideal or impractical in the context of the relevant art unless the present invention directly limits them to such.

Although exemplary aspects of the present invention have been described for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the basic characteristics of the invention. Accordingly, the exemplary aspects of the invention are not described for limiting purposes. Accordingly, the scope of the present invention is not limited by the above aspects but by the claims and the equivalents thereof.

Industrial applications

As described above, the present invention is very useful for applications in the field of video compression processing for encoding and decoding high-resolution video. In particular, not only can the coding efficiency be improved because the present invention enables coding with high correlation between temporally/spatially adjacent pixels occurring in the video, but also the compression efficiency can be improved by reducing block distortion. Also, the number of times filtering is performed may be reduced, which makes it possible to reduce implementation complexity of the video encoding and decoding apparatus.

Cross Reference to Related Applications

The present application claims priority under 35 u.s.c.119 (a) of patent application No.10-2009 § 0086305 filed in korea, 9, 14, 2009, if applicable, the entire content of which is incorporated herein by reference. In addition, the non-provisional application is based on the korean patent application, which claims priority in countries other than the united states for the same reason, and the entire contents of which are incorporated herein by reference.

Claims

1. A video encoding method, the video encoding method comprising the steps of:

dividing an input video into a plurality of macroblocks having various shapes or sizes;

encoding a macroblock; and

macroblock information indicating a shape or size of each of the macroblocks is generated.

2. The video coding method of claim 1, wherein the step of encoding the macroblock comprises the steps of:

dividing the macroblock into one or more sub-blocks; and

generating prediction mode information indicating whether the macroblock is intra-predicted or inter-predicted and macroblock partition information indicating a size of each of the sub-blocks.

3. The video coding method of claim 2, wherein, when the macroblock is intra predicted, the step of coding the macroblock comprises the steps of:

obtaining a residual block representing a difference between each of the sub-blocks and a predicted sub-block of each of the sub-blocks;

transforming the residual block and quantizing the transformed residual block; and

the transform coefficients of the transformed and quantized residual block are scanned.

4. The video coding method of claim 3, wherein the step of encoding the macroblock comprises the steps of:

predicting each of the sub-blocks to obtain the predicted sub-block;

obtaining the residual block representing a difference between each of the sub-blocks and the predictor block for each of the sub-blocks;

determining a transform type based on at least one of a size of each of the macroblocks, the prediction mode, and a size of each of the subblocks;

transforming the residual block according to the determined transform type; and

the transformed residual block is quantized.

5. The video coding method of claim 4, wherein the step of encoding each of the macroblocks further comprises the steps of:

encoding information indicating the determined transform type.

6. The video coding method of claim 5,

the information indicating the transform type includes a size or a category of the transform.

7. The video encoding method of claim 1, wherein each of the plurality of macroblocks has a rectangular shape in which a length of a horizontal side is different from a length of a vertical side.

8. The video coding method of claim 4, wherein the step of determining the transform type comprises the steps of:

determining a size of the transform to be the same as a size of the predictor block when the prediction mode is an intra prediction mode.

9. The video coding method of claim 4, wherein the step of transforming the residual block comprises the steps of:

when the size of the residual block is smaller than the determined size of the transform, a plurality of residual blocks are combined with each other to generate a combined residual block having a size equal to the size of the transform, and then the combined residual block is transformed.

10. The video encoding method of claim 3, wherein the transform coefficients of the transformed and quantized residual block are scanned with a scan pattern corresponding to an intra prediction mode used to predict each of the sub blocks among a plurality of intra prediction modes.

11. The video coding method of claim 10, wherein the transform coefficients of the transformed and quantized residual block are scanned with the scan pattern corresponding to the intra prediction mode when the transformed and quantized residual block has a block size of 4 x 4.

12. A video decoding method, comprising the steps of:

receiving video data obtained by dividing an input video into a plurality of macro blocks having various shapes or sizes and encoding each of the macro blocks;

decoding macroblock information indicating a shape or size of each of the macroblocks; and

decoding the macroblock based on the macroblock information.

13. The video decoding method of claim 12, wherein the step of decoding the macroblock comprises the steps of:

decoding macroblock partition information indicating a size of each of subblocks of the macroblock and prediction mode information indicating whether the macroblock is intra-predicted or inter-predicted; and

obtaining a predictor block for each of the subblocks based on the macroblock division information and the prediction mode information.

14. The video decoding method of claim 13, wherein, when the macroblock is intra predicted, the step of decoding the macroblock comprises the steps of:

when dividing the macroblock into a plurality of sub-blocks, dividing the macroblock into a plurality of sub-blocks;

scanning the transform coefficients to obtain transformed and quantized residual blocks;

inverse-quantizing and inverse-transforming the transformed and quantized residual block; and

adding the inverse quantized and inverse transformed residual block to the predictor block to obtain a reconstructed block.

15. The video decoding method of claim 14, wherein the step of decoding the macroblock comprises the steps of:

determining an inverse transform type based on at least one of a size of each of the macroblocks, the prediction mode, and a size of each of the subblocks;

inverse quantizing and inverse transforming the residual block according to the determined inverse transform type to obtain an inverse-quantized and inverse-transformed residual block; and

adding the inverse quantized and inverse transformed residual block to the predictor block to obtain the reconstructed block.

16. The video decoding method of claim 15, wherein the step of determining the inverse transform type comprises the steps of:

determining the inverse transform type based on information indicating the inverse transform type.

17. The video decoding method of claim 16, wherein the information indicating the inverse transform type comprises a size or a kind of the inverse transform.

18. The video decoding method of claim 12, wherein each of the macroblocks has a rectangular shape in which a length of a horizontal side is different from a length of a vertical side.

19. The video decoding method of claim 15, wherein the step of determining the inverse transform type comprises the steps of:

determining a size of the inverse transform to be the same as a size of each of the sub-blocks when the prediction mode is an intra prediction mode.

20. The video decoding method of claim 15, wherein the step of decoding each of the macroblocks further comprises the steps of:

dividing the inverse-quantized and inverse-transformed residual block into a plurality of residual sub-blocks each having a size equal to the size of the predictor block, when the size of the predictor block is smaller than the determined inverse transform size; and

adding each of the residual sub-blocks to the predictor block to obtain a reconstructed block.

21. The video decoding method of claim 14, wherein the transform coefficients of the transformed and quantized residual block are scanned with a scan pattern corresponding to an intra prediction mode for predicting each of the sub blocks among a plurality of intra prediction modes.

22. The video decoding method of claim 21, wherein the transform coefficients of the transformed and quantized residual block are scanned with the scan pattern corresponding to the intra prediction mode when the transformed and quantized residual block has a block size of 4 x 4.

23. A video encoding apparatus, the video encoding apparatus comprising:

a macroblock type determiner for dividing an input video into a plurality of macroblocks having various shapes or sizes; and

a macroblock encoder for encoding macroblocks, and encoding macroblock information indicating a shape or size of each of the macroblocks.

24. The video coding device of claim 23, wherein the macroblock encoder is configured to:

dividing the macroblock into one or more sub-blocks; and

25. The video coding device of claim 24, wherein the macroblock encoder is configured to, when the macroblock is intra predicted:

26. The video coding device of claim 25, wherein the macroblock encoder is configured to:

transforming the residual block according to the determined transform type; and

the transformed residual block is quantized.

27. The video encoding apparatus of claim 25, wherein the macroblock encoder scans the transform coefficients of the transformed and quantized residual block with a scan pattern corresponding to an intra prediction mode for predicting each of the sub blocks among a plurality of intra prediction modes.

28. The video encoding device of claim 27, wherein the macroblock encoder scans the transform coefficients of the transformed and quantized residual block with the scan pattern corresponding to the intra prediction mode when the transformed and quantized residual block has a block size of 4 x 4.