[go: up one dir, main page]

WO2013069117A1 - Prediction image generation method, encoding method, and decoding method - Google Patents

Prediction image generation method, encoding method, and decoding method Download PDF

Info

Publication number
WO2013069117A1
WO2013069117A1 PCT/JP2011/075851 JP2011075851W WO2013069117A1 WO 2013069117 A1 WO2013069117 A1 WO 2013069117A1 JP 2011075851 W JP2011075851 W JP 2011075851W WO 2013069117 A1 WO2013069117 A1 WO 2013069117A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
reference image
unit
predicted image
Prior art date
Application number
PCT/JP2011/075851
Other languages
French (fr)
Japanese (ja)
Inventor
昭行 谷沢
中條 健
Original Assignee
株式会社東芝
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社東芝 filed Critical 株式会社東芝
Priority to PCT/JP2011/075851 priority Critical patent/WO2013069117A1/en
Priority to TW101101752A priority patent/TW201320750A/en
Publication of WO2013069117A1 publication Critical patent/WO2013069117A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • Embodiments of the present invention relate to a predicted image generation method, an encoding method, and a decoding method.
  • H. H.264 discloses an inter-prediction coding scheme that eliminates temporal redundancy and realizes high coding efficiency by performing fractional accuracy motion compensated prediction using a coded image as a reference image. ing.
  • an implicit motion compensation prediction method that encodes moving images including fade and dissolve effects more efficiently than the inter prediction encoding methods in ISO / IEC MPEG (Moving Picture Experts Group) -1, 2, 4 Has been.
  • motion compensation prediction with fractional accuracy is performed on an input moving image having luminance and two color differences. Then, the reference image, the luminance and the weighting factor for each of the two color differences are implicitly derived by the temporal distance ratio between the reference images, and the prediction image is multiplied by the weighting factor.
  • weighted motion compensation prediction is realized by multiplying a pixel after motion compensation prediction by a weighting factor implicitly derived by the temporal distance ratio of the reference image. Since the offset term for correcting the deviation of the value is not taken into account, the encoding efficiency is reduced.
  • the problem to be solved by the present invention is to provide a prediction image generation method, an encoding method, and a decoding method capable of improving the encoding efficiency.
  • the predicted image generation method of the embodiment includes a first derivation step, a second derivation step, a third derivation step, and a predicted image generation step.
  • a pixel error value indicating a pixel average value of each of two or more reference images and a pixel difference from the pixel average value is derived.
  • a pixel average value of the predicted image using a temporal distance ratio between at least two reference images and the predicted image of the two or more reference images and the pixel average value of the at least two reference images.
  • the pixel error value of the predicted image is derived using the temporal distance ratio and the pixel error value of the at least two reference images.
  • the weight coefficient of the reference image is derived using the pixel error value of the reference image and the pixel error value of the predicted image, and the derived weight coefficient and the pixel of the reference image
  • the offset of the reference image is derived using the average value and the pixel average value of the predicted image.
  • the predicted image generation step using the reference image of one target block obtained by dividing the input image of the reference image into a plurality of blocks, the weighting coefficient of the reference image, and the offset of the reference image, the target block The predicted image is generated.
  • two-way prediction of 1st Embodiment The block diagram which shows the example of the multi-frame motion compensation part of 1st Embodiment. Explanatory drawing of the example of the fixed point precision of the weighting coefficient in 1st Embodiment.
  • the figure which shows the example of the reference image group feature-value of 1st Embodiment.
  • the figure which shows the example of WP parameter information of 1st Embodiment. 6 is a flowchart illustrating an example of a reference image group feature amount derivation process according to the first embodiment.
  • 5 is a flowchart illustrating an example of a predicted image feature amount derivation process according to the first embodiment.
  • the block diagram which shows the structural example of the decoding apparatus of 2nd Embodiment.
  • the encoding device and decoding device of each of the following embodiments can be realized by hardware such as an LSI (Large-Scale Integration) chip, a DSP (Digital Signal Processor), or an FPGA (Field Programmable Gate Array).
  • the encoding device and the decoding device of each of the following embodiments can be realized by software by causing a computer to execute a program.
  • the term “image” can be appropriately replaced with a term such as “video”, “pixel”, “image signal”, “picture”, or “image data”.
  • FIG. 1 is a block diagram illustrating an example of the configuration of the encoding device 100 according to the first embodiment.
  • the encoding apparatus 100 divides each frame or each field constituting the input image into a plurality of pixel blocks, and uses the encoding parameters input from the encoding control unit 113 to predict the divided pixel blocks. To generate a predicted image.
  • the encoding apparatus 100 subtracts the input image divided into a plurality of pixel blocks and the prediction image to generate a prediction error, orthogonally transforms and quantizes the generated prediction error, and further performs entropy encoding to perform encoding. Generate and output data.
  • the encoding apparatus 100 performs predictive encoding by selectively applying a plurality of prediction modes in which at least one of the block size of the pixel block and the prediction image generation method is different.
  • the generation method of the prediction image is roughly classified into two types: intra prediction that performs prediction within the encoding target frame and inter prediction that performs motion compensation prediction using one or more reference frames that are temporally different.
  • intra prediction is also referred to as intra prediction or intra frame prediction
  • inter prediction is also referred to as inter prediction, inter frame prediction, motion compensation prediction, or the like.
  • FIG. 2 is an explanatory diagram showing an example of a predictive coding order of pixel blocks in the first embodiment.
  • the encoding device 100 performs predictive encoding from the upper left to the lower right of the pixel block, and the left side of the encoding target pixel block c in the encoding target frame f and The encoded pixel block p is located on the upper side.
  • the encoding apparatus 100 performs predictive encoding in the order shown in FIG. 2, but the order of predictive encoding is not limited to this.
  • the pixel block indicates a unit for processing an image, and corresponds to, for example, an M ⁇ N size block (M and N are natural numbers), a coding tree block, a macro block, a sub block, or one pixel.
  • M and N are natural numbers
  • the pixel block is basically used in the meaning of the coding tree block, but may be used in other meanings.
  • a pixel block is used to mean a pixel block of the prediction unit.
  • a block may be called by a name such as a unit.
  • a coding block is called a coding unit.
  • FIG. 3A is a diagram showing an example of the block size of the coding tree block in the first embodiment.
  • the coding tree block is typically a 64 ⁇ 64 pixel block as shown in FIG. 3A.
  • the present invention is not limited to this, and it may be a 32 ⁇ 32 pixel block, a 16 ⁇ 16 pixel block, an 8 ⁇ 8 pixel block, a 4 ⁇ 4 pixel block, or the like.
  • the coding tree block may not be a square, and may be, for example, an M ⁇ N size (M ⁇ N) pixel block.
  • FIG. 3B to 3D are diagrams illustrating specific examples of the coding tree block according to the first embodiment.
  • N represents the size of the reference coding tree block.
  • the size when divided is defined as N, and the size when not divided is defined as 2N.
  • FIG. 3C shows a coding tree block obtained by dividing the coding tree block of FIG. 3B into quadtrees.
  • the coding tree block has a quadtree structure as shown in FIG. 3C.
  • the four pixel blocks after the division are numbered in the Z-scan order as shown in FIG. 3C.
  • the coding tree block can be further divided into quadtrees within one quadtree number.
  • a coding tree block can be divided
  • the depth of division is defined by Depth.
  • the Depth of the coding tree block shown in FIG. 3B is 0, and the Depth of the coding tree block shown in FIG.
  • the coding tree block having the largest unit is called a large coding tree block, and the input image signal is encoded in this unit in the raster scan order.
  • the encoding target block or coding tree block of the input image may be referred to as a prediction target block or a prediction pixel block.
  • the encoding unit is not limited to a pixel block, and at least one of a frame, a field, a slice, a line, and a pixel can be used.
  • the encoding device 100 includes a subtraction unit 101, an orthogonal transformation unit 102, a quantization unit 103, an inverse quantization unit 104, an inverse orthogonal transformation unit 105, an addition unit 106, and a prediction
  • the image generation unit 107, the reference image feature value deriving unit 108, the predicted image feature value deriving unit 109, the parameter deriving unit 110, the motion evaluation unit 111, and the encoding unit 112 are provided.
  • the encoding control unit 113 shown in FIG. 1 controls the encoding apparatus 100 and can be realized by, for example, a CPU (Central Processing Unit).
  • the subtraction unit 101 subtracts the corresponding prediction image from the input image divided into pixel blocks to obtain a prediction error.
  • the subtraction unit 101 outputs a prediction error and inputs it to the orthogonal transform unit 102.
  • the orthogonal transform unit 102 performs orthogonal transform such as discrete cosine transform (DCT) or discrete sine transform (DST) on the prediction error input from the subtraction unit 101 to obtain transform coefficients.
  • the orthogonal transform unit 102 outputs transform coefficients and inputs them to the quantization unit 103.
  • the quantization unit 103 performs a quantization process on the transform coefficient input from the orthogonal transform unit 102 to obtain a quantized transform coefficient. Specifically, the quantization unit 103 performs quantization according to quantization information such as a quantization parameter or a quantization matrix specified by the encoding control unit 113. More specifically, the quantization unit 103 divides the transform coefficient by the quantization step size derived from the quantization information to obtain a quantized transform coefficient.
  • the quantization parameter indicates the fineness of quantization.
  • the quantization matrix is used for weighting the fineness of quantization for each component of the transform coefficient.
  • the quantization unit 103 outputs the quantized transform coefficient and inputs it to the inverse quantization unit 104 and the encoding unit 112.
  • the inverse quantization unit 104 performs an inverse quantization process on the quantized transform coefficient input from the quantization unit 103 to obtain a restored transform coefficient. Specifically, the inverse quantization unit 104 performs inverse quantization according to the quantization information used in the quantization unit 103. More specifically, the inverse quantization unit 104 multiplies the quantization transform coefficient by the quantization step size derived from the quantization information to obtain a restored transform coefficient.
  • the quantization information used in the quantization unit 103 is loaded from an internal memory (not shown) of the encoding control unit 113 and used.
  • the inverse quantization unit 104 outputs the restored transform coefficient and inputs it to the inverse orthogonal transform unit 105.
  • the inverse orthogonal transform unit 105 performs inverse orthogonal transform such as inverse discrete cosine transform (IDCT) or inverse discrete sine transform (IDST) on the restored transform coefficient input from the inverse quantization unit 104, Get the restoration prediction error. Note that the inverse orthogonal transform performed by the inverse orthogonal transform unit 105 corresponds to the orthogonal transform performed by the orthogonal transform unit 102. The inverse orthogonal transform unit 105 outputs the reconstruction prediction error and inputs it to the addition unit 106.
  • inverse orthogonal transform such as inverse discrete cosine transform (IDCT) or inverse discrete sine transform (IDST)
  • the addition unit 106 adds the restored prediction error input from the inverse orthogonal transform unit 105 and the corresponding predicted image to generate a locally decoded image.
  • the adding unit 106 outputs the local decoded image and inputs it to the predicted image generation unit 107.
  • the predicted image generation unit 107 stores the locally decoded image input from the addition unit 106 as a reference image in a memory (not shown in FIG. 1), outputs the reference image stored in the memory, and outputs a reference image feature quantity deriving unit 108. And input to the motion evaluation unit 111.
  • the predicted image generation unit 107 performs weighted motion compensation prediction based on the WP parameter information input from the parameter derivation unit 110 and the motion information input from the motion evaluation unit 111, and generates a predicted image.
  • the predicted image generation unit 107 outputs a predicted image and inputs it to the subtracting unit 101 and the adding unit 106.
  • FIG. 4 is a block diagram illustrating an example of the configuration of the predicted image generation unit 107 of the first embodiment.
  • the predicted image generation unit 107 includes a multi-frame motion compensation unit 201, a memory 202, a unidirectional motion compensation unit 203, a prediction parameter control unit 204, a reference image selector 205, and a frame memory 206. And a reference image control unit 207.
  • the frame memory 206 stores the locally decoded image input from the addition unit 106 as a reference image under the control of the reference image control unit 207.
  • the frame memory 206 has a plurality of memory sets FM0 to FMN (N ⁇ 1) for temporarily storing reference images.
  • the prediction parameter control unit 204 prepares a plurality of combinations of reference image numbers and prediction parameters as a table based on the motion information input from the motion evaluation unit 111.
  • the motion information indicates a motion vector indicating a shift amount of motion used in motion compensation prediction, a reference image number, information on a prediction mode such as unidirectional / bidirectional prediction, and the like.
  • the prediction parameter refers to information regarding a motion vector and a prediction mode.
  • the prediction parameter control unit 204 selects and outputs a combination of the reference image number of the reference image used for generating the prediction image and the prediction parameter based on the input image, and inputs the reference image number to the reference image selector 205.
  • the prediction parameter is input to the unidirectional motion compensation unit 203.
  • the reference image selector 205 is a switch for switching which output end of the frame memories FM0 to FMN included in the frame memory 206 is connected according to the reference image number input from the prediction parameter control unit 204. For example, if the reference image number is 0, the reference image selector 205 connects the output end of FM0 to the output end of the reference image selector 205. If the reference image number is N, the reference image selector 205 connects the output end of the FMN to the reference image selector. Connect to 205 output.
  • the reference image selector 205 outputs the reference image stored in the frame memory to which the output terminal is connected among the frame memories FM0 to FMN included in the frame memory 206, and the unidirectional motion compensation unit 203, the reference image feature amount This is input to the derivation unit 108 and the motion evaluation unit 111.
  • the unidirectional motion compensation unit 203 performs a motion compensation prediction process according to the prediction parameter input from the prediction parameter control unit 204 and the reference image input from the reference image selector 205, and generates a unidirectional prediction image.
  • FIG. 5 is a diagram illustrating an example of a motion vector relationship of motion compensation prediction in the bidirectional prediction according to the first embodiment.
  • motion compensated prediction interpolation processing is performed using a reference image, and a unidirectional prediction image is generated based on the amount of motion deviation from the pixel block at the encoding target position between the generated interpolation image and the input image. .
  • the amount of deviation is a motion vector.
  • a prediction image is generated using a set of two types of reference images and motion vectors.
  • an interpolation process with 1/2 pixel accuracy, an interpolation process with 1/4 pixel accuracy, or the like is used, and the value of the interpolation image is generated by performing the filtering process on the reference image.
  • H.P. capable of performing interpolation processing up to 1/4 pixel accuracy on a luminance signal.
  • the shift amount is expressed by four times the integer pixel accuracy.
  • the unidirectional motion compensation unit 203 outputs a unidirectional prediction image and temporarily stores it in the memory 202.
  • the multi-frame motion compensation unit 201 performs weighted prediction using two types of unidirectional prediction images.
  • the first unidirectional prediction image corresponding to the first is stored in the memory 202, and the second unidirectional prediction image is directly output to the multi-frame motion compensation unit 201.
  • the first unidirectional prediction image corresponding to the first is the first prediction image
  • the second unidirectional prediction image is the second prediction image.
  • two unidirectional motion compensation units 203 may be prepared and each may generate two unidirectional prediction images.
  • the unidirectional motion compensation unit 203 directly outputs the first unidirectional prediction image as the first prediction image to the multi-frame motion compensation unit 201. Good.
  • the multi-frame motion compensation unit 201 uses the first prediction image input from the memory 202, the second prediction image input from the unidirectional motion compensation unit 203, and the WP parameter information input from the motion evaluation unit 111.
  • a prediction image is generated by performing weighted prediction.
  • the multi-frame motion compensation unit 201 outputs a prediction image and inputs the prediction image to the subtraction unit 101 and the addition unit 106.
  • FIG. 6 is a block diagram illustrating an example of the configuration of the multi-frame motion compensation unit 201 according to the first embodiment.
  • the multi-frame motion compensation unit 201 includes a default motion compensation unit 301, a weighted motion compensation unit 302, a WP parameter control unit 303, and WP selectors 304 and 305.
  • the WP parameter control unit 303 outputs a WP application flag and weight information based on the WP parameter information input from the parameter derivation unit 110, inputs the WP application flag to the WP selectors 304 and 305, and weights the weight information. Input to the motion compensation unit 302.
  • the WP parameter information includes the fixed-point precision of the weight coefficient, the first WP application flag corresponding to the first predicted image, the first weight coefficient, the first offset, and the second WP application corresponding to the second predicted image. Contains information on flag, second weighting factor, and second offset.
  • the WP application flag is a parameter that can be set for each corresponding reference image and signal component, and indicates whether to perform weighted motion compensation prediction.
  • the weight information includes information on the fixed point precision of the weight coefficient, the first weight coefficient, the first offset, the second weight coefficient, and the second offset.
  • the WP parameter control unit 303 outputs the WP parameter information by separating it into a first WP application flag, a second WP application flag, and weight information.
  • the first WP application flag is input to the WP selector 304
  • the second WP application flag is input to the WP selector 305
  • the weight information is input to the weighted motion compensation unit 302.
  • the WP selectors 304 and 305 switch the connection end of each predicted image based on the WP application flag input from the WP parameter control unit 303.
  • each WP application flag is 0, the WP selectors 304 and 305 connect each output terminal to the default motion compensation unit 301. Then, the WP selectors 304 and 305 output the first predicted image and the second predicted image and input them to the default motion compensation unit 301.
  • each WP application flag is 1
  • the WP selectors 304 and 305 connect each output terminal to the weighted motion compensation unit 302.
  • the WP selectors 304 and 305 output the first predicted image and the second predicted image, and input them to the weighted motion compensation unit 302.
  • the default motion compensation unit 301 performs an average value process based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 to generate a prediction image. Specifically, when the first WP application flag and the second WP application flag are 0, the default motion compensation unit 301 performs average value processing based on Expression (1).
  • P [x, y] is a predicted image
  • PL0 [x, y] is a first predicted image
  • PL1 [x, y] is a second predicted image
  • offset2 and shift2 are parameters of the rounding process in the average value process, and are determined by the internal calculation accuracy of the first predicted image and the second predicted image. If the bit accuracy of the prediction image is L and the bit accuracy of the first prediction image and the second prediction image is M (L ⁇ M), shift2 is formulated by Equation (2), and offset2 is formulated by Equation (3). Is done.
  • Offset2 (1 ⁇ (shift2-1) ... (3)
  • the default motion compensation unit 301 uses only the first prediction image and calculates a final prediction image based on Expression (4). calculate.
  • PLX [x, y] indicates a unidirectional prediction image (first prediction image), and X is an identifier indicating the list number of the reference list, and is either 0 or 1. For example, when the list number is 0, PL0 [x, y] is obtained, and when the list number is 1, PL1 [x, y] is obtained.
  • offset1 and shift1 are parameters of the rounding process, and are determined by the internal calculation accuracy of the first predicted image. Assuming that the bit accuracy of the predicted image is L and the bit accuracy of the first predicted image is M (L ⁇ M), shift1 is formulated by Equation (5), and offset1 is formulated by Equation (6).
  • Offset1 (1 ⁇ (shift1-1) ... (6)
  • the weighted motion compensation unit 302 is based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 and the weight information input from the WP parameter control unit 303. Performs weighted motion compensation. Specifically, when the first WP application flag and the second WP application flag are 1, the weighted motion compensation unit 302 performs the weighting process based on Expression (7).
  • w 0C is a weight coefficient corresponding to the first predicted image
  • w 1C is a weight coefficient corresponding to the second predicted image
  • o 0C is an offset corresponding to the first predicted image
  • o 1C is corresponding to the second predicted image.
  • logWD C is a parameter indicating the fixed-point precision of each weighting factor.
  • the weighted motion compensation unit 302 rounds off the log WD C , which is the fixed-point precision, as in Expression (8) when the calculation accuracy of the first prediction image, the second prediction image, and the prediction image is different. Realize processing.
  • the rounding process can be realized by replacing logWD C in Expression (7) with logWD ′ C in Expression (8).
  • a bit precision of the predicted image 8 when the bit precision of the first prediction image and the second prediction image is 14, by resetting the LogWD C, the same operation accuracy and shift2 formula (1) It is possible to implement a batch rounding process in
  • the weighted motion compensation unit 302 uses only the first predicted image and uses the final predicted image based on Equation (9). Is calculated.
  • PLX [x, y] indicates a unidirectional prediction image (first prediction image)
  • w XC indicates a weighting factor corresponding to unidirectional prediction
  • X is an identifier indicating a list number of the reference list. Yes, either 0 or 1. For example, when the list number is 0, PL0 [x, y] and w 0C are obtained , and when the list number is 1, PL1 [x, y] and w 1C are obtained.
  • the weighted motion compensator 302 if the calculation accuracy of the predicted image and the first prediction image and the second prediction image are different, equation as in the case of bidirectional prediction the LogWD C is a fixed point precision (8)
  • the rounding process is realized by controlling as described above.
  • the rounding process can be realized by replacing logWD C in Expression (7) with logWD ′ C in Expression (8). For example, when the bit accuracy of the predicted image is 8 and the bit accuracy of the first predicted image is 14, by resetting logWD C , the batch rounding process with the same calculation accuracy as shift1 of Expression (4) is performed. It can be realized.
  • FIG. 7 is an explanatory diagram of an example of the fixed-point precision of the weighting factor in the first embodiment, and is a diagram illustrating an example of a change between a moving image having a pixel value change in the time direction and an average pixel value.
  • the encoding target frame is Frame (t)
  • the temporally previous frame is Frame (t ⁇ 1)
  • the temporally subsequent frame is Frame (t + 1).
  • the weighting coefficient means the degree of change in FIG. 7 and takes a value of 1.0 when there is no change in the pixel value, as is clear from Equation (7) and Equation (9).
  • Fixed point precision is a parameter for controlling the step size corresponding to decimal weighting factor, the weighting factor when there is no pixel value variation, the 1 ⁇ logWD C.
  • FIG. 7 shows an example in which the average pixel value of the image decreases with time, and the weighting factor corresponds to the inclination of the decrease.
  • the fixed-point precision of the weighting coefficient is information indicating the precision of this slope. For example, in FIG. 7, when the weighting coefficient between Frame (t ⁇ 1) and Frame (t + 1) is 0.75 in decimal precision, 3/4 can be expressed with 1/4 precision.
  • adjustment can be made with an offset value indicating a correction value (deviation amount) corresponding to the intercept of the linear function.
  • an offset value indicating a correction value (deviation amount) corresponding to the intercept of the linear function.
  • the weighting coefficient between Frame (t ⁇ 1) and Frame (t + 1) is 0.60 in decimal precision and the fixed decimal precision is 1 (1 ⁇ 1)
  • the weighting coefficient is Since it does not correspond to the actual value of the fixed decimal point precision, for example, 1 (that is, corresponding to the decimal point precision 0.50 of the weight coefficient) is set.
  • second WP application flag In the case of unidirectional prediction, since various parameters (second WP application flag, second weighting factor, and second offset information) corresponding to the second predicted image are not used, they are set to predetermined initial values. May be.
  • the reference image feature amount deriving unit 108, the predicted image feature amount deriving unit 109, and the parameter deriving unit 110 generate the predicted image from the reference image input from the predicted image generation unit 107.
  • WP parameter information corresponding to the predicted image is implicitly derived.
  • the reference image feature amount deriving unit 108 derives the reference image feature amount of each reference image input from the predicted image generation unit 107, outputs a reference image group feature amount that summarizes the derived reference image feature amounts, and The prediction image feature quantity deriving unit 109 and the parameter deriving unit 110 are input.
  • FIG. 8 is a diagram illustrating an example of a reference image group feature amount according to the first embodiment.
  • the reference image group feature amount is a table of 2N + 2 reference image feature amounts, and the reference image feature amount includes the list number of the reference list of the reference image and the reference image number of the reference image. , The pixel average value of the reference image, and the pixel error value indicating the pixel difference from the pixel average value of the reference image.
  • the list number is an identifier indicating the prediction direction, and takes a value of 0 for unidirectional prediction, and two values of 0 and 1 can be used for bidirectional prediction.
  • the reference image number is a value corresponding to 0 to N indicated in the frame memory 206.
  • List numbers and reference image numbers are H.264. Are managed in accordance with the DPB used in H.264, etc., and the encoding control unit 113 sets the setting contents in the predicted image generation unit 107 (for example, the reference image selector 205 selects any reference image as the reference image feature quantity deriving unit 108.
  • the reference image feature quantity deriving unit 108 implicitly sets the value depending on whether the data is output to the reference image feature value.
  • the average pixel value and the pixel error value are calculated by the reference image feature amount deriving unit 108.
  • the table of reference image group feature values shown in FIG. 8 is an example, and the configuration of reference images that can be used differs depending on the coding structure, so the table size also differs.
  • the reference image number of list number 1 corresponding to bi-directional prediction is not usable, and thus becomes a table of list number 0.
  • the table size also varies depending on the number of reference images.
  • FIG. 9 is a block diagram illustrating an example of the configuration of the reference image feature quantity deriving unit 108 of the first embodiment.
  • the reference image feature quantity deriving unit 108 includes an average value calculating unit 401, an error value calculating unit 402, and an integrating unit 403.
  • the average value calculation unit 401 calculates the pixel average value of the reference image, outputs the calculated pixel average value, and the error value calculation unit 402 and the integration unit Input to 403.
  • the average value calculation unit 401 calculates the pixel average value of the reference image using, for example, Equation (10).
  • DCLX (t) represents the pixel average value of the reference images of list number X and reference image number t. Therefore, if the pixel average value of the reference image of list number 0 and reference image number 1 is represented as DCL0 (1), if the pixel average value of the reference image of list number 1 and reference image number 0 is DCL1 ( 0). n indicates the number of pixels of the reference image of the list number X and the reference image number t. Y x, y (t) represents the pixel value of the (x, y) coordinates of the reference image of list number X and reference image number t.
  • the error value calculation unit 402 uses the pixel average value of the reference image input from the average value calculation unit 401 to calculate the pixel average value of the reference image. A pixel error value is calculated, and the calculated pixel error value is output and input to the integration unit 403. The error value calculation unit 402 calculates the pixel error value of the pixel average value of the reference image using, for example, Equation (11).
  • ACLX (t) represents a pixel error value that is an average value of differences (errors) between the pixel value of each pixel of the reference image of list number X and reference image number t and the pixel average value of the reference image. .
  • the integration unit 403 receives the list number and reference image number of the reference image, the pixel average value of the reference image input from the average value calculation unit 401, and the error.
  • the pixel error value of the pixel average value of the reference image input from the value calculation unit 402 is integrated into the reference image feature amount.
  • the integration unit 403 collects the integrated reference image feature values in a table as shown in FIG. 8 and outputs them as reference image group feature values, which are input to the predicted image feature value deriving unit 109 and the parameter deriving unit 110.
  • the predicted image feature amount deriving unit 109 derives a predicted image feature amount based on the reference image group feature amount input from the reference image feature amount deriving unit 108, outputs the derived predicted image feature amount, and the parameter deriving unit 110. To enter.
  • the predicted image feature amount includes a pixel average value of the predicted image, a pixel error value obtained by averaging errors from the pixel average value, and a predicted image feature amount derivation flag indicating whether the predicted image feature amount has been derived.
  • FIG. 10 is a block diagram illustrating an example of the configuration of the predicted image feature quantity deriving unit 109 according to the first embodiment.
  • the predicted image feature quantity deriving unit 109 includes a feature quantity control unit 411, a memory 412, and a predicted image feature quantity calculation unit 413.
  • the feature amount control unit 411 selects and outputs two reference image feature amounts used for deriving the predicted image feature amount from the reference image group feature amount input from the reference image feature amount deriving unit 108, and outputs one of them in the memory 412. Is input (loaded), and the other is input to the predicted image feature amount calculation unit 413.
  • the feature quantity control unit 411 displays the reference image display order (image display time) of the reference image feature quantity from the list number and reference image number of the reference image feature quantity collected in the reference image group feature quantity. Deriving POC (Picture Order Count) indicating The reference list and the reference image number are information for indirectly specifying the reference image, and the POC is information for directly specifying the reference image, and corresponds to the absolute position of the reference image. Then, the feature amount control unit 411 selects two POCs from the derived POC in the order of the shortest distance from the encoding target image (predicted image), thereby calculating the predicted image feature amount from the reference image group feature amount. Two reference image feature quantities used for derivation are selected.
  • the feature amount control unit 411 selects two POCs (POC1 and POC2) using Equation (12) when the encoded slice is P-slice.
  • num_of_active_ref_l0_minus1 is one of the syntax elements, and indicates a value obtained by subtracting 1 from the number of reference images used in the reference list of list number 0, that is, N.
  • RefPicOrderCnt is a function that, when given a list number of a reference image and a reference image number of the reference image, returns a POC of the reference image
  • ListL0 indicates list number 0
  • refIdx indicates a reference image number.
  • refPOC indicates the POC arrangement of the reference image in which the reference image feature amount is included in the reference image group feature amount
  • curPOC indicates the POC of the encoding target image.
  • SortRefPOC calculates the absolute difference value between each POC of the number +1 given by num_of_active_ref_l0_minus1 or num_of_active_ref_l0_minus1-1 of POCs stored in refPOC and the POC indicated by curPOC.
  • SortRefPOC is a function that returns the POC having the smallest absolute difference value among the POCs stored in refPOC, deletes the POC from refPOC, and rearranges the POCs stored in refPOC.
  • SortRefPOC (refPOC, curPOC, num_of_active_ref_l0_minus1) has the smallest absolute difference value from refPOC POC number 3 that becomes 1 is returned as POC1, 3 is deleted from refPOC, and refPOC is rearranged to ⁇ 0, 1, 2 ⁇ .
  • SortRefPOC (refPOC, curPOC, num_of_active_ref_l0_minus1-1) returns POC number 2 with the smallest absolute difference value of 2 from refPOC as POC2, deletes 2 from refPOC, and arranges refPOC in ⁇ 0, 1 ⁇ Change.
  • the feature quantity control unit 411 selects two POCs (POC1 and POC2) using Expression (13) when the encoded slice is B-slice.
  • POC2 SortRefPOC (refPOCL1, curPOC, num_of_active_ref_l1_minus1) (13)
  • refPOCL0 indicates the POC arrangement of the reference image of list number 0 in which the reference image feature quantity is included in the reference image group feature quantity.
  • num_of_active_ref_l1_minus1 is one of the syntax elements, and indicates a value obtained by subtracting 1 from the number of reference images used in the reference list of list number 1, that is, N.
  • refPOCL1 indicates the POC arrangement of the reference image of list number 1 in which the reference image feature amount is included in the reference image group feature amount.
  • ListL1 indicates list number 1.
  • the POC reference image with the closest temporal distance to the encoded image
  • the feature amount control unit 411 reselects one POC (POC2) using Expression (14).
  • the feature quantity control unit 411 can select two POCs having different temporal distances from the encoding target image (POC2 having a temporal distance different from that of POC1) from the encoding target image by repeatedly performing Expression (14).
  • X of refPOCLX is an identifier indicating a list number. For example, after searching a reference list of list number 0, a reference list of list number 1 may be searched. For this reason, when POC (POC2) is reselected, the two POCs (reference images) that are closest in time to the encoded image may be selected from the same reference list.
  • M is a value indicating the number of repetitions. M is defined by the number of reference images set for each reference list, for example.
  • the feature quantity control unit 411 selects and outputs a reference image feature quantity corresponding to POC1 from the reference image group feature quantity, and inputs it to the memory 412. Further, when POC2 is selected, the feature amount control unit 411 selects and outputs a reference image feature amount corresponding to POC2 from the reference image group feature amount, and inputs it to the predicted image feature amount calculation unit 413.
  • the feature amount control unit 411 selects two POCs, but three or more POCs may be selected.
  • the feature amount control unit 411 selects three or more reference image feature amounts from the reference image group feature amount, and a predicted image feature amount is derived from the selected three or more reference image feature amounts. Note that in the case of P-slice, it is necessary that N ⁇ 2, and the feature amount control unit 411 may search the reference list with the list number 0 after executing Expression (12).
  • the memory 412 holds the reference image feature amount (reference image feature amount corresponding to POC1 (hereinafter referred to as first reference image feature amount)) input by the feature amount control unit 411. Then, the memory 412 receives a reference image feature amount (a reference image feature amount corresponding to POC2 (hereinafter referred to as a second reference image feature amount)) from the feature amount control unit 411 to the predicted image feature amount calculation unit 413. Then, the first reference image feature value is output and input to the predicted image feature value calculation unit 413.
  • first reference image feature amount reference image feature amount corresponding to POC1
  • second reference image feature amount a reference image feature amount corresponding to POC2
  • the predicted image feature value calculation unit 413 calculates a predicted image feature value using the first reference image feature value input from the memory 412 and the second reference image feature value input from the feature value control unit 411. To be output and input to the parameter deriving unit 110.
  • the predicted image feature amount calculation unit 413 first calculates the temporal distance between the reference image of the first reference image feature amount and the reference image of the second reference image feature amount according to Equation (15).
  • DistScaleFactor Clip3 (-1024, 1023, (tb * tx + 32) >> 6) ... (15)
  • Clip3 (A, B, C) returns A if the value C falls below the minimum value A, returns B if the value C exceeds the maximum value B, and returns value C if none of the conditions apply.
  • tb is calculated by Expression (16)
  • tx is calculated by Expression (17).
  • Tb indicates the time distance between curPOC and POC1.
  • tx is an intermediate variable for performing division of tb / td with fixed-point precision, and indicates a division result of tb / td. Note that the fixed point has 8-bit precision, and when the value of DistScaleFactor is 128 (median value), it means that tb / td is 1.
  • td indicates a temporal distance between POC1 and POC2, and is calculated by Expression (18).
  • Td Clip3 (-128, 127, POC2-POC1) ... (18)
  • the predicted image feature value calculation unit 413 calculates the pixel average value of the predicted image according to Equation (19), and calculates the pixel error value of the predicted image according to Equation (20).
  • DCP indicates the pixel average value of the predicted image
  • DC1 and DC2 indicate the pixel average value of the first reference image feature value and the pixel average value of the second reference image feature value, respectively.
  • ACP indicates the pixel error value of the predicted image
  • AC1 and AC2 indicate the pixel error value of the first reference image feature value and the pixel error value of the second reference image feature value, respectively.
  • the values of Shft3 and Ofst3 are determined according to INTERNAL_PREC indicating the internal calculation accuracy of the predicted image, Shft3 is calculated by Equation (21), and Ofst3 is calculated by Equation (22). In the first embodiment, since the fixed point precision of DistScaleFactor is 8, when INTERNAL_PREC is 8, DCP and ACP are rounded to integer precision.
  • the predicted image feature amount calculation unit 413 cannot derive the predicted image feature amount or time Since the target distance is too far, the pixel average value DCP of the predicted image and the pixel error value ACP of the predicted image are set to initial values.
  • POC1 POC2 (23) (DistScaleFactor >> 2) ⁇ -64 (24) (DistScaleFactor >>2)> 128 (25)
  • the predicted image feature value calculation unit 413 sets the predicted image feature value derivation flag wp_avaiable_flag, which is an internal variable, to false, If DistScaleFactor does not satisfy any of the conditions of Equations (23) to (25), wp_avaiable_flag is set to true.
  • wp_avaiable_flag When the predicted image feature quantity derivation flag is set to false, initial values are set in the DCP and ACP. For example, DefaultDC indicating 0 is set in DCP, and DefaultAC indicating 0 is set in ACP.
  • the predicted image feature value derivation flag is set to true, the values calculated by Expression (19) and Expression (20) are set in DCP and ACP, respectively.
  • the parameter deriving unit 110 uses the reference image group feature amount input from the reference image feature amount deriving unit 108 and the predicted image feature amount input from the predicted image feature amount deriving unit 109 to use the WP parameter of the encoding target image. Deriving information.
  • FIG. 11A and FIG. 11B are diagrams illustrating an example of WP parameter information according to the first embodiment.
  • An example of WP parameter information at the time of P-slice is as shown in FIG. 11A
  • an example of WP parameter information at the time of B-slice is as shown in FIGS. 11A and 11B.
  • the list number and the reference image number are the same as the reference image group feature quantity, and the WP application flag, the weighting coefficient, and the offset are as described with reference to FIGS.
  • the WP application flag, weighting factor, and offset with list number 0 correspond to the first WP application flag, first weighting factor, and first offset, respectively
  • the WP application flag and weighting factor with list number 1 , Offset corresponds to a second WP application flag, a second weighting factor, and a second offset, respectively. Since the WP parameter information is held for each reference list and reference image, the information required for B-slice is 2N + 2 if N + 1 images are used.
  • the parameter deriving unit 110 first confirms the predicted image feature amount derivation flag wp_avaiable_flag included in the predicted image feature amount, and confirms whether the WP parameter information can be derived. When wp_avaiable_flag is set to false, the WP parameter information cannot be derived. Therefore, the parameter deriving unit 110 sets the weighting coefficient and the offset corresponding to the list number X and the reference image number Y to Expression (26) and Expression ( The initial value is set according to 27).
  • Weight [X] [Y] is a value corresponding to w 0C or w 1C used in Equations (7) and (9).
  • Log2Denom is a value corresponding to log WD C calculated by Expression (28) and used in Expression (7) and the like.
  • Default_Value may be set to 0 or 7, for example.
  • the parameter deriving unit 110 When the wp_avaiable_flag is set to false, the parameter deriving unit 110 repeatedly executes Expression (26) and Expression (27) for all combinations of list numbers X and reference image numbers Y (all reference images). Thus, an initial value is set in the WP parameter information.
  • the parameter deriving unit 110 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to false when wp_avaiable_flag is set to false.
  • the parameter deriving unit 110 calculates the weight coefficient and the offset corresponding to the list number X and the reference image number Y, respectively, using the formula (29) Derived according to equation (30).
  • curDC and curAC indicate the pixel average value DCP and the pixel error value ACP of the predicted image, respectively.
  • DC [X] [Y] and AC [X] [Y] indicate the pixel average value DCLX (Y) and pixel error value ACLX (Y) of the reference image with the list number X and the reference image number Y, respectively.
  • LeftShft is a value for correcting a change in calculation accuracy for Shft3 used in equations (19) to (22), and is calculated by equation (31).
  • Ofst4 is a parameter used for rounding when dividing by AC [X] [Y]. For example, if rounding is performed during rounding with fixed-point precision, Ofst4 may be set to (AC [X] [Y] >> 1), and Ofst4 may be set to 0 when always rounding down.
  • RealLog2Denom is calculated by Expression (32), and RealOfst is calculated by Expression (33).
  • RealLog2Denom Log2Denom + LeftShft (32)
  • RealOfst (1 ⁇ (RealLog2Denom-1)) ... (33)
  • RealLog2Denom may be set to a predetermined value such as 7, for example.
  • the parameter deriving unit 110 When the wp_avaiable_flag is set to true, the parameter deriving unit 110 repeatedly executes Expression (29) and Expression (30) for all combinations of the list number X and the reference image number Y (all reference images). Thus, the WP parameter information shown in FIGS. 11A and 11B is derived.
  • the parameter deriving unit 110 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to true when wp_avaiable_flag is set to true.
  • the motion evaluation unit 111 performs motion evaluation between a plurality of frames based on the input image and the reference image input from the predicted image generation unit 107, outputs motion information, and uses the motion information as the predicted image generation unit. 107 and the encoding unit 112.
  • the motion evaluation unit 111 calculates an error by calculating a difference value from a plurality of reference images corresponding to the same position as the input image of the prediction target pixel block, shifts the position with fractional accuracy, and minimizes the error.
  • Optimal motion information is calculated by a technique such as block matching for searching for a block.
  • the motion evaluation unit 111 performs block matching including default motion compensation prediction as shown in Equation (1) and Equation (4) using motion information derived by unidirectional prediction.
  • motion information for bidirectional prediction is calculated.
  • the motion evaluation unit 111 may calculate motion information in consideration of weighted prediction by performing block matching including weighted motion compensated prediction as shown in Equation (7) and Equation (9). Is possible. In this case, the motion evaluation unit 111 may perform block matching using Equation (7) and Equation (9) according to the WP parameter information output from the parameter deriving unit 110.
  • the motion evaluation unit 111 is exemplified as one function of the encoding device 100.
  • the motion evaluation unit 111 is not an essential component of the encoding device 100, and for example, the motion evaluation unit 111 is encoded. It is good also as an apparatus outside the conversion apparatus 100. In this case, the motion information calculated by the motion evaluation unit 111 may be loaded into the encoding device 100.
  • the encoding unit 112 includes various encoding parameters such as a quantized transform coefficient input from the quantization unit 103, motion information input from the motion evaluation unit 111, and quantization information specified by the encoding control unit 113. Is encoded to generate encoded data.
  • the encoding process corresponds to, for example, Huffman encoding or arithmetic encoding. For example, H.M.
  • context-adaptive variable-length coding CAVLC: Context based Adaptive Variable Coding
  • CABAC Context based Adaptive Binary Coding
  • the encoding parameter is a parameter necessary for decoding, such as prediction information indicating a prediction method, information on quantization transform coefficients, information on quantization, and the like.
  • the encoding control unit 113 has an internal memory (not shown), the encoding parameter is held in the internal memory, and the encoding parameter of the adjacent already encoded pixel block is used when encoding the pixel block.
  • the prediction information of the pixel block can be derived from the prediction information of the encoded adjacent block.
  • the encoding unit 112 outputs the generated encoded data according to an appropriate output timing managed by the encoding control unit 113.
  • the output encoded data is, for example, multiplexed with various information such as a multiplexing unit (not shown) and temporarily stored in an output buffer (not shown). (Storage medium) or transmission system (communication line).
  • FIG. 12 is a flowchart illustrating an example of a flow of a reference image group feature quantity derivation process performed by the reference image feature quantity derivation unit 108 of the first embodiment.
  • the reference image feature quantity deriving unit 108 can use two reference lists, so PRED_TYPE is set to 1, and when the slice type is P-slice, reference is made. Since one list can be used, set PRED_TYPE to 0.
  • the reference image feature value deriving unit 108 refers to the variable slice_type, thereby determining the slice type of the encoding target image. Can be discriminated whether B-slice or P-slice.
  • the reference image feature quantity deriving unit 108 initializes the list number X to 0 (step S101), and initializes the reference image number Y to 0 (step S102).
  • the average value calculation unit 401 calculates the pixel average value DCLX [Y] according to Equation (10), The pixel error value ACLX [Y] is calculated according to (11) (step S103).
  • the integration unit 403 integrates the list number X, the reference image number Y, the pixel average value DCLX [Y], and the pixel error value ACLX [Y] into the reference image feature amount, and a reference image group feature amount table. Update. Then, the reference image feature quantity deriving unit 108 increments the reference image number Y (step S104).
  • the reference image feature quantity deriving unit 108 determines whether or not the incremented reference image number Y is larger than num_ref_active_lx_minus1 (step S105). If it is smaller (No in step S105), the process returns to step S103.
  • step S105 the reference image feature quantity deriving unit 108 determines that the calculation of the pixel average value and the pixel error value of all the reference images in the reference list of the list number X is completed, and the list number X is incremented (step S106).
  • the reference image feature quantity deriving unit 108 determines whether or not the incremented list number X is larger than PRED_TYPE (step S107), and if smaller (No in step S107), the process returns to step S102.
  • the reference image feature quantity deriving unit 108 determines that all the reference lists have been processed, outputs the reference image group feature quantity, and ends the process.
  • reference images in which a list number and a reference image number are associated may be input in a batch. .
  • FIG. 13 is a flowchart illustrating an example of a flow of a predicted image feature quantity derivation process performed by the predicted image feature quantity derivation unit 109 of the first embodiment.
  • the predicted image feature quantity deriving unit 109 sets PRED_TYPE to 1 when the slice type of the encoding target image is B-slice, and sets PRED_TYPE to 0 when the slice type is P-slice. Note that the predicted image feature quantity deriving unit 109 can determine whether the slice type of the encoding target image is B-slice or P-slice by referring to the variable slice_type managed by the encoding control unit 113.
  • the predicted image feature quantity deriving unit 109 initializes the list number X to 0 (step S201) and initializes the reference image number Y to 0 (step S202).
  • the feature amount control unit 411 determines the RefPicOrderCnt of Expression (12) or Expression (13) according to the slice type of the encoding target image. Using the function, POC is derived from list number X and reference image number Y, and the derived POC is stored in the refPOC array. Then, the feature quantity control unit 411 increments the reference image number Y (step S203).
  • the feature quantity control unit 411 determines whether or not the incremented reference image number Y is larger than num_ref_active_lx_minus1 (step S204), and if smaller (No in step S204), the process returns to step S203.
  • the feature amount control unit 411 determines that the POC derivation of all the reference images in the reference list of the list number X is finished, and increments the list number X (step S205). .
  • the feature quantity control unit 411 determines whether or not the incremented list number X is larger than PRED_TYPE (step S206), and if smaller (No in step S206), the process returns to step S202.
  • the feature quantity control unit 411 determines that all the reference lists have been processed, and sortRefPOC of Expression (12) or Expression (13) according to the slice type of the encoding target image. Using the function, the POC having the smallest absolute distance (absolute difference value) from the POC of the encoding target image indicated by curPOC among the POCs stored in refPOC is set to POC1. Then, the feature amount control unit 411 outputs the first reference image feature amount, which is the reference image feature amount of the reference image of POC1, to the memory 412, and deletes the POC from the refPOC array (step S207).
  • the feature value control unit 411 again uses the SortRefPOC function of Expression (12) or Expression (13) according to the slice type of the encoding target image to indicate curPOC among POCs stored in refPOC.
  • the POC having the smallest absolute distance (absolute difference value) with the POC of the encoding target image is set as POC2, and the POC is deleted from the refPOC array (step S208).
  • Step S209 the feature amount control unit 411 executes Formula (14) to update POC2 (Step S210), and returns to Step S209.
  • the feature amount control unit 411 sends the second reference image feature amount that is the reference image feature amount of the reference image of POC2 to the predicted image feature amount calculation unit 413. Output.
  • the predicted image feature value deriving unit 109 uses Equation (16).
  • a time distance ratio tb between POC1 and curPOC is derived (step S211).
  • the predicted image feature quantity deriving unit 109 derives the time distance ratio td between POC1 and POC2 using Expression (18) (step S212).
  • the predicted image feature quantity deriving unit 109 derives DistScaleFactor used for distance scaling from the time distance ratio tb and the time distance ratio td using Expression (15) (step S213).
  • the predicted image feature quantity deriving unit 109 determines whether or not the derived DistScaleFactor satisfies any of the formulas (23) to (25) that are conditions for the exception process (step S214).
  • the predicted image feature quantity derivation unit 109 sets the predicted image feature quantity derivation flag wp_avaiable_flag to false (step S215). Thereby, the pixel average value DCP and the pixel error value ACP of the predicted image are set to the initial values (step S216).
  • the predicted image feature quantity deriving unit 109 sets wp_avaiable_flag to true (Step S217).
  • the predicted image feature quantity deriving unit 109 uses the equation (19) to calculate the pixel average of the predicted image from the DistScaleFactor, the pixel average value DC1 of the first reference image feature quantity, and the pixel average value DC2 of the second reference image feature quantity.
  • the value DCP is derived, and the pixel error value ACP of the predicted image is derived from the DistScaleFactor, the pixel error value AC1 of the first reference image feature quantity, and the pixel error value AC2 of the second reference image feature quantity using Equation (20). To do.
  • the predicted image feature quantity deriving unit 109 outputs the pixel average value DCP, the pixel error value ACP, and the predicted image feature quantity derivation flag wp_avaiable_flag of the derived predicted image as the predicted image feature quantity (step S218).
  • FIG. 14 is a flowchart illustrating an example of a flow of a WP parameter information derivation process performed by the parameter derivation unit 110 of the first embodiment.
  • the parameter deriving unit 110 sets PRED_TYPE to 1 when the slice type of the encoding target image is B-slice, and sets PRED_TYPE to 0 when the slice type is P-slice.
  • the parameter deriving unit 110 can determine whether the slice type of the image to be encoded is B-slice or P-slice by referring to the variable slice_type managed by the encoding control unit 113.
  • the parameter deriving unit 110 displays the predicted image feature amount deriving flag. It is confirmed whether or not wp_avaiable_flag is set to false (step S301).
  • step S301 When wp_avaiable_flag is set to false (Yes in step S301), the parameter deriving unit 110 sets Log2Denom to 0 using equation (28) (step S302).
  • the parameter deriving unit 110 sets the weight coefficients of all reference images to initial values using Equation (26) (Step S303), and sets the offsets of all reference images to initial values using Equation (27). (Step S304). Then, the process ends.
  • step S301 when wp_avaiable_flag is set to true (No in step S301), the parameter deriving unit 110 sets Log2Denom to a predetermined value (for example, 7) (step S305).
  • the parameter deriving unit 110 initializes the list number X to 0 (step S306), and initializes the reference image number Y to 0 (step S307).
  • the parameter deriving unit 110 derives the weighting coefficient Weight [X] [Y] corresponding to the list number X and the reference image number Y using Equation (29), and sets the offset Offset [X] [Y]. Derived using Equation (30) (step S308).
  • the parameter deriving unit 110 increments the reference image number Y (step S309). Then, the parameter deriving unit 110 determines whether or not the incremented reference image number Y is larger than num_ref_active_lx_minus1 (step S310), and if smaller (No in step S310), the process returns to step S308.
  • step S310 the parameter deriving unit 110 determines that the derivation of the weighting factors and offsets of all the reference images in the reference list of the list number X has been completed, and increments the list number X (step). S311).
  • the parameter deriving unit 110 determines whether or not the incremented list number X is larger than PRED_TYPE (step S312), and if it is smaller (No in step S312), the process returns to step S307.
  • step S312 the feature quantity control unit 411 determines that all the reference lists have been processed, outputs WP parameter information, and ends the process.
  • FIG. 15 is a diagram illustrating an example of the syntax 500 used by the encoding device 100 according to the first embodiment.
  • a syntax 500 indicates a structure of encoded data generated by the encoding apparatus 100 by encoding an input image (moving image data).
  • a decoding apparatus described later refers to the same syntax structure as that of the syntax 500, and the moving image interprets the syntax.
  • the syntax 500 includes three parts: a high level syntax 501, a slice level syntax 502, and a coding tree level syntax 503.
  • the high level syntax 501 includes syntax information of a layer higher than the slice.
  • a slice refers to a rectangular area or a continuous area included in a frame or a field.
  • the slice level syntax 502 includes information necessary for decoding each slice.
  • the coding tree level syntax 503 includes information necessary for decoding each coding tree (ie, each coding tree block). Each of these parts includes more detailed syntax.
  • the high level syntax 501 includes sequence and picture level syntaxes such as a sequence parameter set syntax 504, a picture parameter set syntax 505, and an adaptation parameter set syntax 506.
  • the slice level syntax 502 includes a slice header syntax 507, a bread weight table syntax 508, a slice data syntax 509, and the like.
  • the tread weight table syntax 508 is called from the slice header syntax 507.
  • the coding tree level syntax 503 includes a coding tree unit syntax 510, a transform unit syntax 511, a prediction unit syntax 512, and the like.
  • the coding tree unit syntax 510 may have a quadtree structure. Specifically, the coding tree unit syntax 510 can be recursively called as a syntax element of the coding tree unit syntax 510. That is, one coding tree block can be subdivided with a quadtree.
  • the coding tree unit syntax 510 includes a transform unit syntax 511.
  • the transform unit syntax 511 is called in each coding tree unit syntax 510 at the extreme end of the quadtree.
  • the transform unit syntax 511 describes information related to inverse orthogonal transformation and quantization.
  • FIG. 16 is a diagram illustrating an example of the picture parameter set syntax 505 according to the first embodiment.
  • weighted_unipred_idc is a syntax element indicating whether the weighted compensated prediction of the first embodiment regarding P-slice is valid or invalid.
  • weighted_unipred_idc is 0, the weighted motion compensation prediction of the first embodiment in the P-slice is invalid. Accordingly, the WP application flag included in the WP parameter information is always set to 0, and the WP selectors 304 and 305 connect the respective output terminals to the default motion compensation unit 301.
  • weighted_unipred_idc When weighted_unipred_idc is 1, explicit weighted motion compensation prediction (not explained in the first embodiment) in the P-slice is valid. Explicit weighted prediction is one of modes in which WP parameter information is explicitly encoded using the predicate weight table syntax. It can be realized by the method described in H.264. When weighted_unipred_idc is 2, the implicit weighted motion compensation prediction of the first embodiment in the P-slice is valid.
  • weighted_unipred_idc may be changed to weighted_pred_flag so that the implicit weighted motion compensation prediction of the first embodiment in the P-slice is always prohibited.
  • Weighted_bipred_idc is a syntax element indicating, for example, the validity or invalidity of the weighted compensation prediction of the first embodiment related to B-slice.
  • weighted_bipred_idc is 0, the weighted motion compensation prediction of the first embodiment in the B-slice is invalid. Accordingly, the WP application flag included in the WP parameter information is always set to 0, and the WP selectors 304 and 305 connect the respective output terminals to the default motion compensation unit 301.
  • weighted_bipred_idc is 1, explicit weighted motion compensated prediction (not described in the first embodiment) in the B-slice is effective.
  • Explicit weighted prediction is one of modes in which WP parameter information is explicitly encoded using red weight table syntax. It can be realized by the method described in H.264. When weighted_bipred_idc is 2, the implicit weighted motion compensation prediction of the first embodiment in the B-slice is valid.
  • FIG. 17 is a diagram illustrating an example of the sequence parameter set syntax 504 according to the first embodiment.
  • profile_idc is an identifier indicating information on a profile of encoded data.
  • level_idc is an identifier indicating information on the level of encoded data.
  • seq_parameter_set_id is an identifier indicating which sequence parameter set syntax 504 is to be referred to.
  • max_num_ref_frames is a variable indicating the maximum number of reference images in a frame.
  • the implicit_weighted_unipred_enabled_flag is a syntax element indicating, for example, whether the P-slice implicit weighted motion compensation prediction is valid or invalid with respect to the encoded data.
  • the implicit_weighted_bipred_enabled_flag is a syntax element indicating, for example, whether the B-slice implicit weighted motion compensation prediction is valid or invalid with respect to the encoded data.
  • implicit_weighted_unipred_enabled_flag or implicit_weighted_bipred_enabled_flag is 1, in the syntax of a lower layer (picture parameter set, slice header, coding tree block, transform unit, prediction unit, etc.)
  • the validity or invalidity of the implicitly weighted motion compensated prediction may be defined for each local region.
  • implicit_weighted_unipred_enabled_flag or implicit_weighted_bipred_enabled_flag 1
  • the image feature amount of the encoded image is derived and stored in the DPB together with the information of the reference image It is good.
  • the encoded slice is stored in the DPB as a reference image, and every time it is referred to when another slice is encoded, it is possible to omit the process of calculating the image feature quantity multiple times, thereby reducing the processing amount. it can.
  • the reference image feature amount deriving unit 108 derives the pixel average value and the pixel error value of the reference image as the reference image feature amount
  • the predicted image feature amount deriving unit 109 The pixel average value and the pixel error value of the predicted image are derived as feature amounts
  • the parameter deriving unit 110 derives a weighting factor using the pixel average value of the reference image and the pixel average value of the predicted image as WP parameter information.
  • the offset is derived using the pixel error value of the reference image and the pixel error value of the predicted image. Therefore, according to the first embodiment, since implicit weighted prediction can be performed in consideration of not only the weighting coefficient but also the offset, the prediction error can be reduced by using the prediction image generated by this prediction. The amount of codes can be reduced, and the coding efficiency can be improved.
  • linear interpolation prediction or linear extrapolation prediction is performed based on a pixel average value and a pixel error value derived between two reference images, and a pixel value change between reference images.
  • a pixel average value and a pixel error value of the predicted image are derived, and changes in the pixel values of the reference image and the predicted image are predicted from these values. Therefore, according to the first embodiment, a weighting factor and an offset can be effectively predicted for a video having fades and dissolve effects that are continuous in time, thereby reducing a prediction error and improving encoding efficiency. Can improve the subjective image quality.
  • the encoding apparatus 100 when performing weighted motion compensated prediction, uses an implicit mode that does not explicitly encode a weight parameter between two reference images.
  • the coding amount can be reduced, and the coding efficiency Can be improved.
  • syntax elements not defined in the first embodiment may be inserted between the rows of the syntax tables illustrated in FIGS. 16 to 17, or descriptions regarding other conditional branches may be included.
  • the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated.
  • the term of each illustrated syntax element can be changed arbitrarily.
  • the reference image feature quantity deriving unit 108 calculates the pixel average value and the pixel error value of the reference image according to the equations (10) and (11), the calculated pixels
  • the average value and the pixel error value are fractional values, and an error in the fractional part occurs when rounded to integer precision.
  • the reference image feature quantity deriving unit 108 calculates the average pixel value and the pixel error value of the reference image using Expression (34) and Expression (35).
  • the parameter deriving unit 110 calculates the offset by the mathematical formula (36).
  • Offset [X] [Y] ((((((curDC ⁇ Log2Denom)-(Weight [X] [Y] * (DC [X] [Y] ⁇ LeftShft))) / N + RealOfst) >> RealLog2Denom) ... (36)
  • division by the image size N is added to the formula (30). For example, by defining the internal calculation accuracy for each image size, the division N can be deleted and included in the section of RealLog2Denom.
  • a rounding error in division can be reduced.
  • the reference image feature quantity deriving unit 108 (the average value calculating unit 401 and the error value calculating unit 402) performs the reference image on the entire image using Expressions (34) and (35).
  • the pixel average value and the pixel error value are calculated, the calculation accuracy increases depending on the resolution of the image. For example, assuming that the pixel range is a 10-bit signal, the worst calculation amount of an image having 4096 ⁇ 2160 pixels ( ⁇ 2 to the 23rd power) is 2 to the 33rd power (23 + 10). It will exceed.
  • the reference image feature quantity deriving unit 108 performs mathematical formulas (34) and (35) in predetermined processing units, and quantizes the predetermined values to quantify the calculation accuracy for each processing unit. Can be kept constant. An arbitrary unit such as a slice unit, a line unit, or a pixel block unit can be set as the processing unit.
  • the reference image feature value deriving unit 108 when performed in units of coding tree units, performs the formula (34) and the formula (35) in 64 ⁇ 64 pixel units (2 to the 12th power) illustrated in FIG. , And quantize the pixel error values using Equation (37) and Equation (38), respectively.
  • Shft5 is calculated by Equation (39), and Ofst5 is calculated by Equation (40).
  • the value of Shft5 is set to 4 in Expression (39).
  • the parameter deriving unit 110 calculates an offset using the formula (30) and calculates RealLog2Denom using the formula (41).
  • division that occurs depending on the image size or block size can be realized by right shift, and processing can be performed by 32-bit arithmetic without depending on resolution. This makes it possible to reduce the hardware scale.
  • the pixel average value and pixel error value are calculated for each processing area, and the pixel average is calculated using all the pixels of the image. There is no need to calculate the value and the pixel error value.
  • the reference image feature amount deriving unit 108 calculates, for example, a pixel average value and a pixel error value in advance, such as skipping one pixel block and sub-sampling, simple sub-sampling, line-by-line, or slice-by-pixel block of the pixel block.
  • the parameter deriving unit 110 can derive WP parameter information for each processing region. For example, when the reference image feature quantity deriving unit 108 calculates the pixel average value and the pixel error value in coding tree unit units, the reference image feature quantity is derived in 64 ⁇ 64 pixel units shown in FIG. Since the unit 108 also derives the predicted image feature quantity in units of 64 ⁇ 64 pixels shown in FIG. 3A, the parameter derivation unit 110 can derive WP parameter information in units of coding tree units.
  • the parameter deriving unit 110 performs the weighted motion compensated prediction of Equation (7) or Equation (9) using the derived WP parameter information for each processing region, thereby performing implicit motion compensated prediction for each pixel block. realizable. For example, when there are temporal pixel value changes that differ for each video region and the changes are temporally continuous, WP parameter information is derived for each pixel block as described above, and implicit motion compensation prediction is performed. Thus, the code amount can be reduced.
  • FIG. 18 is a block diagram illustrating an example of the configuration of the decoding device 800 according to the second embodiment.
  • the decoding apparatus 800 decodes encoded data stored in an input buffer (not shown) into a decoded image and outputs the decoded image to an output buffer (not shown) as an output image.
  • the encoded data is output from, for example, the encoding device 100 of FIG. 1 and the like, and is input to the decoding device 800 via a storage system, a transmission system, or a buffer (not shown).
  • the decoding device 800 includes a decoding unit 801, an inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a predicted image generation unit 805, and a reference image feature amount derivation unit 806.
  • An inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a predicted image generation unit 805, a reference image feature quantity deriving unit 806, a predicted image feature quantity deriving unit 807, and a parameter deriving unit 808 are respectively the inverse of FIG.
  • the quantization unit 104, the inverse orthogonal transform unit 105, the addition unit 106, the predicted image generation unit 107, the reference image feature amount derivation unit 108, the predicted image feature amount derivation unit 109, and the parameter derivation unit 110 are substantially the same or similar. Is an element.
  • the decoding control unit 809 shown in FIG. 18 controls the decoding device 800 and can be realized by, for example, a CPU.
  • the decoding unit 801 performs decoding based on the syntax for each frame or field for decoding the encoded data.
  • the decoding unit 801 sequentially entropy-decodes the code string of each syntax, and reproduces the motion information including the prediction mode, the motion vector, the reference image number, and the like, and the encoding parameters of the encoding target block such as the quantization transform coefficient.
  • the encoding parameters include all parameters necessary for decoding such as information on transform coefficients and information on quantization.
  • the decoding unit 801 has a function of performing decoding processing such as variable length decoding processing and arithmetic decoding processing on the input encoded data.
  • decoding processing such as variable length decoding processing and arithmetic decoding processing
  • H.M. H.264 uses context-adaptive variable-length coding (CAVLC: Context based Adaptive Variable Coding) and context-adaptive arithmetic coding (CABAC: Context based Adaptive Binary Coding). These processes are also called decryption processes.
  • CAVLC Context based Adaptive Variable Coding
  • CABAC Context based Adaptive Binary Coding
  • the decoding unit 801 outputs motion information, a quantized transform coefficient, and the like, inputs the quantized transform coefficient to the inverse quantization unit 802, and inputs the motion information to the predicted image generation unit 805.
  • the inverse quantization unit 802 performs an inverse quantization process on the quantized transform coefficient input from the decoding unit 801 to obtain a restored transform coefficient. Specifically, the inverse quantization unit 802 performs inverse quantization according to the quantization information used in the decoding unit 801. More specifically, the inverse quantization unit 802 multiplies the quantized transform coefficient by the quantization step size derived from the quantization information to obtain a restored transform coefficient. The inverse quantization unit 802 outputs the restored transform coefficient and inputs it to the inverse orthogonal transform unit 803.
  • the inverse orthogonal transform unit 803 performs inverse orthogonal transform corresponding to the orthogonal transform performed on the encoding side on the reconstructed transform coefficient input from the inverse quantization unit 802 to obtain a reconstructed prediction error.
  • the inverse orthogonal transform unit 803 outputs the restoration prediction error and inputs it to the addition unit 804.
  • the addition unit 804 adds the restored prediction error input from the inverse orthogonal transform unit 803 and the corresponding prediction image, and generates a decoded image.
  • the adding unit 804 outputs the decoded image and inputs the decoded image to the predicted image generation unit 805.
  • the adding unit 804 outputs the decoded image to the outside as an output image.
  • the output image is then temporarily stored in an external output buffer (not shown) or the like. For example, in accordance with the output timing managed by the decoding control unit 807, the output image is sent to a display device system or a video device system (not shown). Is output.
  • the predicted image generation unit 805 generates a predicted image using the motion information input from the decoding unit 801, the WP parameter information input from the parameter derivation unit 808, and the decoded image input from the addition unit 804.
  • the predicted image generation unit 805 includes a multi-frame motion compensation unit 201, a memory 202, a unidirectional motion compensation unit 203, a prediction parameter control unit 204, a reference image selector 205, and a frame memory 206. And a reference image control unit 207.
  • the frame memory 206 stores the decoded image input from the adding unit 804 as a reference image under the control of the reference image control unit 207.
  • the frame memory 206 has a plurality of memory sets FM0 to FMN (N ⁇ 1) for temporarily storing reference images.
  • the prediction parameter control unit 204 prepares a plurality of combinations of reference image numbers and prediction parameters as a table based on the motion information input from the decoding unit 801.
  • the motion information indicates a motion vector indicating a shift amount of motion used in motion compensation prediction, a reference image number, information on a prediction mode such as unidirectional / bidirectional prediction, and the like.
  • the prediction parameter refers to information regarding a motion vector and a prediction mode. Then, the prediction parameter control unit 204 selects and outputs a combination of a reference image number and a prediction parameter used for generating a prediction image based on the motion information, inputs the reference image number to the reference image selector 205, and outputs the prediction parameter. Is input to the unidirectional motion compensation unit 203.
  • the reference image selector 205 is a switch for switching which output end of the frame memories FM0 to FMN included in the frame memory 206 is connected according to the reference image number input from the prediction parameter control unit 204. For example, if the reference image number is 0, the reference image selector 205 connects the output end of FM0 to the output end of the reference image selector 205. If the reference image number is N, the reference image selector 205 connects the output end of the FMN to the reference image selector. Connect to 205 output. The reference image selector 205 outputs a reference image stored in the frame memory to which the output terminal is connected among the frame memories FM0 to FMN included in the frame memory 206, and inputs the reference image to the unidirectional motion compensation unit 203 and references. This is input to the image feature quantity deriving unit 806.
  • the unidirectional motion compensation unit 203 performs a motion compensation prediction process according to the prediction parameter input from the prediction parameter control unit 204 and the reference image input from the reference image selector 205, and generates a unidirectional prediction image. Since motion compensation prediction has already been described with reference to FIG. 5, description thereof will be omitted.
  • the unidirectional motion compensation unit 203 outputs a unidirectional prediction image and temporarily stores it in the memory 202.
  • the multi-frame motion compensation unit 201 performs weighted prediction using two types of unidirectional prediction images.
  • the first unidirectional prediction image corresponding to the first one is stored in the memory 202, and the second unidirectional prediction image is directly output to the multi-frame motion compensation unit 201.
  • the first unidirectional prediction image corresponding to the first is the first prediction image
  • the second unidirectional prediction image is the second prediction image.
  • two unidirectional motion compensation units 203 may be prepared and each may generate two unidirectional prediction images.
  • the unidirectional motion compensation unit 203 directly outputs the first unidirectional prediction image as the first prediction image to the multi-frame motion compensation unit 201. Good.
  • the multi-frame motion compensation unit 201 uses the first predicted image input from the memory 202, the second predicted image input from the unidirectional motion compensation unit 203, and the WP parameter information input from the parameter derivation unit 808 to weight Predictive images are generated by performing predictive prediction.
  • the multi-frame motion compensation unit 201 outputs a prediction image and inputs the prediction image to the addition unit 804.
  • the multi-frame motion compensation unit 201 includes a default motion compensation unit 301, a weighted motion compensation unit 302, a WP parameter control unit 303, and WP selectors 304 and 305.
  • the WP parameter control unit 303 outputs a WP application flag and weight information based on the WP parameter information input from the parameter derivation unit 808, inputs the WP application flag to the WP selectors 304 and 305, and weights the weight information. Input to the motion compensation unit 302.
  • the WP parameter information includes the fixed-point precision of the weighting factor, the first WP application flag corresponding to the first predicted image, the first weighting factor, the first offset, and the second WP adaptation corresponding to the second predicted image. Contains information on flag, second weighting factor, and second offset.
  • the WP application flag is a parameter that can be set for each corresponding reference image and signal component, and indicates whether to perform weighted motion compensation prediction.
  • the weight information includes information on the fixed point precision of the weight coefficient, the first weight coefficient, the first offset, the second weight coefficient, and the second offset.
  • the WP parameter information represents the same information as in the first embodiment.
  • the WP parameter control unit 303 separates the WP parameter information into a first WP application flag, a second WP application flag, and weight information, and outputs them.
  • the first WP application flag is input to the WP selector 304
  • the second WP application flag is input to the WP selector 305
  • the weight information is input to the weighted motion compensation unit 302.
  • the WP selectors 304 and 305 switch the connection end of each predicted image based on the WP application flag input from the WP parameter control unit 303.
  • each WP application flag is 0, the WP selectors 304 and 305 connect each output terminal to the default motion compensation unit 301. Then, the WP selectors 304 and 305 output the first predicted image and the second predicted image and input them to the default motion compensation unit 301.
  • each WP application flag is 1
  • the WP selectors 304 and 305 connect each output terminal to the weighted motion compensation unit 302.
  • the WP selectors 304 and 305 output the first predicted image and the second predicted image, and input them to the weighted motion compensation unit 302.
  • the default motion compensation unit 301 performs an average value process based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 to generate a prediction image. Specifically, when the first WP application flag and the second WP application flag are 0, the default motion compensation unit 301 performs average value processing based on Expression (1).
  • the default motion compensation unit 301 uses only the first prediction image and calculates a final prediction image based on Expression (4). calculate.
  • the weighted motion compensation unit 302 is based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 and the weight information input from the WP parameter control unit 303. Performs weighted motion compensation. Specifically, when the first WP application flag and the second WP application flag are 1, the weighted motion compensation unit 302 performs the weighting process based on Expression (7).
  • the weighted motion compensation unit 302 rounds off the log WD C , which is the fixed-point precision, as in Expression (8) when the calculation accuracy of the first prediction image, the second prediction image, and the prediction image is different. Realize processing.
  • the weighted motion compensation unit 302 uses only the first prediction image and uses the final prediction image based on Equation (9). Is calculated.
  • the weighted motion compensator 302 if the calculation accuracy of the predicted image and the first prediction image and the second prediction image are different, equation as in the case of bidirectional prediction the LogWD C is a fixed point precision (8)
  • the rounding process is realized by controlling as described above.
  • the reference image feature value deriving unit 806, the predicted image feature value deriving unit 807, and the parameter deriving unit 808 correspond to the predicted image generated by the predicted image generating unit 805 from the reference image input from the predicted image generating unit 805.
  • WP parameter information is derived implicitly.
  • the reference image feature amount deriving unit 806 derives a reference image feature amount of each reference image input from the predicted image generation unit 805, outputs a reference image group feature amount that summarizes the derived reference image feature amounts, and The prediction image feature quantity deriving unit 807 and the parameter deriving unit 809 are input.
  • the reference image feature value deriving unit 806 includes an average value calculating unit 401, an error value calculating unit 402, and an integrating unit 403.
  • the average value calculation unit 401 calculates a pixel average value of the reference image, outputs the calculated pixel average value, and an error value calculation unit 402 and an integration unit Input to 403.
  • the average value calculation unit 401 calculates the pixel average value of the reference image using, for example, Equation (10).
  • the error value calculation unit 402 uses the pixel average value of the reference image input from the average value calculation unit 401 to calculate the pixel average value of the reference image. A pixel error value is calculated, and the calculated pixel error value is output and input to the integration unit 403. The error value calculation unit 402 calculates the pixel error value of the pixel average value of the reference image using, for example, Equation (11).
  • the integration unit 403 receives the list number and reference image number of the reference image, the pixel average value of the reference image input from the average value calculation unit 401, and the error.
  • the pixel error value of the pixel average value of the reference image input from the value calculation unit 402 is integrated into the reference image feature amount.
  • the integration unit 403 collects the integrated reference image feature values in a table as illustrated in FIG. 8 and outputs the reference image group feature values as reference image group feature values, which are input to the predicted image feature value derivation unit 807 and the parameter derivation unit 808.
  • the predicted image feature quantity derivation unit 807 derives a predicted image feature quantity based on the reference image group feature quantity input from the reference image feature quantity derivation unit 806, outputs the derived predicted image feature quantity, and a parameter derivation unit 808. To enter.
  • the predicted image feature amount includes a pixel average value of the predicted image, a pixel error value obtained by averaging errors from the pixel average value, and a predicted image feature amount derivation flag indicating whether the predicted image feature amount has been derived.
  • the predicted image feature value deriving unit 807 includes a feature value control unit 411, a memory 412, and a predicted image feature value calculating unit 413.
  • the feature amount control unit 411 selects and outputs two reference image feature amounts used for deriving the predicted image feature amount from the reference image group feature amount input from the reference image feature amount deriving unit 806, and outputs one of them in the memory 412. Is input (loaded), and the other is input to the predicted image feature amount calculation unit 413.
  • the feature amount control unit 411 uses a reference image feature amount list number and a reference image number collected in the reference image group feature amount to indicate a display order of the reference image of the reference image feature amount. Deriving Order Count).
  • the reference list and the reference image number are information for indirectly specifying the reference image
  • the POC is information for directly specifying the reference image, and corresponds to the absolute position of the reference image.
  • the feature amount control unit 411 selects two POCs from the derived POC in the order of the shortest distance from the encoding target image (predicted image), thereby calculating the predicted image feature amount from the reference image group feature amount. Two reference image feature quantities used for derivation are selected.
  • the feature amount control unit 411 selects two POCs (POC1 and POC2) using Equation (12) when the encoded slice is P-slice.
  • the feature quantity control unit 411 selects two POCs (POC1 and POC2) using Expression (13) when the encoded slice is B-slice.
  • the feature amount control unit 411 reselects one POC (POC2) using Expression (14).
  • the feature quantity control unit 411 can select two POCs having different temporal distances from the encoding target image (POC2 having a temporal distance different from that of POC1) from the encoding target image by repeatedly performing Expression (14).
  • the feature quantity control unit 411 selects and outputs a reference image feature quantity corresponding to POC1 from the reference image group feature quantity, and inputs it to the memory 412. Further, when POC2 is selected, the feature amount control unit 411 selects and outputs a reference image feature amount corresponding to POC2 from the reference image group feature amount, and inputs it to the predicted image feature amount calculation unit 413.
  • the feature amount control unit 411 selects two POCs has been described, but three or more POCs may be selected.
  • the feature amount control unit 411 selects three or more reference image feature amounts from the reference image group feature amount, and a predicted image feature amount is derived from the selected three or more reference image feature amounts. Note that in the case of P-slice, it is necessary that N ⁇ 2, and the feature amount control unit 411 may search the reference list with the list number 0 after executing Expression (12).
  • the memory 412 holds the reference image feature amount (reference image feature amount corresponding to POC1 (hereinafter referred to as first reference image feature amount)) input by the feature amount control unit 411. Then, the memory 412 receives a reference image feature amount (a reference image feature amount corresponding to POC2 (hereinafter referred to as a second reference image feature amount)) from the feature amount control unit 411 to the predicted image feature amount calculation unit 413. Then, the first reference image feature value is output and input to the predicted image feature value calculation unit 413.
  • first reference image feature amount reference image feature amount corresponding to POC1
  • second reference image feature amount a reference image feature amount corresponding to POC2
  • the predicted image feature value calculation unit 413 calculates a predicted image feature value using the first reference image feature value input from the memory 412 and the second reference image feature value input from the feature value control unit 411. To be output and input to the parameter deriving unit 808.
  • the predicted image feature amount calculation unit 413 first calculates the temporal distance between the reference image of the first reference image feature amount and the reference image of the second reference image feature amount according to Equation (15).
  • the predicted image feature value calculation unit 413 calculates the pixel average value of the predicted image according to Equation (19), and calculates the pixel error value of the predicted image according to Equation (20).
  • the predicted image feature amount calculation unit 413 cannot derive the predicted image feature amount or time Since the target distance is too far, the pixel average value DCP of the predicted image and the pixel error value ACP of the predicted image are set to initial values.
  • the predicted image feature value calculation unit 413 sets the predicted image feature value derivation flag wp_avaiable_flag, which is an internal variable, to false, If DistScaleFactor does not satisfy any of the conditions of Equations (23) to (25), wp_avaiable_flag is set to true.
  • wp_avaiable_flag When the predicted image feature quantity derivation flag is set to false, initial values are set in the DCP and ACP. For example, DefaultDC indicating 0 is set in DCP, and DefaultAC indicating 0 is set in ACP.
  • the predicted image feature value derivation flag is set to true, the values calculated by Expression (19) and Expression (20) are set in DCP and ACP, respectively.
  • the parameter deriving unit 808 uses the reference image group feature value input from the reference image feature value deriving unit 806 and the predicted image feature value input from the predicted image feature value deriving unit 807 to use the WP parameter of the encoding target image. Deriving information.
  • the parameter deriving unit 808 first checks the predicted image feature quantity derivation flag wp_avaiable_flag included in the predicted image feature quantity, and checks whether WP parameter information can be derived. When wp_avaiable_flag is set to false, the WP parameter information cannot be derived. Therefore, the parameter deriving unit 808 sets the weighting coefficient and offset corresponding to the list number X and the reference image number Y to Expression (26) and Expression ( The initial value is set according to 27).
  • the parameter deriving unit 808 When the wp_avaiable_flag is set to false, the parameter deriving unit 808 repeatedly executes Expression (26) and Expression (27) for all combinations of list numbers X and reference image numbers Y (all reference images). Thus, an initial value is set in the WP parameter information.
  • the parameter deriving unit 808 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to false when wp_avaiable_flag is set to false.
  • the parameter deriving unit 808 uses the weighting coefficient and offset corresponding to the list number X and the reference image number Y, respectively, as shown in Equation (29), Derived according to equation (30).
  • the parameter deriving unit 808 When the wp_avaiable_flag is set to true, the parameter deriving unit 808 repeatedly executes Expression (29) and Expression (30) for all combinations of list numbers X and reference image numbers Y (all reference images). Thus, the WP parameter information shown in FIGS. 11A and 11B is derived.
  • the parameter deriving unit 808 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to true when wp_avaiable_flag is set to true.
  • Decoding section 801 uses syntax 500 shown in FIG.
  • a syntax 500 indicates a structure of encoded data to be decoded by the decoding unit 801.
  • the syntax 500 has already been described with reference to FIG.
  • the picture parameter set syntax 505 has already been described with reference to FIG. 16 except that encoding is decoding, and thus description thereof will be omitted.
  • the sequence parameter set syntax 504 has already been described with reference to FIG. 17 except that encoding is decoding, and thus description thereof will be omitted.
  • the decoding apparatus 800 when performing weighted motion compensation prediction, uses an implicit mode that does not explicitly encode a weight parameter between two reference images.
  • the code amount can be reduced, and the coding Efficiency can be improved.
  • encoding and decoding may be performed in order from the lower right to the upper left, or encoding and decoding may be performed so as to draw a spiral from the center of the screen toward the screen end.
  • encoding and decoding may be performed in order from the upper right to the lower left, or encoding and decoding may be performed so as to draw a spiral from the screen end toward the center of the screen.
  • the position of the adjacent pixel block that can be referred to changes depending on the encoding order, the position may be changed to a usable position as appropriate.
  • the prediction target block is a uniform block. It does not have to be a shape.
  • the prediction target block size may be a 16 ⁇ 8 pixel block, an 8 ⁇ 16 pixel block, an 8 ⁇ 4 pixel block, a 4 ⁇ 8 pixel block, or the like.
  • the code amount for encoding or decoding the division information increases as the number of divisions increases. Therefore, it is desirable to select the block size in consideration of the balance between the code amount of the division information and the quality of the locally decoded image or the decoded image.
  • the color signal component has been described without distinguishing between the prediction process for the luminance signal and the color difference signal.
  • the same or different prediction methods may be used. If different prediction methods are used between the luminance signal and the color difference signal, the prediction method selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.
  • the weighted motion compensation prediction process is different between the luminance signal and the color difference signal, the same or different weighted motion compensation prediction process may be used. If a weighted motion compensation prediction process that is different between the luminance signal and the color difference signal is used, the weighted motion compensation prediction process selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.
  • syntax elements not specified in the present embodiment can be inserted between the rows of the table shown in the syntax configuration, and other conditional branch descriptions are included. It does not matter.
  • the syntax table can be divided and integrated into a plurality of tables. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
  • Modifications 1 to 4 of the first embodiment may be applied to the second embodiment.
  • each of the above embodiments solves the problem that the offset for additively correcting the pixel value cannot be used when performing the implicit weighted motion compensation prediction, and the highly efficient implicit weighted motion compensation. Realize prediction processing. Therefore, according to each of the above embodiments, the encoding efficiency is improved, and the subjective image quality is also improved.
  • the storage medium can be a computer-readable storage medium such as a magnetic disk, optical disk (CD-ROM, CD-R, DVD, etc.), magneto-optical disk (MO, etc.), semiconductor memory, etc.
  • the storage format may be any form.
  • the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
  • DESCRIPTION OF SYMBOLS 100 Coding apparatus 101 Subtraction part 102 Orthogonal transformation part 103 Quantization part 104 Inverse quantization part 105 Inverse orthogonal transformation part 106 Adder part 107 Predictive image generation part 108 Reference image feature-value derivation part 109 Predictive-image feature-value derivation part 110 Parameter derivation Unit 111 motion evaluation unit 112 encoding unit 113 encoding control unit 201 multi-frame motion compensation unit 202 memory 203 unidirectional motion compensation unit 204 prediction parameter control unit 205 reference image selector 206 frame memory 207 reference image control unit 301 default motion compensation unit 302 Weighted motion compensation unit 303 WP parameter control unit 304, 305 WP selector 401 Average value calculation unit 402 Error value calculation unit 403 Integration unit 411 Feature amount control unit 412 Memory 413 Predictive image feature amount calculation unit 800 Decoding device 801 Issue unit 802 inverse quantization unit 803 inverse orthogonal transform unit 804 adding unit 805 predicted image generation unit 806

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The prediction image generation method of an embodiment comprises first to third derivation steps and a prediction image generation step. In the first derivation step, a pixel average value and a pixel error value of each of two or more reference images are derived. In the second deviation step, a pixel average value of a prediction image is derived using the time-distance ratio between at least two reference images and the prediction image and the pixel average values of the at least two reference images, and the pixel error value of the prediction image is derived using the time-distance ratio and the pixel error values of the at least two reference images. In the third derivation step, the weight coefficients of the reference images are derived using the pixel error values of the reference images and the pixel error value of the prediction image, and the offsets of the reference images are derived using the derived weight coefficients, the pixel average values of the reference images, and the pixel average value of the prediction image. In the prediction image generation step, using a reference image of one target block obtained by dividing an input image into a plurality of blocks, the weight coefficients, and the offsets, the prediction image of the target block is generated.

Description

予測画像生成方法、符号化方法及び復号方法Predictive image generation method, encoding method, and decoding method
 本発明の実施形態は、予測画像生成方法、符号化方法及び復号方法に関する。 Embodiments of the present invention relate to a predicted image generation method, an encoding method, and a decoding method.
 近年、符号化効率を大幅に向上させた画像符号化方法が、ITU-T(International Telecommunication Union Telecommunication Standardization Sector)とISO(International Organization for Standardization)/IEC(International Electrotechnical Commission)との共同で、ITU-T REC. H.264及びISO/IEC 14496-10(以下、「H.264」という)として勧告されている。 In recent years, image coding methods that have greatly improved coding efficiency have been jointly developed by ITU-T (International Telecommunication Union Telecommunication Standardization Sector) and ISO (International Organization for Standardization) / IEC (International Electrotechnical Commission). T REC. H. H.264 and ISO / IEC 14496-10 (hereinafter referred to as “H.264”).
 H.264には、符号化済みの画像を参照画像に用いて分数精度の動き補償予測を行うことにより、時間方向の冗長性を削除し、高い符号化効率を実現するインター予測符号化方式が開示されている。 H. H.264 discloses an inter-prediction coding scheme that eliminates temporal redundancy and realizes high coding efficiency by performing fractional accuracy motion compensated prediction using a coded image as a reference image. ing.
 また、ISO/IEC MPEG(Moving Picture Experts Group)-1,2,4におけるインター予測符号化方式よりも、フェードやディゾルブ効果を含む動画像を高効率に符号化する暗黙的動き補償予測方式も提案されている。この方式では、時間方向における画素値変化を予測する枠組みとして、輝度と2つの色差とを有する入力動画像に対して分数精度の動き補償予測を行う。そして、参照画像と輝度及び2つの色差毎の重み係数とを、参照画像間の時間的距離比によって暗黙的に導出し、予測画像に重み係数を乗じる。 Also proposed is an implicit motion compensation prediction method that encodes moving images including fade and dissolve effects more efficiently than the inter prediction encoding methods in ISO / IEC MPEG (Moving Picture Experts Group) -1, 2, 4 Has been. In this method, as a framework for predicting a change in pixel value in the time direction, motion compensation prediction with fractional accuracy is performed on an input moving image having luminance and two color differences. Then, the reference image, the luminance and the weighting factor for each of the two color differences are implicitly derived by the temporal distance ratio between the reference images, and the prediction image is multiplied by the weighting factor.
特開2004-179687号公報JP 2004-179687 A
 しかしながら、上述したような従来技術では、参照画像の時間的距離比によって暗黙的に導出した重み係数を動き補償予測後の画素に乗じることで重み付き動き補償予測を実現していたが、画像平均値のズレ分を補正するオフセット項が考慮されないため、符号化効率の低下を招いていた。本発明が解決しようとする課題は、符号化効率を向上できる予測画像生成方法、符号化方法及び復号方法を提供することである。 However, in the related art as described above, weighted motion compensation prediction is realized by multiplying a pixel after motion compensation prediction by a weighting factor implicitly derived by the temporal distance ratio of the reference image. Since the offset term for correcting the deviation of the value is not taken into account, the encoding efficiency is reduced. The problem to be solved by the present invention is to provide a prediction image generation method, an encoding method, and a decoding method capable of improving the encoding efficiency.
 実施形態の予測画像生成方法は、第1導出ステップと、第2導出ステップと、第3導出ステップと、予測画像生成ステップと、を含む。第1導出ステップでは、2以上の参照画像それぞれの画素平均値及び当該画素平均値との画素の差分を示す画素誤差値を導出する。第2導出ステップでは、前記2以上の参照画像のうちの少なくとも2つの参照画像と予測画像との時間距離比及び前記少なくとも2つの参照画像の前記画素平均値を用いて前記予測画像の画素平均値を導出するとともに、前記時間距離比及び前記少なくとも2つの参照画像の前記画素誤差値を用いて前記予測画像の画素誤差値を導出する。第3導出ステップでは、前記参照画像の前記画素誤差値と前記予測画像の前記画素誤差値とを用いて前記参照画像の重み係数を導出するとともに、導出した前記重み係数と前記参照画像の前記画素平均値と前記予測画像の前記画素平均値とを用いて前記参照画像のオフセットを導出する。予測画像生成ステップでは、前記参照画像のうち入力画像を複数のブロックに分割した1つの対象ブロックの参照画像、当該参照画像の前記重み係数、及び当該参照画像の前記オフセットを用いて、前記対象ブロックの前記予測画像を生成する。 The predicted image generation method of the embodiment includes a first derivation step, a second derivation step, a third derivation step, and a predicted image generation step. In the first derivation step, a pixel error value indicating a pixel average value of each of two or more reference images and a pixel difference from the pixel average value is derived. In the second derivation step, a pixel average value of the predicted image using a temporal distance ratio between at least two reference images and the predicted image of the two or more reference images and the pixel average value of the at least two reference images. And the pixel error value of the predicted image is derived using the temporal distance ratio and the pixel error value of the at least two reference images. In the third derivation step, the weight coefficient of the reference image is derived using the pixel error value of the reference image and the pixel error value of the predicted image, and the derived weight coefficient and the pixel of the reference image The offset of the reference image is derived using the average value and the pixel average value of the predicted image. In the predicted image generation step, using the reference image of one target block obtained by dividing the input image of the reference image into a plurality of blocks, the weighting coefficient of the reference image, and the offset of the reference image, the target block The predicted image is generated.
第1実施形態の符号化装置の例を示すブロック図。The block diagram which shows the example of the encoding apparatus of 1st Embodiment. 第1実施形態における画素ブロックの予測符号化順序例を示す説明図。Explanatory drawing which shows the example of the prediction encoding order of the pixel block in 1st Embodiment. 第1実施形態におけるコーディングツリーブロックのブロックサイズ例を示す図。The figure which shows the block size example of the coding tree block in 1st Embodiment. 第1実施形態のコーディングツリーブロックの具体例を示す図。The figure which shows the specific example of the coding tree block of 1st Embodiment. 第1実施形態のコーディングツリーブロックの具体例を示す図。The figure which shows the specific example of the coding tree block of 1st Embodiment. 第1実施形態のコーディングツリーブロックの具体例を示す図。The figure which shows the specific example of the coding tree block of 1st Embodiment. 第1実施形態の予測画像生成部の例を示すブロック図。The block diagram which shows the example of the estimated image generation part of 1st Embodiment. 第1実施形態の双方向予測における動き補償予測の動きベクトルの関係の例を示す図。The figure which shows the example of the relationship of the motion vector of the motion compensation prediction in the bidirectional | two-way prediction of 1st Embodiment. 第1実施形態の複数フレーム動き補償部の例を示すブロック図。The block diagram which shows the example of the multi-frame motion compensation part of 1st Embodiment. 第1実施形態における重み係数の固定小数点精度の例の説明図。Explanatory drawing of the example of the fixed point precision of the weighting coefficient in 1st Embodiment. 第1実施形態の参照画像群特徴量の例を示す図。The figure which shows the example of the reference image group feature-value of 1st Embodiment. 第1実施形態の参照画像特徴量導出部の構成例を示すブロック図。The block diagram which shows the structural example of the reference image feature-value derivation | leading-out part of 1st Embodiment. 第1実施形態の予測画像特徴量導出部の構成例を示すブロック図。The block diagram which shows the structural example of the estimated image feature-value derivation | leading-out part of 1st Embodiment. 第1実施形態のWPパラメータ情報の例を示す図。The figure which shows the example of WP parameter information of 1st Embodiment. 第1実施形態のWPパラメータ情報の例を示す図。The figure which shows the example of WP parameter information of 1st Embodiment. 第1実施形態の参照画像群特徴量導出処理例を示すフローチャート。6 is a flowchart illustrating an example of a reference image group feature amount derivation process according to the first embodiment. 第1実施形態の予測画像特徴量導出処理例を示すフローチャート。5 is a flowchart illustrating an example of a predicted image feature amount derivation process according to the first embodiment. 第1実施形態のWPパラメータ情報導出処理例を示すフローチャート。The flowchart which shows the example of WP parameter information derivation processing of 1st Embodiment. 第1実施形態のシンタクスの例を示す図。The figure which shows the example of the syntax of 1st Embodiment. 第1実施形態のピクチャパラメータセットシンタクスの例を示す図。The figure which shows the example of the picture parameter set syntax of 1st Embodiment. 第1実施形態のシーケンスパラメータセットシンタクスの例を示す図。The figure which shows the example of the sequence parameter set syntax of 1st Embodiment. 第2実施形態の復号装置の構成例を示すブロック図。The block diagram which shows the structural example of the decoding apparatus of 2nd Embodiment.
 以下、添付図面を参照しながら、実施形態を詳細に説明する。以下の各実施形態の符号化装置及び復号装置は、LSI(Large-Scale Integration)チップ、DSP(Digital Signal Processor)、又はFPGA(Field Programmable Gate Array)などのハードウェアにより実現できる。また、以下の各実施形態の符号化装置及び復号装置は、コンピュータにプログラムを実行させること、即ち、ソフトウェアにより実現させることもできる。なお、以降の説明において、「画像」という用語は、「映像」、「画素」、「画像信号」、「絵」、又は「画像データ」などの用語に適宜読み替えることができる。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. The encoding device and decoding device of each of the following embodiments can be realized by hardware such as an LSI (Large-Scale Integration) chip, a DSP (Digital Signal Processor), or an FPGA (Field Programmable Gate Array). In addition, the encoding device and the decoding device of each of the following embodiments can be realized by software by causing a computer to execute a program. In the following description, the term “image” can be appropriately replaced with a term such as “video”, “pixel”, “image signal”, “picture”, or “image data”.
(第1実施形態)
 第1実施形態では、動画像を符号化する符号化装置について説明する。
(First embodiment)
In the first embodiment, an encoding apparatus that encodes a moving image will be described.
 図1は、第1実施形態の符号化装置100の構成の一例を示すブロック図である。 FIG. 1 is a block diagram illustrating an example of the configuration of the encoding device 100 according to the first embodiment.
 符号化装置100は、入力画像を構成する各フレーム又は各フィールドを複数の画素ブロックに分割し、符号化制御部113から入力される符号化パラメータを用いて、分割した画素ブロックに対して予測符号化を行い、予測画像を生成する。そして符号化装置100は、複数の画素ブロックに分割した入力画像と予測画像とを減算して予測誤差を生成し、生成した予測誤差を直交変換及び量子化し、更にエントロピー符号化を行って符号化データを生成し、出力する。 The encoding apparatus 100 divides each frame or each field constituting the input image into a plurality of pixel blocks, and uses the encoding parameters input from the encoding control unit 113 to predict the divided pixel blocks. To generate a predicted image. The encoding apparatus 100 subtracts the input image divided into a plurality of pixel blocks and the prediction image to generate a prediction error, orthogonally transforms and quantizes the generated prediction error, and further performs entropy encoding to perform encoding. Generate and output data.
 符号化装置100は、画素ブロックのブロックサイズ及び予測画像の生成方法の少なくともいずれかが異なる複数の予測モードを選択的に適用して予測符号化を行う。予測画像の生成方法は、大別すると、符号化対象フレーム内で予測を行うイントラ予測と、時間的に異なる1以上の参照フレームを用いて動き補償予測を行うインター予測との2種類である。なお、イントラ予測は、画面内予測又はフレーム内予測などとも称され、インター予測は、画面間予測、フレーム間予測、又は動き補償予測などとも称される。 The encoding apparatus 100 performs predictive encoding by selectively applying a plurality of prediction modes in which at least one of the block size of the pixel block and the prediction image generation method is different. The generation method of the prediction image is roughly classified into two types: intra prediction that performs prediction within the encoding target frame and inter prediction that performs motion compensation prediction using one or more reference frames that are temporally different. Note that intra prediction is also referred to as intra prediction or intra frame prediction, and inter prediction is also referred to as inter prediction, inter frame prediction, motion compensation prediction, or the like.
 図2は、第1実施形態における画素ブロックの予測符号化順序の一例を示す説明図である。図2に示す例では、符号化装置100は、画素ブロックの左上から右下に向かって予測符号化を行っており、符号化処理対象のフレームfにおいて、符号化対象画素ブロックcよりも左側及び上側に符号化済み画素ブロックpが位置している。以下では、説明の簡単化のため、符号化装置100は、図2に示す順序で予測符号化を行うものとするが、予測符号化の順序はこれに限定されるものではない。 FIG. 2 is an explanatory diagram showing an example of a predictive coding order of pixel blocks in the first embodiment. In the example illustrated in FIG. 2, the encoding device 100 performs predictive encoding from the upper left to the lower right of the pixel block, and the left side of the encoding target pixel block c in the encoding target frame f and The encoded pixel block p is located on the upper side. Hereinafter, for simplification of description, the encoding apparatus 100 performs predictive encoding in the order shown in FIG. 2, but the order of predictive encoding is not limited to this.
 画素ブロックは、画像を処理する単位を示し、例えば、M×Nサイズのブロック(M及びNは自然数)、コーディングツリーブロック、マクロブロック、サブブロック、又は1画素などが該当する。以降の説明では、基本的に、画素ブロックをコーディングツリーブロックの意味で使用するが、他の意味で使用する場合もある。例えば、プレディクションユニットの説明では、画素ブロックを、プレディクションユニットの画素ブロックの意味で使用する。また、ブロックはユニットなどの名称で呼ばれることもある。例えばコーディングブロックをコーディングユニットと呼ぶ。 The pixel block indicates a unit for processing an image, and corresponds to, for example, an M × N size block (M and N are natural numbers), a coding tree block, a macro block, a sub block, or one pixel. In the following description, the pixel block is basically used in the meaning of the coding tree block, but may be used in other meanings. For example, in the description of the prediction unit, a pixel block is used to mean a pixel block of the prediction unit. A block may be called by a name such as a unit. For example, a coding block is called a coding unit.
 図3Aは、第1実施形態におけるコーディングツリーブロックのブロックサイズの一例を示す図である。コーディングツリーブロックは、典型的には、図3Aに示すような64×64の画素ブロックである。但し、これに限定されるものではなく、32×32の画素ブロック、16×16の画素ブロック、8×8の画素ブロック、又は4×4の画素ブロックなどであってもよい。また、コーディングツリーブロックは、正方形でなくてもよく、例えば、M×Nサイズ(M≠N)の画素ブロックであってもよい。 FIG. 3A is a diagram showing an example of the block size of the coding tree block in the first embodiment. The coding tree block is typically a 64 × 64 pixel block as shown in FIG. 3A. However, the present invention is not limited to this, and it may be a 32 × 32 pixel block, a 16 × 16 pixel block, an 8 × 8 pixel block, a 4 × 4 pixel block, or the like. Further, the coding tree block may not be a square, and may be, for example, an M × N size (M ≠ N) pixel block.
 図3B~図3Dは、第1実施形態のコーディングツリーブロックの具体例を示す図である。図3Bは、ブロックサイズが64×64(N=32)のコーディングツリーブロックを示している。Nは、基準となるコーディングツリーブロックのサイズを表しており、分割された場合のサイズがN、分割されない場合のサイズが2Nと定義されている。図3Cは、図3Bのコーディングツリーブロックを四分木分割したコーディングツリーブロックを示している。コーディングツリーブロックは、図3Cに示すように、四分木構造を持つ。コーディングツリーブロックが分割された場合、分割後の4つの画素ブロックに対して、図3Cに示すように、Zスキャン順で番号が付される。 3B to 3D are diagrams illustrating specific examples of the coding tree block according to the first embodiment. FIG. 3B shows a coding tree block having a block size of 64 × 64 (N = 32). N represents the size of the reference coding tree block. The size when divided is defined as N, and the size when not divided is defined as 2N. FIG. 3C shows a coding tree block obtained by dividing the coding tree block of FIG. 3B into quadtrees. The coding tree block has a quadtree structure as shown in FIG. 3C. When the coding tree block is divided, the four pixel blocks after the division are numbered in the Z-scan order as shown in FIG. 3C.
 なお、コーディングツリーブロックは、1つの四分木の番号内で更に四分木分割することができる。これにより、コーディングツリーブロックを階層的に分割することができる。この場合、分割の深さは、Depthで定義される。図3Dは、図3Bのコーディングツリーブロックを四分木分割したコーディングツリーブロックの1つを示し、ブロックサイズが32×32(N=16)となっている。図3Bに示すコーディングツリーブロックのDepthは0であり、図3Dに示すコーディングツリーブロックのDepthは、1である。なお、最もユニットが大きいコーディングツリーブロックは、ラージコーディングツリーブロックと呼ばれ、この単位で入力画像信号がラスタースキャン順に符号化される。 Note that the coding tree block can be further divided into quadtrees within one quadtree number. Thereby, a coding tree block can be divided | segmented hierarchically. In this case, the depth of division is defined by Depth. FIG. 3D shows one of the coding tree blocks obtained by dividing the coding tree block of FIG. 3B into quadtrees, and the block size is 32 × 32 (N = 16). The Depth of the coding tree block shown in FIG. 3B is 0, and the Depth of the coding tree block shown in FIG. The coding tree block having the largest unit is called a large coding tree block, and the input image signal is encoded in this unit in the raster scan order.
 以降の説明では、入力画像の符号化対象ブロック又はコーディングツリーブロックを予測対象ブロック又は予測画素ブロックと称することもある。なお、符号化単位は画素ブロックに限らず、フレーム、フィールド、スライス、ライン、及び画素の少なくともいずれかを用いることもできる。 In the following description, the encoding target block or coding tree block of the input image may be referred to as a prediction target block or a prediction pixel block. The encoding unit is not limited to a pixel block, and at least one of a frame, a field, a slice, a line, and a pixel can be used.
 符号化装置100は、図1に示すように、減算部101と、直交変換部102と、量子化部103と、逆量子化部104と、逆直交変換部105と、加算部106と、予測画像生成部107と、参照画像特徴量導出部108と、予測画像特徴量導出部109と、パラメータ導出部110と、動き評価部111と、符号化部112とを、備える。なお、図1に示す符号化制御部113は、符号化装置100を制御するものであり、例えば、CPU(Central Processing Unit)などにより実現できる。 As illustrated in FIG. 1, the encoding device 100 includes a subtraction unit 101, an orthogonal transformation unit 102, a quantization unit 103, an inverse quantization unit 104, an inverse orthogonal transformation unit 105, an addition unit 106, and a prediction The image generation unit 107, the reference image feature value deriving unit 108, the predicted image feature value deriving unit 109, the parameter deriving unit 110, the motion evaluation unit 111, and the encoding unit 112 are provided. Note that the encoding control unit 113 shown in FIG. 1 controls the encoding apparatus 100 and can be realized by, for example, a CPU (Central Processing Unit).
 減算部101は、画素ブロックに分割された入力画像から対応する予測画像を減算して予測誤差を得る。減算部101は、予測誤差を出力し、直交変換部102に入力する。 The subtraction unit 101 subtracts the corresponding prediction image from the input image divided into pixel blocks to obtain a prediction error. The subtraction unit 101 outputs a prediction error and inputs it to the orthogonal transform unit 102.
 直交変換部102は、減算部101から入力された予測誤差に対して、例えば、離散コサイン変換(DCT)又は離散サイン変換(DST)のような直交変換を行い、変換係数を得る。直交変換部102は、変換係数を出力し、量子化部103に入力する。 The orthogonal transform unit 102 performs orthogonal transform such as discrete cosine transform (DCT) or discrete sine transform (DST) on the prediction error input from the subtraction unit 101 to obtain transform coefficients. The orthogonal transform unit 102 outputs transform coefficients and inputs them to the quantization unit 103.
 量子化部103は、直交変換部102から入力された変換係数に対して量子化処理を行い、量子化変換係数を得る。具体的には、量子化部103は、符号化制御部113によって指定される量子化パラメータや量子化マトリクスなどの量子化情報に従って量子化を行う。より詳細には、量子化部103は、変換係数を量子化情報によって導出される量子化ステップサイズで除算し、量子化変換係数を得る。量子化パラメータは、量子化の細かさを示す。量子化マトリクスは、量子化の細かさを変換係数の成分毎に重み付けするために使用される。量子化部103は、量子化変換係数を出力し、逆量子化部104及び符号化部112に入力する。 The quantization unit 103 performs a quantization process on the transform coefficient input from the orthogonal transform unit 102 to obtain a quantized transform coefficient. Specifically, the quantization unit 103 performs quantization according to quantization information such as a quantization parameter or a quantization matrix specified by the encoding control unit 113. More specifically, the quantization unit 103 divides the transform coefficient by the quantization step size derived from the quantization information to obtain a quantized transform coefficient. The quantization parameter indicates the fineness of quantization. The quantization matrix is used for weighting the fineness of quantization for each component of the transform coefficient. The quantization unit 103 outputs the quantized transform coefficient and inputs it to the inverse quantization unit 104 and the encoding unit 112.
 逆量子化部104は、量子化部103から入力された量子化変換係数に対して逆量子化処理を行い、復元変換係数を得る。具体的には、逆量子化部104は、量子化部103において使用された量子化情報に従って逆量子化を行う。より詳細には、逆量子化部104は、量子化情報によって導出された量子化ステップサイズを量子化変換係数に乗算し、復元変換係数を得る。なお、量子化部103において使用された量子化情報は、符号化制御部113の図示せぬ内部メモリからロードされて利用される。逆量子化部104は、復元変換係数を出力し、逆直交変換部105に入力する。 The inverse quantization unit 104 performs an inverse quantization process on the quantized transform coefficient input from the quantization unit 103 to obtain a restored transform coefficient. Specifically, the inverse quantization unit 104 performs inverse quantization according to the quantization information used in the quantization unit 103. More specifically, the inverse quantization unit 104 multiplies the quantization transform coefficient by the quantization step size derived from the quantization information to obtain a restored transform coefficient. The quantization information used in the quantization unit 103 is loaded from an internal memory (not shown) of the encoding control unit 113 and used. The inverse quantization unit 104 outputs the restored transform coefficient and inputs it to the inverse orthogonal transform unit 105.
 逆直交変換部105は、逆量子化部104から入力された復元変換係数に対して、例えば、逆離散コサイン変換(IDCT)又は逆離散サイン変換(IDST)などのような逆直交変換を行い、復元予測誤差を得る。なお、逆直交変換部105が行う逆直交変換は、直交変換部102において行われた直交変換に対応する。逆直交変換部105は、復元予測誤差を出力し、加算部106に入力する。 The inverse orthogonal transform unit 105 performs inverse orthogonal transform such as inverse discrete cosine transform (IDCT) or inverse discrete sine transform (IDST) on the restored transform coefficient input from the inverse quantization unit 104, Get the restoration prediction error. Note that the inverse orthogonal transform performed by the inverse orthogonal transform unit 105 corresponds to the orthogonal transform performed by the orthogonal transform unit 102. The inverse orthogonal transform unit 105 outputs the reconstruction prediction error and inputs it to the addition unit 106.
 加算部106は、逆直交変換部105から入力された復元予測誤差と対応する予測画像とを加算し、局所復号画像を生成する。加算部106は、局所復号画像を出力し、予測画像生成部107に入力する。 The addition unit 106 adds the restored prediction error input from the inverse orthogonal transform unit 105 and the corresponding predicted image to generate a locally decoded image. The adding unit 106 outputs the local decoded image and inputs it to the predicted image generation unit 107.
 予測画像生成部107は、加算部106から入力された局所復号画像を参照画像としてメモリ(図1では図示省略)に蓄積し、メモリに蓄積した参照画像を出力し、参照画像特徴量導出部108及び動き評価部111に入力する。また予測画像生成部107は、パラメータ導出部110から入力されるWPパラメータ情報及び動き評価部111から入力される動き情報に基づいて重み付き動き補償予測を行い、予測画像を生成する。予測画像生成部107は、予測画像を出力し、減算部101及び加算部106に入力する。 The predicted image generation unit 107 stores the locally decoded image input from the addition unit 106 as a reference image in a memory (not shown in FIG. 1), outputs the reference image stored in the memory, and outputs a reference image feature quantity deriving unit 108. And input to the motion evaluation unit 111. The predicted image generation unit 107 performs weighted motion compensation prediction based on the WP parameter information input from the parameter derivation unit 110 and the motion information input from the motion evaluation unit 111, and generates a predicted image. The predicted image generation unit 107 outputs a predicted image and inputs it to the subtracting unit 101 and the adding unit 106.
 図4は、第1実施形態の予測画像生成部107の構成の一例を示すブロック図である。予測画像生成部107は、図4に示すように、複数フレーム動き補償部201と、メモリ202と、単方向動き補償部203と、予測パラメータ制御部204と、参照画像セレクタ205と、フレームメモリ206と、参照画像制御部207と、を備える。 FIG. 4 is a block diagram illustrating an example of the configuration of the predicted image generation unit 107 of the first embodiment. As shown in FIG. 4, the predicted image generation unit 107 includes a multi-frame motion compensation unit 201, a memory 202, a unidirectional motion compensation unit 203, a prediction parameter control unit 204, a reference image selector 205, and a frame memory 206. And a reference image control unit 207.
 フレームメモリ206は、参照画像制御部207の制御の下、加算部106から入力された局所復号画像を参照画像として格納する。フレームメモリ206は、参照画像を一時保持するための複数のメモリセットFM0~FMN(N≧1)を有する。 The frame memory 206 stores the locally decoded image input from the addition unit 106 as a reference image under the control of the reference image control unit 207. The frame memory 206 has a plurality of memory sets FM0 to FMN (N ≧ 1) for temporarily storing reference images.
 予測パラメータ制御部204は、動き評価部111から入力される動き情報に基づいて、参照画像番号と予測パラメータとの複数の組み合わせをテーブルとして用意している。ここで、動き情報とは、動き補償予測で用いられる動きのズレ量を示す動きベクトルや参照画像番号、単方向/双方向予測などの予測モードに関する情報などを指す。予測パラメータは、動きベクトル及び予測モードに関する情報を指す。そして予測パラメータ制御部204は、入力画像に基づいて、予測画像の生成に用いる参照画像の参照画像番号と予測パラメータとの組み合わせを選択して出力し、参照画像番号を参照画像セレクタ205に入力し、予測パラメータを単方向動き補償部203に入力する。 The prediction parameter control unit 204 prepares a plurality of combinations of reference image numbers and prediction parameters as a table based on the motion information input from the motion evaluation unit 111. Here, the motion information indicates a motion vector indicating a shift amount of motion used in motion compensation prediction, a reference image number, information on a prediction mode such as unidirectional / bidirectional prediction, and the like. The prediction parameter refers to information regarding a motion vector and a prediction mode. Then, the prediction parameter control unit 204 selects and outputs a combination of the reference image number of the reference image used for generating the prediction image and the prediction parameter based on the input image, and inputs the reference image number to the reference image selector 205. The prediction parameter is input to the unidirectional motion compensation unit 203.
 参照画像セレクタ205は、フレームメモリ206が有するフレームメモリFM0~FMNのいずれの出力端を接続するかを、予測パラメータ制御部204から入力された参照画像番号に従って切り替えるスイッチである。参照画像セレクタ205は、例えば、参照画像番号が0であれば、FM0の出力端を参照画像セレクタ205の出力端に接続し、参照画像番号がNであれば、FMNの出力端を参照画像セレクタ205の出力端に接続する。参照画像セレクタ205は、フレームメモリ206が有するフレームメモリFM0~FMNのうち、出力端が接続されているフレームメモリに格納されている参照画像を出力し、単方向動き補償部203、参照画像特徴量導出部108、及び動き評価部111へ入力する。 The reference image selector 205 is a switch for switching which output end of the frame memories FM0 to FMN included in the frame memory 206 is connected according to the reference image number input from the prediction parameter control unit 204. For example, if the reference image number is 0, the reference image selector 205 connects the output end of FM0 to the output end of the reference image selector 205. If the reference image number is N, the reference image selector 205 connects the output end of the FMN to the reference image selector. Connect to 205 output. The reference image selector 205 outputs the reference image stored in the frame memory to which the output terminal is connected among the frame memories FM0 to FMN included in the frame memory 206, and the unidirectional motion compensation unit 203, the reference image feature amount This is input to the derivation unit 108 and the motion evaluation unit 111.
 単方向動き補償部203は、予測パラメータ制御部204から入力された予測パラメータと参照画像セレクタ205から入力された参照画像に従って、動き補償予測処理を行い、単方向予測画像を生成する。 The unidirectional motion compensation unit 203 performs a motion compensation prediction process according to the prediction parameter input from the prediction parameter control unit 204 and the reference image input from the reference image selector 205, and generates a unidirectional prediction image.
 図5は、第1実施形態の双方向予測における動き補償予測の動きベクトルの関係の一例を示す図である。動き補償予測では、参照画像を用いて補間処理が行われ、作成された補間画像と入力画像との符号化対象位置の画素ブロックからの動きのズレ量を元に単方向予測画像が生成される。ここで、ズレ量は、動きベクトルである。図5に示すように、双方向予測スライス(B-slice)では、2種類の参照画像と動きベクトルのセットを用いて予測画像が生成される。補間処理としては、1/2画素精度の補間処理や、1/4画素精度の補間処理などが用いられ、参照画像に対してフィルタリング処理が行われることによって、補間画像の値が生成される。例えば、輝度信号に対して1/4画素精度までの補間処理な可能なH.264では、ズレ量は整数画素精度の4倍で表現される。 FIG. 5 is a diagram illustrating an example of a motion vector relationship of motion compensation prediction in the bidirectional prediction according to the first embodiment. In motion compensated prediction, interpolation processing is performed using a reference image, and a unidirectional prediction image is generated based on the amount of motion deviation from the pixel block at the encoding target position between the generated interpolation image and the input image. . Here, the amount of deviation is a motion vector. As shown in FIG. 5, in the bi-directional prediction slice (B-slice), a prediction image is generated using a set of two types of reference images and motion vectors. As the interpolation process, an interpolation process with 1/2 pixel accuracy, an interpolation process with 1/4 pixel accuracy, or the like is used, and the value of the interpolation image is generated by performing the filtering process on the reference image. For example, H.P. capable of performing interpolation processing up to 1/4 pixel accuracy on a luminance signal. In H.264, the shift amount is expressed by four times the integer pixel accuracy.
 単方向動き補償部203は、単方向予測画像を出力し、メモリ202に一時的に格納する。ここで、動き情報(予測パラメータ)が双方向予測を示す場合には、複数フレーム動き補償部201が2種類の単方向予測画像を用いて重み付き予測を行うため、単方向動き補償部203は、1つ目に対応する単方向予測画像をメモリ202に格納し、2つ目に対応する単方向予測画像を複数フレーム動き補償部201に直接出力する。ここでは、1つ目に対応する単方向予測画像を第一予測画像とし、2つ目に対応する単方向予測画像を第二予測画像とする。 The unidirectional motion compensation unit 203 outputs a unidirectional prediction image and temporarily stores it in the memory 202. Here, when the motion information (prediction parameter) indicates bidirectional prediction, the multi-frame motion compensation unit 201 performs weighted prediction using two types of unidirectional prediction images. The first unidirectional prediction image corresponding to the first is stored in the memory 202, and the second unidirectional prediction image is directly output to the multi-frame motion compensation unit 201. Here, the first unidirectional prediction image corresponding to the first is the first prediction image, and the second unidirectional prediction image is the second prediction image.
 なお、単方向動き補償部203を2つ用意し、それぞれが2つの単方向予測画像を生成するようにしてもよい。この場合、動き情報(予測パラメータ)が単方向予測を示すときには、単方向動き補償部203が、1つ目の単方向予測画像を第一予測画像として複数フレーム動き補償部201に直接出力すればよい。 Note that two unidirectional motion compensation units 203 may be prepared and each may generate two unidirectional prediction images. In this case, when the motion information (prediction parameter) indicates unidirectional prediction, the unidirectional motion compensation unit 203 directly outputs the first unidirectional prediction image as the first prediction image to the multi-frame motion compensation unit 201. Good.
 複数フレーム動き補償部201は、メモリ202から入力される第一予測画像、単方向動き補償部203から入力される第二予測画像、及び動き評価部111から入力されるWPパラメータ情報を用いて、重み付き予測を行って予測画像を生成する。複数フレーム動き補償部201は、予測画像を出力し、減算部101及び加算部106に入力する。 The multi-frame motion compensation unit 201 uses the first prediction image input from the memory 202, the second prediction image input from the unidirectional motion compensation unit 203, and the WP parameter information input from the motion evaluation unit 111. A prediction image is generated by performing weighted prediction. The multi-frame motion compensation unit 201 outputs a prediction image and inputs the prediction image to the subtraction unit 101 and the addition unit 106.
 図6は、第1実施形態の複数フレーム動き補償部201の構成の一例を示すブロック図である。複数フレーム動き補償部201は、図6に示すように、デフォルト動き補償部301と、重み付き動き補償部302と、WPパラメータ制御部303と、WPセレクタ304、305とを、備える。 FIG. 6 is a block diagram illustrating an example of the configuration of the multi-frame motion compensation unit 201 according to the first embodiment. As shown in FIG. 6, the multi-frame motion compensation unit 201 includes a default motion compensation unit 301, a weighted motion compensation unit 302, a WP parameter control unit 303, and WP selectors 304 and 305.
 WPパラメータ制御部303は、パラメータ導出部110から入力されるWPパラメータ情報に基づいて、WP適用フラグ及び重み情報を出力し、WP適用フラグをWPセレクタ304、305に入力し、重み情報を重み付き動き補償部302に入力する。 The WP parameter control unit 303 outputs a WP application flag and weight information based on the WP parameter information input from the parameter derivation unit 110, inputs the WP application flag to the WP selectors 304 and 305, and weights the weight information. Input to the motion compensation unit 302.
 ここで、WPパラメータ情報は、重み係数の固定小数点精度、第一予測画像に対応する第一WP適用フラグ,第一重み係数,及び第一オフセット、並びに第二予測画像に対応する第二WP適用フラグ,第二重み係数,及び第二オフセットの情報を含む。WP適用フラグは、該当する参照画像及び信号成分毎に設定可能なパラメータであり、重み付き動き補償予測を行うかどうかを示す。重み情報は、重み係数の固定小数点精度、第一重み係数、第一オフセット、第二重み係数、及び第二オフセットの情報を含む。 Here, the WP parameter information includes the fixed-point precision of the weight coefficient, the first WP application flag corresponding to the first predicted image, the first weight coefficient, the first offset, and the second WP application corresponding to the second predicted image. Contains information on flag, second weighting factor, and second offset. The WP application flag is a parameter that can be set for each corresponding reference image and signal component, and indicates whether to perform weighted motion compensation prediction. The weight information includes information on the fixed point precision of the weight coefficient, the first weight coefficient, the first offset, the second weight coefficient, and the second offset.
 詳細には、WPパラメータ制御部303は、パラメータ導出部110からWPパラメータ情報が入力されると、WPパラメータ情報を第一WP適用フラグ、第二WP適用フラグ、及び重み情報に分離して出力し、第一WP適用フラグをWPセレクタ304に入力し、第二WP適用フラグをWPセレクタ305に入力し、重み情報を重み付き動き補償部302に入力する。 Specifically, when the WP parameter information is input from the parameter deriving unit 110, the WP parameter control unit 303 outputs the WP parameter information by separating it into a first WP application flag, a second WP application flag, and weight information. The first WP application flag is input to the WP selector 304, the second WP application flag is input to the WP selector 305, and the weight information is input to the weighted motion compensation unit 302.
 WPセレクタ304、305は、WPパラメータ制御部303から入力されたWP適用フラグに基づいて、各々の予測画像の接続端を切り替える。WPセレクタ304、305は、各々のWP適用フラグが0の場合、各々の出力端をデフォルト動き補償部301へ接続する。そしてWPセレクタ304、305は、第一予測画像及び第二予測画像を出力し、デフォルト動き補償部301に入力する。一方、WPセレクタ304、305は、各々のWP適用フラグが1の場合、各々の出力端を重み付き動き補償部302へ接続する。そしてWPセレクタ304、305は、第一予測画像及び第二予測画像を出力し、重み付き動き補償部302に入力する。 The WP selectors 304 and 305 switch the connection end of each predicted image based on the WP application flag input from the WP parameter control unit 303. When each WP application flag is 0, the WP selectors 304 and 305 connect each output terminal to the default motion compensation unit 301. Then, the WP selectors 304 and 305 output the first predicted image and the second predicted image and input them to the default motion compensation unit 301. On the other hand, when each WP application flag is 1, the WP selectors 304 and 305 connect each output terminal to the weighted motion compensation unit 302. The WP selectors 304 and 305 output the first predicted image and the second predicted image, and input them to the weighted motion compensation unit 302.
 デフォルト動き補償部301は、WPセレクタ304、305から入力された2つの単方向予測画像(第一予測画像及び第二予測画像)を元に平均値処理を行い、予測画像を生成する。具体的には、デフォルト動き補償部301は、第一WP適用フラグ及び第二WP適用フラグが0の場合、数式(1)に基づいて平均値処理を行う。 The default motion compensation unit 301 performs an average value process based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 to generate a prediction image. Specifically, when the first WP application flag and the second WP application flag are 0, the default motion compensation unit 301 performs average value processing based on Expression (1).
 P[x,y]=Clip1((PL0[x,y]+PL1[x,y]+offset2)>>(shift2))   …(1) P [x, y] = Clip1 ((PL0 [x, y] + PL1 [x, y] + offset2) >> (shift2)) ... (1)
 ここで、P[x,y]は予測画像、PL0[x,y]は第一予測画像、PL1[x,y]は第二予測画像である。offset2及びshift2は平均値処理における丸め処理のパラメータであり、第一予測画像及び第二予測画像の内部演算精度によって定まる。予測画像のビット精度をL、第一予測画像及び第二予測画像のビット精度をM(L≦M)とすると、shift2は数式(2)で定式化され、offset2は数式(3)で定式化される。 Here, P [x, y] is a predicted image, PL0 [x, y] is a first predicted image, and PL1 [x, y] is a second predicted image. offset2 and shift2 are parameters of the rounding process in the average value process, and are determined by the internal calculation accuracy of the first predicted image and the second predicted image. If the bit accuracy of the prediction image is L and the bit accuracy of the first prediction image and the second prediction image is M (L ≦ M), shift2 is formulated by Equation (2), and offset2 is formulated by Equation (3). Is done.
 shift2=(M-L+1)   …(2) Shift2 = (ML + 1) ... (2)
 offset2=(1<<(shift2-1)   …(3) Offset2 = (1 << (shift2-1) ... (3)
 例えば、予測画像のビット精度が8であり、第一予測画像及び第二予測画像のビット精度が14である場合、数式(2)よりshift2=7、数式(3)よりoffset2=(1<<6)となる。 For example, when the bit accuracy of the predicted image is 8 and the bit accuracy of the first predicted image and the second predicted image is 14, shift2 = 7 from Equation (2), and offset2 = (1 << from Equation (3). 6).
 なお、動き情報(予測パラメータ)で示される予測モードが単方向予測である場合、デフォルト動き補償部301は、第一予測画像のみを用いて、数式(4)に基づいて最終的な予測画像を算出する。 When the prediction mode indicated by the motion information (prediction parameter) is unidirectional prediction, the default motion compensation unit 301 uses only the first prediction image and calculates a final prediction image based on Expression (4). calculate.
 P[x,y]=Clip1((PLX[x,y]+offset1)>>(shift1))   …(4) P [x, y] = Clip1 ((PLX [x, y] + offset1) >> (shift1)) (4)
 ここで、PLX[x,y]は単方向予測画像(第一予測画像)を示しており、Xは参照リストのリスト番号を示す識別子であり、0又は1のいずれかである。例えば、リスト番号が0の場合はPL0[x,y]、リスト番号が1の場合はPL1[x,y]となる。offset1及びshift1は丸め処理のパラメータであり、第一予測画像の内部演算精度によって定まる。予測画像のビット精度をL、第一予測画像のビット精度をM(L≦M)とすると、shift1は数式(5)で定式化され、offset1は数式(6)で定式化される。 Here, PLX [x, y] indicates a unidirectional prediction image (first prediction image), and X is an identifier indicating the list number of the reference list, and is either 0 or 1. For example, when the list number is 0, PL0 [x, y] is obtained, and when the list number is 1, PL1 [x, y] is obtained. offset1 and shift1 are parameters of the rounding process, and are determined by the internal calculation accuracy of the first predicted image. Assuming that the bit accuracy of the predicted image is L and the bit accuracy of the first predicted image is M (L ≦ M), shift1 is formulated by Equation (5), and offset1 is formulated by Equation (6).
 shift1=(M-L)   …(5) Shift1 = (ML) ... (5)
 offset1=(1<<(shift1-1)   …(6) Offset1 = (1 << (shift1-1) ... (6)
 例えば、予測画像のビット精度が8であり、第一予測画像のビット精度が14である場合、数式(5)よりshift1=6、数式(6)よりoffset1=(1<<5)となる。 For example, when the bit accuracy of the predicted image is 8 and the bit accuracy of the first predicted image is 14, shift1 = 6 from Equation (5) and offset1 = (1 << 5) from Equation (6).
 重み付き動き補償部302は、WPセレクタ304、305から入力された2つの単方向予測画像(第一予測画像及び第二予測画像)とWPパラメータ制御部303から入力された重み情報とを元に重み付き動き補償を行う。具体的には、重み付き動き補償部302は、第一WP適用フラグ及び第二WP適用フラグが1の場合、数式(7)に基づいて重み付き処理を行う。 The weighted motion compensation unit 302 is based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 and the weight information input from the WP parameter control unit 303. Performs weighted motion compensation. Specifically, when the first WP application flag and the second WP application flag are 1, the weighted motion compensation unit 302 performs the weighting process based on Expression (7).
 P[x,y]=Clip1(((PL0[x,y]*w0C+PL1[x,y]*w1C+(1<<logWD))>>(logWD+1))+((o0C+o1C+1)>>1))   …(7) P [x, y] = Clip1 (((PL0 [x, y] * w 0C + PL1 [x, y] * w 1C + (1 << logWD C)) >> (logWD C +1)) + ((o 0C + o 1C +1) >> 1)) (7)
 ここで、w0Cは第一予測画像に対応する重み係数、w1Cは第二予測画像に対応する重み係数、o0Cは第一予測画像に対応するオフセット、o1Cは第二予測画像に対応するオフセットを表す。以後、それぞれを第一重み係数、第二重み係数、第一オフセット、第二オフセットと呼ぶ。logWDはそれぞれの重み係数の固定小数点精度を示すパラメータである。変数Cは、信号成分を意味する。例えば、YUV空間信号の場合、輝度信号をC=Yとし、Cr色差信号をC=Cr、Cb色差成分をC=Cbと表す。 Here, w 0C is a weight coefficient corresponding to the first predicted image, w 1C is a weight coefficient corresponding to the second predicted image, o 0C is an offset corresponding to the first predicted image, and o 1C is corresponding to the second predicted image. Represents the offset to be performed. Hereinafter, these are referred to as a first weighting factor, a second weighting factor, a first offset, and a second offset, respectively. logWD C is a parameter indicating the fixed-point precision of each weighting factor. The variable C means a signal component. For example, in the case of a YUV spatial signal, the luminance signal is C = Y, the Cr color difference signal is C = Cr, and the Cb color difference component is C = Cb.
 なお、重み付き動き補償部302は、第一予測画像及び第二予測画像と予測画像との演算精度が異なる場合、固定小数点精度であるlogWDを数式(8)のように制御することで丸め処理を実現する。 Note that the weighted motion compensation unit 302 rounds off the log WD C , which is the fixed-point precision, as in Expression (8) when the calculation accuracy of the first prediction image, the second prediction image, and the prediction image is different. Realize processing.
 logWD’=logWD+offset1   …(8) logWD ′ C = logWD C + offset1 (8)
 丸め処理は、数式(7)のlogWDを、数式(8)のlogWD’に置き換えることで実現できる。例えば、予測画像のビット精度が8であり、第一予測画像及び第二予測画像のビット精度が14である場合、logWDを再設定することにより、数式(1)のshift2と同様の演算精度における一括丸め処理を実現することが可能となる。 The rounding process can be realized by replacing logWD C in Expression (7) with logWD ′ C in Expression (8). For example, a bit precision of the predicted image 8, when the bit precision of the first prediction image and the second prediction image is 14, by resetting the LogWD C, the same operation accuracy and shift2 formula (1) It is possible to implement a batch rounding process in
 なお、動き情報(予測パラメータ)で示される予測モードが単方向予測である場合、重み付き動き補償部302は、第一予測画像のみを用いて、数式(9)に基づいて最終的な予測画像を算出する。 When the prediction mode indicated by the motion information (prediction parameter) is unidirectional prediction, the weighted motion compensation unit 302 uses only the first predicted image and uses the final predicted image based on Equation (9). Is calculated.
 P[x,y]=Clip1((PLX[x,y]*wXC+(1<<logWD-1))>>(logWD))   …(9) P [x, y] = Clip 1 ((PLX [x, y] * w XC + (1 << logWD C −1)) >> (logWD C )) (9)
 ここで、PLX[x,y]は単方向予測画像(第一予測画像)を示し、wXCは単方向予測に対応する重み係数を示しており、Xは参照リストのリスト番号を示す識別子であり、0又は1のいずれかである。例えば、リスト番号が0の場合はPL0[x,y]、w0C、リスト番号が1の場合はPL1[x,y]、w1Cとなる。 Here, PLX [x, y] indicates a unidirectional prediction image (first prediction image), w XC indicates a weighting factor corresponding to unidirectional prediction, and X is an identifier indicating a list number of the reference list. Yes, either 0 or 1. For example, when the list number is 0, PL0 [x, y] and w 0C are obtained , and when the list number is 1, PL1 [x, y] and w 1C are obtained.
 なお、重み付き動き補償部302は、第一予測画像及び第二予測画像と予測画像との演算精度が異なる場合、固定小数点精度であるlogWDを双方向予測時と同様に数式(8)のように制御することで丸め処理を実現する。 Incidentally, the weighted motion compensator 302, if the calculation accuracy of the predicted image and the first prediction image and the second prediction image are different, equation as in the case of bidirectional prediction the LogWD C is a fixed point precision (8) The rounding process is realized by controlling as described above.
 丸め処理は、数式(7)のlogWDを、数式(8)のlogWD’に置き換えることで実現できる。例えば、予測画像のビット精度が8であり、第一予測画像のビット精度が14である場合、logWDを再設定することにより、数式(4)のshift1と同様の演算精度における一括丸め処理を実現することが可能となる。 The rounding process can be realized by replacing logWD C in Expression (7) with logWD ′ C in Expression (8). For example, when the bit accuracy of the predicted image is 8 and the bit accuracy of the first predicted image is 14, by resetting logWD C , the batch rounding process with the same calculation accuracy as shift1 of Expression (4) is performed. It can be realized.
 図7は、第1実施形態における重み係数の固定小数点精度の一例の説明図であり、時間方向の画素値変化がある動画像と平均画素値との変化の一例を示す図である。図7に示す例では、符号化対象フレームをFrame(t)とし、時間的に1つ前のフレームをFrame(t-1)、時間的に1つ後のフレームをFrame(t+1)としている。図7に示すように、白から黒に変化するフェード画像では、画像の明度(階調値)が時間とともに減少していく。重み係数は、図7における変化の度合いを意味しており、数式(7)及び数式(9)から明らかなように、画素値変化がない場合に1.0の値を取る。固定小数点精度は、重み係数の小数点に対応する刻み幅を制御するパラメータであり、画素値変化がない場合の重み係数は、1<<logWDとなる。 FIG. 7 is an explanatory diagram of an example of the fixed-point precision of the weighting factor in the first embodiment, and is a diagram illustrating an example of a change between a moving image having a pixel value change in the time direction and an average pixel value. In the example illustrated in FIG. 7, the encoding target frame is Frame (t), the temporally previous frame is Frame (t−1), and the temporally subsequent frame is Frame (t + 1). As shown in FIG. 7, in a fade image that changes from white to black, the brightness (gradation value) of the image decreases with time. The weighting coefficient means the degree of change in FIG. 7 and takes a value of 1.0 when there is no change in the pixel value, as is clear from Equation (7) and Equation (9). Fixed point precision is a parameter for controlling the step size corresponding to decimal weighting factor, the weighting factor when there is no pixel value variation, the 1 << logWD C.
 ここで、図7を参照しながら、時間的な画素値変化のある動画像と、重み係数、重み係数の固定小数点精度、及びオフセットと、の関係を説明する。前述したように、図7に示すような白から黒に変化するフェード画像では、画像の画素値(階調値)が時間とともに減少していく。図7では、画像の平均画素値が時間とともに減少していく例を示しており、この減少の傾きに対応するのが重み係数である。 Here, with reference to FIG. 7, the relationship between a moving image with temporal pixel value change, a weighting factor, a fixed point precision of the weighting factor, and an offset will be described. As described above, in a fade image that changes from white to black as shown in FIG. 7, the pixel value (gradation value) of the image decreases with time. FIG. 7 shows an example in which the average pixel value of the image decreases with time, and the weighting factor corresponds to the inclination of the decrease.
 また、重み係数の固定小数点精度は、この傾きの精度を示す情報である。例えば、図7において、Frame(t-1)~Frame(t+1)間の重み係数が小数点精度で0.75である場合、1/4精度であれば、3/4が表現できる。 Also, the fixed-point precision of the weighting coefficient is information indicating the precision of this slope. For example, in FIG. 7, when the weighting coefficient between Frame (t−1) and Frame (t + 1) is 0.75 in decimal precision, 3/4 can be expressed with 1/4 precision.
 例えば、傾きが予め定めた固定小数点精度の実現値と一致しない場合、一次関数の切片に対応する補正値(ズレ量)を示すオフセット値で調整が可能である。例えば、図7において、Frame(t-1)~Frame(t+1)間の重み係数が小数点精度で0.60であり、固定小数点精度が1(1<<1)である場合、重み係数を、固定小数点精度の実現値に対応しないため、例えば1(つまり、重み係数の小数点精度0.50に該当)とする。この場合、重み係数の小数点精度は、最適な値である0.60から0.10ずれているため、この分の補正値を画素の最大値から計算し、オフセット値として設定することが可能である。画素の最大値が255である場合、25(255×0.1)などの値をオフセット値に設定すればよいことを意味する。 For example, if the slope does not match the predetermined fixed-point precision realization value, adjustment can be made with an offset value indicating a correction value (deviation amount) corresponding to the intercept of the linear function. For example, in FIG. 7, when the weighting coefficient between Frame (t−1) and Frame (t + 1) is 0.60 in decimal precision and the fixed decimal precision is 1 (1 << 1), the weighting coefficient is Since it does not correspond to the actual value of the fixed decimal point precision, for example, 1 (that is, corresponding to the decimal point precision 0.50 of the weight coefficient) is set. In this case, since the decimal point accuracy of the weighting coefficient is deviated from 0.60, which is the optimum value, it is possible to calculate this correction value from the maximum value of the pixel and set it as an offset value. is there. When the maximum value of the pixel is 255, it means that a value such as 25 (255 × 0.1) may be set as the offset value.
 なお、単方向予測の場合には、第二予測画像に対応する各種パラメータ(第二WP適用フラグ、第二重み係数、及び第二オフセットの情報)は利用されないため、予め定めた初期値に設定されていてもよい。 In the case of unidirectional prediction, since various parameters (second WP application flag, second weighting factor, and second offset information) corresponding to the second predicted image are not used, they are set to predetermined initial values. May be.
 図1に戻り、参照画像特徴量導出部108、予測画像特徴量導出部109、及びパラメータ導出部110により、予測画像生成部107から入力された参照画像から、予測画像生成部107により生成される予測画像に対応するWPパラメータ情報が暗黙的に導出される。 Returning to FIG. 1, the reference image feature amount deriving unit 108, the predicted image feature amount deriving unit 109, and the parameter deriving unit 110 generate the predicted image from the reference image input from the predicted image generation unit 107. WP parameter information corresponding to the predicted image is implicitly derived.
 参照画像特徴量導出部108は、予測画像生成部107から入力された参照画像それぞれの参照画像特徴量を導出し、導出したそれぞれの参照画像特徴量をまとめた参照画像群特徴量を出力し、予測画像特徴量導出部109及びパラメータ導出部110に入力する。 The reference image feature amount deriving unit 108 derives the reference image feature amount of each reference image input from the predicted image generation unit 107, outputs a reference image group feature amount that summarizes the derived reference image feature amounts, and The prediction image feature quantity deriving unit 109 and the parameter deriving unit 110 are input.
 図8は、第1実施形態の参照画像群特徴量の一例を示す図である。図8に示す例では、参照画像群特徴量は、2N+2個の参照画像特徴量がテーブルにまとめられており、参照画像特徴量は、参照画像の参照リストのリスト番号、参照画像の参照画像番号、参照画像の画素平均値、及び参照画像の画素平均値との画素の差分を示す画素誤差値を含んでいる。 FIG. 8 is a diagram illustrating an example of a reference image group feature amount according to the first embodiment. In the example illustrated in FIG. 8, the reference image group feature amount is a table of 2N + 2 reference image feature amounts, and the reference image feature amount includes the list number of the reference list of the reference image and the reference image number of the reference image. , The pixel average value of the reference image, and the pixel error value indicating the pixel difference from the pixel average value of the reference image.
 リスト番号は、予測方向を示す識別子であり、単方向予測時は0の値を取り、双方向予測時は2種類の予測を用いることができるため、0と1の2つの値を取る。参照画像番号は、フレームメモリ206に示される0~Nに対応する値である。リスト番号及び参照画像番号は、H.264などで用いられているDPBに従って管理されており、符号化制御部113により、予測画像生成部107内の設定内容(例えば、参照画像セレクタ205がいずれの参照画像を参照画像特徴量導出部108に出力するかなど)に応じて、参照画像特徴量導出部108に暗黙的に設定される。画素平均値及び画素誤差値は、参照画像特徴量導出部108により算出される。 The list number is an identifier indicating the prediction direction, and takes a value of 0 for unidirectional prediction, and two values of 0 and 1 can be used for bidirectional prediction. The reference image number is a value corresponding to 0 to N indicated in the frame memory 206. List numbers and reference image numbers are H.264. Are managed in accordance with the DPB used in H.264, etc., and the encoding control unit 113 sets the setting contents in the predicted image generation unit 107 (for example, the reference image selector 205 selects any reference image as the reference image feature quantity deriving unit 108. The reference image feature quantity deriving unit 108 implicitly sets the value depending on whether the data is output to the reference image feature value. The average pixel value and the pixel error value are calculated by the reference image feature amount deriving unit 108.
 なお、図8に示す参照画像群特徴量のテーブルは一例であり、符号化構造に応じて利用可能な参照画像の構成が異なるため、テーブルサイズも異なる。例えば、P-sliceの符号化構造の場合、双方向予測に対応するリスト番号1の参照画像番号は、利用不可能であるため、リスト番号0のテーブルとなる。また、参照画像の数によってもテーブルサイズは異なる。 Note that the table of reference image group feature values shown in FIG. 8 is an example, and the configuration of reference images that can be used differs depending on the coding structure, so the table size also differs. For example, in the case of a P-slice coding structure, the reference image number of list number 1 corresponding to bi-directional prediction is not usable, and thus becomes a table of list number 0. The table size also varies depending on the number of reference images.
 図9は、第1実施形態の参照画像特徴量導出部108の構成の一例を示すブロック図である。参照画像特徴量導出部108は、図9に示すように、平均値算出部401と、誤差値算出部402と、統合部403と、を備える。 FIG. 9 is a block diagram illustrating an example of the configuration of the reference image feature quantity deriving unit 108 of the first embodiment. As illustrated in FIG. 9, the reference image feature quantity deriving unit 108 includes an average value calculating unit 401, an error value calculating unit 402, and an integrating unit 403.
 平均値算出部401は、予測画像生成部107から参照画像が入力される毎に、当該参照画像の画素平均値を算出し、算出した画素平均値を出力し、誤差値算出部402及び統合部403に入力する。平均値算出部401は、例えば数式(10)を用いて、参照画像の画素平均値を算出する。 Each time the reference image is input from the predicted image generation unit 107, the average value calculation unit 401 calculates the pixel average value of the reference image, outputs the calculated pixel average value, and the error value calculation unit 402 and the integration unit Input to 403. The average value calculation unit 401 calculates the pixel average value of the reference image using, for example, Equation (10).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 ここで、DCLX(t)は、リスト番号Xかつ参照画像番号tの参照画像の画素平均値を示す。従って、リスト番号0かつ参照画像番号1の参照画像の画素平均値であれば、DCL0(1)と表され、リスト番号1かつ参照画像番号0の参照画像の画素平均値であれば、DCL1(0)と表される。nは、リスト番号Xかつ参照画像番号tの参照画像の画素数を示す。Yx,y(t)は、リスト番号Xかつ参照画像番号tの参照画像の(x,y)座標の画素値を示す。 Here, DCLX (t) represents the pixel average value of the reference images of list number X and reference image number t. Therefore, if the pixel average value of the reference image of list number 0 and reference image number 1 is represented as DCL0 (1), if the pixel average value of the reference image of list number 1 and reference image number 0 is DCL1 ( 0). n indicates the number of pixels of the reference image of the list number X and the reference image number t. Y x, y (t) represents the pixel value of the (x, y) coordinates of the reference image of list number X and reference image number t.
 誤差値算出部402は、予測画像生成部107から参照画像が入力される毎に、平均値算出部401から入力された当該参照画像の画素平均値を用いて、当該参照画像の画素平均値の画素誤差値を算出し、算出した画素誤差値を出力し、統合部403に入力する。誤差値算出部402は、例えば数式(11)を用いて、参照画像の画素平均値の画素誤差値を算出する。 Each time the reference image is input from the predicted image generation unit 107, the error value calculation unit 402 uses the pixel average value of the reference image input from the average value calculation unit 401 to calculate the pixel average value of the reference image. A pixel error value is calculated, and the calculated pixel error value is output and input to the integration unit 403. The error value calculation unit 402 calculates the pixel error value of the pixel average value of the reference image using, for example, Equation (11).
Figure JPOXMLDOC01-appb-M000002
 ここで、ACLX(t)は、リスト番号Xかつ参照画像番号tの参照画像の各画素の画素値と当該参照画像の画素平均値との差分(誤差)の平均値である画素誤差値を示す。
Figure JPOXMLDOC01-appb-M000002
Here, ACLX (t) represents a pixel error value that is an average value of differences (errors) between the pixel value of each pixel of the reference image of list number X and reference image number t and the pixel average value of the reference image. .
 統合部403は、予測画像生成部107から参照画像が入力される毎に、当該参照画像のリスト番号及び参照画像番号、平均値算出部401から入力された当該参照画像の画素平均値、並びに誤差値算出部402から入力された当該参照画像の画素平均値の画素誤差値を参照画像特徴量に統合する。そして統合部403は、統合した参照画像特徴量を図8に示すようにテーブルにまとめて参照画像群特徴量として出力し、予測画像特徴量導出部109及びパラメータ導出部110に入力する。 Whenever the reference image is input from the predicted image generation unit 107, the integration unit 403 receives the list number and reference image number of the reference image, the pixel average value of the reference image input from the average value calculation unit 401, and the error. The pixel error value of the pixel average value of the reference image input from the value calculation unit 402 is integrated into the reference image feature amount. Then, the integration unit 403 collects the integrated reference image feature values in a table as shown in FIG. 8 and outputs them as reference image group feature values, which are input to the predicted image feature value deriving unit 109 and the parameter deriving unit 110.
 予測画像特徴量導出部109は、参照画像特徴量導出部108から入力された参照画像群特徴量に基づいて予測画像特徴量を導出し、導出した予測画像特徴量を出力し、パラメータ導出部110に入力する。予測画像特徴量は、予測画像の画素平均値、画素平均値からの誤差を平均した画素誤差値、及び予測画像特徴量を導出したか否かを示す予測画像特徴量導出フラグを含む。 The predicted image feature amount deriving unit 109 derives a predicted image feature amount based on the reference image group feature amount input from the reference image feature amount deriving unit 108, outputs the derived predicted image feature amount, and the parameter deriving unit 110. To enter. The predicted image feature amount includes a pixel average value of the predicted image, a pixel error value obtained by averaging errors from the pixel average value, and a predicted image feature amount derivation flag indicating whether the predicted image feature amount has been derived.
 図10は、第1実施形態の予測画像特徴量導出部109の構成の一例を示すブロック図である。予測画像特徴量導出部109は、図10に示すように、特徴量制御部411と、メモリ412と、予測画像特徴量算出部413とを、を備える。 FIG. 10 is a block diagram illustrating an example of the configuration of the predicted image feature quantity deriving unit 109 according to the first embodiment. As shown in FIG. 10, the predicted image feature quantity deriving unit 109 includes a feature quantity control unit 411, a memory 412, and a predicted image feature quantity calculation unit 413.
 特徴量制御部411は、参照画像特徴量導出部108から入力された参照画像群特徴量から、予測画像特徴量の導出に用いる2つの参照画像特徴量を選択して出力し、一方をメモリ412に入力(ロード)し、他方を予測画像特徴量算出部413に入力する。 The feature amount control unit 411 selects and outputs two reference image feature amounts used for deriving the predicted image feature amount from the reference image group feature amount input from the reference image feature amount deriving unit 108, and outputs one of them in the memory 412. Is input (loaded), and the other is input to the predicted image feature amount calculation unit 413.
 具体的には、特徴量制御部411は、参照画像群特徴量にまとめられた参照画像特徴量のリスト番号及び参照画像番号から、当該参照画像特徴量の参照画像の表示オーダー(画像表示時間)を示すPOC(Picture Order Count)を導出する。なお、参照リスト及び参照画像番号は、参照画像を間接的に指定する情報であり、POCは、参照画像を直接指定する情報であり、参照画像の絶対位置に相当する。そして特徴量制御部411は、導出したPOCの中から符号化対象画像(予測画像)との時間的距離が近い順に2つのPOCを選択することで、参照画像群特徴量から予測画像特徴量の導出に用いる2つの参照画像特徴量を選択する。 Specifically, the feature quantity control unit 411 displays the reference image display order (image display time) of the reference image feature quantity from the list number and reference image number of the reference image feature quantity collected in the reference image group feature quantity. Deriving POC (Picture Order Count) indicating The reference list and the reference image number are information for indirectly specifying the reference image, and the POC is information for directly specifying the reference image, and corresponds to the absolute position of the reference image. Then, the feature amount control unit 411 selects two POCs from the derived POC in the order of the shortest distance from the encoding target image (predicted image), thereby calculating the predicted image feature amount from the reference image group feature amount. Two reference image feature quantities used for derivation are selected.
 特徴量制御部411は、符号化スライスがP-sliceの場合、数式(12)を用いて2つのPOC(POC1及びPOC2)を選択する。 The feature amount control unit 411 selects two POCs (POC1 and POC2) using Equation (12) when the encoded slice is P-slice.
for (refIdx=0; refidx<=num_of_active_ref_l0_minus1; refIdx++){
 refPOC[refIdx] = RefPicOrderCnt(ListL0, refIdx)
}
 POC1 = SortRefPOC(refPOC, curPOC, num_of_active_ref_l0_minus1)
 POC2 = SortRefPOC(refPOC, curPOC, num_of_active_ref_l0_minus1-1)   …(12)
for (refIdx = 0; refidx <= num_of_active_ref_l0_minus1; refIdx ++) {
refPOC [refIdx] = RefPicOrderCnt (ListL0, refIdx)
}
POC1 = SortRefPOC (refPOC, curPOC, num_of_active_ref_l0_minus1)
POC2 = SortRefPOC (refPOC, curPOC, num_of_active_ref_l0_minus1-1) (12)
 ここで、num_of_active_ref_l0_minus1は、シンタクス要素の1つで、リスト番号0の参照リストで用いられる参照画像数から1を引いた値、即ち、Nを示す。RefPicOrderCntは、参照画像のリスト番号及び当該参照画像の参照画像番号を与えると、当該参照画像のPOCを返す関数であり、ListL0はリスト番号0を示し、refIdxは参照画像番号を示す。refPOCは参照画像群特徴量に参照画像特徴量が含まれる参照画像のPOCの配列を示し、curPOCは符号化対象画像のPOCを示す。SortRefPOCは、refPOCに格納されているPOCのうちnum_of_active_ref_l0_minus1又はnum_of_active_ref_l0_minus1-1で与えられた数+1分のPOCそれぞれとcurPOCで示されるPOCとの絶対差分値を計算する。そしてSortRefPOCは、refPOCに格納されているPOCのうち絶対差分値が最も小さい値をとるPOCを返し、refPOCから当該POCを削除してrefPOCに格納されているPOCを並び替える関数である。 Here, num_of_active_ref_l0_minus1 is one of the syntax elements, and indicates a value obtained by subtracting 1 from the number of reference images used in the reference list of list number 0, that is, N. RefPicOrderCnt is a function that, when given a list number of a reference image and a reference image number of the reference image, returns a POC of the reference image, ListL0 indicates list number 0, and refIdx indicates a reference image number. refPOC indicates the POC arrangement of the reference image in which the reference image feature amount is included in the reference image group feature amount, and curPOC indicates the POC of the encoding target image. SortRefPOC calculates the absolute difference value between each POC of the number +1 given by num_of_active_ref_l0_minus1 or num_of_active_ref_l0_minus1-1 of POCs stored in refPOC and the POC indicated by curPOC. SortRefPOC is a function that returns the POC having the smallest absolute difference value among the POCs stored in refPOC, deletes the POC from refPOC, and rearranges the POCs stored in refPOC.
 例えば、refPOCに{0,1,2,3}が格納されており、curPOCの値が4、num_of_active_ref_l0_minus1の値が3の場合、SortRefPOC(refPOC, curPOC, num_of_active_ref_l0_minus1)は、refPOCから絶対差分値が最小の1となるPOC番号3をPOC1として返し、refPOCから3を削除してrefPOCを{0,1,2}に並び替える。続いて、SortRefPOC(refPOC, curPOC, num_of_active_ref_l0_minus1-1)は、refPOCから絶対差分値が最小の2となるPOC番号2をPOC2として返し、refPOCから2を削除してrefPOCを{0,1}に並び替える。 For example, if {0, 1, 2, 3} is stored in refPOC, the value of curPOC is 4, and the value of num_of_active_ref_l0_minus1 is 3, SortRefPOC (refPOC, curPOC, num_of_active_ref_l0_minus1) has the smallest absolute difference value from refPOC POC number 3 that becomes 1 is returned as POC1, 3 is deleted from refPOC, and refPOC is rearranged to {0, 1, 2}. Subsequently, SortRefPOC (refPOC, curPOC, num_of_active_ref_l0_minus1-1) returns POC number 2 with the smallest absolute difference value of 2 from refPOC as POC2, deletes 2 from refPOC, and arranges refPOC in {0, 1} Change.
 符号化対象画像と参照画像との時間的距離が近い程、画像特徴量の相関が高くなる傾向にあるため、P-slice時には、当該符号化画像との時間的距離が近い順に2つのPOC(参照画像)を選択することで、予測画像特徴量の予測精度を高めることができる。 Since the correlation between the image feature amounts tends to increase as the temporal distance between the encoding target image and the reference image is shorter, two POCs (in order of increasing temporal distance from the encoded image) in P-slice. By selecting (reference image), the prediction accuracy of the predicted image feature quantity can be increased.
 また特徴量制御部411は、符号化スライスがB-sliceの場合、数式(13)を用いて2つのPOC(POC1及びPOC2)を選択する。 Also, the feature quantity control unit 411 selects two POCs (POC1 and POC2) using Expression (13) when the encoded slice is B-slice.
for (refIdx=0; refidx<=num_of_active_ref_l0_minus1; refIdx++){
 refPOCL0[refIdx] = RefPicOrderCnt(ListL0, refIdx)
}
for (refIdx=0; refidx<=num_of_active_ref_l1_minus1; refIdx++){
 refPOCL1[refIdx] = RefPicOrderCnt(ListL1, refIdx)
}
 POC1 = SortRefPOC(refPOCL0, curPOC, num_of_active_ref_l0_minus1)
 POC2 = SortRefPOC(refPOCL1, curPOC, num_of_active_ref_l1_minus1)   …(13)
for (refIdx = 0; refidx <= num_of_active_ref_l0_minus1; refIdx ++) {
refPOCL0 [refIdx] = RefPicOrderCnt (ListL0, refIdx)
}
for (refIdx = 0; refidx <= num_of_active_ref_l1_minus1; refIdx ++) {
refPOCL1 [refIdx] = RefPicOrderCnt (ListL1, refIdx)
}
POC1 = SortRefPOC (refPOCL0, curPOC, num_of_active_ref_l0_minus1)
POC2 = SortRefPOC (refPOCL1, curPOC, num_of_active_ref_l1_minus1) (13)
 ここで、refPOCL0は、参照画像群特徴量に参照画像特徴量が含まれるリスト番号0の参照画像のPOCの配列を示す。num_of_active_ref_l1_minus1は、シンタクス要素の1つで、リスト番号1の参照リストで用いられる参照画像数から1を引いた値、即ち、Nを示す。refPOCL1は、参照画像群特徴量に参照画像特徴量が含まれるリスト番号1の参照画像のPOCの配列を示す。ListL1はリスト番号1を示す。 Here, refPOCL0 indicates the POC arrangement of the reference image of list number 0 in which the reference image feature quantity is included in the reference image group feature quantity. num_of_active_ref_l1_minus1 is one of the syntax elements, and indicates a value obtained by subtracting 1 from the number of reference images used in the reference list of list number 1, that is, N. refPOCL1 indicates the POC arrangement of the reference image of list number 1 in which the reference image feature amount is included in the reference image group feature amount. ListL1 indicates list number 1.
 符号化対象画像と参照画像との時間的距離が近い程、画像特徴量の相関が高くなる傾向にあるため、B-slice時には、当該符号化画像との時間的距離が最も近いPOC(参照画像)を2つの参照リストから1つずつ選択することで、予測画像特徴量の予測精度を高めることができる。 Since the correlation between the image feature amounts tends to increase as the temporal distance between the encoding target image and the reference image is shorter, at the time of B-slice, the POC (reference image with the closest temporal distance to the encoded image) ) From the two reference lists one by one, the prediction accuracy of the predicted image feature quantity can be improved.
 なお特徴量制御部411は、選択した2つのPOC(POC1及びPOC2)が同一である場合、数式(14)を用いて一方のPOC(POC2)を再選択する。特徴量制御部411は、数式(14)を繰り返し行うことにより、符号化対象画像から時間的距離が異なる2つのPOC(符号化対象画像との時間的距離がPOC1と異なるPOC2)を選択できる。 Note that, when the two selected POCs (POC1 and POC2) are the same, the feature amount control unit 411 reselects one POC (POC2) using Expression (14). The feature quantity control unit 411 can select two POCs having different temporal distances from the encoding target image (POC2 having a temporal distance different from that of POC1) from the encoding target image by repeatedly performing Expression (14).
if (POC1 == POC2){
 POC2 = SortRefPOC(refPOCLX, curPOC, num_of_active_ref_l1_minus1-(M))
}   …(14)
if (POC1 == POC2) {
POC2 = SortRefPOC (refPOCLX, curPOC, num_of_active_ref_l1_minus1- (M))
} (14)
 ここで、refPOCLXのXは、リスト番号を示す識別子であり、例えば、リスト番号0の参照リストを探索した後に、リスト番号1の参照リストを探索するようにしてもよい。このため、POC(POC2)の再選択を行った場合には、符号化画像との時間的距離が最も近い2つのPOC(参照画像)は、同一の参照リストから選択される場合もある。Mは、繰り返し回数を示す値である。Mは、例えば参照リスト毎に設定される参照画像数によって規定される。 Here, X of refPOCLX is an identifier indicating a list number. For example, after searching a reference list of list number 0, a reference list of list number 1 may be searched. For this reason, when POC (POC2) is reselected, the two POCs (reference images) that are closest in time to the encoded image may be selected from the same reference list. M is a value indicating the number of repetitions. M is defined by the number of reference images set for each reference list, for example.
 そして特徴量制御部411は、POC1を選択すると、参照画像群特徴量からPOC1に対応する参照画像特徴量を選択して出力し、メモリ412に入力する。また特徴量制御部411は、POC2を選択すると、参照画像群特徴量からPOC2に対応する参照画像特徴量を選択して出力し、予測画像特徴量算出部413に入力する。 Then, when POC1 is selected, the feature quantity control unit 411 selects and outputs a reference image feature quantity corresponding to POC1 from the reference image group feature quantity, and inputs it to the memory 412. Further, when POC2 is selected, the feature amount control unit 411 selects and outputs a reference image feature amount corresponding to POC2 from the reference image group feature amount, and inputs it to the predicted image feature amount calculation unit 413.
 第1実施形態では、特徴量制御部411が2つのPOCを選択する例について説明したが、3つ以上のPOCを選択するようにしてもよい。この場合、特徴量制御部411は、参照画像群特徴量から3つ以上の参照画像特徴量を選択し、選択された3つ以上の参照画像特徴量から予測画像特徴量が導出される。なお、P-sliceの場合は、N≧2であることが必要であり、特徴量制御部411は、数式(12)を実行した後にリスト番号0の参照リストを探索するようにすればよい。 In the first embodiment, the example in which the feature amount control unit 411 selects two POCs has been described, but three or more POCs may be selected. In this case, the feature amount control unit 411 selects three or more reference image feature amounts from the reference image group feature amount, and a predicted image feature amount is derived from the selected three or more reference image feature amounts. Note that in the case of P-slice, it is necessary that N ≧ 2, and the feature amount control unit 411 may search the reference list with the list number 0 after executing Expression (12).
 メモリ412は、特徴量制御部411により入力された参照画像特徴量(POC1に対応する参照画像特徴量(以下、第一参照画像特徴量と称する))を保持する。そしてメモリ412は、特徴量制御部411から予測画像特徴量算出部413に参照画像特徴量(POC2に対応する参照画像特徴量(以下、第二参照画像特徴量と称する))が入力されるタイミングで、第一参照画像特徴量を出力し、予測画像特徴量算出部413に入力する。 The memory 412 holds the reference image feature amount (reference image feature amount corresponding to POC1 (hereinafter referred to as first reference image feature amount)) input by the feature amount control unit 411. Then, the memory 412 receives a reference image feature amount (a reference image feature amount corresponding to POC2 (hereinafter referred to as a second reference image feature amount)) from the feature amount control unit 411 to the predicted image feature amount calculation unit 413. Then, the first reference image feature value is output and input to the predicted image feature value calculation unit 413.
 予測画像特徴量算出部413は、メモリ412から入力された第一参照画像特徴量と特徴量制御部411から入力された第二参照画像特徴量とを用いて、予測画像特徴量を算出して出力し、パラメータ導出部110に入力する。 The predicted image feature value calculation unit 413 calculates a predicted image feature value using the first reference image feature value input from the memory 412 and the second reference image feature value input from the feature value control unit 411. To be output and input to the parameter deriving unit 110.
 予測画像特徴量算出部413は、まず、数式(15)に従って、第一参照画像特徴量の参照画像と第二参照画像特徴量の参照画像との時間的距離を算出する。 The predicted image feature amount calculation unit 413 first calculates the temporal distance between the reference image of the first reference image feature amount and the reference image of the second reference image feature amount according to Equation (15).
 DistScaleFactor = Clip3(-1024, 1023, (tb*tx+32)>>6) …(15) DistScaleFactor = Clip3 (-1024, 1023, (tb * tx + 32) >> 6) ... (15)
 ここで、Clip3(A,B,C)は、値Cが最小値Aを下回ったらAを返し、値Cが最大値Bを上回ったらBを返し、いずれの条件にも当てはまらない場合は値Cを返すクリップ関数である。tbは、数式(16)により算出され、txは、数式(17)により算出される。 Here, Clip3 (A, B, C) returns A if the value C falls below the minimum value A, returns B if the value C exceeds the maximum value B, and returns value C if none of the conditions apply. Is a clip function that returns tb is calculated by Expression (16), and tx is calculated by Expression (17).
 tb = Clip3(-128, 127, curPOC-POC1)   …(16)
 tx = (16384+abs(td/2))/td   …(17)
tb = Clip3 (-128, 127, curPOC-POC1) (16)
tx = (16384 + abs (td / 2)) / td (17)
 tbは、curPOCとPOC1との時間的距離を示す。txは、tb/tdの除算を固定小数点精度で行うための中間変数であり、tb/tdの除算結果を示す。なお、固定小数点は8ビット精度であり、DistScaleFactorの値が128(中央値)の場合、tb/tdが1であることを意味する。tdは、POC1とPOC2との時間的距離を示し、数式(18)により算出される。 Tb indicates the time distance between curPOC and POC1. tx is an intermediate variable for performing division of tb / td with fixed-point precision, and indicates a division result of tb / td. Note that the fixed point has 8-bit precision, and when the value of DistScaleFactor is 128 (median value), it means that tb / td is 1. td indicates a temporal distance between POC1 and POC2, and is calculated by Expression (18).
 td = Clip3(-128, 127, POC2-POC1)   …(18) Td = Clip3 (-128, 127, POC2-POC1) ... (18)
 予測画像特徴量算出部413は、続いて、数式(19)に従って予測画像の画素平均値を算出するとともに、数式(20)に従って予測画像の画素誤差値を算出する。 Next, the predicted image feature value calculation unit 413 calculates the pixel average value of the predicted image according to Equation (19), and calculates the pixel error value of the predicted image according to Equation (20).
 DCP = (DistScaleFactor*DC2+(256-DistScaleFactor)*DC1+Ofst3)>>(Shft3);   …(19)
 ACP = (DistScaleFactor*AC2+(256-DistScaleFactor)*AC1+Ofst3)>>(Shft3);   …(20)
DCP = (DistScaleFactor * DC2 + (256-DistScaleFactor) * DC1 + Ofst3) >>(Shft3);… (19)
ACP = (DistScaleFactor * AC2 + (256-DistScaleFactor) * AC1 + Ofst3) >>(Shft3);… (20)
 ここで、DCPは予測画像の画素平均値を示し、DC1、DC2は、それぞれ、第一参照画像特徴量の画素平均値、第二参照画像特徴量の画素平均値を示す。ACPは予測画像の画素誤差値を示し、AC1、AC2は、それぞれ、第一参照画像特徴量の画素誤差値、第二参照画像特徴量の画素誤差値を示す。Shft3及びOfst3は、予測画像の内部演算精度を示すINTERNAL_PRECに応じて値が定まり、Shft3は数式(21)により算出され、Ofst3は、数式(22)により算出される。第1実施形態では、DistScaleFactorの固定小数点精度を8としているため、INTERNAL_PRECが8の場合、DCP及びACPは整数精度に丸められることになる。 Here, DCP indicates the pixel average value of the predicted image, and DC1 and DC2 indicate the pixel average value of the first reference image feature value and the pixel average value of the second reference image feature value, respectively. ACP indicates the pixel error value of the predicted image, and AC1 and AC2 indicate the pixel error value of the first reference image feature value and the pixel error value of the second reference image feature value, respectively. The values of Shft3 and Ofst3 are determined according to INTERNAL_PREC indicating the internal calculation accuracy of the predicted image, Shft3 is calculated by Equation (21), and Ofst3 is calculated by Equation (22). In the first embodiment, since the fixed point precision of DistScaleFactor is 8, when INTERNAL_PREC is 8, DCP and ACP are rounded to integer precision.
 Sfht3 = INTERNAL_PREC   …(21)
 Ofst3 = (1<<(Shft3-1))   …(22)
Sfht3 = INTERNAL_PREC (21)
Ofst3 = (1 << (Shft3-1)) (22)
 但し、予測画像特徴量算出部413は、数式(15)により算出したDistScaleFactorが、数式(23)~数式(25)のいずれかの条件を満たす場合、予測画像特徴量の導出が不可能又は時間的距離が離れすぎているため、予測画像の画素平均値DCP及び予測画像の画素誤差値ACPを初期値に設定する。 However, when the DistScaleFactor calculated by Expression (15) satisfies any one of Expressions (23) to (25), the predicted image feature amount calculation unit 413 cannot derive the predicted image feature amount or time Since the target distance is too far, the pixel average value DCP of the predicted image and the pixel error value ACP of the predicted image are set to initial values.
 POC1 = POC2   …(23)
 (DistScaleFactor>>2) < -64   …(24)
 (DistScaleFactor>>2) > 128   …(25)
POC1 = POC2 (23)
(DistScaleFactor >> 2) <-64 (24)
(DistScaleFactor >>2)> 128 (25)
 予測画像特徴量算出部413は、例えば、DistScaleFactorが数式(23)~数式(25)のいずれかの条件を満たす場合、内部的な変数である予測画像特徴量導出フラグwp_avaiable_flagをfalseに設定し、DistScaleFactorが数式(23)~数式(25)のいずれの条件も満たさない場合、wp_avaiable_flagをtrueに設定する。予測画像特徴量導出フラグがfalseに設定された場合、DCP及びACPには初期値が設定される。例えば、DCPに0を示すDefaultDCが設定され、ACPに0を示すDefaultACが設定される。予測画像特徴量導出フラグがtrueに設定された場合、DCP、ACPには、それぞれ数式(19)、数式(20)で算出された値が設定される。 For example, when the DistScaleFactor satisfies any of the formulas (23) to (25), the predicted image feature value calculation unit 413 sets the predicted image feature value derivation flag wp_avaiable_flag, which is an internal variable, to false, If DistScaleFactor does not satisfy any of the conditions of Equations (23) to (25), wp_avaiable_flag is set to true. When the predicted image feature quantity derivation flag is set to false, initial values are set in the DCP and ACP. For example, DefaultDC indicating 0 is set in DCP, and DefaultAC indicating 0 is set in ACP. When the predicted image feature value derivation flag is set to true, the values calculated by Expression (19) and Expression (20) are set in DCP and ACP, respectively.
 パラメータ導出部110は、参照画像特徴量導出部108から入力された参照画像郡特徴量と予測画像特徴量導出部109から入力された予測画像特徴量とを用いて、符号化対象画像のWPパラメータ情報を導出する。 The parameter deriving unit 110 uses the reference image group feature amount input from the reference image feature amount deriving unit 108 and the predicted image feature amount input from the predicted image feature amount deriving unit 109 to use the WP parameter of the encoding target image. Deriving information.
 図11A及び図11Bは、第1実施形態のWPパラメータ情報の一例を示す図である。P-slice時のWPパラメータ情報の一例は、図11Aに示すとおりであり、B-slice時のWPパラメータ情報の一例は、図11A及び図11Bに示すとおりである。リスト番号及び参照画像番号は、参照画像群特徴量と同様であり、WP適用フラグ、重み係数、及びオフセットについては、図6及び図7で説明したとおりである。なお、リスト番号が0番のWP適用フラグ、重み係数、オフセットは、それぞれ、第一WP適用フラグ、第一重み係数、第一オフセットに対応し、リスト番号が1番のWP適用フラグ、重み係数、オフセットは、それぞれ、第二WP適用フラグ、第二重み係数、第二オフセットに対応する。WPパラメータ情報は、参照リストと参照画像毎に保持されるため、B-slice時に必要な情報は、参照画像がN+1個とすると2N+2個となる。 FIG. 11A and FIG. 11B are diagrams illustrating an example of WP parameter information according to the first embodiment. An example of WP parameter information at the time of P-slice is as shown in FIG. 11A, and an example of WP parameter information at the time of B-slice is as shown in FIGS. 11A and 11B. The list number and the reference image number are the same as the reference image group feature quantity, and the WP application flag, the weighting coefficient, and the offset are as described with reference to FIGS. Note that the WP application flag, weighting factor, and offset with list number 0 correspond to the first WP application flag, first weighting factor, and first offset, respectively, and the WP application flag and weighting factor with list number 1 , Offset corresponds to a second WP application flag, a second weighting factor, and a second offset, respectively. Since the WP parameter information is held for each reference list and reference image, the information required for B-slice is 2N + 2 if N + 1 images are used.
 パラメータ導出部110は、まず、予測画像特徴量に含まれる予測画像特徴量導出フラグwp_avaiable_flagを確認し、WPパラメータ情報を導出可能か否か確認する。wp_avaiable_flagがfalseに設定されている場合、WPパラメータ情報を導出できないため、パラメータ導出部110は、リスト番号X、参照画像番号Yに対応する重み係数、オフセットを、それぞれ、数式(26)、数式(27)に従って、初期値に設定する。 The parameter deriving unit 110 first confirms the predicted image feature amount derivation flag wp_avaiable_flag included in the predicted image feature amount, and confirms whether the WP parameter information can be derived. When wp_avaiable_flag is set to false, the WP parameter information cannot be derived. Therefore, the parameter deriving unit 110 sets the weighting coefficient and the offset corresponding to the list number X and the reference image number Y to Expression (26) and Expression ( The initial value is set according to 27).
 Weight[X][Y] = (1<<Log2Denom)   …(26)
 Offset[X][Y] = 0   …(27)
Weight [X] [Y] = (1 << Log2Denom) (26)
Offset [X] [Y] = 0 (27)
 Weight[X][Y]は、数式(7)、数式(9)で用いられるw0Cやw1Cに対応する値である。Log2Denomは、数式(28)により算出され、数式(7)などで用いられるlogWDに対応する値である。 Weight [X] [Y] is a value corresponding to w 0C or w 1C used in Equations (7) and (9). Log2Denom is a value corresponding to log WD C calculated by Expression (28) and used in Expression (7) and the like.
 Log2Denom = Default_Value   …(28) Log2Denom = Default_Value (28)
 ここでは、Default_Valueを、例えば、0や7などに設定すればよい。 Here, Default_Value may be set to 0 or 7, for example.
 パラメータ導出部110は、wp_avaiable_flagがfalseに設定されている場合、全てのリスト番号X、参照画像番号Yの組み合わせ(全ての参照画像)に対し、数式(26)及び数式(27)を繰り返し実行することにより、WPパラメータ情報に初期値を設定する。 When the wp_avaiable_flag is set to false, the parameter deriving unit 110 repeatedly executes Expression (26) and Expression (27) for all combinations of list numbers X and reference image numbers Y (all reference images). Thus, an initial value is set in the WP parameter information.
 なお、パラメータ導出部110は、wp_avaiable_flagがfalseに設定されている場合、リスト番号X、参照画像番号Yに対応するWP適用フラグ(WP_flag[X][Y])をfalseに設定する。 The parameter deriving unit 110 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to false when wp_avaiable_flag is set to false.
 一方、wp_avaiable_flagがtrueに設定されている場合、WPパラメータ情報を導出できるため、パラメータ導出部110は、リスト番号X、参照画像番号Yに対応する重み係数、オフセットを、それぞれ、数式(29)、数式(30)に従って、導出する。 On the other hand, when wp_avaiable_flag is set to true, the WP parameter information can be derived. Therefore, the parameter deriving unit 110 calculates the weight coefficient and the offset corresponding to the list number X and the reference image number Y, respectively, using the formula (29) Derived according to equation (30).
 Weight[X][Y] = (((curAC<<Log2Denom)+Ofst4)/(AC[X][Y]<<LeftShft))   …(29)
 Offset[X][Y] = (((curDC<<Log2Denom)-(Weight[X][Y]*(DC[X][Y]<<LeftShft))+RealOfst)>>RealLog2Denom)   …(30)
Weight [X] [Y] = (((curAC << Log2Denom) + Ofst4) / (AC [X] [Y] << LeftShft)) ... (29)
Offset [X] [Y] = (((curDC << Log2Denom)-(Weight [X] [Y] * (DC [X] [Y] << LeftShft)) + RealOfst) >> RealLog2Denom)… (30)
 ここで、curDC、curACは、それぞれ、予測画像の画素平均値DCP、画素誤差値ACPを示す。DC[X][Y]、AC[X][Y]は、それぞれ、リスト番号X、参照画像番号Yの参照画像の画素平均値DCLX(Y)、画素誤差値ACLX(Y)を示す。LeftShftは、数式(19)~(22)で用いたShft3に対する演算精度変更を補正する値であり、数式(31)により算出される。 Here, curDC and curAC indicate the pixel average value DCP and the pixel error value ACP of the predicted image, respectively. DC [X] [Y] and AC [X] [Y] indicate the pixel average value DCLX (Y) and pixel error value ACLX (Y) of the reference image with the list number X and the reference image number Y, respectively. LeftShft is a value for correcting a change in calculation accuracy for Shft3 used in equations (19) to (22), and is calculated by equation (31).
 LeftShft = (8-INTERNAL_PREC)   …(31) LeftShft = (8-INTERNAL_PREC) ... (31)
 Ofst4は、AC[X][Y]で除算する場合の丸め処理に用いられるパラメータである。例えば、固定小数点精度の丸め処理の際に四捨五入する場合は、Ofst4を(AC[X][Y]>>1)とすればよく、常に切り捨てにする場合はOfst4を0に設定すればよい。RealLog2Denomは、数式(32)により算出され、RealOfstは、数式(33)により算出される。 Ofst4 is a parameter used for rounding when dividing by AC [X] [Y]. For example, if rounding is performed during rounding with fixed-point precision, Ofst4 may be set to (AC [X] [Y] >> 1), and Ofst4 may be set to 0 when always rounding down. RealLog2Denom is calculated by Expression (32), and RealOfst is calculated by Expression (33).
 RealLog2Denom = Log2Denom+LeftShft   …(32)
 RealOfst = (1<<(RealLog2Denom-1))   …(33)
RealLog2Denom = Log2Denom + LeftShft (32)
RealOfst = (1 << (RealLog2Denom-1)) ... (33)
 ここでは、RealLog2Denomを、例えば、7などの予め定めた値に設定すればよい。 Here, RealLog2Denom may be set to a predetermined value such as 7, for example.
 パラメータ導出部110は、wp_avaiable_flagがtrueに設定されている場合、全てのリスト番号X、参照画像番号Yの組み合わせ(全ての参照画像)に対し、数式(29)及び数式(30)を繰り返し実行することにより、図11A及び図11Bに示すWPパラメータ情報を導出する。 When the wp_avaiable_flag is set to true, the parameter deriving unit 110 repeatedly executes Expression (29) and Expression (30) for all combinations of the list number X and the reference image number Y (all reference images). Thus, the WP parameter information shown in FIGS. 11A and 11B is derived.
 なおパラメータ導出部110は、wp_avaiable_flagがtrueに設定されている場合、リスト番号X、参照画像番号Yに対応するWP適用フラグ(WP_flag[X][Y])をtrueに設定する。 The parameter deriving unit 110 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to true when wp_avaiable_flag is set to true.
 図1に戻り、動き評価部111は、入力画像と予測画像生成部107から入力された参照画像とに基づき複数フレーム間の動き評価を行い、動き情報を出力し、動き情報を予測画像生成部107及び符号化部112に入力する。 Returning to FIG. 1, the motion evaluation unit 111 performs motion evaluation between a plurality of frames based on the input image and the reference image input from the predicted image generation unit 107, outputs motion information, and uses the motion information as the predicted image generation unit. 107 and the encoding unit 112.
 動き評価部111は、例えば、予測対象画素ブロックの入力画像と同位置に対応する複数の参照画像を起点として差分値を計算することで誤差を算出し、この位置を分数精度でずらし、誤差最小のブロックを探すブロックマッチングなどの手法により、最適な動き情報を算出する。動き評価部111は、双方向予測の場合には、単方向予測で導出された動き情報を用いて、数式(1)及び数式(4)に示すようなデフォルト動き補償予測を含むブロックマッチングを行うことにより、双方向予測の動き情報を算出する。 For example, the motion evaluation unit 111 calculates an error by calculating a difference value from a plurality of reference images corresponding to the same position as the input image of the prediction target pixel block, shifts the position with fractional accuracy, and minimizes the error. Optimal motion information is calculated by a technique such as block matching for searching for a block. In the case of bidirectional prediction, the motion evaluation unit 111 performs block matching including default motion compensation prediction as shown in Equation (1) and Equation (4) using motion information derived by unidirectional prediction. Thus, motion information for bidirectional prediction is calculated.
 この際、動き評価部111は、数式(7)及び数式(9)に示すような重み付き動き補償予測を含むブロックマッチングを行うことにより、重み付き予測を考慮した動き情報の算出を行うことも可能である。この場合、動き評価部111は、パラメータ導出部110から出力されたWPパラメータ情報に従って、数式(7)及び数式(9)を用いてブロックマッチングを行えばよい。 At this time, the motion evaluation unit 111 may calculate motion information in consideration of weighted prediction by performing block matching including weighted motion compensated prediction as shown in Equation (7) and Equation (9). Is possible. In this case, the motion evaluation unit 111 may perform block matching using Equation (7) and Equation (9) according to the WP parameter information output from the parameter deriving unit 110.
 なお第1実施形態では、符号化装置100の一機能として動き評価部111を例示しているが、動き評価部111は符号化装置100の必須の構成ではなく、例えば、動き評価部111を符号化装置100外の装置としてもよい。この場合、動き評価部111で算出された動き情報を符号化装置100にロードするようにすればよい。 In the first embodiment, the motion evaluation unit 111 is exemplified as one function of the encoding device 100. However, the motion evaluation unit 111 is not an essential component of the encoding device 100, and for example, the motion evaluation unit 111 is encoded. It is good also as an apparatus outside the conversion apparatus 100. In this case, the motion information calculated by the motion evaluation unit 111 may be loaded into the encoding device 100.
 符号化部112は、量子化部103から入力された量子化変換係数、動き評価部111から入力された動き情報、及び符号化制御部113によって指定される量子化情報などの様々な符号化パラメータに対して符号化処理を行い、符号化データを生成する。符号化処理は、例えば、ハフマン符号化や算術符号化などが該当する。例えば、H.264では、コンテキスト適応型の可変長符号化(CAVLC:Context based Adaptive Variable Length Coding)やコンテキスト適応型の算術符号化(CABAC:Context based Adaptive Binary Arithmetic Coding)などが用いられる。 The encoding unit 112 includes various encoding parameters such as a quantized transform coefficient input from the quantization unit 103, motion information input from the motion evaluation unit 111, and quantization information specified by the encoding control unit 113. Is encoded to generate encoded data. The encoding process corresponds to, for example, Huffman encoding or arithmetic encoding. For example, H.M. In H.264, context-adaptive variable-length coding (CAVLC: Context based Adaptive Variable Coding) and context-adaptive arithmetic coding (CABAC: Context based Adaptive Binary Coding) are used.
 符号化パラメータとは、予測方法などを示す予測情報、量子化変換係数に関する情報、及び量子化に関する情報などの復号に必要となるパラメータである。例えば、符号化制御部113が図示せぬ内部メモリを持ち、この内部メモリに符号化パラメータが保持され、画素ブロックを符号化する際に隣接する既に符号化済みの画素ブロックの符号化パラメータを用いるようにできる。例えば、H.264のイントラ予測では、符号化済みの隣接ブロックの予測情報から、画素ブロックの予測情報を導出することができる。 The encoding parameter is a parameter necessary for decoding, such as prediction information indicating a prediction method, information on quantization transform coefficients, information on quantization, and the like. For example, the encoding control unit 113 has an internal memory (not shown), the encoding parameter is held in the internal memory, and the encoding parameter of the adjacent already encoded pixel block is used when encoding the pixel block. You can For example, H.M. In the H.264 intra prediction, the prediction information of the pixel block can be derived from the prediction information of the encoded adjacent block.
 符号化部112は、生成した符号化データを、符号化制御部113が管理する適切な出力タイミングに従って出力する。出力された符号化データは、例えば、図示せぬ多重化部などで様々な情報が多重化されて、図示せぬ出力バッファなどに一時的に蓄積された後に、例えば、図示せぬ蓄積系(蓄積メディア)又は伝送系(通信回線)へ出力される。 The encoding unit 112 outputs the generated encoded data according to an appropriate output timing managed by the encoding control unit 113. The output encoded data is, for example, multiplexed with various information such as a multiplexing unit (not shown) and temporarily stored in an output buffer (not shown). (Storage medium) or transmission system (communication line).
 図12は、第1実施形態の参照画像特徴量導出部108で行われる参照画像群特徴量導出処理の手順の流れの一例を示すフローチャートである。 FIG. 12 is a flowchart illustrating an example of a flow of a reference image group feature quantity derivation process performed by the reference image feature quantity derivation unit 108 of the first embodiment.
 まず、参照画像特徴量導出部108は、符号化対象画像のスライスタイプがB-sliceの場合、参照リストを2つ利用できるのでPRED_TYPEを1に設定し、スライスタイプがP-sliceの場合、参照リストを1つ利用できるのでPRED_TYPEを0に設定しておく。なお、符号化対象画像のスライスタイプは、符号化制御部113において変数slice_typeで管理しているので、参照画像特徴量導出部108は、変数slice_typeを参照することで、符号化対象画像のスライスタイプがB-sliceかP-sliceかを判別できる。 First, when the slice type of the encoding target image is B-slice, the reference image feature quantity deriving unit 108 can use two reference lists, so PRED_TYPE is set to 1, and when the slice type is P-slice, reference is made. Since one list can be used, set PRED_TYPE to 0. In addition, since the slice type of the encoding target image is managed by the variable slice_type in the encoding control unit 113, the reference image feature value deriving unit 108 refers to the variable slice_type, thereby determining the slice type of the encoding target image. Can be discriminated whether B-slice or P-slice.
 続いて、参照画像特徴量導出部108は、リスト番号Xを0に初期化し(ステップS101)、参照画像番号Yを0に初期化する(ステップS102)。 Subsequently, the reference image feature quantity deriving unit 108 initializes the list number X to 0 (step S101), and initializes the reference image number Y to 0 (step S102).
 続いて、予測画像生成部107からリスト番号X及び参照画像番号Yの参照画像が入力されると、平均値算出部401は、数式(10)に従って画素平均値DCLX[Y]を算出し、数式(11)に従って画素誤差値ACLX[Y]を算出する(ステップS103)。 Subsequently, when the reference image with the list number X and the reference image number Y is input from the predicted image generation unit 107, the average value calculation unit 401 calculates the pixel average value DCLX [Y] according to Equation (10), The pixel error value ACLX [Y] is calculated according to (11) (step S103).
 続いて、統合部403は、リスト番号X、参照画像番号Y、画素平均値DCLX[Y]、及び画素誤差値ACLX[Y]を参照画像特徴量に統合して、参照画像群特徴量のテーブルを更新する。そして、参照画像特徴量導出部108は、参照画像番号Yをインクリメントする(ステップS104)。 Subsequently, the integration unit 403 integrates the list number X, the reference image number Y, the pixel average value DCLX [Y], and the pixel error value ACLX [Y] into the reference image feature amount, and a reference image group feature amount table. Update. Then, the reference image feature quantity deriving unit 108 increments the reference image number Y (step S104).
 続いて、参照画像特徴量導出部108は、インクリメント後の参照画像番号Yがnum_ref_active_lx_minus1よりも大きいかどうかを判別し(ステップS105)、小さければ(ステップS105でNo)、ステップS103に戻る。 Subsequently, the reference image feature quantity deriving unit 108 determines whether or not the incremented reference image number Y is larger than num_ref_active_lx_minus1 (step S105). If it is smaller (No in step S105), the process returns to step S103.
 一方、大きければ(ステップS105でYes)、参照画像特徴量導出部108は、リスト番号Xの参照リストの全ての参照画像の画素平均値及び画素誤差値の算出を終了したと判断し、リスト番号Xをインクリメントする(ステップS106)。 On the other hand, if it is larger (Yes in step S105), the reference image feature quantity deriving unit 108 determines that the calculation of the pixel average value and the pixel error value of all the reference images in the reference list of the list number X is completed, and the list number X is incremented (step S106).
 続いて、参照画像特徴量導出部108は、インクリメント後のリスト番号XがPRED_TYPEよりも大きいかどうかを判別し(ステップS107)、小さければ(ステップS107でNo)、ステップS102に戻る。 Subsequently, the reference image feature quantity deriving unit 108 determines whether or not the incremented list number X is larger than PRED_TYPE (step S107), and if smaller (No in step S107), the process returns to step S102.
 一方、大きければ(ステップS107でYes)、参照画像特徴量導出部108は、全ての参照リストを処理したと判断し、参照画像群特徴量を出力して処理を終了する。 On the other hand, if it is larger (Yes in step S107), the reference image feature quantity deriving unit 108 determines that all the reference lists have been processed, outputs the reference image group feature quantity, and ends the process.
 なお、図12に示すフローチャートでは、予測画像生成部107から該当する参照画像が順次入力される例を示したが、リスト番号と参照画像番号とが関連付けられた参照画像が一括入力されてもよい。 In the flowchart illustrated in FIG. 12, an example in which the corresponding reference images are sequentially input from the predicted image generation unit 107 is illustrated. However, reference images in which a list number and a reference image number are associated may be input in a batch. .
 図13は、第1実施形態の予測画像特徴量導出部109で行われる予測画像特徴量導出処理の手順の流れの一例を示すフローチャートである。 FIG. 13 is a flowchart illustrating an example of a flow of a predicted image feature quantity derivation process performed by the predicted image feature quantity derivation unit 109 of the first embodiment.
 まず、予測画像特徴量導出部109は、符号化対象画像のスライスタイプがB-sliceの場合、PRED_TYPEを1に設定し、スライスタイプがP-sliceの場合、PRED_TYPEを0に設定しておく。なお、予測画像特徴量導出部109は、符号化制御部113で管理している変数slice_typeを参照することで、符号化対象画像のスライスタイプがB-sliceかP-sliceかを判別できる。 First, the predicted image feature quantity deriving unit 109 sets PRED_TYPE to 1 when the slice type of the encoding target image is B-slice, and sets PRED_TYPE to 0 when the slice type is P-slice. Note that the predicted image feature quantity deriving unit 109 can determine whether the slice type of the encoding target image is B-slice or P-slice by referring to the variable slice_type managed by the encoding control unit 113.
 続いて、予測画像特徴量導出部109は、リスト番号Xを0に初期化し(ステップS201)、参照画像番号Yを0に初期化する(ステップS202)。 Subsequently, the predicted image feature quantity deriving unit 109 initializes the list number X to 0 (step S201) and initializes the reference image number Y to 0 (step S202).
 続いて、参照画像特徴量導出部108から参照画像群特徴量が入力されると、特徴量制御部411は、符号化対象画像のスライスタイプに応じて数式(12)又は数式(13)のRefPicOrderCnt関数を用いて、リスト番号X及び参照画像番号YからPOCを導出し、導出したPOCをrefPOC配列に格納する。そして特徴量制御部411は、参照画像番号Yをインクリメントする(ステップS203)。 Subsequently, when a reference image group feature amount is input from the reference image feature amount deriving unit 108, the feature amount control unit 411 determines the RefPicOrderCnt of Expression (12) or Expression (13) according to the slice type of the encoding target image. Using the function, POC is derived from list number X and reference image number Y, and the derived POC is stored in the refPOC array. Then, the feature quantity control unit 411 increments the reference image number Y (step S203).
 続いて、特徴量制御部411は、インクリメント後の参照画像番号Yがnum_ref_active_lx_minus1よりも大きいかどうかを判別し(ステップS204)、小さければ(ステップS204でNo)、ステップS203に戻る。 Subsequently, the feature quantity control unit 411 determines whether or not the incremented reference image number Y is larger than num_ref_active_lx_minus1 (step S204), and if smaller (No in step S204), the process returns to step S203.
 一方、大きければ(ステップS204でYes)、特徴量制御部411は、リスト番号Xの参照リストの全ての参照画像のPOCの導出を終了したと判断し、リスト番号Xをインクリメントする(ステップS205)。 On the other hand, if it is larger (Yes in step S204), the feature amount control unit 411 determines that the POC derivation of all the reference images in the reference list of the list number X is finished, and increments the list number X (step S205). .
 続いて、特徴量制御部411は、インクリメント後のリスト番号XがPRED_TYPEよりも大きいかどうかを判別し(ステップS206)、小さければ(ステップS206でNo)、ステップS202に戻る。 Subsequently, the feature quantity control unit 411 determines whether or not the incremented list number X is larger than PRED_TYPE (step S206), and if smaller (No in step S206), the process returns to step S202.
 一方、大きければ(ステップS206でYes)、特徴量制御部411は、全ての参照リストを処理したと判断し、符号化対象画像のスライスタイプに応じて数式(12)又は数式(13)のSortRefPOC関数を用いて、refPOCに格納されているPOCのうちcurPOCが示す符号化対象画像のPOCとの絶対距離(絶対差分値)が最小のPOCをPOC1に設定する。そして特徴量制御部411は、POC1の参照画像の参照画像特徴量である第一参照画像特徴量をメモリ412に出力し、refPOC配列から当該POCを削除する(ステップS207)。 On the other hand, if it is larger (Yes in step S206), the feature quantity control unit 411 determines that all the reference lists have been processed, and sortRefPOC of Expression (12) or Expression (13) according to the slice type of the encoding target image. Using the function, the POC having the smallest absolute distance (absolute difference value) from the POC of the encoding target image indicated by curPOC among the POCs stored in refPOC is set to POC1. Then, the feature amount control unit 411 outputs the first reference image feature amount, which is the reference image feature amount of the reference image of POC1, to the memory 412, and deletes the POC from the refPOC array (step S207).
 続いて、特徴量制御部411は、再度、符号化対象画像のスライスタイプに応じて数式(12)又は数式(13)のSortRefPOC関数を用いて、refPOCに格納されているPOCのうちcurPOCが示す符号化対象画像のPOCとの絶対距離(絶対差分値)が最小のPOCをPOC2に設定し、refPOC配列から当該POCを削除する(ステップS208)。 Subsequently, the feature value control unit 411 again uses the SortRefPOC function of Expression (12) or Expression (13) according to the slice type of the encoding target image to indicate curPOC among POCs stored in refPOC. The POC having the smallest absolute distance (absolute difference value) with the POC of the encoding target image is set as POC2, and the POC is deleted from the refPOC array (step S208).
 続いて、POC1とPOC2とが同一である場合(ステップS209でYes)、特徴量制御部411は、数式(14)を実行してPOC2を更新し(ステップS210)、ステップS209に戻る。 Subsequently, when POC1 and POC2 are the same (Yes in Step S209), the feature amount control unit 411 executes Formula (14) to update POC2 (Step S210), and returns to Step S209.
 一方、POC1とPOC2とが同一でない場合(ステップS209でNo)、特徴量制御部411は、POC2の参照画像の参照画像特徴量である第二参照画像特徴量を予測画像特徴量算出部413に出力する。 On the other hand, if POC1 and POC2 are not the same (No in step S209), the feature amount control unit 411 sends the second reference image feature amount that is the reference image feature amount of the reference image of POC2 to the predicted image feature amount calculation unit 413. Output.
 続いて、メモリ412から第一参照画像特徴量が入力され、特徴量制御部411から第二参照画像特徴量が入力されると、予測画像特徴量導出部109は、数式(16)を用いてPOC1とcurPOCとの時間距離比tbを導出する(ステップS211)。 Subsequently, when the first reference image feature value is input from the memory 412 and the second reference image feature value is input from the feature value control unit 411, the predicted image feature value deriving unit 109 uses Equation (16). A time distance ratio tb between POC1 and curPOC is derived (step S211).
 続いて、予測画像特徴量導出部109は、数式(18)を用いてPOC1とPOC2との時間距離比tdを導出する(ステップS212)。 Subsequently, the predicted image feature quantity deriving unit 109 derives the time distance ratio td between POC1 and POC2 using Expression (18) (step S212).
 続いて、予測画像特徴量導出部109は、数式(15)を用いて、時間距離比tbと時間距離比tdとから距離のスケーリングに用いるDistScaleFactorを導出する(ステップS213)。 Subsequently, the predicted image feature quantity deriving unit 109 derives DistScaleFactor used for distance scaling from the time distance ratio tb and the time distance ratio td using Expression (15) (step S213).
 続いて、予測画像特徴量導出部109は、導出したDistScaleFactorが、例外処理の条件である数式(23)~数式(25)のいずれかの条件を満たすか否かを判定する(ステップS214)。 Subsequently, the predicted image feature quantity deriving unit 109 determines whether or not the derived DistScaleFactor satisfies any of the formulas (23) to (25) that are conditions for the exception process (step S214).
 例外処理の条件に該当する場合(ステップS214でYes)、予測画像特徴量導出部109は、予測画像特徴量導出フラグwp_avaiable_flagをfalseに設定する(ステップS215)。これにより、予測画像の画素平均値DCP及び画素誤差値ACPが初期値に設定される(ステップS216)。 If the exception processing condition is met (Yes in step S214), the predicted image feature quantity derivation unit 109 sets the predicted image feature quantity derivation flag wp_avaiable_flag to false (step S215). Thereby, the pixel average value DCP and the pixel error value ACP of the predicted image are set to the initial values (step S216).
 一方、例外処理の条件に該当しない場合(ステップS214でNo)、予測画像特徴量導出部109は、wp_avaiable_flagをtrueに設定する(ステップS217)。 On the other hand, when the exception processing condition is not satisfied (No in Step S214), the predicted image feature quantity deriving unit 109 sets wp_avaiable_flag to true (Step S217).
 そして予測画像特徴量導出部109は、数式(19)を用いて、DistScaleFactor、第一参照画像特徴量の画素平均値DC1、及び第二参照画像特徴量の画素平均値DC2から予測画像の画素平均値DCPを導出し、数式(20)を用いて、DistScaleFactor、第一参照画像特徴量の画素誤差値AC1、及び第二参照画像特徴量の画素誤差値AC2から予測画像の画素誤差値ACPを導出する。そして予測画像特徴量導出部109は、導出した予測画像の画素平均値DCP、画素誤差値ACP、及び予測画像特徴量導出フラグwp_avaiable_flagを予測画像特徴量として出力する(ステップS218)。 Then, the predicted image feature quantity deriving unit 109 uses the equation (19) to calculate the pixel average of the predicted image from the DistScaleFactor, the pixel average value DC1 of the first reference image feature quantity, and the pixel average value DC2 of the second reference image feature quantity. The value DCP is derived, and the pixel error value ACP of the predicted image is derived from the DistScaleFactor, the pixel error value AC1 of the first reference image feature quantity, and the pixel error value AC2 of the second reference image feature quantity using Equation (20). To do. Then, the predicted image feature quantity deriving unit 109 outputs the pixel average value DCP, the pixel error value ACP, and the predicted image feature quantity derivation flag wp_avaiable_flag of the derived predicted image as the predicted image feature quantity (step S218).
 図14は、第1実施形態のパラメータ導出部110で行われるWPパラメータ情報導出処理の手順の流れの一例を示すフローチャートである。 FIG. 14 is a flowchart illustrating an example of a flow of a WP parameter information derivation process performed by the parameter derivation unit 110 of the first embodiment.
 まず、パラメータ導出部110は、符号化対象画像のスライスタイプがB-sliceの場合、PRED_TYPEを1に設定し、スライスタイプがP-sliceの場合、PRED_TYPEを0に設定しておく。なお、パラメータ導出部110は、符号化制御部113で管理している変数slice_typeを参照することで、符号化対象画像のスライスタイプがB-sliceかP-sliceかを判別できる。 First, the parameter deriving unit 110 sets PRED_TYPE to 1 when the slice type of the encoding target image is B-slice, and sets PRED_TYPE to 0 when the slice type is P-slice. The parameter deriving unit 110 can determine whether the slice type of the image to be encoded is B-slice or P-slice by referring to the variable slice_type managed by the encoding control unit 113.
 続いて、参照画像特徴量導出部108から参照画像群特徴量が入力され、予測画像特徴量導出部109から予測画像特徴量が入力されると、パラメータ導出部110は、予測画像特徴量導出フラグwp_avaiable_flagがfalseに設定されているか否かを確認する(ステップS301)。 Subsequently, when the reference image group feature amount is input from the reference image feature amount deriving unit 108 and the predicted image feature amount is input from the predicted image feature amount deriving unit 109, the parameter deriving unit 110 displays the predicted image feature amount deriving flag. It is confirmed whether or not wp_avaiable_flag is set to false (step S301).
 wp_avaiable_flagがfalseに設定されている場合(ステップS301でYes)、パラメータ導出部110は、数式(28)を用いてLog2Denomを0に設定する(ステップS302)。 When wp_avaiable_flag is set to false (Yes in step S301), the parameter deriving unit 110 sets Log2Denom to 0 using equation (28) (step S302).
 続いて、パラメータ導出部110は、数式(26)を用いて全ての参照画像の重み係数を初期値に設定し(ステップS303)、数式(27)を用いて全ての参照画像のオフセットを初期値に設定する(ステップS304)。そして、処理を終了する。 Subsequently, the parameter deriving unit 110 sets the weight coefficients of all reference images to initial values using Equation (26) (Step S303), and sets the offsets of all reference images to initial values using Equation (27). (Step S304). Then, the process ends.
 一方、wp_avaiable_flagがtrueに設定されている場合(ステップS301でNo)、パラメータ導出部110は、Log2Denomを予め定めた値(例えば、7など)に設定する(ステップS305)。 On the other hand, when wp_avaiable_flag is set to true (No in step S301), the parameter deriving unit 110 sets Log2Denom to a predetermined value (for example, 7) (step S305).
 続いて、パラメータ導出部110は、リスト番号Xを0に初期化し(ステップS306)、参照画像番号Yを0に初期化する(ステップS307)。 Subsequently, the parameter deriving unit 110 initializes the list number X to 0 (step S306), and initializes the reference image number Y to 0 (step S307).
 続いて、パラメータ導出部110は、リスト番号X及び参照画像番号Yに対応する重み係数Weight[X][Y]を、数式(29)を用いて導出し、オフセットOffset[X][Y]を、数式(30)を用いて導出する(ステップS308)。 Subsequently, the parameter deriving unit 110 derives the weighting coefficient Weight [X] [Y] corresponding to the list number X and the reference image number Y using Equation (29), and sets the offset Offset [X] [Y]. Derived using Equation (30) (step S308).
 続いて、パラメータ導出部110は、参照画像番号Yをインクリメントする(ステップS309)。そしてパラメータ導出部110は、インクリメント後の参照画像番号Yがnum_ref_active_lx_minus1よりも大きいかどうかを判別し(ステップS310)、小さければ(ステップS310でNo)、ステップS308に戻る。 Subsequently, the parameter deriving unit 110 increments the reference image number Y (step S309). Then, the parameter deriving unit 110 determines whether or not the incremented reference image number Y is larger than num_ref_active_lx_minus1 (step S310), and if smaller (No in step S310), the process returns to step S308.
 一方、大きければ(ステップS310でYes)、パラメータ導出部110は、リスト番号Xの参照リストの全ての参照画像の重み係数及びオフセットの導出を終了したと判断し、リスト番号Xをインクリメントする(ステップS311)。 On the other hand, if it is larger (Yes in step S310), the parameter deriving unit 110 determines that the derivation of the weighting factors and offsets of all the reference images in the reference list of the list number X has been completed, and increments the list number X (step). S311).
 続いて、パラメータ導出部110は、インクリメント後のリスト番号XがPRED_TYPEよりも大きいかどうかを判別し(ステップS312)、小さければ(ステップS312でNo)、ステップS307に戻る。 Subsequently, the parameter deriving unit 110 determines whether or not the incremented list number X is larger than PRED_TYPE (step S312), and if it is smaller (No in step S312), the process returns to step S307.
 一方、大きければ(ステップS312でYes)、特徴量制御部411は、全ての参照リストを処理したと判断し、WPパラメータ情報を出力して処理を終了する。 On the other hand, if it is larger (Yes in step S312), the feature quantity control unit 411 determines that all the reference lists have been processed, outputs WP parameter information, and ends the process.
 図15は、第1実施形態の符号化装置100が利用するシンタクス500の一例を示す図である。シンタクス500は、符号化装置100が入力画像(動画像データ)を符号化して生成した符号化データの構造を示している。符号化データを復号化する場合、後述の復号装置は、シンタクス500と同一のシンタクス構造を参照して動画像がシンタクス解釈を行う。 FIG. 15 is a diagram illustrating an example of the syntax 500 used by the encoding device 100 according to the first embodiment. A syntax 500 indicates a structure of encoded data generated by the encoding apparatus 100 by encoding an input image (moving image data). When decoding the encoded data, a decoding apparatus described later refers to the same syntax structure as that of the syntax 500, and the moving image interprets the syntax.
 シンタクス500は、ハイレベルシンタクス501、スライスレベルシンタクス502及びコーディングツリーレベルシンタクス503の3つのパートを含む。ハイレベルシンタクス501は、スライスよりも上位のレイヤのシンタクス情報を含む。スライスとは、フレーム若しくはフィールドに含まれる矩形領域又は連続領域を指す。スライスレベルシンタクス502は、各スライスを復号化するために必要な情報を含む。コーディングツリーレベルシンタクス503は、各コーディングツリー(即ち、各コーディングツリーブロック)を復号するために必要な情報を含む。これら各パートは、更に詳細なシンタクスを含む。 The syntax 500 includes three parts: a high level syntax 501, a slice level syntax 502, and a coding tree level syntax 503. The high level syntax 501 includes syntax information of a layer higher than the slice. A slice refers to a rectangular area or a continuous area included in a frame or a field. The slice level syntax 502 includes information necessary for decoding each slice. The coding tree level syntax 503 includes information necessary for decoding each coding tree (ie, each coding tree block). Each of these parts includes more detailed syntax.
 ハイレベルシンタクス501は、シーケンスパラメータセットシンタクス504、ピクチャパラメータセットシンタクス505、及びアダプテーションパラメータセットシンタクス506などのシーケンス及びピクチャレベルのシンタクスを含む。 The high level syntax 501 includes sequence and picture level syntaxes such as a sequence parameter set syntax 504, a picture parameter set syntax 505, and an adaptation parameter set syntax 506.
 スライスレベルシンタクス502は、スライスヘッダーシンタクス507、プレッドウェイトテーブルシンタクス508、及びスライスデータシンタクス509などを含む。プレッドウェイトテーブルシンタクス508は、スライスヘッダーシンタクス507から呼び出される。 The slice level syntax 502 includes a slice header syntax 507, a bread weight table syntax 508, a slice data syntax 509, and the like. The tread weight table syntax 508 is called from the slice header syntax 507.
 コーディングツリーレベルシンタクス503は、コーディングツリーユニットシンタクス510、トランスフォームユニットシンタクス511、及びプレディクションユニットシンタクス512などを含む。コーディングツリーユニットシンタクス510は、四分木構造を持つことができる。具体的には、コーディングツリーユニットシンタクス510のシンタクス要素として、更にコーディングツリーユニットシンタクス510を再帰呼び出しすることができる。即ち、1つのコーディングツリーブロックを四分木で細分化することができる。また、コーディングツリーユニットシンタクス510内にはトランスフォームユニットシンタクス511が含まれている。トランスフォームユニットシンタクス511は、四分木の最末端の各コーディングツリーユニットシンタクス510において呼び出される。トランスフォームユニットシンタクス511は、逆直交変換及び量子化などに関わる情報が記述されている。 The coding tree level syntax 503 includes a coding tree unit syntax 510, a transform unit syntax 511, a prediction unit syntax 512, and the like. The coding tree unit syntax 510 may have a quadtree structure. Specifically, the coding tree unit syntax 510 can be recursively called as a syntax element of the coding tree unit syntax 510. That is, one coding tree block can be subdivided with a quadtree. The coding tree unit syntax 510 includes a transform unit syntax 511. The transform unit syntax 511 is called in each coding tree unit syntax 510 at the extreme end of the quadtree. The transform unit syntax 511 describes information related to inverse orthogonal transformation and quantization.
 図16は、第1実施形態のピクチャパラメータセットシンタクス505の一例を示す図である。weighted_unipred_idcは、例えば、P-sliceに関する第1実施形態の重み付き補償予測の有効又は無効を示すシンタクス要素である。weighted_unipred_idcが0である場合、P-slice内での第1実施形態の重み付き動き補償予測は無効となる。従って、WPパラメータ情報に含まれるWP適用フラグは常に0に設定され、WPセレクタ304、305は、各々の出力端をデフォルト動き補償部301へ接続する。なお、weighted_unipred_idcが1である場合、P-slice内での明示的な重み付き動き補償予測(第1実施形態では未説明)が有効となる。明示的重み付き予測は、WPパラメータ情報を、プレッドウェイトテーブルシンタクスを用いて明示的に符号化するモードの1つであり、例えば、H.264で記述される方法で実現できる。また、weighted_unipred_idcが2である場合、P-slice内での第1実施形態の暗黙的重み付き動き補償予測が有効となる。 FIG. 16 is a diagram illustrating an example of the picture parameter set syntax 505 according to the first embodiment. For example, weighted_unipred_idc is a syntax element indicating whether the weighted compensated prediction of the first embodiment regarding P-slice is valid or invalid. When weighted_unipred_idc is 0, the weighted motion compensation prediction of the first embodiment in the P-slice is invalid. Accordingly, the WP application flag included in the WP parameter information is always set to 0, and the WP selectors 304 and 305 connect the respective output terminals to the default motion compensation unit 301. When weighted_unipred_idc is 1, explicit weighted motion compensation prediction (not explained in the first embodiment) in the P-slice is valid. Explicit weighted prediction is one of modes in which WP parameter information is explicitly encoded using the predicate weight table syntax. It can be realized by the method described in H.264. When weighted_unipred_idc is 2, the implicit weighted motion compensation prediction of the first embodiment in the P-slice is valid.
 なお、別の例として、weighted_unipred_idcをweighted_pred_flagに変更して、P-sliceにおける第1実施形態の暗黙的重み付き動き補償予測を常に禁止するようにしてもよい。 As another example, the weighted_unipred_idc may be changed to weighted_pred_flag so that the implicit weighted motion compensation prediction of the first embodiment in the P-slice is always prohibited.
 weighted_bipred_idcは、例えば、B-sliceに関する第1実施形態の重み付き補償予測の有効又は無効を示すシンタクス要素である。weighted_bipred_idcが0である場合、B-slice内での第1実施形態の重み付き動き補償予測は無効となる。従って、WPパラメータ情報に含まれるWP適用フラグは常に0に設定され、WPセレクタ304、305は、各々の出力端をデフォルト動き補償部301へ接続する。なお、weighted_bipred_idcが1である場合、B-slice内での明示的な重み付き動き補償予測(第1実施形態では未説明)が有効となる。明示的重み付き予測は、WPパラメータ情報を、レッドウェイトテーブルシンタクスを用いて明示的に符号化するモードの1つであり、例えば、H.264で記述される方法で実現できる。また、weighted_bipred_idcが2である場合、B-slice内での第1実施形態の暗黙的重み付き動き補償予測が有効となる。 “Weighted_bipred_idc” is a syntax element indicating, for example, the validity or invalidity of the weighted compensation prediction of the first embodiment related to B-slice. When weighted_bipred_idc is 0, the weighted motion compensation prediction of the first embodiment in the B-slice is invalid. Accordingly, the WP application flag included in the WP parameter information is always set to 0, and the WP selectors 304 and 305 connect the respective output terminals to the default motion compensation unit 301. When weighted_bipred_idc is 1, explicit weighted motion compensated prediction (not described in the first embodiment) in the B-slice is effective. Explicit weighted prediction is one of modes in which WP parameter information is explicitly encoded using red weight table syntax. It can be realized by the method described in H.264. When weighted_bipred_idc is 2, the implicit weighted motion compensation prediction of the first embodiment in the B-slice is valid.
 図17は、第1実施形態のシーケンスパラメータセットシンタクス504の一例を示す図である。profile_idcは、符号化データのプロファイルに関する情報を示す識別子である。level_idcは、符号化データのレベルに関する情報を示す識別子である。seq_parameter_set_idは、いずれのシーケンスパラメータセットシンタクス504を参照するかを示す識別子である。max_num_ref_framesは、フレームにおける参照画像の最大数を示す変数である。implicit_weighted_unipred_enabled_flagは、例えば、符号化データに関してP-sliceの暗黙的重み付き動き補償予測の有効又は無効を示すシンタクス要素である。implicit_weighted_bipred_enabled_flagは、例えば、符号化データに関してB-sliceの暗黙的重み付き動き補償予測の有効又は無効を示すシンタクス要素である。 FIG. 17 is a diagram illustrating an example of the sequence parameter set syntax 504 according to the first embodiment. profile_idc is an identifier indicating information on a profile of encoded data. level_idc is an identifier indicating information on the level of encoded data. seq_parameter_set_id is an identifier indicating which sequence parameter set syntax 504 is to be referred to. max_num_ref_frames is a variable indicating the maximum number of reference images in a frame. The implicit_weighted_unipred_enabled_flag is a syntax element indicating, for example, whether the P-slice implicit weighted motion compensation prediction is valid or invalid with respect to the encoded data. The implicit_weighted_bipred_enabled_flag is a syntax element indicating, for example, whether the B-slice implicit weighted motion compensation prediction is valid or invalid with respect to the encoded data.
 implicit_weighted_unipred_enabled_flag又はimplicit_weighted_bipred_enabled_flagが0である場合、符号化データ内でのP-slice又はB-sliceの暗黙的重み付き動き補償予測は無効となる。従って、WPパラメータ情報に含まれるWP適用フラグは常に0に設定され、WPセレクタ304、305は、各々の出力端をデフォルト動き補償部301へ接続する。一方、implicit_weighted_unipred_enabled_flag又はimplicit_weighted_bipred_enabled_flagが1である場合、符号化データ内でのP-slice又はB-sliceの暗黙的重み付き動き補償予測が有効となる。 When implicit_weighted_unipred_enabled_flag or implicit_weighted_bipred_enabled_flag is 0, P-slice or B-slice implicit weighted motion compensated prediction in the encoded data is invalid. Accordingly, the WP application flag included in the WP parameter information is always set to 0, and the WP selectors 304 and 305 connect the respective output terminals to the default motion compensation unit 301. On the other hand, when implicit_weighted_unipred_enabled_flag or implicit_weighted_bipred_enabled_flag is 1, P-slice or B-slice implicit weighted motion compensation prediction in the encoded data is valid.
 なお、別の例として、implicit_weighted_unipred_enabled_flag又はimplicit_weighted_bipred_enabled_flagが1である場合には、より下位のレイヤ(ピクチャパラメータセット、スライスヘッダー、コーディングツリーブロック、トランスフォームユニット、及びプレディクションユニットなど)のシンタクスにおいて、スライス内部の局所領域毎に暗黙的重み付き動き補償予測の有効又は無効を規定するようにしてもよい。 As another example, when implicit_weighted_unipred_enabled_flag or implicit_weighted_bipred_enabled_flag is 1, in the syntax of a lower layer (picture parameter set, slice header, coding tree block, transform unit, prediction unit, etc.) The validity or invalidity of the implicitly weighted motion compensated prediction may be defined for each local region.
 また、別の例として、implicit_weighted_unipred_enabled_flag又はimplicit_weighted_bipred_enabled_flagが1である場合では、符号化スライスの復号処理が完了した後に、符号化画像の画像特徴量を導出して、参照画像の情報とともにDPBに保持する形としてもよい。この場合、当該符号化スライスが参照画像としてDPBに保持され、他のスライスを符号化する際に参照される度に、画像特徴量を複数回計算する処理を省くことができ、処理量を削減できる。 As another example, when implicit_weighted_unipred_enabled_flag or implicit_weighted_bipred_enabled_flag is 1, after the decoding process of the encoded slice is completed, the image feature amount of the encoded image is derived and stored in the DPB together with the information of the reference image It is good. In this case, the encoded slice is stored in the DPB as a reference image, and every time it is referred to when another slice is encoded, it is possible to omit the process of calculating the image feature quantity multiple times, thereby reducing the processing amount. it can.
 以上のように第1実施形態では、参照画像特徴量導出部108は、参照画像特徴量として、参照画像の画素平均値及び画素誤差値を導出し、予測画像特徴量導出部109は、予測画像特徴量として、予測画像の画素平均値及び画素誤差値を導出し、パラメータ導出部110は、WPパラメータ情報として、参照画像の画素平均値及び予測画像の画素平均値を用いて重み係数を導出し、参照画像の画素誤差値及び予測画像の画素誤差値を用いてオフセットを導出する。従って第1実施形態によれば、重み係数だけでなくオフセットも考慮して暗黙的重み付き予測を行うことができるので、この予測により生成した予測画像を用いることにより、予測誤差を低減できるので、符号量を削減でき、符号化効率を向上できる。 As described above, in the first embodiment, the reference image feature amount deriving unit 108 derives the pixel average value and the pixel error value of the reference image as the reference image feature amount, and the predicted image feature amount deriving unit 109 The pixel average value and the pixel error value of the predicted image are derived as feature amounts, and the parameter deriving unit 110 derives a weighting factor using the pixel average value of the reference image and the pixel average value of the predicted image as WP parameter information. The offset is derived using the pixel error value of the reference image and the pixel error value of the predicted image. Therefore, according to the first embodiment, since implicit weighted prediction can be performed in consideration of not only the weighting coefficient but also the offset, the prediction error can be reduced by using the prediction image generated by this prediction. The amount of codes can be reduced, and the coding efficiency can be improved.
 具体的には、第1実施形態では、参照画像間の画素値変化を2つの参照画像間で導出した画素平均値と画素誤差値に基づいて、線形内挿予測又は線形外挿予測して、予測画像の画素平均値と画素誤差値を導出し、参照画像と予測画像の画素値変化を、これらの値から予測する。従って第1実施形態によれば、時間的に連続するフェードやディゾルブ効果を持った映像に対して、効果的に重み係数とオフセットを予測できることにより、予測誤差を低減し、符号化効率を向上させることができ、しいては主観画質を向上させることができる。 Specifically, in the first embodiment, linear interpolation prediction or linear extrapolation prediction is performed based on a pixel average value and a pixel error value derived between two reference images, and a pixel value change between reference images. A pixel average value and a pixel error value of the predicted image are derived, and changes in the pixel values of the reference image and the predicted image are predicted from these values. Therefore, according to the first embodiment, a weighting factor and an offset can be effectively predicted for a video having fades and dissolve effects that are continuous in time, thereby reducing a prediction error and improving encoding efficiency. Can improve the subjective image quality.
 以上説明したように、第1実施形態の符号化装置100は、重み付き動き補償予測を行う際に、明示的に重みパラメータを符号化しない暗黙的モードを用いる場合に、2つの参照画像間で導出した画像特徴量と時間的距離比から予測画像の画像特徴量を導出し、参照画像と予測画像の重みパラメータとして、重み係数とオフセットを導出することにより、符号量を削減でき、符号化効率を向上できる。 As described above, the encoding apparatus 100 according to the first embodiment, when performing weighted motion compensated prediction, uses an implicit mode that does not explicitly encode a weight parameter between two reference images. By deriving the image feature amount of the predicted image from the derived image feature amount and the temporal distance ratio, and deriving the weighting factor and offset as the weight parameters of the reference image and the predicted image, the coding amount can be reduced, and the coding efficiency Can be improved.
 なお、図16~図17に例示するシンタクステーブルの行間には、第1実施形態において規定していないシンタクス要素が挿入されてもよいし、その他の条件分岐に関する記述が含まれていてもよい。また、シンタクステーブルを複数のテーブルに分割したり、複数のシンタクステーブルを統合したりしてもよい。また、例示した各シンタクス要素の用語は、任意に変更可能である。 It should be noted that syntax elements not defined in the first embodiment may be inserted between the rows of the syntax tables illustrated in FIGS. 16 to 17, or descriptions regarding other conditional branches may be included. Further, the syntax table may be divided into a plurality of tables, or a plurality of syntax tables may be integrated. Moreover, the term of each illustrated syntax element can be changed arbitrarily.
(第1実施形態の変形例1)
 第1実施形態の変形例1では、参照画像の画素平均値及び画素誤差値の他の算出手法について説明する。
(Modification 1 of the first embodiment)
In the first modification of the first embodiment, another method for calculating the pixel average value and the pixel error value of the reference image will be described.
 参照画像特徴量導出部108(平均値算出部401及び誤差値算出部402)が数式(10)及び数式(11)により参照画像の画素平均値及び画素誤差値を算出した場合、算出された画素平均値及び画素誤差値は分数値となり、整数精度に丸めると分数部分の誤差が発生する。 When the reference image feature quantity deriving unit 108 (the average value calculating unit 401 and the error value calculating unit 402) calculates the pixel average value and the pixel error value of the reference image according to the equations (10) and (11), the calculated pixels The average value and the pixel error value are fractional values, and an error in the fractional part occurs when rounded to integer precision.
 この誤差の発生を避けるため、WPパラメータ(重み係数及びオフセット)の導出時に除算を一括して行うことにより、除算における丸め誤差を低減できる。この場合、参照画像特徴量導出部108は、数式(34)及び数式(35)により参照画像の画素平均値及び画素誤差値を算出する。 To avoid this error, rounding errors in division can be reduced by performing division at the same time when deriving WP parameters (weighting factors and offsets). In this case, the reference image feature quantity deriving unit 108 calculates the average pixel value and the pixel error value of the reference image using Expression (34) and Expression (35).
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 この場合、パラメータ導出部110は、数式(36)によりオフセットを算出する。 In this case, the parameter deriving unit 110 calculates the offset by the mathematical formula (36).
 Offset[X][Y] = ((((curDC<<Log2Denom)-(Weight[X][Y]*(DC[X][Y]<<LeftShft)))/N+RealOfst)>>RealLog2Denom)   …(36) Offset [X] [Y] = (((((curDC << Log2Denom)-(Weight [X] [Y] * (DC [X] [Y] << LeftShft))) / N + RealOfst) >> RealLog2Denom) ... (36)
 数式(36)では、数式(30)に画像サイズNでの除算が追加されている。なお、例えば、画像サイズ毎に内部演算精度を定義しておくことにより、除算Nを削除し、RealLog2Denomの項に含めることもできる。 In the formula (36), division by the image size N is added to the formula (30). For example, by defining the internal calculation accuracy for each image size, the division N can be deleted and included in the section of RealLog2Denom.
 第1実施形態の変形例1によれば、除算における丸め誤差を低減できる。 According to the first modification of the first embodiment, a rounding error in division can be reduced.
(第1実施形態の変形例2)
 第1実施形態の変形例2でも、参照画像の画素平均値及び画素誤差値の他の算出手法について説明する。
(Modification 2 of the first embodiment)
In the second modification of the first embodiment, another method for calculating the pixel average value and the pixel error value of the reference image will be described.
 第1実施形態の変形例1において、参照画像特徴量導出部108(平均値算出部401及び誤差値算出部402)が、画像全体に対して、数式(34)及び数式(35)により参照画像の画素平均値及び画素誤差値を算出した場合、画像の解像度に依存して、演算精度が増加する。例えば、4096×2160画素(≒2の23乗)を持つ映像の数式(34)のワースト演算量は、画素レンジを10ビット信号と仮定すると、2の33乗(23+10)となり、32ビット演算を超えてしまう。 In the first modification of the first embodiment, the reference image feature quantity deriving unit 108 (the average value calculating unit 401 and the error value calculating unit 402) performs the reference image on the entire image using Expressions (34) and (35). When the pixel average value and the pixel error value are calculated, the calculation accuracy increases depending on the resolution of the image. For example, assuming that the pixel range is a 10-bit signal, the worst calculation amount of an image having 4096 × 2160 pixels (≈2 to the 23rd power) is 2 to the 33rd power (23 + 10). It will exceed.
 そこで、参照画像特徴量導出部108は、予め定めた処理単位で数式(34)及び数式(35)を行い、予め定めた値で量子化することにより、処理単位毎の演算精度を解像度によらず一定にすることができる。処理単位としては、スライス単位、ライン単位、又は画素ブロック単位などの任意の単位を設定できる。 Therefore, the reference image feature quantity deriving unit 108 performs mathematical formulas (34) and (35) in predetermined processing units, and quantizes the predetermined values to quantify the calculation accuracy for each processing unit. Can be kept constant. An arbitrary unit such as a slice unit, a line unit, or a pixel block unit can be set as the processing unit.
 例えば、コーディングツリーユニット単位で行う場合、参照画像特徴量導出部108は、図3Aに示す64x64画素単位(2の12乗)で数式(34)及び数式(35)を行い、算出した画素平均値、画素誤差値を、それぞれ、数式(37)、数式(38)で量子化する。 For example, when performed in units of coding tree units, the reference image feature value deriving unit 108 performs the formula (34) and the formula (35) in 64 × 64 pixel units (2 to the 12th power) illustrated in FIG. , And quantize the pixel error values using Equation (37) and Equation (38), respectively.
 DCLX' = (DCLX+Ofst5)>>Shft5   …(37)
 ACLX' = (ACLX+Ofst5)>>Shft5   …(38)
DCLX '= (DCLX + Ofst5) >> Shft5 (37)
ACLX '= (ACLX + Ofst5) >> Shft5 ... (38)
 ここで、Shft5は、数式(39)により算出され、Ofst5は、数式(40)により算出される。ここでは、数式(36)のサイズNの項をシフト演算に含めるため、数式(39)において、Shft5の値を4としている。 Here, Shft5 is calculated by Equation (39), and Ofst5 is calculated by Equation (40). Here, in order to include the term of size N in Expression (36) in the shift calculation, the value of Shft5 is set to 4 in Expression (39).
 Shft5=4   …(39)
 Ofst5=(1<<(Shft5-1))   …(40)
Shft5 = 4 (39)
Ofst5 = (1 << (Shft5-1)) (40)
 この場合、パラメータ導出部110は、数式(30)によりオフセットを算出し、数式(41)によりRealLog2Denomを算出する。 In this case, the parameter deriving unit 110 calculates an offset using the formula (30) and calculates RealLog2Denom using the formula (41).
 RealLog2Denom = Log2Denom+LeftShft+Shft5;   …(41) RealLog2Denom = Log2Denom + LeftShft + Shft5; ... (41)
 第1実施形態の変形例2によれば、画像サイズやブロックサイズに依存して発生する除算を右シフトで実現できるようになるとともに、解像度に依存せずに32ビット演算で処理を行うことが可能となり、ハードウェア規模を削減することができる。 According to the second modification of the first embodiment, division that occurs depending on the image size or block size can be realized by right shift, and processing can be performed by 32-bit arithmetic without depending on resolution. This makes it possible to reduce the hardware scale.
(第1実施形態の変形例3)
 第1実施形態の変形例3でも、参照画像の画素平均値及び画素誤差値の他の算出手法について説明する。
(Modification 3 of the first embodiment)
In the third modification of the first embodiment, another method for calculating the pixel average value and the pixel error value of the reference image will be described.
 画面内の画素平均値及び画素誤差値の統計的な性質があまり変化しない単純フェードやディゾルブ効果では、処理領域毎に画素平均値及び画素誤差値を算出し、画像の全画素を用いて画素平均値及び画素誤差値を算出する必要はない。 For simple fade and dissolve effects where the statistical properties of the pixel average value and pixel error value in the screen do not change much, the pixel average value and pixel error value are calculated for each processing area, and the pixel average is calculated using all the pixels of the image. There is no need to calculate the value and the pixel error value.
 そこで、参照画像特徴量導出部108は、例えば、1画素ブロック飛ばし、格子状に画素ブロックをサブサンプリング、単純サブサンプリング、ライン毎、又はスライス毎など予め画素平均値及び画素誤差値を算出する領域を定めておくことにより、画素平均値及び画素誤差値を算出に伴う処理量を削減できる。 Therefore, the reference image feature amount deriving unit 108 calculates, for example, a pixel average value and a pixel error value in advance, such as skipping one pixel block and sub-sampling, simple sub-sampling, line-by-line, or slice-by-pixel block of the pixel block. By defining the above, it is possible to reduce the amount of processing involved in calculating the pixel average value and the pixel error value.
(第1実施形態の変形例4)
 第1実施形態の変形例4でも、参照画像の画素平均値及び画素誤差値の他の算出手法について説明する。
(Modification 4 of the first embodiment)
In the fourth modification of the first embodiment, another method for calculating the pixel average value and the pixel error value of the reference image will be described.
 参照画像特徴量導出部108が処理領域毎に画素平均値及び画素誤差値を算出する場合、パラメータ導出部110は、当該処理領域毎にWPパラメータ情報を導出できる。例えば、参照画像特徴量導出部108がコーディングツリーユニット単位で画素平均値及び画素誤差値を算出する場合、図3Aに示す64×64画素単位で参照画像特徴量を導出し、予測画像特徴量導出部108も図3Aに示す64×64画素単位で予測画像特徴量を導出するので、パラメータ導出部110は、コーディングツリーユニット単位でWPパラメータ情報を導出できる。 When the reference image feature quantity deriving unit 108 calculates the pixel average value and the pixel error value for each processing region, the parameter deriving unit 110 can derive WP parameter information for each processing region. For example, when the reference image feature quantity deriving unit 108 calculates the pixel average value and the pixel error value in coding tree unit units, the reference image feature quantity is derived in 64 × 64 pixel units shown in FIG. Since the unit 108 also derives the predicted image feature quantity in units of 64 × 64 pixels shown in FIG. 3A, the parameter derivation unit 110 can derive WP parameter information in units of coding tree units.
 パラメータ導出部110は、導出された処理領域毎のWPパラメータ情報を用いて、数式(7)又は数式(9)の重み付き動き補償予測を行うことにより、画素ブロック単位で暗黙的動き補償予測を実現できる。例えば、映像の領域毎に異なる時間的画素値変化がある場合で変化が時間的に連続する場合、上述の通り画素ブロック毎にWPパラメータ情報を導出し、暗黙的に重み付き動き補償予測を行うことで符号量を削減できる。 The parameter deriving unit 110 performs the weighted motion compensated prediction of Equation (7) or Equation (9) using the derived WP parameter information for each processing region, thereby performing implicit motion compensated prediction for each pixel block. realizable. For example, when there are temporal pixel value changes that differ for each video region and the changes are temporally continuous, WP parameter information is derived for each pixel block as described above, and implicit motion compensation prediction is performed. Thus, the code amount can be reduced.
(第2実施形態)
 第2実施形態では、第1実施形態の符号化装置で符号化された符号化データを復号する復号装置について説明する。
(Second Embodiment)
In the second embodiment, a decoding device that decodes encoded data encoded by the encoding device of the first embodiment will be described.
 図18は、第2実施形態の復号装置800の構成の一例を示すブロック図である。 FIG. 18 is a block diagram illustrating an example of the configuration of the decoding device 800 according to the second embodiment.
 復号装置800は、図示せぬ入力バッファなどに蓄積された符号化データを復号画像に復号し、出力画像として図示せぬ出力バッファに出力する。符号化データは、例えば、図1の符号化装置100などから出力され、図示せぬ蓄積系、伝送系、又はバッファなどを経て、復号装置800に入力される。 The decoding apparatus 800 decodes encoded data stored in an input buffer (not shown) into a decoded image and outputs the decoded image to an output buffer (not shown) as an output image. The encoded data is output from, for example, the encoding device 100 of FIG. 1 and the like, and is input to the decoding device 800 via a storage system, a transmission system, or a buffer (not shown).
 復号装置800は、図18に示すように、復号部801と、逆量子化部802と、逆直交変換部803と、加算部804と、予測画像生成部805と、参照画像特徴量導出部806と、予測画像特徴量導出部807と、パラメータ導出部808とを、備える。逆量子化部802、逆直交変換部803、加算部804、予測画像生成部805、参照画像特徴量導出部806、予測画像特徴量導出部807、パラメータ導出部808は、それぞれ、図1の逆量子化部104、逆直交変換部105、加算部106、予測画像生成部107、参照画像特徴量導出部108、予測画像特徴量導出部109、パラメータ導出部110と、実質的に同一又は類似の要素である。なお、図18に示す復号制御部809は、復号装置800を制御するものであり、例えば、CPUなどにより実現できる。 As illustrated in FIG. 18, the decoding device 800 includes a decoding unit 801, an inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a predicted image generation unit 805, and a reference image feature amount derivation unit 806. A predictive image feature amount deriving unit 807 and a parameter deriving unit 808. An inverse quantization unit 802, an inverse orthogonal transform unit 803, an addition unit 804, a predicted image generation unit 805, a reference image feature quantity deriving unit 806, a predicted image feature quantity deriving unit 807, and a parameter deriving unit 808 are respectively the inverse of FIG. The quantization unit 104, the inverse orthogonal transform unit 105, the addition unit 106, the predicted image generation unit 107, the reference image feature amount derivation unit 108, the predicted image feature amount derivation unit 109, and the parameter derivation unit 110 are substantially the same or similar. Is an element. Note that the decoding control unit 809 shown in FIG. 18 controls the decoding device 800 and can be realized by, for example, a CPU.
 復号部801は、符号化データの復号のために、1フレーム又は1フィールド毎にシンタクスに基づいて解読を行う。復号部801は、各シンタクスの符号列を順次エントロピー復号し、予測モード、動きベクトル、及び参照画像番号などを含む動き情報、並びに量子化変換係数などの符号化対象ブロックの符号化パラメータを再生する。符号化パラメータは、上記以外にも変換係数に関する情報や量子化に関する情報などの復号に必要となる全てのパラメータを含む。 The decoding unit 801 performs decoding based on the syntax for each frame or field for decoding the encoded data. The decoding unit 801 sequentially entropy-decodes the code string of each syntax, and reproduces the motion information including the prediction mode, the motion vector, the reference image number, and the like, and the encoding parameters of the encoding target block such as the quantization transform coefficient. . In addition to the above, the encoding parameters include all parameters necessary for decoding such as information on transform coefficients and information on quantization.
 具体的には、復号部801は、入力されてきた符号化データに対して可変長復号処理や算術復号処理などの復号処理を行う機能を有する。例えば、H.264ではコンテキスト適応型の可変長符号化(CAVLC:Context based Adaptive Variable Length Coding)やコンテキスト適応型の算術符号化(CABAC:Context based Adaptive Binary Arithmetic Coding)などが用いられている。これらの処理は解読処理とも呼ばれる。 Specifically, the decoding unit 801 has a function of performing decoding processing such as variable length decoding processing and arithmetic decoding processing on the input encoded data. For example, H.M. H.264 uses context-adaptive variable-length coding (CAVLC: Context based Adaptive Variable Coding) and context-adaptive arithmetic coding (CABAC: Context based Adaptive Binary Coding). These processes are also called decryption processes.
 復号部801は、動き情報、及び量子化変換係数などを出力し、量子化変換係数を逆量子化部802に入力し、動き情報を予測画像生成部805に入力する。 The decoding unit 801 outputs motion information, a quantized transform coefficient, and the like, inputs the quantized transform coefficient to the inverse quantization unit 802, and inputs the motion information to the predicted image generation unit 805.
 逆量子化部802は、復号部801から入力された量子化変換係数に対して逆量子化処理を行い、復元変換係数を得る。具体的には、逆量子化部802は、復号部801において使用された量子化情報に従って逆量子化を行う。より詳細には、逆量子化部802は、量子化情報によって導出された量子化ステップサイズを量子化変換係数に乗算し、復元変換係数を得る。逆量子化部802は、復元変換係数を出力し、逆直交変換部803に入力する。 The inverse quantization unit 802 performs an inverse quantization process on the quantized transform coefficient input from the decoding unit 801 to obtain a restored transform coefficient. Specifically, the inverse quantization unit 802 performs inverse quantization according to the quantization information used in the decoding unit 801. More specifically, the inverse quantization unit 802 multiplies the quantized transform coefficient by the quantization step size derived from the quantization information to obtain a restored transform coefficient. The inverse quantization unit 802 outputs the restored transform coefficient and inputs it to the inverse orthogonal transform unit 803.
 逆直交変換部803は、逆量子化部802から入力された復元変換係数に対して、符号化側において行われた直交変換に対応する逆直交変換を行い、復元予測誤差を得る。逆直交変換部803は、復元予測誤差を出力し、加算部804に入力する。 The inverse orthogonal transform unit 803 performs inverse orthogonal transform corresponding to the orthogonal transform performed on the encoding side on the reconstructed transform coefficient input from the inverse quantization unit 802 to obtain a reconstructed prediction error. The inverse orthogonal transform unit 803 outputs the restoration prediction error and inputs it to the addition unit 804.
 加算部804は、逆直交変換部803から入力された復元予測誤差と、対応する予測画像とを加算し、復号画像を生成する。加算部804は、復号画像を出力し、予測画像生成部805に入力する。また加算部804は、復号画像を出力画像として外部に出力する。出力画像は、その後図示せぬ外部の出力バッファなどに一時的に蓄積され、例えば、復号制御部807によって管理される出力タイミングに従って、図示せぬディスプレイやモニタなどの表示装置系又は映像デバイス系へ出力される。 The addition unit 804 adds the restored prediction error input from the inverse orthogonal transform unit 803 and the corresponding prediction image, and generates a decoded image. The adding unit 804 outputs the decoded image and inputs the decoded image to the predicted image generation unit 805. The adding unit 804 outputs the decoded image to the outside as an output image. The output image is then temporarily stored in an external output buffer (not shown) or the like. For example, in accordance with the output timing managed by the decoding control unit 807, the output image is sent to a display device system or a video device system (not shown). Is output.
 予測画像生成部805は、復号部801から入力された動き情報、パラメータ導出部808から入力されたWPパラメータ情報、及び加算部804から入力された復号画像を用いて、予測画像を生成する。 The predicted image generation unit 805 generates a predicted image using the motion information input from the decoding unit 801, the WP parameter information input from the parameter derivation unit 808, and the decoded image input from the addition unit 804.
 ここで、図4を参照しながら、予測画像生成部805の詳細について説明する。予測画像生成部805は、予測画像生成部107同様、複数フレーム動き補償部201と、メモリ202と、単方向動き補償部203と、予測パラメータ制御部204と、参照画像セレクタ205と、フレームメモリ206と、参照画像制御部207と、を備える。 Here, the details of the predicted image generation unit 805 will be described with reference to FIG. Like the predicted image generation unit 107, the predicted image generation unit 805 includes a multi-frame motion compensation unit 201, a memory 202, a unidirectional motion compensation unit 203, a prediction parameter control unit 204, a reference image selector 205, and a frame memory 206. And a reference image control unit 207.
 フレームメモリ206は、参照画像制御部207の制御の下、加算部804から入力された復号画像を参照画像として格納する。フレームメモリ206は、参照画像を一時保持するための複数のメモリセットFM0~FMN(N≧1)を有する。 The frame memory 206 stores the decoded image input from the adding unit 804 as a reference image under the control of the reference image control unit 207. The frame memory 206 has a plurality of memory sets FM0 to FMN (N ≧ 1) for temporarily storing reference images.
 予測パラメータ制御部204は、復号部801から入力される動き情報に基づいて、参照画像番号と予測パラメータとの複数の組み合わせをテーブルとして用意している。ここで、動き情報とは、動き補償予測で用いられる動きのズレ量を示す動きベクトルや参照画像番号、単方向/双方向予測などの予測モードに関する情報などを指す。予測パラメータは、動きベクトル及び予測モードに関する情報を指す。そして予測パラメータ制御部204は、動き情報に基づいて、予測画像の生成に用いる参照画像番号と予測パラメータとの組み合わせを選択して出力し、参照画像番号を参照画像セレクタ205に入力し、予測パラメータを単方向動き補償部203に入力する。 The prediction parameter control unit 204 prepares a plurality of combinations of reference image numbers and prediction parameters as a table based on the motion information input from the decoding unit 801. Here, the motion information indicates a motion vector indicating a shift amount of motion used in motion compensation prediction, a reference image number, information on a prediction mode such as unidirectional / bidirectional prediction, and the like. The prediction parameter refers to information regarding a motion vector and a prediction mode. Then, the prediction parameter control unit 204 selects and outputs a combination of a reference image number and a prediction parameter used for generating a prediction image based on the motion information, inputs the reference image number to the reference image selector 205, and outputs the prediction parameter. Is input to the unidirectional motion compensation unit 203.
 参照画像セレクタ205は、フレームメモリ206が有するフレームメモリFM0~FMNのいずれの出力端を接続するかを、予測パラメータ制御部204から入力された参照画像番号に従って切り替えるスイッチである。参照画像セレクタ205は、例えば、参照画像番号が0であれば、FM0の出力端を参照画像セレクタ205の出力端に接続し、参照画像番号がNであれば、FMNの出力端を参照画像セレクタ205の出力端に接続する。参照画像セレクタ205は、フレームメモリ206が有するフレームメモリFM0~FMNのうち、出力端が接続されているフレームメモリに格納されている参照画像を出力し、単方向動き補償部203へ入力するとともに参照画像特徴量導出部806へ入力する。 The reference image selector 205 is a switch for switching which output end of the frame memories FM0 to FMN included in the frame memory 206 is connected according to the reference image number input from the prediction parameter control unit 204. For example, if the reference image number is 0, the reference image selector 205 connects the output end of FM0 to the output end of the reference image selector 205. If the reference image number is N, the reference image selector 205 connects the output end of the FMN to the reference image selector. Connect to 205 output. The reference image selector 205 outputs a reference image stored in the frame memory to which the output terminal is connected among the frame memories FM0 to FMN included in the frame memory 206, and inputs the reference image to the unidirectional motion compensation unit 203 and references. This is input to the image feature quantity deriving unit 806.
 単方向動き補償部203は、予測パラメータ制御部204から入力された予測パラメータと参照画像セレクタ205から入力された参照画像に従って、動き補償予測処理を行い、単方向予測画像を生成する。動き補償予測については、図5を参照して既に説明しているため、説明を省略する。 The unidirectional motion compensation unit 203 performs a motion compensation prediction process according to the prediction parameter input from the prediction parameter control unit 204 and the reference image input from the reference image selector 205, and generates a unidirectional prediction image. Since motion compensation prediction has already been described with reference to FIG. 5, description thereof will be omitted.
 単方向動き補償部203は、単方向予測画像を出力し、メモリ202に一時的に格納する。ここで、動き情報(予測パラメータ)が双方向予測を示す場合には、複数フレーム動き補償部201が2種類の単方向予測画像を用いて重み付き予測を行うため、単方向動き補償部203は、1つ目に対応する単方向予測画像をメモリ202に格納し、2つ目に対応する単法予測画像を複数フレーム動き補償部201に直接出力する。ここでは、1つ目に対応する単方向予測画像を第一予測画像とし、2つ目に対応する単方向予測画像を第二予測画像とする。 The unidirectional motion compensation unit 203 outputs a unidirectional prediction image and temporarily stores it in the memory 202. Here, when the motion information (prediction parameter) indicates bidirectional prediction, the multi-frame motion compensation unit 201 performs weighted prediction using two types of unidirectional prediction images. The first unidirectional prediction image corresponding to the first one is stored in the memory 202, and the second unidirectional prediction image is directly output to the multi-frame motion compensation unit 201. Here, the first unidirectional prediction image corresponding to the first is the first prediction image, and the second unidirectional prediction image is the second prediction image.
 なお、単方向動き補償部203を2つ用意し、それぞれが2つの単方向予測画像を生成するようにしてもよい。この場合、動き情報(予測パラメータ)が単方向予測を示すときには、単方向動き補償部203が、1つ目の単方向予測画像を第一予測画像として複数フレーム動き補償部201に直接出力すればよい。 Note that two unidirectional motion compensation units 203 may be prepared and each may generate two unidirectional prediction images. In this case, when the motion information (prediction parameter) indicates unidirectional prediction, the unidirectional motion compensation unit 203 directly outputs the first unidirectional prediction image as the first prediction image to the multi-frame motion compensation unit 201. Good.
 複数フレーム動き補償部201は、メモリ202から入力される第一予測画像、単方向動き補償部203から入力される第二予測画像、パラメータ導出部808から入力されるWPパラメータ情報を用いて、重み付き予測を行って予測画像を生成する。複数フレーム動き補償部201は、予測画像を出力し、加算部804に入力する。 The multi-frame motion compensation unit 201 uses the first predicted image input from the memory 202, the second predicted image input from the unidirectional motion compensation unit 203, and the WP parameter information input from the parameter derivation unit 808 to weight Predictive images are generated by performing predictive prediction. The multi-frame motion compensation unit 201 outputs a prediction image and inputs the prediction image to the addition unit 804.
 ここで、図6を参照しながら、複数フレーム動き補償部201の詳細について説明する。複数フレーム動き補償部201は、予測画像生成部107同様、デフォルト動き補償部301と、重み付き動き補償部302と、WPパラメータ制御部303と、WPセレクタ304、305とを、備える。 Here, the details of the multi-frame motion compensation unit 201 will be described with reference to FIG. Similar to the predicted image generation unit 107, the multi-frame motion compensation unit 201 includes a default motion compensation unit 301, a weighted motion compensation unit 302, a WP parameter control unit 303, and WP selectors 304 and 305.
 WPパラメータ制御部303は、パラメータ導出部808から入力されるWPパラメータ情報に基づいて、WP適用フラグ及び重み情報を出力し、WP適用フラグをWPセレクタ304、305に入力し、重み情報を重み付き動き補償部302に入力する。 The WP parameter control unit 303 outputs a WP application flag and weight information based on the WP parameter information input from the parameter derivation unit 808, inputs the WP application flag to the WP selectors 304 and 305, and weights the weight information. Input to the motion compensation unit 302.
 ここで、WPパラメータ情報は、重み係数の固定小数点精度、第一予測画像に対応する第一WP適用フラグ,第一重み係数,及び第一オフセット、並びに第二予測画像に対応する第二WP適応フラグ,第二重み係数,及び第二オフセットの情報を含む。WP適用フラグは、該当する参照画像及び信号成分毎に設定可能なパラメータであり、重み付き動き補償予測を行うかどうかを示す。重み情報は、重み係数の固定小数点精度、第一重み係数、第一オフセット、第二重み係数、及び第二オフセットの情報を含む。なおWPパラメータ情報は、第1実施形態と同様の情報を表す。 Here, the WP parameter information includes the fixed-point precision of the weighting factor, the first WP application flag corresponding to the first predicted image, the first weighting factor, the first offset, and the second WP adaptation corresponding to the second predicted image. Contains information on flag, second weighting factor, and second offset. The WP application flag is a parameter that can be set for each corresponding reference image and signal component, and indicates whether to perform weighted motion compensation prediction. The weight information includes information on the fixed point precision of the weight coefficient, the first weight coefficient, the first offset, the second weight coefficient, and the second offset. The WP parameter information represents the same information as in the first embodiment.
 詳細には、WPパラメータ制御部303は、パラメータ導出部808からWPパラメータ情報が入力されると、WPパラメータ情報を第一WP適用フラグ、第二WP適用フラグ、及び重み情報に分離して出力し、第一WP適用フラグをWPセレクタ304に入力し、第二WP適用フラグをWPセレクタ305に入力し、重み情報を重み付き動き補償部302に入力する。 Specifically, when the WP parameter information is input from the parameter deriving unit 808, the WP parameter control unit 303 separates the WP parameter information into a first WP application flag, a second WP application flag, and weight information, and outputs them. The first WP application flag is input to the WP selector 304, the second WP application flag is input to the WP selector 305, and the weight information is input to the weighted motion compensation unit 302.
 WPセレクタ304、305は、WPパラメータ制御部303から入力されたWP適用フラグに基づいて、各々の予測画像の接続端を切り替える。WPセレクタ304、305は、各々のWP適用フラグが0の場合、各々の出力端をデフォルト動き補償部301へ接続する。そしてWPセレクタ304、305は、第一予測画像及び第二予測画像を出力し、デフォルト動き補償部301に入力する。一方、WPセレクタ304、305は、各々のWP適用フラグが1の場合、各々の出力端を重み付き動き補償部302へ接続する。そしてWPセレクタ304、305は、第一予測画像及び第二予測画像を出力し、重み付き動き補償部302に入力する。 The WP selectors 304 and 305 switch the connection end of each predicted image based on the WP application flag input from the WP parameter control unit 303. When each WP application flag is 0, the WP selectors 304 and 305 connect each output terminal to the default motion compensation unit 301. Then, the WP selectors 304 and 305 output the first predicted image and the second predicted image and input them to the default motion compensation unit 301. On the other hand, when each WP application flag is 1, the WP selectors 304 and 305 connect each output terminal to the weighted motion compensation unit 302. The WP selectors 304 and 305 output the first predicted image and the second predicted image, and input them to the weighted motion compensation unit 302.
 デフォルト動き補償部301は、WPセレクタ304、305から入力された2つの単方向予測画像(第一予測画像及び第二予測画像)を元に平均値処理を行い、予測画像を生成する。具体的には、デフォルト動き補償部301は、第一WP適用フラグ及び第二WP適用フラグが0の場合、数式(1)に基づいて平均値処理を行う。 The default motion compensation unit 301 performs an average value process based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 to generate a prediction image. Specifically, when the first WP application flag and the second WP application flag are 0, the default motion compensation unit 301 performs average value processing based on Expression (1).
 なお、動き情報(予測パラメータ)で示される予測モードが単方向予測である場合、デフォルト動き補償部301は、第一予測画像のみを用いて、数式(4)に基づいて最終的な予測画像を算出する。 When the prediction mode indicated by the motion information (prediction parameter) is unidirectional prediction, the default motion compensation unit 301 uses only the first prediction image and calculates a final prediction image based on Expression (4). calculate.
 重み付き動き補償部302は、WPセレクタ304、305から入力された2つの単方向予測画像(第一予測画像及び第二予測画像)とWPパラメータ制御部303から入力された重み情報とを元に重み付き動き補償を行う。具体的には、重み付き動き補償部302は、第一WP適用フラグ及び第二WP適用フラグが1の場合、数式(7)に基づいて重み付き処理を行う。 The weighted motion compensation unit 302 is based on the two unidirectional prediction images (first prediction image and second prediction image) input from the WP selectors 304 and 305 and the weight information input from the WP parameter control unit 303. Performs weighted motion compensation. Specifically, when the first WP application flag and the second WP application flag are 1, the weighted motion compensation unit 302 performs the weighting process based on Expression (7).
 なお、重み付き動き補償部302は、第一予測画像及び第二予測画像と予測画像との演算精度が異なる場合、固定小数点精度であるlogWDを数式(8)のように制御することで丸め処理を実現する。 Note that the weighted motion compensation unit 302 rounds off the log WD C , which is the fixed-point precision, as in Expression (8) when the calculation accuracy of the first prediction image, the second prediction image, and the prediction image is different. Realize processing.
 また、動き情報(予測パラメータ)で示される予測モードが単方向予測である場合、重み付き動き補償部302は、第一予測画像のみを用いて、数式(9)に基づいて最終的な予測画像を算出する。 Also, when the prediction mode indicated by the motion information (prediction parameter) is unidirectional prediction, the weighted motion compensation unit 302 uses only the first prediction image and uses the final prediction image based on Equation (9). Is calculated.
 なお、重み付き動き補償部302は、第一予測画像及び第二予測画像と予測画像との演算精度が異なる場合、固定小数点精度であるlogWDを双方向予測時と同様に数式(8)のように制御することで丸め処理を実現する。 Incidentally, the weighted motion compensator 302, if the calculation accuracy of the predicted image and the first prediction image and the second prediction image are different, equation as in the case of bidirectional prediction the LogWD C is a fixed point precision (8) The rounding process is realized by controlling as described above.
 重み係数の固定小数点精度については、図7を参照して既に説明しているため、説明を省略する。なお、単方向予測の場合には、第二予測画像に対応する各種パラメータ(第二WP適応フラグ,第二重み係数,及び第二オフセットの情報)は利用されないため、予め定めた初期値に設定されていてもよい。 Since the fixed-point precision of the weighting factor has already been described with reference to FIG. In the case of unidirectional prediction, various parameters (second WP adaptive flag, second weighting factor, and second offset information) corresponding to the second predicted image are not used, and are set to predetermined initial values. May be.
 参照画像特徴量導出部806、予測画像特徴量導出部807、及びパラメータ導出部808により、予測画像生成部805から入力された参照画像から、予測画像生成部805により生成される予測画像に対応するWPパラメータ情報が暗黙的に導出される。 The reference image feature value deriving unit 806, the predicted image feature value deriving unit 807, and the parameter deriving unit 808 correspond to the predicted image generated by the predicted image generating unit 805 from the reference image input from the predicted image generating unit 805. WP parameter information is derived implicitly.
 参照画像特徴量導出部806は、予測画像生成部805から入力された参照画像それぞれの参照画像特徴量を導出し、導出したそれぞれの参照画像特徴量をまとめた参照画像群特徴量を出力し、予測画像特徴量導出部807及びパラメータ導出部809に入力する。 The reference image feature amount deriving unit 806 derives a reference image feature amount of each reference image input from the predicted image generation unit 805, outputs a reference image group feature amount that summarizes the derived reference image feature amounts, and The prediction image feature quantity deriving unit 807 and the parameter deriving unit 809 are input.
 参照画像群特徴量については、図8を参照して既に説明しているため、説明を省略する。ここで、図9を参照しながら、参照画像特徴量導出部806の詳細について説明する。参照画像特徴量導出部806は、参照画像特徴量導出部108同様、平均値算出部401と、誤差値算出部402と、統合部403と、を備える。 Since the reference image group feature quantity has already been described with reference to FIG. Here, the details of the reference image feature quantity deriving unit 806 will be described with reference to FIG. Similar to the reference image feature value deriving unit 108, the reference image feature value deriving unit 806 includes an average value calculating unit 401, an error value calculating unit 402, and an integrating unit 403.
 平均値算出部401は、予測画像生成部805から参照画像が入力される毎に、当該参照画像の画素平均値を算出し、算出した画素平均値を出力し、誤差値算出部402及び統合部403に入力する。平均値算出部401は、例えば数式(10)を用いて、参照画像の画素平均値を算出する。 Each time a reference image is input from the predicted image generation unit 805, the average value calculation unit 401 calculates a pixel average value of the reference image, outputs the calculated pixel average value, and an error value calculation unit 402 and an integration unit Input to 403. The average value calculation unit 401 calculates the pixel average value of the reference image using, for example, Equation (10).
 誤差値算出部402は、予測画像生成部805から参照画像が入力される毎に、平均値算出部401から入力された当該参照画像の画素平均値を用いて、当該参照画像の画素平均値の画素誤差値を算出し、算出した画素誤差値を出力し、統合部403に入力する。誤差値算出部402は、例えば数式(11)を用いて、参照画像の画素平均値の画素誤差値を算出する。 Each time the reference image is input from the predicted image generation unit 805, the error value calculation unit 402 uses the pixel average value of the reference image input from the average value calculation unit 401 to calculate the pixel average value of the reference image. A pixel error value is calculated, and the calculated pixel error value is output and input to the integration unit 403. The error value calculation unit 402 calculates the pixel error value of the pixel average value of the reference image using, for example, Equation (11).
 統合部403は、予測画像生成部805から参照画像が入力される毎に、当該参照画像のリスト番号及び参照画像番号、平均値算出部401から入力された当該参照画像の画素平均値、並びに誤差値算出部402から入力された当該参照画像の画素平均値の画素誤差値を参照画像特徴量に統合する。そして統合部403は、統合した参照画像特徴量を図8に示すようにテーブルにまとめて参照画像群特徴量として出力し、予測画像特徴量導出部807及びパラメータ導出部808に入力する。 Whenever the reference image is input from the predicted image generation unit 805, the integration unit 403 receives the list number and reference image number of the reference image, the pixel average value of the reference image input from the average value calculation unit 401, and the error. The pixel error value of the pixel average value of the reference image input from the value calculation unit 402 is integrated into the reference image feature amount. Then, the integration unit 403 collects the integrated reference image feature values in a table as illustrated in FIG. 8 and outputs the reference image group feature values as reference image group feature values, which are input to the predicted image feature value derivation unit 807 and the parameter derivation unit 808.
 予測画像特徴量導出部807は、参照画像特徴量導出部806から入力された参照画像群特徴量に基づいて予測画像特徴量を導出し、導出した予測画像特徴量を出力し、パラメータ導出部808に入力する。予測画像特徴量は、予測画像の画素平均値、画素平均値からの誤差を平均した画素誤差値、及び予測画像特徴量を導出したか否かを示す予測画像特徴量導出フラグを含む。 The predicted image feature quantity derivation unit 807 derives a predicted image feature quantity based on the reference image group feature quantity input from the reference image feature quantity derivation unit 806, outputs the derived predicted image feature quantity, and a parameter derivation unit 808. To enter. The predicted image feature amount includes a pixel average value of the predicted image, a pixel error value obtained by averaging errors from the pixel average value, and a predicted image feature amount derivation flag indicating whether the predicted image feature amount has been derived.
 ここで、図10を参照しながら、予測画像特徴量導出部807の詳細について説明する。予測画像特徴量導出部807は、予測画像特徴量導出部109同様、特徴量制御部411と、メモリ412と、予測画像特徴量算出部413とを、を備える。 Here, the details of the predicted image feature quantity deriving unit 807 will be described with reference to FIG. Similar to the predicted image feature value deriving unit 109, the predicted image feature value deriving unit 807 includes a feature value control unit 411, a memory 412, and a predicted image feature value calculating unit 413.
 特徴量制御部411は、参照画像特徴量導出部806から入力された参照画像群特徴量から、予測画像特徴量の導出に用いる2つの参照画像特徴量を選択して出力し、一方をメモリ412に入力(ロード)し、他方を予測画像特徴量算出部413に入力する。 The feature amount control unit 411 selects and outputs two reference image feature amounts used for deriving the predicted image feature amount from the reference image group feature amount input from the reference image feature amount deriving unit 806, and outputs one of them in the memory 412. Is input (loaded), and the other is input to the predicted image feature amount calculation unit 413.
 具体的には、特徴量制御部411は、参照画像群特徴量にまとめられた参照画像特徴量のリスト番号及び参照画像番号から、当該参照画像特徴量の参照画像の表示オーダーを示すPOC(Picture Order Count)を導出する。なお、参照リスト及び参照画像番号は、参照画像を間接的に指定する情報であり、POCは、参照画像を直接指定する情報であり、参照画像の絶対位置に相当する。そして特徴量制御部411は、導出したPOCの中から符号化対象画像(予測画像)との時間的距離が近い順に2つのPOCを選択することで、参照画像群特徴量から予測画像特徴量の導出に用いる2つの参照画像特徴量を選択する。 Specifically, the feature amount control unit 411 uses a reference image feature amount list number and a reference image number collected in the reference image group feature amount to indicate a display order of the reference image of the reference image feature amount. Deriving Order Count). The reference list and the reference image number are information for indirectly specifying the reference image, and the POC is information for directly specifying the reference image, and corresponds to the absolute position of the reference image. Then, the feature amount control unit 411 selects two POCs from the derived POC in the order of the shortest distance from the encoding target image (predicted image), thereby calculating the predicted image feature amount from the reference image group feature amount. Two reference image feature quantities used for derivation are selected.
 特徴量制御部411は、符号化スライスがP-sliceの場合、数式(12)を用いて2つのPOC(POC1及びPOC2)を選択する。 The feature amount control unit 411 selects two POCs (POC1 and POC2) using Equation (12) when the encoded slice is P-slice.
 また特徴量制御部411は、符号化スライスがB-sliceの場合、数式(13)を用いて2つのPOC(POC1及びPOC2)を選択する。 Also, the feature quantity control unit 411 selects two POCs (POC1 and POC2) using Expression (13) when the encoded slice is B-slice.
 なお特徴量制御部411は、選択した2つのPOC(POC1及びPOC2)が同一である場合、数式(14)を用いて一方のPOC(POC2)を再選択する。特徴量制御部411は、数式(14)を繰り返し行うことにより、符号化対象画像から時間的距離が異なる2つのPOC(符号化対象画像との時間的距離がPOC1と異なるPOC2)を選択できる。 Note that, when the two selected POCs (POC1 and POC2) are the same, the feature amount control unit 411 reselects one POC (POC2) using Expression (14). The feature quantity control unit 411 can select two POCs having different temporal distances from the encoding target image (POC2 having a temporal distance different from that of POC1) from the encoding target image by repeatedly performing Expression (14).
 そして特徴量制御部411は、POC1を選択すると、参照画像群特徴量からPOC1に対応する参照画像特徴量を選択して出力し、メモリ412に入力する。また特徴量制御部411は、POC2を選択すると、参照画像群特徴量からPOC2に対応する参照画像特徴量を選択して出力し、予測画像特徴量算出部413に入力する。 Then, when POC1 is selected, the feature quantity control unit 411 selects and outputs a reference image feature quantity corresponding to POC1 from the reference image group feature quantity, and inputs it to the memory 412. Further, when POC2 is selected, the feature amount control unit 411 selects and outputs a reference image feature amount corresponding to POC2 from the reference image group feature amount, and inputs it to the predicted image feature amount calculation unit 413.
 第2実施形態では、特徴量制御部411が2つのPOCを選択する例について説明したが、3つ以上のPOCを選択するようにしてもよい。この場合、特徴量制御部411は、参照画像群特徴量から3つ以上の参照画像特徴量を選択し、選択された3つ以上の参照画像特徴量から予測画像特徴量が導出される。なお、P-sliceの場合は、N≧2であることが必要であり、特徴量制御部411は、数式(12)を実行した後にリスト番号0の参照リストを探索するようにすればよい。 In the second embodiment, an example in which the feature amount control unit 411 selects two POCs has been described, but three or more POCs may be selected. In this case, the feature amount control unit 411 selects three or more reference image feature amounts from the reference image group feature amount, and a predicted image feature amount is derived from the selected three or more reference image feature amounts. Note that in the case of P-slice, it is necessary that N ≧ 2, and the feature amount control unit 411 may search the reference list with the list number 0 after executing Expression (12).
 メモリ412は、特徴量制御部411により入力された参照画像特徴量(POC1に対応する参照画像特徴量(以下、第一参照画像特徴量と称する))を保持する。そしてメモリ412は、特徴量制御部411から予測画像特徴量算出部413に参照画像特徴量(POC2に対応する参照画像特徴量(以下、第二参照画像特徴量と称する))が入力されるタイミングで、第一参照画像特徴量を出力し、予測画像特徴量算出部413に入力する。 The memory 412 holds the reference image feature amount (reference image feature amount corresponding to POC1 (hereinafter referred to as first reference image feature amount)) input by the feature amount control unit 411. Then, the memory 412 receives a reference image feature amount (a reference image feature amount corresponding to POC2 (hereinafter referred to as a second reference image feature amount)) from the feature amount control unit 411 to the predicted image feature amount calculation unit 413. Then, the first reference image feature value is output and input to the predicted image feature value calculation unit 413.
 予測画像特徴量算出部413は、メモリ412から入力された第一参照画像特徴量と特徴量制御部411から入力された第二参照画像特徴量とを用いて、予測画像特徴量を算出して出力し、パラメータ導出部808に入力する。 The predicted image feature value calculation unit 413 calculates a predicted image feature value using the first reference image feature value input from the memory 412 and the second reference image feature value input from the feature value control unit 411. To be output and input to the parameter deriving unit 808.
 予測画像特徴量算出部413は、まず、数式(15)に従って、第一参照画像特徴量の参照画像と第二参照画像特徴量の参照画像との時間的距離を算出する。 The predicted image feature amount calculation unit 413 first calculates the temporal distance between the reference image of the first reference image feature amount and the reference image of the second reference image feature amount according to Equation (15).
 予測画像特徴量算出部413は、続いて、数式(19)に従って予測画像の画素平均値を算出するとともに、数式(20)に従って予測画像の画素誤差値を算出する。 Next, the predicted image feature value calculation unit 413 calculates the pixel average value of the predicted image according to Equation (19), and calculates the pixel error value of the predicted image according to Equation (20).
 但し、予測画像特徴量算出部413は、数式(15)により算出したDistScaleFactorが、数式(23)~数式(25)のいずれかの条件を満たす場合、予測画像特徴量の導出が不可能又は時間的距離が離れすぎているため、予測画像の画素平均値DCP及び予測画像の画素誤差値ACPを初期値に設定する。 However, when the DistScaleFactor calculated by Expression (15) satisfies any one of Expressions (23) to (25), the predicted image feature amount calculation unit 413 cannot derive the predicted image feature amount or time Since the target distance is too far, the pixel average value DCP of the predicted image and the pixel error value ACP of the predicted image are set to initial values.
 予測画像特徴量算出部413は、例えば、DistScaleFactorが数式(23)~数式(25)のいずれかの条件を満たす場合、内部的な変数である予測画像特徴量導出フラグwp_avaiable_flagをfalseに設定し、DistScaleFactorが数式(23)~数式(25)のいずれの条件も満たさない場合、wp_avaiable_flagをtrueに設定する。予測画像特徴量導出フラグがfalseに設定された場合、DCP及びACPには初期値が設定される。例えば、DCPに0を示すDefaultDCが設定され、ACPに0を示すDefaultACが設定される。予測画像特徴量導出フラグがtrueに設定された場合、DCP、ACPには、それぞれ数式(19)、数式(20)で算出された値が設定される。 For example, when the DistScaleFactor satisfies any of the formulas (23) to (25), the predicted image feature value calculation unit 413 sets the predicted image feature value derivation flag wp_avaiable_flag, which is an internal variable, to false, If DistScaleFactor does not satisfy any of the conditions of Equations (23) to (25), wp_avaiable_flag is set to true. When the predicted image feature quantity derivation flag is set to false, initial values are set in the DCP and ACP. For example, DefaultDC indicating 0 is set in DCP, and DefaultAC indicating 0 is set in ACP. When the predicted image feature value derivation flag is set to true, the values calculated by Expression (19) and Expression (20) are set in DCP and ACP, respectively.
 パラメータ導出部808は、参照画像特徴量導出部806から入力された参照画像郡特徴量と予測画像特徴量導出部807から入力された予測画像特徴量とを用いて、符号化対象画像のWPパラメータ情報を導出する。 The parameter deriving unit 808 uses the reference image group feature value input from the reference image feature value deriving unit 806 and the predicted image feature value input from the predicted image feature value deriving unit 807 to use the WP parameter of the encoding target image. Deriving information.
 WPパラメータ情報については、図11A及び図11Bを参照して既に説明しているため、説明を省略する。 Since the WP parameter information has already been described with reference to FIGS. 11A and 11B, the description thereof will be omitted.
 パラメータ導出部808は、まず、予測画像特徴量に含まれる予測画像特徴量導出フラグwp_avaiable_flagを確認し、WPパラメータ情報を導出可能か否か確認する。wp_avaiable_flagがfalseに設定されている場合、WPパラメータ情報を導出できないため、パラメータ導出部808は、リスト番号X、参照画像番号Yに対応する重み係数、オフセットを、それぞれ、数式(26)、数式(27)に従って、初期値に設定する。 The parameter deriving unit 808 first checks the predicted image feature quantity derivation flag wp_avaiable_flag included in the predicted image feature quantity, and checks whether WP parameter information can be derived. When wp_avaiable_flag is set to false, the WP parameter information cannot be derived. Therefore, the parameter deriving unit 808 sets the weighting coefficient and offset corresponding to the list number X and the reference image number Y to Expression (26) and Expression ( The initial value is set according to 27).
 パラメータ導出部808は、wp_avaiable_flagがfalseに設定されている場合、全てのリスト番号X、参照画像番号Yの組み合わせ(全ての参照画像)に対し、数式(26)及び数式(27)を繰り返し実行することにより、WPパラメータ情報に初期値を設定する。 When the wp_avaiable_flag is set to false, the parameter deriving unit 808 repeatedly executes Expression (26) and Expression (27) for all combinations of list numbers X and reference image numbers Y (all reference images). Thus, an initial value is set in the WP parameter information.
 なお、パラメータ導出部808は、wp_avaiable_flagがfalseに設定されている場合、リスト番号X、参照画像番号Yに対応するWP適用フラグ(WP_flag[X][Y])をfalseに設定する。 The parameter deriving unit 808 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to false when wp_avaiable_flag is set to false.
 一方、wp_avaiable_flagがtrueに設定されている場合、WPパラメータ情報を導出できるため、パラメータ導出部808は、リスト番号X、参照画像番号Yに対応する重み係数、オフセットを、それぞれ、数式(29)、数式(30)に従って、導出する。 On the other hand, when wp_avaiable_flag is set to true, since WP parameter information can be derived, the parameter deriving unit 808 uses the weighting coefficient and offset corresponding to the list number X and the reference image number Y, respectively, as shown in Equation (29), Derived according to equation (30).
 パラメータ導出部808は、wp_avaiable_flagがtrueに設定されている場合、全てのリスト番号X、参照画像番号Yの組み合わせ(全ての参照画像)に対し、数式(29)及び数式(30)を繰り返し実行することにより、図11A及び図11Bに示すWPパラメータ情報を導出する。 When the wp_avaiable_flag is set to true, the parameter deriving unit 808 repeatedly executes Expression (29) and Expression (30) for all combinations of list numbers X and reference image numbers Y (all reference images). Thus, the WP parameter information shown in FIGS. 11A and 11B is derived.
 なおパラメータ導出部808は、wp_avaiable_flagがtrueに設定されている場合、リスト番号X、参照画像番号Yに対応するWP適用フラグ(WP_flag[X][Y])をtrueに設定する。 The parameter deriving unit 808 sets the WP application flag (WP_flag [X] [Y]) corresponding to the list number X and the reference image number Y to true when wp_avaiable_flag is set to true.
 復号部801は、図15に示すシンタクス500を利用する。シンタクス500は、復号部801の復号対象の符号化データの構造を示している。シンタクス500については、図15を参照して既に説明しているため、説明を省略する。また、ピクチャパラメータセットシンタクス505については、符号化が復号である点を除き、図16を参照して既に説明しているため、説明を省略する。また、シーケンスパラメータセットシンタクス504についても、符号化が復号である点を除き、図17を参照して既に説明しているため、説明を省略する。 Decoding section 801 uses syntax 500 shown in FIG. A syntax 500 indicates a structure of encoded data to be decoded by the decoding unit 801. The syntax 500 has already been described with reference to FIG. Also, the picture parameter set syntax 505 has already been described with reference to FIG. 16 except that encoding is decoding, and thus description thereof will be omitted. Also, the sequence parameter set syntax 504 has already been described with reference to FIG. 17 except that encoding is decoding, and thus description thereof will be omitted.
 以上説明したように、第2実施形態の復号化装置800は、重み付き動き補償予測を行う際に、明示的に重みパラメータを符号化しない暗黙的モードを用いる場合に、2つの参照画像間で導出した画像特徴量と時間的距離比から予測画像の画像特徴量を導出し、参照画像と予測画像の重みパラメータとして、重み係数とオフセットを導出することにより、、符号量を削減でき、符号化効率を向上できる。 As described above, the decoding apparatus 800 according to the second embodiment, when performing weighted motion compensation prediction, uses an implicit mode that does not explicitly encode a weight parameter between two reference images. By deriving the image feature amount of the predicted image from the derived image feature amount and the temporal distance ratio, and deriving the weighting factor and offset as the weight parameters of the reference image and the predicted image, the code amount can be reduced, and the coding Efficiency can be improved.
(変形例)
 上記第1~第2実施形態では、フレームを16×16画素サイズなどの矩形ブロックに分割し、画面左上のブロックから右下に向かって順に符号化/復号を行う例について説明している(図2Aを参照)。しかしながら、符号化順序及び復号順序はこの例に限定されない。例えば、右下から左上に向かって順に符号化及び復号が行われてもよいし、画面中央から画面端に向かって渦巻を描くように符号化及び復号が行われてもよい。更に、右上から左下に向かって順に符号化及び復号が行われてもよいし、画面端から画面中央に向かって渦巻きを描くように符号化及び復号が行われてもよい。この場合、符号化順序によって参照できる隣接画素ブロックの位置が変わるので、適宜利用可能な位置に変更すればよい。
(Modification)
In the first and second embodiments described above, an example is described in which a frame is divided into rectangular blocks of 16 × 16 pixel size and the like, and encoding / decoding is sequentially performed from the upper left block to the lower right side (FIG. See 2A). However, the encoding order and the decoding order are not limited to this example. For example, encoding and decoding may be performed in order from the lower right to the upper left, or encoding and decoding may be performed so as to draw a spiral from the center of the screen toward the screen end. Furthermore, encoding and decoding may be performed in order from the upper right to the lower left, or encoding and decoding may be performed so as to draw a spiral from the screen end toward the center of the screen. In this case, since the position of the adjacent pixel block that can be referred to changes depending on the encoding order, the position may be changed to a usable position as appropriate.
 上記第1~第2実施形態では、4×4画素ブロック、8×8画素ブロック、16×16画素ブロックなどの予測対象ブロックサイズを例示して説明を行ったが、予測対象ブロックは均一なブロック形状でなくてもよい。例えば、予測対象ブロックサイズは、16×8画素ブロック、8×16画素ブロック、8×4画素ブロック、4×8画素ブロックなどであってもよい。また、1つのコーディングツリーブロック内で全てのブロックサイズを統一させる必要はなく、複数の異なるブロックサイズを混在させてもよい。1つのコーディングツリーブロック内で複数の異なるブロックサイズを混在させる場合、分割数の増加に伴って分割情報を符号化又は復号するための符号量も増加する。そこで、分割情報の符号量と局部復号画像または復号画像の品質との間のバランスを考慮して、ブロックサイズを選択することが望ましい。 In the first and second embodiments, the description has been given by exemplifying the prediction target block size such as the 4 × 4 pixel block, the 8 × 8 pixel block, and the 16 × 16 pixel block. However, the prediction target block is a uniform block. It does not have to be a shape. For example, the prediction target block size may be a 16 × 8 pixel block, an 8 × 16 pixel block, an 8 × 4 pixel block, a 4 × 8 pixel block, or the like. Moreover, it is not necessary to unify all the block sizes within one coding tree block, and a plurality of different block sizes may be mixed. When a plurality of different block sizes are mixed in one coding tree block, the code amount for encoding or decoding the division information increases as the number of divisions increases. Therefore, it is desirable to select the block size in consideration of the balance between the code amount of the division information and the quality of the locally decoded image or the decoded image.
 上記第1~第2実施形態では、簡単化のために、輝度信号と色差信号における予測処理とを区別せず、色信号成分に関して包括的な説明を記述した。しかしながら、予測処理が輝度信号と色差信号との間で異なる場合には、同一または異なる予測方法が用いられてよい。輝度信号と色差信号との間で異なる予測方法が用いられるならば、色差信号に対して選択した予測方法を輝度信号と同様の方法で符号化又は復号できる。 In the first and second embodiments described above, for the sake of simplicity, a comprehensive description of the color signal component has been described without distinguishing between the prediction process for the luminance signal and the color difference signal. However, when the prediction process is different between the luminance signal and the color difference signal, the same or different prediction methods may be used. If different prediction methods are used between the luminance signal and the color difference signal, the prediction method selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.
 上記第1~第2実施形態では、簡単化のために、輝度信号と色差信号における重み付き動き補償予測処理とを区別せず、色信号成分に関して包括的な説明を記述した。しかしながら、重み付き動き補償予測処理が輝度信号と色差信号との間で異なる場合には、同一または異なる重み付き動き補償予測処理が用いられてよい。輝度信号と色差信号との間で異なる重み付き動き補償予測処理が用いられるならば、色差信号に対して選択した重み付き動き補償予測処理を輝度信号と同様の方法で符号化又は復号できる。 In the first and second embodiments described above, for the sake of simplicity, a comprehensive description of the color signal component has been described without distinguishing between the luminance signal and the weighted motion compensation prediction process for the color difference signal. However, when the weighted motion compensation prediction process is different between the luminance signal and the color difference signal, the same or different weighted motion compensation prediction process may be used. If a weighted motion compensation prediction process that is different between the luminance signal and the color difference signal is used, the weighted motion compensation prediction process selected for the color difference signal can be encoded or decoded in the same manner as the luminance signal.
 上記第1~第2実施形態では、シンタクス構成に示す表の行間には、本実施形態で規定していないシンタクス要素が挿入されることも可能であるし、それ以外の条件分岐に関する記述が含まれていても構わない。或いは、シンタクステーブルを複数のテーブルに分割、統合することも可能である。また、必ずしも同一の用語を用いる必要は無く、利用する形態によって任意に変更しても構わない。 In the first and second embodiments, syntax elements not specified in the present embodiment can be inserted between the rows of the table shown in the syntax configuration, and other conditional branch descriptions are included. It does not matter. Alternatively, the syntax table can be divided and integrated into a plurality of tables. Moreover, it is not always necessary to use the same term, and it may be arbitrarily changed depending on the form to be used.
 上記第1実施形態の変形例1~4を上記第2実施形態に適用してもよい。 Modifications 1 to 4 of the first embodiment may be applied to the second embodiment.
 以上説明したように、上記各実施形態は、暗黙的重み付き動き補償予測を行う際に画素値を加算的に補正するオフセットが利用できない問題を解消しつつ、高効率な暗黙的重み付き動き補償予測処理を実現する。故に、上記各実施形態によれば、符号化効率が向上し、ひいては主観画質も向上する。 As described above, each of the above embodiments solves the problem that the offset for additively correcting the pixel value cannot be used when performing the implicit weighted motion compensation prediction, and the highly efficient implicit weighted motion compensation. Realize prediction processing. Therefore, according to each of the above embodiments, the encoding efficiency is improved, and the subjective image quality is also improved.
 本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although several embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the spirit of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.
 例えば、上記各実施形態の処理を実現するプログラムを、コンピュータで読み取り可能な記憶媒体に格納して提供することも可能である。記憶媒体としては、磁気ディスク、光ディスク(CD-ROM、CD-R、DVD等)、光磁気ディスク(MO等)、半導体メモリなど、プログラムを記憶でき、かつ、コンピュータが読み取り可能な記憶媒体であれば、その記憶形式は何れの形態であってもよい。 For example, it is possible to provide a program that realizes the processing of each of the above embodiments by storing it in a computer-readable storage medium. The storage medium can be a computer-readable storage medium such as a magnetic disk, optical disk (CD-ROM, CD-R, DVD, etc.), magneto-optical disk (MO, etc.), semiconductor memory, etc. For example, the storage format may be any form.
 また、上記各実施形態の処理を実現するプログラムを、インターネットなどのネットワークに接続されたコンピュータ(サーバ)上に格納し、ネットワーク経由でコンピュータ(クライアント)にダウンロードさせてもよい。 Further, the program for realizing the processing of each of the above embodiments may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.
 100 符号化装置
 101 減算部
 102 直交変換部
 103 量子化部
 104 逆量子化部
 105 逆直交変換部
 106 加算部
 107 予測画像生成部
 108 参照画像特徴量導出部
 109 予測画像特徴量導出部
 110 パラメータ導出部
 111 動き評価部
 112 符号化部
 113 符号化制御部
 201 複数フレーム動き補償部
 202 メモリ
 203 単方向動き補償部
 204 予測パラメータ制御部
 205 参照画像セレクタ
 206 フレームメモリ
 207 参照画像制御部
 301 デフォルト動き補償部
 302 重み付き動き補償部
 303 WPパラメータ制御部
 304、305 WPセレクタ
 401 平均値算出部
 402 誤差値算出部
 403 統合部
 411 特徴量制御部
 412 メモリ
 413 予測画像特徴量算出部
 800 復号装置
 801 復号部
 802 逆量子化部
 803 逆直交変換部
 804 加算部
 805 予測画像生成部
 806 参照画像特徴量導出部
 807 予測画像特徴量導出部
 808 パラメータ導出部
 809 復号制御部
DESCRIPTION OF SYMBOLS 100 Coding apparatus 101 Subtraction part 102 Orthogonal transformation part 103 Quantization part 104 Inverse quantization part 105 Inverse orthogonal transformation part 106 Adder part 107 Predictive image generation part 108 Reference image feature-value derivation part 109 Predictive-image feature-value derivation part 110 Parameter derivation Unit 111 motion evaluation unit 112 encoding unit 113 encoding control unit 201 multi-frame motion compensation unit 202 memory 203 unidirectional motion compensation unit 204 prediction parameter control unit 205 reference image selector 206 frame memory 207 reference image control unit 301 default motion compensation unit 302 Weighted motion compensation unit 303 WP parameter control unit 304, 305 WP selector 401 Average value calculation unit 402 Error value calculation unit 403 Integration unit 411 Feature amount control unit 412 Memory 413 Predictive image feature amount calculation unit 800 Decoding device 801 Issue unit 802 inverse quantization unit 803 inverse orthogonal transform unit 804 adding unit 805 predicted image generation unit 806 reference image feature amount derivation unit 807 predicted image feature amount derivation unit 808 parameter derivation unit 809 the decoding control unit

Claims (12)

  1.  2以上の参照画像それぞれの画素平均値及び当該画素平均値との画素の差分を示す画素誤差値を導出する第1導出ステップと、
     前記2以上の参照画像のうちの少なくとも2つの参照画像と予測画像との時間距離比及び前記少なくとも2つの参照画像の前記画素平均値を用いて前記予測画像の画素平均値を導出するとともに、前記時間距離比及び前記少なくとも2つの参照画像の前記画素誤差値を用いて前記予測画像の画素誤差値を導出する第2導出ステップと、
     前記参照画像の前記画素誤差値と前記予測画像の前記画素誤差値とを用いて前記参照画像の重み係数を導出するとともに、導出した前記重み係数と前記参照画像の前記画素平均値と前記予測画像の前記画素平均値とを用いて前記参照画像のオフセットを導出する第3導出ステップと、
     前記参照画像のうち入力画像を複数のブロックに分割した1つの対象ブロックの参照画像、当該参照画像の前記重み係数、及び当該参照画像の前記オフセットを用いて、前記対象ブロックの前記予測画像を生成する予測画像生成ステップと、
     を含む予測画像生成方法。
    A first derivation step for deriving a pixel error value indicating a pixel average value of each of the two or more reference images and a pixel difference from the pixel average value;
    Deriving a pixel average value of the predicted image using a temporal distance ratio between at least two reference images of the two or more reference images and the predicted image and the pixel average value of the at least two reference images; A second derivation step of deriving a pixel error value of the predicted image using a time distance ratio and the pixel error value of the at least two reference images;
    The weight coefficient of the reference image is derived using the pixel error value of the reference image and the pixel error value of the predicted image, and the derived weight coefficient, the pixel average value of the reference image, and the predicted image A third derivation step of deriving an offset of the reference image using the pixel average value of
    Generating the predicted image of the target block using the reference image of one target block obtained by dividing the input image into a plurality of blocks of the reference image, the weighting coefficient of the reference image, and the offset of the reference image A predicted image generation step,
    A predicted image generation method including:
  2.  前記第3導出ステップでは、前記参照画像の前記画素誤差値と前記予測画像の前記画素誤差値との比率に従って前記参照画像それぞれの前記重み係数を導出するとともに、導出した前記重み係数と前記参照画像の前記画素平均値と前記予測画像の前記画素平均値とに従って前記参照画像それぞれの前記オフセットを導出し、
     前記予測画像生成ステップでは、前記対象ブロックの前記参照画像を動きベクトルに従って動き補償予測した値に、当該参照画像の前記重み係数を乗じ、当該参照画像の前記オフセットを加算することにより、前記対象ブロックの前記予測画像を生成する請求項1に記載の予測画像生成方法。
    In the third derivation step, the weighting factor of each of the reference images is derived according to the ratio between the pixel error value of the reference image and the pixel error value of the predicted image, and the derived weighting factor and the reference image Deriving the offset of each of the reference images according to the pixel average value of and the pixel average value of the predicted image,
    In the predicted image generation step, the target block is obtained by multiplying the value obtained by motion compensation prediction of the reference image of the target block according to a motion vector by the weight coefficient of the reference image and adding the offset of the reference image. The predicted image generation method according to claim 1, wherein the predicted image is generated.
  3.  前記第2導出ステップでは、前記2以上の参照画像の中から画像表示時間が互いに異なる参照画像を前記少なくとも2つの参照画像に選択する請求項1に記載の予測画像生成方法。 The predicted image generation method according to claim 1, wherein, in the second derivation step, reference images having different image display times are selected from the two or more reference images as the at least two reference images.
  4.  前記第2導出ステップでは、前記2以上の参照画像の中から前記予測画像との時間的距離が近い順に前記少なくとも2つの参照画像を選択する請求項3に記載の予測画像生成方法。 4. The predicted image generation method according to claim 3, wherein, in the second derivation step, the at least two reference images are selected from the two or more reference images in descending order of temporal distance from the predicted image.
  5.  前記第2導出ステップでは、外挿予測又は内挿予測の線形予測を行うことにより、前記予測画像の前記画素平均値及び前記画素誤差値を導出する請求項1に記載の予測画像生成方法。 The predicted image generation method according to claim 1, wherein in the second derivation step, the pixel average value and the pixel error value of the predicted image are derived by performing extrapolation prediction or linear prediction of interpolation prediction.
  6.  前記第1導出ステップでは、前記2以上の参照画像それぞれの前記画素平均値及び前記画素誤差値を整数精度で導出し、
     前記第3導出ステップでは、前記参照画像の前記重み係数及び前記オフセットの導出時に予め定めた固定小数点精度への丸め処理を行う請求項1に記載の予測画像生成方法。
    In the first derivation step, the pixel average value and the pixel error value of each of the two or more reference images are derived with integer precision,
    The predicted image generation method according to claim 1, wherein, in the third derivation step, a rounding process to a fixed point precision determined in advance when the weighting factor and the offset of the reference image are derived.
  7.  前記第1導出ステップでは、前記2以上の参照画像それぞれの前記画素平均値及び前記画素誤差値を、スライス単位、ライン単位、又は画素ブロック単位のいずれかの単位で導出して、予め定めた固定小数点精度で量子化する請求項1に記載の予測画像生成方法。 In the first derivation step, the pixel average value and the pixel error value of each of the two or more reference images are derived in any one of a slice unit, a line unit, and a pixel block unit and fixed in advance. The predicted image generation method according to claim 1, wherein quantization is performed with decimal precision.
  8.  前記第1導出ステップでは、前記2以上の参照画像それぞれの前記画素平均値及び前記画素誤差値を、サブサンプリングされたスライス単位、ライン単位、又は画素ブロック単位のいずれかの単位で導出する請求項1に記載の予測画像生成方法。 The first deriving step derives the pixel average value and the pixel error value of each of the two or more reference images in units of subsampled slice units, line units, or pixel block units. The predicted image generation method according to 1.
  9.  前記第1導出ステップでは、前記入力画像の符号化後に前記2以上の参照画像それぞれの前記画素平均値及び前記画素誤差値を導出し、
     前記第3導出ステップでは、前記参照画像の管理方法に従って、前記参照画像の前記画素平均値及び前記画素誤差値を参照する請求項1に記載の予測画像生成方法。
    In the first derivation step, after the input image is encoded, the pixel average value and the pixel error value of each of the two or more reference images are derived,
    The predicted image generation method according to claim 1, wherein, in the third derivation step, the pixel average value and the pixel error value of the reference image are referred to according to the reference image management method.
  10.  前記第1導出ステップでは、前記2以上の参照画像それぞれの前記画素平均値及び前記画素誤差値を、スライス単位、ライン単位、又は画素ブロック単位のいずれかの単位で導出し、
     前記第2導出ステップでは、前記予測画像の前記画素平均値及び前記画素誤差値を前記単位で導出し、
     前記第3導出ステップでは、前記参照画像の前記重み係数及び前記オフセットを前記単位で導出し、
     前記予測画像生成ステップでは、前記対象ブロックの前記参照画像を動きベクトルに従って動き補償予測した値に、当該参照画像の前記単位での前記重み係数を乗じ、当該参照画像の前記単位での前記オフセットを加算することにより、前記対象ブロックの前記予測画像を生成する請求項1に記載の予測画像生成方法。
    In the first derivation step, the pixel average value and the pixel error value of each of the two or more reference images are derived in any one of a slice unit, a line unit, and a pixel block unit,
    In the second derivation step, the pixel average value and the pixel error value of the predicted image are derived in the unit,
    In the third derivation step, the weighting factor and the offset of the reference image are derived in the unit,
    In the predicted image generation step, a value obtained by motion compensation prediction of the reference image of the target block according to a motion vector is multiplied by the weight coefficient in the unit of the reference image, and the offset in the unit of the reference image is calculated. The predicted image generation method according to claim 1, wherein the predicted image of the target block is generated by adding.
  11.  2以上の参照画像それぞれの画素平均値及び当該画素平均値との画素の差分を示す画素誤差値を導出する第1導出ステップと、
     前記2以上の参照画像のうちの少なくとも2つの参照画像と予測画像との時間距離比及び前記少なくとも2つの参照画像の前記画素平均値を用いて前記予測画像の画素平均値を導出するとともに、前記時間距離比及び前記少なくとも2つの参照画像の前記画素誤差値を用いて前記予測画像の画素誤差値を導出する第2導出ステップと、
     前記参照画像の前記画素誤差値と前記予測画像の前記画素誤差値とを用いて前記参照画像の重み係数を導出するとともに、導出した前記重み係数と前記参照画像の前記画素平均値と前記予測画像の前記画素平均値とを用いて前記参照画像のオフセットを導出する第3導出ステップと、
     前記参照画像のうち入力画像を複数のブロックに分割した1つの対象ブロックの参照画像、当該参照画像の前記重み係数、及び当該参照画像の前記オフセットを用いて、前記対象ブロックの前記予測画像を生成する予測画像生成ステップと、
     前記入力画像と前記予測画像とに基づく値を符号化する符号化ステップと、
     を含む符号化方法。
    A first derivation step for deriving a pixel error value indicating a pixel average value of each of the two or more reference images and a pixel difference from the pixel average value;
    Deriving a pixel average value of the predicted image using a temporal distance ratio between at least two reference images of the two or more reference images and the predicted image and the pixel average value of the at least two reference images; A second derivation step of deriving a pixel error value of the predicted image using a time distance ratio and the pixel error value of the at least two reference images;
    The weight coefficient of the reference image is derived using the pixel error value of the reference image and the pixel error value of the predicted image, and the derived weight coefficient, the pixel average value of the reference image, and the predicted image A third derivation step of deriving an offset of the reference image using the pixel average value of
    Generating the predicted image of the target block using the reference image of one target block obtained by dividing the input image into a plurality of blocks of the reference image, the weighting coefficient of the reference image, and the offset of the reference image A predicted image generation step,
    An encoding step for encoding a value based on the input image and the predicted image;
    An encoding method including:
  12.  2以上の参照画像それぞれの画素平均値及び当該画素平均値との画素の差分を示す画素誤差値を導出する第1導出ステップと、
     前記2以上の参照画像のうちの少なくとも2つの参照画像と予測画像との時間距離比及び前記少なくとも2つの参照画像の前記画素平均値を用いて前記予測画像の画素平均値を導出するとともに、前記時間距離比及び前記少なくとも2つの参照画像の前記画素誤差値を用いて前記予測画像の画素誤差値を導出する第2導出ステップと、
     前記参照画像の前記画素誤差値と前記予測画像の前記画素誤差値とを用いて前記参照画像の重み係数を導出するとともに、導出した前記重み係数と前記参照画像の前記画素平均値と前記予測画像の前記画素平均値とを用いて前記参照画像のオフセットを導出する第3導出ステップと、
     前記参照画像のうち入力画像を複数のブロックに分割した1つの対象ブロックの参照画像、当該参照画像の前記重み係数、及び当該参照画像の前記オフセットを用いて、前記対象ブロックの前記予測画像を生成する予測画像生成ステップと、
     前記予測画像に基づいて復号画像を生成する復号ステップと、
     を含む復号方法。
    A first derivation step for deriving a pixel error value indicating a pixel average value of each of the two or more reference images and a pixel difference from the pixel average value;
    Deriving a pixel average value of the predicted image using a temporal distance ratio between at least two reference images of the two or more reference images and the predicted image and the pixel average value of the at least two reference images; A second derivation step of deriving a pixel error value of the predicted image using a time distance ratio and the pixel error value of the at least two reference images;
    The weight coefficient of the reference image is derived using the pixel error value of the reference image and the pixel error value of the predicted image, and the derived weight coefficient, the pixel average value of the reference image, and the predicted image A third derivation step of deriving an offset of the reference image using the pixel average value of
    Generating the predicted image of the target block using the reference image of one target block obtained by dividing the input image into a plurality of blocks of the reference image, the weighting coefficient of the reference image, and the offset of the reference image A predicted image generation step,
    A decoding step of generating a decoded image based on the predicted image;
    A decoding method including:
PCT/JP2011/075851 2011-11-09 2011-11-09 Prediction image generation method, encoding method, and decoding method WO2013069117A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2011/075851 WO2013069117A1 (en) 2011-11-09 2011-11-09 Prediction image generation method, encoding method, and decoding method
TW101101752A TW201320750A (en) 2011-11-09 2012-01-17 Prediction image generation method, encoding method, and decoding method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2011/075851 WO2013069117A1 (en) 2011-11-09 2011-11-09 Prediction image generation method, encoding method, and decoding method

Publications (1)

Publication Number Publication Date
WO2013069117A1 true WO2013069117A1 (en) 2013-05-16

Family

ID=48288709

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2011/075851 WO2013069117A1 (en) 2011-11-09 2011-11-09 Prediction image generation method, encoding method, and decoding method

Country Status (2)

Country Link
TW (1) TW201320750A (en)
WO (1) WO2013069117A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022065225A (en) * 2019-03-08 2022-04-27 シャープ株式会社 Lic unit, image decoding device, and image coding device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004179687A (en) * 2002-11-22 2004-06-24 Toshiba Corp Motion picture coding/decoding method and apparatus thereof
JP2007081518A (en) * 2005-09-12 2007-03-29 Victor Co Of Japan Ltd Moving image coding apparatus and moving image coding method
WO2009005071A1 (en) * 2007-07-02 2009-01-08 Nippon Telegraph And Telephone Corporation Moving picture scalable encoding and decoding method, their devices, their programs, and recording media storing the programs
WO2010035731A1 (en) * 2008-09-24 2010-04-01 ソニー株式会社 Image processing apparatus and image processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004179687A (en) * 2002-11-22 2004-06-24 Toshiba Corp Motion picture coding/decoding method and apparatus thereof
JP2007081518A (en) * 2005-09-12 2007-03-29 Victor Co Of Japan Ltd Moving image coding apparatus and moving image coding method
WO2009005071A1 (en) * 2007-07-02 2009-01-08 Nippon Telegraph And Telephone Corporation Moving picture scalable encoding and decoding method, their devices, their programs, and recording media storing the programs
WO2010035731A1 (en) * 2008-09-24 2010-04-01 ソニー株式会社 Image processing apparatus and image processing method

Also Published As

Publication number Publication date
TW201320750A (en) 2013-05-16

Similar Documents

Publication Publication Date Title
US11871007B2 (en) Encoding device, decoding device, encoding method, and decoding method
JP5925801B2 (en) Encoding method, decoding method, encoding device, and decoding device
WO2013057783A1 (en) Encoding method and decoding method
JP6105034B2 (en) Decoding method and decoding apparatus
JP2018007278A (en) Decoding method and decoder
WO2013069117A1 (en) Prediction image generation method, encoding method, and decoding method
JP6419934B2 (en) Electronic device, encoding method and program
JP2017085633A (en) Decoding method and decoding apparatus
JP6262381B2 (en) Electronic device, decoding method and program
JP6235742B2 (en) Electronic device, decoding method and program
JP5916906B2 (en) Encoding method, decoding method, encoding device, and decoding device
JP6088036B2 (en) Decoding method, decoding apparatus, and program
JP6744507B2 (en) Encoding method and decoding method
JP5702011B2 (en) Encoding method, encoding apparatus, and program
JP6682594B2 (en) Encoding method and decoding method
JP2018007279A (en) Decoding method and decoder
JP6262380B2 (en) Electronic device, decoding method and program
JP6132950B2 (en) Encoding method, decoding method, encoding device, and decoding device
JP6235745B2 (en) Electronic device, decoding method and program
JP5869160B2 (en) Decoding method, decoding apparatus, and program
JP2020129848A (en) Data structure of encoded data, storage device, transmission device, and encoding method
JP2020074609A (en) Storage device, transmission device, reception device, and encoded data
JP2019009792A (en) Encoding method, decoding method and encoded data
JP2014197852A (en) Decoding method and decoder
JPWO2013057783A1 (en) Encoding method and decoding method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11875549

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11875549

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP