CN101663896A

CN101663896A - Method and apparatus for encoding video data, method and apparatus for decoding encoded video data and encoded video signal

Info

Publication number: CN101663896A
Application number: CN 200880013169
Authority: CN
Inventors: 高永英; 武宇文; 英格·多塞
Original assignee: Thomson Licensing SAS
Current assignee: Thomson Licensing SAS
Priority date: 2007-04-23
Filing date: 2008-04-09
Publication date: 2010-03-03

Abstract

For two or more versions of a video with different spatial, temporal or SNR resolution, scalability can be achieved by generating a base layer (BL) and an enhancement layer (EL). When a version of a video is available that has higher color bit depth than can be displayed, a common solution is tone mapping. A more efficient compression method is proposed for the case where the two or more versionswith different color bit depth use different color encoding. The present invention is based on joint inter-layer prediction among the available color channels. Thus, color bit depth scalability can also be used where the two or more versions with different color bit depth use different color encoding. In this case the inter-layer prediction is a joint prediction based on all color components. Prediction may also include color space conversion and gamma correction.

Description

Method and device for encoding video data, method and device for decoding encoded video data and encoded video signal

技术领域 technical field

本发明涉及数字视频编码。更具体地，本发明涉及一种用于对视频数据进行编码的方法和设备，一种用于对编码的视频数据和相应编码的视频信号进行解码的方法和设备。The present invention relates to digital video coding. More specifically, the present invention relates to a method and a device for encoding video data, a method and a device for decoding encoded video data and a corresponding encoded video signal.

背景技术 Background technique

近年来，在许多应用领域(如，医学图像处理、制作和后期制作中的数字电影工作流、以及与家庭影院相关的应用)中越来越期望具有高于8位位深的数字图像/视频。现有技术图像/视频编码技术同样正在推进高位深编码。JVT以H.264保真度范围扩展(FRExt)对高位深编码进行标准化，FRExt支持高达14比特的位深、以及高达4:4:4的色度采样。另一方面，运动图像JPEG2000(第3部分)支持每分量高达32比特。In recent years, digital images/videos with a bit depth higher than 8 bits have become increasingly desirable in many application areas such as medical image processing, digital cinema workflows in production and post-production, and home theater related applications. State of the art image/video coding techniques are also advancing high bit depth coding. JVT standardized high bit-depth encoding with H.264 Fidelity Range Extensions (FRExt), which supports bit depths up to 14 bits and chroma sampling up to 4:4:4. On the other hand, moving picture JPEG2000 (Part 3) supports up to 32 bits per component.

潜在地，考虑到在未来较长时间内，传统8比特和高比特数字成像系统将在市场上同时存在这一事实，颜色位深可缩放性是十分有用的。有若干种方式来处理8比特视频和高比特视频的共存。第一种解决方案是，仅给出高比特编码的比特流，并使得色调映射方法能够给出针对标准8比特显示设备的8比特表示。第二种解决方案是，给出包含8比特编码的比特流的同播(simulcast)比特流。选择哪些比特流进行解码是解码器的偏好。这意味着，例如，支持AVC高10简档的较强的解码器可以解码并输出10比特视频，而常规解码器仅可以输出8比特视频。典型地，第一种解决方案不能兼容H.264/AVC 8比特解码器。第二种解决方案兼容所有当前标准，但是需要更多开销。然而，比特降低和后向标准兼容性之间的良好折衷可以是可缩放的解决方案。SVC(也称为H.264/AVC的可缩放扩展)考虑支持位深可缩放性。Potentially, color bit depth scalability is very useful considering the fact that traditional 8-bit and high-bit digital imaging systems will co-exist in the market for a long time to come. There are several ways to handle the coexistence of 8-bit video and high-bit video. A first solution is to only give a high-bit encoded bitstream and enable the tone mapping method to give an 8-bit representation for a standard 8-bit display device. A second solution is to give a simulcast bitstream comprising an 8-bit encoded bitstream. Which bitstreams are chosen for decoding is a decoder preference. This means, for example, that a stronger decoder supporting the AVC High 10 profile can decode and output 10-bit video, while a conventional decoder can only output 8-bit video. Typically, the first solution is not compatible with H.264/AVC 8-bit decoders. The second solution is compatible with all current standards, but requires more overhead. However, a good compromise between bit reduction and backward standards compatibility can be a scalable solution. SVC (also known as the scalable extension of H.264/AVC) allows for support of bit depth scalability.

对于颜色位深可缩放性的方法还没有太多研究。不同于可以在不同分辨率之间使用空间上采样来实现的空间可缩放性，具有挑战的是，可能很难对从重构的低比特画面到原始高比特画面的附加信息进行编码，例如，对于8比特至10比特的可缩放性，由于在对8比特画面进行编码时引入的量化误差的缘故，附加信息也可以高达10比特。层间位深预测也不类似于在变换域利用位平面扫描的FGS。There hasn't been much research into methods for color bit depth scalability. Unlike spatial scalability, which can be achieved using spatial upsampling between different resolutions, it can be challenging to encode additional information from the reconstructed low-bit picture to the original high-bit picture, e.g., For 8-bit to 10-bit scalability, the additional information can also be up to 10 bits due to quantization errors introduced when encoding 8-bit pictures. Inter-layer bit-depth prediction is also not similar to FGS using bit-plane scanning in the transform domain.

此外，已知使用不同类型的颜色空间、色度坐标和伽马校正(例如，RGB、YCrCb、HSV、XYZ)的颜色编码的不同可能。存在各种转换算法。Furthermore, different possibilities for color coding using different types of color spaces, chromaticity coordinates and gamma correction (eg RGB, YCrCb, HSV, XYZ) are known. Various conversion algorithms exist.

当具有比可以显示的位深高的颜色位深的视频版本可用时，一般的解决方案是色调映射，其中，将高动态范围降低至较低颜色位深，而保持对比度。当具有不同空间、时间或SNR分辨率的两个或多个视频版本可用时，可以通过产生基本层(BL)和要与BL组合的增强层(EL)来实现可缩放性。When a version of the video is available with a higher color bit depth than can be displayed, a general solution is tone mapping, where high dynamic range is reduced to a lower color bit depth, while maintaining contrast. When two or more video versions with different spatial, temporal or SNR resolutions are available, scalability can be achieved by creating a base layer (BL) and an enhancement layer (EL) to be combined with the BL.

然而，色调映射方法的固有问题在于，传送比必要数据更多的数据。对于具有不同颜色位深的两个或多个版本使用不同颜色编码的情况，需要更高效的压缩方法。However, an inherent problem with the tone mapping approach is that more data is transmitted than necessary. For situations where two or more versions with different color bit depths are encoded with different colors, more efficient compression methods are required.

发明内容 Contents of the invention

本发明是基于对以下事实的认识，所述事实为：通常在位深可缩放视频编码中，在可用颜色通道之间执行联合层间预测是有利的。因此，根据本发明，在具有不同颜色位深的两个或多个版本使用不同颜色编码的情况下，也可以使用颜色位深可缩放性。在这种情况下，层间预测是基于所颜色分量的联合预测。预测还可以包括颜色空间转换和伽马校正。The invention is based on the recognition of the fact that, generally in bit-depth scalable video coding, it is advantageous to perform joint inter-layer prediction between the available color channels. Thus, according to the invention, color bit depth scalability can also be used in cases where two or more versions with different color bit depths use different color encodings. In this case, the inter-layer prediction is based on the joint prediction of all color components. Prediction can also include color space conversion and gamma correction.

根据本发明的一个方面，一种用于对包括基本层数据和增强层数据在内的视频数据进行编码的方法，其中，基本层和增强层数据包括多个颜色通道，例如，Y、Cr、Cb或R、G、B，并且，基本层和增强层数据具有不同的位深，所述方法包括以下步骤：对基本层数据进行编码；分别针对颜色通道，根据基本层数据对增强层数据进行预测；以及基于所述预测的增强层数据，分别针对颜色通道对增强层数据进行编码，其中，在至少一个模式下，根据所有可用基本层颜色通道，联合预测每个增强层颜色通道，所述方法针对至少一个增强层颜色通道还包括以下步骤：产生残差数据，所述残差数据是原始增强层颜色通道数据和预测的颜色通道数据之间的差；对原始增强层颜色通道数据进行编码；对残差数据进行编码；为至少一个增强层颜色通道选择经编码的原始增强层颜色通道数据、残差数据或经编码的残差数据，其中，所述选择与其他增强层颜色通道的选择无关；以及提供所选增强层颜色通道数据作为增强层输出数据，并且提供对涉及所述增强层颜色通道的所选编码模式的指示。According to one aspect of the present invention, a method for encoding video data comprising base layer data and enhancement layer data, wherein the base layer and enhancement layer data comprises a plurality of color channels, e.g., Y, Cr, Cb or R, G, B, and the base layer and the enhancement layer data have different bit depths, the method includes the following steps: encoding the base layer data; respectively targeting the color channels, according to the base layer data to enhance the layer data predicting; and based on said predicted enhancement layer data, encoding the enhancement layer data separately for color channels, wherein, in at least one mode, jointly predicting each enhancement layer color channel from all available base layer color channels, said The method further comprises, for at least one enhancement layer color channel, the steps of: generating residual data which is the difference between the original enhancement layer color channel data and the predicted color channel data; encoding the original enhancement layer color channel data ; encoding residual data; selecting encoded raw enhancement layer color channel data, residual data, or encoded residual data for at least one enhancement layer color channel, wherein the selection is consistent with the selection of other enhancement layer color channels don't care; and providing the selected enhancement layer color channel data as enhancement layer output data and providing an indication of the selected encoding mode involving said enhancement layer color channel.

根据本发明的另一方面，一种用于对具有BL和EL数据的经编码的视频数据进行解码的方法包括以下步骤：从经编码的视频数据中提取BL数据和EL数据，其中，BL数据和EL数据均包括多个颜色通道的分离数据；至少针对增强层的第一颜色通道，对指示了编码模式的指示进行提取；对多个颜色通道的基本层数据进行解码；基于解码的基本层数据对EL数据进行预测，其中，在至少一个模式中，根据所有可用BL颜色通道对每个EL颜色通道进行联合预测；对多个颜色通道的EL数据进行解码，其中，获得残差，并且至少针对所述第一颜色通道，根据所指示的编码模式，使用所述指示进行解码；以及基于预测的EL数据和所述残差，对多个颜色通道的EL数据进行重构。According to another aspect of the present invention, a method for decoding encoded video data having BL and EL data comprises the steps of: extracting BL data and EL data from encoded video data, wherein the BL data Both the and EL data include separate data for multiple color channels; at least for the first color channel of the enhancement layer, an indication of the encoding mode is extracted; base layer data for multiple color channels is decoded; base layer based on decoding predicting EL data, wherein, in at least one mode, jointly predicting each EL color channel from all available BL color channels; decoding EL data for multiple color channels, wherein a residual is obtained, and at least decoding for the first color channel according to the indicated encoding mode using the indication; and reconstructing EL data for a plurality of color channels based on the predicted EL data and the residual.

根据本发明的又一方面，一种用于对包括基本层数据和增强层数据在内的视频数据进行编码的设备，其中，基本层数据和增强层数据包括多个颜色通道，并且，基本层和增强层具有不同的位深，所述设备包括：用于对基本层进行编码的装置；用于分别针对颜色通道，根据基本层对增强层进行预测的装置；以及用于基于所述预测的增强层，分别针对颜色通道(例如，R、G、B)对增强层进行编码的装置，其中，在至少一个模式中，根据所有可用基本层颜色通道来联合预测每个增强层颜色通道，所述设备针对至少一个增强层颜色通道还包括：用于产生残差的装置，所述残差是原始增强层颜色通道图像和预测的颜色通道图像之间的差；用于对原始增强层颜色通道图像进行编码的装置；用于对残差进行编码的装置；用于为至少一个增强层颜色通道选择经编码的原始增强层颜色通道图像、残差或经编码的残差的装置，其中，所述选择与其他增强层颜色通道的选择无关；以及用于提供所选增强层颜色通道数据作为增强层输出数据并且提供对涉及所述增强层颜色通道的所选编码模式的指示的装置。According to yet another aspect of the present invention, an apparatus for encoding video data comprising base layer data and enhancement layer data, wherein the base layer data and enhancement layer data comprise a plurality of color channels, and the base layer having a different bit depth from the enhancement layer, the apparatus comprising: means for encoding the base layer; means for predicting the enhancement layer from the base layer, for color channels respectively; enhancement layer, means for encoding the enhancement layer separately for color channels (e.g., R, G, B), wherein, in at least one mode, each enhancement layer color channel is jointly predicted from all available base layer color channels, so The apparatus further comprises, for at least one enhancement layer color channel: means for generating a residual, the residual being the difference between the original enhancement layer color channel image and the predicted color channel image; means for encoding an image; means for encoding a residual; means for selecting an encoded original enhancement layer color channel image, a residual, or an encoded residual for at least one enhancement layer color channel, wherein the said selection is independent of selection of other enhancement layer color channels; and means for providing selected enhancement layer color channel data as enhancement layer output data and providing an indication of a selected encoding mode involving said enhancement layer color channel.

根据本发明的另一方面，一种用于对具有基本层数据和增强层数据的经编码的视频数据进行解码的设备包括：用于从经编码的视频数据中提取基本层数据和增强层数据的装置，其中，基本层数据和增强层数据均包括多个颜色通道的分离数据；用于至少针对增强层的第一颜色通道，对指示了编码模式的指示进行提取的装置；用于对多个颜色通道的基本层数据进行解码的装置；用于基于解码的基本层数据对增强层数据进行预测的装置，其中，在至少一个模式中，根据所有可用基本层颜色通道，对每个增强层颜色通道进行联合预测；用于对多个颜色通道的增强层数据进行解码的装置，并且，获得残差，并且至少针对所述第一颜色通道，根据所指示的编码模式，使用所述指示进行解码；以及用于基于预测的增强层数据和所述残差对多个颜色通道的增强层数据进行重构的装置。According to another aspect of the present invention, an apparatus for decoding encoded video data having base layer data and enhancement layer data includes: for extracting base layer data and enhancement layer data from encoded video data The device, wherein, each of base layer data and enhancement layer data includes separation data of a plurality of color channels; means for extracting an indication indicating a coding mode at least for a first color channel of an enhancement layer; means for decoding base layer data of color channels; means for predicting enhancement layer data based on decoded base layer data, wherein, in at least one mode, for each enhancement layer according to all available base layer color channels joint prediction of color channels; means for decoding enhancement layer data of a plurality of color channels, and obtaining a residual, at least for said first color channel, according to the indicated coding mode, using said indication decoding; and means for reconstructing enhancement layer data for a plurality of color channels based on the predicted enhancement layer data and the residual.

根据另一方面，一种包括基本层数据和增强层数据的经编码的视频信号，其中，基本层数据包括第一颜色编码的多个颜色通道，并且增强层数据包括另一第二颜色编码的多个颜色通道，基本层数据和增强层数据具有不同的颜色位深，并且，所述信号还包括编码模式指示，所述编码模式至少针对第一增强层颜色通道指示其包括经编码的残差数据还是经编码的宏块数据。According to another aspect, an encoded video signal includes base layer data and enhancement layer data, wherein the base layer data includes a plurality of color channels encoded in a first color and the enhancement layer data includes another color channel encoded in a second color. The plurality of color channels, the base layer data and the enhancement layer data have different color bit depths, and the signal further includes an indication of a coding mode indicating, for at least the first enhancement layer color channel, that it includes a coded residual The data is also coded macroblock data.

所提出的编码解决方案的特别优势在于，其符合H.264/AVC标准，并且与H.264/AVC可缩放扩展(SVC)中支持的所述种类的可缩放性相兼容。A particular advantage of the proposed encoding solution is that it complies with the H.264/AVC standard and is compatible with the kind of scalability supported in the H.264/AVC Scalable Extension (SVC).

至少一种实现提出了H.264/AVC兼容的颜色位深可缩放编码解决方案，其中，将低比特(通常8比特)和高比特(例如，10、12或14比特)序列分别编码为基本层和增强层。在所公开的解决方案的一个实施例中，在宏块(MB)级中进行低比特BL和高比特EL之间的层间预测，以利用相同视频的低比特和高比特表示之间的冗余。此外，对每个颜色通道(例如，Y、Cb或Cr)的层间颜色位深预测不是独立的。相反，以联合方式执行，从而通过联合层间颜色位深预测，用重构的位于同一位置的基层MB的所有(通常为三个)颜色通道来确定增强层MB的每个通道的预测版本。At least one implementation proposes an H.264/AVC-compatible color bit-depth scalable coding solution, where low-bit (typically 8-bit) and high-bit (e.g., 10, 12 or 14-bit) sequences are separately coded as basic layers and enhancement layers. In one embodiment of the disclosed solution, inter-layer prediction between low-bit BL and high-bit EL is done at the macroblock (MB) level to exploit redundancy between low-bit and high-bit representations of the same video. Remain. Furthermore, the inter-layer color bit depth prediction for each color channel (eg, Y, Cb or Cr) is not independent. Instead, it is performed in a joint manner, whereby all (typically three) color channels of the reconstructed co-located base layer MB are used to determine the predicted version of each channel of the enhancement layer MB by joint inter-layer color bit depth prediction.

在所附权利要求、以下说明书和附图中公开了本发明的优势实施例。Advantageous embodiments of the invention are disclosed in the appended claims, the following description and the drawings.

附图说明 Description of drawings

参照附图描述了本发明的示例实施例，在附图中Example embodiments of the invention are described with reference to the accompanying drawings, in which

图1是颜色位深可缩放编码的框架；Figure 1 is a framework for color bit depth scalable coding;

图2是帧内编码中的联合层间预测；Figure 2 is the joint inter-layer prediction in intra-frame coding;

图3是帧间编码中的联合层间预测；以及Fig. 3 is joint inter-layer prediction in inter coding; and

图4是帧间编码中的自适应层间颜色位深预测。Figure 4 is an adaptive inter-layer color bit depth prediction in inter-frame coding.

具体实施方式 Detailed ways

不失一般性地，假设存在颜色位深可缩放性的两个层：一个层是8比特视频序列，而另一层是10比特视频序列。针对至少一个实现方式，在图1中示出了所提出的颜色位深可缩放编码的框架。Without loss of generality, assume that there are two layers of color bit depth scalability: one layer is an 8-bit video sequence, and the other layer is a 10-bit video sequence. For at least one implementation, the proposed framework for color bit-depth scalable coding is shown in FIG. 1 .

可缩放编码器Enc产生位深可缩放比特流SBS，其中复用了BL和EL编码的画面。可缩解码器Dec可以通过仅对BL比特流进行解码来产生8比特视频，或通过对整个可缩放比特流SBS进行解码来产生10比特视频。向不同客户提供相同视觉内容的不同位深的多个版本，通过所提出的颜色位深可缩放编码来实现设备自适应。The scalable encoder Enc generates a bit-depth scalable bit-stream SBS in which BL and EL coded pictures are multiplexed. The scalable decoder Dec can generate 8-bit video by decoding only the BL bitstream, or 10-bit video by decoding the entire scalable bitstream SBS. Providing multiple versions of the same visual content with different bit depths to different clients enables device adaptation through the proposed color bit depth scalable encoding.

应当强调的是，两个输入序列，8比特和10比特视频序列，可以不仅在位深方面不同。因此，层间预测可以包含，例如：It should be emphasized that the two input sequences, 8-bit and 10-bit video sequences, may differ not only in bit depth. Thus, inter-layer predictions can include, for example:

1)针对不同伽马校正和不同色度坐标的调整，例如，RGB颜色空间(Rec.BT.601)至RGB颜色空间(Rec.BT.709)的转换，RGB颜色空间(Rec.BT.601)至设备指定RGB颜色空间转换。1) Adjustments for different gamma corrections and different chromaticity coordinates, for example, conversion from RGB color space (Rec.BT.601) to RGB color space (Rec.BT.709), RGB color space (Rec.BT.601 ) to device-specified RGB color space conversion.

2)颜色空间转换(包括针对不同伽马校正的调整)，例如，XYZ颜色空间至sRGB颜色空间的转换，YCbCr颜色空间(Rec.BT.709)至RGB颜色空间(Rec.BT.709)的转换，YCbCr颜色空间(Rec.BT.601)至YCbCr颜色空间(Rec.BT.709)的转换。2) Color space conversion (including adjustments for different gamma corrections), for example, conversion from XYZ color space to sRGB color space, conversion from YCbCr color space (Rec.BT.709) to RGB color space (Rec.BT.709) Conversion, conversion from YCbCr color space (Rec.BT.601) to YCbCr color space (Rec.BT.709).

3)色调格式转换，例如，YCbCr 4:2:0至YCbCr 4:2:2，YCbCr 4:2:0至YCbCr 4:4:4，3) Hue format conversion, for example, YCbCr 4:2:0 to YCbCr 4:2:2, YCbCr 4:2:0 to YCbCr 4:4:4,

4)颜色校正，以及4) color correction, and

5)上述项目的组合。5) A combination of the above items.

情况1)、2)和3)可以涉及非线性变换，而在情况4)中，两个所考虑的序列之间的关系与查找表(LUT)一样复杂。此外，情况2)还可以涉及跨不同颜色通道的处理。例如，将YCbCr颜色空间(Rec.BT.709)至RGB颜色空间(Rec.BT.709)的转换数学建模为矩阵运算，使得针对每个像素，通过Y、Cb和Cr的值的线性组合来计算R(G或B)的值。至少一个实现方式提出了包含跨不同颜色通道的处理的联合层间预测，所述联合层间预测可以在画面级或MB级下进行。Cases 1), 2) and 3) may involve nonlinear transformations, while in case 4), the relationship between the two considered sequences is as complex as a look-up table (LUT). Furthermore, case 2) can also involve processing across different color channels. For example, the conversion from YCbCr color space (Rec.BT.709) to RGB color space (Rec.BT.709) is mathematically modeled as a matrix operation such that for each pixel, a linear combination of the values of Y, Cb, and Cr to calculate the value of R (G or B). At least one implementation proposes joint inter-layer prediction involving processing across different color channels, which can be done at picture level or MB level.

以下，给出了实现联合层间颜色位深预测的编码/解码方法。在该部分中，提供了各个实现方式的细节。也可能在其他部分中讨论这样的实现方式。至少一个实现方式提供了用于实现颜色位深可缩放性的AVC兼容联合层间预测的技术解决方案。在图2和图3中示出了包含MB级层间颜色位深预测的帧内和帧间编码中的颜色位深可缩放编码器的相应的图。不失一般性地，假设层间颜色位深预测包含YCbCr颜色空间(Rec.BT.709)至RGB颜色空间(Rec.BT.709)的转换。解码处理是帧内和帧间编码中编码处理的逆过程。In the following, an encoding/decoding method for realizing joint inter-layer color bit depth prediction is given. In this section, details of various implementations are provided. Such implementations may also be discussed in other sections. At least one implementation provides a technical solution for AVC compatible joint inter-layer prediction for color bit depth scalability. The corresponding diagrams of a color bit depth scalable encoder in intra and inter coding with MB-level inter-layer color bit depth prediction are shown in FIGS. 2 and 3 . Without loss of generality, it is assumed that inter-layer color bit depth prediction involves conversion from YCbCr color space (Rec.BT.709) to RGB color space (Rec.BT.709). The decoding process is the inverse of the encoding process in intra and inter encoding.

关于图2和图3，应当注意，三个率失真优化块(RDO)RDOr、RDOg、RDOb彼此独立。即，对于每个颜色通道，可以单独判定，在无需预测的情况下直接对增强层进行帧内/帧间编码，还是以其他方式执行预测，产生残差，以及对该残差直接进行帧内/帧间编码，还是在速率失真优化判定之前进行变换(T)、量化(Q)和熵编码。在RDO期间，确定数据率和失真之间的最佳折衷，并且选择相应的信号。在帧间预测的情况下，如图3所示，在增强层中可以使用来自基本层MB的运动矢量305r、305g、305b。With regard to Figures 2 and 3, it should be noted that the three rate-distortion optimized blocks (RDOs) RDOr, RDOg, RDOb are independent of each other. That is, for each color channel, it can be decided separately whether to directly intra/inter code the enhancement layer without prediction, or to perform prediction in some other way, produce a residual, and directly intra / Inter coding, or transform (T), quantization (Q) and entropy coding before the rate-distortion optimization decision. During RDO, the best trade-off between data rate and distortion is determined and the corresponding signal is selected. In case of inter prediction, as shown in Fig. 3, motion vectors 305r, 305g, 305b from the base layer MB may be used in the enhancement layer.

可以在语法中(例如，在MB类型字段中)包括对所选编码类型的指示。An indication of the selected encoding type may be included in the syntax (eg, in the MB type field).

图4示出了在每个EL分支中使用附加跳过模式，使得RDO具有4个输入：引入新模式(所谓的跳过模式)来跳过EL残差信号。如果通过RDO选择跳过模式，则EL不包含当前MB的比特。在解码器处，仅对BL MB进行解码，并且进行层间颜色位深预测来获得重构的ELMB。层内预测在原理上以相同方式进行工作。Figure 4 shows the use of additional skip modes in each EL branch such that the RDO has 4 inputs: A new mode (so-called skip mode) is introduced to skip the EL residual signal. If skip mode is selected via RDO, the EL does not contain the bits of the current MB. At the decoder, only the BL MB is decoded, and the inter-layer color bit depth prediction is performed to obtain the reconstructed ELMB. Intra-layer prediction works in principle in the same way.

以下列表提供了各种实现方式的简要列表。该列表并非是穷尽的，而是仅提供了许多可能实现方式中一小部分的简要描述。The following list provides an abbreviated listing of the various implementations. This list is not exhaustive, but only provides a brief description of a small number of the many possible implementations.

参照图2和图3，一种用于对包括基本层数据和增强层数据在内的视频数据进行编码的方法，其中，基本层和增强层数据包括多个颜色通道(例如，Y、Cr、Cb或R、G、B)，并且，基本层和增强层数据具有不同的位深，该方法包括以下步骤：2 and 3, a method for encoding video data including base layer data and enhancement layer data, wherein the base layer and enhancement layer data includes multiple color channels (e.g., Y, Cr, Cb or R, G, B), and, base layer and enhancement layer data have different bit depths, the method comprises the following steps:

对基本层数据进行编码201y、201cr、201cb；分别针对颜色通道根据基本层数据对增强层数据进行预测；以及基于所述预测的增强层数据，分别针对颜色通道(例如，R、G、B)对增强层数据进行编码，encoding base layer data 201y, 201cr, 201cb; predicting enhancement layer data from the base layer data for color channels respectively; and based on said predicted enhancement layer data, respectively for color channels (e.g. R, G, B) encode the enhancement layer data,

其中，在至少一个模式中，根据所有可用基本层颜色通道，对每个增强层颜色通道进行联合预测200，并且所述方法针对至少一个(或一些或全部)增强层颜色通道还包括其他以下步骤：Wherein, in at least one mode, each enhancement layer color channel is jointly predicted 200 according to all available base layer color channels, and the method further comprises the following steps for at least one (or some or all) enhancement layer color channels :

产生残差数据R_res、B_res、G_res，所述残差数据是原始增强层颜色通道数据R_EL、G_EL、B_EL和预测的颜色通道数据之间的差；generating residual data R _res , B _res , G _res , said residual data being the difference between the original enhancement layer color channel data R _EL , G _EL , B _EL and the predicted color channel data;

对原始增强层颜色通道数据进行编码202r、202g、202b；Encoding 202r, 202g, 202b raw enhancement layer color channel data;

对残差数据进行编码203r、203g、203b、204r、204g、204b；Encoding residual data 203r, 203g, 203b, 204r, 204g, 204b;

为至少一个增强层颜色通道选择RDO_r、RDO_g、RDO_b经编码的原始增强层颜色通道数据、残差数据或经编码的残差数据，其中，所述选择与其他增强层颜色通道的选择无关；以及Selection of RDO _r , RDO _g , RDO _b encoded raw enhancement layer color channel data, residual data, or encoded residual data for at least one enhancement layer color channel, wherein said selection is consistent with selection of other enhancement layer color channels irrelevant; and

提供所选增强层颜色通道数据作为增强层输出数据，并且提供对涉及所述增强层颜色通道的所选编码模式的指示。Selected enhancement layer color channel data is provided as enhancement layer output data and an indication of a selected encoding mode involving said enhancement layer color channel is provided.

在一个实施例中，基本层和增强层使用不同的颜色编码(例如，Y、CR、CB和R、G、B)，并且层间预测200还包括针对帧内和帧间编码的颜色空间转换。In one embodiment, the base layer and enhancement layer use different color coding (e.g., Y, CR, CB and R, G, B), and the inter-layer prediction 200 also includes color space conversion for intra and inter coding .

在一个实施例中，颜色空间转换包括从YCbCr颜色空间(Rec.BT.709)至RGB颜色空间(Rec.BT.709)的转换。In one embodiment, the color space conversion includes conversion from YCbCr color space (Rec. BT.709) to RGB color space (Rec. BT.709).

在一个实施例中，对残差进行编码包括熵编码204r、204g、204b。In one embodiment, encoding the residual comprises entropy encoding 204r, 204g, 204b.

在一个实施例中，针对增强层颜色通道数据的附加编码模式包括宏块级上的跳过模式405；在跳过模式中，增强层数据不包含相应宏块的比特。In one embodiment, additional coding modes for enhancement layer color channel data include a skip mode 405 at the macroblock level; in skip mode, the enhancement layer data does not contain bits of the corresponding macroblock.

在一个实施例中，在选择RDO_r、RDO_g、RDO_b的步骤中，所述选择基于数据率和失真的最小化。In one embodiment, in the step of selecting RDO _r , RDO _g , RDO _b , said selection is based on data rate and distortion minimization.

在一个实施例中，在画面级执行跨不同颜色通道的预测200。In one embodiment, prediction 200 across different color channels is performed at the picture level.

在一个实施例中，在宏块级执行跨不同颜色通道的预测。In one embodiment, prediction across different color channels is performed at the macroblock level.

在一个实施例中，所述方法还包括：分别针对每个基本层和增强层颜色通道进行熵编码EC_Y，BL、EC_Cb，BL、EC_Cr，BL、EC_Y，EL、EC_Cb，EL、EC_Cr，EL。In one embodiment, the method further includes: performing entropy encoding EC _Y,BL , EC _Cb,BL , EC _Cr,BL , EC _Y,EL , EC _Cb,EL for each base layer and enhancement layer color channel respectively , EC _{Cr, EL} .

根据本发明的另一方面，用于对具有BL数据和EL数据的经编码的视频数据进行解码的方法包括以下步骤：According to another aspect of the invention, a method for decoding encoded video data having BL data and EL data comprises the steps of:

从经编码的视频数据中提取基本层数据和增强层数据，其中，基本层数据和增强层数据均包括多个颜色通道的分离数据；至少针对增强层的第一颜色通道，对指示了编码模式的指示进行提取；对多个颜色通道的基本层数据进行解码；基于解码的基本层数据对增强层数据进行预测，其中，在至少一个模式中，根据所有可用基本层颜色通道对每个增强层颜色通道进行联合预测；对多个颜色通道的增强层数据进行解码，其中，获得残差，并且至少针对所述第一颜色通道，根据所指示的编码模式，使用所述指示进行解码；以及基于预测的增强层数据和所述残差，对多个颜色通道的增强层数据进行重构。Base layer data and enhancement layer data are extracted from encoded video data, wherein the base layer data and enhancement layer data each comprise separate data for a plurality of color channels; at least for a first color channel of the enhancement layer, an encoding mode is indicated extracting an indication of; decoding base layer data for a plurality of color channels; predicting enhancement layer data based on the decoded base layer data, wherein, in at least one mode, for each enhancement layer according to all available base layer color channels jointly predicting color channels; decoding enhancement layer data for a plurality of color channels, wherein a residual is obtained and decoding is performed using said indication, at least for said first color channel, according to an indicated coding mode; and based on The predicted enhancement layer data and the residual are used to reconstruct the enhancement layer data of a plurality of color channels.

以下实施例涉及用于解码的方法。在一个实施例中，基本层和增强层使用不同的颜色编码(例如，Y、CR、CB或R、G、B)，并且预测步骤还包括：针对帧内和帧间编码的颜色空间转换。The following embodiments relate to methods for decoding. In one embodiment, the base layer and the enhancement layer use different color coding (eg, Y, CR, CB or R, G, B), and the prediction step further includes: color space conversion for intra and inter coding.

在一个实施例中，颜色空间转换包括YCbCr颜色空间至RGB颜色空间转换。In one embodiment, the color space conversion includes YCbCr color space to RGB color space conversion.

在一个实施例中，对残差进行解码包括熵解码。In one embodiment, decoding the residual comprises entropy decoding.

在一个实施例中，采用针对增强层颜色通道的附加解码模式，包括宏块级上的跳过模式，其中，在跳过模式中，增强层数据不包含相应宏块的比特。In one embodiment, additional decoding modes for enhancement layer color channels are employed, including a skip mode at the macroblock level, wherein in skip mode the enhancement layer data does not contain bits of the corresponding macroblock.

在一个实施例中，在画面级上执行跨不同颜色通道的预测。In one embodiment, prediction across different color channels is performed at picture level.

在一个实施例中，在宏块级上执行跨不同颜色通道的预测。In one embodiment, prediction across different color channels is performed at the macroblock level.

在一个实施例中，所述方法还包括：分别针对每个基本层和增强层颜色通道的熵解码。In one embodiment, the method further comprises: entropy decoding for each base layer and enhancement layer color channel separately.

根据另一方面，一种用于对包括基本层数据和增强层数据在内的视频数据进行编码的设备，其中，基本层数据和增强层数据包括多个颜色通道(例如，Y、CR、CB或R、G、B)，并且，基本层和增强层具有不同的位深，所述设备包括：According to another aspect, an apparatus for encoding video data comprising base layer data and enhancement layer data, wherein the base layer data and enhancement layer data comprise a plurality of color channels (e.g., Y, CR, CB or R, G, B), and the base layer and the enhancement layer have different bit depths, the device comprising:

用于对基本层进行编码的装置201y、201cr、201cb；means 201y, 201cr, 201cb for encoding the base layer;

用于分别针对颜色通道根据基本层对增强层进行预测的装置200；以及means 200 for predicting the enhancement layer from the base layer for the color channels respectively; and

用于基于所述预测的增强层，分别针对颜色通道R、G、B对增强层进行编码的装置，其中，在至少一个模式中，根据所有可用基本层颜色通道来联合预测200每个增强层颜色通道R、G、B，并且所述设备针对至少一个增强层颜色通道还包括：Means for encoding enhancement layers for color channels R, G, B, respectively, based on said predicted enhancement layers, wherein, in at least one mode, jointly predicting 200 each enhancement layer from all available base layer color channels color channels R, G, B, and the device further comprises, for at least one enhancement layer color channel:

用于产生残差R_res、B_res、G_res的装置，所述残差是原始增强层颜色通道R_EL、G_EL、B_EL和预测的颜色通道图像之间的差；用于对原始增强层颜色通道图像进行编码的装置202r、202g、202b；用于对残差进行编码的装置203r、203g、203b、204r、204g、204b；用于为至少一个增强层颜色通道选择经编码的原始增强层颜色通道图像、残差或经编码的残差的装置RDO_r、RDO_g、RDO_b，其中，所述选择与其他增强层颜色通道的选择无关；以及用于提供所选增强层颜色通道数据作为增强层输出数据并且提供对涉及所述增强层颜色通道的所选编码模式的指示的装置。means for generating residuals R _res , B _res , G _res , said residuals being the difference between the original enhancement layer color channels R _EL , G _EL , B _EL and the predicted color channel image; means for encoding layer color channel images 202r, 202g, 202b; means for encoding residuals 203r, 203g, 203b, 204r, 204g, 204b; for selecting an encoded original enhancement for at least one enhancement layer color channel means RDO _r , RDO _g , RDO _b of layer color channel images, residuals or coded residuals, wherein said selection is independent of the selection of other enhancement layer color channels; and for providing selected enhancement layer color channel data means for outputting data as an enhancement layer and providing an indication of a selected encoding mode involving color channels of said enhancement layer.

以下实施例涉及用于对视频数据进行编码的设备。The following embodiments relate to devices for encoding video data.

在一个实施例中，基本层和增强层使用不同的颜色编码Y、CR、CB，R、G、B，并且用于执行帧间预测的装置200还包括：用于针对帧内和帧间编码执行颜色空间转换的装置。In one embodiment, the base layer and the enhancement layer use different color codes Y, CR, CB, R, G, B, and the apparatus 200 for performing inter-frame prediction further includes: for intra-frame and inter-frame coding A device that performs color space conversion.

在一个实施例中，颜色空间转换包括YCbCr颜色空间(Rec.BT.709)至RGB颜色空间(Rec.BT.709)转换。In one embodiment, the color space conversion includes YCbCr color space (Rec. BT.709) to RGB color space (Rec. BT.709) conversion.

在一个实施例中，用于对残差进行编码的装置包括：用于执行熵编码的装置204r、204g、204b。In one embodiment, the means for encoding the residual comprises: means for performing entropy encoding 204r, 204g, 204b.

在一个实施例中，所述设备还包括：用于针对增强层颜色通道在宏块级上执行作为附加编码模式的跳过模式的装置405，其中，在跳过模式中，增强层不包含相应宏块的比特。In one embodiment, the apparatus further comprises: means 405 for performing a skip mode as an additional coding mode at the macroblock level for enhancement layer color channels, wherein in skip mode the enhancement layer does not contain the corresponding bits of the macroblock.

根据本发明的另一方面，一种用于对具有基本层数据和增强层数据的经编码的视频数据进行解码的设备包括：According to another aspect of the present invention, an apparatus for decoding encoded video data having base layer data and enhancement layer data includes:

用于从经编码的视频数据中提取基本层数据和增强层数据的装置，其中，基本层数据和增强层数据包括多个颜色通道的分离数据；用于至少针对增强层的第一颜色通道，对指示了编码模式的指示进行提取的装置；用于对多个颜色通道的基本层数据进行解码的装置；用于基于解码的基本层数据对增强层数据进行预测的装置，其中，在至少一个模式中，根据所有可用基本层颜色通道对每个增强层颜色通道进行联合预测；用于对多个颜色通道的增强层数据进行解码的装置，其中，获得残差，并且至少针对所述第一颜色通道，根据所指示的编码模式，使用所述指示进行解码；以及用于基于预测的增强层数据和所述残差，对多个颜色通道的增强层数据进行重构的装置。means for extracting base layer data and enhancement layer data from encoded video data, wherein the base layer data and enhancement layer data comprise separate data for a plurality of color channels; for at least a first color channel of the enhancement layer, means for extracting an indication indicating a coding mode; means for decoding base layer data for a plurality of color channels; means for predicting enhancement layer data based on the decoded base layer data, wherein at least one mode, jointly predicting each enhancement layer color channel from all available base layer color channels; means for decoding enhancement layer data of a plurality of color channels, wherein a residual is obtained, and at least for said first a color channel, according to the indicated encoding mode, decoded using said indication; and means for reconstructing enhancement layer data for a plurality of color channels based on predicted enhancement layer data and said residual.

以下实施例涉及用于对经编码的视频数据进行解码的设备。The following embodiments relate to apparatus for decoding encoded video data.

在一个实施例中，基本层和增强层分别针对Y、CR、CB颜色空间或R、G、B颜色空间使用不同的颜色编码装置，并且用于预测的装置还包括：用于在帧内和帧间编码的情况下执行颜色空间转换的装置。In one embodiment, the base layer and the enhancement layer use different color coding means for Y, CR, CB color spaces or R, G, B color spaces respectively, and the means for predicting further includes: for intra-frame and Means for performing color space conversion in case of inter coding.

在一个实施例中，用于执行颜色空间转换的装置包括：用于执行YCbCr颜色空间至RGB颜色空间转换的装置。In one embodiment, the means for performing color space conversion comprises: means for performing YCbCr color space to RGB color space conversion.

在一个实施例中，用于对残差进行解码的装置包括：用于熵解码的装置。In one embodiment, means for decoding a residual comprises: means for entropy decoding.

在一个实施例中，所述设备还包括：用于针对至少一个增强层颜色通道，执行作为附加解码模式的、宏块级上的跳过模式解码的装置，其中，在跳过模式中，增强层数据不包含相应宏块的比特。In one embodiment, the apparatus further comprises means for performing skip mode decoding at macroblock level as an additional decoding mode for at least one enhancement layer color channel, wherein in skip mode the enhancement Layer data does not contain bits of the corresponding macroblock.

在一个实施例中，用于执行跨不同颜色通道的预测的装置在画面级上进行操作。In one embodiment, the means for performing prediction across different color channels operates on a picture level.

在一个实施例中，用于执行跨不同颜色通道的预测的装置在宏块级上进行操作。In one embodiment, the means for performing prediction across different color channels operates on a macroblock level.

在一个实施例中，所述设备还包括：用于分别针对每个基本层和增强层颜色通道进行熵解码的装置。In one embodiment, the apparatus further comprises means for entropy decoding each base layer and enhancement layer color channel respectively.

根据又一方面，一种包括基本层数据和增强层数据在内的经编码的视频信号，其中，基本层数据包括第一颜色编码的多个颜色通道，例如Y、Cr、Cb，并且增强层数据包括另一第二颜色编码的多个颜色通道R、G、B，其中，基本层数据和增强层数据具有不同的颜色位深，并且，所述信号还包括编码模式指示，所述编码模式至少针对第一增强层颜色通道指示其包括经编码的残差数据还是经编码的宏块数据。According to yet another aspect, an encoded video signal comprising base layer data and enhancement layer data, wherein the base layer data comprises a plurality of color channels encoded in a first color, such as Y, Cr, Cb, and the enhancement layer The data includes a plurality of color channels R, G, B encoded in another second color, wherein the base layer data and the enhancement layer data have different color bit depths, and the signal further includes an indication of a coding mode, the coding mode Indicates, at least for the first enhancement layer color channel, whether it comprises encoded residual data or encoded macroblock data.

根据一个方面，通过根据重构的位于同一位置的基本层MB的所有(通常为三个)颜色通道，对增强层MB的每个颜色通道进行预测，来进行联合层间预测。According to one aspect, joint inter-layer prediction is performed by predicting each color channel of an enhancement layer MB from all (typically three) reconstructed color channels of a co-located base layer MB.

本公开描述了各种实现。然而，所描述的实现的特征和方面还适于其他实现。例如，可以使用各种不同的技术来实现信令，这些技术包括但不限于，SPS语法、其他高级语法、非高级语法、带外信息，以及隐式信令。此外，可以使用各种编码技术。相应地，尽管可以在具体上下文中描述这里描述的实现，但是这样的描述不应视为对这样的实现或上下文的特征和概念的限制。This disclosure describes various implementations. However, features and aspects of the described implementations are also applicable to other implementations. For example, signaling may be implemented using a variety of different techniques including, but not limited to, SPS syntax, other high-level syntax, non-high-level syntax, out-of-band information, and implicit signaling. Additionally, various encoding techniques can be used. Accordingly, although implementations described herein may be described in a specific context, such description should not be viewed as limiting on the features and concepts of such implementation or context.

例如可以以方法或过程、设备或软件程序来实现这里描述的实现。即使仅在单一形式的实现的上下文中进行讨论(例如，仅作为方法进行讨论)，所讨论的实现或特征也可以以其他形式(例如设备或程序)实现。例如，设备可以以适当的硬件、软件和固件实现。例如，方法可以在诸如计算机或其他处理设备等设备中实现。此外，还可以通过由处理设备或其他设备执行的指令来实现方法，并且可以将这样的指令存储在计算机可读介质上，例如，CD、或其他计算机可读存储设备、或集成电路。Implementations described herein may be implemented, for example, as a method or process, an apparatus, or a software program. Even if only discussed in the context of a single form of implementation (eg, discussed only as a method), a discussed implementation or feature may also be implemented in other forms (eg, an apparatus or program). For example, a device can be implemented in suitable hardware, software, and firmware. For example, methods may be implemented in a device such as a computer or other processing device. Furthermore, methods may also be implemented by instructions executed by a processing device or other device, and such instructions may be stored on a computer-readable medium, eg, a CD or other computer-readable storage device, or an integrated circuit.

正如本领域技术人员显而易见，实现还可以产生被格式化为承载例如可以被存储或传输的信息的信号。该信息可以包括，例如，用于执行方法的指令，或由所描述的实现之一产生的数据。例如，可以将信号格式化为将特定语法的值(或者如果正在传输语法则甚至可以是语法指令本身)作为数据来承载。此外，可以在编码器或解码器或二者中实现多种实现。Implementations may also generate signals formatted to carry information that may, for example, be stored or transmitted, as will be apparent to those skilled in the art. This information may include, for example, instructions for performing a method, or data produced by one of the described implementations. For example, a signal may be formatted to carry as data the value of a particular syntax (or even the syntax instruction itself if syntax is being transmitted). Furthermore, various implementations may be implemented in an encoder or a decoder or both.

此外，通过本公开可以构思出其他实现。例如，可以通过对所公开的实现的各种特征进行组合、删除、修改或补充来创建附加的实现。Furthermore, other implementations are contemplated by this disclosure. For example, additional implementations can be created by combining, deleting, modifying or supplementing various features of the disclosed implementations.

将理解，仅以示例的方式描述了本发明，在不脱离本发明的范围的前提下可以对本发明进行细节上的修改。It will be understood that the present invention has been described by way of example only and modifications of detail may be made without departing from the scope of the invention.

可以独立地或以任何适当组合的形式来提供说明书以及(适当地)权利要求书和附图中所公开的每个特征。可以以硬件、软件、或两者的组合的形式来(适当地)实现这些特征。在适用情况下，可以将连接实现为无线连接或有线连接，而不一定为直接或专用连接。出现在权利要求中的附图标记仅用作说明目的，不应对权利要求的范围造成限制影响。Each feature disclosed in the description and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination. These features may (where appropriate) be implemented in hardware, software, or a combination of both. Where applicable, the connection may be implemented as a wireless connection or a wired connection, and not necessarily a direct or dedicated connection. Reference signs appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.

Claims

1. A method for encoding video data comprising base layer (BL) data and enhancement layer (EL) data, wherein the base layer and enhancement layer data comprise a plurality of color channels (Y, Cr, Cb , R, G, B), and the base layer and enhancement layer data have different bit depths, the method includes the following steps:

- Encoding of base layer data (201y, 201cr, 201cb);

- predicting (200) enhancement layer data from base layer data, respectively for said color channels; and

- encoding enhancement layer data for said color channels (R, G, B) respectively based on said predicted enhancement layer data;

Wherein, in at least one mode, each enhancement layer color channel is jointly predicted (200) according to all available base layer color channels, and the method further includes the following additional steps for at least one enhancement layer color channel:

- generating residual data (R _res , B _res , G _res ) which is the difference between the original enhancement layer color channel (R _EL , G _EL , B _EL ) and the predicted color channel data;

- encoding (202r, 202g, 202b) of raw enhancement layer color channel data;

- encoding the residual data (203r, 203g, 203b, 204r, 204g, 204b);

- selection (RDO _r , RDO _g , RDO _b ) of encoded raw enhancement layer color channel data, residual data or encoded residual data for at least one enhancement layer color channel, wherein the selection is related to other enhancement layer color channels The choice of channel is irrelevant; and

- providing selected enhancement layer color channel data as enhancement layer output data and providing an indication of a selected encoding mode involving said enhancement layer color channel.

2. The method according to claim 1, wherein the base layer and the enhancement layer use different color coding (Y, CR, CB, R, G, B), and the inter-layer prediction (200) also includes Color space conversion for inter-coding.

3. The method according to claim 2, wherein the color space conversion comprises: conversion from YCbCr color space (Rec. BT.709) to RGB color space (Rec. BT.709).

4. The method of one of the preceding claims, wherein encoding the residual comprises entropy encoding (204r, 204g, 204b).

5. The method according to one of the preceding claims, wherein additional coding modes for enhancement layer color channel data include a skip mode (405) on macroblock level; in skip mode, enhancement layer data does not contain code for bits of the corresponding macroblock.

6. Method according to one of the preceding claims, wherein in said selecting ( _RDOr , _RDOg , _RDOb ) step said selection is based on data rate and distortion minimization.

7. The method according to one of the preceding claims, wherein the prediction (200) across different color channels is performed at picture level.

8. The method according to one of the preceding claims, wherein prediction across different color channels is performed at macroblock level.

9. The method according to one of the preceding claims, further comprising entropy coding (EC _{Y, BL} , EC _{Cb, BL} , EC _{Cr, BL} , EC _{Y, EL} , EC _{Cb, EL} , EC _{Cr, EL} ).

10. A method for decoding encoded video data having BL data and EL data comprising the steps of:

- extracting base layer data and enhancement layer data from the encoded video data, wherein both the base layer data and the enhancement layer data comprise separate data for a plurality of color channels;

- extracting, at least for the first color channel of the enhancement layer, an indication indicating a coding mode;

- Decode base layer data for multiple color channels;

- prediction of enhancement layer data based on decoded base layer data, wherein, in at least one mode, each enhancement layer color channel is jointly predicted from all available base layer color channels;

- decoding enhancement layer data for a plurality of color channels, wherein a residual is obtained and decoded using said indication, at least for said first color channel, according to the indicated coding mode; and

- Reconstructing enhancement layer data for a plurality of color channels based on predicted enhancement layer data and said residual.

11. An apparatus for encoding video data comprising a base layer (BL) and an enhancement layer (EL), wherein the base layer and enhancement layer data comprise a plurality of color channels (Y, Cr, Cb, R , G, B), and the base layer and the enhancement layer have different bit depths, the device includes:

- means (201y, 201cr, 201cb) for encoding the base layer;

- means (200) for predicting the enhancement layer from the base layer, respectively for the color channels; and

- means for encoding the enhancement layer for each color channel (R, G, B) based on said predicted enhancement layer;

Wherein, in at least one mode, each enhancement layer color channel (R, G, B) is jointly predicted (200) according to all available base layer color channels, and the device further includes for at least one enhancement layer color channel:

- means for generating residuals (R _res , B _res , G _res ) that are the difference between the original enhancement layer color channels (R _EL , G _EL , B _EL ) and the predicted color channel images;

- means (202r, 202g, 202b) for encoding raw enhancement layer color channel images;

- means for encoding residuals (203r, 203g, 203b, 204r, 204g, 204b);

- means (RDO _r , RDO _g , RDO _b ) for selecting a coded original enhancement layer color channel image, a residual or a coded residual for at least one enhancement layer color channel, wherein said selection is related to other enhancement The selection of the layer color channel is irrelevant; and

- Means for providing selected enhancement layer color channel data as enhancement layer output data and providing an indication of a selected encoding mode relating to said enhancement layer color channel.

12. The device according to the preceding claim, wherein the base layer and the enhancement layer use different color coding (Y, CR, CB, R, G, B), the means (200) for performing inter-layer prediction further comprising : means for performing color space conversion for intra and inter coding.

13. An apparatus for decoding encoded video data having base layer and enhancement layer data, comprising:

- means for extracting base layer data and enhancement layer data from encoded video data,

Wherein, both the base layer data and the enhancement layer data include separation data of multiple color channels;

- means for extracting, at least for a first color channel of an enhancement layer, an indication indicating a coding mode;

- means for decoding base layer data of a plurality of color channels;

- means for predicting enhancement layer data based on decoded base layer data, wherein, in at least one mode, each enhancement layer color channel is jointly predicted from all available base layer color channels;

- means for decoding enhancement layer data of a plurality of color channels, wherein a residual is obtained and, at least for said first color channel, is decoded using said indication according to the indicated coding mode; and

- Means for reconstructing enhancement layer data for a plurality of color channels based on predicted enhancement layer data and said residual.

14. The device according to the preceding claim, wherein the base layer and the enhancement layer use different color coding (Y, CR, CB, R, G, B), the means for predicting further comprises: for intra and inter-coding means to perform color space conversion.

15. An encoded video signal comprising base layer (BL) and enhancement layer (EL) data, wherein the base layer data comprises a plurality of color channels (Y, Cr, Cb) encoded in a first color, and the enhancement layer data comprises a plurality of color channels (R, G, B) encoded in different second colors, wherein the base layer data and the enhancement layer data have different color bit depths, and the signal further comprises an encoding mode indication, The encoding mode indicates, at least for the first enhancement layer color channel, whether it comprises encoded residual data or encoded macroblock data.