AU2016203467A1

AU2016203467A1 - Method, apparatus and system for determining a luma value

Info

Publication number: AU2016203467A1
Application number: AU2016203467A
Authority: AU
Inventors: Volodymyr KOLESNIKOV
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-05-26
Filing date: 2016-05-26
Publication date: 2017-12-14

Abstract

- 35 Abstract METHOD, APPARATUS AND SYSTEM FOR DETERMINING A LUMA VALUE A method of determining a luma value from 4:4:4 RGB video data for encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream. An initial luma point and a 5 target luminance value are determined from received RGB values for a pixel in the video data. A plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount, are determined. Corresponding luminance values are determined for each of the determined offset luma points. An approximation curve is determined from the determined offset luma points and the corresponding 10 luminance values. The luma value is determined from the target luminance value using the determined approximation curve. 11361077v1 P151823 Speci As Filed r--------------------------------- Encoding device downmpler Video encoder Storage Source Colour space Source conversion .4 material L-------------------------------------------------------- 150 L---- j Display device 160 Panel device Video decoder Chroma Colour space 162 upsampler conversion 164 165 Fig. 1

Description

METHOD, APPARATUS AND SYSTEM FOR DETERMINING A LUMA VALUE

TECHNICAL FIELD

The present invention relates generally to digital video signal processing and, in particular, to a method, apparatus and system for determining a luma value from 4:4:4 RGB video data. The present invention also relates to a computer program product including a computer readable medium having recorded thereon a computer program for determining a luma value from 4:4:4 RGB video data.

BACKGROUND

Development of standards for conveying high dynamic range (HDR) and wide colour gamut (WCG) video data and development of displays capable of displaying HDR video data is underway. Standards bodies such as International Organisations for Standardisation / International Electrotechnical Commission Joint Technical Committee 1 / Subcommittee 29 / Working Group 11 (ISO/IEC JTC1/SC29/WG11), also known as the Moving Picture Experts Group (MPEG), the International Telecommunications Union -Radiocommunication Sector (ITU-R), the International Telecommunications Union -Telecommunication Sector (ITU-T), and the Society of Motion Picture Television Experts (SMPTE) are investigating the development of standards for representation and coding of HDR video data. HDR video data covers a wide range of luminance intensities, far beyond that used in traditional standard dynamic range (SDR) services. For example, the Perceptual Quantizer (PQ) Electro-Optical Transfer Function (EOTF), standardised as SMPTE ST.2084, is defined to support a peak luminance of up to 10,000 candela / metre2 (nits) whereas traditional television services are defined with a 100 nit peak brightness (although more modem sets increase the peak brightness beyond this). The minimum supported luminance is zero nits, but for the purposes of calculating the dynamic range the lowest non-zero luminance is used (i.e. 4*10'5 nits for PQ quantised to 10 bits). The physical intensity of a light source is measured in candela / metre2 and is also referred to as ‘luminance’ or ‘linear light’. When luminance is encoded using PQ (or other transfer function) the encoded space is referred to as ‘luma’. Luma is intended to be more perceptually uniform (i.e. a given change in the luma value results in the same perceived change in brightness regardless of the starting point). Traditional power functions such as the ‘gamma’ of SDR television is somewhat perceptually uniform. Transfer functions such as PQ are designed according to models of human visual perception to be more perceptually uniform. In any case, the relationship between luma and luminance is highly non-linear.

Video data generally includes three colour components, where each frame comprises three planes of samples and each plane corresponds to one colour component. The relationship between the sampling rates of the planes is known as a ‘chroma format’. When each plane is sampled at the same rate, the video data is said to be in a ‘4:4:4’ chroma format. In the 4:4:4 chroma format, each triplet of collocated samples forms a ‘pixel’, having a colour and luminance resulting from the values of the triplet of collocated samples. When referring to a sample to which a gamma-correction or a transfer function was already applied, the colour component is referred to as ‘chroma’ and the luminance component is referred to as ‘luma’ to reflect the fact that the colour components’ values are not ‘true’ colour and luminance. The prime symbol (’) is sometimes used after the variable name to indicate a luma value (e.g. Y’). When the second and third of the three planes is sampled at half the rate horizontally and vertically compared to the first plane, the video data is said to be in a ‘4:2:0’ chroma format. As the use of the 4:2:0 results in fewer samples being processed compared to 4:4:4, the result is lower complexity in the video codec. Then, each pixel has one luma sample and groups of four pixels share a pair of chroma samples. Moreover, in such a case, typically the ‘YCbCr’ colour space is used, with the luma (Y) channel stored in the first plane, where the sampling rate is highest and the chroma channels (Cb and Cr) stored in the second and third planes respectively, where the lower sampling rate for chroma information results in lower data rate with little impact subjectively for viewers of the decoded video data.

When displaying the video data, a conversion back to 4:4:4 is required to map the video data onto modern display technology, such as an LCD panel. As such, a pair of chroma samples (i.e Cb and Cr samples) are combined with four luma (Y) samples. Any residual luminance information present in the Cb and Cr samples is known to interfere with the luminance information present in each Y sample, resulting in shifts in the 4:4:4 output from the 4:2:0 to 4:4:4 conversion process. In earlier ‘standard dynamic range’ (SDR) systems using a transfer function that is a power function for encoding of luma and chroma samples (i .e. a ‘gamma function’) the nonlinearity of the transfer function was less than is the case for Perceptual Quantizer (PQ) Electro-Optical Transfer Function (EOTF).

SUMMARY

It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.

According to one aspect of the present disclosure there is provided, a method of determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the method comprising: determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; determining corresponding luminance values for each of the determined offset luma points; determining an approximation line from the determined offset luma points and the corresponding luminance values; and determining the luma value from the target luminance value using the determined approximation line.

According to another aspect of the present disclosure, there is provided a system for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the system comprising: a memory for storing data and a computer program; a processor coupled to the memory for executing the computer program, the computer program comprising instructions for: determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; determining corresponding luminance values for each of the determined offset luma points; determining an approximation curve from the determined offset luma points and the corresponding luminance values; and determining the luma value from the target luminance value using the determined approximation curve.

According to another aspect of the present disclosure, there is provided an encoder for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the encoder comprising: module for determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; module for determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; module for determining corresponding luminance values for each of the determined offset luma points; module for determining an approximation curve from the determined offset luma points and the corresponding luminance values; and module for determining the luma value from the target luminance value using the determined approximation curve.

According to another aspect of the present disclosure, there is provided an apparatus for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the apparatus comprising: means for determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; means for determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; means for determining corresponding luminance values for each of the determined offset luma points; means for determining an approximation curve from the determined offset luma points and the corresponding luminance values; and means for determining the luma value from the target luminance value using the determined approximation curve.

According to another aspect of the present disclosure, there is provided a computer readable medium having a computer program stored thereon for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the program comprising: code for determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; code for determining a plurality of offset luma offset points, each of the offset luma points being offset from the initial luma point by a predetermined amount; code for determining corresponding luminance values for each of the determined offset luma points; code for determining an approximation curve from the determined offset luma points and the corresponding luminance values; and code for determining the luma value from the target luminance value using the determined approximation curve.

Other aspects are also disclosed.

BRIEF DESCRIPTION OF THE DRAWINGS

At least one embodiment of the present invention will now be described with reference to the following drawings and appendices, in which:

Fig. lisa schematic block diagram showing a video capture and reproduction system that includes a video encoder and a video decoder;

Figs. 2A and 2B form a schematic block diagram of a general purpose computer system upon which one or both of the video encoding and decoding system of Fig. 1 may be practiced;

Fig. 3A shows an example sampling approach for video data using the 4:2:0 chroma format using non-co-sited chroma samples;

Fig. 3B shows an example sampling approach for video data using the 4:2:0 chroma format using co-sited chroma samples;

Fig. 4 shows is a graph showing the perceptual-quantiser (PQ) electro-optical transfer function (EOTF);

Fig. 5 is a schematic block diagram showing a colour conversion module and a chroma downsampler configured for converting video data from a linear light RGB representation to a PQ-encoded YCbCr representation using ‘non-constant luminance’ and downsampling from the 4:4:4 chroma format to the 4:2:0 chroma format;

Fig. 6 shows a luma sample adjuster for adjusting luma samples to reduce subsampling artefacts;

Fig. 7 is a graph showing an example luminance-to-luma mapping for a luma adjustment module;

Fig. 8 is a schematic flow diagram showing a method of determining an adjusted luma value; and

Fig. 9 is a schematic flow diagram showing a method of determining the adjusted luma value.

DETAILED DESCRIPTION INCLUDING BEST MODE

Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.

As discussed above, although the possibility of brightness and colour distortion resulting from chroma upconversion is present in SDR systems, the less nonlinear transfer function reduces the extent of such artefacts compared to the case where the Perceptual Quantizer (PQ) Electro-Optical Transfer Function (EOTF) is used. Methods to alleviate such artefacts operate at the ‘pixel rate’ of the video processing system, and as such, relatively low complexity or at least, fixed complexity methods are required. In modern video processing systems, the pixel rate is very high (e.g. 4K at 60 frames per second) where 3 840x2160x60 = 498 x 106 pixels per second need to be processed. As such, for real time processing, a need exists for implementations that are feasible for hardware implementation.

Fig. 1 is a schematic block diagram showing functional modules of a video encoding and decoding system 100. The system 100 includes an encoding device 110, such as a digital video camera, a display device 160, and a communication channel 150 interconnecting the two. Generally, the encoding device 110 operates at a separate location (and time) to the display device 160. As such, the system 100 generally includes separate devices operating at different times and locations. Additional instances of the display device 160 (also considered part of the video encoding and decoding system 100) are considered to be present for each recipient of the encoded video data, e g. customers of a video streaming service or viewers of a free to air broadcast service.

The encoding device 110 encodes source material 112. The source material 112 may be obtained from a complementary metal oxide semiconductor (CMOS) imaging sensor of a video camera with a capability to receive a wider range of luminance levels than traditional SDR imaging sensors. Additionally, the source material 112 may also be obtained using other technologies, such as charged coupled device (CCD) technology, or generated from computer graphics software, or some combination of these sources. Also, the source material 112 may simply represent previously captured and stored video data.

The source material 112 includes a sequence of frames 122. Collectively, the frames 122 form uncompressed video data 130. In the context of preparing video bitstreams for distribution, the source material 112 is generally present in the 4:4:4 chroma format and requires downconversion to the 4:2:0 chroma format prior to encoding. For example, if the source material 112 is obtained from an imaging sensor, a ‘debayering’ process is applied that results in 4:4:4 video data. Moreover, the video data is sampled in RGB. The video data 130 includes codewords for the frames 122, such that three planes of codewords are present for each frame. The source material 112 is generally sampled as tristimulus values in the RGB domain, representing linear light levels. Conversion of linear light RGB to a more perceptually uniform space is achieved by the application of a nonlinear transfer function and results in an R’G’B’ representation comprising R’G’B’ values. The transfer function may be an opto-electrical transfer function (OETF), in which case the R’G’B’ values represent physical light levels of the original scene. In arrangements where the transfer function is an opto-electrical transfer function (OETF), the video processing system 100 may be termed a ‘scene-referred’ system. Alternatively, the transfer function may be the inverse of an electro-optical transfer function (EOTF), in which case the R’G’B’ values represent physical light levels to be displayed. In arrangements where the transfer function is the inverse of an electro-optical transfer function (EOTF), the video processing system 100 may be termed a ‘display-referred’ system.

As seen in Fig. 1, a colour space conversion module 114 is used to convert the R’G’B’ representation to a colour space that decorrelates the luma from each of R’, G’ and B’, such as Y’CbCr. The colour space conversion module 114 applies a matrix transform and is thus a linear operation. Application of the colour space conversion on R’G’B’ (i.e. luma), rather than RGB (i.e. luminance), results in some distortions due to the non-linear nature of the applicable transfer function, but is an accepted practice in SDR television and video systems known as ‘non-constant luminance’ (NCL). In the context of HDR, in particular when using a transfer function such as PQ, the distortions are more significant, and thus processing is performed to compensate for the distortions. The Y’CbCr representation is then quantised to a specified bit depth, resulting in discrete ‘codewords’. Codewords in the Y’ channel encode, approximately, the luminance levels present in the source material 112 according to the transfer function as luma values represented using a particular bit depth. As the luminance information is not fully decorrelated from the Cb and Cr channels, some luminance magnitude is still present in the Cb and Cr channels.

One method of visualising the luminance magnitude still present in the Cb and Cr channels is that in the YCbCr, the range of ‘valid’ codewords for Cb and Cr is dependent on the magnitude of Y’. As Y’ approaches zero, the range of Cb and Cr reduces. As such, the afforded colour fidelity is reduced at low luminances. The same phenomenon is observed as Y increases to the maximum supported value. The source material 112, expressed as RGB values, has three distinct components whose weighted sum forms the luminance. Then, each display primary of the display device 160 (i.e. R, G and B) are necessarily less bright than when all display primaries of the display device 160 are producing maximum allowed output. The range of distinct codewords is implied by the bit depth in use (and thus, implicitly, the quantisation of the codewords is dependent on the bit depth in use). Generally, the video processing system 100 operates at a particular bit depth, such as ten (10) bits. Operation at this particular bit depth implies the availability of 1024 discrete codewords. Further restriction upon the range of available samples may also be present. For example, if the uncompressed video data 130 is to be transported within the encoding device 110 using the ‘serial digital interface’ (SDI) protocol, the codeword range is restricted to 4-1019 inclusive, giving 1016 discrete codeword values. Alternatively, TV broadcast systems may limit the codeword range to 64-940 for 10-bit video data.

As also seen in Fig. 1, a chroma downsampler 116 converts the source material 112 from the 4:4:4 chroma format to produce uncompressed video data 130 in the 4:2:0 chroma format. The chroma downsampler 116 operates in real-time, performing a function referred to as Tuma sample adjustment’, described further below with reference to Figs. 7A-12B.

The video encoder 118 encodes each frame as a sequence of square regions, known as ‘coding tree units’, producing an encoded bitstream 132. The video encoder 118 conforms to a video coding standard such as high efficiency video coding (HEVC), although other standards such as H.264/AVC, VC-1 or MPEG-2 may also be used. The encoded bitstream 132 can be stored (e.g. in a non-transitory storage device or similar arrangement 140), prior to transmission over communication channel 150.

The encoded bitstream 132 is conveyed (e.g. transmitted or passed) to the display device 160. Examples of the display device 160 include an LCD television, a monitor or a projector. The display device 160 includes a video decoder 162 that decodes the encoded bitstream 132 to produce decoded codewords 170. The decoded codewords 170 correspond approximately to the codewords of the uncompressed video data 130. The decoded codewords 170 may not be exactly equal to the codewords of the uncompressed video data 130 due to lossy compression techniques applied in the video encoder 118. The decoded codewords 170 are passed to a chroma upsampler module 164 to produce decoded 4:4:4 video data 172. The chroma upsampler module 164 applies a particular set of filters to perform the upsampling from the 4:2:0 chroma format to the 4:4:4 chroma format, as described further with reference to Figs. 3A and 3B. The decoded 4:4:4 video data 172 is then converted from YCbCr to RGB in a colour space conversion module 165, to produce RGB video data 174. The RGB video data 174 is passed as input to the panel display 166 for visual reproduction of the video data. For example, the reproduction may modulate the amount of backlight illumination passing through an LCD panel. The panel device 166 is generally an LCD panel with an LED backlight. The LED backlight may include an array of LEDs to enable a degree of spatially localised control of the maximum achievable luminance. The panel device 166 may alternatively use ‘organic LEDs’ (OLEDs).

The relationship between a given codeword of the decoded codewords 170 and the corresponding light output emitted from the corresponding pixel in the panel device 166 is nominally the inverse of the transfer function. For a display-referred system, the inverse of the transfer function is the EOTF. For a scene-referred system, the inverse of the transfer function is the inverse OETF. For systems using ‘relative luminance’, the light output is not controlled only by the codeword and the inverse of the transfer function. The light output may be further modified by user control of the contrast or brightness settings of the display device 160.

In one arrangement of the video processing system 100, the EOTF in use is the PQ-EOTF (i.e., SMPTE ST.2084) as will be described further below with reference to Fig. 4. Another example of a transfer function configured for carrying HDR video data is the Hybrid Log Gamma (HLG) Opto-Electrical Transfer Function (OETF), standardised as ARIB STD B-67. The HLG-OETF is nominally defined to support a peak luminance of 1,200 nits. However, as the HLG-OETF is a relative luminance transfer function, a viewer may adjust the contrast and brightness settings of a display to display brighter luminances than the nominal peak luminance.

Notwithstanding the example devices mentioned above, each of the encoding device 110 and display device 160 may be configured within a general purpose computing system, typically through a combination of hardware and software components. Fig. 2A illustrates such a computer system 200, which includes: a computer module 201; input devices such as a keyboard 202, a mouse pointer device 203, a scanner 226, a digital video camera 227, which may be configured as the HDR imaging sensor 112, and a microphone 280, which may be integrated with the camera; and output devices including a printer 215, a display device 214, which may be configured as the display device 160, and loudspeakers 217. An external Modulator-Demodulator (Modem) transceiver device 216 may be used by the computer module 201 for communicating to and from a communications network 220 via a connection 221. The communications network 220, which may represent the communication channel 150, may be a wide-area network (WAN), such as the Internet, a cellular telecommunications network, or a private WAN. Where the connection 221 is a telephone line, the modem 216 may be a traditional “dialup” modem. Alternatively, where the connection 221 is a high capacity (e.g., cable) connection, the modem 216 may be a broadband modem. A wireless modem may also be used for wireless connection to the communications network 220. The transceiver device 216 may additionally be provided in the encoding device 110 and the display device 160 and the communication channel 150 may be embodied in the connection 221.

Further, whilst the communication channel 150 of Fig. 1 may typically be implemented by a wired or wireless communications network, the bitstream 132 may alternatively be conveyed between the encoding device 110 and the display device 160 by way of being recorded to a non-transitory memory storage medium, such as a CD or DVD. In this fashion, the network 150 is merely representative of one path via which the bitstream 132 is conveyed between the encoding device 110 and the display device 160, with the storage media being another such path.

The computer module 201 typically includes at least one processor unit 205, and a memory unit 206. For example, the memory unit 206 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 201 also includes an number of input/output (I/O) interfaces including: an audiovideo interface 207 that couples to the video display 214, loudspeakers 217 and microphone 280; an I/O interface 213 that couples to the keyboard 202, mouse 203, scanner 226, camera 227 and optionally a joystick or other human interface device (not illustrated); and an interface 208 for the external modem 216 and printer 215. The signal from the audio-video interface 207 to the computer monitor 214 is generally the output of a computer graphics card. In some implementations, the modem 216 may be incorporated within the computer module 201, for example within the interface 208. The computer module 201 also has a local network interface 211, which permits coupling of the computer system 200 via a connection 223 to a local-area communications network 222, known as a Local Area Network (LAN). As illustrated in Fig. 2A, the local communications network 222 may also couple to the wide network 220 via a connection 224, which would typically include a so-called “firewall” device or device of similar functionality. The local network interface 211 may comprise an Ethernet™ circuit card, a Bluetooth™ wireless arrangement or an IEEE 802.11 wireless arrangement; however, numerous other types of interfaces may be practiced for the interface 211. The local network interface 211 may also provide the functionality of the communication channel 120 may also be embodied in the local communications network 222.

The I/O interfaces 208 and 213 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 209 are provided and typically include a hard disk drive (HDD) 210. Other storage devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 212 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g. CD-ROM, DVD, Blu-ray Disc™), USB-RAM, portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the computer system 200. Typically, any of the HDD 210, optical drive 212, networks 220 and 222 may also be configured to operate as the HDR imaging sensor 112, or as a destination for decoded video data to be stored for reproduction via the display 214. The encoding device 110 and the display device 160 of the system 100 may be embodied in the computer system 200.

The components 205 to 213 of the computer module 201 typically communicate via an interconnected bus 204 and in a manner that results in a conventional mode of operation of the computer system 200 known to those in the relevant art. For example, the processor 205 is coupled to the system bus 204 using a connection 218. Likewise, the memory 206 and optical disk drive 212 are coupled to the system bus 204 by connections 219. Examples of computers on which the described arrangements can be practised include IBM-PC’s and compatibles, Sun SPARCstations, Apple Mac™ or alike computer systems.

Where appropriate or desired, the encoding device 110 and the display device 160, as well as methods described below, may be implemented using the computer system 200 wherein the video encoder 118, the video decoder 162 and methods to be described, may be implemented as one or more software application programs 233 executable within the computer system 200. In particular, the encoding device 110, the display device 160 and the steps of the described methods are effected by instructions 231 (see Fig. 2B) in the software 233 that are carried out within the computer system 200. The software instructions 231 may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the described methods and a second part and the corresponding code modules manage a user interface between the first part and the user.

The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer system 200 from the computer readable medium, and then executed by the computer system 200. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 200 preferably effects an advantageous apparatus for implementing the encoding device 110, the display device 160 and the described methods.

The software 233 is typically stored in the HDD 210 or the memory 206. The software is loaded into the computer system 200 from a computer readable medium, and executed by the computer system 200. Thus, for example, the software 233 may be stored on an optically readable disk storage medium (e.g., CD-ROM) 225 that is read by the optical disk drive 212.

In some instances, the application programs 233 may be supplied to the user encoded on one or more CD-ROMs 225 and read via the corresponding drive 212, or alternatively may be read by the user from the networks 220 or 222. Still further, the software can also be loaded into the computer system 200 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 200 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-ray Disc™, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 201. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of the software, application programs, instructions and/or video data or encoded video data to the computer module 201 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.

The second part of the application programs 233 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 214. Through manipulation of typically the keyboard 202 and the mouse 203, a user of the computer system 200 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 217 and user voice commands input via the microphone 280.

Fig. 2B is a detailed schematic block diagram of the processor 205 and a “memory” 234. The memory 234 represents a logical aggregation of all the memory modules (including the HDD 209 and semiconductor memory 206) that can be accessed by the computer module 201 in Fig. 2A.

When the computer module 201 is initially powered up, a power-on self-test (POST) program 250 executes. The POST program 250 is typically stored in a ROM 249 of the semiconductor memory 206 of Fig. 2A. A hardware device such as the ROM 249 storing software is sometimes referred to as firmware. The POST program 250 examines hardware within the computer module 201 to ensure proper functioning and typically checks the processor 205, the memory 234 (209, 206), and a basic input-output systems software (BIOS) module 251, also typically stored in the ROM 249, for correct operation. Once the POST program 250 has run successfully, the BIOS 251 activates the hard disk drive 210 of Fig. 2A. Activation of the hard disk drive 210 causes a bootstrap loader program 252 that is resident on the hard disk drive 210 to execute via the processor 205. This loads an operating system 253 into the RAM memory 206, upon which the operating system 253 commences operation. The operating system 253 is a system level application, executable by the processor 205, to fulfil various high level functions, including processor management, memory management, device management, storage management, software application interface, and generic user interface.

The operating system 253 manages the memory 234 (209, 206) to ensure that each process or application running on the computer module 201 has sufficient memory in which to execute without colliding with memory allocated to another process.

Furthermore, the different types of memory available in the computer system 200 of Fig. 2A must be used properly so that each process can run effectively. Accordingly, the aggregated memory 234 is not intended to illustrate how particular segments of memory are allocated (unless otherwise stated), but rather to provide a general view of the memory accessible by the computer system 200 and how such is used.

As shown in Fig. 2B, the processor 205 includes a number of functional modules including a control unit 239, an arithmetic logic unit (ALU) 240, and a local or internal memory 248, sometimes called a cache memory. The cache memory 248 typically includes a number of storage registers 244-246 in a register section. One or more internal busses 241 functionally interconnect these functional modules. The processor 205 typically also has one or more interfaces 242 for communicating with external devices via the system bus 204, using a connection 218. The memory 234 is coupled to the bus 204 using a connection 219.

The application program 233 includes a sequence of instructions 231 that may include conditional branch and loop instructions. The program 233 may also include data 232 which is used in execution of the program 233. The instructions 231 and the data 232 are stored in memory locations 228, 229, 230 and 235, 236, 237, respectively. Depending upon the relative size of the instructions 231 and the memory locations 228-230, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 230. Alternately, an instruction may be segmented into a number of parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 228 and 229.

In general, the processor 205 is given a set of instructions which are executed therein. The processor 205 waits for a subsequent input, to which the processor 205 reacts to by executing another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 202, 203, data received from an external source across one of the networks 220, 202, data retrieved from one of the storage devices 206, 209 or data retrieved from a storage medium 225 inserted into the corresponding reader 212, all depicted in Fig. 2A. The execution of a set of the instructions may in some cases result in output of data. Execution may also involve storing data or variables to the memory 234.

The encoding device 110, the display device 150 and the described methods may use input variables 254, which are stored in the memory 234 in corresponding memory locations 255, 256, 257. The encoding device 110, the display device 150 and the described methods produce output variables 261, which are stored in the memory 234 in corresponding memory locations 262, 263, 264. Intermediate variables 258 may be stored in memory locations 259, 260, 266 and 267.

Referring to the processor 205 of Fig. 2B, the registers 244, 245, 246, the arithmetic logic unit (ALU) 240, and the control unit 239 work together to perform sequences of micro-operations needed to perform “fetch, decode, and execute” cycles for every instruction in the instruction set making up the program 233. Each fetch, decode, and execute cycle comprises: (a) a fetch operation, which fetches or reads an instruction 231 from a memory location 228, 229, 230; (b) a decode operation in which the control unit 239 determines which instruction has been fetched; and (c) an execute operation in which the control unit 239 and/or the ALU 240 execute the instruction.

Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 239 stores or writes a value to a memory location 232.

Fig. 3 A shows an example sampling approach for video data using the 4:2:0 chroma format with non-co-sited chroma samples. Fig. 3 A shows a frame 300 of uncompressed video data 130 which includes three colour planes. The first colour plane has luma samples, such as a luma sample 304, which are shown each with ‘X’ in Fig. 3A. Each ‘X’ corresponds to one luma sample. Note that the frame 300 generally encodes video data in the YCbCr colour space and, as such, the Y, Cb and Cr values are not actually the result of a sampling process. Instead, the Y, Cb and Cr values can also be referred to as ‘codewords’ and the Y, Cb and Cr values are the result of applying a colour transform, such as described with reference to Fig. 5. Then, chroma sample pair 306 includes two chroma samples, a Cb and a Cr sample. The Cb and Cr samples are only present at half the frequency horizontally and vertically compared to the luma samples. Moreover, in the example of Fig. 3 A, the chroma samples are not collocated with particular luma samples. Other configurations of the 4:2:0 chroma format are also possible where the chroma samples are located differently, relative to the luma samples. Configurations where the chroma samples overlap with luma samples (i.e. every second luma sample horizontally and verticall) are referred to as ‘co-sited’ sampling.

As seen in Fig. 3 A, luma samples 302 (contained in the illustrated box) are the four luma samples associated with the chroma samples 306. Such a pattern repeats throughout the frame 300. The chroma samples 306 are ‘shared’ in the sense that when upconverting to the 4:4:4 chroma format, the same chroma samples are used with each of the four luma samples (e.g. when using nearest-neighbour interpolation to produce 4:4:4 chroma samples). Different upsampling filters are also possible, resulting in a different relationship between the 4:2:0 chroma samples and the 4:4:4 chroma samples. One example is bilinear filtering, in which case additional chroma samples are synthesised from the 4:2:0 chroma samples to produce chroma samples present at the same sampling rate as the luma samples. The interpretation of the magnitudes of a given pair of chroma samples is dependent upon the magnitude of the associated luma sample. Having multiple luma samples associated with a pair of chroma samples presents problems for reconstruction 4:4:4 video data at the decoder that corresponds to the 4:4:4 video data prior to the chroma downsampling present in the encoding device 110.

Fig. 3B shows an example sampling approach for video data using the 4:2:0 chroma format with co-sited chroma samples. In the example of Fig. 3B, a frame 320 includes luma samples (‘X’) and pairs of chroma samples (O’) using the 4:2:0 chroma format. In contrast to the frame 300, the chroma samples of Fig. 3B are ‘co-sited’, meaning that each chroma value is sampled at a location that is collocated with a luma sample (every second luma sample horizontally and vertically). In Fig. 3 A, the chroma samples were not co-sited, and do not overlap any luma samples. The distinction between co-sited and non-co-sited chroma samples can be understood by visualising the underlying video as a continuous function of luminance and chrominance, with a sampling process applied to produce the discrete samples of video data. Then, the sampling process requires consideration of the location of each sample present in each colour component. In general, an encoding device (e.g., 110) using an imaging sensor receives data already sampled by the imaging sensor according to the physical configuration of cells in the imaging sensor. Then, a ‘debayering’ process converts samples from this format generally into the 4:4:4 chroma format, so that the samples of 4:4:4 video data are effectively synthesised out of the samples from the imaging sensor.

Other locations for the chroma sample are also possible. The encoded bitstream 132 includes a packet known as the ‘video usability information’ (VUI). The VUI includes information instructing the display device 160 how to interpret the decoded samples from the video decoder 162. The VUI may include a syntax element ‘chroma_sample_loc_type_top_field’ that indicates the chroma sampling location. Although generally used when the video data is in ‘interlaced’ format, the above syntax element may also be used when the video data is in progressive format. When chromasampleloctypetopfield is not present, the chroma samples are assumed to be non-co-sited (as shown in Fig. 3 A). If the video data is in interlaced format, then the chroma sample location for alternate ‘bottom’ fields is specified by the chroma sample loc type top field syntax element.

The encoded bitstream 132 may also contain a packet known as a Chroma resampling filter hint Supplementary Enhancement Information (SEI) message. The Chroma resampling filter hint SEI message enables signalling of the chroma filters to resample video data from one chroma format to another chroma format. Predefined filters can also be signalled, such as those defined in Rec. ITU-T T.800 | ISO/IEC 15444-1.

Fig. 4 is a graph 440 showing the perceptual-quantiser (PQ) electro-optical transfer function 442 (EOTF), with 10-bit quantisation. The PQ-EOTF 442 is configured to closely fit a curve resulting from iterative addition of multiples of just noticeable differences (f*JND) derived from a model of human visual perception known as the “Barten model”. The PQ-EOTF 442 differs from the Barten model in that the lowest codeword corresponds to a luminance of 0 nits, asymptotically not depicted in Fig. 4. The graph 400 shows the codeword values along the X axis, with quantisation to ten (10) bits, and absolute luminance on the Y axis over the range supported by the PQ-EOTF 442. The range of available codewords intended for use is restricted to sixty-four (64) to nine hundred and forty (940), known as ‘video range’. Such a range of available codewords accords with common practice for video systems operating at a bit depth of ten (10) bits. However, other transfer functions may permit excursions outside the sixty-four (64) to nine hundred and forty (940) range in some cases). The codeword range from sixty-four (64) to nine hundred and forty (940) corresponds to luminances from zero (0) nits (not shown on the graph of Fig. 4) to 104 nits. Adjacent codewords correspond to steps above the JND threshold for a fully-adapted human eye.

Fig. 5 is a schematic block diagram showing the colour space conversion module 114 and the chroma downsampler 116. In Fig. 5 the colour space conversion module 114 converts video data from a linear light RGB representation to a PQ-EOTF encoded YCbCr representation using ‘non-constant luminance’. Fig. 5 also shows the chroma downsampler 116 downsampling the converted video data from the 4:4:4 sample format to the 4:2:0 sample format. In the example of Fig. 5, the colour space conversion module 114 accepts an RGB linear light input sample 514 of the source material 112. Each component of the sample 514 is mapped, using transfer function modules 502, 504 and 506. to an R’, G’ and B’ sample. The transfer function modules 502, 505 and 506 may implement an OETF such as ITU-R BT.709 or an inverse EOTF such as an inverse PQ-EOTF. An RGB to YCbCr module 508 converts the R’G’B’ sample to a Y’CbCr sample comprising an unadjusted luma value 516 and a pair of chroma values 518 by performing the following Equation (1):

A downsample Y’CbCr module 510 of the chroma downsampler 116 performs subsampling of the chroma values 518 resulting in subsampled chroma values 520.

Chroma subsampling performed by the downsample Y’CbCr module 510 introduces luminance shifts that become perceptually noticeable after upsampling and colour space conversion performed by the chroma upsampler module 164 and the colour space conversion module 165. A luma adjustment module 512 performs luma adjustment to compensate for the luminance shifts introduced by the subsampling performed by the downsample Y’CbCr module 510. The luma adjustment module 512 receives a target Ylinear luminance value 526 provided by an RGB to Y conversion module 524, the unadjusted luma value 516, and subsampled chroma values 520. The luma adjustment module 512 produces an adjusted luma value 522. The adjusted luma value 522 and the subsampled chroma samples 520 comprise samples 130 to be encoded by the video encoder module 118.

One aspect of the chroma subsampling is that by performing the filtering in the perceptual domain (i.e. R’G’B’) rather than the linear domain (i.e. RGB), the highly nonlinear nature of the transfer function results in shifts in intensity from the filtering operation. In the case of R’G’B’ being subsampled (e.g. G’ is assigned to the primary colour component and R’ and B’ are subsampled) then multiple samples in B’ and R’, at the higher sampling rate of the 4:4:4 chroma format, are filtered to produce samples at the lower sampling rate of the 4:2:0 chroma format. Then, interference occurs between neighbouring pixels due to brightness information present in the B’ and R’ samples at the higher sampling rate being combined into B’ and R’ samples at the lower sampling rate. From Equation (1), the severity of the interference can be seen from the relative contribution of R’ and B’ to Y\

Even though Y’CbCr is considered to ‘decorrelate’ luma (Y’) from chroma (Cb and Cr), the decorrelation is not complete and some luminance information is still present in the Cb and Cr values. For example, when applying Equation (1) to produce Y’CbCr values, the volume of valid R’G’B’ values results in a different volume of valid Y’CbCr values, such that the range of valid Cb and Cr values converges at the minimum and maximum valid Y’ values. At the middle Y’ value, the range of permissible Cb and Cr values reaches a maximum. Thus, Y’CbCr has not fully decorrelated colour information from the luminance information.

When chroma subsampling is applied independently to the Cb and Cr samples, interference between neighbouring pixels in terms of luminance occurs across samples, which may have quite different brightness. This interference can even result in ‘out of gamut’ colours, i.e. in 4:2:0 Y’CbCr values that, when converted back to 4:4:4 and then to R’G’B, have values outside the interval of [0, 1],

The RGB to YCbCr conversion may be performed by the colour space conversion module 114 with linear light RGB as input rather than R’G’B’ as input. However, in many systems it is impractical to do so, as either the linear light data is not available, or the complexity of dealing with linear light data, which requires a floating point representation, is not feasible for considerations of complexity and/or real time operation. The application of the RGB to YCbCr colour conversion using R’G’B’ instead of RGB video data is known as a ‘non-constant luminance’ method. Also, the colour space conversion module 114 operates in the 4:4:4 chroma format, with downconversion to 4:2:0 as a subsequent step (prior to encoding).

Fig. 6 shows a luma sample adjuster 600 performing iterative adjustment of luma samples to reduce subsampling artefacts. In the example of Fig. 6, a luma sample adjuster 600 performs an iterative method to produce each Y sample of the frame 300. RGB samples of the source material 112 are provided to an RGB to XYZ colour transform module 612 in linear light format. The colour space ‘XYZ’ refers to the CIE1931 XYZ colour space. If R’G’B’ are available only, the transfer function is ‘undone’ to restore linear light values. In the XYZ domain, only the linear luminance Y|mCar is used. The value Yiinear corresponds to the ‘true’ luminance of the source material 112 and is the luminance that the display device 160 will desirably output for the considered pixel. Then, an initial Y value and a Cb and Cr values are calculated from the RGB of the source material and the Cb and Cr values are provided to a YCbCr to RGB conversion module 602. Initially, an estimate Y value is also provided to the YCbCr to RGB conversion module 602. The estimate Y may be the initial Y value. The YCbCr conversion module 602 converts the Y value and the Cb and Cr values back to R’G’B’, with the resulting R’G’B’ values then converted to linear light R, G, B using inverse transfer function modules 604, 606 and 608. An RGB to XYZ conversion module 610 then converts the linear light RGB values to linear light XYZ values, of which only the value Ycandidate is further considered. The modules 602-610 have determined an actual luminance in linear light that would be emitted from the display device 160 if the estimate Y value is used. The determined actual luminance is expected to differ from the intended light level Yiinear due to the Cb and Cr samples being associated with multiple Y samples (e.g. shared between sets of four Y samples when a ‘nearest neighbour’ approach is used). As seen in Fig. 6, a compare module 614 compares Yiinear with Ycandidate and calculates a new estimate Y value, e.g. using a bijective search. The new estimate Y value is processed again by the modules 602-610 to produce a new Ycandidate-

The steps described above with reference to Fig. 6, are applied iteratively until the value Ycandidate is sufficiently close to Yiinear so that no further iterations are deemed necessary. At the point where no further iterations are deemed necessary, the value Ycandidate can be converted to Y’ by application of the transfer function (e.g. PQ or BT.709). The resulting value Y’ can then be passed to a video encoder, along with the associated samples Cb and Cr. The steps described with reference to Fig. 6 operate iteratively for each luma sample in the frame 300, making use of extensive colour space conversions, transfer function applications, and operable upon linear (i.e. floating point) data. As such, although practical for non-realtime systems (e.g. offline simulations), the approach, as defined for Fig. 6, is overly complex for a real time implementation. Real time operation implies operating at the pixel rate of a video processing system. The presence of an iterative step also presents implementation difficulties, as pixel-rate operations are generally performed in hardware and iterative methods are excessively costly to implement as the worst case must be provisioned for (i.e. maximum number of iterations happening for each pixel).

Although the luma sample adjuster 600 is shown operating on linear light RGB input, the luma sample adjuster 600 may also operate on Y’CbCr (e g. resulting from the colour space conversion module 114). In such a case, the non-constant luminance representation of Y’CbCr has introduced some distortions that cannot be undone, hence deriving a Y|mCar that exactly matches the Yiinear from application of the RGB to XYZ colour space transform on the source material 122 is not possible. However, a close match is still achieved. Then, a modified luma sample adjuster is possible that operates as a function of three inputs (Y, Cb and Cr) and produces a revised Y value, Υπη.-ιΐ.

Although the result of the luma sample adjuster 600 can be encapsulated into a look-up table for all possible input values, the size of such a look-up table would be prohibitive for real-time implementation.

The adjusted luma sample 522 may be derived from the target luminance value 526 and the subsampled chroma values 520. The luma adjustment process performed by the luma adjustment module 512 may be viewed as a function of three variables - Y|mCar, Cb and Cr producing an adjusted luma value Y\ Taking into account that the subsampled chroma values 520 are known, the luma sample adjustment process performed by the luma adjustment module 512 can be viewed as a one argument luminance-to-luma mapping, that maps the Yiinear luminance value 526 to the adjusted luma value 522. In practice, a precise closed form representation of such a luminance-to-luma mapping is either unavailable or is not feasible for a hardware implementation due to excessive complexity.

Fig. 7 shows a graph 700 of luminance versus luma value, and will be used to describe operation of the luma adjustment module 512. Axis 701 is the luminance axis and axis 714 is the luma axis. A curve 705 is an example luminance-to-luma mapping. An offset point 710 on the luma axis 714 is derived from the unadjusted luma value 516 by adding a negative offset to the unadjusted luma value 516. The offset point 710 may also be referred to as an “offset luma point”. A point 704 on the luminance axis 701 is derived from the offset luma point 710. The point 704 is derived by applying a Y’CbCr to RGB and then RGB to XYZ colour space conversion to a sample formed by the luma value of the point 710 and chroma values 520. The value of a Y component resulting from the Y’CbCr to RGB and RGB to XYZ colour space conversion performed on the sample is used as the value of the luminance point 704. A point 707 has a luma axis 714 coordinate equal to the point 710 and a luminance axis 701 coordinate equal to the point 704. The point 707 serves as a mapping between the points 704 and 710 and therefore belongs to the luminance-to-luma mapping 705.

An offset point 713 on the luma axis 714 is derived from the unadjusted luma value 516 by adding a predetermined positive offset value to the unadjusted luma value 516. The point 713 may also be referred to as an “offset luma point” as with the point 710. A point 702 on the luminance axis 701 is derived from the offset luma point 713 by applying a Y’CbCr to RGB and then RGB to XYZ colour space conversion to a sample formed by the luma value of the point 713 and chroma values 520. The value of a Y component resulting from the Y’CbCr to RGB and RGB to XYZ colour space conversion performed on the offset luma point 713 is used as the value of the luminance point 702. A point 708 has a luma axis 714 coordinate equal to the point 713 and a luminance axis 701 coordinate equal to the point 702. The point 707 serves as a mapping between the points 702 and 713 and therefore belongs to the luminance-to-luma mapping 705.

Line 709 as seen in Fig. 7 is an approximation line used to approximate the luminance-to-luma mapping 705 around the unadjusted luma value 516. The approximation line 709 is determined from the points 707 and 708.

The adjusted luma value 522 is derived from the target luminance value 526 by applying the approximation line.

Deriving the approximation line using multiple approximation points such as the points 707 and 708 is advantageous because, generally, increasing the number of approximation points increases accuracy of the luminance-to-luma mapping. With more approximation point there is a better chance that a ‘true’ adjusted luma value will be determined by the luminance-to-luma mapping as the ‘true’ adjusted luma value should be in close proximity to one of the approximation points. A maximum error in a range between the approximation points 707 and 708 is bound whereas the error to the right of the rightmost approximation point 708 and to the left of the leftmost approximation point 707 is generally unbound and may be indefinitely large. Therefore, having at least two approximation points is advantageous compared to using a single approximation point because of the bounded error region between the approximation points 707 and 708, as the presence of the range with bounded error generally improves approximation precision.

Fig. 8 is a schematic flow diagram showing a method 800 of determining the adjusted luma value 522 for a current YCbCr pixel value. The method 800 will be executed each time that a luminance adjustment is determined for a pixel. As a result, the method 800 will be executed for each pixel in an image. The method 800 may be implemented as one or more software code modules of the software application programs 233 resident in the hard disk drive 210 and being controlled in its execution by the processor 205. The method 800 will be described with reference to the example of Fig. 5, where the luma adjustment module 512 performs the steps of the method 800 to compensate for the luminance shifts introduced by the subsampling performed by the downsample Y’CbCr module 510. The luma adjustment module 512 receives the target Yiinear luminance value 526 determined by the RGB to Y conversion module 524. The luma adjustment module 512 also receives the unadjusted luma value 516 and subsampled chroma values 520. As described above, the adjusted luma value 522 may be used for encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream.

The method 800 begins at a first offset luma point determining step 801. At first offset luma point determining step 801, the luma adjustment module 512, under control of the processor 205, determines the offset luma point 710 by adding a predetermined negative offset value to the unadjusted luma value 516.

Then, at a Y’CbCr to RGBconversion step 802, the luma adjustment module 512, under control of the processor 205, converts a luma sample comprising the offset luma point 710, and chroma values 520 to the RGB colour space.

The method 800 then proceeds to an RGB to Y conversion step 803, where the luma adjustment module 512, under control of the processor 205, applies the following Equation (2) to the RGB sample to determine the luminance value 704:

where Y is the resulting luminance value; and R, G and B are corresponding components of the RGB sample. Any other suitable equation may be used at step 803 to determine the luminance value 704. Then at a second offset luma point determining step 804, the luma adjustment module 512, under control of the processor 205, determines the offset luma point 713 by adding a predetermined positive offset value to the unadjusted luma value 516. The predetermined offset values for the offset luma points 710 and 713 may be the same or may be different. The predetermined offset may be set to 0.1 percent of a total length of a range of valid luma values. Methods of determining different offset values will be described below.

Following step 804, the method 800 proceeds to a Y’CbCr to RGB conversion step 805, where the luma adjustment module 512, under control of the processor 205, converts a luma sample comprising the offset luma point 713, and chroma values 520 to the RGB colour space. Then at an RGB to Y conversion step 806, the luma adjustment module 512, under control of the processor 205, applies the Equation (2) to the RGB sample to derive the luminance value 702.

The method 800 continues to an approximation line determining step 808, where the luma adjustment module 512, under control of the processor 205, determines the slope and intercept point of the approximation line 709, by performing the following Equations (3) and (4):

where A and B are the slope and intercept point of the approximation line 709 respectively; Li and L2 are the luminance values at points 704 and 702 respectively; and Pi and P2 are luma values at points 710 and 713 respectively. The approximation line 709 is determined from the offset luma points 710 and 713 determined at steps 801 and 804, respectively, and the luminance values corresponding to the points 702 and 704.

As described in more detail below, any other suitable equations may be used at step 808 to determine the approximation line 709 from the points 707 and 708.

The method 800 concludes at an adjusted luma value determining step 810, where the luma adjustment module 512, under control of the processor 205, determines the adjusted luma value 522, by applying the following Equation 5 to the target luminance value 526:

where P is the adjusted luma value 522; Y is the target luminance value 526; and A and B are the values derived using the Equations (3) and (4) respectively. As described, the adjusted luma value 522 is determined from the target luminance value using the determined approximation line 709.

In one arrangement of the method 800, at most one of the offsets applied at the steps 801 and 804 is zero (0).

In one arrangement of the method 800, more than two offset luma points are derived from the unadjusted luma value 516. In the arrangement where more than two offset luma points are derived from the unadjusted luma value 516, a polynomial equation corresponding to the number of derived offset points is applied at the step 801 to determine the adjusted luma value 522. For example, in the case of three (3) offset points, the corresponding approximation line 709 determined at the step 808 is a parabola and a second-order equation is applied at step 810 instead of the Equation (5), to derive the adjusted luma value 522.

In another arrangement of the method 800, one of the offsets applied at the steps 801 and 804 may be zero (0). The result of a zero offset is that one of either the offset luma point 710 or offset luma point 713 is located at the unadjusted luma value 516.

In one arrangement of the method 800, one or more of the offset luma points determined at steps 801 and 804 may be determined adaptively. The adaptive offsets of the luma point may be determined by taking into account one or more of the adjusted luma values 522 determined for previous pixel values in an image.

The offset luma point 710 may have an adaptive offset determined at the step 801, using Equation (6):

where V710 is the value of the offset luma point 710; and Pieft is the adjusted luma value 522 determined for a most recent invocation of the method 800 for a previous pixel in the image, where the determined adjusted luma value 522 is less than the unadjusted luma value 516.

The offset luma point 713 may also have an adaptive offset determined in a similar manner at the step 804 using Equation (7):

where V713 is the value of the offset luma point 713; and P tight is the adjusted luma value 522 determined for a most recent invocation of the method 800 for a previous pixel in the image, where the determined adjusted luma value 522 was greater than the unadjusted luma value 516.

Alternatively, the offset luma point 710 may also have an adaptive offset determined using Equation (8):

where V is the value of the offset luma point 710; P522 is the adjusted luma value 522 determined at the most recent invocation of the method 800 for a previous pixel of the image, where the adjusted luma value 522 determined at the step 810 was less than the unadjusted luma value 516; and VPrev is the value of the offset luma point 710 determined at the most recent invocation of the method 800 for a previous pixel in the image, where the adjusted luma value 522 determined at the step 810 was less than the unadjusted luma value 516.

The offset luma point 713 may also have an adaptive offset determined using Equation (9):

where V is the value of a offset luma point 713; P522 is the adjusted luma value 522 determined at the most recent invocation of the method 800 for a previous pixel of the image, where the adjusted luma value 522 determined at the step 810 was greater than the unadjusted luma value 516; and Vpiev is the value of the offset luma point 713 determined at the most recent previous invocation of the method 800 for a previous pixel of the image, such that the adjusted luma value 522 determined at the step 810 was greater than the unadjusted luma value 516.

The approximation line 709 provides a best approximation of the luminance-to-luma mapping at the offset luma points and in some limited neighbourhood of the offset luma points. Approximation error in a region between the offset luma points 710 and 713 is bound and has been found to be acceptably low. However, approximation error outside the region between the offset luma points 710 and 713 and outside a limited neighbourhood of the offset luma points may be prohibitively large. To address the approximation error outside the offset luma points 710 and 713, a method 900 described below may be used by the luma adjustment module 512.

The arrangements of the method 800 have fixed complexity and therefore are more preferable for hardware implementations than iterative methods such as the method used by the iterative luma adjustment module 600.

Fig. 9 is a schematic flow diagram showing a method 900 of determining the adjusted luma value 522. The method 900 may be implemented as one or more software code modules of the software application programs 233 resident in the hard disk drive 210 and being controlled in its execution by the processor 205. The method 900 will be described with reference to the example of Fig. 5.

The method 900 begins at a adjust luma value determination step 901, the method 800 is called under execution of the processor 205. At the adjust luma value step 801. the luma adjustment module 512, under control of the processor 205, determines an adjusted luma value as returned by the method 800.

At a measure distance step 903, the luma adjustment module 512, under control of the processor 205, determines a distance on the luma axis 714 between the determined adjusted luma value 522 and an offset luma point (i.e., either offset luma point 710 or 713), closest to the determined adjusted luma value 522.

At a distance test step 904, the luma adjustment module 512, under control of the processor 205, determines if the distance determined at the step 903 is below a predetermined threshold value. If the distance determined at the step 903 is below the predetermined threshold value, then the method 900 returns the adjusted luma value, determined at step 901, as the adjusted luma value 522. If the determined distance is above or equal to the predetermined threshold value, then control of method 900 is passed to an iteration test step 905.

At the iteration test step 905, the luma adjustment module 512, under control of the processor 205, determines if the method 900 will continue based on a number of iterations that the method 900 has completed. The number of iterations the method 900 may perform is limited to a predetermined number to limit complexity of hardware implementation. If the number of iterations of the method 900 has not exceeded a predetermined iteration threshold, then control of the method 900 is returned to the step 901 passing the adjusted luma value returned from the previous call of the method 800 at step 801 as the unadjusted luma value, to the method 800 at the step 901. If the iteration threshold has been exceeded, then the method 900 terminates returning the adjusted luma value determined at the most recent execution of the step 901 as the adjusted luma value 522.

In one arrangement of the method 900, the distance determined at the measure distance step 903 is zero (0) if the adjusted luma value determined at the step 901 is in a region between the offset luma points 710 and 714.

Table 1 below shows results of experiments conducted by the inventor over a well-known video sequence test set. As seen in Table 1, below, the inventor observed that the fixed-complexity approach of the method 800 provides performance very close to the approach of the iterative luma adjustment module 600, having average and maximum difference of only 0.07dB and 0.22 dB, respectively. At the same time, the run-time complexity reduction is over 45% compared to the iterative approach.

Table 1

Although the arrangements above are examples of the Perceptual Quantizer Electro-Optical Transfer Function (PQ-EOTF), the arrangements are in fact independent of the Transfer Function (TF) used by the colour space conversion module 114. Other Electro-Optical Transfer Functions (EOTFs) or Opto-Electrical Transfer Functions (OETFs), such as for example the Hybrid-Log Gamma Opto-Electrical Transfer Function (HLG OETF) can be used.

Arrangements disclosed herein provide for a video system that encodes and decodes video content that has been subsampled (e.g. to the 4:2:0 chroma format), with compensation for deviations in the luminance of each pixel that would otherwise be present in a conventional chroma downsampler. For HDR applications using highly nonlinear transfer functions, such as PQ-EOTF, the deviations are more significant than in traditional SDR applications. Moreover, the methods described herein operate with fixed complexity per pixel and with complexity commensurate with hardware implementation (e.g. for real-time systems).

Notwithstanding the above description of a system operation with the source material 112 in the 4:4:4 chroma format and the video encoder 118 configured to encode video data in the 4:2:0 chroma format, source material 112 in other chroma formats, such as 4:2:2 may also be used. Moreover, source material 112 that is in an interlaced format may be used, with the chroma upsampling filter alternating between the top and bottom fields.

INDUSTRIAL APPLICABILITY

The arrangements described are applicable to the computer and data processing industries and particularly for the digital signal processing for the encoding a decoding of signals such as video signals.

The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive. For example, one or more features of the arrangements described above may be combined.

In the context of this specification, the word “comprising” means “including principally but not necessarily solely” or “having” or “including”, and not “consisting only of’. Variations of the word "comprising", such as “comprise” and “comprises” have correspondingly varied meanings.

Claims

CLAIMS:

1. A method of determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the method comprising: determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; determining corresponding luminance values for each of the determined offset luma points; determining an approximation line from the determined offset luma points and the corresponding luminance values; and determining the luma value from the target luminance value using the determined approximation line.
2. The method according to claim 1, wherein the approximation line is an approximation of a luminance-to-luma mapping for a combination of chroma downsampled 4:2:0 YCbCr values of a pixel.
3. The method according to claim 1 wherein at least one of the plurality of offset luma points is located at the initial luma point.
4. The method according to claim 1, wherein the determined plurality of offset luma points and the corresponding luminance values define points located on a luminance-to-luma mapping for a combination of chroma downsampled 4:2:0 YCbCr values of a pixel.
5. The method according to claim 1, wherein the predetermined amount for the offset luma points is equal for all of the plurality of offset luma points.
6. The method according to claim 1 wherein the plurality of points are two points.
7. A system for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the system comprising: a memory for storing data and a computer program; a processor coupled to the memory for executing the computer program, the computer program comprising instructions for: determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; determining corresponding luminance values for each of the determined offset luma points; determining an approximation curve from the determined offset luma points and the corresponding luminance values; and determining the luma value from the target luminance value using the determined approximation curve.
8. An encoder for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the encoder comprising: module for determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; module for determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; module for determining corresponding luminance values for each of the determined offset luma points; module for determining an approximation curve from the determined offset luma points and the corresponding luminance values; and module for determining the luma value from the target luminance value using the determined approximation curve.
9. An apparatus for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the apparatus comprising: means for determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; means for determining a plurality of offset luma points, each of the offset luma points being offset from the initial luma point by a predetermined amount; means for determining corresponding luminance values for each of the determined offset luma points; means for determining an approximation curve from the determined offset luma points and the corresponding luminance values; and means for determining the luma value from the target luminance value using the determined approximation curve.
10. A computer readable medium having a computer program stored thereon for determining a luma value from 4:4:4 RGB video data for use in encoding chroma downsampled 4:2:0 YCbCr video data into a bitstream, the program comprising: code for determining an initial luma point and a target luminance value from received RGB values for a pixel in the video data; code for determining a plurality of offset luma offset points, each of the offset luma points being offset from the initial luma point by a predetermined amount; code for determining corresponding luminance values for each of the determined offset luma points; code for determining an approximation curve from the determined offset luma points and the corresponding luminance values; and code for determining the luma value from the target luminance value using the determined approximation curve.