MXPA97008213A - Synchronization of a stereoscop video sequence - Google Patents
Synchronization of a stereoscop video sequenceInfo
- Publication number
- MXPA97008213A MXPA97008213A MXPA/A/1997/008213A MX9708213A MXPA97008213A MX PA97008213 A MXPA97008213 A MX PA97008213A MX 9708213 A MX9708213 A MX 9708213A MX PA97008213 A MXPA97008213 A MX PA97008213A
- Authority
- MX
- Mexico
- Prior art keywords
- images
- image
- hei
- lower layer
- intensified
- Prior art date
Links
- 230000005540 biological transmission Effects 0.000 claims abstract description 43
- 238000000034 method Methods 0.000 claims abstract description 39
- 230000002457 bidirectional effect Effects 0.000 claims description 10
- 238000003860 storage Methods 0.000 description 14
- DSKSQHKXCCKUIB-HGJVRFSMSA-N (2r,4s,6r)-5-acetamido-2-[(2r)-2,3-di(tetradecoxy)propoxy]-4-hydroxy-6-[(1r,2r)-1,2,3-trihydroxypropyl]oxane-2-carboxylic acid Chemical compound CCCCCCCCCCCCCCOC[C@@H](OCCCCCCCCCCCCCC)CO[C@]1(C(O)=O)C[C@H](O)C(NC(C)=O)[C@H]([C@H](O)[C@H](O)CO)O1 DSKSQHKXCCKUIB-HGJVRFSMSA-N 0.000 description 10
- 230000033001 locomotion Effects 0.000 description 8
- 238000007906 compression Methods 0.000 description 7
- 230000006835 compression Effects 0.000 description 7
- 230000006837 decompression Effects 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- -1 Hgi Proteins 0.000 description 5
- 230000003466 anti-cipated effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 5
- 102100037812 Medium-wave-sensitive opsin 1 Human genes 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 101000575029 Bacillus subtilis (strain 168) 50S ribosomal protein L11 Proteins 0.000 description 3
- 102100035793 CD83 antigen Human genes 0.000 description 3
- 101000946856 Homo sapiens CD83 antigen Proteins 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 101100171060 Caenorhabditis elegans div-1 gene Proteins 0.000 description 2
- 108010065805 Interleukin-12 Proteins 0.000 description 2
- 102000013462 Interleukin-12 Human genes 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 101710190981 50S ribosomal protein L6 Proteins 0.000 description 1
- 101001099542 Aspergillus niger Pectin lyase A Proteins 0.000 description 1
- 206010019233 Headaches Diseases 0.000 description 1
- 102000026633 IL6 Human genes 0.000 description 1
- 108010002352 Interleukin-1 Proteins 0.000 description 1
- 108090000174 Interleukin-10 Proteins 0.000 description 1
- 102000003814 Interleukin-10 Human genes 0.000 description 1
- 102000003815 Interleukin-11 Human genes 0.000 description 1
- 108090000177 Interleukin-11 Proteins 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 208000003464 asthenopia Diseases 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 231100000869 headache Toxicity 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Abstract
The present invention relates to a method for reordering a video image sequence in a lower layer (L) and in an intensified layer (E) of a stereoscopic video signal for transmission to a decoder, including the intensified layer predicted disparity images that are predicted using the corresponding lower layer images, wherein the lower layer comprises includes only intracoded images (images I) that include consecutive images Ili, Ili + 1 eIli + 2, and the corresponding images of the intensified layer are represented by Hei, and Hei + 2, respectively, and H designates a generic image type, the method comprising the step of: reordering the video images in such a way that the images of intensified layer with predicted disparity are they transmit after the respective corresponding lower layer images, in the order: ILi, ILi + 1, HEi, ILi
Description
SYNCHRONIZATION OF ONA STEREOSCOPIC VIDEO SEQUENCE
FIELD OF THE INVENTION The present invention relates to an apparatus and method for synchronizing decoding and displaying (eg, displaying) a stereoscopic video sequence. In particular, a system is presented to determine the date and time of the presentation as well as the date and time of the decoding of an intensified layer, as well as a corresponding transmission of optimum bitstream that orders that the size be reduced to a minimum " required from the decoder's input buffer.
BACKGROUND OF THE INVENTION Digital technology has revolutionized the supply of video and audio services to consumers since it can provide signals of a much higher quality than those provided with analog techniques and provides additional features that were previously not available. Digital systems are particularly advantageous for signals that are broadcast via a cable or satellite television network to cable television affiliates and / or directly to domestic television receivers via
P1523 / 97MX satellite. In such systems, a subscriber receives the digital data stream through a mixing receiver / eliminator that decompresses and decodes the data in order to reconstruct the original video and audio signals. The digital receiver includes a microcomputer and memory storage elements to be used in this process. The need to provide low cost receivers while still providing high quality video and audio requires that the amount of data that will be processed be limited. In addition, the available bandwidth for digital signal transmission may also be limited by physical constraints, existing communication protocols and government regulation. In accordance with the foregoing, various intra-frame data compression schemes have been developed that take advantage of the spatial correlation between adjacent pixels in a particular video image (e.g., a frame). In addition, the intra-frame compression schemes take advantage of the temporal correlations between corresponding regions of successive frames using the motion compensation data and the block matching movement estimation algorithms. In this case, a motion vector is determined for each
P1523 / 97MX block in a current or current figure of an image identifying a block in a previous image that closely resembles the particular current block. The entire current image can then be reconstructed in a decoder that sends data representing the difference between the corresponding pairs of blocks, together with the motion vectors that are required to identify the corresponding pairs. Block matching movement estimation algorithms are particularly effective when combined with spatial compression techniques based on blocks such as for example the discrete cosine transform (DCT). Additionally, there is a growing interest in the proposed stereoscopic video transmission formats, such as the MPEG-2 Multi-View Profile System (MPEG) of the Motion Picture Expert Group (MPEG), described in the ISO document. / IEC JTC1 / SC29 / WG11 N1088, entitled "Amendments to Project Proposal No. 3 to 13818-2 (Multi-view Profile)", November 1995, incorporated herein by reference. The stereoscopic video provides slightly displaced views of the same image to produce a combined image with greater depth of field, creating or generating in this way a three-dimensional effect (3-D). In this system, double cameras can be located
P1523 / 97MX approximately two inches apart to record an event on two separate video signals. The separation of the cameras approximates the distance of the right and left human eyes. further, with some stereoscopic video recorder cameras, the two lenses are incorporated in a head of the camera recorder and, therefore, move in synchrony, for example, when moving through an image. The two video signals can be transmitted and recombined in a receiver to produce an image with a depth of field corresponding to normal human vision. Other special effects can also be provided. The MVP MPEG system includes two layers or video layers that are transmitted in a multiplexed signal. First, a layer or base layer (eg, lower) represents the left view of a three-dimensional object. Second, an intensified layer (for example, auxiliary or superior) represents the right view of the object. Since the right and left views are of the same object and are only slightly displaced with respect to each other, there will usually be a greater degree of correlation between the video images of the base and intensified layers. This relationship can be used to compress the data of the intensified layer with respect to those of the base layer, reducing in this way
P1523 / 97MX the amount of data that needs to be transmitted in the intensified layer to maintain a given image quality. The image quality generally corresponds to the level of quantification of the video data. The MVP MPEG system includes three types of video images; specifically, the intra-coded image (image I), the predictive coded image (P image) and, the bi-directional predictive coded image (image B). Additionally, while the base layer supports either video sequences with frame or field structure, the intensified layer supports only frame structure. An I picture completely describes a single video image without reference to any other image. For improved cancellation or cancellation of the error, motion vectors may be included in an I image. An error in an I image has the potential to have a greater impact on the displayed or displayed video, because both images, the P image and B image in the base layer are predicted from I images. Also, the images in the Intensified layer can be predicted from images in the base layer in a cross-section prediction process known as disparity prediction. Predicting one frame to another within a layer is known as a temporal prediction. In the base layer, the P images are described
P1S23 / 97MX based on previous I or P images. The reference is of an I or P image before a future P image and is known as an anticipated prediction. The B images are predicted from the closest previous I or P image and the closest last I or P image. In the intensified layer, an image P can be predicted from (a) the most recently decoded image of the intensified layer, (b) the image of the most recent base layer, in order of display, or (c) the image of the next lower layer, in order of display. The case of (b) is normally used when the image of the most recent base layer, in order of display, is an I. In addition, an image B of the intensified layer can be predicted by using (d) the image of the intensified layer most recent decoded for anticipated prediction and, the most recent lower layer image, in order of display, for retrograde prediction, (e) the most recent decoded intensified layer image for anticipated prediction and the image of the next lower layer, in order of display, for retrograde prediction or, (f) the image of the most recent lower layer, in order of display, for the anticipated prediction and, the image of the next lower layer, in order of display, for retrograde prediction, when the image of the lower layer
P1523 / 97MX more recent, in order of display, is an I image, only that I image will be used for predictive coding (for example, there will be no anticipated prediction). Note that only the prediction modes of (a), (b) and (d) are covered or included within the MPEG MVP system. The MVP system is a subset of the MPEG temporal scalability coding, which encompasses or includes each of the modes of (a) - (f). In an optional configuration, the intensified layer only has P or B images, but not images I. The reference to a future image (that is, one that has not yet been displayed) is called retrograde prediction. Note that no retrograde prediction occurs within the intensified layer. In accordance with the above, the images of the intensified layer are transmitted in order of display. These are situations where retrograde prediction is very useful in increasing the compression rate. For example, in a scene where a door is opened, the current image can predict what is behind the door based on a future image in which the door is already open. The B images provide or produce the greatest compression but also incorporate the greatest error. To eliminate the propagation of the error, B images can never be predicted from other B images of the layer
P1523 / 97MX base. P images produce less error and less compression. I images produce minimal compression but can provide random access. Thus, in the base layer for decoding P images, the previous or previous image I or image P must be available. Similarly, to decode B images, the previous P or I images and future P or I images must be available. Consequently, the video images are encoded and transmitted with order dependence, in this way, all the images used for the prediction are encoded before the images predicted from them. When the encoded signal is received in a decoder, the video images are decoded and reordered for display or display. In accordance with the above, temporary storage elements are required to buffer the data before it is displayed. However, the need for a relatively large decoder input buffer increases the manufacturing cost of the decoder. This is undesirable since the decoders are items sold at a massive level that must be produced at the lowest possible cost. Additionally, there is a need to synchronize the decoding and presentation of the video sequences of the intensified layer and the layer
P1523 / 97MX base. The synchronization of the decoding and the presentation process for stereoscopic video is a particularly important aspect of the MVP. Since it is inherent in stereoscopic video that two views are closely coupled with each other, the loss of presentation or display synchronization could cause many problems for the viewer, such as, for example, eye fatigue, headaches, etc. In addition, the problems encountered in dealing with this issue for compressed digital bitstreams are different from those with uncompressed bitstreams or analog signals, such as those that conform to NTSC or PAL standards. For example, with the NTSC or PAL signals, the images are transmitted in a synchronized manner, so that a clock signal can be derived directly from the synchronization of the image. In this case, the synchronization of two views can be easily obtained using the synchrony of the image. However, in a digital compressed stereoscopic bitstream, the amount of data for each image in each layer is variable and depends on the speed of the bits, the types of image coding and the complexity of the scene. In this way, the decoding and presentation time may not be derived
P1523 / 97MX directly from the start of the image data. That is, unlike analog video transmissions, there is no natural concept of synchronic pulses in a digitally compressed bit stream. In accordance with the above, it would be advantageous to provide a system for synchronizing the decoding and presentation of a stereoscopic video sequence. The system must also be compatible with decoders that decode images either sequentially (for example, one image at a time) or in parallel (for example two images at a time). In addition, the system must provide an optimal image transmission order that minimizes the required size of the decoder input buffer. The present invention provides a system having the above advantages and other additional advantages.
SUMMARY OF THE INVENTION According to the present invention, there is presented a method and an apparatus for sequencing the transmission of video images of the lower and intensified layers of a stereoscopic video sequence. In particular, the images are transmitted in an order such that the number of images that must be stored temporarily before presentation is minimized.
P1523 / 97MX In addition, the date of decoding (DTS) and the date of the presentation (PTS) for each image can be determined to provide synchronization between the images of the lower layer and the intensified layer in the decoder, where it occurs or decoding is presented either sequentially or in parallel. In particular, a method for ordering the transmission of a video image sequence in a lower layer and in an intensified layer of a stereoscopic video signal is presented, wherein the intensified layer includes predicted disparity images that are predicted using the corresponding images of the lower layer. The method includes the step of ordering the video images, in such a way that the images of the intensified layer of predicted disparity are transmitted after the corresponding ones of the respective lower layer. In a first embodiment, the lower layer includes only intra-coded images (images I) including consecutive images ILj, ILÍ + I # I Í + 2 »! Í + 3 ILÍ + 4 / y etc., and the corresponding images of the intensified layer are represented by HEi, HEi + 1, HEi + 2, HEÍ + 3 'HEÍ + 4 / etc .. In this case the video images they are transmitted in the order: ILi, ILi + 1, HEi, ILi + 2, HEi + 1, ILi + 3, HEÍ + 2 'ILÍ + 4 / HEÍ + 3 / etc., (for example sequence 1). Alternatively, in a second modality, the
P1523 / 97MX video images are transmitted in the order: ILl, HEi, ILl + I < HEi + 1, ILi + 2 / HEi + 2, ILÍ + 3 »HEÍ + 3 'etc., (for example, sequence 2) In a third mode, the lower layer includes only intra-coded images (images I) and images with coding predictive (P images), including consecutive images: ILi, P Í + I / PLÍ + 2 / PLÍ + 3 Y PLÍ + 4 < etc., and, the corresponding intensified layer images are represented by HE, HEi + 1, HEj_ + 2, HEi + 3 and HEi + 4, etc., respectively. Here, the video images are transmitted in the order: IL, PLÍ + I, HE, Í + 2 HEÍ + I, PLÍ + 3 HEi + 2, PLÍ + 4, HEi + 3, etc., (for example, sequence 3) . Alternatively, in a fourth mode, the video images are transmitted in the order: IL-1, Hgi, pLi + i 'HEi + 1, P i + 2, HEi + 2, PLI + 3 ^ HEi + 3, etc. . (for example, sequence 4). In a fifth embodiment, the lower layer includes intra-coded images (I images), images with predictive coding (P images), and images with bidirectional non-consecutive predictive coding (B images), including consecutive images: IL; ¡_, BL_ + 1 , P + + 2 B? + 3 'pLi + 4 »BLI + 5» p i + 6 / etc., respectively and the corresponding intensified layer images are represented by HEi, HEi + 1, HEi + 2, HEi +3, HEi + 4, HEi + 5,
P1523 / 97MX HEÍ + 6 < etc., respectively. The video images are transmitted in the order: ILi, PLi + 2, BLi + 1, HEi, HEi + 1, PLi + 4, B Í + 3 »HEÍ + 2 / HEi + 3, etc., (for example, sequence 5). Alternatively, in a sixth mode, the video images are transmitted in the order: ILi, HEi,
PLi + 2 / B i + l / HEi + l / HEi + 2 / P i + 4 »BLi + 3 'HEi + 3' HEi + 4 > ßtC.
(for example, sequence 6). Alternatively, in a seventh modality, the video images are transmitted in the order: IL-¡, PLÍ + 2 / HEi, BLi + 1, HEÍ + I, PLi + 4 / HEi + 2 'BLi + 3 »HEi + 3 »etc. , (for example, sequence 7). In an eighth modality, the lower layer includes intra-coded images (I images), images with predictive coding (P images) and images with bidirectional non-consecutive predictive coding (B images), including consecutive images: ILÍ, BLÍ + 1 / BLÍ + 2 / pLi + 3 »B i + 4 > BLi + 5 'pLi + 6' etc. , respectively, and the corresponding intensified layer images are represented by HEi, HEi + 1, HEi + 2, HEi + 3, HEi + 4, HEi + 5 and HE + g,, etc., respectively. The video images are transmitted in the order: ILi, pLi + 3- B i + i / HEi, BLi + 2, HEi + 1,
HEi + 2 »PLi + 6 / BLi + 4- HEi + 3 / BLi + 5 / HEi + 4 / HEi + 5» etc., (for example, sequence 8). Alternatively, in a ninth modality, the video images are transmitted in the order: ILi, HEi,
P1523 / 97MX PLi + 3 < BLi + l / HEi + l / BLÍ + 2Í HEi + 2 > HEi + 3 / PLi + 6 'BLi + 4 > HEi + 4 /
BLi + 5 »HEi + 5 and HEi + 6, etc., (for example, sequence 9). Alternatively, in a tenth mode, the video images are transmitted in the order: IL-¡, PLÍ + 3 HEi, BLÍ + I, HEÍ + 1, BLi + 2, HEi + 2, Li + 6 < HEi + 3 < BLi + 4 «HEi + 4 /
BLÍ + 5 / HEÍ + 5 'etc., (for example, sequence 10). The corresponding apparatus is also presented. Additionally, a receiver is presented to process a sequence of video images of a stereoscopic signal that includes a lower layer and an intensified layer. The receiver includes a memory, a decompression / prediction processor and a memory manager associated in an operative and functional way with the memory and with the processor. The memory manager determines the storage of the images of the selected lower layer, in such a way that they are processed by the decompression / prediction processor before the corresponding images of the intensified layer with predicted disparity. In addition, the decoding can be carried out sequentially in parallel.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a block diagram of an encoder / decoder structure for stereoscopic video.
P1523 / 97MX Figure 2 is an illustration of a sequence of intensified layer images and a first sequence of images of the base layer for use with the system of the present invention. Figure 3 is an illustration of a sequence of intensified layer images and a second sequence of images of the base layer for use with the system of the present invention. Figure 4 is an illustration of a sequence of intensified layer images and a third image sequence of the base layer for use with the system of the present invention. Figure 5 is an illustration of a sequence of intensified layer images and a fourth sequence of images of the base layer for use with the system of the present invention. Figure 6 is a block diagram of an intensified layer decoder structure for stereoscopic video.
DETAILED DESCRIPTION OF THE INVENTION A method is described as an apparatus for synchronizing the decoding and presentation of a sequence of stereoscopic video images. Figure 1 is a block diagram of a
P1523 / 97MX encoder / decoder structure for stereoscopic video. The MPEG MVP standard and similar systems include the encoding of two video layers, which include a lower layer and an enhanced layer. For this application, the lower layer is assigned the left view while the enhanced layer is assigned the right view. In the encoder / decoder (e.g., codec) structure of Figure 1, the video sequences of the lower layer and the intensified layer are received by a temporal remultiplexer (remux) 105. Using time division multiplexing ( TDMX), the video of the intensified layer is supplied to an intensification encoder 110, while the video of the base layer is supplied to a lower encoder 115. Note that the video data of the lower layer can be supplied to the encoder of 110 intensification for the prediction of disparity. The intensified and coded base layers are then supplied to a multiplexer 120 of the system for transmission to a decoder, generally shown at 122, as a transport stream. The transmission path is normally a satellite link to a headend of the cable system or directly via satellite to the consumer's home. In the decoder 122, the transport stream is
P1523 / 97MX demultiplexed in a demultiplexer 125 of the system. The coded intensified layer data is supplied to an intensifier decoder 130, while the encoded lower layer data is supplied to a lower decoder 135. Note that the decoding is preferably performed concurrently with the lower and intensified layers in a parallel processing configuration. Alternatively, the intensifier decoder 130 and the lower decoder 135 can share common processing hardware, in which case, the decoding can be performed sequentially, one image at a time. The decoded lower layer data is output from the lower decoder 135 as a separate data stream and is also supplied to a temporary remultiplexer 140. In the temporal remultiplexer 140, the decoded base layer data and the decoded intensified layer data combine to provide an intensified layer output signal as shown. The output signals of the intensified and lower layer are then supplied to a display device for observation. In addition, the coded bitstreams for the two layers the lower and the intensified must be multiplexed in multiplexer 120 of the system of such
P1523 / 97MX so that the decoder 122 can decode any frame or field depending only on the frame or on the fields that have already been decoded. However, this problem is complicated by the fact that the prediction modes for images P and B are different in the lower and intensified layers. In addition, the images of the intensified layer are always transmitted in the order of presentation (for example, display), while this is often not the case for the lower layer. Therefore, there is a frequent need to store and reorder the video images in the decoder, so that decoding and display can occur in the proper order. Additionally, difficulties arise in the synchronization of the decoding and the presentation of the lower and intensified layer data. As mentioned, the video bitstreams of the lower layer and the intensified layer are transmitted as two elementary video streams. For the transport stream, two packet identifiers are specified
(PID) of the transport stream packages in a program map section of the transport stream for the two layers. In addition, the time information is carried in the adaptation field of the packages selected for the lower layer (for example, in the
P1523 / 97MX PCR_PID field) to serve as a reference for time comparisons in the decoder. Specifically, samples of a clock at 27 MHz are transmitted in the program_clock_reference (PCR) field. More precisely, the samples are transmitted in the field program_clock_reference_base and program__clock_reference_extension, described in the MPEG-2 system document ITU-T Rec. H. 262 ISO / IEC 13818-1, of April 27, 1995, incorporated herein by reference. reference. Additional details of the MPEG-2 standard can be found in document ISO / IEC JTC1 / SC29 / WGII N0702, entitled "Information Technology -Generic Coding of Moving Pictures and Associated Audio, Recommendation H.262", of March 25, 1994, incorporated herein by reference. The PCR indicates the expected time at the end of the reading of a field of the bitstream in the decoder. In the part of the local clock that works in the decoder it is compared with the PCR value of the bitstream at the time when the PCR value is obtained to determine whether the decoding of the video data, audio data and other data It is synchronized. In addition, the decoder's sample clocks are fixed to the system clock derived from the PCR values. The PCR values are calculated using
P1523 / 97MX the equations described in ITU-T Rec H.262, ISO / IEC 13818-1 and shown below: PCR (i) = PCR_base (i) x 300 + PCR_ext (i), where: PCR_base (i) = ((system_clock_frecuency xt (i)) DlV
300)% 233, and PCR_exit (i) = ((system_clock_frecuency x t (i)) DIV 1)% 300; where the "%" symbol indicates a module operation. Similarly, for the program current of the stereoscopic video signal, the time information is carried in the guide or header of the packet as a sample of the 27 MHz clock of the system_clock_reference (SCR) field. The SCR values are calculated using the equations described in ITU-T Rec H.262, ISO / IEC 13818-1 and, set out below: SCR (i) = SCR_base (i) x 300 + SCR_ext (i) where: SCR_base (i) = ((system_clock_frecuency xt (i)) DIV 300)% 233, and SCR_exit (i) = ((system_clock_frecuency xt (i)) DIV 1)% 300. The identification of the video packages in the two layers, the lower and the intensified, is specified in a current map of the program as two
P1523 / 97MX current identifiers. For both the transport stream and the program stream, the synchronization of the decoding and the stereoscopic video presentation process is supplied in packages of the packed elementary stream (PES). In particular, a filing date (PTS) and / or a decoding date (DTS) are provided in the optional fields of the PES guides or headings. PES packages are built or built for each elementary video stream before transport or program packaging. A new PES packet is provided in the PES stream if it is necessary to send a PTS and / or a DTS to the decoder. Therefore, a key factor for synchronization is correctly calculating the PTS and the DTS. The PTS and the DTS are determined by the encoder based on hypothetical decoder models, namely the target decoder of the transport current system (T-STD) or, the target decoder of the program current system (P-). STD), both are described in ITU-T Rec H.262, ISO / IEC 13818-1. Both PTS and DTS values are specified in units of the period of the system clock frequency divided by 300, which produces 90 KHz units. In particular, as described in ITU-T Rec H.262, ISO / IEC 13818-1: P1523 / 97MX PTS (k) = ((system_clock_frecuency x tpn (k)) DIV 300)% 233, where tpn (k ) is the presentation time of the presentation unit Pn (k). Similarly, DTS (j) = ((system_clock_frecuency x tdn (k)) DlV
300)% 233, where tdn (k) is the decoding time of the access unit An (j). The DTS of the video indicates in this way the time the image needs to be decoded by the STD. The PTS of the video indicates the time at which the decoded image will be presented to the viewer (for example, displayed or displayed on a television). In addition, the times indicated by the PTS and the DTS are evaluated with respect to the current PCR or SCR value. A stream of video bits is instantly decoded in the theoretical STD model. However, if B images are present in the lower layer of the stereoscopic bitstream, the bitstream will not arrive at the decoder in order of presentation (eg, display or display). In this case, some I and / or P images should be temporarily stored in a reordering buffer in the STD after being decoded until the appropriate presentation time. However, with the intensified layer, all the images arrive in order of presentation to the decoder and,
P1523 / 97MX consequently, the PTS and DTS values must be identical and they are only displaced in a fixed interval. In order to synchronize the lower and intensified layer sequences, the corresponding images in the lower and intensified layers must have the same PTS. Any existing method of calculating the DTS for the main MPEG-2 profile can be used for the calculation of the DTS in the lower layer, for example, DTSL, where "L" denotes the lower layer. The subsequent PTS and DTS values will refer to the corresponding DTSL. In particular, DTSLJ. and PTSLI denote respectively DTS and PTS for the i-th image of the lower layer. Also, DTSEi and PTSEi denote respectively DTS and PTS for the i-th image of the intensified layer. Then, the time interval F between the presentation of successive 9 x 10 images can be defined as: F = frame rate For example, in accordance with the NTSC standard, with a frame rate of 29.97 frames / second, F = 3,003. F is the nominal frame period in clock cycles of 90 KHz and corresponds to an actual elapsed time of 3,003 cycles / 90 KHz = 0.03336 seconds. According to the PAL standard, with the frame rate of 25 frames / second, F = 3,600. In addition, the synchronization of the lower and intensified layer sequences is intimately dependent on
P1523 / 97MX the transmission and display order of the video sequences. Generally, the MPEG-2 standard for non-stereoscopic video signals does not specify any particular distribution that images I, images P and images B must take within a sequence in the base layer but allows different distributions to provide different degrees of compression and random accessibility. In a possible distribution, each image of the base layer is an I image. In other possible distributions, the two images I and P or both the I, P and B images are provided, where the B images are supplied non-consecutively or both I, P and B images, where they can be Two consecutive B images are supplied. Generally, three or more consecutive B images are not used due to degraded image quality. In the intensified layer, images B and P are supplied and images I can optionally be supplied. Figure 2 is an illustration of a sequence of intensified layer images and a first sequence of base layer images for use with the system of the present invention. Here, the lower layer includes only I images. The sequence of intensified layer images is generally displayed at 200, while the lower layer sequence is generally displayed at
P1523 / 97MX 250. Sequences 200 and 250 are displayed in order of display. Each image is labeled to indicate the type of image (for example, I, B, or P), the layer designation (for example, "E" for the intensified layer, and "L" for the lower layer) and the location or sequential positioning of the image, where the subscript "0" indicates the image with zero order of the sequence, the subscript "l" indicates the first image of the sequence and so on. In intensified layer 200 it includes IE0 images
(202), BE1 (204), BE2 (206), PE3 (208), BE4 (210), BE5 (212), PE6 (214), BE7 (216), BE8 (218), PE9 (220), BE10 (222), BE11 (224) and IE12 (226). However, the sequence of the particular intensified layer is shown only in illustrative form. In any of the sequences of the enhanced layer discussed herein, including those of Figures 2-5, the particular type of the intensified layer image is not limited since the intensified layer is transmitted in order of display. Thus, any of the intensified layer images can be considered to be a generic image type (for example, HEi), where "H" denotes the type of image. The lower layer 250 of this example includes only I images, including IL0 (252), IL? (254), IL2 (256), IL3 (258), IL4 (260), IL5 (262), IL6 (264), IL7 (266), IL8 (268),
P1523 / 97MX IL9 (270), IL10 (272), IL11 (274), and IL12 (276). Additionally, the start of the group of images (GOP) of each of the sequences is indicated. The GOP indicates one or more consecutive images that can be decoded without reference to images of another GOP. Generally, the GOPs of the lower and intensified layers are not aligned and have different lengths. For example, the start of a first GOP of the intensified layer 200 is shown in the IEQ image (202), while the start of a second GOP is in the image IE12 (226). Similarly, the start of a first GOP in the lower layer 250 is shown in the image IL2 (256), while the start of a second GOP is in the image IL8 (268). In addition, the arrows shown in Figure 2 indicate the prediction modes allowed such that the image that is signaled by an arrowhead can be predicted based on the image that is connected to the tail of the arrow. For example, the image BE1 (204) is predicted from the image IL1 (254). Remember that I images do not have predictive coding but are independent or autonomous. With the order of displaying images of Figure 2, an advantageous transmission sequence according to the present invention starts at IL2, is: IL2, BE1, IL3, BE2, IL4 »pE3 /? Ld 'BE4 •? Ld / BE5 > t 7 > PE6 /
P1523 / 97MX ^ LB t BE7 /! L9Í BE8 /? LLO PE9 »? Lll» BE10 »XL12» BE11 / ßtC. ,
(sequence 1). With this ordering or order of images, each image with predictive coding that arrives at the decoder will not have to be reordered before decoding. In this way, the storage and processing requirements in the decoder can be reduced, thereby reducing the cost of the decoder. Another suitable image transmission sequence is: IL2 / BE2, IL3, pE3, IL4-BE4 > ? LS »BE5 'I 6» PE6' IL7, BE7, IL8, BE8, IL9, PE9, I10 BE10 / ILE »BE11- IL12 e
IEI2 etc., (sequence 2). With these sequences of transmission of images, all the images arrive at the decoder in the order of presentation. In addition, it is possible to determine the appropriate PTS and DTS for each image. First, suppose that the DTS of the lower layer image i-th DTSLi is known. As a specific example, with the first image transmission sequence of Figure 2, ie, sequence 1, decoding and presentation occurs as described in Table 2. Serial decoding is assumed. In Table 1, the first column indicates the time, using DTSL2 as the start time in increments of 0.5F, the second column indicates the decoding time of the lower layer image, the third column indicates the time of
P1S23 / 97MX decoding of the images of the intensified layer and, the fourth column indicates the time of presentation of the images of the lower and intensified layer.
P1523 / 97MX Here, only storage is required for two decoded images. For example, IL2 and? L3 are decoded and stored before BE2 is received. When received, BE2 can then be decoded immediately and issued for presentation in substantially concurrent form with IL. Furthermore, for the i-th image in any of the lower or intensified sequences, the DTS and the PTS can be determined from the DTSLi as follows: PTSLi = DTSLi + 1.5F; DTSEi = DTSLi + 1-5F; and PTSEi = PTSLi. For example, the PTS for PE3 (208) of Figure 2 is equal to the sum of 1.5F and DTS of IL3. In this way, the decoding and presentation of PE3 will follow the decoding of IL3 by 1.5 time intervals of the image (ie, 1.5F). With the second image transmission sequence of Figure 2, decoding and presentation occur as described in Table 2 below.
P1523 / 97KK
Here, storage is required only for a decoded image. For example, IL2 is
P1523 / 97MX decoded and stored before BE2 is received. When it is received, BE2 can then be decoded immediately and broadcast for presentation concurrently with IL2. For the i-th image in either the lower or intensified sequences, the DTS and the PTS can be determined from the DTSLi as follows for the transmission sequence of Table 2: PTSLi = DTSLi + 0.5F; DTSEi = DTSLi + 0.5F; and PTSEi = PTSLi. Figure 3 is an illustration of an image sequence of the intensified layer and a second image sequence of the base layer for use with the system of the present invention. Here, the lower layer includes both the I and the P. The elements with similar numbering correspond to the elements of Figure 2. The intensified layer 200 is the same as previously discussed. The lower layer, generally shown at 300, includes the image sequence PL0 (302), pL1 (304), IL2 (306), PL3 (308), PL4 (310), PL5 (312), PL6 (314), PL8 (316), PL9 (318), PL10 (320), PL11 (322) and PL12 (326). The GOPs start in IL2 (306) and IL8 (318). Here, the prediction scheme is a little more complex. Remember that, in the base layer, an image P
P1523 / 97MX has predictive coding using the nearest I or P image. In the intensified layer, a B-image can have predictive coding using up to three different possible modes. However, when the corresponding image of the lower layer is an I image, only this image I is used. Also, in the intensified layer, a P image having predictive coding uses the image of the most recent intensified layer, the image of lowermost layer, in order of display or the next lower layer image, in order of display. Again, when the corresponding lower layer image is an I image, only that I image is used. Note that in some cases, the prediction modes shown include optional trajectories. Thus, in the lower layer sequence 300, for example, PL4 was encoded using PL3 and PL5. In intensified layer 200, PE3 can be encoded using BE3 or PL3. A suitable image transmission sequence according to the present invention, starting at IL2 is: IL2 / BE1, PL3, BE2, PL4, PE3, PL5, BE4, PL6, BE5, PL7, BE6,? Lß 'BE7 / PL9 < BE8 'PL10' PE9 / PL11 / BE1C PL12 'BEll? etc.,
(sequence 3). For this sequence, decoding and presentation occurs as described in Table 3 below.
P1523 / 97MX
P1523 / 97MX Here, only storage is required for two decoded images. For example, IL2 and PL3 are decoded and stored before BE2 is received. When it is received, BE2 can then be decoded immediately and issued for presentation concurrently with IL. For the i-th image in either the lower or intensified sequences, the DTS and the PTS can be determined from DTSL; L as follows for the transmission sequence of Table 3: PTSLi = DTSLi + 1.5F; DTSEi = DTSLi + 1.5F; and PTSEi = PTSLi. Alternatively, another transmission sequence suitable for the example of Figure 3 is: IL2, BE2, PL3, pE3 »PL4 / BE4 'PL5' BE5» PL6 < PE6 / PL7 / BE7 /? 8 > BE8 'PL9 < pE9 P IO / BEIO »PLII» BEii / PL12 * Ei2 etc. , (sequence 4). Decoding and presentation occurs as described in Table 4 below.
P1523 / 97MX
Here, only storage is required for a decoded image. For example, IL2 is
P1523 / 97MX decodes and stores before BE2 is received at which time BE2 can be decoded and issued directly for presentation concurrently with IL2. For the i-th image in any of the sequences, the lower and the intensified, the DTS and the PTS can be determined from the DTSLi as follows for the transmission sequence of Table 4: PTSLi = DTSLi + 0.5F; DTSEi = DTSLi + 0.5F; and pTSEi = pTSLi. Figure 4 is an illustration of a sequence of intensified layer images and a third sequence of base layer images for use with the system of the present invention. Here, the lower layer includes images I, P and B, where the B images are not consecutive. Elements with similar numbering correspond to the elements of Figures 2 and 3. The intensified layer 200 is the same as previously discussed. The lower layer, generally shown at 400, includes the sequence of images PLQ (402), BL1
(404), IL2 (406), BL3 (408), PL4 (410), BL5 (412), PL6 (414),
BL7 (416), IL8 (418), BL9 (420), PL10 (422), BL11 (424) and
PL12 (426). The GOPs begin in IL2 (406) and IL8 (418). Here, the prediction scheme is as follows. Remember that, in the base layer, a B image has
P1523 / 97MX predictive coding using the closest previous I or P image and the nearest subsequent I or P image. Thus, in the lower layer sequence 400, for example, BL3 was encoded using IL2 and PL4 • A suitable image transmission sequence in accordance with the present invention, starting at IL2 is: IL, PL4,
BL3 'BE2 / PE3 / PL6 / BL5 < BE4 > BE5 / ^ LS > BL7 / PE6 'PE7 »P 10 /
BL9, BE8, PE9, PL12, BL11, BE10, BEn, etc. (sequence 5). Alternatively, another suitable transmission sequence is: IL2, BE; P 4 »BL3 'PE3 / BE4' PL6 / BL5 > BE5 > PE6 • ^ Lß / BL7 / BE7 / BE8 'P 10 / B 9' PE9 »BE10» PL12 / BL11Í BE11 > L2 / etc.,
(sequence 6). An additional suitable transmission sequence is: IL2, pL4 / BE2 BL3- PE3 »PL6 BE4 BLS- BE5»
? Lß 'PE6 < BL7 / BE7 / PL10 < BE8 # BL9 / PE9 »PL12 / BE10 / BL11 / BE11 / etc., (sequence 7). For the i-th image in either the lower or intensified sequences, the DTS and PTS can be determined from DTSLi as follows. For each image, the presentation of the image is delayed by a multiple integer of F after the decoding of the image. For example, with the first prior transmission sequence, i.e., sequence 5, decoding and presentation occurs as described in Table 5 below.
P1523 / 97MX
S23 / 97MX Here, storage is required only for three decoded images. For example, IL2 < pL4 and BL3 are decoded and stored before BE2 is received, at which time BE2 can then be decoded and broadcast directly for presentation concurrently with IL2. For the i-th image in any of the lower or intensified sequences, the DTS and PTS can be determined from DTSLi as follows for the transmission sequence of Table 5: PTS i = DTSLi + (mod2 (i + 1) +1) 1.5F, for all i; DTSEi = DTSLi + 1.5F, for i = 2; DTSEi = DTSL? + (L + 2mod2 (i + 1)) F, for i > 2; and PTSEi = PTSLi, for all i; where mod2 (i) is the base modulus of the integer i such that mod2 (i) = 0 when i is even and mod2 (i) = l when i is odd. With sequence 6, decoding and presentation occurs as described in Table 6 below.
P1523 / 97MX
P1523 / 97MX Here, storage is required only for two decoded images. For example, PL4 and BL3 are decoded and stored before PE3 is received, at which time PE3 is decoded and output directly for presentation concurrently with IL2. For the i-th image in any of the lower or intensified sequences, the DTS and the PTS can be determined from DTSLÍ as follows for the transmission sequence of Table 6: PTSLi = DTSLi + F, for i = 2; PTSLi = DTSLi + (3mod2 (i + 1) +1) 0.5F, for i > 2; DTSEi = DTSLi + 0.5F, for i = 2; DTSEi = DTSLi + (l + 2mod2 (i + 1)) 0.5F, for i > 2; and PTSEi = PTSLi, for all i. With sequence 7, decoding and presentation occurs as described in Table 7 below.
P1523 / 97MX
P1523 / 97MX Here, storage is required only for two decoded images. For example, LL2 and PL4 are decoded and stored before BE2 is received, at which time BE2 is decoded and output directly for presentation concurrently with IL2. For the i-th image in either the lower or intensified sequences, the DTS and the PTS can be determined from the DTSLÍ as follows for the transmission sequence of Table 7: PTSLÍ = DTSLJÍ + F, for i = 2; PTSLi = DTSLi + (4mod2 (i + 1) +1) 0.5F, for i > 2; DTSEi = DTSLi + F, for i = 2; DTSE? = DTSLi + (4mod2 (i + 1) +1) 0.5F, for i > 2; and PTSE? = PTSLi, for all i. Figure 5 is an illustration of a sequence of images of the intensified layer and a fourth sequence of images of the lower layer for use with the system of the present invention. Here, the lower layer includes images I, P and B with two consecutive B images. Elements with similar numbering correspond to the elements of Figures 2-4. The intensified layer 200 is the same as previously discussed. The lower layer, generally shown at 500, includes the image sequence BL0 (502), BL1 (504), IL2 (506), BL3 (508), BL4 (510), PL5 (512), BL6 (514), BL7 (516), IL8 (518), BL9 (520),
P1523 / 97MX B 10 (522), PL11 (524) and BL12 (526). GOPs start at IL2 (506) and IL8 (518). A suitable sequence of images according to the present invention, starting at IL2 is: IL2, PL5,
BL3 'BE2 »BL4 < PE3 'BE4 ^ L' BL6 / BE5 / BL7 PE6 < BE7 / PL11 /
BL9, BE8, BL00, pE9, BE10, etc., (sequence 8). With this transmission sequence, decoding and presentation occurs as described in Table 8 below.
P1523 / 97MX
P1523 / 97MX Here, storage is required only for three decoded images. For example, IL2, PL and
BL3 are decoded and stored before it is received
BE2, at which time BE2 is decoded and broadcasts directly for presentation concurrently with IL2. For the i-th image in either the lower or intensified sequences, the DTS and the PTS can be determined from the DTSLi as follows for the transmission sequence of Table 8: PTSLi = DTSLi + 1.5F, for i = 2; PTSLi = DTS i + (5mod2 (mod3 (i + 1)) +3) 0.5F, for i > 2; DTSEÍ = DTSLÍ + 1.5F, for i = 2; DTSEi = DTSLi + (3-mod2 (mod3 (i)) + 5mod2 (mod3 (il)) 0.5F, for i> 2, and PTSEi = PTS i, for all i, where mod3 (i) is the module of base 3 of the integer i such that mod2 (i) = 0 when i = 0 + 3n, mod3 (i) = l when i = l + 3n, and mod3 (i) = 2 when l = 2 + 3n, for n = 0, 1, 2, 3, etc. Alternatively, another suitable transmission sequence is: IL2, BE2, PL5, BL3, PE3, BL4, BE4, BE5, IL8, BL6,
PE6 > BL7 BE7 »BE8 < PL11 / BL9 / PE9 / BL10 < BE10 'BE11 etc.,
(sequence 9). With this transmission sequence, decoding and presentation occurs as described in Table 9 below.
P1523 / 97MX
P1523 / 97MX Here, storage is required only for three decoded images. For example, IL2, and BE2 are decoded and stored before PL5 is received, at which time BE and IL2 are issued for concurrent presentation. For the i-th image in any of the lower or intensified sequences, the DTS and the PTS can be determined from DTSLi as follows for the transmission sequence of Table 9: PTSL? = DTSLi + F, for i = 2; PTSLi = DTSLi + (5mod2 (mod3 (i + 1)) +1) 0.5F, for i > 2; DTSEi = DTSLi + 0.5F, for i = 2; DTSEi = DTSLi + (5mod2 (mod3 (i-1)) +1) 0.5F, for i > 2; and PTSEi = PTSLi, for all i. An additional suitable transmission sequence is: IL2, PL5, BE2, BL3, PE3, BL4, BE4, IL8, BE5, BL6, PE6, BL7, BE7 PLU »BE8» BL9 PE9 »BLIO / BEIO» etc., (sequence 10 ). With this transmission sequence, decoding and presentation occur as described in Table 10 below.
P1523 / 97MX
P1523 / 97MX Here, storage is required only for two decoded images. For example, IL2, and PL5 are decoded and stored before BE2 is received, at which time BE2 is decoded and output directly for presentation concurrently with IL2. For the i-th image in any of the lower or intensified sequences, the DTS and PTS can be determined from DTSLi as follows for the transmission sequence of Table 10: PTSLi = DTSLi + 1.5F, for i = 2; PTSLi = DTSL? + (6mod2 (mod3 (i-1)) +1) 0.5F, for i > 2; DTSEi = DTSLi + F, for i = 2; DTSEi = DTSLi + (6mod2 (mod3 (i-1) +1) 0.5F, for i> 2, and PTSEi = PTSL ?, for all i Note that in each of the previous cases with sequences 1-10, serial decoding was assumed When parallel decoding is used, the relationship between the PTS and the DTS can be characterized in a more general way, specifically when the lower layer does not have B images, but only has I and / or images P, all images in both layers arrive or arrive in the order of presentation to the decoder.Thus, for the i-th image in any of the lower or intensified sequences, the PTS and DTS can be determined from: P1523 / 97MX PTSLl = DTSLiF; DTSEl = DTSLl + F; and DTSEl = DTSLl This relationship is illustrated in an example shown in Table 11 below.The difference between DTSLl and DTSL (I_D is F.
For example, referring to sequence 1 discussed in relation to Figure 2 above, decoding and presentation will occur as illustrated in Table 12 below.
P1523 / 97MX
Here, only storage is required for a decoded image. For example, IL2 is decoded and stored before BE2 is received, when BE2 is received it is decoded and output directly for presentation substantially concurrently with IL2. When the lower layer has non-consecutive B images, the DTS and PTS can be determined from DTSL1 as follows. If the i-th image of the lower layer is an I image with a "closed GOP" indicator or a P image followed by this I image, then PTSLl = DTSLl + 2F.
P1523 / 97 X If the i-th image of the lower layer is an image P or an I image of an "open GOP" and the image (i + l) th is not an I image with a "GOP closed" indicator , then PTSLi = DTSLi + 3F. If the i-th image of the lower layer is an image B, then PTSLÍ = DTSLÍ + F. For the intensified layer, DTSEÍ = DTSLÍ + 2F and PTSEi = DTSLi + 2F. Note that in the MPEG-2 video protocol, a group of guide or header images is included at the beginning of a GOP and is set opposite with a one-bit indicator, close_gol = 0, while close_gol = l, indicates a GOP closed. An open GOP image I receives a processing as an image P in terms of decoding order. The decoding and presentation with non-consecutive B images of the lower layer is illustrated in an example in Table 13 below.
P1523 / 97MX TABLE 13 Image DTSL type PTSL DTSE type PTSE No. image, image, intensified lower layer layer 0 I (GOP DTSL0 DTSL0 + 2F I, P DTSL0 + 2F DTSL0 + 2F closed or ». _ _DTSL0 + F _ 1 P, DTS DTSLl + 2F "I'P DTSL1 + 2F _DTSL1 + 2F
2 B, DTSL2 < DTSL2 + 2F I, P, B? DTSL2 + 2F? DTSL2 + 2F
3 P -Dts3 _DTSjj3 + 2F _I, P, B _DTStj3 + 2F .DTSt3 + 2F
4 B .DTSL4 _DTSM + F _I, P, B _DTS 4 + F DTS ^ + F
I (GOP DTSL5 DTSL5 + 3F I, P, B DTSL5 + 3F DTSL5 + 3F closed 6 B, Dtsl6 .DTS ^ + F I, P, B DTS ^ + F DTSL6 + F
7 I (GOP DTSL7 DTSL7 + 2F I, P, ß DTSL7 + 2F DTSL7 + 2F closed
In a specific example, the sequence of the lower layer, in order of display, is IL0, BL1, PL2 BL3 p 4 / BL5? Lß '? L7' etc. The sequence of the intensified layer, in order of display and transmission, is PEQ, BE1, BE2, BE3, BE4, BE5, PE6, PE7, etc. A possible transmission order according to the present invention is IL0,
PL2 'BL1 »PE0' PL4 'BE1 • BL3' BE2 /! L6 'BE3' BL5 / BE4»! 7 '
BE5, etc. The DTS and the PTS can be determined as shown in Table 14.
P1523 / 97MX
When the lower layer has two consecutive B images, the DTS and the PTS are calculated by the following rules. If the i-th image of the lower layer is an I image with a GOP indicator or a P image followed by this I image, then PTSLl = DTSLl + 2F. If the i-th image of the lower layer is a P image or an I image of an open GOP and the (i + l) th image is not an I image with a closed GOP indicator, then PTSLX = DTSLI + 4F. If the i-th image of the lower layer is a B image, then PTSLI = DTSLI + F. For the intensified layer, DTSEl = DTSLl + 2F and PTSEl = DTSLl + 2F.
P1523 / 97MX Decoding and presentation with two consecutive B images in the lower layer is illustrated in an example in Table 15 below.
TABLE 15 Image DTSL type PTSL DTSE type PTSE No. image, image, intensified lower layer layer 0 I (GOP DTSLO DTSL0 + 2F I, p DTSL0 + 2F DTSL0 + 2F closed) 1 P, Dts1, DTSLl + 4F I, P, B DTSL? + 2F tDTSu + 2F
2 B -DTSL2 .DTSL2 + F _I, P, B _DTSL2 + 2F _DTSL2 + 2F
3 B _DTSi3 _DTSL3 + F _I, P, B _DTS 3 + 2F _DTSL3 + 2F
4 I (GOP DTSL4 DTSL4 + 4F I, P, B DTS 4 + 2F DTSL4 + 2F open) 5 B, DTS 5 .DTSLS + F I, P, B DTSL5 + 2F _DTSL5 + 2F
6 B fDTSL6, DTSL $ + F _I, P, B IDTSL6 + 2F > DTSL6 + 2F
7 _P _DTSL7 _DTSL7 + 2F I, P, B _DTSL7 + 2F DTSL7 + 2F
8 I (GOP DTSL8 DTSL8 + 2F I, P, B DTSL8 + 2F DTSL8 + 2F closed)
In a specific example, the sequence of the lower layer, in order of display, is ILo, BL1, BL2, PL3 / BL4, BL5 ^ h1 i IL7 / etc .. The sequence of the intensified layer, in order of display and transmission, is PE0, BE1, BE2, BE3- BE4 »E5- pE6 < PE7, etc. A possible transmission order according to the present invention is ILQ,
P1523 / 97MX PL3 'BL1' PE0 'BL2 / BE1 / XL6 • BE2 / BL4 / BE3 > B 5 'BE4 / ^ -ip >
BE5, etc. The DTS and the PTS can be determined as shown in Table 16.
The above rules, which apply to video in frame mode, can be generalized to the corresponding cases of the movie mode. Figure 6 is a block diagram of an intensified layer decoder structure for stereoscopic video. The decoder, generally shown at 130, includes an input terminal 605
P1523 / 97MX to receive the data of the compressed intensified layer and a 610 transport syntax analyzer for the syntactic analysis of the data. The parsed data is supplied to a memory manager 630, which may comprise a central processing unit. The memory manager 630 communicates with a memory 620, which may comprise a dynamic random access memory (DRAM), for example. The memory manager 630 also communicates with a 640 decompression / prediction processor and, receives lower level data decoded by the terminal 650 that can be temporarily stored in the memory 620 for subsequent use of part of the 640 processor in decoding images of intensified layer with predicted disparity. Decompression / prediction processor 640 provides a variety of processing functions, such as, for example, error detection and correction, motion vector decoding, inverse quantization, inverse discrete cosine transformation, Huffman decoding and calculations. of prediction, for example. After being processed by the decompression / prediction function 640, the decoded intensified layer data is emitted by the memory manager. Alternatively, the data
Decoded P1523 / 97MX can be issued directly from the decompression / prediction function 640 via a medium not shown. For the lower layer an analogous structure can be used. In addition, the decoders of the intensified and lower layer can share common hardware. For example, memory 620 and processor 640 can be shared. However, this may not be possible where parallel decoding is used. A common clock signal (not shown) is provided in such a way that the decoding can be coordinated in accordance with the transmission sequences presented here. In particular, it will be necessary to temporarily store lower layer images which will be used for the prediction of images of the intensified layer with predicted disparity or of other images of the lower layer, before the reception of the predicted image data. In accordance with the present invention, the number of images that must be stored before decoding is minimized, thus allowing a small memory size. As can be seen, the present invention provides an advantageous image transmission scheme for a stereoscopic video image sequence. In particular, the images are transmitted in an order of such
P1523 / 97MX so that the number of images that must be stored temporarily before presentation is minimized. In addition, the example transmission sequences presented here are compatible with both the MVP MPEG-2 protocol as well as with the proposed MPEG-4 protocol. In addition, a dated decoding (DTS) and a dated presentation (PTS) for each picture can be determined to provide synchronization between the lower layer and intensified layer images in the decoder. The DTS and the PTS are adjusted or set according to whether the decoding is sequential or parallel and, depending on whether the lower layer has no B images, non-consecutive B images, or two consecutive B images. Although the invention has been described with respect to several specific embodiments, those skilled in the art will appreciate that numerous adaptations and modifications may be made therein without departing from the spirit and scope of the invention as set forth in the claims. For example, those skilled in the art will appreciate that the scheme set forth herein may be adapted to other lower and enhanced layer sequences in addition to those specifically illustrated herein.
P1523 / 97MX
Claims (24)
- NOVELTY OF THE INVENTION Having described the present invention, it is considered as a novelty and, therefore, the content of the following CLAIMS is claimed as property: 1. A method for ordering a sequence of video images in a lower layer and in an intensified layer of a stereoscopic video signal for transmission to a decoder, the intensified layer includes predicted disparity images that are predicted using the corresponding lower layer images, comprising the steps of: ordering the video images in such a way that the intensified layer images with predicted disparity are transmitted after the respective corresponding lower layer images. The method according to claim 1, wherein the lower layer includes only intra-coded images (images I), which include consecutive images ILÍ + I and ILÍ + 2 ¥ • the corresponding images of the intensified layer are represented by HEÍ HEÍ + I and HEi + 2, respectively, which includes the additional step of: ordering the video images in such a way that they are transmitted in the order: + ?, HEi, + 2 • P1523 / 9 MX 3. The method according to claim 1, wherein the lower layer includes only intra-coded images (images I), which include consecutive images I Í and ILÍ + I < And I the corresponding images of the intensified layer are represented by HEi and HEi + i 'respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order: ILÍ, HEi, ILÍ + I; HE +1. The method according to claim 1, wherein the lower layer includes only intra-coded images (I images) and the images with predictive coding (P images), which include the consecutive images ILÍ / PLÍ + I and pLi + 2 Y the corresponding ones Images of the intensified layer are represented by HE, HEi + 1 and HEi + 2, respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order: IL, P Í + I HE ¿ , PLÍ + 2 • 5. The method according to claim 1, wherein the lower layer includes only intra-coded images (images 1) and images with predictive coding (P images), which include consecutive images! Í and pLi + i / and the corresponding images of the intensified layer are represented by HEÍ and HEi + 1, respectively, which comprises the additional step of: P1523 / 9 MX ordering the video images in such a way that they are transmitted in the order : HE, PLÍ + 1 / HEÍ + 1. The method according to claim 1, wherein the lower layer includes intra-coded images (I images), predictive encoding images (P images) and images with non-consecutive bidirectional predictive coding (B images), which include consecutive images ILÍ BLÍ + IY PLÍ + 2 / And the corresponding images of the intensified layer are represented by HEi, HEi + 1 and HEi +, respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order: ILÍ, PLÍ + 2 / BLÍ + 1 / HEÍ / HEÍ + I- 7. The method according to claim 1, wherein the lower layer includes intra-coded images (images I), images with predictive coding (P images) and images with predictive, bidirectional and non-consecutive coding (B images), including consecutive images ILÍ, BL¿ + 1 and P Í + 2 and the corresponding images of the intensified layer they are represented by HEi, HEi + 1 and HEi + 2, respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order: HEi, PLi + 2. BLi + i / HE +1, HEi + 2- P1523 / 97MX 8. The method according to claim 1, wherein the lower layer includes intracoded images (I images), predictive encoding images (P images) and predictive, bidirectional and non-consecutive encoding images (B images) that include consecutive images ILÍ, BLi +? and PLÍ + 2 YI the corresponding intensified layer images are represented by HE, HE¿ + 1 and HEi + 2, respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order.- ILÍ, P Í + 2 HEÍ <; BLÍ + I, HE +1. The method according to claim 1, wherein the lower layer includes intra-coded images (I images), predictive encoding images (P images) and predictive, bidirectional and consecutive coding images (B images), which include consecutive ILi images, BLi + ?, BL? +2 and P Í + 3 »Y- the corresponding images of the intensified layer are represented by HEi, HEi + 1, HEi + 2, and HEi + 3, respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order:? Lil PLi + 3 / BLi + I, HEi, BLi + 2, HEi + lf HEi + 2 • 10. The method according to claim 1, wherein the lower layer includes only images P1523 / 97MX intracoded (images I), images with predictive coding (P images) and images with predictive, bidirectional and consecutive coding (B images), including consecutive images I Í, BLÍ + I, BLÍ + 2 and LÍ + 3 y, the corresponding intensified layer images are represented by HE-¡, HEi + 1, HE¿ + 2 and HE¿ + 3, respectively, which comprises the additional step of: ordering the video images in such a way that they are transmitted in the order: ILÍ, HEi, PLÍ + 3 »BLÍ + 1 / HEÍ + I < The method according to claim 1, wherein the lower layer includes only intracoded images (I images), predictive encoding images (P images) and predictive, bidirectional and consecutive coding images (B images), including consecutive ILi images, BLÍ + 1 / BLÍ + 2 and PLÍ + 3 »and, the corresponding images of the intensified layer are represented by HEi, HEi + 1, HE¿ + and HEi + 3, respectively, which includes the additional step of: ordering the images of video in such a way that they are transmitted in the order: LLif PLi + 3 HEÍ / BLÍ + 1 / HEi + 1, BLi + 2, HEi + 2 • 12. A method for parallel decoding of a sequence of video images in a lower layer and in an intensified layer of a video signal P1523 / 97MX stereoscopic, where the lower layer includes at least one image of the intracoded images (I images) and the images with predictive coding (P images) but not images with bi-directional predictive coding (B images), comprising the step of: providing the images with date and time of decoding (DTS) and date and time of presentation (PTS) to indicate, respectively, the time of decoding and presentation of each of the images; where: the DTS of the i-th image of the lower layer is DTS Li '"the PTS of the i-th image of the lower layer is PTS Li" the DTS of the i-th intensified layer image is DTS Hi / 'The PTS of the i-th intensified layer image is PTS Hi /' the F is a time interval between the presentation of successive images, and PTSLÍ = DTSHI = PTSHI = DTSLÍ + F 13. A method for decoding in parallel of a sequence of video images in a lower layer and in an intensified layer of a stereoscopic video signal, wherein the lower layer includes images with predictive coding, bidirectional and not P1523 / 97MX consecutive (images B), comprising the step of: providing the images with date and time of decoding (DTS) and date and time of presentation (PTS) to indicate, respectively, the time and moment of decoding and presentation for each of the images; where: the DTS of the i-th image of the lower layer is DTSLÍ; the PTS of the i-th image of the lower layer is PTSLÍ; the DTS of the i-th intensified layer image is DTSHÍ; the PTS of the i-th intensified layer image is PTSHÍ; F is a time interval between the presentation of successive images; and PTSLÍ = DTSLÍ + 2F when the i-th lower layer image is an intra-coded image (image I) with a closed GOP indicator. The method according to claim 13, wherein: PTSLÍ = DTSLÍ + 2F, when the i-th lower layer image is an image with predictive coding (image P) and the (i + l) th bottom layer image is an I image with a closed GOP indicator. P1523 / 97MX 15. The method according to claim 13 or 14, wherein: PTSLÍ = DTSLÍ + 3F, when the i-th lower layer image is an image indicator P and the (i + l) th bottom layer image is not an image I with a closed GOP indicator. The method according to one of claims 13 to 15, wherein: PTSLÍ = DTSLÍ + 3F, when the i-th lower layer image is an I image with an open GOP indicator and the (i + l) th image lower layer is not an I image with a closed GOP indicator. 17. The method according to one of claims 13 to 16, where: PTSLÍ = DTSLÍ + F, when the i-th lower layer image is an image B. 18. The method according to claim 13, wherein: DTSHÍ = PTSHÍ = PTSLÍ = DTSLÍ + 2F. 19. A method for parallel decoding of a sequence of video images in a lower layer and in an intensified layer of a stereoscopic video signal, wherein the lower layer includes at least one group of two images with predictive coding, bidirectional and consecutive (images B), comprising P1523 / 97MX the step of: providing the images with date and time of decoding (DTS) and date and time of presentation (PTS) to indicate, respectively, the time or moment for the decoding and for the presentation of each of the images; where: the DTS of the i-th image of the lower layer is DTSLi; the PTS of the i-th image of the lower layer is PTSLÍ; the DTS of the i-th intensified layer image is DTSHÍ; the PTS of the i-th intensified layer image is PTSHÍ; F is a time interval between the presentation of successive images; and PTSLÍ = DTSLÍ + F when the i-th lower layer image is an intra-coded image (image I) with a closed GOP indicator. The method according to claim 19, wherein: PTSLÍ = DTSLÍ + 2F, when the i-th lower layer image is an image with predictive coding (image P) and the (i + l) th bottom layer image is an I image with a closed GOP indicator. P1523 / 97MX 21. The method according to claim 19 or 20, wherein: PTSLÍ = DTSLÍ + 4F when the i-th lower layer image is an image indicator P and the (i + l) th bottom layer image is not an I image with a closed GOP indicator. 22. The method according to one of claims 19 to 21, wherein: PTSLÍ = DTSLÍ + 4F when the i-th lower layer image is an I image with an open GOP indicator and the (i + l) th image of Lower layer is not an I image with a closed GOP indicator. 23. The method according to one of claims 19 to 22, wherein: PTSLÍ = DTSLÍ + F when the i-th lower layer image is an image B. 24. The method according to claim 19, wherein: DTSHÍ = PTSHÍ = PTSLÍ = DTSLÍ + 2F. P1523 / 97MX
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08736383 | 1996-10-24 | ||
US08/736,383 US5886736A (en) | 1996-10-24 | 1996-10-24 | Synchronization of a stereoscopic video sequence |
Publications (2)
Publication Number | Publication Date |
---|---|
MX9708213A MX9708213A (en) | 1998-08-30 |
MXPA97008213A true MXPA97008213A (en) | 1998-11-12 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2218607C (en) | Synchronization of a stereoscopic video sequence | |
US5612735A (en) | Digital 3D/stereoscopic video compression technique utilizing two disparity estimates | |
US5619256A (en) | Digital 3D/stereoscopic video compression technique utilizing disparity and motion compensated predictions | |
EP0817493B1 (en) | Rate control for stereoscopic digital video encoding | |
EP2538674A1 (en) | Apparatus for universal coding for multi-view video | |
EP0862837B1 (en) | Method and apparatus for statistical -multiplexing programs using decoder buffer fullness | |
US6480628B2 (en) | Method for computational graceful degradation in an audiovisual compression system | |
US6055274A (en) | Method and apparatus for compressing multi-view video | |
JP4417421B2 (en) | Binocular / multi-view 3D moving image processing system and method | |
US5877812A (en) | Method and apparatus for increasing channel utilization for digital video transmission | |
KR100563754B1 (en) | Method and system for multiplexing image signal, Method and system for demultiplexing image signal, and transmission medium | |
US20030138045A1 (en) | Video decoder with scalable architecture | |
WO1997019561A9 (en) | Method and apparatus for multiplexing video programs | |
JP2009505604A (en) | Method and apparatus for encoding multi-view video | |
Haskell et al. | Mpeg video compression basics | |
JP2009004939A (en) | Multi-viewpoint image decoding method, multi-viewpoint image decoding device, and multi-viewpoint image decoding program | |
MXPA97008213A (en) | Synchronization of a stereoscop video sequence | |
WO2007007923A1 (en) | Apparatus for encoding and decoding multi-view image | |
HK1010775B (en) | Synchronization of a stereoscopic video sequence | |
JP4148200B2 (en) | MPEG image data recording apparatus and MPEG image data recording method | |
HK1008285A (en) | Rate control for stereoscopic digital video encoding |