[go: up one dir, main page]

CN107959844A - 360 degree of video captures and playback - Google Patents

360 degree of video captures and playback Download PDF

Info

Publication number
CN107959844A
CN107959844A CN201710952982.8A CN201710952982A CN107959844A CN 107959844 A CN107959844 A CN 107959844A CN 201710952982 A CN201710952982 A CN 201710952982A CN 107959844 A CN107959844 A CN 107959844A
Authority
CN
China
Prior art keywords
degree
video
videos
projection
view
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710952982.8A
Other languages
Chinese (zh)
Other versions
CN107959844B (en
Inventor
周敏华
陈学敏
布赖恩·A·亨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avago Technologies International Sales Pte Ltd
Original Assignee
Avago Technologies Fiber IP Singapore Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/599,447 external-priority patent/US11019257B2/en
Application filed by Avago Technologies Fiber IP Singapore Pte Ltd filed Critical Avago Technologies Fiber IP Singapore Pte Ltd
Publication of CN107959844A publication Critical patent/CN107959844A/en
Application granted granted Critical
Publication of CN107959844B publication Critical patent/CN107959844B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • G06T3/047Fisheye or wide-angle transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/12Panospheric to cylindrical image transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/139Format conversion, e.g. of frame-rate or size
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/189Recording image signals; Reproducing recorded image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/243Image signal generators using stereoscopic image cameras using three or more 2D image sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/39Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

In system of 360 degree of video captures with playback, 360 degree of videos can be captured, splice, encode, decode, present and play back.In one or more embodiments, splicing apparatus can be configured to splice 360 degree of videos using the intermediate coordinate system between input Picture Coordinate System and capture coordinate system.In one or more embodiments, the splicing apparatus can be configured to be determined using projection format by 360 degree of video-splicings at least two different projection formats, and code device can be configured to be encoded with the signaling for indicating described at least two different projection formats to described through splicing 360 degree of videos.In one or more embodiments, the splicing apparatus can be configured to splice 360 degree of videos with a variety of visual angles, and device is presented and can be configured so that decoded bit stream is presented using one or more suggestion visual angles.

Description

360 degree of video captures and playback
The cross reference of related application
Present application advocates the senior interest from following U.S. provisional patent application cases according to 35U.S.C. § 119:2016 " 360 degree of videos captures and playback (360DEGREE VIDEO CAPTURE AND entitled filed on May 19, in PLAYBACK No. 62/339,040 U.S. provisional patent application cases) ";Entitled " video filed in 14 days October in 2016 The 62/th of decoding and adaptive projection format (VIDEO CODING WITH ADAPTIVE PROJECTION FORMAT) " No. 408,652 U.S. provisional patent application cases;And " the suggestion view in 360 degree of videos entitled filed in 4 days November in 2016 The 62/418th, No. 066 U.S. provisional patent application cases of (SUGGESTED VIEWS WITHIN 360DEGREE VIDEO) ", The disclosure of the U.S. provisional patent application cases is incorporated in a manner of being cited in full text for all purposes herein.
Technical field
The present invention relates to video capture and playback, and more particularly, it is related to 360 degree of video captures, processing and playback.
Background technology
360 degree of videos (360degree video) (also referred to as 360 degree of videos (360degree videos)), immersions Video and/or spherical video are the video records of real world panorama, wherein the view of record in each direction at the same time, using complete Shot to camera or camera set.During playback, beholder can control the visual field (FOV) angle and view direction (empty Intend real form).
The content of the invention
On the one hand, the present invention relates to a kind of system, it includes:Video capture device, it is configured to 360 degree of capture and regards Frequently;Splicing apparatus, it is configured to:Use the centre between input Picture Coordinate System and 360 degree of video capture coordinate systems The 360 degree of videos captured described in coordinate system splicing;And code device, it is configured to:By described through splicing 360 degree of videos It is encoded into 360 degree of video bit streams;And for transmitting and storage, prepare 360 degree of video bit streams for playback.
On the other hand, the present invention relates to a kind of system, it includes:Video capture device, it is configured to 360 degree of capture and regards Frequently;Splicing apparatus, its be configured to using projection format determine by the 360 degree of video-splicings captured at least two not Same projection format;And code device, it is configured to:By described 360 degree of video bit streams, institute are encoded into through splicing 360 degree of videos State 360 degree of video bit streams and include the signaling for indicating described at least two different projection formats;And for transmitting, prepare described 360 Video bit stream is spent for playback.
On the other hand, the present invention relates to a kind of system, it includes:Video capture device, it is configured to 360 degree of capture and regards Frequently;Splicing apparatus, it is configured to the 360 degree of videos captured described in the splicing of multiple visual angles;Code device, it is configured to By described 360 degree of video bit streams are encoded into through splicing 360 degree of videos;Decoding apparatus, it is configured to decoding and is regarded with the multiple 360 degree of video bit streams that angle is associated;And device is presented, it is configured to use one or more in the multiple visual angle It is recommended that the decoded bit stream is presented in visual angle.
Brief description of the drawings
Some features of this technology are stated in the dependent claims.However, for explanatory purposes, state in the accompanying drawings One or more embodiments of this technology.
Fig. 1 illustrates the example net for being implemented within 360 degree of video captures and playback according to one or more embodiments Network environment.
Fig. 2 conceptually illustrates the example of equidistant column projection format.
Fig. 3 conceptually illustrates equidistant column projection and the example of Earth map.
Fig. 4 conceptually illustrates 360 degree of videos and the example of equidistant column projection.
Fig. 5 conceptually illustrates the example of 360 degree of images in equidistant column projected layout.
Fig. 6 conceptually illustrates the cubical example definition in six faces.
Fig. 7 conceptually illustrates the example of cubic projection form.
Fig. 8 conceptually illustrates the example of 360 degree of images in cubic projection layout.
Fig. 9 conceptually illustrates the example of the normalization projection plane size determined by field-of-view angle.
Figure 10 conceptually illustrates the example of view direction angle.
Figure 11 illustrates that output is presented picture and inputs the schematic diagram of the coordinate mapping between 360 degree of video pictures.
Figure 12, which conceptually illustrates to be mapped to the point normalized in presentation coordinate system using equidistant column projection format, to return One changes the example of projection coordinate's system.
Figure 13 conceptually illustrates that will normalize the point in presentation coordinate system using cubic projection form is mapped to normalizing Change the example of projection coordinate's system.
Figure 14 conceptually illustrates the two dimension through projecting the sample for being used for 360 degree of video pictures of input that 360 degree of videos are presented The example of layout 1400.
Figure 15 conceptually illustrates to capture the global rotation angle between coordinate system and 360 degree of video projection coordinate systems Example.
Figure 16 conceptually illustrates the example of 360 degree of view projections of replacement in equidistant column form.
Figure 17 illustrates the schematic diagram of the example for the coordinate mapping process changed with 360 degree of video projection coordinate systems.
Figure 18 illustrates the capture of 360 degree of videos and the schematic diagram of the example of playback system using six views layout format.
Figure 19 illustrates the capture of 360 degree of videos and the schematic diagram of the example of playback system using two view layout forms.
Figure 20 illustrate using two view layout forms (one of view sequence be used for present) the capture of 360 degree videos and The schematic diagram of the example of playback system.
Figure 21 conceptually illustrates the example of multiple cubic projection format layouts.
Figure 22 conceptually illustrates the example of unrestricted motion compensation.
Figure 23 conceptually illustrates the example of multiple 360 degree of video projection formats layouts.
Figure 24 conceptually illustrates the example for extending unrestricted motion compensation.
Figure 25 illustrates 360 degree of video captures and the schematic diagram of another example of playback system.
Figure 26 conceptually illustrates the example of cubic projection form.
Figure 27 conceptually illustrates the example of fisheye projection form.
Figure 28 conceptually illustrates the example of icosahedron projection format.
Figure 29 illustrates the schematic diagram of the example of 360 degree of videos in a variety of projection formats.
Figure 30 illustrates the capture of 360 degree of videos and the schematic diagram of the example of playback system with adaptive projection format.
Figure 31 illustrates the capture of 360 degree of videos and the signal of another example of playback system with adaptive projection format Figure.
Figure 32 illustrates the schematic diagram for the example that projection format determines.
Figure 33 conceptually illustrates the example for excluding the projection format transformation of the inter prediction across projection format transformation border.
Figure 34 conceptually illustrates the example of the projection format transformation with the inter prediction across projection format transformation border.
Figure 35 illustrates the example of the suggestion view being implemented within 360 degree of videos according to one or more embodiments Network environment 3500.
Figure 36 conceptually illustrates the example of equidistant column and cubic projection.
Figure 37 conceptually illustrates the example that 360 degree of videos are presented.
Figure 38 illustrates the schematic diagram for being extracted and being presented with suggestion visual angle.
Figure 39 conceptually illustrates electronic system, can implement one or more embodiments of this technology with the electronic system.
It is included to provide further understanding and be incorporated in this specification and formed this specification to this technology The annex of enclosing of a part illustrates each side of this technology, and is used for the principle for explaining this technology together with the description.
Embodiment
Description of the detailed description for wishing to be set forth below as the various configurations of this technology, and be not intended to only represent this technology Configuration therein can be practiced in.During attached drawing is incorporated herein by and form be described in detail a part.For offer to this technology The purpose thoroughly understood, detailed description include detail.However, those skilled in the art should understand and understand, this Technology is not limited to detail set forth herein, and one or more embodiments can be used to put into practice.In one or more examples In, well-known structure and component are shown in form of a block diagram to avoid the concept obfuscation for making this technology.
In system of 360 degree of videos capture with playback, 360 degree of videos can be captured, splice, encode, store or launch, Decoding, present and play back.In one or more embodiments, splicing apparatus can be configured with using input Picture Coordinate System with The intermediate coordinate system captured between coordinate system splices 360 degree of videos.In one or more embodiments, the splicing Device can be configured to be determined using projection format by 360 degree of video-splicings at least two different projection formats, and compile Code device can be configured to be carried out with the signaling for indicating described at least two different projection formats to described through splicing 360 degree of videos Coding.In one or more embodiments, the splicing apparatus can be configured to splice 360 degree of videos with a variety of visual angles, And device is presented and can be configured so that decoded bit stream is presented using one or more suggestion visual angles.
In the present system, acquiescence (recommendation) view direction angle, FOV can be signaled together with 360 degree of video contents Angle and/or the system message that picture size is presented.360 degree of video playback apparatus (not showing) can keep that picture is presented same as before Size, but purposefully reduce active presentation area and complexity and memory bandwidth requirements are presented to reduce.360 degree of video playbacks Device can just terminate in playback or before being switched to other program channels 360 degree videos of storage present settings (for example, FOV angles, View direction angle, be presented picture size etc.) so that when the playback of same channels recovers stored presentation can be used to set Put.360 degree of video playback apparatus can provide preview mode, and wherein visual angle can change to help beholder to select automatically every N number of frame Select desirable view direction.360 degree of video captures and playback reproducer (for example, block-by-block) can calculate perspective view to save memory in real time Bandwidth.In this example, it may not be possible to load perspective view from memory chip.In the present system, different views fidelity is believed Breath can be assigned to that different views.
Fig. 1 illustrates the example net for being implemented within 360 degree of video captures and playback according to one or more embodiments Network environment 100.However, it is possible to discribed all components will not be used, and one or more embodiments can be included and do not opened up in figure The additional assemblies shown.Can be in the case of without departing substantially from the spirit or scope of claims such as set forth herein, to component Arrangement and type make change.Additional assemblies, different components or less component can be provided.
Example Network Environment looogl 100 includes 360 degree of video capture devices 102,360 degree of video-splicing devices 104, Video codings Device 106, transmitting link or storage media, video decoder 108 and 360 degree of video exhibition devices 110.In one or more realities Apply in scheme, device 102,104,106,108, one or more of 110 can be incorporated into same physical unit.Citing comes Say, 360 degree of video capture devices, 102,306 degree of video-splicing devices 104 and video coding apparatus 106 can be incorporated into single dress In putting, and video decoder 108 and 360 degree of video exhibition devices 110 can be incorporated into single assembly.In some respects In, network environment 100 can include storage device 114, it stores encoded 360 degree of videos and (such as is stored in 360 degree of videos On digital video record at DVD, blue light, high in the clouds or gateway/set-top box etc.), and then display equipment (for example, 112) played back on.
Network environment 100 can further include 360 degree of video projection format conversion equipments (not showing), it can be compiled in video Carry out performing 360 degree of videos after video decoding before code device 106 carries out Video coding and/or in video decoder 108 Projection format is changed.Network environment 100 can also be included and is inserted between 108 and 360 video exhibition devices 110 of video decoder 360 degree of video projection format conversion equipments (not showing).In one or more embodiments, video coding apparatus 106 can be through Video decoder 108 is communicably coupled to by transmitting link (such as passing through network).
In the present system, 360 degree of video-splicing devices 104 can utilize additional coordinates system, and the additional coordinates system exists Capture 360 degree of videos are projected to and are captured when 2D inputs Picture Coordinate System to be stored or be launched in 360 degree of videos More freedom is provided on side.360 degree of video-splicing devices 104 can also support a variety of projection formats to be deposited to carry out 360 degree of videos Storage, compression, transmitting, decoding, presentation etc..For example, the removable weight captured by camera equipment of video-splicing device 104 Folded area, and export six view sequences for for example each covering 90 ° × 90 ° viewports.360 degree of video projection format conversion equipments 360 degree of video projection formats (for example, cubic projection form) of input can be transformed into 360 degree of video projections of output by (not showing) Form (for example, equidistant column form).
Video coding apparatus 106 can minimize space discontinuity (that is, the number on face border) in synthesising picture with into Row more preferably spatial prediction, so as to optimize the compression efficiency in video compress.Such as cubic projection, preferable layout Ying He Into 360 degree of video pictures in have minimize number face border, for example, 4.Video coding apparatus 106 can be implemented infinitely Motion compensation (UMC) processed is to optimize compression efficiency.
In the present system, 360 degree of video exhibition devices 110 can export colourity perspective view from luma prediction figure.360 degree of videos Device 110 is presented also to may be selected that picture size is presented to maximize display video quality.360 degree of video exhibition devices 110 are also Can the horizontal FOV angle [alpha]s of common choice and vertical FOV angle betas to minimize presentation distortion.360 degree of video exhibition devices 110 may be used also Control FOV angles are limited by real-time 360 degree of videos presentation of available memory bandwidth budget to realize.
In Fig. 1,360 degree of videos are equipped by camera and captured, and are spliced together into equidistant column form.Then, depending on Frequency can be compressed into any suitable video compression format (for example, MPEG/ITU-T AVC/H.264, HEVC/H.265, VP9 etc. Deng) and via transmitting link (for example, cable, satellite, ground, internet stream transmission etc.) transmitting.In receiver-side, video Decoded (for example, 108) and stored with equidistant column form, then, presented according to view direction angle and the visual field (FOV) angle (for example, 110), and it is shown (for example, 112).In the present system, terminal user can control FOV angles and view direction angle Spend to watch video with desirable visual angle.
Coordinate system
In the presence of multiple coordinate systems suitable for this technology, it includes but be not limited to:
● 360 degree of video capture (camera) coordinate systems of (x, y, z) -3D
● (x ', y ', z ') 360 degree of video-see coordinate systems of -3D
●(xp, yp) -2D normalization projection coordinate system, wherein xp∈[0.0:1.0] and yp∈[0.0:1.0].
●(Xp, Yp) -2D input Picture Coordinate Systems, wherein Xp∈[0:InputPicWidth-1], and Yp∈[0: InputPicHeight-1], wherein inputPicWidth × inputPicHeight is color component (for example, Y, U or V) Input picture size.
●(xc, yc) -2D normalization presentation coordinate systems, wherein xc∈[0.0:, and yc ∈ [0.0 1.0]:1.0].
●(Xc, Yc) -2D output presentation Picture Coordinate Systems, wherein Xc∈[0:RenderingPicWidth-1], and Yc ∈[0:RenderingPicHeight-1], wherein picWidth × picHeight is the defeated of color component (for example, Y, U or V) Go out picture size is presented.
●(xr, yr, zr) 360 degree of video projection coordinate systems of -3D
Fig. 2 conceptually illustrates the example of equidistant column projection format 200.Equidistant column projection format 200 represents mapping meter A kind of standard texture mode of sphere in calculation machine figure.It is also referred to as equidistant cylindrical projection, Geographical projections, plate square (plate carr é) or map parallelogram (carte parallelogrammatique).As shown in Fig. 2, for by ball Body surface millet cake p (x, y, z) (for example, 202) projects to the sample p ' (x in normalization projection coordinate's system (for example, 204)p, yp), The longitude ω and latitude of p (x, y, z) is calculated according to equation 1
Wherein ω ∈ [- π:π], andπ is the ratio of round circumference and its diameter, is normally approximately 3.1415926。
Equidistant column projection format 200 can be defined as in equation 2:
Wherein xp∈[0.0:1.0] and yp∈[0.0:1.0].(xp, yp) it is the coordinate normalized in projection coordinate's system.
Fig. 3 conceptually illustrates the example of equidistant column projected layout 300 and Earth map.In equidistant column projected layout In 300, picture has 1 only along equator:1 mapping, and it is stretched over other places.In the arctic and the South Pole of sphere (for example, 302) Maximum mapping distortion occurs for place, and wherein a single point is mapped to the sample line on equidistant histoprojections piece (for example, 304), Cause the mass of redundancy data in 360 degree of videos of synthesis thereby using equidistant column projected layout 300.
Fig. 4 conceptually illustrates the example of 360 degree of videos and equidistant column projected layout 400.Individual layer is wherein used to utilize The existing infrastructure of video coder-decoder carries out video transmission, and 360 degree captured by multiple cameras with different angle regard Frequency fragment (for example, 402) is usually spliced and synthesizes the single video sequence being stored in equidistant column projected layout.Such as figure Shown in 4, in equidistant column projected layout 400, left, the preceding and right video segment of 360 degree of videos is projected in picture Centre, rear video fragment are averaged segmentation and are positioned over the left side and right side of picture;Upper and lower video segment is placed on respectively The top and bottom of picture (for example, 404).All video segments are stretched, wherein upper and lower video segment is stretched at most.Figure 5 conceptually illustrate the example of 360 degree of video images in equidistant column projected layout.
Cubic projection
Fig. 6 conceptually illustrates the example definition of six face cubes 600.Store another common projection format of 360 degree of views It is that video segment is projected into a cube dignity.As illustrated in figure 6, cubical six faces are named as front, back, left, right, up Under and.
Fig. 7 conceptually illustrates the example of cubic projection form 700.In the figure 7, cubic projection form 700, which includes, incites somebody to action Spherome surface point p (x, y, z) is mapped to one of six cubes of dignity (for example, 702), wherein calculating cube dignity id and returning One changes the coordinate (x in cubic projection coordinate systemp, yp) (for example, 704).
Fig. 8 conceptually illustrates the example of 360 degree of video images in cubic projection layout 800.Described in table 1 cube The projection rule of body projection, is used to spherome surface point p (x, y, z) being mapped to a cube honorable pseudo-code wherein providing.
Table 1:Pseudo-code for cubic projection mapping
If (z > 0&& (- z≤y≤z)s && (- z≤x≤z))
Else if (z < 0&& (z≤y≤- z)s && (z≤x≤- z))
Else if (x > 0&& (- x≤y≤x)s && (- x≤z≤x))
Else if (x < 0&& (x≤y≤- x)s && (x≤z≤- x))
Else if (y > 0&& (- y≤x≤y)s && (- y≤z≤y))
Else if (y < 0&& (y≤x≤- y)s && (y≤z≤- y))
The visual field (FOV) and view direction angle
It is 360 degree of videos of display, it is necessary to project and the part of every one 360 degree of video pictures is presented.The visual field (FOV) angle is fixed Justice shows the more most of of 360 degree video pictures, and view direction angle defines which portion shown in 360 degree of video pictures Point.
To show 360 degree of videos, imagine, video is mapped on single spherome surface, is sitting in the central point of sphere The beholder at place can watch rectangular screen, and screen has and is positioned at its four turnings on spherome surface.Herein, (x ', Y ', z ') it is referred to as 360 views viewing coordinate system, and (xc, yc) it is referred to as normalization presentation coordinate system.
Fig. 9 conceptually illustrates the example of the normalization projection plane size 900 determined by field-of-view angle.Such as institute's exhibition in Fig. 9 Show, in viewing coordinate system (x ', y ', z '), the location of the core of projection plane (that is, rectangular screen) is and flat on z ' axis Row is in x ' y ' planes.Therefore, can be by the following calculating projection plane size wxh and its distance d to the center of sphere:
WhereinAndAnd α ∈ (0:π] it is horizontal FOV angles, and β ∈ (0:π] be Vertical FOV angles.
Figure 10 conceptually illustrates the example of view direction angle 1000.View direction by 3D viewing coordinate system (x ', y ', Z ') defined relative to the rotation angle of 3D capture coordinate systems (x, y, z).As demonstrated in Figure 10, view direction is by along y-axis Rotate clockwise angle, θ (for example, 1002, roll), along x-axis rotated counterclockwise by angle γ (for example, 1004, trim) and Specified along the rotated counterclockwise by angle ε (for example, 1006, sidewinder) of z-axis.
Coordinate mapping between (x, y, z) and (x ', y ', z ') coordinate system is defined as:
That is,
Equation 4
Figure 11 illustrates that output is presented picture and inputs the schematic diagram of the coordinate mapping 1100 between picture.Using defined above FOV and view direction angle, output can be established Picture Coordinate System (X is presentedc, Yc) (that is, the presentation picture for display) with Input Picture Coordinate System (Xp, Yp) coordinate between (that is, input 360 degree video pictures) maps.As shown in fig. 11, give Surely sample point (the X in picture is presentedc, Yc), input the correspondence sample point (X in picturep, Yp) coordinate can pass through following steps Export:
● normalization projection plane size and distance (that is, the equation to ball centre are calculated based on FOV angles (α, β) 3);Transformation matrix of coordinates (that is, equation between viewing and capture coordinate system is calculated based on view direction angle (∈, θ, γ) Formula 4)
● based on presentation picture size and projection plane size is normalized by (Xc, Yc) normalization.
● the coordinate (x in presentation coordinate system will be normalizedc, yc) it is mapped to 3D viewing coordinate systems (x ', y ', z ').
● coordinate is transformed into 3D capture coordinate systems (x, y, z)
● the coordinate (x in export normalization projection coordinate systemp, yp)
● derived coordinate is transformed into the integer-bit inputted in picture based on input picture size and projected layout form Put.
Figure 12, which conceptually illustrates to normalize using equidistant column projection format 1200, is presented coordinate system (for example, p (xc, yc)) in point be mapped to normalization projection coordinate system (for example, p ' (xp, yp)) example.
In one or more embodiments, the projection from equidistant column input format is performed.For example, if input figure Piece is in equidistant column projection format, then following steps are applicable to the sample point (X that will be presented in picturec, Yc) it is mapped to input Sample point (X in picturep, Yp)。
● display projection plane size is normalized based on FOV angle calculations:
WhereinAnd
● by (Xc, Yc) be mapped in normalization presentation coordinate system:
● calculate the p (x in (x ', y ', z ') viewing coordinate systemc, yc) coordinate:
● coordinate (x ', y ', z ') is transformed into (x, y, z) capture coordinate system based on view direction angle:
● p (x, y, z) is projected into normalization projection coordinate system p ' (xp, yp) on:
● by p ' (xp, yp) it is mapped to input picture (equidistant column) coordinate system (Xp, Yp) on
Wherein:
● α, β are FOV angles, and ε, θ, γ are view direction angles.
● renderingPicWidth × renderingPicHeight is that picture size is presented
● inputPicWidth × inputPicHeight is input picture size (being in equidistant column projection format)
Figure 13, which conceptually illustrates to normalize using cubic projection form 1300, is presented coordinate system (for example, p (xc, yc)) in point be mapped to normalization projection coordinate system (for example, p ' (xp, yp)) example.
In one or more embodiments, the projection from cubic projection input format is performed.For example, if input Picture is in cubic projection form, then following similar step suitable for will present picture sample point (Xc, Yc) be mapped to it is defeated Enter the sample point (X in picturep, Yp)。
● display projection plane size is normalized based on FOV angle calculations:
WhereinAnd
● by (Xc, Yc) be mapped in normalization presentation coordinate system:
● calculate the p (x in (x ', y ', z ') viewing coordinate systemc, yc) coordinate
● coordinate (x ', y ', z ') is transformed into (x, y, z) capture coordinate system based on view direction angle:
● p (x, y, z) is projected to by normalization cube coordinate system p ' (x based on the pseudo-code defined in table 1p, yp) on.
● by p ' (xp, yp) it is mapped to input cube coordinate system (Xp, Yp) on (it is assumed that all cubes of dignity all have Equal resolution)
Wherein:
● α, β are FOV angles, and ε, θ, γ are view direction angles.
● renderingPicWidth × renderingPicHeight is that picture size is presented
● inputPicWidth × inputPicHeight is input picture size (being in cubic projection form)
● { (Xoffset [faceID], Yoffset [facID]) | faceID=front, back, left, right, up and under } is input Cube honorable coordinate shift in cubic projection coordinate system.
It is laid out for the cubic projection described in Figure 13, face ID accesses coordinate shift array in the following order:Forward and backward, It is left and right, on, followed by under.
Sample is presented to show
In 360 degree of video projection displays, 360 degree of video pictures are inputted (for example, being thrown in equidistant column form or cube Shadow form) in multiple sample projectables to present picture in same integer position (Xc, Yc).To be presented with smooth, not only It is integer pixel positions, and its sub-pixel position in picture is presented also through projecting the correspondence sample to find out in input picture This.
Figure 14 conceptually illustrates the two dimension of the sample through projecting 360 video pictures of input presented with 360 degree of videos of progress The example of layout 1400.As demonstrated in Figure 14, if projection accuracy is in the horizontal directionSub-pixel and in the vertical direction It isSub-pixel, then in position (Xc, Yc) sample value of presentation picture at place can be by presented below:
Wherein:
●(Xp, Yp)=mapping_func (Xc, Yc) it is to be regarded from presentation picture to the input 360 defined in chapters and sections above The coordinate mapping function of frequency picture (for example, the equidistant column projections of w/ or cubic projection form).
●inputImg[Xp, Yp] it is the position (X inputted in picturep, Yp) place sample value.
●renderingImg[Xc, Yc] it is to export the position (X presented in picturec, Yc) place sample value.
Coordinate system (X is presented instead of calculating output in real timec, Yc) with inputting Picture Coordinate System (Xp, Yp) between coordinate Mapping, coordinate mapping can be also pre-calculated, and be stored as entirely presenting the perspective view of picture.Because view direction and FOV Angle may not change according to the difference of picture, so the perspective view precalculated can be total to by the way that multiple pictures are presented Enjoy.
Assuming that projectMap [n*Xc+ j, m*Yc+ i] it is the perspective view precalculated, wherein Xc=0,1 ..., RenderingPicWidth-1, Yc=0,1 ..., renderingPicHeight-1, j=0,1 ..., n-1, and i=0, 1、…、m-1.For the sub-pixel position presented in pictureEach single item storage in perspective view Input the coordinate value (X precalculated of Picture Coordinate Systemp, Yp).Presentation can be written as:
Picture can have multiple color component, such as YUV, YCbCr, RGB.Above-mentioned presentation process can be independently applicable to color Component.
Figure 15 conceptually illustrates to capture the global rotation angle between coordinate system and 360 degree of video projection coordinate systems 1500 example.Instead of 360 degree of videos are captured coordinate systems from 360 degree of videos of 3D in 360 degree of video captures and splicing System (x, y, z) maps directly to input Picture Coordinate System (Xp, Yp), we introduce and are named as 360 degree of video projection coordinates of 3D System (xr, yr, zr) additional coordinates system.(xr, yr, zr) relation between (x, y, z) is as along as shown in Figure 15 Z, y, the rotated counterclockwise by angle of x-axisSpecify.Coordinate transform between two systems is defined as:
In one or more embodiments, equation 8 can be rewritten as equation 9:
Figure 16 conceptually illustrates the example of 360 degree of view projections 1600 of replacement in equidistant column projection format.When will catch When the 360 degree of videos received project to 2D input Picture Coordinate Systems to be stored or be launched, additional coordinates system (xr, yr, zr) on 360 degree of videos capture sides provide more freedom.Using equidistant column projection format as example, sometimes, it may be desirable to Before and after the South Pole and the arctic will be arrived by view projections, be gone up as shown in Figure 16, rather than shown in Fig. 4 and lower view.Because In equidistant column projection, the data around the South Pole and the arctic of sphere are preferably preserved, so as to allow alternative arrangement (such as to scheme Layout shown in 16) benefit (such as more preferably compression efficiency) can be potentially provided in some environments.
Because can be by 360 degree of video datas from 3D coordinate systems (xr, yr, zr) (rather than (x, y, z)) project to input picture Coordinate system, so before 360 degree of video datas to be projected to input Picture Coordinate System, needs base in Fig. 2 and Fig. 7 Coordinate (x, y, z) is transformed into (x in equation 9r, yr, zr) additional step.Therefore, using (xr, yr, zr) substitution equation 1st, (x, the y, z) in equation 5 and in table 1.
Figure 17 illustrates the schematic diagram of the example for the coordinate mapping process changed with 360 degree of video projection coordinate systems.Therefore, It can as illustrated in figure 17 change from output and Picture Coordinate System (X is presentedc, Yc) arrive input Picture Coordinate System (Xp, Yp) Coordinate mapping.The calculating of transformation matrix of coordinates is by considering view direction angle (∈, θ, γ) and global rotation angleBoth change.By cascading equation 9 and equation 4, coordinate can be sat from viewing by below equation Mark system (x ', y ', z ') is converted directly into 360 degree of video projection coordinate system (x of 3Dr, yr, zr):
In one or more embodiments, equation above can be used to precalculate for 3 × 3 transformation matrix of coordinates.
Figure 17 explanations are with 360 degree of video projection coordinate systems (for example, (xr, yr, zr)) modification coordinate mapping process reality The schematic diagram of example.In fig. 17, global rotation angle can be signaled in systems by any suitable meansFor example, SEI (Supplemental Enhancement Information) message carried in video elementary bit stream, such as AVC are defined SEI or HEVC SEI message.In addition, it also may specify and carry default view.
Figure 18 illustrates the capture of 360 degree of videos and the signal of the example of playback system 1800 using six views layout format Figure.Instead of by 360 degree of Video Compositions to single view sequence (such as the equidistant column form used in Fig. 1), 360 degree of videos Capture and playback system can support multiple view layout forms to carry out 360 degree of videos storages, compression, transmitting, decoding, presentations etc. Deng.As shown in Figure 18,360 degree of video-splicing blocks (for example, 104) can only remove by camera equipment capture it is overlapping Area, and export six view sequences (that is, upper and lower, front, rear, left and right) for for example each covering 90 ° × 90 ° viewports.Sequence can Have or can not have equal resolution, but discretely compressed (for example, 1802) and decompression (for example, 1804).360 degree of videos Engine (for example, 110), which is presented, can use multiple sequences as input, and sequence is presented to be shown with the help of control piece is presented Show.
Figure 19 illustrates the capture of 360 degree of videos and the schematic diagram of the example of playback system using two view layout forms. In Figure 19,360 degree of video-splicing blocks (for example, 104) can generate such as two view sequences, one covering such as data upper and Lower 360 ° × 30 °, and forward and backward, 360 ° × 120 ° of the left and right of another one covering data.Two sequences can have different resolution And different projected layout forms.For example, preceding+rear+left side+right view can be used equidistant column projection format, and upper+under regard Another projection format can be used in figure.In this particular instance, using two encoders (for example, 1902) and decoder (for example, 1904).(for example, 110), which are presented, in 360 degree of videos can be presented sequence to be shown by using two sequences as input.
Figure 20 illustrate using two view layout forms (one of view sequence be used for present) the capture of 360 degree videos and The schematic diagram of the example of playback system 2000.Multi views layout format can also provide expansible presentation in 360 degree of videos are presented Feature.As shown in Figure 20,360 degree of videos can select do not present video 30 ° of parts of up or down (for example, 2002, 2004), this is attributed to the limited capability that engine (110) is presented, or if the bit stream bag of upper+lower view is lost during transmitting.
Even if 360 degree of video-splicing blocks (for example, 104) generate multiple view sequences, single video encoder and decoder May still it be used in 360 degree of video captures and playback system 2000.For example, if 360 degree of video-splicing blocks (for example, 104) output is six 720p@30 (that is, with the 720p sequences of 30 frames/second) view sequence, then output can be synthesized to list In a 180 sequences of 720p@(that is, with the 720p sequences of 180 frames/second), and use single encoded decoder compression/decompression.Substitute Ground, such as six independent sequences can be by single video encoder and decoder example compression/decompression, without single by the time-division The available processes resource of encoder and decoder example is synthesized in single composite sequence.
Figure 21 conceptually illustrates the example of multiple cubic projection format layouts:(a) cubic projection original layout;(b) Unacceptable cubic projection layout;And the example of (c) preferable cubic projection layout.Conventional video compression technology utilizes figure Spatial coherence in piece.For cubic projection, if for example, projection sequence upper and lower, forward and backward, that left and right cube is honorable It is synthesized in single view sequence, then different cubes of dignity layouts can cause different compression efficiencies, even identical 360 degree of video datas.In one or more embodiments, cube original layout (a) will not create in synthesising picture not to be connected Continuous face border, but there is bigger picture size compared with both another, and dummy data is carried in grey area.At one or more In embodiment, layout (b) is one of unacceptable layout, because it in synthesising picture there is maximum number not connect Continuous face border (that is, 7).In one or more embodiments, layout (c) is one in preferable cubic projection layout Person, because it is with the discontinuous face border 4 of minimal amount.Between face border in the layout (c) of Figure 22 is left and upper and Horizontal plane border between right and rear, and it is upper and lower between and afterwards with it is lower between vertical plane border.In certain aspects, test As a result disclosing layout (c) averagely surpasses layout both (a) and (b) substantially 3%.
Therefore, for cubic projection or other multiple face projection formats, in general, layout should minimize synthesising picture In space discontinuity (that is, the number on discontinuous face border), to carry out more preferably spatial prediction, and therefore, optimize video Compression efficiency in compression.For cubic projection, for example, it is preferable to be laid out synthesize 360 degree video pictures in it is several not Continuous face border 4.For cubic projection form, being further minimized for space discontinuity can be by introducing in layout Face rotates (that is, 90,180,270 degree) to realize.In one or more aspects, photos and sending messages notify in advanced video system message The layout information of 360 degree of incoming videos is to be compressed and present.
Figure 22 conceptually illustrates the example of unrestricted motion compression 2200.Unrestricted motion compensation (UMC) is typically used for Video compression standard is to optimize the technology of compression efficiency.As shown in Figure 22, in UMC, it is allowed to the reference block of predicting unit More than picture boundary.Reference pixel ref [X outside for reference picturep, Yp], use nearest picture boundary pixel.Herein, join Examine the coordinate (X of pixelp, Yp) determined by the position and motion vector of current prediction unit (PU), it can exceed picture boundary.
Assuming that { refPic [Xcp, Ycp], Xcp=0,1 ..., inputPicWidth-1;Ycp=0,1 ..., InputPicHeight-1 } it is reference picture sample matrix, then and UMC (being loaded according to reference block) is defined as:
refBlk[Xp, Yp]=refPic [clip3 (0, inputPicWidth-1, Xp), clip3 (0, InputPicHeight-1, Yp)]
Equation 10,
Function clip3 (0, a, x) is wherein sheared to be defined as:
Int clip3 (0, a, x) { if (x < 0) return 0;Else if (x > a) return a;else return X } equation 11
Figure 23 conceptually illustrates the example of multiple 360 degree of video projection formats layouts 2300.In certain aspects, 360 degree Video projection format layout 2300 including (but not limited to):(a) equidistant column projected layout;And the original cloth of (b) cubic projection Office.However, in 360 degree of videos, left and right picture boundary can belong to identical camera view.Similarly, upper and lower picture boundary Line can be in physical space close to each other, although it is much separated in 360 degree video pictures layouts.Figure 23 describes two realities Example.In layout example (a), both left and right picture boundaries belong to rearview.In layout example (b), although left and right figure Sheet border belongs to different views (that is, Zuo Yuhou), but that two picture boundaries are listed in during video captures actually in physics Above close to each other.Therefore, it is the compression efficiency of 360 degree of videos of optimization, when reference pixel is outside picture boundary rather than with current UMC Defined in nearest picture boundary pixel filling when, it is allowed to it is reasonable that reference block is loaded in a manner of " circular ".
In one or more embodiments, following high-level syntax can enable extension UMC.
Table 2:Extend UMC grammers
It should be noted that it is poor (Δ W, Δ H) to need to signal picture size, because decoded picture size InputPicWidth x inputPicHeight usually must be 8 or 16 multiple in the two directions, and the picture captured May not be that this can make the picture size between the picture captured and decoded picture different.Reference block is around being along catching The picture rather than the picture boundary of decoded picture received.
In one or more embodiments, it may depend on embodiment and transmitted in sequence-header or picture header or both The high-level semantics of number notification list 2.
Using semanteme defined above, extension UMC can be defined as below:
refBlk[Xp, Yp]=refPic [Xcp, Ycp] equation 12,
Wherein (Xcp, Ycp) be calculated as:
Wherein clip3 () is defined in equation 11, and warparound3 (0, a, x) is defined as:
Int wraparound3 (0, a, x) { while (x < 0) x+=a;While (x > a) x-=a;return x;Equation 13
In Current video compression standard, motion vector can exceed the picture boundary with big margin.Therefore, comprising equation " while " circulation in formula 13.To avoid " while " from circulating, in video compression standard of future generation, reference image may be limited It is how many (for example, depending on the maximum encoding block defined in generation standard is big that plain (being directed to motion compensation) can exceed picture boundary It is small, up to 64 or 128 or 256 pixels etc.).After this constraint is forced, warparound3 (0, a, x) can be reduced to:
Int wraparound3 (0, a, x) { if (x < 0) x+=a;If (x > a) x-=a;return x;Equation 14
Figure 24 conceptually illustrates the example for extending unrestricted motion compensation 2400.In fig. 24, enable in the horizontal direction Reference block is surround.(captured instead of with the reference block port outside picture boundary pixel filling left figure sheet border, loading along the right side ) " circular " part of picture boundary.
Adaptive projection format
Figure 25 illustrates the example net for being implemented within 360 degree of video captures and playback according to one or more embodiments Network environment 2500.However, it is possible to discribed all components will not be used, and one or more embodiments can be included and do not opened up in figure The additional assemblies shown.Can be in the case of without departing substantially from the spirit or scope of claims such as set forth herein, to component Arrangement and type make change.Additional assemblies, different components or less component can be provided.
Example Network Environment looogl 2500 includes 360 degree of video capture devices 2502,360 degree of video-splicing devices 2504, videos Code device 2506, video decoder 2508 and 360 degree of video exhibition devices 2510.In one or more embodiments, dress Putting 2502,2504,2506,2508, one or more of 2510 can be incorporated into same physical unit.For example, 360 degree 2502,306 degree of video-splicing devices 2504 of video capture device and video coding apparatus 2506 can be incorporated into single assembly, And video decoder 2508 and 360 degree of video exhibition devices 2510 can be incorporated into single assembly.
Network environment 2500 can further include 360 degree of video projection format determination devices (not showing), it can be in video Projection lattice are performed before splicing apparatus 2504 carries out video-splicing and/or after video coding apparatus 2506 carries out Video coding Formula selects.Network environment 2500 can also include 360 degree of video playback apparatus (not showing), it plays back 360 degree of presented videos Content.In one or more embodiments, video coding apparatus 2506 can be via transmitting link (such as passing through network) communicatedly It is coupled to video decoder 2508.
In the present system, 360 degree of video-splicing devices 2504 can be regarded in 360 degree of video capture/compressed sides using 360 degree Frequency projection format determination device (not showing) with determine which projection format (for example, ERP (equidistant column projection), CMP (cube Body projects), ISP (icosahedron projection) etc.) be optimally adapted to current video and be segmented (that is, group of picture) or photo current with reality Now most preferably may compression efficiency.Can based on provided by video coding apparatus 2506 encoding statistics (such as bit rate distribution, across More inside/internal schema of segmentation or picture, video quality measurement etc.) and/or obtained from 360 degree of video capture devices 2502 Initial data statistics (such as distribution of original data space activity etc.) on original 360 degree of video camera data is made certainly It is fixed.Once have selected projection format for current fragment or picture, then 360 degree of video-splicing devices 2504 by video-splicing into Selected projection format, and video coding apparatus 2506 will be transmitted to through 360 degree of videos of splicing to be compressed.
In the present system, can be sent out in compressed 360 degree of video bit streams by appropriate ways from video coding apparatus 2506 Signal notifies selected projection format and associated projection format parameter (for example, the face in projection format ID, projected layout Number, face size, areal coordinate offset, face rotation angle etc.), such as in Supplemental Enhancement Information message (SEI), in sequence In title, in picture header, etc..360 degree of video-splicing devices 2504 can be by 360 degree of video-splicings into by 360 degree of videos The different projection formats of projection format determination device selection, rather than by 360 degree video-splicings into single and fixation projection format (example Such as, ERP).
In the present system, 360 degree of video-splicing devices 2504 can utilize additional coordinates system, and the coordinate system will caught The 360 degree of videos received project to 2D input Picture Coordinate Systems for above being carried 360 degree of video capture sides when storing or launching For more freedom.360 degree of video-splicing devices 2504 can also support a variety of projection formats for 360 degree of videos store, compress, Transmitting, decoding, presentation etc..For example, the removable overlay region captured by camera equipment of video-splicing device 2504, and Output for example each covers six view sequences of 90 ° × 90 ° viewports.
Video coding apparatus 2506 can minimize space discontinuity in synthesising picture (that is, discontinuous face border Number) to carry out more preferably spatial prediction, so as to optimize the compression efficiency in video compress.For cubic projection, for example, it is preferable to Layout should have in 360 degree of video pictures of synthesis minimizes the discontinuous face border of number, for example, 4.Video coding Device 2506 can implement UMC to optimize compression efficiency.
On 360 degree of video playback sides, 360 degree of video playback apparatus (not showing) can receive compressed 360 degree of video positions Stream, and decompress 360 degree of video bit streams.360 degree of video playback apparatus can be presented on the different throwings signaled in 360 degree of bit streams 360 degree of videos of shadow form, and non-exhibiting is in the video of single and fixed projection format (for example, ERP).In this respect, 360 degree Video is presented not only to be controlled by view direction and FOV angles, but also by from the decoded projection format of compressed 360 degree of video bit streams Information controls.
In the present system, acquiescence (recommendation) view direction can be signaled together with 360 degree of video contents (that is, to watch Orientation angle), FOV angles and/or present picture size system message.360 degree of video playback apparatus can be kept same as before Existing picture size, but purposefully reduce active presentation area and complexity and memory bandwidth requirements are presented to reduce.360 degree regard Frequency playback reproducer can just terminate in playback or before being switched to other program channels 360 degree videos of storage present settings (for example, FOV angles, view direction angle, be presented picture size etc.) so that it can use and be stored when the playback of same channels recovers Presentation set.360 degree of video playback apparatus can provide preview mode, and wherein visual angle can change to help automatically every N number of frame Beholder selects desirable view direction.The capture of 360 degree of videos and playback reproducer can in real time (for example, block-by-block) calculating perspective view with Preserve bandwidth of memory.In this example, it may not be possible to load perspective view from memory chip.In the present system, difference regards Figure fidelity information can be assigned to that different views.
In fig. 25, video is equipped by camera and captured, and is spliced together into equidistant column form.Then, video can It is compressed into any suitable video compression format (for example, MPEG/ITU-T AVC/H.264, HEVC/H.265, VP9 etc.) And via transmitting link (for example, cable, satellite, ground, Internet Streaming Media etc.) transmitting.On the receiving side, video is through solution Code and equidistant column form is stored as, then, is presented according to view direction angle and the visual field (FOV) angle, and it is shown. In system, terminal user can control FOV and view direction angle to watch the desirable portion of 360 degree of videos with desirable visual angle Point.
Referring back to Fig. 2, video biography is carried out to utilize single-layer video coding decoder to be used for existing infrastructure therein Send, the 360 degree of video segments captured by multiple cameras with different angle are usually spliced and synthesized for example widely used The single video sequence stored in a certain projection format of equidistant column projection format.
In addition to equidistant column projection format (ERP), exist many can represent 360 degree of video frame on 2D rectangular images Other projection formats, such as cubic projection (CMP), icosahedron projection (ISP) and fisheye projection (FISHEYE).
Figure 26 conceptually illustrates the example of cubic projection form.As shown in Figure 26, in CMP, spherome surface (example Such as, 2602) through projecting to six cubes of dignity (that is, upper, preceding, right, rear, left and under) (for example, 2604), per one side covering 90 × 90 degree of visuals field, and six cubes of dignity are synthesized in single image.Figure 27 conceptually illustrates the example of fisheye projection form. In fisheye projection, spherome surface (for example, 2702) is projected to two circulations (for example, 2704,2706);Each circulation covering The 180x180 degree visual field.Figure 28 conceptually illustrates the example of icosahedron projection format.In ISP, sphere, which is mapped to, to be synthesized 20 triangles altogether into single image.Figure 29 illustrates showing for the example of 360 degree of videos 2900 in a variety of projection formats It is intended to.For example, the layout (a) of Figure 29 describes ERP projection formats (for example, 2902), and the layout (b) of Figure 29 is described CMP and thrown Shadow form (for example, 2904), the layout (c) of Figure 29 describes ISP projection formats (for example, 2906), and the layout (d) of Figure 29 is described FISHEYE projection formats (for example, 2908).
For identical 360 degree of video contents, different projection formats can in video use-cases such as MPEG/ITU AVC/H.264 or Cause different compression efficiencies after the compression of MPEG/ITU MPEG HEVC/H.265 video compression standards.Table 1, which provides, is directed to ten BD speed differences between two 4K 360 degree of video test sequences ERP and CMP.PSNR and BD speed differences calculate in CMP domains, its Middle negative means the more preferably compression efficiency using ERP, and positive number means the more preferably compression efficiency using CMP.(the example of table 3 Such as) MPEG/ITU MPEG HEVC/H.265 reference softwares HM16.9 illustrates the experiment on ERP relative to the compression efficiency of CMP As a result.
Table 3:ERP relative to CMP compression efficiency
As shown in table 3, although used ERP in global optimization compression efficiency as a result, there are other situations (for example, GT_Shriff-left, timelapse_basejump), wherein causing more positive results using CMP.Therefore, can the phase Prestige can change projection format from time to time with the characteristic of 360 degree of video contents of best fit, most preferably may compression to realize Efficiency.
Figure 30 illustrates the capture of 360 degree of videos and the signal of the example of playback system 3000 with adaptive projection format Figure.To maximize the compression efficiency of 360 degree of videos, can implement can be by 360 degree of video compress into the adaptive of hybrid projection format Induction method.As shown in Figure 30, used in 360 degree of video capture/compressed sides projection format determine block (for example, 3002) with Determine which projection format (for example, ERP, CMP, ISP etc.) is optimally adapted to current video segmentation (that is, group of picture) or current Picture most preferably may compression efficiency to realize.Can be based on by video encoder (for example, 2506) provide encoding statistics (for example, The distribution of bit rate, across segmentation or inside/internal schema of picture, video quality measurement etc.) and/or obtain on original The initial data of 360 degree of video camera data counts (such as distribution of original data space activity etc.) to make decision.One Denier have selected the projection format of current fragment or picture, then 360 degree of video-splicing blocks (for example, 2504) are by video-splicing into institute The projection format of selection, and video encoder (for example, 2506) will be transmitted to through 360 degree of videos of splicing to be compressed.
Signal selected projection format and associated throwing in any suitable manner in compressed bitstream Shadow format parameter (such as the number in face, face size, areal coordinate offset, face rotation angle etc. in projection format ID, projected layout Deng), such as in SEI (Supplemental Enhancement Information) message, in sequence-header, in picture header etc..Described in Figure 25 Bright system is different, and 360 degree of video-splicings can be determined block by 360 degree of video-splicing blocks (for example, 2504) into by projection format The different projection formats of (for example, 3002) selection, rather than by video-splicing into single and fixed projection format (for example, ERP).
On 360 degree of video playback sides, receiver (for example, 2508) receives compressed 360 degree of video bit streams, and decompresses and regard Frequency flows.Different from system illustrated in fig. 25,360 degree of videos presentation blocks (for example, 2510) can be presented in bit stream and transmit Number notice different projection formats 360 degree videos, and non-exhibiting in it is single and fixation projection format (for example, ERP) video. That is, 360 degree of videos are presented not only is controlled by view direction and FOV angles, but also by the projection format information control from bitstream decoding System.
Figure 31 illustrates 360 degree of videos capture with the projection format for passing through multiple channel adaptives according to more introduces a collections with returning The schematic diagram of another example of place system.In Figure 31,360 degree of video captures can for example carry out 360 with playback system 3100 Spend video four channel-decodeds and present parallel, 360 degree of video inputs may be from be in different projection formats (for example, 3104-1, 3104-2,3104-3,3104-4) and compressed format (for example, 3106-1,3106-2,3106-3,3106-4), and be in different guarantors True degree (for example, photo resolution, frame rate, bit rate etc.) it is not homologous (for example, 3102-1,3102-2,3102-3, 3102-4)。
In one or more embodiments, passage 0 can be in adaptive projection format and with HEVE compress live video. In one or more embodiments, passage 1 can be in the live video fixed ERP forms and compressed with MPEG-2.One or more In a embodiment, passage 2 can be in adaptive projection format and with VP9 compress but be previously stored on server to carry out 360 degree of video contents of stream transmission.In one or more embodiments, passage 3 can be in fixed ERP forms and with H.264 Compress but be previously stored on server to carry out the 360 of stream transmission degree of video contents.
Decoded for video, decoder (for example, 3108-1,3108-2,3108-3,3108-4) can be decoded in different pressures The video of contracting form, and decoder is may be implemented in hardware (HW), or implemented with programmable processor (SW), or be implemented in In the mixing of HW/SW.
Presented for 360 degree of videos, engine (for example, 3110-1,3110-2,3110-3,3110-4), which is presented, to be based on View direction angle, FOV angles, 360 degree of video pictures sizes of input, output view-port picture size, global rotation angle Etc. perform viewports from 360 degree of videos of input of different projection formats (for example, ERP, CMP etc.) and present.
Engine is presented to may be implemented in hardware (HW), or is implemented with programmable processor (for example, GPU), or is carried out In the mixing of HW/SW.In certain aspects, the output of same video decoder can be fed in multiple presentation engines so that more A viewport is presented from identical 360 video input to be shown.
Figure 32 illustrates that projection format determines the schematic diagram of 3200 example.Although 360 degree of video playbacks (decompression plus presentation) It is relatively fixed, but there are the mode of 360 degree of video compressed video stream of a variety of generations, 360 degree of videos compressed video stream There is maximum compression efficiency by adaptively changing projection format from time to time.Figure 32 provides how projection format can be produced Determine the example of (for example, 3002).In this example, there is provided regarded in 360 degree of a variety of projection formats (such as ERP, CMP and SP) Frequency content (for example, 3202), in video encoder (such as the MPEG/ITU-T HEVC/ of the video same type of different-format H.265 (for example, 3204)) are compressed, and compressed bitstream is stored (for example, 3206).In the segmentation to different projection formats or After coding of graphics, the rate distortion costs of the segmentations of those measurable projection formats or picture (such as the PSNR of identical bit rate The bit rate or combination metric of number or phase homogenous quantities).Projection lattice are made in the rate distortion costs based on current fragment or picture Formula is determined after (for example, 3002), selectes the correspondence bit stream (for example, 3208) of the selected form of segmentation or picture, and will It is spliced in the bit stream of hybrid projection format (for example, 3210).It should be noted that in such a system, projection format can according to regarding Frequency division section (group of picture) or the difference of picture and change.
Figure 33 conceptually illustrates the projection format transformation 3300 for excluding the inter prediction across projection format transformation border Example.For with existing video compression standard (such as MPEG/ITU AVC/H.264, MPEG/ITU HEVC/H.265 and Google VP9 360 degree of videos of hybrid projection format) are supported, it is necessary to force the limitation of the frequency on projection format can be changed.Such as figure Shown in 33, projection format transformation 3300, such as be converted to CMP from ERP in this example, or ERP is converted to from CMP, can Occur over just random access point (RAP) place so that inter prediction may need not change border across projection format.RAP can be by wink When decoding refresh (IDR) picture or provide random access capabilities other types of picture guiding.In fig. 33, projection format It can only be changed according to the difference of video segmentation, changed without the difference according to picture, unless segmentation inherently picture.
Figure 34 conceptually illustrates the projection format transformation 3400 with the inter prediction across projection format transformation border Example.If inter prediction may span across projection format transformation border, then projection format can also change according to the difference of picture. For identical 360 degree of video contents, after 360 degree of video-splicings, different projection formats may not be only resulted in picture not Same content, and cause different photo resolutions.As shown in Figure 34, the reference chart in DPB (decoded picture buffer) The projection format conversion (for example, 3400) of piece can be used for supporting the inter prediction across projection format border.Projection format is changed 360 degree of video reference pictures in DPB can be converted into projection format (for example, CMP) from a kind of form (for example, ERP) and worked as The photo resolution of preceding picture.It can implement to change based on the mode of picture, wherein the reference of the projection format in photo current Picture is changed and is previously stored, or the exercise data of the size based on the block in photo current, position and current prediction block It is carried out block by block in real time.In Figure 34, projection format can only change according to the difference of picture, but need to refer to picture projection Format conversion, it is the supported instrument of future video compression standard.
It is recommended that view
Plurality of services (such as YouTube, Facebook etc..) has started to provide 360 ° of video sequences recently.These clothes Business allows user to look around in all directions in video playing.The rotatable scene of user with viewing they to fixing time Anything of interest.
In the presence of the several formats for 360 ° of videos, but each form be related to by 3D surfaces (sphere, cube, octahedron, Icosahedron etc..) project in 2D planes.Then, 2D projections are encoded/decode as any normal video sequence. At decoder, the visual angle depending on user at that time presents and shows a part for that 360 ° of view.
Final result is to see every vicinal freedom around them to user, and which greatly increases sense on the spot in person Feel, so that it experiences them seemingly just in the scene.With special audio effects (rotation audio surround sound to match video) Combination, effect can be quite charming.
Figure 35 illustrates the example of the suggestion view being implemented within 360 degree of videos according to one or more embodiments Network environment 3500.However, it is possible to discribed all components will not be used, and one or more embodiments can be included in figure not The additional assemblies of displaying.Can be in the case of without departing substantially from the spirit or scope of claims such as set forth herein, to component Arrangement and type make change.Additional assemblies, different components or less component can be provided.
Example Network Environment looogl 3500 includes 360 degree of video capture devices 3502,360 degree of video-splicing devices 3504, videos Code device 3506, video decoder 3508 and 360 degree of video exhibition devices 3510.In one or more embodiments, dress Putting 3502,3504,3506,3508, one or more of 3510 can be incorporated into same physical unit.For example, 360 degree 3502,306 degree of video-splicing devices 3504 of video capture device and video coding apparatus 3506 can be incorporated into single assembly, And video decoder 3508 and 360 degree of video exhibition devices 3510 can be incorporated into single assembly.In certain embodiments, Video decoder 3508 can include audio decoding apparatus (not showing), or in other embodiments, video decoder 3508 Separating audio decoding apparatus is communicably coupled to, for handling 360 video compressed bitstreams that are incoming or being stored.
In 360 video playback sides, network environment 3500 can further include demultiplexer device (not showing), it can be demultiplexed With incoming compressed 360 video bit stream, and demultiplexed bit stream is provided respectively to video decoder 3508, audio decoder Device and visual angle extraction element (not showing).In certain aspects, demultiplexer device can be configured to decompress 360 video positions Stream.Network environment 3500 can further include 360 video projection format conversion equipments (not showing), it can be in video coding apparatus 360 videos projection lattice are performed before 3506 carry out Video coding and/or after video decoder 3508 carries out video decoding Formula is changed.Network environment 3500 can also include 360 video playback apparatus (not showing), it plays back presented 360 video content. In one or more embodiments, video coding apparatus 3506 can be communicably coupled to regard via transmitting link (such as passing through network) Frequency decoding apparatus 3508.
360 video playback apparatus can just store 360 videos before playback terminates or is switched to other program channels and present Set (for example, FOV angles, view direction angle, presentation picture size etc.) so that can be when recovering the playback of same channels Stored presentation is used to set.360 video playback apparatus can provide preview mode, and wherein visual angle can change automatically every N number of frame Become to help beholder to select desirable view direction.360 videos are captured can calculate (for example, block by block) in real time with playback reproducer Perspective view is to preserve bandwidth of memory.In this example, it may not be possible to load perspective view from memory chip.In the system In, different views fidelity information can be assigned to that different views.
In the present system, content supplier is provided to " it is recommended that view " of fixed 360 degree of videos.This suggests that view can be Recommend experience to provide a user in one group of specific visual angle of each frame in 360 degree of videos.In user at any given time Point is not that user may be viewed by (or playback) and suggest view in itself in event of special interest to control view, and experience by The view that content supplier is recommended.
In the present system, in the specific view watched for effectively stored record/replaying user during specific viewing Through decompressing the data needed for 360 video bit streams, " inclination " of each frame, " trim " and " sidewindering " angle (perspective data) can be protected It is stored in storage device.Combined with the complete 360 degree of viewdatas initially recorded, the view being previously saved can be re-created.
Depending on storing the mode at visual angle, corresponding visual angle extraction process can be initialized.Compressed 360 are stored at visual angle In one or more embodiments in video bit stream, visual angle extraction process can be used.For example, video decoder 3508 And/or audio decoding apparatus can extract visual angle from compressed 360 video bit stream (for example, SEI message out of HEVC bit streams). This respect, then, the visual angle extracted by video decoder 3508 can be provided that video exhibition devices 3510.If visual angle quilt It is stored in mask data stream (for example, MPEG-2TS PID), then demultiplexer device can extract this information, and be sent to To video exhibition devices 3510 as suggestion visual angle.In some instances, demultiplexer is fed to separation visual angle extraction element (not Displaying) to extract suggestion visual angle.In this respect, the system should have at any time in the user for suggesting view and manually selecting The ability switched between view.
Suggesting that switching can include prompting user offer to the user of this into/out pattern between view and manual view Select (for example, pressing lower button).In one or more embodiments, the system can automated execution switching.For example, if with View is moved manually in family (with mouse, Remote control, gesture, handle etc..), then view is updated over to follow user's It is required that.If user stops being manually adjusted in the time quantum of setting, then view is drift to return to suggestion view.
In one or more embodiments, multiple suggestion views can be provided in the appropriate case and/or can once be presented one A view suggested by the above.For example, for football game, quarter back can be traced in a view, and other views can be traced outside Catcher.Using football example above, user can once watch the split screen with 4 views.Alternatively, different views can be used for Specific automobile is tracked during NASCAR racing car events.User can make a choice to customize its body among these suggestion views Test, without controlling view completely always.
If the suggestion view of whole scene is unavailable or inappropriate, then can provide suggestion (or recommendation) and be ensured with trying Beholder will not miss important action.Prompting view (or preview) can be provided when new scene starts.Then, view can be shifted To be concentrated on prompting angle to see on the main actions of view.In one or more embodiments, if the user desired that energy Less directly (or independent), then graphic arrows can be used for pointing out that user may face the mode of mistake and miss on screen The thing of interest.
Both in most common projection type are equidistant column projection and cubic projection.These projections are by video from sphere (equidistant column) and cube (cube) are mapped in flat 2D planes.Example is shown in Figure 36, it illustrates equidistant column Project the example of (for example, 3602) and cubic projection (for example, 3604).
Figure 37 conceptually illustrates that 3700 example is presented in 360 degree of videos.This moment, watch coming on computer or smart phone From 360 degree of video contents of majority of streaming media service (YouTube, Facebook etc..).However, it is expected that in the near future 360 degree of videos will be played on standard cable/satellite network.Sport event, tourism program, extreme sport and many other types Program can be shown in 360 ° of videos to improve interest and participation.
Although 360 degree of Video Applications can be the fun way immersed in the scene, for longer range sequence, always control manually View processed may usually become dull to track the requirement of main target of interest.For example, one during sport event Secondary look around again is probably interesting, but just wants to the center of viewing action for most of game, user.
For this purpose, content supplier can provide " it is recommended that view ".This suggestion view should be one group of each frame specific Visual angle to provide user to recommendation physical examination.It is not especially to feel emerging in itself to be put at any given time to control view in user In the event of interest, user can only watch suggestion view, and experience the view recommended by content supplier.
Can be by three different angles in any conceptualization to the User fixed time is recorded as 3d space.This A little different angles are so-called Euler (Euler) angles.In flight dynamics, these three angles referred to as " inclination ", " trim " And " sidewindering ".Referring back to Figure 10, as the component for the view for suggesting beholder, " inclination " that can be to each frame, " trim " and " sidewindering " angular coding.Decoder (for example, 3508) can user to control view in itself and the uninterested any time carries Take and use these to suggest visual angle.
This perspective data can be stored in a manner of any number.For example, angle can be inserted into video flowing as Picture user data (AVC/HEVC Supplemental Enhancement Information message-SEI message), or it can be through carrying as point in video sequence From data flow (different MPEG-2TS PID or MP4 data flows).
Depending on storing the mode at visual angle, it may be necessary to corresponding visual angle extraction mechanism.For example, if angle is stored SEI message is used as in video bit stream, then Video Decoder can extract visual angle.If angle is stored in separation MPEG-2TS In PID, then demultiplexer can extract this information and send it to presentation process.360 degree of video presentation systems (for example, 3500) by with any time in the ability for suggesting switching between view and the User manually selected.
Figure 38 illustrates with the schematic diagram suggested visual angle extraction and presented.Suggesting between view and manual view that switching can be with By lower button with simple this into/out pattern.In other aspects, can automated execution switching.For example, if with View is moved manually in family (with mouse, Remote control, gesture, handle etc..), then view is updated over to follow user's It is required that.If user stops being manually adjusted in the time quantum of setting, then view is drift to return to suggestion view.
Multiple suggestion views can be provided in the appropriate case.For example, can be traced for football game, a view Quarter back, and the outer catcher of other views tracking.In other aspects, different views can be used for during NASCAR racing car events with The specific automobile of track.User can make a choice to customize its experience among these suggestion views, without control regards completely always Figure.
More than one can be once presented and suggests view.Using football example above, user can once watch and be regarded with 4 The split screen of figure, one is quarter back, and one is outer catcher, while they control another two, etc. manually.
If the suggestion view of whole scene is unavailable or inappropriate, then can provide prompting to try and ensure beholder not Important action can be missed.Prompting view can be provided when new scene starts.Then, view can it is shifted with prompt angle to see with Just concentrate on the main actions of view.In other aspects, if it is desirable that can less directly, such as figure on screen The something of arrow can be used for pointing out that user may face the mode of mistake and miss interesting thing.
Most 360 degree of Video Applications do not allow user to adjust " sidewindering " angle.Camera is normally held in vertical orientation. View can up/down and left/right rotation, but not to lateral deflection etc..In this respect, two visual angles are only suggested System will be sufficient to most uses.
It should be noted that and not all " 360 video " stream actually all cover complete 360 ° × 180 ° visuals field.Some sequences are only Allow to watch in front direction (180 ° × 180 °).In some instances, some can have and can be seen up or down on you More high limitation.All these will be covered under same concept discussed herein.
In one or more embodiments, acquiescence (recommendation) viewing side can be signaled together with 360 degree of video contents To (that is, view direction angle), FOV angles, the system message that picture size is presented.
In one or more embodiments, 360 degree of video playback systems support scan pattern, and wherein visual angle can be every N number of Frame changes to help beholder to select desirable view direction automatically.For example, in automatically scanning pattern, vertical visual angle γ And 0 degree is fixed to first along the visual angle ε of z-axis, vertical angle changes once every N number of frame;Level is have selected in beholder After view angle theta, horizontal view angle is fixed to selected visual angle, and is still secured to 0 degree along the visual angle ε of z-axis, vertical angle of view Start to change once until beholder have selected vertical angle of view γ every N number of frame;It has selected horizontal view angle θ and vertical angle of view γ After both, two angles are fixed to selected angle, start to change every N number of frame along the visual angle ε of z-axis and are once knowing Beholder have selected visual angle ε.In some embodiments, scan pattern can parallel scan different visual angles, or in other embodiments In, scan pattern can sequentially scan different visual angles.In certain aspects, visual angle can be by user profile or the type (example of user Such as, children, adult) limitation (or constraint).In this example, 360 degree of video contents that management is set are controlled to can be limited to by parent Such as by the subset at the visual angle for setting instruction.In certain aspects, 360 degree of video bit streams include the frame track of instruction scan pattern Metadata.
In one or more embodiments, multi views layout format can also by the importance of the view in indicator sequence come Preserve viewpoint of interest.View the allocated can have different views fidelity (resolution ratio, frame rate, bit rate, FOV angles Size etc.).Application system message signals notify view fidelity information.
Figure 39 conceptually illustrates electronic system 3900, one or more embodiment available electron systems 3900 of this technology are real Apply.Electronic system 3900 may be, for example, network equipment, medium converter, desktop computer, laptop computer, tablet and calculate Any electricity that machine, server, interchanger, router, base station, receiver, phone or (in general) pass through network launches signal Sub-device.This electronic system 3900 includes the computer-readable matchmaker of various types of computer-readable medias and various other types The interface of body.In one or more embodiments, electronic system 3900 can be or can include device 102,104,106,108, 110th, one or more of 360 degree of video projection format conversion equipments and/or 360 degree of video playback apparatus.Electronic system 3900 Comprising bus 3908, one or more processing units 3912, system storage 3904, read-only storage (ROM) 3910, permanently store Device 3902, input unit interface 3914, output device interface 3906 and network interface 3916 or its subset and variation.
Bus 3908 represents all systems, ancillary equipment and the numerous internal dresses for communicatedly connecting electronic system 3900 jointly The chipset bus put.In one or more embodiments, bus 3908 communicatedly connect one or more processing units 3912 with ROM 3910, system storage 3904 and permanent storage 3902.It is single from these various memory cells, one or more processing First 3912 search instructions handle the processing to perform the present invention to perform and retrieve data.In various embodiments, one Or multiple processing units 3912 can be single processor or polycaryon processor.
ROM 3910 is stored as one or more processing units 3912 in electronic system and the statistical number needed for other modules According to and instruction.On the other hand, permanent storage 3902 is read-write memory apparatus.Permanent storage 3902 is non-volatile Memory cell, its store instruction and data, even when electronic system 3900 powers off.One or more implementations in the present invention Scheme uses mass storage device (such as disk or CD and its corresponding disc drives) to be used as permanent storage 3902.
Other embodiments, which use, can load and unload storage device (such as floppy disk, flash drives and its corresponding disc drives) conduct Permanent storage 3902.Such as permanent storage 3902, system storage 3904 is read-write memory apparatus.However, not It is same as permanent storage 3902, system storage 3904 is volatile read-write memory, such as random access memory.System Memory 3904 stores one or more processing units 3912 and appoints whichever in required instruction and data at runtime.One or In multiple embodiments, processing of the invention is stored in system storage 3904, permanent storage 3902 and/or ROM In 3910.From these various memory cells, one or more 3912 search instructions of processing unit are located with performing and retrieving data Manage to perform the processing of one or more embodiments.
Bus 3908 is also connected to input unit interface 3914 and output device interface 3906.Input unit interface 3914 makes Information and select command can be communicated to electronic system by user.Included with reference to the input unit that input unit interface 3914 uses Such as alphanumeric keypad and indicator device (also known as " cursor control device ").Output device interface 3906 realize for example by The display for the image that electronic system 3900 generates.The output device used with reference to output device interface 3906 include such as pointer and Display equipment, such as the display of liquid crystal display (LCD), light emitting diode (LED) display, Organic Light Emitting Diode (OLED) Device, flexible display, flat-panel monitor, solid state display, projecting apparatus or any other device for output information.It is one or more A embodiment can include the device for being used as both input-output devices, such as touch-screen.In these embodiments, there is provided Feedback to user can be any type of sense feedback, such as visual feedback, audio feedback or touch feedback;And come from user Input can receive in any form, include the sense of hearing, speed or sense of touch.
Finally, as shown in Figure 39, bus 3908 is also by one or more network interfaces 3916 by electronic system 3900 It is coupled to one or more networks (not showing).In this way, computer can be one or more computer networks (for example, LAN (" LAN "), wide area network (" WAN ") or intranet) part or a kind of network (such as internet) in multiple network. Any or all component of electronic system 3900 can be used in connection with the present invention.
Embodiment in the scope of the present invention can be used the tangible computer encoded to one or more instructions readable Storage media (or multiple tangible computer readable memory mediums of one or more types) is partially or completely realized.Volatibility calculates The property of machine readable memory medium can also be non-transitory.
Computer-readable storage medium can be any storage media, it can be read by universal or special computing device, write-in Or access in other ways, comprising any processing electronic equipment and/or it is able to carry out the process circuit instructed.For example, and Unrestrictedly, computer-readable media can include any volatile semiconductor memory, such as RAM, DRAM, SRAM, T-RAM, Z- RAM and TTRAM.Computer-readable media can also include any nonvolatile semiconductor memory, for example, ROM, PROM, EPROM, EEPROM, NVRAM, flash memory, nvSRAM, FeRAM, FeTRAM, MRAM, PRAM, CBRAM, SONOS, RRAM, NRAM, Racing track (racetrack) memory, FJG and Millipede memories.
In addition, computer-readable storage medium can include any non-semiconductor memory, such as optical disk storage apparatus, disk Storage device, tape, other magnetic storage devices or any other media that one or more instructions can be stored.In some implementations In scheme, volatile computer readable memory medium may be directly coupled to computing device, and in other embodiments, volatibility Computer-readable storage medium can for example via one or more wired connections, one or more wireless connections or any combination thereof indirectly It is coupled to computing device.
Instruction can be can directly perform or available for developing executable instruction.For example, can be implemented as can for instruction Execution or non-executable machine code are implemented as high level language instructions, and the high-level language can be compiled executable to produce Or non-executable machine code.In addition, instruction can also be implemented as or can include data.Computer executable instructions can also appoint What format organization, includes routine, subroutine, program, data structure, object, module, application, applet, function etc.. Such as those skilled in the art it should be understood that can be shown including but not limited to the details of the number of instruction, structure, sequence and tissue Write and change without changing basic logic, function, processing and output.
Although the polycaryon processor discussed above for being primarily referred to as microprocessor or performing software, one or more embodiments Performed by one or more integrated circuits, such as application-specific integrated circuit (ASIC) or field programmable gate array (FPGA).One or In multiple embodiments, this adhesive integrated circuit performs the instruction being stored on its circuit.
It will be understood by one of ordinary skill in the art that various illustrative pieces described herein, module, element, component, method and Algorithm can be implemented as the combination of electronic hardware, computer software or both.To illustrate this interchangeability of hardware and software, on Text various illustrative pieces, module, element, component, method and algorithm generally according to its functional descriptions.This feature be by Hardware is embodied as also to be implemented as software depending on application-specific and force at design constraint on total system.Technical staff Described function can be implemented in various ways for each specific application.Various assemblies and block can be arranged differently (example Such as, arranged with different order or split by different way), all without departing from the scope of this technology.
It is to be understood that any particular order or level of block in the disclosed process are the explanations of case method.Based on setting Count preference, it should be appreciated that during block particular order or level can be re-arranged, or illustrated all pieces are performed. In the block whichever can be performed at the same time.In one or more embodiments, multitask and parallel processing can be favourable.In addition, The separation of various system components in examples described above should not be interpreted as being required for this separation in all embodiments, And it is more to should be understood that described program assembly and system generally can be together integrated in single software product or are packaged into In a software product.
Any claim such as this specification and present application uses, term " base station ", " receiver ", " calculating Machine ", " server ", " processor " and " memory " all refer to electronic device or other technique devices.These terms exclude people Or crowd.For the purpose of specification, term " display (display) " or " display (displaying) " mean to fill in electronics Put display.
As used herein, phrase foregoing a series for articles " at least one of ", with appointing whichever in discrete articles Term " and " or "or" global revision list, and each component (for example, each article) in non-list.Phrase " ... in extremely At least one of few one " each article listed without selection;Truth is that phrase allows comprising in any one of article At least one of any combinations of at least one and/or article and/or at least one of each of article meaning Justice.Give an actual example, phrase " at least one of A, B and C " or " at least one of A, B or C " each refer to generation:Only A, only B or only C;A, any combinations of B and C;And/or at least one of each of A, B and C.
Predicate " being configured to ", " operable with " and " being programmed to " do not imply any specific tangible or invisible of object and repair Change, and truth is, it is desirable to it is interchangeably used.In one or more embodiments, monitoring and control operation or group are configured to The processor of part also mean to be programmed to the processor or operable with monitoring and control operation of monitoring and control operation Processor.Similarly, the processor for being configured to perform code may be structured to be programmed to perform code or operable to hold The processor of line code.
Such as on the one hand, the aspect, on the other hand, some aspects, one or more aspects, an embodiment, the reality Apply scheme, another embodiment, some embodiments, one or more embodiments, an embodiment, the embodiment, Ling Yishi Apply example, some embodiments, one or more embodiments, a configuration, the configuration, another configuration, some configurations, one or more match somebody with somebody Put, this technology, disclosure, the present invention, the phrase of its other variation and the like are for convenience, and not imply and be related to this The disclosure of (class) phrase is suitable for this technology necessity or the displosure content all configurations of this technology.It is short to be related to this (class) The disclosure of language is applicable to all configurations or one or more configurations.Be related to the disclosure of this (class) phrase can provide one or Multiple examples.Such as the phrase of one side or some aspects may refer to one or more aspects and vice versa, and this is similarly fitted For other foregoing phrases.
Meaned " being used as example, example or explanation " using word " exemplary " herein.Be described herein as " exemplary " or Be described as " example " any embodiment differ establish a capital be configured to preferably or it is more favourable than other embodiments.In addition, in term " bag Containing ", " having " or the like be used to describe or claims in degree, this term wished to exist similar to such as " comprising " The mode of such term " comprising " is interpreted as tool inclusive when being used as transitional word in claims.
One of ordinary skill in the art are known or later by the known various aspects through the description of the displosure content Element all structures and functional equivalent be expressly incorporated into by reference herein and it is desirable that by claims Cover.In addition, no matter whether each things disclosed herein it is not desirable that be directed to public, be expressly recited in detail in the claims The displosure content.In the case where providing 35U.S.C. § 112, the 6th paragraph, opinion element is not constructed, except not element uses phrase " use In ... component " be expressly recited, or in the case of claim to a method, element is come old using phrase " the step of being used for ... " State.
Offer, which is previously described, enables those skilled in the art to put into practice various aspects described herein.Fields Technical staff should be readily apparent that to the various modifications in terms of these, and generic principles defined herein may be used on other sides Face.Therefore, claims are without wishing to be held to the aspect shown herein, but should meet the full breadth consistent with language requirement, Wherein it is not intended to mean " one and only one " with reference to the element of singulative, unless specifically so stated, but it is " one or more Person ".Unless expressly stated otherwise, term "some" refer to one or more.Male's pronoun (for example, he) include women and neutrality (for example, she and it), and vice versa.Title and subtitle (if present) do not limit this only for being used for the sake of convenience Invention.

Claims (20)

1. a kind of system, it includes:
Video capture device, it is configured to 360 degree of videos of capture;
Splicing apparatus, it is configured to:
Splice the capture using the intermediate coordinate system between input Picture Coordinate System and 360 degree of video capture coordinate systems The 360 degree of videos arrived;And
Code device, it is configured to:
By described 360 degree of video bit streams are encoded into through splicing 360 degree of videos;And
For transmitting and storage, prepare 360 degree of video bit streams for playback.
2. system according to claim 1, wherein the splicing apparatus is configured to:
Normalization projection plane size is calculated using field-of-view angle;
Picture Coordinate System calculating normalization is presented from output using the normalization projection plane size to present in coordinate system Coordinate;
The coordinate is mapped to viewing coordinate system using the normalization projection plane size;
Using transformation matrix of coordinates by the coordinate from the viewing origin coordinate system transform to capture coordinate system;
Using the transformation matrix of coordinates by the coordinate from the capture origin coordinate system transform to the intermediate coordinate system;
The coordinate is transformed into normalization optical projection system from the intermediate coordinate system;And
The coordinate is mapped to the input Picture Coordinate System from the normalization optical projection system.
3. system according to claim 2, wherein using seat described in view direction angle and global rotation angle precomputation Mark transformation matrix.
4. system according to claim 3, wherein being signaled in the message included in 360 degree of video bit streams The overall situation rotation angle.
5. system according to claim 1, wherein the code device is further configured to:
It is encoded into each of multiple view sequences, the multiple view sequence through splicing 360 degree of videos by described and corresponds to Different views region in 360 degree of videos.
6. system according to claim 5, wherein in the multiple view sequence at least both with different projected layouts Form encodes.
7. system according to claim 5, the system further comprises:
Device is presented, it is configured to receive the multiple view sequence and institute is presented as input, and using control input is presented State each of multiple view sequences.
8. system according to claim 7, wherein the presentation device is further configured to select the multiple view At least one of sequence excludes at least one of the multiple view sequence for presenting from described present.
9. system according to claim 1, wherein the code device is further configured to 360 degree of video positions Signaling is compensated to indicate to exceed in view one or more pixels of the picture boundary of the view comprising unrestricted motion in stream.
10. system according to claim 9, wherein unrestricted motion compensation signaling is positioned at 360 degree of videos In the sequence-header or picture header of bit stream.
11. system according to claim 1, wherein between the intermediate coordinate system and the input Picture Coordinate System Relation by being defined along the rotated counterclockwise by angle of one or more axis.
12. system according to claim 1, wherein the splicing apparatus is further configured to:
The input projection format of 360 degree of videos is converted into the output projection format for being different from the input projection format.
13. a kind of system, it includes:
Video capture device, it is configured to 360 degree of videos of capture;
Splicing apparatus, it is configured to determine the 360 degree of video-splicings captured at least two using projection format Different projection formats;And
Code device, it is configured to:
360 degree of video bit streams are encoded into through splicing 360 degree of videos by described, 360 degree of video bit streams are included described in instruction extremely The signaling of few two kinds of different projection formats;And
For transmitting, prepare 360 degree of video bit streams for playback.
14. system according to claim 13, wherein the projection format determines it is based on related to 360 degree of videos The initial data statistics of connection and the coding statistics from the code device.
15. system according to claim 13, the system further comprises:
Decoding apparatus, it is configured to receive 360 degree of video bit streams as input, and decodes and come from 360 degree of videos The projection format information of the signaling in bit stream, the described at least two different projection formats of projection format information instruction.
16. system according to claim 15, the system further comprises:
Device is presented, it is configured to receive 360 degree of video bit streams as input from the decoding apparatus, and presents and come from The view sequence of 360 degree of video bit streams with described at least two different projection formats.
17. a kind of system, it includes:
Video capture device, it is configured to 360 degree of videos of capture;
Splicing apparatus, it is configured to the 360 degree of videos captured described in the splicing of multiple visual angles;
Code device, it is configured to be encoded into 360 degree of video bit streams through splicing 360 degree of videos by described;
Decoding apparatus, it is configured to decode the 360 degree video bit stream associated with the multiple visual angle;And
Device is presented, it is configured to use one or more suggestion visual angles in the multiple visual angle that the decoded position is presented Stream.
18. system according to claim 17, wherein the decoding apparatus is further configured to:
From one or more described suggestion visual angles of 360 degree of video bit streams extraction.
19. system according to claim 17, wherein the presentation device is further configured to:
The visual angle of one or more users selection is received as input;And
Make a choice between one or more described suggestion visual angles and the visual angle of one or more described users selection with described in presentation Decoded bit stream.
20. system according to claim 19, wherein the presentation device is further configured to:
One or more view sequences in the decoded bit stream are presented in the visual angle selected with one or more described users,
Wherein after the predetermined user inactive time cycle, will one or more described view sequences from described one or The view angle switch of multiple user's selections returns to one or more described suggestion visual angles.
CN201710952982.8A 2016-10-14 2017-10-13 360degree video capture and playback Active CN107959844B (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201662408652P 2016-10-14 2016-10-14
US62/408,652 2016-10-14
US201662418066P 2016-11-04 2016-11-04
US62/418,066 2016-11-04
US15/599,447 2017-05-18
US15/599,447 US11019257B2 (en) 2016-05-19 2017-05-18 360 degree video capture and playback

Publications (2)

Publication Number Publication Date
CN107959844A true CN107959844A (en) 2018-04-24
CN107959844B CN107959844B (en) 2021-09-17

Family

ID=61765451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710952982.8A Active CN107959844B (en) 2016-10-14 2017-10-13 360degree video capture and playback

Country Status (2)

Country Link
CN (1) CN107959844B (en)
DE (1) DE102017009145A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829851A (en) * 2019-01-17 2019-05-31 厦门大学 A kind of Panorama Mosaic method and storage equipment based on spherical surface alignment estimation
CN111355966A (en) * 2020-03-05 2020-06-30 上海乐杉信息技术有限公司 Surrounding free visual angle live broadcast method and system
WO2021032105A1 (en) * 2019-08-20 2021-02-25 中兴通讯股份有限公司 Code stream processing method and device, first terminal, second terminal and storage medium
CN113206992A (en) * 2021-04-20 2021-08-03 聚好看科技股份有限公司 Method for converting projection format of panoramic video and display equipment
CN113243112A (en) * 2018-12-21 2021-08-10 皇家Kpn公司 Streaming volumetric and non-volumetric video
WO2024055925A1 (en) * 2022-09-13 2024-03-21 影石创新科技股份有限公司 Image transmission method and apparatus, image display method and apparatus, and computer device
CN118945282A (en) * 2024-10-12 2024-11-12 圆周率科技(常州)有限公司 Video processing method, device, computer equipment and storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3952304A4 (en) * 2019-03-29 2022-05-04 Sony Group Corporation Image processing device, image processing method, and program
US11259055B2 (en) 2020-07-10 2022-02-22 Tencent America LLC Extended maximum coding unit size
DE102021119951A1 (en) 2021-08-02 2023-02-02 Dr. Ing. H.C. F. Porsche Aktiengesellschaft Method, system and computer program product for detecting the surroundings of a motor vehicle

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1922544A (en) * 2004-02-19 2007-02-28 创新科技有限公司 Method and apparatus for providing a combined image
US20110069146A1 (en) * 2009-09-18 2011-03-24 Hon Hai Precision Industry Co., Ltd. System and method for processing images
CN102508565A (en) * 2011-11-17 2012-06-20 Tcl集团股份有限公司 Remote control cursor positioning method and device, remote control and cursor positioning system
US20130089301A1 (en) * 2011-10-06 2013-04-11 Chi-cheng Ju Method and apparatus for processing video frames image with image registration information involved therein
CN103873758A (en) * 2012-12-17 2014-06-18 北京三星通信技术研究有限公司 Method, device and equipment for generating panorama in real time
CN105872353A (en) * 2015-12-15 2016-08-17 乐视网信息技术(北京)股份有限公司 System and method for implementing playback of panoramic video on mobile device
CN106023070A (en) * 2016-06-14 2016-10-12 北京岚锋创视网络科技有限公司 Real-time panoramic splicing method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1922544A (en) * 2004-02-19 2007-02-28 创新科技有限公司 Method and apparatus for providing a combined image
US20110069146A1 (en) * 2009-09-18 2011-03-24 Hon Hai Precision Industry Co., Ltd. System and method for processing images
US20130089301A1 (en) * 2011-10-06 2013-04-11 Chi-cheng Ju Method and apparatus for processing video frames image with image registration information involved therein
CN103096008A (en) * 2011-10-06 2013-05-08 联发科技股份有限公司 Video frame processing method, video stream playing method and video frame recording device
CN102508565A (en) * 2011-11-17 2012-06-20 Tcl集团股份有限公司 Remote control cursor positioning method and device, remote control and cursor positioning system
CN103873758A (en) * 2012-12-17 2014-06-18 北京三星通信技术研究有限公司 Method, device and equipment for generating panorama in real time
CN105872353A (en) * 2015-12-15 2016-08-17 乐视网信息技术(北京)股份有限公司 System and method for implementing playback of panoramic video on mobile device
CN106023070A (en) * 2016-06-14 2016-10-12 北京岚锋创视网络科技有限公司 Real-time panoramic splicing method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113243112A (en) * 2018-12-21 2021-08-10 皇家Kpn公司 Streaming volumetric and non-volumetric video
CN113243112B (en) * 2018-12-21 2024-06-07 皇家Kpn公司 Streaming volumetric and non-volumetric video
CN109829851A (en) * 2019-01-17 2019-05-31 厦门大学 A kind of Panorama Mosaic method and storage equipment based on spherical surface alignment estimation
WO2021032105A1 (en) * 2019-08-20 2021-02-25 中兴通讯股份有限公司 Code stream processing method and device, first terminal, second terminal and storage medium
CN111355966A (en) * 2020-03-05 2020-06-30 上海乐杉信息技术有限公司 Surrounding free visual angle live broadcast method and system
CN113206992A (en) * 2021-04-20 2021-08-03 聚好看科技股份有限公司 Method for converting projection format of panoramic video and display equipment
WO2024055925A1 (en) * 2022-09-13 2024-03-21 影石创新科技股份有限公司 Image transmission method and apparatus, image display method and apparatus, and computer device
CN118945282A (en) * 2024-10-12 2024-11-12 圆周率科技(常州)有限公司 Video processing method, device, computer equipment and storage medium
CN118945282B (en) * 2024-10-12 2025-02-14 圆周率科技(常州)有限公司 Video processing method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
DE102017009145A1 (en) 2018-04-19
CN107959844B (en) 2021-09-17

Similar Documents

Publication Publication Date Title
CN107959844A (en) 360 degree of video captures and playback
US10848668B2 (en) 360 degree video recording and playback with object tracking
US11019257B2 (en) 360 degree video capture and playback
CN108024094B (en) 360degree video recording and playback with object tracking
US11651752B2 (en) Method and apparatus for signaling user interactions on overlay and grouping overlays to background for omnidirectional content
CN110459246B (en) System and method for playback of panoramic video content
US8683067B2 (en) Video perspective navigation system and method
US20200336727A1 (en) Method for transmitting 360-degree video, method for receiving 360-degree video, apparatus for transmitting 360-degree video, and apparatus for receiving 360-degree video
CN110073662A (en) The suggestion viewport of panoramic video indicates
CN110870303A (en) Method and apparatus for rendering VR media beyond all-around media
KR20190095430A (en) 360 video processing method and apparatus therefor
WO2020185804A1 (en) Media content presentation
US20210176446A1 (en) Method and device for transmitting and receiving metadata about plurality of viewpoints
US10904508B2 (en) 360 degree video with combined projection format
CN108271044A (en) A kind of processing method and processing device of information
US10602239B2 (en) Method and apparatus for track composition
JP7177034B2 (en) Method, apparatus and stream for formatting immersive video for legacy and immersive rendering devices
CN115883882A (en) Image processing method, device, system, network equipment, terminal and storage medium
Podborski et al. Virtual reality and DASH
US11297378B2 (en) Image arrangement determination apparatus, display controlling apparatus, image arrangement determination method, display controlling method, and program
Podborski et al. 360-degree video streaming with MPEG-DASH
US20220329886A1 (en) Methods and devices for handling media data streams
Priyadharshini et al. 360 user-generated videos: Current research and future trends
Deshpande et al. Omnidirectional MediA Format (OMAF): toolbox for virtual reality services
US20230215080A1 (en) A method and apparatus for encoding and decoding volumetric video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right

Effective date of registration: 20181022

Address after: Singapore Singapore

Applicant after: Annwa high tech Limited by Share Ltd

Address before: Singapore Singapore

Applicant before: Avago Technologies Fiber IP Singapore Pte. Ltd.

TA01 Transfer of patent application right
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant