CN117321987A - Immersive viewing experience - Google Patents
Immersive viewing experience Download PDFInfo
- Publication number
- CN117321987A CN117321987A CN202280030471.XA CN202280030471A CN117321987A CN 117321987 A CN117321987 A CN 117321987A CN 202280030471 A CN202280030471 A CN 202280030471A CN 117321987 A CN117321987 A CN 117321987A
- Authority
- CN
- China
- Prior art keywords
- user
- image
- specific display
- shows
- display image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 claims abstract description 35
- 239000002131 composite material Substances 0.000 claims description 24
- 230000008921 facial expression Effects 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 abstract description 2
- 210000001508 eye Anatomy 0.000 description 91
- 210000003128 head Anatomy 0.000 description 13
- 230000006641 stabilisation Effects 0.000 description 9
- 238000011105 stabilization Methods 0.000 description 9
- 239000011521 glass Substances 0.000 description 6
- 238000003384 imaging method Methods 0.000 description 6
- 241000282414 Homo sapiens Species 0.000 description 5
- 230000003190 augmentative effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000002271 resection Methods 0.000 description 4
- 238000010408 sweeping Methods 0.000 description 4
- 241000282994 Cervidae Species 0.000 description 3
- 241000086550 Dinosauria Species 0.000 description 3
- 239000003086 colorant Substances 0.000 description 3
- 241000282412 Homo Species 0.000 description 2
- 101150071882 US17 gene Proteins 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 210000005252 bulbus oculi Anatomy 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000004424 eye movement Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000001747 pupil Anatomy 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/243—Image signal generators using stereoscopic image cameras using three or more 2D image sensors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/122—Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/21805—Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42201—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] biosensors, e.g. heat sensor for presence detection, EEG sensors or any limb activity sensors worn by the user
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/4728—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/65—Transmission of management data between client and server
- H04N21/658—Transmission by the client directed to the server
- H04N21/6587—Control parameters, e.g. trick play commands, viewpoint selection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/30—Image reproducers
- H04N13/332—Displays for viewing with the aid of special glasses or head-mounted displays [HMD]
- H04N13/344—Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Neurosurgery (AREA)
- Human Computer Interaction (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Stereoscopic And Panoramic Photography (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
技术领域Technical field
本公开的各方面一般涉及作品分发的使用。Aspects of this disclosure generally relate to uses for distribution of works.
相关申请交叉引用Related application cross-references
本申请是于2021年4月22日提交的US17/237,152的PCT,US17/237,152是于2021年4月7日提交的美国专利申请17/225,610的部分继续申请,美国专利申请17/225,610是于2021年2月28日提交的美国专利申请17/187,828的部分继续申请。This application is a PCT of US17/237,152 filed on April 22, 2021. US17/237,152 is a continuation-in-part of US patent application 17/225,610 filed on April 7, 2021. US patent application 17/225,610 was filed on April 7, 2021. A continuation-in-part of U.S. patent application 17/187,828 filed on February 28, 2021.
引言introduction
电影是一种娱乐形式。Movies are a form of entertainment.
发明内容Contents of the invention
本文中提到的所有示例、方面和特征可以以任何技术上可想到的方式组合。本专利教导了一种用于沉浸式观看体验的方法、软件和装置。All examples, aspects and features mentioned in this article may be combined in any technically conceivable way. This patent teaches a method, software, and apparatus for an immersive viewing experience.
总的来说,本专利改进了2021年4月7日提交的美国专利申请17/225,610中所教导的技术,美国专利申请17/225,610的全部内容通过引用并入本文。美国专利申请17/225,610中描述的一些装置具有生成超大数据集的能力。本专利改善了这种超大数据集的显示。In summary, this patent improves upon the technology taught in U.S. Patent Application 17/225,610, filed on April 7, 2021, the entire contents of which are incorporated herein by reference. Some devices described in US Patent Application No. 17/225,610 have the ability to generate very large data sets. This patent improves the display of such very large data sets.
本专利公开了一种系统、方法、装置和软件,以实现改善的沉浸式观看体验。首先,将用户的观看参数上传到云,其中该云存储影像(imagery)(其在优选的实施例中是超大数据集)。观看参数可以包括任何动作、姿势、身体位置、眼睛注视角度、眼睛会聚/辐辏(convergence/vergence)或输入(例如,通过图形用户接口)。因此,用户的观看参数被近乎实时地表征(例如,通过各种设备,诸如面向眼睛的摄像头、记录姿势的摄像头)并被发送到云。第二,从该影像中优化出一组用户特定影像,其中该用户特定影像至少基于上述观看参数。在优选的实施例中,用户特定影像的视场小于该影像。在优选的实施例中,用户正在看的位置具有高分辨率,而用户没有在看的位置具有低分辨率。例如,如果用户正在看左边的物体,那么左侧的用户特定影像将是高分辨率的。在一些实施例中,用户特定影像将近乎实时地流式传输。This patent discloses a system, method, device and software to achieve an improved immersive viewing experience. First, the user's viewing parameters are uploaded to the cloud, where the cloud stores the imagery (which in the preferred embodiment is a very large data set). Viewing parameters may include any movement, posture, body position, eye gaze angle, eye convergence/vergence, or input (eg, via a graphical user interface). Thus, the user's viewing parameters are characterized in near real-time (e.g., via various devices such as eye-facing cameras, gesture-recording cameras) and sent to the cloud. Second, a set of user-specific images is optimized from the images, wherein the user-specific images are based at least on the above viewing parameters. In a preferred embodiment, the field of view of a user-specific image is smaller than that of the image. In a preferred embodiment, locations where the user is looking have high resolution and locations where the user is not looking have low resolution. For example, if the user is looking at an object on the left, then the user-specific image on the left will be high resolution. In some embodiments, user-specific imagery will be streamed in near real-time.
在一些实施例中,用户特定影像包括具有第一空间分辨率的第一部分和具有第二空间分辨率的第二部分,其中该第一空间分辨率高于该第二空间分辨率。一些实施例包括,其中上述观看参数包括观看位置,其中该观看位置对应于该第一部分。In some embodiments, the user-specific image includes a first portion having a first spatial resolution and a second portion having a second spatial resolution, wherein the first spatial resolution is higher than the second spatial resolution. Some embodiments include, wherein the viewing parameters include a viewing position, wherein the viewing position corresponds to the first portion.
一些实施例包括,其中用户特定影像包括具有第一变焦设置的第一部分和具有第二变焦设置的第二部分,其中该第一变焦设置高于该第二变焦设置。一些实施例包括,其中第一部分由上述观看参数确定,其中上述观看参数包括由以下各项组成的组中的至少一项:该用户的身体的位置;该用户的身体的方向;该用户的手的姿势;该用户的面部表情;该用户的头部的位置;和该用户的头部的方向。一些实施例包括,其中第一部分由图形用户接口(例如鼠标或控制器)确定。Some embodiments include wherein the user-specific image includes a first portion having a first zoom setting and a second portion having a second zoom setting, wherein the first zoom setting is higher than the second zoom setting. Some embodiments include, wherein the first portion is determined by the above viewing parameters, wherein the above viewing parameters include at least one of the group consisting of: the position of the user's body; the orientation of the user's body; the user's hands the posture of the user; the facial expression of the user; the position of the user's head; and the direction of the user's head. Some embodiments include wherein the first portion is determined by a graphical user interface (such as a mouse or controller).
一些实施例包括,其中影像包括第一视场(field of view,FOV),该用户特定影像包括第二视场,其中该第一FOV大于该第二FOV。Some embodiments include where the image includes a first field of view (FOV) and the user-specific image includes a second field of view, where the first FOV is larger than the second FOV.
一些实施例包括,其中影像包括立体影像,其中该立体影像通过立体摄像头或立体摄像头集群获得。Some embodiments include, wherein the image includes a stereoscopic image, wherein the stereoscopic image is obtained by a stereoscopic camera or a stereoscopic camera cluster.
一些实施例包括,其中该影像包括拼接影像,其中该拼接影像由至少两个摄像头生成。Some embodiments include, wherein the image includes a stitched image, wherein the stitched image is generated by at least two cameras.
一些实施例包括,其中该影像包括合成影像,其中该合成影像通过以下方式生成:用第一组摄像头设置拍摄场景的第一图像,其中该第一组摄像头设置使得第一物体聚焦而第二物体失焦;以及用第二组摄像头设置拍摄场景的第二图像,其中该第二组摄像头设置使得该第二物体聚焦而该第一物体失焦。一些实施例包括,其中当用户看该第一物体时,该第一图像会被呈现给该用户,当用户看该第二物体时,该第二图像会被呈现给该用户。一些实施例包括,将至少来自该第一图像的该第一物体和来自该第二图像的该第二物体组合成该合成图像。Some embodiments include, wherein the image includes a composite image, wherein the composite image is generated by capturing a first image of the scene with a first set of camera settings, wherein the first set of camera settings causes the first object to be in focus and the second object to be in focus. out of focus; and capturing a second image of the scene with a second set of camera settings, wherein the second set of camera settings causes the second object to be in focus and the first object to be out of focus. Some embodiments include wherein the first image is presented to the user when the user looks at the first object, and the second image is presented to the user when the user looks at the second object. Some embodiments include combining at least the first object from the first image and the second object from the second image into the composite image.
一些实施例包括,执行图像稳定。一些实施例包括,其中该观看参数包括会聚。一些实施例包括,其中用户特定影像是3D影像(三维影像),其中该3D影像是在HDU、一副立体眼镜或一副偏振眼镜上呈现的。Some embodiments include performing image stabilization. Some embodiments include, wherein the viewing parameter includes convergence. Some embodiments include where the user-specific image is a 3D image (three-dimensional image), where the 3D image is presented on an HDU, a pair of stereoscopic glasses, or a pair of polarized glasses.
一些实施例包括,其中该用户特定影像在显示器上呈现给该用户,其中该用户具有至少0.5π球面度的视场。Some embodiments include wherein the user-specific image is presented to the user on a display, wherein the user has a field of view of at least 0.5π steradians.
一些实施例包括,其中用户特定影像在显示器上呈现。在一些实施例中,该显示器是屏幕(例如,TV,与投影仪系统耦合的反射屏幕,包括增强现实显示器、虚拟现实显示器或混合现实显示器的扩展现实头部显示单元)。Some embodiments include wherein user-specific images are presented on the display. In some embodiments, the display is a screen (eg, a TV, a reflective screen coupled to a projector system, an extended reality head display unit including an augmented reality display, a virtual reality display, or a mixed reality display).
附图说明Description of drawings
图1示出了Figure 1 shows
图1示出了立体图像的回顾性显示。Figure 1 shows a retrospective display of stereoscopic images.
图2示出了确定在给定时间点向用户显示哪个立体对的方法。Figure 2 illustrates a method of determining which stereo pair to display to the user at a given point in time.
图3示出了在HDU上显示视频记录。Figure 3 shows the display of video recordings on the HDU.
图4示出了由用户1执行的预先记录的立体观看。Figure 4 shows a pre-recorded stereoscopic viewing performed by user 1.
图5示出了使用立体摄像头集群对远处物体进行远距离立体成像。Figure 5 shows long-distance stereo imaging of distant objects using a stereo camera cluster.
图6示出了通过生成立体合成图像,基于用户眼动跟踪调整图像以得到最佳可能图片的采集后能力。Figure 6 illustrates the post-acquisition ability to adjust the image based on user eye tracking to get the best possible picture by generating a stereo composite image.
图7A示出了运动图像以及图像稳定处理的应用。Figure 7A shows a moving image and the application of image stabilization processing.
图7B示出了在HDU中显示的运动图像。FIG. 7B shows a moving image displayed in the HDU.
图7C示出了使用立体影像对图像应用图像稳定。Figure 7C shows applying image stabilization to an image using stereoscopic imaging.
图8A示出了具有第一摄像头设置的左图像和右图像。Figure 8A shows left and right images with a first camera setup.
图8B示出了具有第二摄像头设置的左图像和右图像。Figure 8B shows left and right images with a second camera setup.
图9A示出了在一个时间点收集的场景的所有数据的俯视图。Figure 9A shows a top view of all data for a scene collected at one point in time.
图9B示出了视频记录的广角显示2D图像帧。Figure 9B shows a video-recorded wide-angle display 2D image frame.
图9C示出了用户A的视角为-70°和55°FOV的俯视图。Figure 9C shows a top view of user A with a viewing angle of -70° and 55° FOV.
图9D示出了给定用户A的视角为-70°和55°FOV时用户A会看到什么。Figure 9D shows what User A would see given his viewing angles of -70° and 55° FOV.
图9E示出了用户B的视角为+50°和85°FOV的俯视图。Figure 9E shows a top view of user B's viewing angles of +50° and 85° FOV.
图9F示出了给定用户B的视角为+50°和85°FOV时用户B会看到什么。Figure 9F shows what User B would see given User B's viewing angles of +50° and 85° FOV.
图10A示出了左摄像头在第一时间点捕获的视场。Figure 10A shows the field of view captured by the left camera at the first point in time.
图10B示出了右摄像头在第一时间点捕获的视场。Figure 10B shows the field of view captured by the right camera at the first point in time.
图10C示出了第一用户在给定时间点的个性化视场(FOV)。Figure 1OC shows the first user's personalized field of view (FOV) at a given point in time.
图10D示出了第二用户在给定时间点的个性化视场(FOV)。Figure 10D shows the second user's personalized field of view (FOV) at a given point in time.
图10E示出了第三用户在给定时间点的个性化视场(FOV)。Figure 10E shows a third user's personalized field of view (FOV) at a given point in time.
图10F示出了第四用户在给定时间点的个性化视场(FOV)。Figure 10F shows the fourth user's personalized field of view (FOV) at a given point in time.
图11A示出了第一用户的左眼视图的俯视图。Figure 11A shows a top view of the first user's left eye view.
图11B示出了第一用户的左眼视图的俯视图,其中会聚点靠近左眼和右眼。Figure 11B shows a top view of the first user's left eye view with the convergence point proximal to the left and right eyes.
图11C示出了在时间点1没有会聚的左眼视图。Figure 11C shows the view of the left eye at time point 1 without convergence.
图11D示出了在时间点2有会聚的左眼视图。Figure 1 ID shows the left eye view with convergence at time point 2.
图12示出了从先前获取的广角立体图像重建各种立体图像。Figure 12 shows the reconstruction of various stereoscopic images from previously acquired wide-angle stereoscopic images.
图13A示出了家庭影院的俯视图。Figure 13A shows a top view of a home theater.
图13B示出了如图13A所示的家庭影院的侧视图。Figure 13B shows a side view of the home theater shown in Figure 13A.
图14A示出了家庭影院的俯视图。Figure 14A shows a top view of a home theater.
图14B示出了如图14A所示的家庭影院的侧视图。Figure 14B shows a side view of the home theater shown in Figure 14A.
图15A示出了近似球形的TV,其中用户在时间点#1处直视前方。FIG. 15A shows an approximately spherical TV in which the user looks straight ahead at time point #1.
图15B示出了用户在时间点#1观察到的电视部分和视场。Figure 15B shows the portion of the television and field of view observed by the user at time point #1.
图15C示出了近似球形的TV,其中用户在时间点#2直视前方。Figure 15C shows an approximately spherical TV in which the user is looking straight ahead at time point #2.
图15D示出了用户在时间点#2观察到的电视部分和视场。Figure 15D shows the portion of the television and field of view observed by the user at time point #2.
图15E示出了近似球形的TV,其中用户在时间点#3直视前方。Figure 15E shows an approximately spherical TV in which the user is looking straight ahead at time point #3.
图15F示出了用户在时间点#3观察到的电视部分和视场。Figure 15F shows the portion of the television and field of view observed by the user at time point #3.
图16A示出了未变焦的图像。Figure 16A shows an unzoomed image.
图16B示出了对图像的一部分的数字型放大。Figure 16B shows a digital enlargement of a portion of the image.
图17A示出了未变焦的图像。Figure 17A shows an unzoomed image.
图17B示出了在图像的一部分上的光学型放大。Figure 17B shows optical magnification on a portion of the image.
图18A示出了单分辨率图像。Figure 18A shows a single resolution image.
图18B示出了多分辨率图像。Figure 18B shows a multi-resolution image.
图19A示出了大视场,其中第一用户正在看图像的第一部分,第二用户正在看图像的第二部分。Figure 19A shows a large field of view in which a first user is looking at a first part of the image and a second user is looking at a second part of the image.
图19B示出了只有图19A中图像的第一部分和图19A中图像的第二部分是高分辨率,而图像的其余部分是较低分辨率。Figure 19B shows that only the first portion of the image in Figure 19A and the second portion of the image in Figure 19A are high resolution, while the remainder of the image is lower resolution.
图20A示出了低分辨率图像。Figure 20A shows a low resolution image.
图20B示出了高分辨率图像。Figure 20B shows a high resolution image.
图20C示出了合成图像。Figure 20C shows the composite image.
图21示出了用于执行自定义图像的近实时流式传输(streaming)的方法和过程。Figure 21 illustrates a method and process for performing near real-time streaming of custom images.
图22A示出了结合立体摄像头使用后方交会,其中第一摄像头位置未知。Figure 22A illustrates the use of resection with stereo cameras, where the first camera position is unknown.
图22B示出了结合立体摄像头使用后方交会,其中物体位置未知。Figure 22B illustrates the use of resection in conjunction with a stereo camera where the object position is unknown.
图23A示出了人向前看向家庭影院屏幕中心的俯视图。Figure 23A shows a top view of a person looking forward towards the center of the home theater screen.
图23B示出了人向前看向家庭影院屏幕右侧的俯视图。Figure 23B shows a top view of a person looking forward toward the right side of the home theater screen.
图24示出了用于在移动中的图像采集期间优化立体摄像头设置的方法、系统和装置。Figure 24 illustrates methods, systems and devices for optimizing stereo camera setup during image acquisition on the move.
具体实施方式Detailed ways
流程图没有描述任何特定编程语言的语法。相反,流程图示出了本领域普通技术人员制造电路或生成计算机软件以执行根据本发明所需的处理过程时需要的功能信息。需要注意的是,没有示出许多例行程序元素,例如循环和变量的初始化以及临时变量的使用。本领域普通技术人员能够理解,除非本文另有说明,否则所描述的具体步骤顺序仅是说明性的,其可以在不脱离本发明精神的前提下进行改变。因此,除非另有说明,否则下文描述的步骤是无序的,这意味着在可能的情况下,这些步骤可以以任何方便或期望的顺序来执行。Flowcharts do not describe the syntax of any particular programming language. Rather, the flowchart illustrations illustrate the functional information that would be required by one of ordinary skill in the art to fabricate circuits or generate computer software to perform the processes required in accordance with the present invention. Note that many routine elements are not shown, such as loops and initialization of variables and the use of temporary variables. It will be understood by those of ordinary skill in the art that, unless otherwise stated herein, the specific sequence of steps described is only illustrative and may be changed without departing from the spirit of the invention. Therefore, unless otherwise stated, the steps described below are unordered, which means that where possible, the steps can be performed in any convenient or desired order.
图1示出了立体图像的回顾性显示。100示出了步骤A,确定观看者在时间点n正在看的位置(例如,坐标(αn,,βn,,rn))。注#1:该位置可以是近处会聚点、中间处会聚点或远处会聚点。注#2:采集并记录了一系列立体影像。步骤A按照采集过程,并且在用户观看期间在某个后续时间段进行。101示出了步骤B,确定与该位置(例如,时间点n处的(αn,,βn,,rn)坐标,注:用户可以选择FOV)对应的FOVn。102示出了步骤C,选择对应于左眼FOV的(一个或多个)摄像头,其中具有执行附加图像处理(例如,使用合成图像,使用辐辏区)的选项,以生成时间点n处的个性化左眼图像(PLEIn)。103示出了步骤D,选择对应于右眼FOV的(一个或多个)摄像头,其中具有执行附加图像处理(例如,使用合成图像,使用辐辏区)的选项,以生成时间点n处的个性化右眼图像(PREIn)。104示出了步骤E,在HDU的左眼显示器上显示PLEIn。105示出了步骤F,在HDU的右眼显示器上显示PREIn。106示出了步骤G,将时间步长递增到n+1并转到上述步骤A。Figure 1 shows a retrospective display of stereoscopic images. 100 shows step A, determining the location where the viewer is looking at time point n (eg, coordinates (α n , β n , rn )). Note #1: This location can be a near convergence point, an intermediate convergence point, or a distant convergence point. Note #2: A series of stereoscopic images were collected and recorded. Step A follows the acquisition process and is performed at some subsequent time period while the user is watching. 101 shows step B, determining the FOV n corresponding to the position (eg, (α n , β n , rn ) coordinates at time point n, Note: the user can select the FOV ) . 102 shows step C, selecting the camera(s) corresponding to the left eye FOV, with the option to perform additional image processing (e.g., using composite images, using convergence zones) to generate a personality at time point n ize the left eye image (PLEI n ). 103 shows step D, selecting the camera(s) corresponding to the right eye FOV, with the option to perform additional image processing (e.g., use composite images, use convergence zones) to generate the personality at time point n Transform the right eye image (PREI n ). 104 shows step E, displaying PLEIn on the left eye display of the HDU. 105 shows step F, displaying PREI n on the right eye display of the HDU. 106 shows step G, incrementing the time step to n+1 and going to step A above.
图2示出了确定在给定时间点向用户显示哪个立体对(stereo pair)的方法。200示出了分析用户参数以确定向该用户显示哪个立体图像的文本框。首先,使用用户头部的观看方向。例如,如果用户的头部在向前的方向,则可以使用第一立体对,如果用户的头部在向左的方向,则可以使用第二立体对。第二,使用用户注视的视角。例如,如果用户在朝向远处物体(例如远处的山)的方向上看,则针对该时间点会选择远处(例如区域3)的立体图像对。第三,使用用户的会聚。例如,如果近处物体(例如,树上的叶子)的观看方向与远处物体(例如,远处的山)的观看方向极其相似,则选择使用会聚和视角的组合。第四,使用用户眼睛的调节。例如,监测用户的瞳孔大小,并利用大小的变化来指示用户正在看哪里(近处/远处)。Figure 2 illustrates a method of determining which stereo pair to display to the user at a given point in time. 200 shows a text box that analyzes user parameters to determine which stereoscopic image to display to the user. First, use the viewing direction of the user's head. For example, if the user's head is in a forward direction, the first stereo pair may be used, and if the user's head is in a left direction, the second stereo pair may be used. Second, use the perspective of the user’s gaze. For example, if the user is looking in the direction toward a distant object (eg, a distant mountain), a distant (eg, area 3) stereoscopic image pair will be selected for that point in time. Third, use the convergence of users. For example, if the viewing direction of a near object (e.g., leaves on a tree) is very similar to the viewing direction of a distant object (e.g., a distant mountain), then choose to use a combination of convergence and perspective. Fourth, use the adjustment of the user's eyes. For example, monitor the user's pupil size and use changes in size to indicate where the user is looking (near/far).
图3示出了在HDU上显示视频记录。300示出了建立坐标系。例如,使用摄像头标作为原点,使用摄像头的指向作为轴。美国专利申请17/225,610中对此有更详细的论述,其全部内容通过引用并入本文。301示出了执行场景的广角记录。例如,以比向用户显示的FOV更大的FOV记录数据)。302示出了执行如图2中所讨论的对用户的分析,以确定用户在场景中正在看的位置。303示出了基于302中的分析来优化显示。在一些实施例中,物理物体的特征(例如,位置、大小、形状、方向、颜色、亮度、纹理、通过AI算法的分类)确定虚拟物体的特征(例如,位置、大小、形状、方向、颜色、亮度、纹理)。例如,用户正在房子的房间中使用混合现实显示器,其中房间中的一些区域(例如,白天的窗户)是明亮的,而房间中的一些区域是黑暗的(例如,深蓝色的墙)。在一些实施例中,虚拟物体的放置位置基于房间内物体的位置。例如,如果背景是深蓝色的墙,虚拟物体可以被着色为白色,以便突出显示。例如,如果背景是白色的墙,虚拟物体可以被着色为蓝色,以便突出显示。例如,可以对虚拟物体进行放置(或重新放置),使其背景能够使得虚拟物体被显示,从而优化用户的观看体验。Figure 3 shows the display of video recordings on the HDU. 300 shows establishing the coordinate system. For example, use the camera mark as the origin and the direction of the camera as the axis. This is discussed in more detail in US Patent Application No. 17/225,610, the entire contents of which are incorporated herein by reference. 301 shows a wide angle recording of the execution scene. For example, recording data with a larger FOV than the FOV displayed to the user). 302 shows performing analysis of the user as discussed in Figure 2 to determine where the user is looking in the scene. 303 shows optimizing the display based on the analysis in 302. In some embodiments, characteristics of the physical object (e.g., location, size, shape, orientation, color, brightness, texture, classification by AI algorithms) determine characteristics of the virtual object (e.g., location, size, shape, orientation, color , brightness, texture). For example, a user is using a mixed reality display in a room of a house where some areas of the room are bright (e.g., windows during the day) and some areas of the room are dark (e.g., dark blue walls). In some embodiments, virtual objects are placed based on the location of objects within the room. For example, if the background is a dark blue wall, virtual objects can be tinted white to stand out. For example, if the background is a white wall, virtual objects can be tinted blue to stand out. For example, the virtual object can be placed (or repositioned) so that its background enables the virtual object to be displayed, thereby optimizing the user's viewing experience.
图4示出了由用户1执行的预先记录的立体观看。400示出了用户1使用立体摄像系统(例如,智能手机等)执行立体记录。美国专利申请17/225,610中对此有更详细的论述,其全部内容通过引用并入本文。401示出了在存储设备上存储立体记录。402示出了用户(例如,用户1或(一个或多个)其他用户)检索存储的立体记录。注意,立体记录可传输给上述(一个或多个)其他用户,并且上述(一个或多个)其他用户会接收上述存储的立体记录。403示出了用户(例如,用户1或(一个或多个)其他用户)在立体显示单元(例如,增强现实、混合现实、虚拟现实显示器)上观看上述存储的立体记录。Figure 4 shows a pre-recorded stereoscopic viewing performed by user 1. 400 shows user 1 performing stereoscopic recording using a stereoscopic camera system (eg, smartphone, etc.). This is discussed in more detail in US Patent Application No. 17/225,610, the entire contents of which are incorporated herein by reference. 401 shows storing a stereoscopic recording on a storage device. 402 shows a user (eg, User 1 or other user(s)) retrieving a stored stereoscopic recording. Note that the stereoscopic recording can be transmitted to the above-mentioned other user(s), and the above-mentioned other user(s) will receive the above-mentioned stored stereoscopic recording. 403 shows a user (eg, user 1 or other user(s)) viewing the above-mentioned stored stereoscopic recording on a stereoscopic display unit (eg, augmented reality, mixed reality, virtual reality display).
图5示出了使用立体摄像头集群对远处物体进行远距离立体成像。500示出了将两个摄像头集群放置在相隔至少50英尺远的位置。501示出了选择至少1英里以外的目标。502示出了精确对准每个摄像头集群,使得焦点中心线与目标相交。503示出了获取目标的立体影像。504示出了观看和/或分析所获取的立体影像。一些实施例使用带有长焦镜头的摄像头,而不是摄像头集群。此外,一些实施例具有小于或等于50英尺的立体间隔,以优化小于1英里远的观看。Figure 5 shows long-distance stereo imaging of distant objects using a stereo camera cluster. 500 shows two camera clusters placed at least 50 feet apart. 501 shows selecting a target at least 1 mile away. 502 shows the precise alignment of each camera cluster so that the centerline of focus intersects the target. 503 shows acquiring a stereoscopic image of the target. 504 illustrates viewing and/or analyzing the acquired stereoscopic image. Some embodiments use a camera with a telephoto lens instead of a camera cluster. Additionally, some embodiments have stereo separation less than or equal to 50 feet to optimize viewing less than 1 mile away.
图6示出了通过生成立体合成图像,基于用户眼动跟踪调整图像以得到最佳可能图片的采集后能力。在这个时间点显示的立体图像中有几个物体可能是观看场景的人感兴趣的。因此,在每个时间点,将生成立体合成图像以匹配至少一个用户的输入。例如,如果用户在第一时间点正在观看(眼动跟踪确定观看位置)山600或云601,则将生成传送到HDU的立体合成图像对,使得远处物体山600或云601对焦,而包括鹿603和花602的近处物体失焦。如果用户正在观看(眼动跟踪确定观看位置)鹿603,则该帧呈现的立体合成图像将被优化用于中等距离。最后,如果用户正在观看(眼动跟踪确定观看位置)近处的花603,则立体合成图像将被优化用于近距离(例如,实现会聚,并模糊远处的项目,诸如鹿603、山600和云601)。可以使用各种用户输入向软件套件指示如何优化立体合成图像。诸如眯眼之类的姿势可用于优化更远物体的立体合成图像。诸如前倾等姿势可以用于推近远处的物体。也可以使用GUI改善沉浸式观看体验。Figure 6 illustrates the post-acquisition ability to adjust the image based on user eye tracking to get the best possible picture by generating a stereo composite image. There are several objects in the stereoscopic image displayed at this point in time that may be of interest to the person viewing the scene. Therefore, at each time point, a stereoscopic composite image is generated to match at least one user's input. For example, if the user is viewing (eye tracking determines the viewing position) mountain 600 or cloud 601 at the first point in time, a stereoscopic composite image pair transmitted to the HDU will be generated such that the distant object mountain 600 or cloud 601 is in focus, while including The close objects of Deer 603 and Flower 602 are out of focus. If the user is viewing (eye tracking determined viewing position) deer 603, the stereo composite image presented in this frame will be optimized for medium distances. Finally, if the user is viewing (eye tracking determines the viewing position) a flower 603 nearby, the stereo composite image will be optimized for close range (e.g., to achieve convergence, and blur distant items such as deer 603, mountains 600 and cloud 601). Various user inputs can be used to instruct the software suite how to optimize the stereo composite image. Gestures such as squinting can be used to optimize stereo composite images of more distant objects. Gestures such as leaning forward can be used to push distant objects closer. A GUI can also be used to improve the immersive viewing experience.
图7A示出了运动图像以及图像稳定处理的应用。700A示出了物体的左眼图像,其中物体的边缘有运动模糊。701A示出了应用了图像稳定处理的物体的左眼图像。Figure 7A shows a moving image and the application of image stabilization processing. 700A shows a left eye image of an object with motion blur around the edges of the object. 701A shows a left eye image of an object to which image stabilization processing is applied.
图7B示出了在HDU中显示的运动图像。702示出了HDU。700A示出了物体的左眼图像,其中物体的边缘有运动模糊。700B示出了物体的右眼图像,其中物体的边缘有运动模糊。701A示出了左眼显示器,其与用户的左眼对准。701B示出了右眼显示器,其与用户的右眼对准。FIG. 7B shows a moving image displayed in the HDU. 702 shows the HDU. 700A shows a left eye image of an object with motion blur around the edges of the object. 700B shows a right eye image of an object with motion blur around the edges of the object. 701A shows a left eye display aligned with the user's left eye. 701B shows a right eye display aligned with the user's right eye.
图7C示出了使用立体影像对图像应用图像稳定。图像处理的一项关键任务是使用立体影像的图像稳定。700A示出了应用了图像稳定处理的物体的左眼图像。700B示出了应用了图像稳定处理的物体的左眼图像。701A示出了左眼显示器,其与用户的左眼对准。701B示出了右眼显示器,其与用户的右眼对准。702示出了HDU。Figure 7C shows applying image stabilization to an image using stereoscopic imaging. A key task in image processing is image stabilization using stereoscopic imagery. 700A shows a left eye image of an object to which image stabilization processing has been applied. 700B shows a left eye image of an object to which image stabilization processing is applied. 701A shows a left eye display aligned with the user's left eye. 701B shows a right eye display aligned with the user's right eye. 702 shows the HDU.
图8A示出了具有第一摄像头设置的左图像和右图像。注意,显示器上的文本是对焦的,而柜子上旋钮的远处物体是失焦的。Figure 8A shows left and right images with a first camera setup. Note that the text on the monitor is in focus, while the distant object of the knob on the cabinet is out of focus.
图8B示出了具有第二摄像头设置的左图像和右图像。注意,显示器上的文本是失焦的,而柜子上旋钮的远处物体是对焦的。一个创新点是使用至少两个摄像头。第一图像从第一摄像头获得。第二图像从第二摄像头获得。第一摄像头和第二摄像头处于相同的观看视角。此外,它们是场景(例如,静止场景,或具有运动/变化的场景的同一时间点)的图像。生成合成图像,其中合成图像的第一部分从第一图像中获得,合成图像的第二部分从第二图像中获得。注意,在一些实施例中,第一图像内的物体可以被分割,并且第二图像内的相同物体也可以被分割。物体的第一图像和物体的第二图像可以进行比较,以查看哪一个具有更好的质量。可以将具有更好图像质量的图像添加到合成图像中。然而,在一些实施例中,可以执行有意选择一些不清晰的部分。Figure 8B shows left and right images with a second camera setup. Note that the text on the monitor is out of focus, while the distant object of the knob on the cabinet is in focus. An innovative point is the use of at least two cameras. The first image is obtained from the first camera. The second image is obtained from the second camera. The first camera and the second camera are at the same viewing angle. Furthermore, they are images of a scene (e.g., a still scene, or a scene with movement/change at the same point in time). A composite image is generated, wherein a first part of the composite image is obtained from the first image and a second part of the composite image is obtained from the second image. Note that in some embodiments, objects within the first image may be segmented, and the same objects within the second image may also be segmented. The first image of the object and the second image of the object can be compared to see which one has better quality. Images with better image quality can be added to the composite image. However, in some embodiments, intentional selection of some unclear parts may be performed.
图9A示出了在一个时间点收集的场景的所有数据的俯视图。Figure 9A shows a top view of all data for a scene collected at one point in time.
图9B示出了视频记录的广角显示2D图像帧。注意,假定用户的内部FOV(人眼FOV)与摄像系统的FOV不匹配,向用户显示的该整个视场角会失真。Figure 9B shows a video-recorded wide-angle display 2D image frame. Note that assuming the user's internal FOV (human eye FOV) does not match the camera system's FOV, the entire field of view displayed to the user will be distorted.
图9C示出了用户A的视角为-70°和55°FOV的俯视图。一个关键创新点在于,用户能够根据视角选择立体影像的部分。注意,所选部分实际上可能达到~180°,但不会更大。Figure 9C shows a top view of user A with a viewing angle of -70° and 55° FOV. A key innovation is that users can select parts of the stereoscopic image based on their viewing angle. Note that the selected section may actually be up to ~180°, but no larger.
图9D示出了给定用户A的视角为-70°和55°FOV时用户A会看到什么。这比现有技术有所改进,因为其允许不同的观看者看到视场的不同部分。虽然人具有略大于180度的水平视场,但是人类只能在约10度的视场上阅读文本,只能在约30度的视场上评估形状,并且只能在约60度的视场上评估颜色。在一些实施例中,执行过滤(减去)。人具有约120度的垂直视场,其中向上(水平线上方)视场为50度,向下(水平线下方)视场为大约70度。然而,眼球的最大转动被限制在水平线以上约25度以及水平线以下约30度。通常,从就座位置的正常视线大约在水平线以下15度。Figure 9D shows what User A would see given his viewing angles of -70° and 55° FOV. This is an improvement over existing techniques as it allows different viewers to see different parts of the field of view. Although humans have a horizontal field of view slightly larger than 180 degrees, humans can only read text over a field of view of about 10 degrees, evaluate shapes only over a field of view of about 30 degrees, and only over a field of view of about 60 degrees Evaluate color. In some embodiments, filtering (subtraction) is performed. A person has a vertical field of view of approximately 120 degrees, with an upward (above the horizontal) field of view of 50 degrees and a downward (below the horizontal) field of view of approximately 70 degrees. However, the maximum movement of the eyeball is limited to about 25 degrees above the horizontal and about 30 degrees below the horizontal. Typically, normal line of sight from a seated position is approximately 15 degrees below horizontal.
图9E示出了用户B的视角为+50°和85°FOV的俯视图。一个关键创新点在于,用户能够根据视角选择立体影像的部分。此外,注意,用户B的FOV大于用户A的FOV。注意,所选部分实际上可能达到~180°,但由于人眼的限制,不会更大。Figure 9E shows a top view of user B's viewing angles of +50° and 85° FOV. A key innovation is that users can select parts of the stereoscopic image based on their viewing angle. Also, note that User B's FOV is larger than User A's FOV. Note that the selected part may actually reach ~180°, but will not be larger due to limitations of the human eye.
图9F示出了给定用户B的视角为+50°和85°FOV时用户B会看到什么。这比现有技术有所改进,因为其允许不同的观看者看到视场的不同部分。在一些实施例中,多个摄像头记录240°的影片。在一个实施例中,4个摄像头(每个摄像头拍摄60°的扇区)用于同时记录。在另一个实施例中,扇区被顺序拍摄——一次记录一个。影片中的一些场景可以顺序拍摄,而其他场景可以同时拍摄。在一些实施例中,可使用重叠的摄像头设置进行图像拼接。一些实施例包括使用美国专利申请17/225,610中所描述的摄像球系统,其全部内容通过引用并入本文。影像被记录后,对来自摄像头的影像进行编辑,使场景同步并拼接在一起。激光雷达(LIDAR)设备可以集成到摄像系统中,用于精确的摄像头方向指向。Figure 9F shows what User B would see given User B's viewing angles of +50° and 85° FOV. This is an improvement over existing technology as it allows different viewers to see different parts of the field of view. In some embodiments, multiple cameras record 240° movies. In one embodiment, 4 cameras (each camera captures a 60° sector) are used for simultaneous recording. In another embodiment, sectors are captured sequentially - recorded one at a time. Some scenes in the film could be shot sequentially, while other scenes could be shot simultaneously. In some embodiments, overlapping camera setups may be used for image stitching. Some embodiments include use of the camera ball system described in U.S. Patent Application No. 17/225,610, the entire contents of which are incorporated herein by reference. After the images are recorded, the images from the camera are edited so that the scenes are synchronized and stitched together. LiDAR (LIDAR) devices can be integrated into camera systems for precise camera direction pointing.
图10A示出了左摄像头在第一时间点捕获的视场。示出了左摄像头1000和右摄像头1001。左FOV 1002由白色区域示出,约为215°,并会具有从+90°至-135°的α范围(逆时针方向从+90°扫至-135°)。未在左FOV 1003内成像的区域约为135°,并会具有从+90°至-135°的α范围(顺时针方向从+90°扫至-135°)。Figure 10A shows the field of view captured by the left camera at the first point in time. Left camera 1000 and right camera 1001 are shown. The left FOV 1002 is shown by the white area, is approximately 215°, and would have an alpha range from +90° to -135° (sweeping counterclockwise from +90° to -135°). The area not imaged within left FOV 1003 is approximately 135° and would have an alpha range from +90° to -135° (sweeping clockwise from +90° to -135°).
图10B示出了右摄像头在第一时间点捕获的视场。示出了左摄像头1000和右摄像头1001。右FOV 1004由白色区域示出,约为215°,并会具有α范围从+135°至-90°的α范围(逆时针方向从+135°扫至-90°)。未在右FOV 1005内成像的区域约为135°,并会具有从+135°至-90°的α范围(逆时针方向从+135°扫至-90°)。Figure 10B shows the field of view captured by the right camera at the first point in time. Left camera 1000 and right camera 1001 are shown. The right FOV 1004 is shown by the white area, is approximately 215°, and would have an alpha range of +135° to -90° (sweeping counterclockwise from +135° to -90°). The area not imaged within the right FOV 1005 is approximately 135° and would have an alpha range from +135° to -90° (sweeping counterclockwise from +135° to -90°).
图10C示出了第一用户在给定时间点的个性化视场(FOV)。1000示出了左摄像头。1001示出了右摄像头。1006a示出了第一用户的左眼FOV的左边界,其以浅灰色显示。1007a示出了第一用户的左眼FOV的右侧边界,其以浅灰色显示。1008a示出了第一用户的右眼FOV的左边界,其以浅灰色显示。1009a示出了第一用户的右眼FOV的右侧边界,其以浅灰色显示。1010a示出了第一用户的左眼FOV的中心线。1011a示出了第一用户的右眼FOV的中心线。注意,第一用户的左眼FOV的中心线1010a和第一用户的右眼FOV的中心线1011a是平行的,这相当于在无穷远处有会聚点。注意,第一用户正看向前方。建议在拍摄移动时,场景中的大部分动作都发生在这个前视方向。Figure 1OC shows the first user's personalized field of view (FOV) at a given point in time. 1000 shows the left camera. 1001 shows the right camera. 1006a shows the left boundary of the first user's left eye FOV, shown in light gray. 1007a shows the right boundary of the first user's left eye FOV, shown in light gray. 1008a shows the left boundary of the first user's right eye FOV, shown in light gray. 1009a shows the right boundary of the first user's right eye FOV, shown in light gray. 1010a shows the centerline of the first user's left eye FOV. 1011a shows the centerline of the first user's right eye FOV. Note that the center line 1010a of the first user's left eye FOV and the center line 1011a of the first user's right eye FOV are parallel, which is equivalent to a convergence point at infinity. Notice that the first user is looking forward. It is recommended that when photographing movement, most of the action in the scene occurs in this front-view direction.
图10D示出了第二用户在给定时间点的个性化视场(FOV)。1000示出了左摄像头。1001示出了右摄像头。1006b示出了第二用户的左眼FOV的左边界,其以浅灰色显示。1007b示出了第二用户的左眼FOV的右侧边界,其以浅灰色显示。1008b示出了第二用户的右眼FOV的左边界,其以浅灰色显示。1009b示出了第二用户的右眼FOV的右侧边界,其以浅灰色显示。1010b示出了第二用户的左眼FOV的中心线。1011b示出了第二用户的右眼FOV的中心线。注意,第二用户的左眼FOV的中心线1010b和第二用户的右眼FOV的中心线1011b在会聚点1012相遇。这允许第二用户更详细地观看小物体。注意,第二用户正看向前方。建议在拍摄移动时,场景中的大部分动作都发生在这个前视方向。Figure 10D shows the second user's personalized field of view (FOV) at a given point in time. 1000 shows the left camera. 1001 shows the right camera. 1006b shows the left boundary of the second user's left eye FOV, shown in light gray. 1007b shows the right boundary of the second user's left eye FOV, shown in light gray. 1008b shows the left boundary of the second user's right eye FOV, shown in light gray. 1009b shows the right boundary of the second user's right eye FOV, shown in light gray. 1010b shows the centerline of the second user's left eye FOV. 1011b shows the centerline of the second user's right eye FOV. Note that the center line 1010b of the second user's left eye FOV and the center line 1011b of the second user's right eye FOV meet at convergence point 1012. This allows the second user to view small objects in greater detail. Notice that the second user is looking forward. It is recommended that when photographing movement, most of the action in the scene occurs in this front-view direction.
图10E示出了第三用户在给定时间点的个性化视场(FOV)。1000示出了左摄像头。1001示出了右摄像头。1006c示出了第三用户的左眼FOV的左边界,其以浅灰色显示。1007c示出了第三用户的左眼FOV的右侧边界,其以浅灰色显示。1008c示出了第三用户的右眼FOV的左边界,其以浅灰色显示。1009c示出了第三用户的右眼FOV的右侧边界,其以浅灰色显示。1010c示出了第三用户的左眼FOV的中心线。1011c示出了第三用户的右眼FOV的中心线。注意,第三用户的左眼FOV的中心线1010c和第三用户的右眼FOV的中心线1011c近似平行,这相当于看向很远的距离。注意,第三用户正看向适度偏左的方向。注意,左眼FOV和右眼FOV的重叠为第三观看者提供了立体观看。Figure 10E shows a third user's personalized field of view (FOV) at a given point in time. 1000 shows the left camera. 1001 shows the right camera. 1006c shows the left boundary of the third user's left eye FOV, shown in light gray. 1007c shows the right boundary of the third user's left eye FOV, shown in light gray. 1008c shows the left boundary of the third user's right eye FOV, shown in light gray. 1009c shows the right boundary of the third user's right eye FOV, shown in light gray. 1010c shows the centerline of the third user's left eye FOV. 1011c shows the centerline of the third user's right eye FOV. Note that the center line 1010c of the third user's left eye FOV and the center line 1011c of the third user's right eye FOV are approximately parallel, which is equivalent to looking at a long distance. Note that the third user is looking moderately to the left. Note that the overlap of the left eye FOV and right eye FOV provides stereoscopic viewing for the third viewer.
图10F示出了第四用户在给定时间点的个性化视场(FOV)。1000示出了左摄像头。1001示出了右摄像头。1006d示出了第四用户的左眼FOV的左边界,其以浅灰色显示。1107d示出了第四用户的左眼FOV的右侧边界,其以浅灰色显示。1008d示出了第四用户的右眼FOV的左边界,其以浅灰色显示。1009d示出了第四用户的右眼FOV的右侧边界,其以浅灰色显示。1010d示出了第四用户的左眼FOV的中心线。1011d示出了第四用户的右眼FOV的中心线。注意,第四用户的左眼FOV的中心线1010d和第四用户的右眼FOV的中心线1011d近似平行,这相当于看向很远的距离。注意,第四用户正看向极左的方向。注意,第一用户、第二用户、第三用户和第四用户都在同一时间点观看影片的不同视图。需要注意的是,其中一些设计,如所述的摄像头集群或球形系统Figure 10F shows the fourth user's personalized field of view (FOV) at a given point in time. 1000 shows the left camera. 1001 shows the right camera. 1006d shows the left boundary of the fourth user's left eye FOV, shown in light gray. 1107d shows the right boundary of the fourth user's left eye FOV, shown in light gray. 1008d shows the left boundary of the fourth user's right eye FOV, shown in light gray. 1009d shows the right boundary of the fourth user's right eye FOV, shown in light gray. 1010d shows the centerline of the fourth user's left eye FOV. 1011d shows the centerline of the fourth user's right eye FOV. Note that the center line 1010d of the fourth user's left eye FOV and the center line 1011d of the fourth user's right eye FOV are approximately parallel, which is equivalent to looking at a long distance. Note that the fourth user is looking to the extreme left. Note that the first user, the second user, the third user and the fourth user all watch different views of the movie at the same point in time. It should be noted that some of these designs, such as the camera cluster or spherical system described
图11A示出了第一用户在时间点1的左眼视图的俯视图。1100示出了左眼视点。1101示出了右眼视点。1102示出了未被任一摄像头覆盖的视场(FOV)部分。1103示出了被至少一个摄像头覆盖的FOV部分。1104A示出了用户使用的高分辨率FOV的内侧部分(medialportion),其对应于α=+25°。美国专利申请17/225,610中对此有更详细的论述,其全部内容通过引用并入本文。FIG. 11A shows a top view of the first user's left eye view at time point 1 . 1100 shows the left eye viewpoint. 1101 shows the right eye viewpoint. 1102 shows the portion of the field of view (FOV) not covered by either camera. 1103 shows the portion of the FOV covered by at least one camera. 1104A shows the medialportion of the high-resolution FOV used by the user, which corresponds to α = +25°. This is discussed in more detail in US Patent Application No. 17/225,610, the entire contents of which are incorporated herein by reference.
1105A示出了用户使用的高分辨率FOV的外侧部分(lateral portion),其对应于α=-25°。1105A shows the lateral portion of the high-resolution FOV used by the user, which corresponds to α = -25°.
图11B示出了第一用户的左眼视图的俯视图,其中会聚点靠近左眼和右眼。1100示出了左眼视点。Figure 11B shows a top view of the first user's left eye view with the convergence point proximal to the left and right eyes. 1100 shows the left eye viewpoint.
1101示出了右眼视点。1102示出了未被任一摄像头覆盖的视场(FOV)部分。1103示出了被至少一个摄像头覆盖的FOV部分。1104B示出了用户使用的高分辨率FOV的内侧部分,其对应于α=-5°。1105B示出了用户使用的高分辨率FOV的外侧部分,其对应于α=+45°。1101 shows the right eye viewpoint. 1102 shows the portion of the field of view (FOV) not covered by either camera. 1103 shows the portion of the FOV covered by at least one camera. 1104B shows the inner portion of the high-resolution FOV used by the user, which corresponds to α = -5°. 1105B shows the outer portion of the high-resolution FOV used by the user, which corresponds to α = +45°.
图11C示出了在时间点1没有会聚的左眼视图。注意,图像中示出了花1106,其沿着视角α=0°定位。Figure 11C shows the view of the left eye at time point 1 without convergence. Note that flower 1106 is shown in the image, which is positioned along the viewing angle α = 0°.
图11D示出了在时间点2有会聚的左眼视图。注意,图像中示出了花1106,其仍然沿着视角α=0°定位。然而,用户在这个时间点期间已经会聚。这种会聚行为使得左眼视场从α介于-25°和25°之间的水平视场(如图11A和11C所示)变为α介于-5°和+45°之间(如图11B和11D所示)。该系统对现有技术进行了改进,因为它通过根据左(右)视场移动图像来提供立体摄像头上的立体会聚。在一些实施例中,显示的一部分是非优化的,这在美国专利10,712,837中有所描述,该专利的全部内容通过引用并入本文。Figure 1 ID shows the left eye view with convergence at time point 2. Note that the image shows flower 1106, which is still positioned along the viewing angle α = 0°. However, users have converged during this point in time. This convergence behavior changes the left eye's field of view from a horizontal field of view with α between -25° and 25° (as shown in Figures 11A and 11C) to an α between -5° and +45° (as shown in Figures 11A and 11C) shown in Figures 11B and 11D). This system improves upon existing technology as it provides stereo convergence on a stereo camera by moving the image according to the left (right) field of view. In some embodiments, a portion of the display is non-optimized, as described in U.S. Patent 10,712,837, which is incorporated herein by reference in its entirety.
图12示出了从先前获取的广角立体图像重建各种立体图像。1200示出了从立体摄像系统获取影像。美国专利申请17/225,610中对该摄像系统有更详细的论述,其全部内容通过引用并入本文。1201示出了针对左眼观看视角使用第一摄像头,针对右眼观看视角使用第二摄像头。1202示出了基于左眼视角选择第一摄像头的视场,以及基于右眼视角选择第二摄像头的视场。在优选的实施例中,该选择将由计算机(例如,集成到头部显示单元中)基于跟踪用户眼球运动的眼动跟踪系统来执行。还应该注意的是,在优选的实施例中,在会聚过程中,在更靠近鼻子的显示器上还会有向内的图像偏移,这在美国专利10,712,837中有所教导,尤其是图15A、15B、16A和16B,该专利的全部内容通过引用并入本文。1203示出了向用户的左眼呈现左眼视场,向用户的右眼呈现右眼视场。在这种情境下有多种选择。首先,使用合成立体图像对,其中左眼图像由至少两个透镜生成(例如,第一个被优化用于特写图像,第二个被优化用于远距离成像),并且其中右眼图像由至少两个透镜生成(例如,第一个被优化用于特写图像,第二个被优化用于远距离成像)。当用户正在看近处物体时,呈现近处物体对焦而远处物体失焦的立体图像对。当用户正在看远处物体时,呈现近处物体失焦而远处物体对焦的立体图像对。第二,使用各种显示设备(例如,增强现实、虚拟现实、混合现实显示器)。Figure 12 shows the reconstruction of various stereoscopic images from previously acquired wide-angle stereoscopic images. 1200 shows acquiring images from a stereo camera system. This camera system is discussed in more detail in U.S. Patent Application No. 17/225,610, the entire contents of which are incorporated herein by reference. 1201 shows the use of a first camera for a left eye viewing angle and a second camera for a right eye viewing angle. 1202 illustrates selecting the field of view of the first camera based on the left eye perspective and selecting the field of view of the second camera based on the right eye perspective. In a preferred embodiment, the selection will be performed by a computer (eg integrated into the head display unit) based on an eye tracking system that tracks the user's eye movements. It should also be noted that in the preferred embodiment, there is also an inward image shift on the display closer to the nose during the convergence process, as taught in U.S. Patent 10,712,837, particularly Figures 15A, 15B, 16A, and 16B, the entire contents of which are incorporated herein by reference. 1203 shows the left eye field of view being presented to the user's left eye and the right eye field of view being presented to the user's right eye. There are several options in this situation. First, a synthetic stereoscopic image pair is used, in which the left-eye image is generated by at least two lenses (e.g., the first is optimized for close-up images and the second is optimized for distant imaging), and in which the right-eye image is generated by at least Two lenses are generated (e.g. the first is optimized for close-up images and the second is optimized for distant imaging). When the user is looking at a near object, a stereoscopic image pair is presented in which the near object is in focus and the far object is out of focus. When the user is looking at a distant object, a stereoscopic image pair is presented in which the near object is out of focus and the far object is in focus. Second, use various display devices (e.g., augmented reality, virtual reality, mixed reality displays).
图13A示出了家庭影院的俯视图。1300示出了用户。1301示出了投影仪。1302示出了屏幕。注意,这种沉浸式家庭影院显示的视场大于用户1300的视场。例如,如果用户1300正直视前方,家庭影院将显示大于180度的水平FOV。因此,家庭影院的FOV将完全覆盖用户的水平FOV。类似地,如果用户正直视前方,家庭影院将显示大于120度的垂直FOV。因此,家庭影院的FOV将完全覆盖用户的垂直FOV。AR/VR/MR头戴式耳机可以与该系统结合使用,但非必须。也可以使用便宜的立体影片(anaglyph)或一次性彩色眼镜。传统的IMAX偏光投影仪可以与IMAX型一次性偏光眼镜一起使用。家庭影院的大小可以变化。家庭影院的墙可以使用白色反光面板和框架来建造。投影仪将具有多个头部,以覆盖更大的视场。Figure 13A shows a top view of a home theater. 1300 shows the user. 1301 shows a projector. 1302 shows the screen. Note that the field of view of this immersive home theater display is larger than that of the user 1300 . For example, if the user 1300 is looking straight ahead, the home theater will display a horizontal FOV greater than 180 degrees. Therefore, the home theater's FOV will completely cover the user's horizontal FOV. Similarly, if the user is looking straight ahead, the home theater will display a vertical FOV greater than 120 degrees. Therefore, the home theater's FOV will completely cover the user's vertical FOV. AR/VR/MR headsets can be used with this system, but are not required. Inexpensive anaglyph or disposable colored glasses can also be used. Traditional IMAX polarized projectors can be used with IMAX-style disposable polarized glasses. Home theaters can vary in size. Home theater walls can be constructed using white reflective panels and frames. Projectors will have multiple heads to cover a larger field of view.
图13B示出了如图13A所示的家庭影院的侧视图。1300示出了用户。1301示出了投影仪。1302示出了屏幕。注意,这种沉浸式家庭影院显示的视场大于用户1300的视场。例如,如果用户100正在躺椅上向前看,家庭影院将显示大于120度的垂直FOV。因此,家庭影院的FOV将完全覆盖用户的FOV。类似地,如果用户正直视前方,家庭影院将显示大于120度的水平FOV。因此,家庭影院的FOV将完全覆盖用户的FOV。Figure 13B shows a side view of the home theater shown in Figure 13A. 1300 shows the user. 1301 shows a projector. 1302 shows the screen. Note that the field of view of this immersive home theater display is larger than that of the user 1300 . For example, if the user 100 is looking forward in a recliner, the home theater will display a vertical FOV greater than 120 degrees. Therefore, the home theater's FOV will completely cover the user's FOV. Similarly, if the user is looking straight ahead, the home theater will display a horizontal FOV greater than 120 degrees. Therefore, the home theater's FOV will completely cover the user's FOV.
图14A示出了家庭影院的俯视图。1400A示出了第一用户。1400B示出了第一用户。1401示出了投影仪。1402示出了屏幕。注意,该沉浸式家庭影院显示的视场大于第一用户1400A或第二用户1400B的FOV。例如,如果第一用户1400A正直视前方,则第一用户1400A将看到大于180度的水平FOV。因此,家庭影院的FOV将完全覆盖用户的水平FOV。类似地,如果第一用户1400A正直视前方,家庭影院将显示大于120度的垂直FOV,如图14B所示。因此,家庭影院的FOV将完全覆盖用户的垂直FOV。AR/VR/MR头戴式耳机可以与该系统结合使用,但非必须。也可以使用便宜的立体影片或偏光眼镜。传统的IMAX偏光投影仪可以与IMAX型一次性偏光眼镜一起使用。家庭影院的大小可以变化。家庭影院的墙可以使用白色反光面板和框架来建造。投影仪将具有多个头部,以覆盖更大的视场。Figure 14A shows a top view of a home theater. 1400A shows a first user. 1400B shows the first user. 1401 shows a projector. 1402 shows the screen. Note that the field of view displayed by the immersive home theater is larger than the FOV of the first user 1400A or the second user 1400B. For example, if first user 1400A is looking straight ahead, first user 1400A will see a horizontal FOV greater than 180 degrees. Therefore, the home theater's FOV will completely cover the user's horizontal FOV. Similarly, if first user 1400A is looking straight ahead, the home theater will display a vertical FOV greater than 120 degrees, as shown in Figure 14B. Therefore, the home theater's FOV will completely cover the user's vertical FOV. AR/VR/MR headsets can be used with this system, but are not required. Inexpensive stereoscopic films or polarized glasses may also be used. Traditional IMAX polarized projectors can be used with IMAX-style disposable polarized glasses. Home theaters can vary in size. Home theater walls can be constructed using white reflective panels and frames. Projectors will have multiple heads to cover a larger field of view.
图14B示出了如图14A所示的家庭影院的侧视图。1400A示出了第一用户。1401示出了投影仪。1402示出了屏幕。注意,这种沉浸式家庭影院显示的视场大于第一用户1400A的视场。例如,如果用户1400A正在躺椅上向前看,用户将看到大于120度的垂直FOV。因此,家庭影院的FOV将完全覆盖第一用户1400A的FOV。类似地,如果第一用户1400A正直视前方,家庭影院将显示大于120度的水平FOV。因此,家庭影院的FOV将完全覆盖第一用户1400A的FOV。Figure 14B shows a side view of the home theater shown in Figure 14A. 1400A shows a first user. 1401 shows a projector. 1402 shows the screen. Note that the field of view of this immersive home theater display is larger than the field of view of the first user 1400A. For example, if user 1400A is looking forward in a recliner, the user will see a vertical FOV greater than 120 degrees. Therefore, the FOV of the home theater will completely cover the FOV of the first user 1400A. Similarly, if first user 1400A is looking straight ahead, the home theater will display a horizontal FOV greater than 120 degrees. Therefore, the FOV of the home theater will completely cover the FOV of the first user 1400A.
典型的高分辨率显示器在1.37米的距离上具有4000个像素。这相当于每1.87平方米有10×106个像素。考虑半球形剧院的数据。假设半球形影院的半径为2米。半球的表面积是2×π×r2,等于(4)(3.14)(22)或50.24m2。假设希望空间分辨率等于典型的高分辨率显示器的空间分辨率,这将等于(50.24m2)(10×106个像素每1.87m2)或4.29亿个像素。假设帧速率为每秒60帧。这相当于标准4K监视器数据量的26倍。A typical high-resolution display has 4000 pixels at a distance of 1.37 meters. This is equivalent to 10×10 6 pixels per 1.87 square meters. Consider data for a hemispherical theater. Assume that the radius of the hemispherical theater is 2 meters. The surface area of the hemisphere is 2×π×r 2 , which is equal to (4)(3.14)(2 2 ) or 50.24m 2 . Assuming that the spatial resolution is desired to be equal to that of a typical high-resolution display, this would equal (50.24m 2 ) (10×10 6 pixels per 1.87m 2 ) or 429 million pixels. Assume the frame rate is 60 frames per second. This is equivalent to 26 times the data volume of a standard 4K monitor.
一些实施例包括,构建与投影仪的几何形状匹配的家庭影院。优选的实施例是亚球形的(sub-spherical)(例如半球形的)。一种低成本的构造是使用与多头投影仪拼接在一起的反射表面。在一些实施例中,视场包括4π球面度的球形覆盖范围。这可以通过HDU来实现。在一些实施例中,视场包括至少3π球面度的亚球形覆盖范围。在一些实施例中,视场包括至少2π球面度的亚球形覆盖范围。在一些实施例中,视场包括至少1π球面度的亚球形覆盖范围。在一些实施例中,视场包括至少0.5π球面度的亚球形覆盖范围。在一些实施例中,视场包括至少0.25π球面度的亚球形覆盖范围。在一些实施例中,视场包括至少0.05π球面度的亚球形覆盖范围。在一些实施例中,创建亚球形IMAX系统,以改善许多观看者的电影院体验。椅子将被放置在与标准电影院相似的位置,但是屏幕将是亚球形的。在一些实施例中,也可以使用非球形形状。Some embodiments include building a home theater that matches the geometry of the projector. Preferred embodiments are sub-spherical (eg hemispherical). One low-cost construction is to use reflective surfaces spliced together with multiple projectors. In some embodiments, the field of view includes a spherical coverage of 4π steradians. This can be achieved through HDU. In some embodiments, the field of view includes subspherical coverage of at least 3π steradians. In some embodiments, the field of view includes subspherical coverage of at least 2π steradians. In some embodiments, the field of view includes subspherical coverage of at least 1π steradian. In some embodiments, the field of view includes subspherical coverage of at least 0.5π steradians. In some embodiments, the field of view includes subspherical coverage of at least 0.25π steradians. In some embodiments, the field of view includes subspherical coverage of at least 0.05π steradians. In some embodiments, a sub-spherical IMAX system is created to improve the movie theater experience for many viewers. Chairs will be placed in similar positions to a standard movie theater, but the screen will be sub-spherical. In some embodiments, non-spherical shapes may also be used.
图15A示出了时间点#1,其中用户正直视前方,以相当精确的视场(例如,用户可以看到外围FOV的形状和颜色)看到约水平60度和垂直40度的水平视场。Figure 15A shows time point #1 where the user is looking straight ahead, seeing a horizontal field of view of approximately 60 degrees horizontally and 40 degrees vertically with a fairly precise field of view (e.g., the user can see the shape and color of the peripheral FOV) .
图15B示出了TV的中心部分以及用户在时间点#1观察到的视场。注意,在一些实施例中,数据将被流式传输(例如,通过互联网)。注意,本专利的一个创新特征被称为“观看参数定向的流式传输”。在本实施例中,观看参数用于指导流式传输的数据。例如,如果用户1500正直视前方,则第一组数据将被流式传输以与用户1500的直视视角对应。然而,如果用户正看向屏幕的侧面,则第二组数据将被流式传输以与用户1500的侧视视角对应。其他可控制视角的观看参数包括但不限于以下参数:用户的辐辏;用户的头部位置;用户的头部方向。从广义上讲,用户的任何特征(年龄、性别、偏好)或动作(视角、位置等)可用于指导流式传输。注意,另一个创新特征是至少两种图像质量的流式传输。例如,将根据第一参数(例如,在用户的30°水平FOV和30°垂直FOV内)来流式传输第一图像质量(例如,高质量)。并且,也将流式传输不满足该标准(例如,不在用户的30°水平FOV和30°垂直FOV内)的第二图像质量(例如,较低质量)。将在该系统中将实现环绕声。Figure 15B shows the center portion of the TV and the field of view observed by the user at time point #1. Note that in some embodiments, the data will be streamed (eg, over the Internet). Note that an innovative feature of this patent is called "viewing parameter-directed streaming." In this embodiment, viewing parameters are used to guide the streamed data. For example, if user 1500 is looking straight ahead, the first set of data will be streamed to correspond to user 1500's straight-looking perspective. However, if the user is looking to the side of the screen, a second set of data will be streamed to correspond to the user's 1500 side-viewing perspective. Other viewing parameters that can control the viewing angle include, but are not limited to, the following parameters: user's vergence; user's head position; user's head direction. Broadly speaking, any characteristics (age, gender, preferences) or actions (perspective, location, etc.) of the user can be used to guide streaming. Note that another innovative feature is the streaming of at least two image qualities. For example, a first image quality (eg, high quality) will be streamed according to a first parameter (eg, within the user's 30° horizontal FOV and 30° vertical FOV). And, a second image quality (eg, lower quality) that does not meet this criterion (eg, is not within the user's 30° horizontal FOV and 30° vertical FOV) will also be streamed. Surround sound will be implemented in this system.
图15C示出了时间点#2,其中用户正看向用户的屏幕左侧,以相当精确的视场(例如,用户可以看到外围FOV的形状和颜色)看到约水平60度和垂直40度的水平视场。Figure 15C shows time point #2 where the user is looking to the left of the user's screen, seeing approximately 60 degrees horizontally and 40 degrees vertically with a fairly precise field of view (e.g., the user can see the shape and color of the peripheral FOV) degree of horizontal field of view.
图15D示出了时间点#2,用户在时间点#2观察到的视场不同于图15B。感兴趣的区域是时间点#1的一半。在一些实施例中,向用户提供场景内的小FOV内物体的更多细节和更高分辨率。在这个高分辨率视场区域之外,可以在屏幕上呈现较低分辨率的图像质量。Figure 15D shows time point #2, where the field of view observed by the user is different from Figure 15B. The area of interest is half of time point #1. In some embodiments, the user is provided with more detail and higher resolution of objects within a small FOV within the scene. Outside this high-resolution field of view area, lower resolution image quality can be rendered on the screen.
图15E示出了时间点#3,其中用户正看向用户的屏幕右侧。Figure 15E shows time point #3 where the user is looking to the right side of the user's screen.
图15F示出了时间点#3,看到圆形的高分辨率FOV。Figure 15F shows time point #3, where a circular high-resolution FOV is seen.
图16A示出了未变焦的图像。1600示出了图像。1601A示出了一个方框,表示图像1600内被设置为放大的区域。Figure 16A shows an unzoomed image. 1600 shows the image. 1601A shows a box representing an area within image 1600 that is set to magnification.
图16B示出了对图像的一部分的数字型放大。这可以通过美国专利8,384,771中描述的方法来实现(例如,1个像素变成4个像素),该专利的全部内容通过引用并入本文。注意,要放大的区域可以通过各种用户输入来实现,包括:姿势跟踪系统;眼动跟踪系统;以及图形用户接口(graphical user interface,GUI)。注意,图16A中表示的图像1601A内的区域现在被放大,如1601B所示。注意,1601B区域的分辨率等于图像1600的分辨率,只是更大。注意,1600B示出了1600A的未被放大的部分。注意,1601A现在被放大,并且注意,1600A的部分没有被可视化。Figure 16B shows a digital enlargement of a portion of the image. This can be achieved by the method described in US Patent 8,384,771 (eg, 1 pixel becomes 4 pixels), which is incorporated herein by reference in its entirety. Note that the area to be enlarged can be achieved through various user inputs, including: gesture tracking systems; eye tracking systems; and graphical user interfaces (GUI). Note that the area within image 1601A represented in Figure 16A is now enlarged, as shown at 1601B. Note that the resolution of area 1601B is equal to the resolution of image 1600, just larger. Note that 1600B shows an unamplified portion of 1600A. Note that 1601A is now magnified, and note that portions of 1600A are not visualized.
图17A示出了未变焦的图像。1700示出了图像。1701A示出了一个方框,表示图像1700内被设置放大的区域。Figure 17A shows an unzoomed image. 1700 shows the image. 1701A shows a box representing the area within image 1700 that is set to be magnified.
图17B示出了对图像的一部分的光学型放大。注意,要放大的区域可以通过各种用户输入来实现,包括:姿势跟踪系统;眼动跟踪系统;以及图形用户接口(GUI)。注意,图17A中表示的图像1701A内的区域现在被放大,如1701B所示,并且还注意,1701B内的图像呈现更高的图像质量。这可以通过在区域1701B中选择性地显示最高质量的影像并放大区域1701B来实现。不仅云更大,云的分辨率也更好。注意,1700B示出了1700A的未被放大的部分(注意,1700A的未被放大的某些部分现在被放大的区域覆盖)。Figure 17B shows an optical magnification of a portion of the image. Note that the area to be enlarged can be achieved through various user inputs, including: gesture tracking systems; eye tracking systems; and graphical user interfaces (GUIs). Note that the area within image 1701A represented in Figure 17A is now enlarged as shown at 1701B, and also note that the image within 1701B exhibits higher image quality. This can be accomplished by selectively displaying the highest quality image in area 1701B and zooming in on area 1701B. Not only are the clouds bigger, the resolution of the clouds is also better. Note that 1700B shows an unamplified portion of 1700A (note that some of the unamplified portions of 1700A are now covered by enlarged areas).
图18A示出了单分辨率图像。1800A示出了图像。1801A示出了一个方框,表示图像1800A内被设置为分辨率提高的区域。Figure 18A shows a single resolution image. 1800A shows the image. 1801A shows a box representing an area within image 1800A that is set to have increased resolution.
图18B示出了多分辨率图像。注意,分辨率提高的区域可以通过各种用户输入来实现,包括:姿势跟踪系统;眼动跟踪系统;以及包括操纵杆或控制器的图形用户接口(GUI)。注意,图18A中表示的图像1801A内的区域现在以更高的分辨率显示,如1801B所示。在一些实施例中,1801B内的图像也可以在其他选项中改变(例如,不同的配色方案、不同的亮度设置等)。这可以通过在区域1801B中选择性地显示更高(例如,最高)质量的影像而不放大区域1701B来实现。Figure 18B shows a multi-resolution image. Note that areas of increased resolution can be achieved through a variety of user inputs, including: gesture tracking systems; eye tracking systems; and graphical user interfaces (GUIs) including joysticks or controllers. Note that the area within image 1801A represented in Figure 18A is now displayed at a higher resolution, as shown at 1801B. In some embodiments, the image within 1801B may also be changed in other options (e.g., different color schemes, different brightness settings, etc.). This may be accomplished by selectively displaying higher (eg, highest) quality images in area 1801B without enlarging area 1701B.
图19A示出了大视场,其中第一用户正在看图像的第一部分,第二用户正在看图像的第二部分。1900A是具有第一分辨率的大视场。1900B是第一用户正在看的位置,如图19B所示,其被设置为变成高分辨率。1900C是第二用户正在看的位置,如图19B所示,其被设置为变成高分辨率。Figure 19A shows a large field of view in which a first user is looking at a first part of the image and a second user is looking at a second part of the image. The 1900A is the first resolution with a large field of view. 1900B is the position where the first user is looking, as shown in Figure 19B, which is set to become high resolution. 1900C is where the second user is looking, as shown in Figure 19B, which is set to become high resolution.
图19B示出了只有图19A中图像的第一部分和图19A中图像的第二部分为高分辨率,而图像的其余部分为低分辨率。1900A是具有第一分辨率(低分辨率)的大视场。1900B是第一用户的高分辨率区域的位置,其具有第二分辨率(在本例中为高分辨率)。1900C是第二用户的高分辨率区域的位置,其具有第二分辨率(在本例中为高分辨率)。因此,第一高分辨率区域用于第一用户。并且,第二高分辨率区域可以用于第二用户。如图14A和14B所示,该系统对于家庭影院显示器可以是有用的。Figure 19B shows that only the first part of the image in Figure 19A and the second part of the image in Figure 19A are high resolution, while the remainder of the image is low resolution. 1900A is a large field of view with first resolution (low resolution). 1900B is the location of the first user's high resolution area, which has a second resolution (high resolution in this example). 1900C is the location of the second user's high resolution area, which has a second resolution (high resolution in this case). Therefore, the first high resolution area is used for the first user. Also, a second high resolution area may be used for a second user. As shown in Figures 14A and 14B, this system may be useful for home theater displays.
图20A示出了低分辨率图像。Figure 20A shows a low resolution image.
图20B示出了高分辨率图像。Figure 20B shows a high resolution image.
图20C示出了合成图像。注意,该合成图像具有低分辨率的第一部分2000和高分辨率的第二部分2001。这在美国专利16/893,291中有所描述,该专利的全部内容通过引用并入本文。第一部分由用户的观看参数(例如,视角)确定。一个创新点在于,具有第一图像质量的第一部分2000和具有第二图像质量的第二部分的近实时流式传输。注意,第一部分的显示可以与第二部分不同。例如,第一部分和第二部分可以在视觉呈现参数上不同,包括:亮度;配色方案;或者其他。因此,在一些实施例中,可以压缩图像的第一部分,而不压缩图像的第二部分。在其他实施例中,通过将一些高分辨率图像和一些低分辨率图像拼接在一起的安排来生成合成图像,以显示给用户。在一些实施例中,大图像(例如,4.29亿像素)的某些部分是高分辨率的,大图像的某些部分是低分辨率的。将根据用户的观看参数(例如,会聚点、视角、头部角度等)流式传输大图像的高分辨率部分。Figure 20C shows the composite image. Note that this composite image has a low resolution first part 2000 and a high resolution second part 2001. This is described in US Patent 16/893,291, which is incorporated herein by reference in its entirety. The first part is determined by the user's viewing parameters (eg, viewing angle). One novelty lies in the near real-time streaming of a first part 2000 with a first image quality and a second part with a second image quality. Note that the first part can be displayed differently than the second part. For example, the first part and the second part may differ in visual presentation parameters, including: brightness; color scheme; or others. Thus, in some embodiments, a first portion of the image may be compressed without compressing a second portion of the image. In other embodiments, a composite image is generated for display to the user by an arrangement that stitches together some high-resolution images and some low-resolution images. In some embodiments, some portions of a large image (eg, 429 million pixels) are high resolution and some portions of a large image are low resolution. A high-resolution portion of a large image will be streamed based on the user's viewing parameters (e.g., convergence point, viewing angle, head angle, etc.).
图21示出了用于执行自定义图像的近实时流式传输的方法和过程。Figure 21 illustrates methods and processes for performing near real-time streaming of custom images.
关于显示器2100,显示器包括但不限于以下各项:大TV;扩展现实(例如,增强现实、虚拟现实或混合现实显示器);屏幕上的投影仪系统;电脑显示器等。显示器的一个关键组成部分是跟踪用户在图像中的视线位置以及观看参数是什么的能力。Regarding the display 2100, the display includes, but is not limited to, the following: a large TV; extended reality (eg, augmented reality, virtual reality, or mixed reality display); on-screen projector system; computer monitor, etc. A key component of the display is the ability to track where the user is looking in the image and what the viewing parameters are.
关于观看参数2101,观看参数包括但不限于以下参数:视角;辐辏/会聚;用户偏好(例如,特别感兴趣的物体,过滤——可以为特定用户过滤某些评级为“R”的物体,等等)。Regarding viewing parameters 2101, viewing parameters include, but are not limited to, the following parameters: viewing angle; vergence/convergence; user preferences (e.g., objects of special interest, filtering - certain objects rated "R" may be filtered for a specific user, etc. wait).
关于云1202,电影或视频中的每一帧都将是超大数据(特别是如果图14A和14B中所示的家庭影院与美国专利申请17/225,610中描述的摄像头集群结合使用,该专利申请的全部内容通过引用并入本本文)。注意,云指的是存储器、数据库等。注意,云能够进行云计算。本专利的一个创新点在于,将(一个或多个)用户的观看参数发送到云,在云中处理观看参数(例如,如图12中所论述的,选择视场或合成立体图像对),并确定流式传输超大数据的哪些部分以优化各个用户的体验。例如,多个用户可以同步观看他们的影片。每个用户从云中将该特定时间点的个性化优化数据流式传输2103到他们的移动设备上。然后,每个用户都会在自己的设备上观看各自的优化数据。这将使得沉浸式观看体验得到改善。例如,假设在单个时间点有个晚餐场景,场景中有吊灯、狗、老人、书柜、长桌、地毯和墙上饰品。名为Dave的用户可以是正在看狗,Dave的图像将被优化(例如,将狗的具有最大分辨率和优化颜色的图像流式传输到Dave的移动设备,并显示在Dave的HDU上)。名为Kathy的用户可以是正在看吊灯,Kathy的图像将被优化(例如,将吊灯的具有最大分辨率和优化颜色的图像流式传输到Kathy的移动设备并显示在Kathy的HDU上)。最后,名为Bob的用户可以是正在看老人,Bob的图像将被优化(例如,将老人的具有最大分辨率和优化颜色的图像流式传输到Bob的移动设备并显示在Bob的HDU上)。需要注意的是,云将在每个时间点存储巨大的数据集,但是只有部分数据集会被流式传输,而这些部分由用户的观看参数和/或偏好来确定。因此,书柜、长桌、地毯和墙上饰品可能都在Dave、Kathy和Bob的视场内,但是这些物体不会被优化用于显示(例如,存储在云中的这些图像的最高可能分辨率没有被流式传输)。Regarding cloud 1202, each frame in a movie or video will be extremely large data (especially if the home theater shown in Figures 14A and 14B is used in conjunction with the camera cluster described in U.S. Patent Application No. 17/225,610, of which The entire contents are incorporated herein by reference). Note that cloud refers to storage, databases, etc. Note that the cloud is capable of cloud computing. An innovative aspect of this patent is that the viewing parameters of the user(s) are sent to the cloud, where the viewing parameters are processed (e.g., selecting a field of view or synthesizing a stereoscopic image pair, as discussed in Figure 12), and determine which portions of very large data to stream to optimize the experience for individual users. For example, multiple users can watch their videos simultaneously. Each user has personalized optimized data for that specific point in time streamed from the cloud to their mobile device. Each user then views their own optimization data on their own device. This will result in an improved immersive viewing experience. For example, suppose there is a dinner scene at a single point in time, with a chandelier, a dog, an old man, a bookcase, a long table, a rug, and wall accessories. A user named Dave could be looking at a dog, and Dave's image would be optimized (e.g., an image of the dog with maximum resolution and optimized colors would be streamed to Dave's mobile device and displayed on Dave's HDU). A user named Kathy could be looking at a chandelier, and Kathy's image would be optimized (e.g., an image of the chandelier with maximum resolution and optimized colors would be streamed to Kathy's mobile device and displayed on Kathy's HDU). Finally, a user named Bob could be looking at an old man, and Bob's image would be optimized (e.g., an image of the old man with maximum resolution and optimized colors would be streamed to Bob's mobile device and displayed on Bob's HDU) . It is important to note that the cloud will store huge data sets at each point in time, but only parts of the data sets will be streamed, and these parts are determined by the user's viewing parameters and/or preferences. Therefore, the bookcase, long table, rug, and wall accessories may all be within the field of view of Dave, Kathy, and Bob, but these objects will not be optimized for display (e.g., the highest possible resolution of these images stored in the cloud not being streamed).
最后,引入先发制人的概念。如果预测到即将到来的场景可能会使得特定的用户的观看参数改变(例如,用户转头),则可以先发制人地流式传输该附加图像帧。例如,如果电影的时间是在1:43:05处,恐龙将在1:43:30处发出声音并从屏幕的左侧跳出,则可以以低分辨率格式下载整个场景,并且可以根据需要(例如,基于用户的观看参数、基于预测用户要看的即将到来的恐龙场景)下载FOV的选择性部分的附加数据集。因此,跳出的恐龙将始终处于其最大分辨率。这种技术创造了更加沉浸和改善的观看体验。Finally, the concept of preemption is introduced. This additional image frame may be pre-emptively streamed if it is predicted that an upcoming scene may cause a particular user's viewing parameters to change (eg, the user turns their head). For example, if the movie's time is at 1:43:05 and the dinosaur makes a sound and jumps out from the left side of the screen at 1:43:30, you can download the entire scene in a low-resolution format and use it as needed ( For example, download additional data sets for selective portions of the FOV based on the user's viewing parameters, based on predictions of upcoming dinosaur scenes that the user wants to see). Therefore, the dinosaur that pops out will always be at its maximum resolution. This technology creates a more immersive and improved viewing experience.
图22A示出了结合立体摄像头的后方交会(resection)的使用。摄像头#1具有已知的位置(例如,来自GPS的纬度和经度)。已知从摄像头#1到物体2200的距离(2英里)和方向(西北偏北330°)。可以计算出物体2200的位置。摄像头#2具有未知的位置,但是已知到物体2200的距离(1英里)和方向(东北偏北30°)。由于可以计算出物体2200的位置,因此可以求解几何图形并确定摄像头#2的位置。Figure 22A illustrates the use of resection in conjunction with a stereo camera. Camera #1 has a known location (e.g. latitude and longitude from GPS). The distance (2 miles) and direction (330° north-northwest) from camera #1 to object 2200 are known. The position of object 2200 can be calculated. Camera #2 has an unknown location, but a known distance (1 mile) and direction (30° north-northeast) to object 2200. Since the position of object 2200 can be calculated, the geometry can be solved and the position of camera #2 determined.
图22A示出了结合立体摄像头使用后方交会。摄像头#1具有已知的位置(例如,来自GPS的纬度和经度)。摄像头#1和摄像头#2具有已知的位置。已知从摄像头#1到物体2200B的方向(西北偏北330°)。已知从摄像头#2到物体2200B的方向(东北偏北30°)。可以计算出物体2200B的位置。Figure 22A illustrates the use of resection in conjunction with a stereo camera. Camera #1 has a known location (e.g. latitude and longitude from GPS). Camera #1 and Camera #2 have known locations. The direction from camera #1 to object 2200B is known (330° north-northwest). The direction from camera #2 to object 2200B is known (30° north-northeast). The position of object 2200B can be calculated.
图23A示出了人向前看向家庭影院屏幕中心的俯视图。人2300正向前看向家庭影院的屏幕2301的中心部分2302B。在该时间点期间,自定义流式传输以具有优化的中心部分2302B(例如,最高可能的分辨率)、非优化的左侧部分2302A(例如,低分辨率或黑色)、以及非优化的右侧部分2302C(例如,低分辨率或黑色)。注意,为了适当的流式传输,对监控系统(检测用户的观看方向和其他观看参数,如姿势或面部表情)或控制器(接收用户的命令也必须就位)进行输入。Figure 23A shows a top view of a person looking forward towards the center of the home theater screen. Person 2300 is looking forward toward center portion 2302B of screen 2301 of the home theater. During this point in time, the stream is customized to have an optimized center portion 2302B (eg, highest possible resolution), a non-optimized left portion 2302A (eg, low resolution or black), and a non-optimized right portion. Side portion 2302C (e.g., low resolution or black). Note that for proper streaming, input to the monitoring system (which detects the user's viewing direction and other viewing parameters such as posture or facial expressions) or the controller (which receives the user's commands) must also be in place.
图23B示出了人向前看向家庭影院屏幕右侧的俯视图。人2300正向前看向家庭影院的屏幕2301的右侧部分2302C。在该时间点期间,自定义流式传输以具有优化的右侧部分2302C(例如,最高可能的分辨率)、非优化的左侧部分2302A(例如,低分辨率或黑色)、以及非优化的中间部分2302B(例如,低分辨率或黑色)。注意,为了适当的流式传输,对监控系统(检测用户的观看方向和其他观看参数,如姿势或面部表情)或控制器(接收用户的命令也必须就位)进行输入。Figure 23B shows a top view of a person looking forward toward the right side of the home theater screen. Person 2300 is looking forward toward right portion 2302C of screen 2301 of the home theater. During this point in time, the stream is customized to have an optimized right portion 2302C (eg, highest possible resolution), a non-optimized left portion 2302A (eg, low resolution or black), and a non-optimized Middle portion 2302B (eg, low resolution or black). Note that for proper streaming, input to the monitoring system (which detects the user's viewing direction and other viewing parameters such as posture or facial expressions) or the controller (which receives the user's commands) must also be in place.
图24示出了用于在移动中的图像采集期间优化立体摄像头设置的方法、系统和装置。2400示出了确定一个时间点处物体的距离(例如,使用激光测距仪)。可以实施物体跟踪/目标跟踪系统。2401示出了调整立体摄像系统的变焦设置,以使其被优化用于步骤2400中确定的该距离。在优选的实施例中,与执行数字变焦相反,这将在使用变焦镜头时执行。2402示出了调整立体摄像头之间的分隔距离(立体距离),以使其被优化用于步骤2400中确定的该距离。注意,还有一个选择,也可以调整摄像头的方向,以使其被优化用于步骤2400中确定的该距离。2403示出了获取目标在步骤2400的时间点处的立体影像。2404示出了记录、观看和/或分析所获取的立体影像。Figure 24 illustrates methods, systems and devices for optimizing stereo camera setup during image acquisition on the move. 2400 illustrates determining the distance of an object at a point in time (eg, using a laser rangefinder). Object tracking/target tracking systems can be implemented. 2401 shows adjusting the zoom setting of the stereo camera system so that it is optimized for the distance determined in step 2400. In a preferred embodiment, this will be performed while using a zoom lens as opposed to performing digital zoom. 2402 shows adjusting the separation distance between stereo cameras (stereo distance) so that it is optimized for the distance determined in step 2400. Note that there is an option to also adjust the direction of the camera so that it is optimized for the distance determined in step 2400. 2403 shows acquiring the stereoscopic image of the target at the time point of step 2400. 2404 illustrates recording, viewing and/or analyzing the acquired stereoscopic images.
Claims (20)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/237,152 US11589033B1 (en) | 2021-02-28 | 2021-04-22 | Immersive viewing experience |
US17/237,152 | 2021-04-22 | ||
PCT/US2022/025818 WO2022226224A1 (en) | 2021-04-22 | 2022-04-21 | Immersive viewing experience |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117321987A true CN117321987A (en) | 2023-12-29 |
Family
ID=83723167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202280030471.XA Pending CN117321987A (en) | 2021-04-22 | 2022-04-21 | Immersive viewing experience |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4327552A4 (en) |
JP (1) | JP2024518243A (en) |
CN (1) | CN117321987A (en) |
WO (1) | WO2022226224A1 (en) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8228327B2 (en) * | 2008-02-29 | 2012-07-24 | Disney Enterprises, Inc. | Non-linear depth rendering of stereoscopic animated images |
US10551993B1 (en) * | 2016-05-15 | 2020-02-04 | Google Llc | Virtual reality content development environment |
EP3337154A1 (en) * | 2016-12-14 | 2018-06-20 | Thomson Licensing | Method and device for determining points of interest in an immersive content |
EP3588970A1 (en) * | 2018-06-22 | 2020-01-01 | Koninklijke Philips N.V. | Apparatus and method for generating an image data stream |
US20200371673A1 (en) * | 2019-05-22 | 2020-11-26 | Microsoft Technology Licensing, Llc | Adaptive interaction models based on eye gaze gestures |
US11206364B1 (en) * | 2020-12-08 | 2021-12-21 | Microsoft Technology Licensing, Llc | System configuration for peripheral vision with reduced size, weight, and cost |
-
2022
- 2022-04-21 EP EP22792523.7A patent/EP4327552A4/en active Pending
- 2022-04-21 CN CN202280030471.XA patent/CN117321987A/en active Pending
- 2022-04-21 JP JP2023558524A patent/JP2024518243A/en active Pending
- 2022-04-21 WO PCT/US2022/025818 patent/WO2022226224A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
JP2024518243A (en) | 2024-05-01 |
EP4327552A4 (en) | 2025-03-12 |
WO2022226224A1 (en) | 2022-10-27 |
EP4327552A1 (en) | 2024-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102087690B1 (en) | Method and apparatus for playing video content from any location and any time | |
US20200288113A1 (en) | System and method for creating a navigable, three-dimensional virtual reality environment having ultra-wide field of view | |
US9842433B2 (en) | Method, apparatus, and smart wearable device for fusing augmented reality and virtual reality | |
CA2949005C (en) | Method and system for low cost television production | |
US9684435B2 (en) | Camera selection interface for producing a media presentation | |
US20170118458A1 (en) | Stereo viewing | |
US10277890B2 (en) | System and method for capturing and viewing panoramic images having motion parallax depth perception without image stitching | |
US20160344999A1 (en) | SYSTEMS AND METHODs FOR PRODUCING PANORAMIC AND STEREOSCOPIC VIDEOS | |
KR20150090183A (en) | System and method for generating 3-d plenoptic video images | |
CA2933704A1 (en) | Systems and methods for producing panoramic and stereoscopic videos | |
CN110458953B (en) | A three-dimensional image reconstruction system and method | |
WO2012166593A2 (en) | System and method for creating a navigable, panoramic three-dimensional virtual reality environment having ultra-wide field of view | |
GB2436921A (en) | Methods and apparatus providing central, primary displays with surrounding display regions | |
JP6628343B2 (en) | Apparatus and related methods | |
TW201132316A (en) | Adjusting system and method for vanity mirron, vanity mirron including the same | |
US20210056749A1 (en) | Method and device for tailoring a synthesized reality experience to a physical setting | |
JP2007501950A (en) | 3D image display device | |
US20250193361A1 (en) | Techniques for displaying and capturing images | |
US20180124374A1 (en) | System and Method for Reducing System Requirements for a Virtual Reality 360 Display | |
US11831974B2 (en) | Zone-adaptive video generation | |
WO2020206647A1 (en) | Method and apparatus for controlling, by means of following motion of user, playing of video content | |
US20190335167A1 (en) | Method and apparatus for time-based stereo display of images and video | |
US11589033B1 (en) | Immersive viewing experience | |
CN116320363B (en) | Multi-angle virtual reality shooting method and system | |
EP3206082A1 (en) | System, method and computer program for recording a non-virtual environment for obtaining a virtual representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |