CN110944222B

CN110944222B - Method and system for immersive media content as user moves

Info

Publication number: CN110944222B
Application number: CN201811108139.2A
Authority: CN
Inventors: 徐异凌; 王延峰; 黄倩; 谢绍伟; 管云峰
Original assignee: Shanghai Jiao Tong University
Current assignee: Shanghai Jiao Tong University
Priority date: 2018-09-21
Filing date: 2018-09-21
Publication date: 2021-02-12
Anticipated expiration: 2038-09-21
Also published as: CN110944222A

Abstract

The present invention provides a method and system for changing immersive media content with user movement, including: reading and parsing a video stream; determining the number of viewpoints, initial viewpoints, the relative relationship between viewpoints, and the maximum coverage radius value of each viewpoint; Feedback the user's relative displacement to the initial viewpoint according to the user's position; select the current viewpoint according to the relative relationship between viewpoints, the maximum coverage radius value of each viewpoint, and the relative displacement; determine the current depth value according to the relative displacement, according to the visual field and depth within the viewpoint Determines the viewing field content of the current viewpoint; presents the content corresponding to the viewing field content in the video stream. The present invention supports the change of the viewing area of the user and the current viewpoint relative to the initial viewpoint in a certain relationship, and supports switching of different viewpoints at the same time.

Description

Method and system for immersive media content as user moves

Technical Field

The invention relates to the technical field of multimedia, in particular to a method and a system for immersing media content along with the movement change of a user.

Background

With the rapid development of Virtual Reality (VR) technology, the demand of VR systems increases, and the development from three degrees of Freedom (3Degree of Freedom, 3DoF), three degrees of Freedom plus (3Degree of Freedom +, 3DoF +) to six degrees of Freedom (6 Degree of Freedom, 6DoF) is being realized, where 3DoF supports the head of a user to perform three rotations of Yaw, Roll, and Pitch (i.e., Yaw angle, Roll angle, and Pitch angle), 3DoF + supports the head of the user to perform small-range translation motions of X, Y, Z axes in up, down, left, right, front, and back directions on the basis of 3DoF, and 6DoF can not only perform three rotations of Yaw, Roll, and Pitch as 3DoF, but also can track the translation of the user on X, Y, Z axis. Technically, the VR video should add depth information on the basis of the existing three degrees of Freedom (3Degree of Freedom, 3DoF), and support that the scene changes correspondingly with the movement of the user, that is, when the user wears the VR head display to move in any direction, the VR will react correspondingly, so that the user obtains better immersive experience.

Now, detection and identification of immersive media content are important indicators for determining the immersive experience of the user, and in order to meet the requirements of different application scenarios, additional identification of feedback and package information of immersive media content is required to meet the further application of specific information.

Disclosure of Invention

In view of the deficiencies in the prior art, it is an object of the present invention to provide a method and system for immersive media content as a function of user movement.

According to the invention, a method for immersive media content as a function of user movement is provided, comprising:

and (3) analyzing: reading and analyzing a video stream;

a viewpoint determining step: determining the number of viewpoints, the initial viewpoints, the relative relationship among the viewpoints and the maximum coverage radius value of each viewpoint;

and a displacement feedback step: feeding back the relative displacement of the user to the initial viewpoint according to the position of the user;

a viewpoint selection step: selecting a current viewpoint according to the relative relationship among the viewpoints, the maximum coverage radius value of each viewpoint and the relative displacement;

a viewing field content determination step: determining a current depth value according to the relative displacement, and determining the view content of the current viewpoint according to the relation between the view and the depth in the viewpoint;

a video presenting step: content in the video stream corresponding to the viewing field of view content is presented.

Preferably, the relative relationship between the viewpoints includes: and establishing a coordinate system by taking the initial viewpoint as a coordinate origin, and determining coordinates of viewpoints other than the initial viewpoint.

Preferably, the user relative displacement includes: x-coordinate information, y-coordinate information, z-coordinate information of the user's movement.

Preferably, the depth value is a distance of the user's position from the current viewpoint.

According to the invention, a system for immersive media content as a function of user movement is provided, comprising:

an analysis module: reading and analyzing a video stream;

a viewpoint determining module: determining the number of viewpoints, the initial viewpoints, the relative relationship among the viewpoints and the maximum coverage radius value of each viewpoint;

a displacement feedback module: feeding back the relative displacement of the user to the initial viewpoint according to the position of the user;

a viewpoint selection module: selecting a current viewpoint according to the relative relationship among the viewpoints, the maximum coverage radius value of each viewpoint and the relative displacement;

a viewing field content determination module: determining a current depth value according to the relative displacement, and determining the view content of the current viewpoint according to the relation between the view and the depth in the viewpoint;

a video presentation module: content in the video stream corresponding to the viewing field of view content is presented.

Compared with the prior art, the invention has the following beneficial effects:

for immersive media content supporting motion parallax, the method and the system support that the viewing area of the user changes in a certain relation with the current viewpoint relative to the initial viewpoint, and simultaneously support switching of different viewpoints as if the user is completely immersed in a scene, so that the degree of freedom in consumption of the immersive media is fully improved according to the interaction behavior of the user.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a logic flow diagram of the present invention;

FIG. 2 is a diagram illustrating an exemplary embodiment of the present invention relating to the relationship between the viewing area and the movement information;

fig. 3 is a view point range illustration of a specific application example of the present invention with respect to an immersive virtual museum.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

As shown in fig. 1, a method of immersive media content as a function of user movement is provided in accordance with the present invention, comprising:

and (3) analyzing: and the server side reads and analyzes the video stream.

A viewpoint determining step: and determining the number of viewpoints, the initial viewpoints, the relative relation among the viewpoints and the maximum coverage radius value of each viewpoint. The relative relationship between viewpoints includes: and establishing a coordinate system by taking the initial viewpoint as a coordinate origin, and determining coordinates of viewpoints other than the initial viewpoint.

And a displacement feedback step: and feeding back the relative displacement of the user to the initial viewpoint according to the position of the user. The user relative displacement comprises: x-coordinate information, y-coordinate information, z-coordinate information of the user's movement.

A viewpoint selection step: and selecting the current viewpoint according to the relative relation among the viewpoints, the maximum coverage radius value of each viewpoint and the relative displacement.

A viewing field content determination step: and determining the current depth value according to the relative displacement, and determining the watching view content of the current view point according to the relation between the view and the depth in the view point. The depth value is the distance of the user's position from the current viewpoint.

Based on the method for immersing the media content to change along with the movement of the user, the invention also provides a method for immersing the media content to change along with the movement of the user, which comprises the following steps:

and (3) analyzing: and the server side reads and analyzes the video stream.

The invention aims to provide an identification method for immersive media content along with corresponding changes of user interaction behaviors (such as head movement and body movement), which can indicate the interaction behaviors of a user in consuming the immersive media content and feed back the interaction behaviors to a server side so as to acquire the immersive media content meeting the requirements of the user and different application scenes.

In this embodiment, an immersive virtual museum is taken as an example. Immersive virtual museum, that is, the user can wear three-dimensional head-mounted device, can interact with surrounding environment and nearby object in the exhibition district, and the exhibition area that sees can become certain relation change with the position that the user removed, supports the viewpoint switch between different exhibition districts simultaneously. In immersive media, the position at which a user views media content is called the viewpoint, i.e., the position of the camera. The immersive virtual museum allows a user to switch among a plurality of panoramic videos about an exhibition area based on position information of different viewpoints, and simultaneously supports motion parallax, namely, viewing content in the current viewpoint is allowed to change along with user interaction behaviors, for example, the user can observe a more detailed part in a scene through behaviors close to exhibits, and a better immersive experience is obtained. Assuming that different exhibition area scenes are shot by independent and fixed panoramic cameras, a user can freely change a moving route so as to realize switching between different viewpoints and immerse the scenes in the corresponding exhibition areas; meanwhile, the head position can be changed to feel the motion parallax, so that the effect of watching objects close to and far away from the current viewpoint coverage range is realized, namely, the adaptive adjustment of the interactive behavior (recorded as real-time relative displacement) of the actual watching area of the user along with the viewpoint range is supported.

In order to achieve the above purpose, the following technical solutions are adopted in this embodiment:

for immersive media content that supports motion parallax, the user viewing region is enabled to change in a relationship to the current viewpoint relative to the initial viewpoint, while different viewpoint switching is enabled, as if fully immersed in the scene. And attaches necessary indication information (e.g., package information, transport information, consumption information) uniquely associated therewith to substantially increase the freedom in consumption of the immersive media in accordance with user interaction behavior.

In the invention, the necessary indication information of the immersive media content to be added can be realized by taking the following information as an example:

information one: for indicating the total number of viewpoints within the complete scene;

and information II: for indicating an initial viewpoint;

and (3) information three: for indicating relative position information between different viewpoints to distinguish viewpoint contents;

and information four: on the basis of the information three, indicating constraint information for switching among different viewpoints;

and information five: the coverage range information is used for indicating different viewpoint contents so as to adjust the actual watching area of the user according to the real-time relative displacement of the user;

information six: and on the basis of the fifth information, adding the corresponding relation between the actual watching area of the user and the real-time relative displacement.

And information seven: and on the basis of the information five, indicating the displacement information of the user relative to the current viewpoint.

And carrying out information identification on the variation of the immersive media content along with the movement, wherein the identification information indicates the movement information of the user, the packaging information of the relation between the area range watched by the user and the movement information and indicates the current viewpoint information of the user.

For immersive media content that supports motion parallax, the user viewing region is enabled to change in a relationship to the current viewpoint relative to the initial viewpoint, while different viewpoint switching is enabled, as if fully immersed in the scene. And necessary indication information uniquely associated with the video media is added to fully feed back the interactive behavior of the immersive media content, so that the specific application requirements of the video media are further realized.

For the above problem, as shown in fig. 2, the following fields can be added as needed:

hmovement _ x: x-coordinate information indicating viewer movement;

hmovement _ y: y-coordinate information indicating viewer movement;

hmovement _ z: z-coordinate information indicating viewer movement;

parallelx _ flag: indicating a video supporting motion parallax;

viewpoint _ ID: indicating the current viewpoint information of the user;

move _ depth: defining a depth value of the viewing area, specifying the relative distance of the real-time position of the user relative to the initial viewpoint, and obtaining the depth value according to the feedback coordinates hmovement _ x, hmovement _ y and hmovement _ z;

behavior _ coefficient: defining as a magnification behavior coefficient;

sphere _ radius: a radius representing a spherical viewing area;

viewing _ range _ field: the method comprises the steps of representing an area range which can be watched by a user at a real-time position, and determining according to a watching depth, a behavior coefficient and a spherical area radius;

num _ viewpoint: a preset total number of viewpoints;

viewpoint _ x (0), viewpoint _ y (0), viewpoint _ z (0): x, y, z coordinate information indicating an initial viewpoint position, set to (0, 0, 0);

viewpoint _ x (i): indicating x coordinate information of the position of the viewpoint i relative to the initial viewpoint;

viewpoint _ y (i): y coordinate information indicating the position of the viewpoint i relative to the initial viewpoint;

viewpoint _ z (i): z coordinate information indicating a position of the viewpoint i with respect to the initial viewpoint;

rmax (i): indicating that the viewpoint i covers the maximum radius information.

Based on the above information, taking vrviewpoint changefeed and Depth and viewing range information box as examples, an organization structure of these information is given below.

1.VRViewpointChangeFeedback

Depth and viewing range information box

2.1 definition

■ data Box Type (Box Type): dvri'

Contained in (Container) project Omni VideoBox

Mandatory (Mandatory) not Mandatory

Number (Quantity) zero or one

The data box provides a relative relationship between user viewing range and viewpoint depth in a spherical area.

2.2 grammar

View point Change Struct view point Change Structure syntax

For a better understanding of the meaning of the above fields, reference is made to the application examples mentioned below.

Based on the above expression, specific application examples are given below:

in the application of panoramic video, due to the technical limitation, when a user watches the video, the user is limited to the included 360-degree visual angle range, and the watching content of the user cannot change along with the step movement of the user. For the panoramic video with the depth information, a user can freely change the action route to realize the switching between different viewpoints and immerse the panoramic video in the corresponding exhibition area scene; meanwhile, the head position can be changed to feel the motion parallax, so that the effect of watching objects close to and far away from the current viewpoint coverage range is realized, namely, the adjustment of the adaptability of the actual watching area of the user along with the real-time relative displacement is supported.

Specifically, as shown in fig. 3, when the user moves within the view coverage of the exhibition area 1 in the museum, the client directly feeds back the user's real-time relative displacement to the server by locating the corresponding information of hmovement _ x, hmovement _ y, and hmovement _ z, the server may obtain the depth value of the viewing area according to the feedback coordinates hmovement _ x, hmovement _ y, and hmovement _ z, determine the area range that the user can view at the current viewpoint according to the analyzed viewing depth, behavior coefficient, and spherical area radius, and then present the corresponding viewing area at the viewpoint 1 to the user, generally speaking, the user's movement within the current viewpoint coverage can achieve the effect of "approaching" and "departing" the viewing object. When the user moves from the exhibition area 1 to the exhibition area 2, the real-time position is within the maximum radius coverage of the exhibition area 2, accordingly, the viewing content is obtained from the scene provided by the viewpoint 2, and when the user moves freely within the coverage of the exhibition area 2, the viewing area changes as described above.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. a method for immersing media content to change with user movement, is characterized in that, comprising:

Parsing step: read and parse the video stream;

The viewpoint determination step: determine the initial viewpoint, the relative relationship between viewpoints, and the maximum coverage radius value of each viewpoint;

Displacement feedback step: feedback the relative displacement of the user to the initial viewpoint according to the user's position;

Viewpoint selection step: select the current viewpoint according to the relative relationship between the viewpoints, the maximum coverage radius value of each viewpoint and the relative displacement;

The step of determining the content of the viewing field of view: determining the current depth value according to the relative displacement, and determining the content of the viewing field of view of the current viewpoint according to the relationship between the field of view and the depth in the viewpoint;

Video presentation step: present the content corresponding to the viewing field content in the video stream;

The depth value is the distance of the user's position relative to the current viewpoint.

2. The method for changing immersive media content with user movement according to claim 1, wherein the relative relationship between the viewpoints comprises: establishing a coordinate system with an initial viewpoint as a coordinate origin, and determining the coordinates of viewpoints other than the initial viewpoint .

3 . The method for changing immersive media content with user movement according to claim 1 , wherein the relative user displacement comprises: x coordinate information, y coordinate information, and z coordinate information of the user's movement. 4 .

4. A system for immersing media content to change with user movement, is characterized in that, comprising:

Parsing module: read and parse the video stream;

Viewpoint determination module: determine the initial viewpoint, the relative relationship between viewpoints, and the maximum coverage radius value of each viewpoint;

Displacement feedback module: feedback the relative displacement of the user to the initial viewpoint according to the user's position;

Viewpoint selection module: select the current viewpoint according to the relative relationship between viewpoints, the maximum coverage radius value of each viewpoint and the relative displacement;

The viewing field content determination module: determine the current depth value according to the relative displacement, and determine the viewing field content of the current viewpoint according to the relationship between the field of view and the depth in the viewpoint;

Video presentation module: present the content corresponding to the viewing field content in the video stream;

5. The system according to claim 4, wherein the relative relationship between the viewpoints comprises: establishing a coordinate system with an initial viewpoint as a coordinate origin, and determining the coordinates of viewpoints other than the initial viewpoint .

6 . The system for changing immersive media content with user movement according to claim 4 , wherein the relative user displacement includes: x coordinate information, y coordinate information, and z coordinate information of the user's movement. 7 .