[go: up one dir, main page]

CN109257584B - User watching viewpoint sequence prediction method for 360-degree video transmission - Google Patents

User watching viewpoint sequence prediction method for 360-degree video transmission Download PDF

Info

Publication number
CN109257584B
CN109257584B CN201810886661.7A CN201810886661A CN109257584B CN 109257584 B CN109257584 B CN 109257584B CN 201810886661 A CN201810886661 A CN 201810886661A CN 109257584 B CN109257584 B CN 109257584B
Authority
CN
China
Prior art keywords
viewpoint
sequence
user
future
positions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810886661.7A
Other languages
Chinese (zh)
Other versions
CN109257584A (en
Inventor
邹君妮
杨琴
刘昕
李成林
熊红凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiao Tong University
Original Assignee
Shanghai Jiao Tong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiao Tong University filed Critical Shanghai Jiao Tong University
Priority to CN201810886661.7A priority Critical patent/CN109257584B/en
Publication of CN109257584A publication Critical patent/CN109257584A/en
Application granted granted Critical
Publication of CN109257584B publication Critical patent/CN109257584B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method for predicting a view point sequence watched by a user in 360-degree video transmission, which comprises the following steps: using the viewpoint positions of the past time of a user as the input of a viewpoint sequence prediction model, predicting the viewpoint positions of a plurality of future times through the viewpoint sequence prediction model, wherein the viewpoint positions of the plurality of future times form a first viewpoint sequence; using video content as input of a viewpoint tracking model through the viewpoint tracking model, and predicting viewpoint positions at a plurality of moments in the future through the viewpoint tracking model, wherein the viewpoint positions at the plurality of moments in the future form a second viewpoint sequence; and determining a future viewing viewpoint sequence of the user by combining the first viewpoint sequence and the second viewpoint sequence. The prediction method in the invention has good practicability and expansibility, and can change the sequence length of the prediction viewpoint according to the head movement speed of the user.

Description

360度视频传输的用户观看视点序列预测方法Prediction method of user viewing viewpoint sequence for 360-degree video transmission

技术领域technical field

本发明涉及视频通信技术领域,具体地,涉及360度视频传输的用户观看视点序列预测方法。The present invention relates to the technical field of video communication, and in particular, to a method for predicting user viewing viewpoint sequences for 360-degree video transmission.

背景技术Background technique

360度视频是虚拟现实技术的一种重要应用,与传统视频相比,360度视频采用全方位摄像头捕捉现实世界每个方位景象,并将这些景象拼接以形成全景图像。当观看360度视频时,用户可以自由转动头部调整观看视角,获得沉浸式体验。然而,360度视频有超高的分辨率,传输完整360度视频需要消耗的带宽高达传统视频的6倍以上。在网络带宽资源受限的情况下,特别是对于移动网络来说,传输完整的360度视频是很困难的。360-degree video is an important application of virtual reality technology. Compared with traditional video, 360-degree video uses omnidirectional cameras to capture scenes in every direction of the real world, and stitches these scenes to form panoramic images. When watching 360-degree videos, users can freely turn their heads to adjust the viewing angle for an immersive experience. However, 360-degree video has ultra-high resolution, and the bandwidth required to transmit a full 360-degree video is more than 6 times that of traditional video. In the case of limited network bandwidth resources, especially for mobile networks, it is difficult to transmit full 360-degree video.

受限于头戴式显示器的视场区域,每个时刻用户只能观看360度视频的一部分。因此根据用户头部运动选择用户感兴趣的视频区域进行传输能够更加有效利用带宽。从获取用户的需求信息,并将这一信息反馈至服务器端,直至用户接收到视频内容,会经历从用户到服务器的往返时延(Round-Trip Time,RTT)。而用户在这一时间段内可能已经发生了头部位置移动,导致用户接收到的内容不再是其感兴趣的部分。为了避免RTT时延带来的传输滞后性,需要对用户的视点进行预测。Limited by the field of view of the head-mounted display, the user can only view a portion of the 360-degree video at each moment. Therefore, selecting the video region that the user is interested in for transmission according to the user's head movement can more effectively utilize the bandwidth. From acquiring the user's demand information and feeding this information back to the server, until the user receives the video content, a round-trip delay (Round-Trip Time, RTT) from the user to the server will be experienced. However, the user may have moved the head position during this time period, so that the content received by the user is no longer the part of interest to the user. In order to avoid the transmission lag caused by the RTT delay, it is necessary to predict the user's viewpoint.

经过对现有技术的检索发现,为了实现用户的视点预测,一种常用方法是通过对过去时刻的视点位置来推断未来时刻的视点位置。Y.Bao等人在《IEEE InternationalConference on Big Data》会议上发表了题为“Shooting a moving target:Motion-prediction-based transmission for 360-degree videos”的文章,该文章提出了直接将当前时刻的视点位置作为未来时刻的视点位置的简单模型,及采用线性回归、前馈神经网络对用户视点位置随时间的变化关系进行回归分析,进而预测未来时刻的视点位置等三种回归模型。但是,诸如用户职业,年龄,性别,偏好等因素会影响用户对于360度视频的感兴趣区域,未来时刻的视点位置与过去时刻的视点位置之间的关系可以表征为非线性和长时间依赖关系,该文章提出的三种预测模型只能预测单个视点位置,无法预测未来多个时刻的视点位置。After searching the prior art, it is found that, in order to realize the user's viewpoint prediction, a common method is to infer the viewpoint position of the future time by the viewpoint position of the past time. Y. Bao et al. published an article entitled "Shooting a moving target: Motion-prediction-based transmission for 360-degree videos" at the "IEEE International Conference on Big Data", which proposes a direct view of the current moment The position is used as a simple model of the viewpoint position in the future time, and the linear regression and feedforward neural network are used to perform regression analysis on the change relationship of the user's viewpoint position with time, and then predict the viewpoint position in the future time. There are three regression models. However, factors such as the user's occupation, age, gender, preference, etc. will affect the user's area of interest for 360-degree videos, and the relationship between the viewpoint position at the future moment and the viewpoint position at the past moment can be characterized as non-linear and long-term dependency , the three prediction models proposed in this article can only predict the position of a single viewpoint, but cannot predict the viewpoint positions at multiple times in the future.

经检索还发现,A.D.Aladagli等人在《International Conference on 3dImmersion,2018,pp.1-6》发表了题为“Predicting head trajectories in 360virtualreality videos”的文章,该文章考虑了视频内容对于用户视点位置的影响,基于显著性算法预测视频的显著性区域,以此预测用户视点位置。但是,该文章没有考虑过去时刻视点位置对于观看视点的影响。The search also found that A.D.Aladagli et al. published an article entitled "Predicting head trajectories in 360virtualreality videos" in "International Conference on 3dImmersion, 2018, pp.1-6", which considered the impact of video content on the user's viewpoint position. Influence, the saliency region of the video is predicted based on the saliency algorithm to predict the position of the user's viewpoint. However, this article does not consider the influence of the viewpoint position on the viewing viewpoint in the past time.

发明内容SUMMARY OF THE INVENTION

针对现有技术中的缺陷,本发明的目的是提供一种360度视频传输的用户观看视点序列预测方法。Aiming at the defects in the prior art, the purpose of the present invention is to provide a method for predicting a user's viewing viewpoint sequence for 360-degree video transmission.

本发明提供一种360度视频传输的用户观看视点序列预测方法,包括:The present invention provides a user viewing viewpoint sequence prediction method for 360-degree video transmission, including:

将用户过去时刻的视点位置作为视点序列预测模型的输入,通过所述视点序列预测模型预测未来多个时刻的视点位置,所述未来多个时刻的视点位置构成第一视点序列;Taking the viewpoint position of the user in the past as the input of the viewpoint sequence prediction model, and predicting the viewpoint positions of multiple future moments through the viewpoint sequence prediction model, and the viewpoint positions of the future multiple moments constitute the first viewpoint sequence;

通过视点跟踪模型,将视频内容作为所述视点跟踪模型的输入,通过所述视点跟踪模型预测未来多个时刻的视点位置,所述未来多个时刻的视点位置构成第二视点序列;Using the viewpoint tracking model, the video content is used as the input of the viewpoint tracking model, and the viewpoint positions of multiple future moments are predicted by the viewpoint tracking model, and the viewpoint positions of the multiple future moments constitute a second viewpoint sequence;

结合第一视点序列和第二视点序列,确定用户未来的观看视点序列。Combining the first viewpoint sequence and the second viewpoint sequence, the user's future viewing viewpoint sequence is determined.

可选地,在将用户过去时刻的视点位置作为视点序列预测模型的输入,通过所述视点序列预测模型预测未来多个时刻的视点位置之前,还包括:Optionally, before using the viewpoint position of the user in the past as the input of the viewpoint sequence prediction model, and predicting the viewpoint positions at multiple future moments by using the viewpoint sequence prediction model, the method further includes:

基于循环神经网络构建视点序列预测模型;其中,所述视点序列预测模型用于将输入的视点位置编码后输入到循环神经网络,计算隐藏单元和输出单元的值,学习用户不同时刻的观看视点间的长时间依赖关系,输出未来多个时刻的视点位置;所述视点位置包括:俯仰角、偏航角、滚动角的单位圆投影,所述视点位置的变化范围为-1到1;采用双曲正切函数作为输出单元的激活函数,所述激活函数限定所述视点位置的输出范围。A viewpoint sequence prediction model is constructed based on a recurrent neural network; wherein, the viewpoint sequence prediction model is used to encode the input viewpoint position and input it to the recurrent neural network, calculate the values of the hidden unit and the output unit, and learn the viewing viewpoints of the user at different times. The long-term dependency relationship of , outputs the viewpoint positions at multiple moments in the future; the viewpoint positions include: the unit circle projection of the pitch angle, yaw angle, and roll angle, and the variation range of the viewpoint position is -1 to 1; The tangent function is used as the activation function of the output unit, and the activation function defines the output range of the viewpoint position.

可选地,所述将用户过去时刻的视点位置作为视点序列预测模型的输入,通过所述视点序列预测模型预测未来多个时刻的视点位置,包括:Optionally, the viewpoint position of the user in the past is used as the input of the viewpoint sequence prediction model, and the viewpoint positions at multiple future moments are predicted by the viewpoint sequence prediction model, including:

将用户当前时刻的视点位置作为所述视点序列预测模型第一次迭代的输入,得到第一次迭代的预测视点位置;Taking the viewpoint position of the user at the current moment as the input of the first iteration of the viewpoint sequence prediction model, and obtaining the predicted viewpoint position of the first iteration;

循环将上一次迭代的预测视点位置作为所述视点序列预测模型下一次迭代的输入,得到未来多个时刻的预测视点位置。The loop uses the predicted viewpoint position of the previous iteration as the input of the next iteration of the viewpoint sequence prediction model, and obtains the predicted viewpoint positions at multiple times in the future.

可选地,所述第一视点序列的长度与用户观看时头部运动的速度有关,用户头部运动速度越慢,则对应的第一视点序列的长度越长;用户头部运动速度越快,则对应的第一视点序列的长度越短。Optionally, the length of the first viewpoint sequence is related to the speed of the head movement of the user when viewing, and the slower the head movement speed of the user is, the longer the length of the corresponding first viewpoint sequence; the faster the user head movement speed. , the length of the corresponding first viewpoint sequence is shorter.

可选地,在通过视点跟踪模型,将视频内容作为所述视点跟踪模型的输入,通过所述视点跟踪模型预测未来多个时刻的视点位置之前,还包括:Optionally, before using the viewpoint tracking model to use the video content as the input of the viewpoint tracking model, and predicting the viewpoint positions at multiple future moments by using the viewpoint tracking model, the method further includes:

根据目标跟踪相关的滤波器算法构建视点跟踪模型,其中,所述相关的滤波器算法是指:设置相关滤波器,所述相关滤波器对视点位置的视频区域会形成最大响应值。The viewpoint tracking model is constructed according to a filter algorithm related to target tracking, wherein the related filter algorithm refers to setting a correlation filter, and the correlation filter will form a maximum response value to the video area at the viewpoint position.

可选地,所述通过视点跟踪模型,将视频内容作为所述视点跟踪模型的输入,通过所述视点跟踪模型预测未来多个时刻的视点位置,包括:Optionally, in the viewpoint tracking model, video content is used as the input of the viewpoint tracking model, and the viewpoint positions at multiple future moments are predicted by the viewpoint tracking model, including:

采用等距圆柱投影方式,将未来时刻的360度视频帧的球形图像投影成平面图像;Using the equidistant cylindrical projection method, the spherical image of the 360-degree video frame in the future moment is projected into a plane image;

通过所述视点跟踪模型在所述平面图像中确定一个边界框,所述边界框中的区域即为视点区域,根据所述视点区域确定对应的视点位置。A bounding box is determined in the plane image by the viewpoint tracking model, the area in the bounding box is the viewpoint area, and the corresponding viewpoint position is determined according to the viewpoint area.

可选地,所述结合第一视点序列和第二视点序列,确定用户未来的观看视点序列,包括:Optionally, the combination of the first viewpoint sequence and the second viewpoint sequence to determine the user's future viewing viewpoint sequence includes:

为所述第一视点序列中的视点位置和所述第二视点序列中的视点位置,分别设置不同的权重值w1和w2;且权重w1和w2满足w1+w2=1;其中:权重值w1和w2的设置需要满足预测的用户未来的观看视点位置与实际用户观看视点位置的误差最小原则;Different weight values w 1 and w 2 are respectively set for the viewpoint positions in the first viewpoint sequence and the viewpoint positions in the second viewpoint sequence; and the weights w 1 and w 2 satisfy w 1 +w 2 =1 ; wherein: the setting of the weight values w 1 and w 2 needs to satisfy the principle of minimum error between the predicted user's future viewing viewpoint position and the actual user's viewing viewpoint position;

根据权重值w1和w2,以及第一视点序列中的视点位置、第二视点序列中的视点位置,计算用户未来的观看视点序列;计算公式中如下:According to the weight values w 1 and w 2 , as well as the viewpoint positions in the first viewpoint sequence and the viewpoint positions in the second viewpoint sequence, the user's future viewing viewpoint sequence is calculated; the calculation formula is as follows:

Figure BDA0001755825660000031
Figure BDA0001755825660000031

其中:

Figure BDA0001755825660000032
为t+1时刻到t+tw时刻用户未来的观看视点位置,w1为第一视点序列的权重值,
Figure BDA0001755825660000033
为t+1时刻到t+tw时刻第一视点序列中的视点位置,w2为第二视点序列的权重值,
Figure BDA0001755825660000034
为t+1时刻到t+tw时刻第二视点序列中的视点位置,⊙表示逐元素相乘运算,t为当前时刻,tw为预测时间窗。in:
Figure BDA0001755825660000032
is the user's future viewing viewpoint position from time t+1 to time t+t w , w 1 is the weight value of the first viewpoint sequence,
Figure BDA0001755825660000033
is the viewpoint position in the first viewpoint sequence from time t+1 to time t+t w , w 2 is the weight value of the second viewpoint sequence,
Figure BDA0001755825660000034
is the viewpoint position in the second viewpoint sequence from time t+1 to time t+t w , ⊙ represents the element-by-element multiplication operation, t is the current time, and t w is the prediction time window.

可选地,随着预测时刻的增大,所述视点跟踪模型所预测的第二视点序列的权重w2逐渐减小。Optionally, as the prediction time increases, the weight w 2 of the second viewpoint sequence predicted by the viewpoint tracking model gradually decreases.

与现有技术相比,本发明具有如下的有益效果:Compared with the prior art, the present invention has the following beneficial effects:

本发明提供的360度视频传输的用户观看视点序列预测方法,结合了循环神经网络学习用户不同时刻的观看视点之间的长时间依赖关系,基于过去时刻的用户视点位置预测未来多个时刻的视点位置;同时考虑了视频内容对观看视点的影响,基于视频内容预测未来的视点序列;最后综合循环神经网络和视频内容对观看视点的影响,得到用户未来的观看视点序列,能够根据用户头部运动速度改变预测视点序列的长度,具有良好的实用性和扩展性。The method for predicting the user's viewing viewpoint sequence for 360-degree video transmission provided by the present invention combines the cyclic neural network to learn the long-term dependency between the viewing viewpoints of the user at different times, and predicts viewpoints at multiple times in the future based on the user viewpoint positions at the past moments. At the same time, the influence of the video content on the viewing viewpoint is considered, and the future viewpoint sequence is predicted based on the video content; finally, the influence of the recurrent neural network and the video content on the viewing viewpoint is integrated to obtain the user's future viewing viewpoint sequence, which can be based on the user's head movement. The speed of changing the length of the predicted viewpoint sequence has good practicability and scalability.

附图说明Description of drawings

通过阅读参照以下附图对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present invention will become more apparent by reading the detailed description of non-limiting embodiments with reference to the following drawings:

图1为本发明一实施例提供的应用360度视频传输的用户观看视点序列预测方法的系统框图;1 is a system block diagram of a method for predicting a sequence of user viewing viewpoints using 360-degree video transmission according to an embodiment of the present invention;

图2为本发明一实施例提供的视点区域的原理示意图。FIG. 2 is a schematic schematic diagram of a viewpoint area provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面结合具体实施例对本发明进行详细说明。以下实施例将有助于本领域的技术人员进一步理解本发明,但不以任何形式限制本发明。应当指出的是,对本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进。这些都属于本发明的保护范围。The present invention will be described in detail below with reference to specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be noted that, for those skilled in the art, several modifications and improvements can be made without departing from the concept of the present invention. These all belong to the protection scope of the present invention.

图1为本发明一实施例提供的应用360度视频传输的用户观看视点序列预测方法的系统框图,如图1所示,包括:基于循环神经网络的视点预测模块,基于相关滤波器的视点跟踪模块,融合模块,其中:循环神经网络视点预测模块,结合循环神经网络学习用户不同时刻的观看视点之间的长时间依赖关系,基于过去时刻的用户视点位置预测未来多个时刻的视点位置;相关滤波器视点跟踪模块,考虑了视频内容对观看视点的影响,提出基于相关滤波器的视点跟踪模块,探索视频内容和视点序列之间的关系,基于视频内容预测未来的视点序列;融合模块,结合了循环神经网络视点预测模块和相关滤波器视点跟踪模块的预测结果,两个模块优势互补,提高了模型的预测准确度。FIG. 1 is a system block diagram of a method for predicting a sequence of viewpoints viewed by a user applying 360-degree video transmission provided by an embodiment of the present invention. As shown in FIG. 1 , it includes: a viewpoint prediction module based on a recurrent neural network, and viewpoint tracking based on a correlation filter. The module, the fusion module, wherein: a recurrent neural network viewpoint prediction module, which combines the recurrent neural network to learn the long-term dependency between the viewing viewpoints of the user at different times, and predicts the viewpoint positions of multiple future moments based on the user viewpoint position at the past moment; related; The filter viewpoint tracking module, which considers the influence of video content on viewing viewpoints, proposes a viewpoint tracking module based on correlation filters, explores the relationship between video content and viewpoint sequences, and predicts future viewpoint sequences based on video content; fusion module, combined with The prediction results of the recurrent neural network viewpoint prediction module and the correlation filter viewpoint tracking module are presented. The two modules complement each other and improve the prediction accuracy of the model.

本实施例中,结合循环神经网络学习用户不同时刻的观看视点之间的长时间依赖关系,基于过去时刻的用户视点位置预测未来多个时刻的视点位置;同时考虑了视频内容对观看视点的影响,提出基于相关滤波器的视点跟踪模块,探索视频内容和视点序列之间的关系,基于视频内容预测未来的视点序列;最后通过融合模块结合了循环神经网络视点预测模块和相关滤波器视点跟踪模块的预测结果,两个模块优势互补,提高了模型的预测准确度。本发明提出的视点序列预测结构,能够根据用户头部运动速度改变预测视点的序列长度,具有良好的实用性和扩展性,为360度视频的高效传输打下了坚实基础。In this embodiment, the long-term dependencies between the viewing viewpoints of the user at different times are learned in combination with the recurrent neural network, and the viewpoint locations at multiple future moments are predicted based on the user viewpoint locations at the past moments; the influence of the video content on the viewing viewpoints is also considered. , propose a viewpoint tracking module based on correlation filter, explore the relationship between video content and viewpoint sequences, and predict future viewpoint sequences based on video content; finally, the fusion module combines the recurrent neural network viewpoint prediction module and correlation filter viewpoint tracking module. The prediction results of the two modules complement each other and improve the prediction accuracy of the model. The viewpoint sequence prediction structure proposed by the present invention can change the sequence length of the predicted viewpoints according to the motion speed of the user's head, has good practicability and expansibility, and lays a solid foundation for the efficient transmission of 360-degree video.

具体地,本实施例中,预测视点位置实际为预测俯仰角(θ)、偏航角

Figure BDA0001755825660000051
和滚动角(ψ)的单位圆投影位置,其中预测俯仰角(θ)、偏航角
Figure BDA0001755825660000052
和滚动角(ψ)对应用户头部绕X轴,Y轴和Z轴进行旋转的角度。图2为本发明一实施例提供的视点区域的原理示意图,参见图2;定义用户头部的初始位置的这三个角度均为0度,每个角度变化范围在-180°到180°之间。对于采用头戴式显示器观看视频的用户来说,这三个角度确定了唯一的视点位置,当用户转动头部时,实验表明偏航角
Figure BDA00017558256600000523
相对于另外两个角度变化最明显,所以最难预测。Specifically, in this embodiment, the predicted viewpoint position is actually the predicted pitch angle (θ), yaw angle
Figure BDA0001755825660000051
and the unit circle projected position of roll angle (ψ), where pitch angle (θ), yaw angle are predicted
Figure BDA0001755825660000052
And the roll angle (ψ) corresponds to the angle that the user's head rotates around the X-axis, Y-axis and Z-axis. Fig. 2 is a schematic diagram of the principle of the viewpoint area provided by an embodiment of the present invention, see Fig. 2; the three angles that define the initial position of the user's head are all 0 degrees, and the variation range of each angle is between -180° and 180° between. For users watching videos with a head-mounted display, these three angles determine the unique viewpoint position, and when the user turns their head, experiments show that the yaw angle
Figure BDA00017558256600000523
Compared to the other two angles, the change is the most obvious, so it is the most difficult to predict.

在本实例中,主要关注对于偏航角

Figure BDA0001755825660000053
的预测,提出的系统结构可以直接延伸到另外两个角的预测。根据角度定义,-180°和179°相差1°而不是359°,为了避免这个问题,首选需要对预测角度进行变换,采用
Figure BDA0001755825660000054
作为输入,其中
Figure BDA0001755825660000055
在输出预测结果之前,对预测得到的Vt做反变换,其中
Figure BDA0001755825660000056
其中,Vt为t时刻偏航角
Figure BDA0001755825660000057
经过g函数变换后所得输出向量,
Figure BDA0001755825660000058
为t时刻偏航角
Figure BDA0001755825660000059
的正弦值,
Figure BDA00017558256600000510
为t时刻偏航角
Figure BDA00017558256600000511
的余弦值,
Figure BDA00017558256600000512
为t时刻对偏航角
Figure BDA00017558256600000513
的函数变换。In this example, the main concern is for the yaw angle
Figure BDA0001755825660000053
For the predictions, the proposed system structure can be directly extended to the predictions of the other two corners. According to the angle definition, the difference between -180° and 179° is 1° instead of 359°. In order to avoid this problem, it is preferred to transform the predicted angle, using
Figure BDA0001755825660000054
as input, where
Figure BDA0001755825660000055
Before outputting the predicted result, inverse transform the predicted V t , where
Figure BDA0001755825660000056
Among them, V t is the yaw angle at time t
Figure BDA0001755825660000057
The output vector obtained after transformation by the g function,
Figure BDA0001755825660000058
is the yaw angle at time t
Figure BDA0001755825660000059
the sine of ,
Figure BDA00017558256600000510
is the yaw angle at time t
Figure BDA00017558256600000511
the cosine of ,
Figure BDA00017558256600000512
is the yaw angle at time t
Figure BDA00017558256600000513
function transformation.

本实施例中,所述循环神经网络视点预测模块采用当前时刻视点位置

Figure BDA00017558256600000514
作为输入,预测多个时刻偏航角
Figure BDA00017558256600000515
其中tw是预测时间窗,
Figure BDA00017558256600000516
为t+1时刻偏航角的值,
Figure BDA00017558256600000517
为t+tw时刻偏航角的值。如果用户头部运动缓慢,可以选择一个较大预测时间窗tw,反之预测时间窗需要设置一个较小值。在训练过程中,对于每一个时间步i,将
Figure BDA00017558256600000518
编码为128维向量xi,i的取值范围为t到t+tw-1。然后将xi输入到循环神经网络中,计算隐藏单元hi,和输出单元
Figure BDA00017558256600000519
从t到t+tw-1每个时间步,应用至以下更新方程:In this embodiment, the recurrent neural network viewpoint prediction module adopts the viewpoint position at the current moment
Figure BDA00017558256600000514
As input, predict the yaw angle at multiple times
Figure BDA00017558256600000515
where t w is the prediction time window,
Figure BDA00017558256600000516
is the value of the yaw angle at time t+1,
Figure BDA00017558256600000517
is the value of the yaw angle at time t+t w . If the user's head moves slowly, a larger prediction time window tw can be selected, otherwise, a smaller value needs to be set for the prediction time window. During training, for each time step i, the
Figure BDA00017558256600000518
The encoding is a 128-dimensional vector x i , and the value range of i is from t to t+t w -1. Then input x i into the recurrent neural network, compute the hidden unit h i , and the output unit
Figure BDA00017558256600000519
For each time step from t to t+t w -1, apply the following update equation:

Figure BDA00017558256600000520
Figure BDA00017558256600000520

hi=σ2(Whxxi+Whhhi-1+bh) (2)h i2 (W hx x i +W hh h i-1 +b h ) (2)

yi=Wohhi+bo (3)y i =W oh h i +b o (3)

Figure BDA00017558256600000521
Figure BDA00017558256600000521

其中,Wxv为将偏航角

Figure BDA00017558256600000522
编码为128维向量xi过程的权重矩阵,Whx为输入单元xi到隐藏单元hi连接的权重矩阵,Whh为i-1时刻的隐藏单元hi-1到i时刻的隐藏单元hi连接的权重矩阵,Woh为隐藏单元hi到输出单元oi连接的权重矩阵,bx为编码过程的偏置向量,bh为计算隐藏单元hi的偏置向量,bo为计算输出单元oi的偏置向量。在测试过程中,采用当前时刻视点位置
Figure BDA0001755825660000061
作为第一次迭代的输入,对于其他时间步,采用上一次迭代的预测结果作为下一次迭代的输入,即
Figure BDA0001755825660000062
σ1和σ2是激活函数,其中,σ1是线性整流函数,σ2是双曲正切函数;
Figure BDA0001755825660000063
为i+1时刻用户视点位置的预测值,
Figure BDA0001755825660000064
为对i时刻偏航角
Figure BDA00017558256600000613
的函数变换值,hi-1为i-1时刻的隐藏单元,g-1(yi)为对于输出结果yi的g函数反变换。Among them, W xv is the yaw angle
Figure BDA00017558256600000522
Encoded as the weight matrix of the 128-dimensional vector x i process, W hx is the weight matrix connecting the input unit x i to the hidden unit h i , W hh is the hidden unit h i-1 at time i-1 to the hidden unit h at time i The weight matrix connected by i , W oh is the weight matrix connected from the hidden unit hi to the output unit o i , b x is the bias vector of the encoding process, b h is the bias vector for calculating the hidden unit hi , and b o is the calculation Bias vector for output unit oi . During the test, the viewpoint position at the current moment is used
Figure BDA0001755825660000061
As the input of the first iteration, for other time steps, the prediction result of the previous iteration is used as the input of the next iteration, i.e.
Figure BDA0001755825660000062
σ 1 and σ 2 are activation functions, where σ 1 is a linear rectification function and σ 2 is a hyperbolic tangent function;
Figure BDA0001755825660000063
is the predicted value of the user's viewpoint position at time i+1,
Figure BDA0001755825660000064
is the yaw angle for time i
Figure BDA00017558256600000613
The function transformation value of , h i-1 is the hidden unit at time i-1, g -1 (y i ) is the inverse transformation of the g function for the output result y i .

本实施例中,所述相关滤波器视点跟踪模块根据目标跟踪相关滤波器算法,设计的相关滤波器对于视点所在区域有最大响应,采用未来时刻360度视频帧

Figure BDA0001755825660000065
作为输入,基于视频内容对视点位置做出预测;其中:Ft+1为t+1时刻360度视频帧,
Figure BDA0001755825660000066
为t+tw时刻360度视频帧。由于目标跟踪相关滤波器算法主要用于跟踪视频中的具体物体,本实施例中跟踪的视点相比具体物体更加抽象。因此,需要先采用等距圆柱投影方式将360度视频帧的球形图像投影成平面图像,在平面图像上重新定位视点对应区域。对于投影得到的平面图像来说,靠近极点的图像内容被水平展开,相应的视点对应区域不再是矩形,所以在视点周围设定一个边界框,重新定义视点区域大小和形状。由此,可以基于视频内容预测出视点的边界框,从而预测出视点位置
Figure BDA0001755825660000067
In this embodiment, the correlation filter viewpoint tracking module designs the correlation filter according to the target tracking correlation filter algorithm to have the maximum response to the region where the viewpoint is located, and uses 360-degree video frames in the future.
Figure BDA0001755825660000065
As input, the viewpoint position is predicted based on the video content; where: F t+1 is a 360-degree video frame at time t+1,
Figure BDA0001755825660000066
is a 360-degree video frame at time t+t w . Since the target tracking correlation filter algorithm is mainly used to track specific objects in the video, the tracked viewpoint in this embodiment is more abstract than the specific objects. Therefore, it is necessary to firstly use the equidistant cylindrical projection method to project the spherical image of the 360-degree video frame into a plane image, and relocate the corresponding area of the viewpoint on the plane image. For the plane image obtained by projection, the image content near the pole is expanded horizontally, and the corresponding area of the viewpoint is no longer a rectangle, so a bounding box is set around the viewpoint to redefine the size and shape of the viewpoint area. In this way, the bounding box of the viewpoint can be predicted based on the video content, thereby predicting the position of the viewpoint
Figure BDA0001755825660000067

本实施例中,结合了循环神经网络视点预测模块和相关滤波器视点跟踪模块的预测结果,赋予不同权重以得到最终预测结果,即

Figure BDA0001755825660000068
Figure BDA0001755825660000069
其中,
Figure BDA00017558256600000610
是最终的预测结果,
Figure BDA00017558256600000611
Figure BDA00017558256600000612
分别是循环神经网络视点预测模块和相关滤波器视点跟踪模块的预测结果,⊙和逐元素相乘,权重w1和w2满足w1+w2=1,采用使得最终的视点位置预测值误差最小的权重值。其中,所述相关滤波器视点跟踪模块,滤波器无法进行更新,视点估计值和真实值之间的差距随着误差累积逐渐增加,对于大的预测窗,相关滤波器视点跟踪的预测结果的权重逐渐减小。将基于循环神经网络的视点序列预测模块和基于相关滤波器的视点跟踪系统模块优势互补,提高了预测准确度。In this embodiment, the prediction results of the recurrent neural network viewpoint prediction module and the correlation filter viewpoint tracking module are combined, and different weights are assigned to obtain the final prediction result, that is,
Figure BDA0001755825660000068
Figure BDA0001755825660000069
in,
Figure BDA00017558256600000610
is the final prediction result,
Figure BDA00017558256600000611
and
Figure BDA00017558256600000612
are the prediction results of the recurrent neural network viewpoint prediction module and the correlation filter viewpoint tracking module, respectively. ⊙ is multiplied element by element, and the weights w 1 and w 2 satisfy w 1 +w 2 =1, and the error of the final viewpoint position prediction value is adopted. Minimum weight value. Among them, in the correlation filter viewpoint tracking module, the filter cannot be updated, the gap between the viewpoint estimated value and the real value gradually increases with the accumulation of errors, and for a large prediction window, the weight of the prediction result of the correlation filter viewpoint tracking slowing shrieking. The advantages of the viewpoint sequence prediction module based on the recurrent neural network and the viewpoint tracking system module based on the correlation filter are complementary, and the prediction accuracy is improved.

本实施例中关键参数的设置为:实验数据来源于Y.Bao等人在《IEEEInternational Conference on Big Data》会议上发表了题为“Shooting a movingtarget:Motion-prediction-based transmission for 360-degree videos”的文章,该数据采集了153个志愿者观看16段360度视频时的头部运动信息,部分志愿者只观看部分视频,共采集985个观看样本。在本实施例数据预处理中,对每个观看样本进行每秒10次的采样,每个观看样本共记录289个运动数据,共得到285665个运动数据。采用80%的运动数据作为训练集,20%的运动数据作为测试集。对所述循环神经网络模块,隐藏单元大小设为256,采用Adam(自适应矩估计)优化方法,动量和权重衰减分别设置为0.8和0.999。批量大小为128,共训练500个周期。学习率在前250个周期训练过程中从0.001到0.0001线性衰减。对所述相关滤波器视点跟踪模块,调整图像大小为1800×900,设置边界框大小为10×10。对于所述融合模块,对w1和w2赋不同值,选择使得最终的视点位置预测值误差最小的作为最终权重值。The key parameters in this embodiment are set as follows: the experimental data is derived from the "Shooting a moving target: Motion-prediction-based transmission for 360-degree videos" published by Y. Bao et al. at the "IEEE International Conference on Big Data" conference. The data collected the head motion information of 153 volunteers watching 16 360-degree videos, some volunteers only watched part of the video, and a total of 985 viewing samples were collected. In the data preprocessing in this embodiment, each viewing sample is sampled 10 times per second, a total of 289 pieces of motion data are recorded for each viewing sample, and a total of 285,665 pieces of motion data are obtained. 80% of the motion data is used as the training set and 20% of the motion data is used as the test set. For the RNN module, the hidden unit size is set to 256, the Adam (Adaptive Moment Estimation) optimization method is used, and the momentum and weight decay are set to 0.8 and 0.999, respectively. The batch size is 128 and it is trained for 500 epochs. The learning rate decays linearly from 0.001 to 0.0001 during the first 250 epochs of training. For the correlation filter viewpoint tracking module, adjust the image size to 1800×900, and set the bounding box size to 10×10. For the fusion module, different values are assigned to w 1 and w 2 , and the final weight value that minimizes the error of the final viewpoint position prediction value is selected.

本发明为适应360度视频传输中提高带宽利用率的需要,提出了基于用户过去时刻视点位置和360度视频内容的视点序列预测系统。本发明提出的视点序列预测结构,能够预测未来多个时刻用户的视点位置,并且能够根据用户头部运动速度改变预测视点的序列长度,具有良好的实用性和扩展性,为360度视频的高效传输打下了坚实基础。In order to meet the needs of improving bandwidth utilization in 360-degree video transmission, the present invention proposes a viewpoint sequence prediction system based on the user's past moment viewpoint position and 360-degree video content. The viewpoint sequence prediction structure proposed by the present invention can predict the viewpoint positions of users at multiple times in the future, and can change the sequence length of the predicted viewpoints according to the motion speed of the user's head, which has good practicability and expansibility, and is efficient for 360-degree video. Transmission has laid a solid foundation.

以上对本发明的具体实施例进行了描述。需要理解的是,本发明并不局限于上述特定实施方式,本领域技术人员可以在权利要求的范围内做出各种变化或修改,这并不影响本发明的实质内容。在不冲突的情况下,本申请的实施例和实施例中的特征可以任意相互组合。Specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the above-mentioned specific embodiments, and those skilled in the art can make various changes or modifications within the scope of the claims, which do not affect the essential content of the present invention. The embodiments of the present application and features in the embodiments may be combined with each other arbitrarily, provided that there is no conflict.

Claims (8)

1.一种360度视频传输的用户观看视点序列预测方法,其特征在于,包括:1. a user viewing viewpoint sequence prediction method of 360-degree video transmission, is characterized in that, comprises: 将用户过去时刻的视点位置作为视点序列预测模型的输入,通过所述视点序列预测模型预测未来多个时刻的视点位置,所述视点序列预测模型预测的所述未来多个时刻的视点位置构成第一视点序列;所述视点序列预测模型基于循环神经网络构建,用于将输入的视点位置编码后输入到循环神经网络,计算隐藏单元和输出单元的值,学习用户不同时刻的观看视点间的长时间依赖关系,输出未来多个时刻的视点位置;所述视点位置包括:俯仰角、偏航角、滚动角的单位圆投影,所述视点位置的变化范围为-1到1;采用双曲正切函数作为输出单元的激活函数,所述激活函数限定所述视点位置的输出范围;The viewpoint position of the user in the past time is used as the input of the viewpoint sequence prediction model, and the viewpoint positions of the future multiple times are predicted by the viewpoint sequence prediction model, and the viewpoint positions of the future multiple times predicted by the viewpoint sequence prediction model constitute the first step. A viewpoint sequence; the viewpoint sequence prediction model is constructed based on a recurrent neural network, and is used to encode the input viewpoint position and input it to the recurrent neural network, calculate the values of the hidden unit and the output unit, and learn the length of the viewing viewpoints of the user at different times. Time dependency, output the viewpoint positions at multiple moments in the future; the viewpoint positions include: the unit circle projection of the pitch angle, yaw angle, and roll angle, and the variation range of the viewpoint position is -1 to 1; the hyperbolic tangent is used The function is used as the activation function of the output unit, and the activation function defines the output range of the viewpoint position; 通过视点跟踪模型,将视频内容作为所述视点跟踪模型的输入,通过所述视点跟踪模型预测未来多个时刻的视点位置,所述视点跟踪模型预测的所述未来多个时刻的视点位置构成第二视点序列;所述视点跟踪模型根据目标跟踪相关的滤波器算法构建,其中,所述相关的滤波器算法是指:设置相关滤波器,所述相关滤波器对视点位置的视频区域会形成最大响应值;Through the viewpoint tracking model, the video content is used as the input of the viewpoint tracking model, and the viewpoint positions at multiple future moments are predicted by the viewpoint tracking model, and the viewpoint positions at the multiple future moments predicted by the viewpoint tracking model constitute the first Two-viewpoint sequence; the viewpoint tracking model is constructed according to a filter algorithm related to target tracking, wherein the related filter algorithm refers to: setting a correlation filter, and the correlation filter will form a maximum effect on the video area of the viewpoint position. Response; 结合第一视点序列和第二视点序列,确定用户未来的观看视点序列。Combining the first viewpoint sequence and the second viewpoint sequence, the user's future viewing viewpoint sequence is determined. 2.根据权利要求1所述的360度视频传输的用户观看视点序列预测方法,其特征在于,在将用户过去时刻的视点位置作为视点序列预测模型的输入,通过所述视点序列预测模型预测未来多个时刻的视点位置之前,还包括:基于循环神经网络构建视点序列预测模型。2. The method for predicting a sequence of viewpoints viewed by a user for 360-degree video transmission according to claim 1, wherein the viewpoint position at the past moment of the user is used as the input of the viewpoint sequence prediction model, and the future is predicted by the viewpoint sequence prediction model. Before the viewpoint positions at multiple times, the method further includes: constructing a viewpoint sequence prediction model based on a recurrent neural network. 3.根据权利要求2所述的360度视频传输的用户观看视点序列预测方法,其特征在于,所述将用户过去时刻的视点位置作为视点序列预测模型的输入,通过所述视点序列预测模型预测未来多个时刻的视点位置,包括:3. The method for predicting a user's viewing viewpoint sequence for 360-degree video transmission according to claim 2, wherein the viewpoint position of the user in the past time is used as the input of the viewpoint sequence prediction model, and the viewpoint sequence prediction model is used to predict the Viewpoint locations at multiple moments in the future, including: 将用户当前时刻的视点位置作为所述视点序列预测模型第一次迭代的输入,得到第一次迭代的预测视点位置;Taking the viewpoint position of the user at the current moment as the input of the first iteration of the viewpoint sequence prediction model, and obtaining the predicted viewpoint position of the first iteration; 循环将上一次迭代的预测视点位置作为所述视点序列预测模型下一次迭代的输入,得到未来多个时刻的预测视点位置。The loop uses the predicted viewpoint position of the previous iteration as the input of the next iteration of the viewpoint sequence prediction model, and obtains the predicted viewpoint positions at multiple times in the future. 4.根据权利要求1所述的360度视频传输的用户观看视点序列预测方法,其特征在于,所述第一视点序列的长度与用户观看时头部运动的速度有关,用户头部运动速度越慢,则对应的第一视点序列的长度越长;用户头部运动速度越快,则对应的第一视点序列的长度越短。4. The method for predicting a sequence of viewpoints viewed by a user for 360-degree video transmission according to claim 1, wherein the length of the first viewpoint sequence is related to the speed of the head movement when the user is watching, and the faster the head movement speed of the user is. Slower, the longer the length of the corresponding first viewpoint sequence; the faster the user's head movement speed, the shorter the length of the corresponding first viewpoint sequence. 5.根据权利要求1所述的360度视频传输的用户观看视点序列预测方法,其特征在于,在通过视点跟踪模型,将视频内容作为所述视点跟踪模型的输入,通过所述视点跟踪模型预测未来多个时刻的视点位置之前,还包括:根据目标跟踪相关的滤波器算法构建视点跟踪模型。5. The method for predicting a sequence of viewpoints viewed by a user for 360-degree video transmission according to claim 1, wherein, when using a viewpoint tracking model, video content is used as the input of the viewpoint tracking model, and the viewpoint tracking model predicts Before the viewpoint positions at multiple moments in the future, the method further includes: constructing a viewpoint tracking model according to a filter algorithm related to target tracking. 6.根据权利要求5所述的360度视频传输的用户观看视点序列预测方法,其特征在于,所述通过视点跟踪模型,将视频内容作为所述视点跟踪模型的输入,通过所述视点跟踪模型预测未来多个时刻的视点位置,包括:6 . The method for predicting a sequence of viewpoints viewed by a user for 360-degree video transmission according to claim 5 , wherein, in the viewpoint tracking model, video content is used as the input of the viewpoint tracking model, and the viewpoint tracking model Predict viewpoint positions at multiple moments in the future, including: 采用等距圆柱投影方式,将未来时刻的360度视频帧的球形图像投影成平面图像;Using the equidistant cylindrical projection method, the spherical image of the 360-degree video frame in the future moment is projected into a plane image; 通过所述视点跟踪模型在所述平面图像中确定一个边界框,所述边界框中的区域即为视点区域,根据所述视点区域确定对应的视点位置。A bounding box is determined in the plane image by the viewpoint tracking model, the area in the bounding box is the viewpoint area, and the corresponding viewpoint position is determined according to the viewpoint area. 7.根据权利要求1-6中任一项所述的360度视频传输的用户观看视点序列预测方法,其特征在于,所述结合第一视点序列和第二视点序列,确定用户未来的观看视点序列,包括:7. The method for predicting a user's viewing viewpoint sequence for 360-degree video transmission according to any one of claims 1-6, wherein the user's future viewing viewpoint is determined by combining the first viewpoint sequence and the second viewpoint sequence sequence, including: 为所述第一视点序列中的视点位置和所述第二视点序列中的视点位置,分别设置不同的权重值w1和w2;且权重w1和w2满足w1+w2=1;其中:权重值w1和w2的设置需要满足预测的用户未来的观看视点位置与实际用户观看视点位置的误差最小原则;Different weight values w 1 and w 2 are respectively set for the viewpoint positions in the first viewpoint sequence and the viewpoint positions in the second viewpoint sequence; and the weights w 1 and w 2 satisfy w 1 +w 2 =1 ; wherein: the setting of the weight values w 1 and w 2 needs to satisfy the principle of minimum error between the predicted user's future viewing viewpoint position and the actual user's viewing viewpoint position; 根据权重值w1和w2,以及第一视点序列中的视点位置、第二视点序列中的视点位置,计算用户未来的观看视点序列;计算公式中如下:According to the weight values w 1 and w 2 , as well as the viewpoint positions in the first viewpoint sequence and the viewpoint positions in the second viewpoint sequence, the user's future viewing viewpoint sequence is calculated; the calculation formula is as follows:
Figure FDA0002300676740000021
Figure FDA0002300676740000021
其中:
Figure FDA0002300676740000022
为t+1时刻到t+tw时刻用户未来的观看视点位置,w1为第一视点序列的权重值,
Figure FDA0002300676740000023
为t+1时刻到t+tw时刻第一视点序列中的视点位置,w2为第二视点序列的权重值,
Figure FDA0002300676740000024
为t+1时刻到t+tw时刻第二视点序列中的视点位置,⊙表示逐元素相乘运算,t为当前时刻,tw为预测时间窗。
in:
Figure FDA0002300676740000022
is the user's future viewing viewpoint position from time t+1 to time t+t w , w 1 is the weight value of the first viewpoint sequence,
Figure FDA0002300676740000023
is the viewpoint position in the first viewpoint sequence from time t+1 to time t+t w , w 2 is the weight value of the second viewpoint sequence,
Figure FDA0002300676740000024
is the viewpoint position in the second viewpoint sequence from time t+1 to time t+t w , ⊙ represents the element-by-element multiplication operation, t is the current time, and t w is the prediction time window.
8.根据权利要7所述的360度视频传输的用户观看视点序列预测方法,其特征在于,随着预测时刻的增大,所述视点跟踪模型所预测的第二视点序列的权重w2逐渐减小。8. The method for predicting a sequence of viewpoints viewed by a user for 360-degree video transmission according to claim 7, wherein, as the prediction time increases, the weight w of the second viewpoint sequence predicted by the viewpoint tracking model gradually increases. decrease.
CN201810886661.7A 2018-08-06 2018-08-06 User watching viewpoint sequence prediction method for 360-degree video transmission Active CN109257584B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810886661.7A CN109257584B (en) 2018-08-06 2018-08-06 User watching viewpoint sequence prediction method for 360-degree video transmission

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810886661.7A CN109257584B (en) 2018-08-06 2018-08-06 User watching viewpoint sequence prediction method for 360-degree video transmission

Publications (2)

Publication Number Publication Date
CN109257584A CN109257584A (en) 2019-01-22
CN109257584B true CN109257584B (en) 2020-03-10

Family

ID=65048730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810886661.7A Active CN109257584B (en) 2018-08-06 2018-08-06 User watching viewpoint sequence prediction method for 360-degree video transmission

Country Status (1)

Country Link
CN (1) CN109257584B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109862019B (en) * 2019-02-20 2021-10-22 联想(北京)有限公司 Data processing method, device and system
CN110248212B (en) * 2019-05-27 2020-06-02 上海交通大学 Multi-user 360-degree video stream server-side bit rate adaptive transmission method and system
CN110166850B (en) * 2019-05-30 2020-11-06 上海交通大学 Method and system for predicting panoramic video watching position by multiple CNN networks
CN110248178B (en) * 2019-06-18 2021-11-23 深圳大学 Viewport prediction method and system using object tracking and historical track panoramic video
CN114040184B (en) * 2021-11-26 2024-07-16 京东方科技集团股份有限公司 Image display method, system, storage medium and computer program product

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104768018A (en) * 2015-02-04 2015-07-08 浙江工商大学 A Fast Viewpoint Prediction Method Based on Depth Map
CN106612426A (en) * 2015-10-26 2017-05-03 华为技术有限公司 Method and device for transmitting multi-view video
CN107274472A (en) * 2017-06-16 2017-10-20 福州瑞芯微电子股份有限公司 A kind of method and apparatus of raising VR play frame rate
CN107422844A (en) * 2017-03-27 2017-12-01 联想(北京)有限公司 A kind of information processing method and electronic equipment
CN107533230A (en) * 2015-03-06 2018-01-02 索尼互动娱乐股份有限公司 Head mounted display tracing system
CN107770561A (en) * 2017-10-30 2018-03-06 河海大学 A kind of multiresolution virtual reality device screen content encryption algorithm using eye-tracking data
CN108134941A (en) * 2016-12-01 2018-06-08 联发科技股份有限公司 Adaptive video decoding method and apparatus thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10432988B2 (en) * 2016-04-15 2019-10-01 Ati Technologies Ulc Low latency wireless virtual reality systems and methods
US9681096B1 (en) * 2016-07-18 2017-06-13 Apple Inc. Light field capture

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104768018A (en) * 2015-02-04 2015-07-08 浙江工商大学 A Fast Viewpoint Prediction Method Based on Depth Map
CN107533230A (en) * 2015-03-06 2018-01-02 索尼互动娱乐股份有限公司 Head mounted display tracing system
CN106612426A (en) * 2015-10-26 2017-05-03 华为技术有限公司 Method and device for transmitting multi-view video
CN108134941A (en) * 2016-12-01 2018-06-08 联发科技股份有限公司 Adaptive video decoding method and apparatus thereof
CN107422844A (en) * 2017-03-27 2017-12-01 联想(北京)有限公司 A kind of information processing method and electronic equipment
CN107274472A (en) * 2017-06-16 2017-10-20 福州瑞芯微电子股份有限公司 A kind of method and apparatus of raising VR play frame rate
CN107770561A (en) * 2017-10-30 2018-03-06 河海大学 A kind of multiresolution virtual reality device screen content encryption algorithm using eye-tracking data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Viewpoint-predicting-based Remote Rendering on Mobile Devices using Multiple;Wang Xiaochuan,Liang Xiaohui;《2015 International Conference on Virtual Reality and Visualization》;20160512;全文 *

Also Published As

Publication number Publication date
CN109257584A (en) 2019-01-22

Similar Documents

Publication Publication Date Title
CN109257584B (en) User watching viewpoint sequence prediction method for 360-degree video transmission
Shen et al. Learning dynamic facial radiance fields for few-shot talking head synthesis
Bai et al. Adaptive dilated network with self-correction supervision for counting
Xu et al. Predicting head movement in panoramic video: A deep reinforcement learning approach
CN112597883B (en) Human skeleton action recognition method based on generalized graph convolution and reinforcement learning
US20220301252A1 (en) View synthesis of a dynamic scene
US9361723B2 (en) Method for real-time face animation based on single video camera
Luc et al. Transformation-based adversarial video prediction on large-scale data
CN109447121B (en) Multi-target tracking method, device and system for visual sensor network
Liu et al. Counting people by estimating people flows
CN111259779A (en) A video action detection method based on center point trajectory prediction
CN102592146B (en) Face detection and camera tripod control method applied to video monitoring
CN111901532B (en) Video stabilization method based on recurrent neural network iteration strategy
US10200618B2 (en) Automatic device operation and object tracking based on learning of smooth predictors
Zhang et al. Modeling long-and short-term temporal context for video object detection
Yang et al. Single and sequential viewports prediction for 360-degree video streaming
CN110503666A (en) A video-based dense crowd counting method and system
Yılmaz et al. Dfpn: Deformable frame prediction network
Ma et al. Timelens-xl: Real-time event-based video frame interpolation with large motion
CN116485974A (en) Picture rendering, data prediction and training method, system, storage and server thereof
CN111915587A (en) Video processing method, video processing device, storage medium and electronic equipment
WO2022267957A1 (en) Video frame interpolation method and apparatus, and device
CN117455948A (en) Multi-view pedestrian trajectory extraction and analysis method based on deep learning algorithm
CN117611624A (en) A method for video stabilization based on global motion estimation
CN116340568A (en) An online video summarization method based on cross-scenario knowledge transfer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant